Skip to content

khalidkhankakar/image-to-md

Repository files navigation

Image to Markdown AI 📄🤖

Image to Markdown Neubrutalism UI Preview

A web application that takes any uploaded image containing text and seamlessly converts it into clean, structured Markdown using powerful Vision LLMs (Google Gemini 2.0 & Llama 4 Scout via Groq).

Built with Next.js, Vercel AI SDK, and Tailwind CSS, featuring a bold, Gumroad-inspired Neubrutalism user interface.

✨ Features

  • Accurate Text Extraction: Recognizes paragraphs, heading hierarchies, bullet points, and code blocks directly from screenshots/images.
  • Multiple AI Models: Switch between Gemini 2.0 Flash (via Google) and Llama 4 Scout (via Groq) on the fly.
  • Neubrutalist UI: A fun, highly-tactile design featuring heavy black borders, hard block shadows, vibrant colors, and click-depth animations.
  • Drag & Drop: Easily drop images directly into the browser.
  • 1-Click Copy: Copy the generated markdown to your clipboard instantly.

🚀 How to Use / Run Locally

1. Prerequisites

Ensure you have the following installed on your machine:

2. Clone the Repository

git clone https://github.com/khalidkhankakar/image-to-md
cd image-to-md

3. Install Dependencies

pnpm install

4. Setup Environment Variables

Create a file named .env.local in the root of the project:

touch .env.local

Open it and add your API keys:

# Get this from Google AI Studio (https://aistudio.google.com/)
GOOGLE_GENERATIVE_AI_API_KEY=your_gemini_api_key_here

# Get this from Groq Cloud (https://console.groq.com/keys)
GROQ_API_KEY=your_groq_api_key_here

5. Start the Development Server

pnpm run dev

Open your browser and navigate to exactly http://localhost:3000 (or whichever port Next.js assigns if 3000 is taken) to use the app!


🛠 Tech Stack

About

Upload an image and extract its text as clean, structured Markdown using ai.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors