Skip to content

Latest commit

 

History

History
113 lines (88 loc) · 4.08 KB

File metadata and controls

113 lines (88 loc) · 4.08 KB

SnapText

Fast, lightweight, privacy-focused screenshot + OCR tool for macOS, built with Rust.

Press a hotkey, capture your screen, and instantly extract text — all offline, using Apple's Vision framework on Apple Silicon.

Features

  • Global hotkey (Cmd+Shift+2) to trigger instant screenshot capture
  • Apple Vision OCR — native, hardware-accelerated on M1–M4, offline
  • Tesseract fallback — optional cross-platform OCR engine
  • Floating overlay — shows extracted text with action buttons
  • Annotation tools — arrows, rectangles, highlights, text boxes
  • One-click copy — text or image to clipboard
  • Auto-save — screenshot PNG + extracted text TXT side-by-side
  • Menu bar tray icon — runs as a background agent
  • macOS notifications — capture confirmation with text preview
  • Configurable — hotkey, save path, language, format via ~/.config/snaptext/config.toml

Requirements

  • macOS 12.3+ (ScreenCaptureKit)
  • Rust 2021 edition
  • Screen Recording permission (prompted on first run)

Quick Start

# Clone and build
git clone https://github.com/ediestel/snap-text.git
cd snap-text
cargo build --release

# Run
cargo run --release

On first launch, macOS will prompt for Screen Recording permission. Grant it in System Settings > Privacy & Security > Screen Recording, then restart the app.

Optional: Tesseract fallback

brew install tesseract leptonica
cargo run --release --features tesseract-ocr

Usage

Action How
Capture screenshot Press Cmd+Shift+2 or click tray icon > "Capture Screenshot"
Copy extracted text Click "Copy Text" in overlay
Copy screenshot image Click "Copy Image" in overlay
Save both to disk Click "Save Both" (saves PNG + TXT to ~/Screenshots/SnapText/)
Annotate Click "Annotate" to enable drawing tools
Quit Click tray icon > "Quit SnapText" or Ctrl+C in terminal

Configuration

Config file: ~/.config/snaptext/config.toml (auto-created on first run)

hotkey = "CmdOrCtrl+Shift+2"
save_dir = "/Users/you/Screenshots/SnapText"
ocr_language = "en-US"
show_notifications = true
copy_text_on_capture = true
image_format = "png"

Architecture

src/
├── domain/           # Pure business logic (no system dependencies)
│   ├── annotation.rs # Annotation types (Point, Color, Annotation)
│   ├── capture.rs    # ScreenCapturer trait, CaptureTarget, CaptureRequest
│   ├── ocr.rs        # OcrEngine trait, OcrResult
│   └── snapshot.rs   # Snapshot, SnapshotId, SnapshotMetadata
├── app/              # Application layer (state + commands + events)
│   ├── state_machine.rs # AppState enum, AppEvent, transition logic
│   ├── state.rs      # AppContext (state + snapshots + command history)
│   ├── commands.rs   # Command trait + CopyText/CopyImage/SaveSnapshot
│   └── event_bus.rs  # Pub/sub EventBus, SystemEvent, listeners
├── capture.rs        # XcapCapturer (ScreenCaptureKit adapter)
├── ocr/              # OCR engine adapters
│   ├── vision.rs     # Apple Vision framework
│   └── tesseract.rs  # Tesseract (optional)
├── gui/              # Presentation layer
│   ├── overlay.rs    # eframe floating window
│   └── annotation.rs # Drawing tools UI
├── main.rs           # Event loop
├── config.rs         # TOML config
├── clipboard.rs      # Clipboard manager
├── hotkey.rs         # Global hotkey registration
├── tray.rs           # Menu bar tray icon
├── permissions.rs    # Screen recording permission check
└── utils.rs          # File save, notifications

Design patterns: hexagonal architecture (domain ports + infrastructure adapters), state machine (explicit lifecycle), command pattern (undo-able actions), event bus (decoupled notifications).

Tests

cargo test

35 tests covering state machine transitions, event bus, AppContext lifecycle, capture cropping, config serialization, OCR engine abstraction, and file utilities.

License

MIT