Dark | Light |
---|---|
... | ... |
-
🔍 Advanced OCR Capabilities
-
⚡ Quick Access
- Menubar integration for easy access
- Global keyboard shortcuts for instant capture
- Custom shortcuts for specific OCR configurations
-
🎨 User Interface
- Clean and modern interface in the menubar
- Dark and light theme support
- Hoverlay panel (spotlight style) to choose model to use for OCR
-
🛠️ Customization
- Configurable OCR settings
- Customizable keyboard shortcuts
- Auto-start option
- Sound feedback toggle
-
📋 Clipboard Integration
- Automatic clipboard copying
- Quick copy buttons for results
-
📝 History Management
- Screenshot history tracking
- Search through past OCR results
- Option to disable history for privacy
-
🔒 Privacy and Security
- Local processing of screenshots
- Optional history disable feature
- Bun - Fast all-in-one JavaScript runtime & package manager
- uv - Fast Python package installer
- Rust - For Tauri's backend
- Install the project dependencies for javascript
bun install
- Python Sidecar Setup:
# Navigate to src-python directory and create a virtual environment
cd src-python
# Create a virtual environment
uv venv
# Activate the virtual environment
source .venv/bin/activate
# Sync the dependencies
uv sync
The Python sidecar (ocr_mlx
) is packaged using box-packager, which:
- Takes the entry point defined in pyproject.toml (
ocr_mlx.endpoint:main
) - Bundles all dependencies into a single executable
- Places the executable in
src-tauri/binaries
with platform-specific naming - Uses PyApp to bootstrap the Python environment at runtime
- Build the Python Sidecar:
# This command will:
# 1. Package the Python app using box-packager
# 2. Copy the executable to src-tauri/binaries
# 3. Name it appropriately for your platform
bun run python:package:build
# Initialize the packaged environment
bun run python:package:reset
- Start Development Server:
# This will:
# 1. Start the Svelte dev server
# 2. Launch the Tauri window
# 3. Initialize the Python sidecar for OCR
bun run tauri dev
# Build the complete application:
# 1. Checks if Python sidecar needs rebuilding
# 2. Rebuilds sidecar if needed
# 3. Builds the Tauri application
bun run tauri build
The built application will be available in src-tauri/target/release
.
src/
- SvelteKit frontend codesrc-tauri/
- Tauri backend code- Hot reload enabled for both frontend and Rust changes
src-python/
- Python OCR service code- Development mode:
bun run python:package:dev
- API development:
# Generate OpenAPI specs from Python service bun run python:package:generate-openapi # Generate TypeScript client from OpenAPI specs bun run python:package:generate-client
bun run icon:generate
- Generate app icons from SVGbun run icon:generate-tray
- Generate animated tray icons
- SvelteKit 5 with Runes for reactive frontend
- Tauri v2 for native capabilities
- Tailwind CSS + DaisyUI for styling
- MLX-Nougat for efficient OCR processing
- SQLite with Drizzle ORM for data persistence
- TypeScript for type safety
- Python sidecar with FastAPI for OCR service
Note: The Python sidecar is automatically rebuilt when changes are detected in the src-python directory. The build process is managed by the
pretauri
script.
Contributions are always welcome!
See contributing.md
for ways to get started.
-
Enhanced OCR Capabilities
- Support for handwritten text recognition
- Table structure recognition and export to Excel/CSV
- Chemical formula recognition
- Multi-language support with language auto-detection
-
Unified processing backend
-
User Experience
- Interactive tutorial for new users
- Customizable OCR region presets
-
Platform Support
- Windows support
- Linux support
This project is heavily inspired by the following commercial projects:
- Tauri v2: A framework for building web based desktop apps using Rust.
- tauri-toolkit: A tauri plugin toolkit for menubar apps.
- JacobBolda: A youtube channel with a lot of tauri content
- MrJakob: A youtube channel with a lot of tauri content
- OCRS: A lightweight OCR library project.
- mlx-nougat: Nougat implementation for MLX.
- nougat: Facebook Nougat OCR.
- Fluent Icons: A library of icons for Windows (use for the tray icon)