NexusVision is a Python-only AI productivity studio that combines real-time computer vision, gesture creativity, and business workflow automation in one desktop application.
The platform has three integrated studios:
- NexusVision Lab: webcam-based hand, pose, face, object, and motion intelligence.
- HoloCanvas Studio: air drawing and gesture-based creative interaction.
- AutoFlow Studio: AI-powered document workflow automation for PDF/TXT processing, summaries, and exports.
The project is intentionally desktop-first:
- Python only
- PyQt6 dashboard UI
- OpenCV camera processing
- MediaPipe hand, pose, and face intelligence
- NumPy image processing
- PyAutoGUI presentation control with explicit safety toggle
- PyMuPDF document extraction
- python-docx report export
- SQLite local history
- Windows-friendly run flow
Implemented:
- Modern dark PyQt6 desktop dashboard
- Professional grouped sidebar navigation with active state
- Live webcam preview
- Camera selector with refresh support
- Mirror camera mode for natural webcam movement
- Camera performance profiles for 60 FPS-oriented capture
- Professional operator profiles for balanced, performance, low-light, creator, and presentation scenarios
- Low-latency camera mode with MJPG and small buffer requests
- Camera display filters for cleaner or more stylized previews
- Person-focused background blur and custom uploaded background images
- Start and stop camera controls
- FPS counter
- Mode switcher
- Hand tracking mode
- Pose tracking mode
- Face detection mode
- Gesture recognition mode
- Stabilized gesture labels for less jittery gesture recognition
- Motion tracking mode
- Professional HUD metrics for quality, confidence, movement, coverage, and gesture state
- HUD levels and signal tips for cleaner real-time operation
- Mode-specific diagnostics for hand fingers, pose posture, face framing, object density, and motion activity
- Motion heatmap overlay for visual activity tracking
- HoloCanvas Studio air-drawing mode
- HoloCanvas color, brush size, eraser, undo, clear, PNG export, and replay controls
- Presentation controller mode with explicit keyboard-control toggle
- Lightweight OpenCV object detection mode
- AutoFlow Studio PDF/TXT automation
- AutoFlow focused document workspace that hides the camera preview while active
- AutoFlow grounded local summary generation with source evidence and missing-information checks
- AutoFlow document intelligence, quality review, template-specific output, recommended actions, and confidence
- AutoFlow TXT and DOCX exports
- AutoFlow workflow history
- Screenshot capture to
assets/screenshots - SQLite session and screenshot history
- SQLite-backed app settings for camera, effects, HUD, quality gates, presentation, metrics, and AutoFlow defaults
- Session metric snapshots for reviewing tracking quality
- Diagnostics report export from the Settings tab
- Missing webcam error handling
Ctrl+Qsafe quit shortcut- Unit tests for non-camera core modules
nexus-vision/
|-- app/
| |-- main.py
| |-- camera/
| |-- vision/
| |-- gestures/
| |-- hand/
| |-- pose/
| |-- face/
| |-- objects/
| |-- motion/
| |-- controllers/
| |-- reports/
| |-- autoflow/
| |-- ui/
| `-- utils/
|-- assets/
| `-- screenshots/
|-- docs/
| |-- ARCHITECTURE.md
| |-- ROADMAP.md
| `-- FEATURES.md
|-- tests/
|-- requirements.txt
|-- README.md
`-- run.pyCreate a virtual environment:
python -m venv .venv
.\.venv\Scripts\Activate.ps1
python -m pip install --upgrade pip
python -m pip install -r requirements.txtRun the app:
python run.pyRun tests:
python -m pytestThe app can scan and select available camera indices through OpenCV. If the preview does not start:
- Confirm the webcam is connected.
- Close other apps that may be using the camera.
- Check Windows privacy permissions for camera access.
- Click
Refresh Cameras. - Select another camera from the camera dropdown.
The camera profile dropdown includes:
Performance 60 FPS: 640x480 at 60 FPS targetBalanced HD: 1280x720 at 30 FPS targetHD 60 FPS: 1280x720 at 60 FPS targetDetail 1080p: 1920x1080 at 30 FPS target
The app requests the selected FPS, MJPG encoding, and a small camera buffer when low latency is enabled. Actual FPS still depends on webcam hardware, USB bandwidth, driver support, lighting, and the selected AI mode.
Camera enhancement presets are available in the Vision tab:
Off: raw camera feedNatural: balanced denoise, contrast, white balance, and light sharpeningLow light: stronger cleanup for noisy laptop webcamsSharp: stronger edge detail for clearer tracking previews
For weak laptop webcams, use Performance 60 FPS or Balanced HD with Natural first. Use Low light when the frame is very grainy, but expect a lower FPS on older hardware.
Mirror camera flips the webcam horizontally before tracking so movement feels natural, like a mirror. Disable it if you need a raw non-mirrored camera feed.
Camera effects are also available in the Vision tab:
- Filters:
None,Clean,Studio light,Mono,Warm,Cool,Cinematic,Vintage,High contrast,Soft focus,Document scan,Noir,Blueprint,Thermal,Sketch,Edges, andNeon edges - Background modes:
Original,Blur, andCustom Backgroundlets you upload your own PNG/JPG/BMP/WebP background imageBlurcontrols how strongly the background is blurred while keeping the person in focus
Background blur and custom backgrounds use MediaPipe selfie segmentation. They look best with good front lighting and a clear distance between the person and the wall. If FPS drops, use Performance 60 FPS, keep enhancement on Natural, and avoid heavy filters such as Sketch.
Operator profiles in the Vision tab can quickly tune the workspace:
Balanced: general tracking and demo usePerformance: lighter HUD and 60 FPS-oriented captureLow Light: webcam cleanup for weaker roomsCreator: cinematic look with professional tracking HUDPresentation: fast capture and cleaner slide-control monitoring
AutoFlow Studio is integrated as the business automation wing of the app. It supports:
- Select PDF/TXT/MD input
- Choose workflow template
- Extract document text
- Generate grounded deterministic local summary draft
- Build a document intelligence table
- Include source evidence and missing-information notes
- Add quality review checks and template-specific output
- Export TXT
- Export DOCX
- Store workflow history in SQLite
When the AutoFlow tab is active, NexusVision hides the camera preview and camera metric bar so the workspace focuses on document analysis and output review.
Templates included:
- Report Automation
- Invoice Intelligence
- Thesis Study Assistant
Full product notes are in docs/AUTOFLOW_STUDIO.md.
HoloCanvas Studio is the integrated air-drawing and gesture creative mode. It supports:
- Index finger air drawing
- Brush color selector
- Brush size selector
- Eraser mode
- Clear canvas
- Save/export PNG to
assets/canvas_exports - Stroke replay
- Drawing export history in SQLite
Gesture mapping is documented in docs/GESTURES.md. Settings persistence is documented in docs/SETTINGS.md.
Presentation keyboard control is disabled by default. To use it:
- Select
Presentation Controller. - Check
Enable presentation keys. - Move an open hand horizontally to trigger previous or next slide actions.
HoloCanvas uses the index finger to draw, two fingers for eraser mode, an open palm to clear, and pinch to cycle colors.
Screenshots are saved to:
assets/screenshots/Canvas exports are saved to:
assets/canvas_exports/AutoFlow exports are saved to:
assets/autoflow_exports/Session data is stored locally in:
data/nexus_vision.sqlite3Diagnostics reports are exported to:
assets/reports/Both are runtime artifacts and are ignored by Git except for .gitkeep placeholders.
This project is open source under the MIT License.