Skip to content

xFalzz/nexus-vision

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

NexusVision

NexusVision is a Python-only AI productivity studio that combines real-time computer vision, gesture creativity, and business workflow automation in one desktop application.

The platform has three integrated studios:

  • NexusVision Lab: webcam-based hand, pose, face, object, and motion intelligence.
  • HoloCanvas Studio: air drawing and gesture-based creative interaction.
  • AutoFlow Studio: AI-powered document workflow automation for PDF/TXT processing, summaries, and exports.

The project is intentionally desktop-first:

  • Python only
  • PyQt6 dashboard UI
  • OpenCV camera processing
  • MediaPipe hand, pose, and face intelligence
  • NumPy image processing
  • PyAutoGUI presentation control with explicit safety toggle
  • PyMuPDF document extraction
  • python-docx report export
  • SQLite local history
  • Windows-friendly run flow

Phase 1 and Phase 2

Implemented:

  • Modern dark PyQt6 desktop dashboard
  • Professional grouped sidebar navigation with active state
  • Live webcam preview
  • Camera selector with refresh support
  • Mirror camera mode for natural webcam movement
  • Camera performance profiles for 60 FPS-oriented capture
  • Professional operator profiles for balanced, performance, low-light, creator, and presentation scenarios
  • Low-latency camera mode with MJPG and small buffer requests
  • Camera display filters for cleaner or more stylized previews
  • Person-focused background blur and custom uploaded background images
  • Start and stop camera controls
  • FPS counter
  • Mode switcher
  • Hand tracking mode
  • Pose tracking mode
  • Face detection mode
  • Gesture recognition mode
  • Stabilized gesture labels for less jittery gesture recognition
  • Motion tracking mode
  • Professional HUD metrics for quality, confidence, movement, coverage, and gesture state
  • HUD levels and signal tips for cleaner real-time operation
  • Mode-specific diagnostics for hand fingers, pose posture, face framing, object density, and motion activity
  • Motion heatmap overlay for visual activity tracking
  • HoloCanvas Studio air-drawing mode
  • HoloCanvas color, brush size, eraser, undo, clear, PNG export, and replay controls
  • Presentation controller mode with explicit keyboard-control toggle
  • Lightweight OpenCV object detection mode
  • AutoFlow Studio PDF/TXT automation
  • AutoFlow focused document workspace that hides the camera preview while active
  • AutoFlow grounded local summary generation with source evidence and missing-information checks
  • AutoFlow document intelligence, quality review, template-specific output, recommended actions, and confidence
  • AutoFlow TXT and DOCX exports
  • AutoFlow workflow history
  • Screenshot capture to assets/screenshots
  • SQLite session and screenshot history
  • SQLite-backed app settings for camera, effects, HUD, quality gates, presentation, metrics, and AutoFlow defaults
  • Session metric snapshots for reviewing tracking quality
  • Diagnostics report export from the Settings tab
  • Missing webcam error handling
  • Ctrl+Q safe quit shortcut
  • Unit tests for non-camera core modules

Project Structure

nexus-vision/
|-- app/
|   |-- main.py
|   |-- camera/
|   |-- vision/
|   |-- gestures/
|   |-- hand/
|   |-- pose/
|   |-- face/
|   |-- objects/
|   |-- motion/
|   |-- controllers/
|   |-- reports/
|   |-- autoflow/
|   |-- ui/
|   `-- utils/
|-- assets/
|   `-- screenshots/
|-- docs/
|   |-- ARCHITECTURE.md
|   |-- ROADMAP.md
|   `-- FEATURES.md
|-- tests/
|-- requirements.txt
|-- README.md
`-- run.py

Setup on Windows

Create a virtual environment:

python -m venv .venv
.\.venv\Scripts\Activate.ps1
python -m pip install --upgrade pip
python -m pip install -r requirements.txt

Run the app:

python run.py

Run tests:

python -m pytest

Camera Notes

The app can scan and select available camera indices through OpenCV. If the preview does not start:

  • Confirm the webcam is connected.
  • Close other apps that may be using the camera.
  • Check Windows privacy permissions for camera access.
  • Click Refresh Cameras.
  • Select another camera from the camera dropdown.

The camera profile dropdown includes:

  • Performance 60 FPS: 640x480 at 60 FPS target
  • Balanced HD: 1280x720 at 30 FPS target
  • HD 60 FPS: 1280x720 at 60 FPS target
  • Detail 1080p: 1920x1080 at 30 FPS target

The app requests the selected FPS, MJPG encoding, and a small camera buffer when low latency is enabled. Actual FPS still depends on webcam hardware, USB bandwidth, driver support, lighting, and the selected AI mode.

Camera enhancement presets are available in the Vision tab:

  • Off: raw camera feed
  • Natural: balanced denoise, contrast, white balance, and light sharpening
  • Low light: stronger cleanup for noisy laptop webcams
  • Sharp: stronger edge detail for clearer tracking previews

For weak laptop webcams, use Performance 60 FPS or Balanced HD with Natural first. Use Low light when the frame is very grainy, but expect a lower FPS on older hardware.

Mirror camera flips the webcam horizontally before tracking so movement feels natural, like a mirror. Disable it if you need a raw non-mirrored camera feed.

Camera effects are also available in the Vision tab:

  • Filters: None, Clean, Studio light, Mono, Warm, Cool, Cinematic, Vintage, High contrast, Soft focus, Document scan, Noir, Blueprint, Thermal, Sketch, Edges, and Neon edges
  • Background modes: Original, Blur, and Custom
  • Background lets you upload your own PNG/JPG/BMP/WebP background image
  • Blur controls how strongly the background is blurred while keeping the person in focus

Background blur and custom backgrounds use MediaPipe selfie segmentation. They look best with good front lighting and a clear distance between the person and the wall. If FPS drops, use Performance 60 FPS, keep enhancement on Natural, and avoid heavy filters such as Sketch.

Operator profiles in the Vision tab can quickly tune the workspace:

  • Balanced: general tracking and demo use
  • Performance: lighter HUD and 60 FPS-oriented capture
  • Low Light: webcam cleanup for weaker rooms
  • Creator: cinematic look with professional tracking HUD
  • Presentation: fast capture and cleaner slide-control monitoring

AutoFlow Studio

AutoFlow Studio is integrated as the business automation wing of the app. It supports:

  • Select PDF/TXT/MD input
  • Choose workflow template
  • Extract document text
  • Generate grounded deterministic local summary draft
  • Build a document intelligence table
  • Include source evidence and missing-information notes
  • Add quality review checks and template-specific output
  • Export TXT
  • Export DOCX
  • Store workflow history in SQLite

When the AutoFlow tab is active, NexusVision hides the camera preview and camera metric bar so the workspace focuses on document analysis and output review.

Templates included:

  • Report Automation
  • Invoice Intelligence
  • Thesis Study Assistant

Full product notes are in docs/AUTOFLOW_STUDIO.md.

HoloCanvas Studio

HoloCanvas Studio is the integrated air-drawing and gesture creative mode. It supports:

  • Index finger air drawing
  • Brush color selector
  • Brush size selector
  • Eraser mode
  • Clear canvas
  • Save/export PNG to assets/canvas_exports
  • Stroke replay
  • Drawing export history in SQLite

Gesture mapping is documented in docs/GESTURES.md. Settings persistence is documented in docs/SETTINGS.md.

Phase 2 Controls

Presentation keyboard control is disabled by default. To use it:

  1. Select Presentation Controller.
  2. Check Enable presentation keys.
  3. Move an open hand horizontally to trigger previous or next slide actions.

HoloCanvas uses the index finger to draw, two fingers for eraser mode, an open palm to clear, and pinch to cycle colors.

Screenshots and Data

Screenshots are saved to:

assets/screenshots/

Canvas exports are saved to:

assets/canvas_exports/

AutoFlow exports are saved to:

assets/autoflow_exports/

Session data is stored locally in:

data/nexus_vision.sqlite3

Diagnostics reports are exported to:

assets/reports/

Both are runtime artifacts and are ignored by Git except for .gitkeep placeholders.

License

This project is open source under the MIT License.