RBAC-RAG Assistant

RBAC-RAG Assistant is a local-first retrieval-augmented generation application for internal knowledge bases. It combines role-based access control, document retrieval, source citations, and local LLM inference to demonstrate how an organization can expose private documents through a controlled assistant experience.

The stack uses Streamlit for the user interface, FastAPI for the backend API, Qdrant for vector search, sentence-transformers for local embeddings, LangGraph for the optional graph workflow, and Ollama for local chat completion.

Highlights

Role-scoped document retrieval across Engineering, Finance, HR, Marketing, General, and executive access levels.
RAG and LangGraph execution modes with source citations.
Local LLM runtime through Ollama, with no hosted LLM required by default.
Qdrant-backed semantic search with optional Chroma support.
Document explorer filtered by the signed-in user's role.
Admin reindex action for Engineering and C-level users.
Session-local usage analytics for demo and evaluation workflows.
Evaluation fixtures for correctness and RBAC leakage checks.

Architecture

flowchart LR
  subgraph UI["User Interface"]
    Streamlit["Streamlit app"]
  end

  subgraph API["FastAPI Backend"]
    Auth["Basic auth and RBAC"]
    Rag["RAG service"]
    Graph["LangGraph workflow"]
    Indexer["Indexer service"]
  end

  subgraph Data["Knowledge Base"]
    Docs["resources/data/<department>"]
    Qdrant["Qdrant vector store"]
  end

  subgraph LLM["Local Model Runtime"]
    Ollama["Ollama - qwen2.5:3b-instruct"]
  end

  Streamlit -->|"HTTP + Basic Auth"| Auth
  Auth --> Rag
  Auth --> Graph
  Docs --> Indexer
  Indexer --> Qdrant
  Rag --> Qdrant
  Graph --> Qdrant
  Rag --> Ollama
  Graph --> Ollama
  Rag --> Streamlit
  Graph --> Streamlit

Repository Layout

.
|-- app/
|   |-- graph/              # LangGraph RAG workflow
|   |-- services/           # Auth, indexing, retrieval, generation helpers
|   |-- schemas/            # Pydantic models
|   |-- utils/              # File reading and chunking utilities
|   |-- main.py             # FastAPI application
|   `-- policy.py           # Role-to-department access policy
|-- docs/                   # Screenshots
|-- evals/                  # Evaluation cases
|-- pages/                  # Streamlit multipage views
|-- resources/data/         # Sample department documents
|-- scripts/                # CLI utilities
|-- tests/                  # RBAC and evaluation tests
|-- Home.py                 # Streamlit entrypoint
|-- docker-compose.yml      # Qdrant, Ollama, API, and web services
`-- requirements*.txt       # Runtime and development dependencies

Quick Start

Prerequisites

Docker Desktop
Python 3.10 or newer for local development
At least several GB of free disk space for model and embedding dependencies

1. Configure Environment

cp .env.example .env

The defaults use:

VECTOR_DB=qdrant
EMBED_BACKEND=local
ST_MODEL=sentence-transformers/all-MiniLM-L6-v2
OLLAMA_MODEL=qwen2.5:3b-instruct

2. Start Infrastructure

docker compose up -d qdrant ollama
docker compose exec ollama ollama pull qwen2.5:3b-instruct

3. Run the Full Docker Stack

docker compose up --build api web

Open the application at:

Streamlit UI: http://localhost:8501
FastAPI backend: http://localhost:8000
API health check: http://localhost:8000/healthz

4. Local Development Mode

For faster iteration, run Qdrant and Ollama in Docker and run the API/UI from the local virtual environment:

python -m venv .venv
source .venv/bin/activate
pip install -r requirements.dev.txt

docker compose up -d qdrant ollama
docker compose exec ollama ollama pull qwen2.5:3b-instruct

PYTHONPATH=. \
VECTOR_DB=qdrant \
QDRANT_URL=http://localhost:6333 \
OLLAMA_HOST=http://localhost:11434 \
DATA_DIR=resources/data \
python scripts/cli.py ingest

Start the API:

PYTHONPATH=. \
VECTOR_DB=qdrant \
QDRANT_URL=http://localhost:6333 \
OLLAMA_HOST=http://localhost:11434 \
DATA_DIR=resources/data \
AUTO_INDEX=0 \
uvicorn app.main:app --host 127.0.0.1 --port 8000

Start the UI in a second terminal:

PYTHONPATH=. \
API_URL=http://127.0.0.1:8000 \
streamlit run Home.py --server.port=8501 --server.address=127.0.0.1

Demo Users

The project ships with local demo users for role-based testing:

Username	Password	Role
`Peter`	`pete123`	Engineering
`Mariam`	`mariampass123`	Marketing
`Natasha`	`hrpass123`	HR
`Sam`	`financepass`	Finance
`Cathy`	`cathyceo`	C-level
`Emma`	`password`	Employee

These credentials are for local demonstration only. Replace them with BASIC_USERS_JSON or a production identity provider before using this pattern beyond a demo environment.

Application Views

Chat

The main chat screen supports both RAG and Graph execution modes, displays generated answers, and shows source files used for retrieval.

Document Explorer

The document explorer shows only documents visible to the current role and allows simple filtering by department and text search.

Admin Tools

Engineering and C-level users can trigger reindexing from the UI. The page also shows API health and runtime model/vector-store settings.

Usage Analytics

The analytics page tracks session-local request counts, engine usage, request status, latency, answer length, and source count. It is intended for local demos and lightweight validation, not durable production reporting.

API Endpoints

Method	Path	Description
`GET`	`/healthz`	Basic service health check
`GET`	`/version`	Runtime model and vector database settings
`GET`	`/login`	Validate Basic Auth credentials
`POST`	`/chat/rag`	Single-turn RAG answer generation
`POST`	`/chat/graph`	LangGraph answer generation with thread memory
`POST`	`/admin/reindex`	Rebuild vector index from `resources/data`

Example RAG request:

curl -u Peter:pete123 \
  -H "Content-Type: application/json" \
  -d '{"message":"What are the key components of the engineering architecture?"}' \
  http://localhost:8000/chat/rag

Data Model and Access Control

Documents are organized by department:

resources/data/
|-- engineering/
|-- finance/
|-- general/
|-- hr/
`-- marketing/

The access policy is defined in app/policy.py. Each role maps to an allowed set of departments. Retrieval filters enforce the role boundary before answers are generated, and intent detection can deny or soften requests that appear to target departments outside the user's access.

Configuration

Most runtime behavior is controlled through .env.

Variable	Purpose
`VECTOR_DB`	Vector backend: `qdrant` or `chroma`
`QDRANT_URL`	Qdrant endpoint used by the API
`DATA_DIR`	Document root used during indexing
`EMBED_BACKEND`	Embedding provider: `local` or `openai`
`ST_MODEL`	Local sentence-transformer model
`OLLAMA_HOST`	Ollama server URL
`OLLAMA_MODEL`	Chat model used for answer generation
`AUTO_INDEX`	Reindex once on API startup when set to `1`
`RBAC_INTENT`	Enable department intent detection
`RBAC_INTENT_SOFT`	Soften cross-department intent to allowed/general docs
`RERANK_CE`	Enable optional cross-encoder reranking
`PASSAGE_SELECTION`	Enable optional LLM passage selection

See .env.example for the full set of supported settings.

Testing

Install development dependencies:

pip install -r requirements.dev.txt

Run the test suite:

pytest -q

The tests cover RBAC leakage scenarios and evaluation cases under evals/. Some tests require the embedding model and vector index to be available.

Operational Notes

The first local run may download the embedding model from Hugging Face and the chat model from Ollama.
Docker Compose persists Qdrant, Ollama, and Hugging Face cache data in named volumes.
The checked-in documents are synthetic sample data for demonstration and evaluation.
Demo authentication uses plaintext credentials and should not be used as-is in production.
For production use, add durable auth, encrypted secrets management, persistent analytics, monitoring, and document-level ACLs.

Roadmap

Document-level access control.
Durable analytics and audit logging.
CI workflow for tests and linting.
Optional hosted LLM and embedding providers.
Additional evaluation coverage for retrieval quality and access-boundary behavior.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
app		app
docs		docs
evals		evals
pages		pages
resources/data		resources/data
scripts		scripts
tests		tests
.dockerignore		.dockerignore
.env.example		.env.example
.gitignore		.gitignore
Dockerfile.api		Dockerfile.api
Dockerfile.streamlit		Dockerfile.streamlit
Home.py		Home.py
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
configtest.py		configtest.py
docker-compose.yml		docker-compose.yml
pyproject.toml		pyproject.toml
requirements.api.txt		requirements.api.txt
requirements.dev.txt		requirements.dev.txt
requirements.streamlit.txt		requirements.streamlit.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

RBAC-RAG Assistant

Highlights

Architecture

Repository Layout

Quick Start

Prerequisites

1. Configure Environment

2. Start Infrastructure

3. Run the Full Docker Stack

4. Local Development Mode

Demo Users

Application Views

Chat

Document Explorer

Admin Tools

Usage Analytics

API Endpoints

Data Model and Access Control

Configuration

Testing

Operational Notes

Roadmap

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

RBAC-RAG Assistant

Highlights

Architecture

Repository Layout

Quick Start

Prerequisites

1. Configure Environment

2. Start Infrastructure

3. Run the Full Docker Stack

4. Local Development Mode

Demo Users

Application Views

Chat

Document Explorer

Admin Tools

Usage Analytics

API Endpoints

Data Model and Access Control

Configuration

Testing

Operational Notes

Roadmap

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages