Urban Transportation Analysis

Research and implementation of transportation analytics for Lahore and Riyadh utilizing deep learning and real-time data processing.

Project Structure

Lahore Traffic Monitoring: Spatio-temporal prediction and route optimization.
Riyadh Transportation Analysis: Graph-based multi-modal system integration.

Quick Start

Infrastructure

docker compose up -d

Verification

export PYTHONPATH=$PYTHONPATH:$(pwd)
source venv/bin/activate
python shared/utils/verify_infra.py

Development Progress

Day 1: Infrastructure and Data Ingestion

Achievements:

Configured Python 3.12 environment with specialized geospatial dependencies.
Deployed PostgreSQL (PostGIS), Redis, and Kafka infrastructure via Docker.
Extracted and processed 145,998 road nodes and 380,264 edges for the Lahore District.

Commands Executed:

# 1. Start Infrastructure
docker compose up -d

# 2. Enable PostGIS Extension
docker exec lahore_postgres psql -U traffic_user -d lahore_traffic -c "CREATE EXTENSION IF NOT EXISTS postgis;"

# 3. Verify Connectivity
export PYTHONPATH=$PYTHONPATH:$(pwd)
source venv/bin/activate
python shared/utils/verify_infra.py

# 4. Run Data Ingestion
python lahore/src/data_pipeline/ingestion.py

Verification Output:

INFO - PostgreSQL connection successful!
INFO - Redis connection successful!
INFO - Kafka connection successful!
INFO - All infrastructure components are online and reachable!

Day 2: Network Graph and Feature Engineering

Achievements:

Constructed a hierarchical networkx.MultiDiGraph representing the Lahore road network.
Developed a feature extraction pipeline for spatial road attributes (length, density, hierarchy).
Validated 100% geometry integrity and topological connectivity.

Commands Executed:

# 1. Build Network Graph
python lahore/src/data_pipeline/graph.py

# 2. Extract Features
python lahore/src/data_pipeline/features.py

# 3. Validate Data Quality
python lahore/src/data_pipeline/validation.py

Verification Output:

INFO - Graph constructed: 145998 nodes, 380264 edges.
INFO - All geometries are spatially valid.
INFO - Data quality validation complete!

Day 3: Simulation and Deep Learning Architecture

Achievements:

Implemented a CNN-LSTM hybrid model for simultaneous spatial and temporal feature learning.
Developed a real-time traffic simulation engine using the Kafka streaming protocol.
Resolved loss instability through feature scaling and automated temporal imputation.

Commands Executed:

# 1. Start Traffic Simulation (Background)
python lahore/src/data_pipeline/traffic_simulator.py &
python lahore/src/data_pipeline/traffic_consumer.py &

# 2. Run Model Training
python lahore/src/ml_models/train.py

Day 4: Model Optimization and Uncertainty

Achievements:

Introduced Traffic Transformer architecture with self-attention for sequence modeling.
Implemented Gated Ensemble to fuse CNN-LSTM and Transformer predictions.
Integrated Uncertainty Quantification using Monte Carlo Dropout to estimate prediction confidence.
Upgraded evaluation suite with industry-standard metrics: MAPE, RMSE, and MAE.

Commands Executed:

# 1. Run Advanced Model Training (CNN-LSTM & Transformer)
python lahore/src/ml_models/train.py

# 2. Generate Performance Comparison Plots
python lahore/src/visualization/compare_results.py

Model Performance Comparison

Statistical Distribution	Performance Metrics (MAPE)

Baseline speed profile across network.	Comparison of CNN-LSTM vs Transformer performance.

Benchmark Results (Evaluation Set):

CNN-LSTM: MAE: 0.92, MAPE: 245.1%, RMSE: 1.18
Transformer: MAE: 0.88, MAPE: 238.4%, RMSE: 1.12
Key Insight: The Transformer architecture shows superior sequence modeling, reducing MAPE by ~7% over the LSTM baseline within 5 epochs.

Technical Architecture

graph TD
    subgraph "Data Sources"
        OSM["OpenStreetMap (OSMnx)"]
        SIM["Traffic Simulator (Python)"]
    end

    subgraph "Real-time Ingestion (Kafka)"
        T_PROD["Traffic Producer"]
        K_BUS["Kafka Topic: lahore_traffic_updates"]
        T_CONS["Traffic Consumer"]
        
        SIM --> T_PROD
        T_PROD --> K_BUS
        K_BUS --> T_CONS
    end

    subgraph "Storage Layer"
        PG["PostgreSQL + PostGIS"]
        RD["Redis (Real-time Cache)"]
        
        OSM -->|Static Road Network| PG
        T_CONS -->|Historical Training Data| PG
        T_CONS -->|Live Traffic State| RD
    end

    subgraph "Deep Learning Core (Day 3)"
        DS["Data Scaler & Imputer"]
        CNN["CNN Layer (Spatial Correlations)"]
        LSTM["LSTM Layer (Temporal Trends)"]
        PRED["Traffic Forecast"]

        PG --> DS
        DS --> CNN
        CNN --> LSTM
        LSTM --> PRED
    end

Model Validation and Performance

Spatial Network Coverage	Temporal Traffic Trends

Graph-based spatial distribution of road segments.	Mean speed variance across the simulation sequence.

Speed Distribution	Volume Distribution

Simulated speed profile (km/h).	Vehicle volume density per segment.

Geospatial Congestion Analysis

Static geospatial analysis showing traffic density and road hierarchy across Lahore. Red segments indicate simulated bottlenecks.

Verification Results:

Training Stability: MSE loss reduced from 0.057 to 0.050 over initial calibration.
Data Throughput: Successfully processed sequences for 20,448 nodes with multi-dimensional features.
Pipeline Integrity: End-to-end verification from Kafka ingestion to model inference confirmed.

Day 5: Route Optimization

Achievements:

Implemented Dijkstra and A* algorithms with haversine heuristic for shortest path finding.
Developed Genetic Algorithm for multi-objective optimization (balancing time and distance).
Created Congestion-Aware Routing Engine that dynamically adjusts paths based on traffic conditions.
Built modular lahore/src/optimization package with full benchmarking suite.

Commands Executed:

# 1. Run Route Optimization Verification
python lahore/src/optimization/verify_routing.py

Benchmark Results:

Algorithm	Avg Execution Time	Notes
Dijkstra	590.99 ms	Baseline shortest path
A*	64.36 ms	9x faster with haversine heuristic
Genetic Algorithm	182.64 ms	Multi-objective optimization

Verification Results:

Congestion Diversion: Successfully reroutes traffic around simulated bottlenecks.
Time Saved: ~3,759 cost units when avoiding congested segments.
Graph Coverage: Tested on 145,998 nodes and 380,264 edges.

Route Optimization Visualizations

Algorithm Performance	Route Comparison

A is 9x faster than Dijkstra with haversine heuristic.*	Static (red) vs Congestion-Aware (green) routes.

Network congestion heatmap showing ~75,000 simulated bottlenecks across Lahore's road network.

📍 Landmark Routing Demos

We've selected recognizable Lahore landmarks to demonstrate the real-world utility of the congestion-aware routing engine.

Minar-e-Pakistan → Gaddafi Stadium	Data Darbar → DHA Phase 5

Route optimization between North and Central Lahore.	Long-distance optimization from Old City to DHA.

Tip

You can open the interactive .html files in lahore/data/plots/ to zoom, pan, and explore these routes in detail.

Day 6: Advanced Analytics

Achievements:

Implemented Anomaly Detection using Isolation Forests and Z-score statistics (100% detection rate on incident simulations).
Developed Bottleneck Identification algorithms to map persistent infrastructure stress points.
Created Temporal Trend Analyzers to track diurnal cycles and compare weekday/weekend behavior.
Built Emergency Routing Engine that favors wide boulevards and high-capacity roads for mission-critical response.

Commands Executed:

# Run Advanced Analytics Verification
python lahore/src/analytics/verify_analytics.py

Analytics Visualizations:

Diurnal Traffic Cycle	Weekday vs. Weekend Speeds

Identified peaks at 9:00 AM and 5:00 PM.	Weekends show higher average speeds during evening hours.

Verification Highlights:

Incident Detection: Successfully identified sudden speed drops as anomalies.
Priority Routing: Emergency paths correctly prioritize high-capacity arterials over narrow shortcuts.
Network Stress: Identified top 10 segments requiring potential infrastructure upgrades.

Next: Day 7 - Streaming Analytics (Flink/Spark Streaming)

Day 7: Streaming Analytics

Achievements:

Built Faust-based Stream Processor with 5-minute tumbling windows for real-time traffic aggregation.
Implemented Sliding Window Feature Extractor computing rolling statistics (mean, std, congestion index).
Created Online Predictor module for live ML inference from Kafka streams.
Verified end-to-end pipeline: 288 messages → 288 features → 288 predictions at sub-millisecond latency.

Commands Executed:

# Start Docker services
docker compose up -d

# Run Streaming Verification
python lahore/src/streaming/verify_streaming.py

Streaming Architecture:

Simulator → [lahore_traffic_updates] → Processor → [lahore_traffic_features] → Predictor → [lahore_traffic_predictions]

Verification Results:

Stage	Messages	Latency
Simulator	288	—
Feature Extractor	288	<0.01ms
Predictor	288	<0.01ms

Performance Visualization:

The graph above shows our system's speed. Every dot is a traffic prediction made in real-time. Notice how the response time is almost flat and incredibly low (<0.01ms), meaning the system won't slow down even during peak traffic hours.

Live Monitoring

You can watch the system heartbeat in real-time by running the monitor script:

python lahore/src/streaming/live_monitor.py

Day 8: Advanced ML Pipeline (Adaptive Intelligence)

Achievements:

Implemented an Online Learning Engine for incremental model fine-tuning via active Kakfa data streams.
Developed a Statistical Drift Detector (Kolmogorov-Smirnov test) to identify real-time shifts in traffic distributions.
Created an A/B Testing Framework (Champion vs Challenger) to validate model performance before deployment.
Integrated automated model promotion logic based on relative error metrics (MAE/RMSE).

Technical Highlights:

Model Adaptation: System fine-tunes weights incrementally without requiring full dataset retraining.
Drift Detection: Monitors input features for concept drift, ensuring model reliability during seasonal or atmospheric changes.
Performance Validation: Baseline "Champion" compared against a "Challenger" (Transformer-based) showing significant accuracy gains.

Performance Visuals:

Left (Data Drift): This illustrates the AI's "spidey-sense." When the blue (normal) and red (congested) curves don't match, the system knows the city's traffic behavior has shifted. Right (A/B Testing): The green line shows a newer AI being significantly more accurate than the old one (gray), saving us from wrong predictions.

Commands Executed:

# Run ML Pipeline Verification
python lahore/src/analytics/verify_ml_pipeline.py

Adaptive Architecture:

Stream → [Drift Detector] → [Online Predictor (Champion/Challenger)] → [A/B Tester]
              ↑ (retrain)
          [Online Learner]

Verification Highlights:

Drift Detection: Successfully identified shifts with p < 0.001.
Model Promotion: Identified shadow model with 65% better accuracy.

Day 9: System Integration

Achievements:

Developed a production-grade FastAPI backend to expose traffic intelligence to the world.
Implemented WebSockets for a real-time "Traffic Heartbeat" stream.
Integrated Redis Caching to ensure routing and prediction results are served in milliseconds.
Created an automated verification suite for all API endpoints.

Technical Highlights:

API Gateway: Unified access point for ML predictions and graph-based route optimization.
Real-time Updates: Push-based architecture for broadcasting Kafka traffic events to connected clients.
Performance Optimization: Redis implementation reduced routing latency to sub-150ms for large-scale graph traversals.

Commands Executed:

# 1. Start the API Server
source venv/bin/activate
export PYTHONPATH=$PYTHONPATH:$(pwd)
uvicorn lahore.src.api.main:app --host 0.0.0.0 --port 8000

# 2. Run API Verification
python lahore/src/api/verify_api.py

Verification Highlights:

Health Check: ✅ All modules (Redis, Kafka, Graph) reported healthy.
Routing Latency: Successfully calculated an 11km route across 145k nodes in <150ms.
Live Bridge: WebSocket successfully received and broadcasted real-time drift alerts.

Day 10: Interactive Dashboard

Achievements:

Built an interactive Streamlit dashboard for real-time city-scale traffic monitoring.
Developed a high-resolution map component for visualizing live road-level congestion.
Created a routing UI for testing congestion-aware pathfinding algorithms.
Integrated ML performance monitoring (Drift and A/B testing) into the frontend.

Technical Highlights:

State-of-the-art UI: Clean, responsive interface for traffic analysts and urban planners.
Dynamic Visualization: Color-coded road segments reflecting real-time speed predictions.
Unified Monitoring: Centralized access to back-end stats, routing metrics, and model health.

Commands Executed:

# 1. Install Dashboard Dependencies
pip install streamlit-folium httpx pydeck requests

# 2. Start the Dashboard
source venv/bin/activate
export PYTHONPATH=$PYTHONPATH:$(pwd)
streamlit run lahore/dashboard/app.py

Next: Day 11 - Advanced Visualization (Phase 4: Optimization)

Name		Name	Last commit message	Last commit date
Latest commit History 18 Commits
docs		docs
lahore		lahore
shared		shared
.gitignore		.gitignore
README.md		README.md
docker-compose.yml		docker-compose.yml
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Urban Transportation Analysis

Project Structure

Quick Start

Infrastructure

Verification

Development Progress

Day 1: Infrastructure and Data Ingestion

Day 2: Network Graph and Feature Engineering

Day 3: Simulation and Deep Learning Architecture

Day 4: Model Optimization and Uncertainty

Model Performance Comparison

Technical Architecture

Model Validation and Performance

Geospatial Congestion Analysis

Day 5: Route Optimization

Route Optimization Visualizations

📍 Landmark Routing Demos

Day 6: Advanced Analytics

Day 7: Streaming Analytics

Live Monitoring

Day 8: Advanced ML Pipeline (Adaptive Intelligence)

Day 9: System Integration

Day 10: Interactive Dashboard

About

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Urban Transportation Analysis

Project Structure

Quick Start

Infrastructure

Verification

Development Progress

Day 1: Infrastructure and Data Ingestion

Day 2: Network Graph and Feature Engineering

Day 3: Simulation and Deep Learning Architecture

Day 4: Model Optimization and Uncertainty

Model Performance Comparison

Technical Architecture

Model Validation and Performance

Geospatial Congestion Analysis

Day 5: Route Optimization

Route Optimization Visualizations

📍 Landmark Routing Demos

Day 6: Advanced Analytics

Day 7: Streaming Analytics

Live Monitoring

Day 8: Advanced ML Pipeline (Adaptive Intelligence)

Day 9: System Integration

Day 10: Interactive Dashboard

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Contributors

Uh oh!

Languages