An advanced analytics and scouting dashboard designed for performance analysis and recruitment support for Pogoń Szczecin during the 2025/26 season. This project integrates match performance metrics from FotMob with market valuations, contract details, and representing agency data from Transfermarkt.
-
Multi-Source Ingestion & Event Scraping (Phase 1):
- Real-time extraction of 37 performance metrics from FotMob and Transfermarkt.
- Event data scraper capturing spatial coordinate charts: Sofascore
/average-positionsvia Playwright Chromium and FotMob/shotmapfrom next-data props.
-
ETL Pipeline & SQLite Database (Phase 2):
- Normalization of count statistics to per-90 minutes rates.
- Calculation of positional percentiles (GK, DF, MF, FW) across the entire Ekstraklasa league to eliminate role-based bias.
- Storage in an optimized relational SQLite database (
db/scouting.db).
-
Scouting Similarity Engine (Phase 3):
- Cosine similarity matching using z-standardized vectors against dynamic candidate pools.
- Scouting filters by age, market value, and position group with zero-variance protection.
-
Visualizations & Interactive Dashboard (Phase 4):
- Interactive Plotly Radar charts displaying raw metrics alongside percentile levels.
- Football Manager-style Plotly Scatter matrices for league-wide quadrant placement.
- Season Shotmaps (xG-scaled circles on half-pitch layouts) and computed Passing Networks.
-
Integration Testing & Case Study (Phase 5):
- Pytest pipeline checking data flow from SQLite querying to similarity matching.
- Case study on Thomas Thomasberg's vertical 4-4-2 tactics.
data/raw/– Raw JSON data files.db/– Local SQLite database files (scouting.db).docs/– Technical documentation, project schedule (WBS), and data acquisition analysis.src/– Python source modules:fetch_data.py– Multi-source scrapers and squad data aggregator.etl.py– SQLite schema management and data normalization.similarity.py– Cosine similarity calculation engine.fetch_events.py– SQL query interfaces for shot maps and average positions.viz.py– Matplotlib/Plotly tactical visualization charts.scrape_and_insert_sofascore.py– Playwright scraper for average player positions.scrape_fotmob_shots.py– Scraper for shot coordinates and xG.
tests/– Unit tests verifying ETL, matching logic, and viz outputs.
- Python 3.13 or newer
- Virtual environment tool (venv)
-
Clone the repository:
git clone https://github.com/SirSail/pogon-analytics.git cd pogon-analytics -
Create and activate a virtual environment:
python -m venv .venv # Windows (PowerShell): .\.venv\Scripts\Activate.ps1
-
Install dependencies:
pip install -r requirements.txt
To run the Streamlit dashboard locally, execute:
streamlit run app.py$env:PYTHONPATH="."
.\.venv\Scripts\pytest -v