Skip to content

SirSail/pogon-analytics

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

23 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Pogoń Szczecin Performance & Recruitment Dashboard (2025/26)

Python 3.13+ License: MIT Code Style: Black

An advanced analytics and scouting dashboard designed for performance analysis and recruitment support for Pogoń Szczecin during the 2025/26 season. This project integrates match performance metrics from FotMob with market valuations, contract details, and representing agency data from Transfermarkt.


🚀 Key Features

  1. Multi-Source Ingestion & Event Scraping (Phase 1):

    • Real-time extraction of 37 performance metrics from FotMob and Transfermarkt.
    • Event data scraper capturing spatial coordinate charts: Sofascore /average-positions via Playwright Chromium and FotMob /shotmap from next-data props.
  2. ETL Pipeline & SQLite Database (Phase 2):

    • Normalization of count statistics to per-90 minutes rates.
    • Calculation of positional percentiles (GK, DF, MF, FW) across the entire Ekstraklasa league to eliminate role-based bias.
    • Storage in an optimized relational SQLite database (db/scouting.db).
  3. Scouting Similarity Engine (Phase 3):

    • Cosine similarity matching using z-standardized vectors against dynamic candidate pools.
    • Scouting filters by age, market value, and position group with zero-variance protection.
  4. Visualizations & Interactive Dashboard (Phase 4):

    • Interactive Plotly Radar charts displaying raw metrics alongside percentile levels.
    • Football Manager-style Plotly Scatter matrices for league-wide quadrant placement.
    • Season Shotmaps (xG-scaled circles on half-pitch layouts) and computed Passing Networks.
  5. Integration Testing & Case Study (Phase 5):

    • Pytest pipeline checking data flow from SQLite querying to similarity matching.
    • Case study on Thomas Thomasberg's vertical 4-4-2 tactics.

📂 Project Structure

  • data/raw/ – Raw JSON data files.
  • db/ – Local SQLite database files (scouting.db).
  • docs/ – Technical documentation, project schedule (WBS), and data acquisition analysis.
  • src/ – Python source modules:
    • fetch_data.py – Multi-source scrapers and squad data aggregator.
    • etl.py – SQLite schema management and data normalization.
    • similarity.py – Cosine similarity calculation engine.
    • fetch_events.py – SQL query interfaces for shot maps and average positions.
    • viz.py – Matplotlib/Plotly tactical visualization charts.
    • scrape_and_insert_sofascore.py – Playwright scraper for average player positions.
    • scrape_fotmob_shots.py – Scraper for shot coordinates and xG.
  • tests/ – Unit tests verifying ETL, matching logic, and viz outputs.

🛠️ Installation & Setup

Prerequisites

  • Python 3.13 or newer
  • Virtual environment tool (venv)

Environment Configuration

  1. Clone the repository:

    git clone https://github.com/SirSail/pogon-analytics.git
    cd pogon-analytics
  2. Create and activate a virtual environment:

    python -m venv .venv
    # Windows (PowerShell):
    .\.venv\Scripts\Activate.ps1
  3. Install dependencies:

    pip install -r requirements.txt

Running the App

To run the Streamlit dashboard locally, execute:

streamlit run app.py

Running QA Tests

$env:PYTHONPATH="."
.\.venv\Scripts\pytest -v

About

Performance & Recruitment Dashboard for Pogoń Szczecin (2025/26). Features automated FotMob & Transfermarkt scrapers, SQLite ETL pipeline, and scouting tools (similarity search, positional percentiles, radars).

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages