Swiss trains are famously punctual; German trains famously aren't. This project quantifies the gap using each country's own open data — and finds it's even larger than the reputation suggests. At the busiest 20 stations of each network, Swiss trains are punctual ~96% of the time vs. ~78% for German trains, and every single major Swiss station outperforms even the best major German one.
Scope: analysis covers 30 days of SBB Ist-Daten (September 2024) compared against the German piebro/deutsche-bahn-data window. Punctuality includes cancellations as "not punctual"; the 6-minute threshold matches DB's long-distance definition, the 3-minute threshold matches SBB's internal target.
At Germany's and Switzerland's 20 busiest train stations over the analysis window:
- Swiss trains are punctual roughly 96% of the time at the standard 6-minute threshold; German trains, 78% — a gap of ~18 percentage points.
- The gap nearly doubles at the stricter 3-minute threshold (SBB 91%, DB 63%) — DB clears the easier bar much more often than the harder one, indicating a fat tail of small-but-noticeable delays.
- Every one of SBB's 20 busiest stations outperforms the best of DB's 20 busiest. DB's most punctual large station (87.6%) still falls short of SBB's least punctual one (89.7%).
- Mean arrival delay is ~3.7× higher in Germany (3.81 min vs 1.04 min).
A focused two-stat comparison between Deutsche Bahn and Schweizerische Bundesbahnen, using:
- German data:
piebro/deutsche-bahn-data— community-maintained Hugging Face dataset (CC BY 4.0, underlying data © Deutsche Bahn), covering ~100 busiest German stations from 2024-07 onwards. - Swiss data: SBB Ist-Daten on opentransportdata.swiss — daily CSV files with actual vs. planned arrival/departure times across the SBB network.
The comparison is restricted to two questions:
- What does the arrival-delay distribution look like in each country?
- How does punctuality vary across the 20 busiest stations in each network?
Nothing else. No machine learning, no time-of-day analysis, no seasonal effects. The scope is deliberately small.
Punctuality is computed at two thresholds: <3 minutes (SBB Group's internal target) and <6 minutes (DB's long-distance definition). Both are reported for both countries so the comparison is honest about which definitional regime is being used. Cancellations are counted as "not punctual" in both datasets, which differs from each operator's official reporting. The analysis window is 2024-07 onwards, matching the German dataset's coverage. The German data is pre-flattened with a computed delay_in_min column; for Swiss data the delay is computed as (actual_arrival - scheduled_arrival) in minutes, clipped to non-negative. Station selection is "top 20 by stop-event volume in the analysis window" within each country independently — comparing each network's busy hubs to the other's.
db-vs-sbb-punctuality/
├── README.md
├── requirements.txt
├── LICENSE
├── .gitignore
├── notebooks/
│ └── analysis.ipynb ← the full comparison
├── scripts/
│ ├── download_de_data.py ← German data from Hugging Face
│ └── download_ch_data.py ← Swiss data from opentransportdata.swiss
├── figures/ ← PNG charts (git-tracked)
└── data/
├── de/ ← German Parquet files (git-ignored)
└── ch/ ← Swiss CSV files (git-ignored)
# Setup
git clone https://github.com/ajitagupta/db-vs-sbb-punctuality.git
cd db-vs-sbb-punctuality
python -m venv .venv && source .venv/bin/activate
pip install -r requirements.txt
# Download both datasets
python scripts/download_de_data.py
python scripts/download_ch_data.py # this one takes longer — see script for date range
# Run the notebook
jupyter notebook notebooks/analysis.ipynbThe Swiss download is the slower step (one CSV per day, ~30 MB each). The script downloads a configurable range — see python scripts/download_ch_data.py --help.
- The Swiss dataset is much larger than the German one in raw row count (entire SBB network vs. ~100 stations). For a fair top-20 comparison this isn't a problem, but absolute distribution comparisons are sensitive to the coverage difference.
- DB and SBB define "punctual" differently in their official reporting (6 min vs 3 min thresholds; differing cancellation treatment). This project shows both thresholds and treats cancellations uniformly. Numbers therefore differ from each operator's headline figures.
- The Swiss Ist-Daten represents the SBB network, not the full Swiss rail system (BLS, SOB, foreign operators on Swiss tracks are partially included; private regional operators are not).
- No causal claims. This is descriptive. Reasons why one country outperforms the other — track sharing with freight, infrastructure investment, network topology, average journey length — are not analyzed here.
- German data:
piebro/deutsche-bahn-data, CC BY 4.0, underlying data © Deutsche Bahn. - Swiss data: SBB Ist-Daten, opentransportdata.swiss, © SBB AG.
- Prior public analyses of DB data by David Kriesel (CCC 2019) and the Bahn-Vorhersage project informed the methodology choices.
MIT — see LICENSE. Underlying data is governed by each operator's open-data license terms.
Built as a data-analysis portfolio project by Ajita Gupta — full-stack engineer based in Zurich.
