You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Covers data pipeline (PDFs → Claude API → PostgreSQL → JSON),
all 8 pages, project structure, tech stack, setup instructions,
deployment, data types, and severity score calculation.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
A public gallery of privacy violation cases and enforcement actions.
3
+
A global registry of **771 data privacy enforcement cases** across 7 jurisdictions, totaling **$507M+** in fines. Browse cases, compare enforcement actions side-by-side, explore jurisdictions on an interactive map, and learn what privacy enforcement terms actually mean.
The project has two halves: a **data pipeline** that extracts structured case data from legal PDFs using Claude AI, and a **React frontend** that presents it as an interactive gallery.
12
+
13
+
### Data Pipeline
14
+
15
+
```
16
+
Google Drive (PDFs)
17
+
|
18
+
v
19
+
Python Agent (agent.py) ---> Claude API (extracts structured fields)
1.**Source documents** — Complaint filings, consent orders, compliance decisions, and penalty notices are collected from regulator websites and stored in [Google Drive](https://drive.google.com/drive/folders/1j3XpwO0N2ttEjjVin-x-pHpq3KT3gwYj), organized by jurisdiction
32
+
2.**PDF processing** — `files/agent.py` watches an inbox folder, extracts text from PDFs, and sends them to the Claude API with a structured extraction prompt
33
+
3.**Claude extraction** — Claude parses each legal document and returns structured fields: company name, jurisdiction, violation types, legal bases, fines, impacted individuals, claims vs reality, regulatory findings, and more
34
+
4.**Database storage** — Extracted data is stored in PostgreSQL with the full JSON payload
35
+
5.**Frontend export** — `files/export_to_frontend.py` reads the database, calculates derived fields (severity scores, fine displays), and exports everything to `src/data/generatedCases.json`
36
+
6.**Static frontend** — The React app imports the JSON at build time. No runtime API calls or database connections
37
+
38
+
### Severity Score
39
+
40
+
Each case gets a deterministic severity rating (1-5) based on:
-**Final score** = data + people, clamped to [1, 5]
44
+
45
+
---
46
+
47
+
## Pages
48
+
49
+
| Page | Route | Description |
50
+
|------|-------|-------------|
51
+
|**Cases**|`/`| Searchable, filterable grid of all 771 cases with jurisdiction, sector, violation type, and sort controls |
52
+
|**Case Detail**|`/case/:id`| Full case breakdown — what they did, why they were wrong, claims vs reality, legal findings, outcome, attached PDFs |
53
+
|**Compare**|`/compare`| Matrix view (patterns across jurisdictions/violations/sectors) and side-by-side comparison of up to 3 individual cases |
54
+
|**Explore**|`/explore`| Interactive world map highlighting 7 jurisdictions; click a region to see its enforcement framework, key laws, and dataset statistics |
55
+
|**Leaderboard**|`/leaderboard`| Rankings — top companies by fines, most active jurisdictions, most common violations and sectors |
56
+
|**Learn**|`/learn`| Educational glossary explaining enforcement outcomes, violation types, and key legal concepts with cross-references |
57
+
|**About**|`/about`| Project information and attribution |
58
+
59
+
---
60
+
61
+
## Jurisdictions Covered
62
+
63
+
| Jurisdiction | Abbreviation | Region |
64
+
|---|---|---|
65
+
| Federal Trade Commission | US FTC | United States |
66
+
| California DOJ | CA DOJ | United States (California) |
67
+
| Information Commissioner's Office | UK ICO | United Kingdom |
68
+
| Personal Data Protection Commission | SG PDPC | Singapore |
69
+
| General Data Protection Regulation | EU GDPR | European Union |
70
+
| European Data Protection Board | EU EDPB | European Union |
71
+
| Office of the Australian Information Commissioner | AU OAIC | Australia |
0 commit comments