Nordic Deal Sourcing Graph

Ranks acquisition / investment targets from public Nordic & EU procurement data — by contract-win momentum, scale, and institutional diversity — and drafts a first sourcing memo per top candidate. A rifle scope, not a CSV dump.

Why this matters

PE, corp dev, and private-credit teams all fight the same fight: find the target before the market prices it. EQT built Motherbrain to do exactly this and treats an 18–24 month tech lead as a moat. This repo is the Nordic angle that most candidates never touch — it fuses public data sources few people use well into a ranked, explainable target list. That's the differentiator between "another AI builder with American CSVs" and someone who speaks the local market.

Research basis

Cao et al. (EQT Motherbrain), "Beyond Gut Feel," ICANN 2024 — framing deal sourcing as multivariate time-series classification of company trajectories.
ML-in-M&A thesis (Aalto, surfaced): Predicting Merger and Acquisition Outcomes: A Machine Learning Approach — aaltodoc
Government-contract signal literature — public procurement carries both a pricing signal and an information signal about firm health and momentum.
Master's thesis to reverse-engineer (surfaced — read & vet next): Aalto, Predicting venture-capital-backed start-up success with machine learning — aaltodoc

Math we're stealing: the trajectory features and classifier that score a company's odds of IPO/acquisition — Motherbrain's "Beyond Gut Feel" idea, made reproducible on public data.
Frontier to push beyond (2025–2026): the LLM multi-agent investment-analysis wave (e.g. arXiv 2602.00082) applied to public-data target ranking.

Hypothesis: contract-win momentum, scale, and institutional breadth are leading, public, and underused signals for sourcing — legible long before a banker's teaser lands in the inbox.

Data

Source	Provides	Cost	License caveat
TED v3 Search API	EU public-contract notices	Free, no key	Open. Polite User-Agent required (sets `TED_USER_AGENT`).
Bolagsverket open data	Swedish ownership / financials	Free	Open (roadmap — manual validation today.)
GLEIF	LEI entity identifiers	Free	Open (roadmap — for ER beyond suffix-stripping.)

TED v3 coverage caveat: the v3 Search API only carries notices from ~early 2026 onward, due to the EU's eForms transition. Older windows return sparse data. The MVP defaults to a 6-month look-back for this reason.

System design

flowchart LR
    A[TED v3<br/>award notices] --> B[TEDClient<br/>iteration-token pagination]
    B --> C[Notice model<br/>buyer, winner, value, CPV, date]
    C --> D[normalize_supplier_name<br/>Nordic suffix strip]
    D --> E[aggregate_suppliers<br/>group by canonical name]
    E --> F[SupplierSnapshot<br/>features per supplier]
    F --> G[score_supplier<br/>momentum + scale + diversity]
    G --> H[Composite rank<br/>balanced/momentum_led/scale_led]
    H --> I[CSV + PNG + per-supplier memo]

Two disciplines do most of the work:

Entity resolution that's honest about its limits. Nordic legal suffixes (AB, Aktiebolag, Oy, Oyj, A/S, ApS, ASA, SE, GmbH, ...) get stripped on a word-boundary regex; case + whitespace get normalised. Volvo AB, VOLVO AB, and Volvo Aktiebolag collapse to one supplier. Volvo Cars AB and Volvo Trucks AB stay separate — they're distinct legal entities, and merging them would be wrong. Typo merging and corporate-tree resolution wait for GLEIF LEI integration in v2.
Scoring weights are defensible by construction, not by fit. No M&A label dataset exists to train a classifier against, so we don't pretend to. The three sub-scores (momentum, scale, diversity) are percentile-ranked within the universe and combined with documented weights per profile (balanced, momentum_led, scale_led). Easy to inspect; easy to defend; easy to swap for a real model when labels arrive.

Results

Live scans against TED v3, 6-month look-back, balanced scoring profile. Both runs hit the live TED API and produced the artifacts in data/sample/.

Sweden — 2,266 notices, 1,398 unique suppliers, 265 ranked

Rank	Supplier	12mo	3mo	Momentum	Composite
1	Peab Sverige AB	6	4	2.67×	0.902
2	Swedbank AB	3	2	2.67×	0.869
3	Securitas Sverige Aktiebolag	4	3	3.00×	0.837
4	Stena Recycling AB	8	4	2.00×	0.831
5	Movab AB	5	3	2.40×	0.812
6	Avarn Security AB	4	2	2.00×	0.812
7	CGI Sverige AB	3	2	2.67×	0.811
8	TN Bygg & Anläggning AB	3	2	2.67×	0.795
9	Anticimex Aktiebolag	7	4	2.29×	0.788
10	OneMed Sverige AB	5	2	1.60×	0.783

These are real, recognizable Nordic suppliers showing actual contract-win momentum: Peab (large-cap construction), Swedbank, Stena group, CGI Sweden, Anticimex (pest control / inspections). #1 Peab has 1.43B SEK of disclosed contract value across 5 distinct public buyers in 12 months — exactly the institutional-depth signal a corp-dev team would surface manually after weeks of work.

Norway — 903 notices, 581 unique suppliers, 92 ranked

Rank	Supplier	12mo	3mo	Momentum	Composite
1	Matriks AS	5	3	2.40×	0.818
2	Atea AS	6	2	1.33×	0.798
3	Crayon AS	5	2	1.60×	0.766
4	Asko Øst AS	3	2	2.67×	0.753
5	AF Energi AS	4	1	1.00×	0.744
6	GK Norge AS	3	1	1.33×	0.740
7	VWR International AS	3	2	2.67×	0.739
8	Schindler AS (hovedenhet)	2	1	2.00×	0.729
9	HENT AS	3	1	1.33×	0.725
10	LÆRE AS	2	2	4.00×	0.721

Norway's procurement volume is meaningfully smaller than Sweden's (roughly 2.5× ratio matches the population + private-sector mix difference). Top Norwegian names skew toward IT services (Atea, Crayon, CGI-class) and infrastructure (HENT, AF Energi, GK Norge).

Per-supplier sourcing memo (data/sample/supplier_memo_top1.md for Sweden's #1 Peab): cadence, scale, sector focus, institutional depth, and a structured "what to validate next" checklist (ownership, financials, sector context, competitive position). Auto-drafted from the same SupplierSnapshot used for ranking — no LLM, no hallucination surface, reproducible.

Bundled artifacts:

data/sample/ranked_targets.csv — Sweden top-20
data/sample/ranked_targets.png — Sweden bar chart
data/sample/supplier_memo_top1.md — Peab memo
data/sample/no/ — same set, Norway market

Risks & limitations

Entity resolution is suffix-strip only. Catches ~80% of trivial variants. Doesn't catch typos, corporate-tree relationships (parent / subsidiary), or abbreviations. GLEIF LEI lookup is the v2 fix for the remaining 20%.
TED v3 only carries ~early-2026-onward notices. Older windows are sparse. For a true long-term momentum signal, ingestion from the historical TED CSV archive (2014-2024) is needed.
Public procurement is one customer channel, not the P&L. A supplier with strong public-contract momentum may still be losing money in the private-sector book. The sourcing memo explicitly flags this and pushes the user to validate financials separately.
Scoring weights are not learned. No M&A outcome dataset is available to train against on public data. The balanced / momentum_led / scale_led profiles are defensible-by-construction heuristics — swap for a trained classifier when labels arrive.
Anonymous TED winners (~12% of notices for SE, ~16% for NO) are dropped silently. They represent real economic activity that's invisible to this pipeline.
No ownership data yet. A founder-owned or PE-backed signal would be the highest-value filter; deferred to v2 (Bolagsverket integration).
Country code mapping: TED uses ISO-3 (SWE, NOR, DNK, FIN); the CLI accepts both 2-letter and 3-letter forms and resolves.

What I'd build next for a real firm

GLEIF LEI integration for verified entity resolution — catches the 20% the suffix-strip can't.
Bolagsverket / Brønnøysund ownership overlay — distinguish founder-owned from PE-backed from corporate-subsidiary, the single highest-value filter for a PE sourcing desk.
Historical TED CSV ingestion (2014-2024) to extend the momentum window from 6 months to 5+ years.
Per-supplier financial pull via local statutory accounts — moves the score from "public-contract momentum" to "public-contract momentum on a healthy P&L."
Sector-specific scoring profiles tuned to the sourcing mandate (e.g. healthcare-services-focused profile up-weights CPV-85 contracts).
CRM hand-off — flagged target → task in the firm's Salesforce.

Reproducibility

# install
python3.12 -m venv .venv && .venv/bin/pip install -e .

# polite User-Agent (TED requires identification)
cp .env.example .env
# edit .env: TED_USER_AGENT="your-project (Your Name your.email@example.com)"
export $(grep -v '^#' .env | xargs)

# scan Sweden (live TED, ~5s for 2.3K notices)
ndsg scan --country SE --months 6 --top 20 \
  --cache-dir data/cache --out-dir data/sample

# scan Norway
ndsg scan --country NO --months 6 --top 20 \
  --cache-dir data/cache --out-dir data/sample/no

# re-score from cached notices (no live TED call)
ndsg scan --country SE --offline \
  --cache-dir data/cache --out-dir /tmp/se_rerun

# scoring profiles
ndsg scan --country SE --profile momentum_led  # 60/20/20 weighting
ndsg scan --country SE --profile scale_led     # 20/60/20

# tests
pytest                       # 67 hermetic tests, no network
pytest -m slow               # adds live-TED integration test

Engineering standards: permissive code (MIT), public data first (TED v3, no key), every claim grounded in the artifact CSVs, entity-resolution honesty surfaced explicitly.

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
data		data
src/nordic_deal_graph		src/nordic_deal_graph
tests		tests
.env.example		.env.example
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Nordic Deal Sourcing Graph

Why this matters

Research basis

Data

System design

Results

Sweden — 2,266 notices, 1,398 unique suppliers, 265 ranked

Norway — 903 notices, 581 unique suppliers, 92 ranked

Risks & limitations

What I'd build next for a real firm

Reproducibility

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Nordic Deal Sourcing Graph

Why this matters

Research basis

Data

System design

Results

Sweden — 2,266 notices, 1,398 unique suppliers, 265 ranked

Norway — 903 notices, 581 unique suppliers, 92 ranked

Risks & limitations

What I'd build next for a real firm

Reproducibility

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages