Commit 9766049
fix: incremental-update.py — align changed-QIDs discovery with expanded filters
The previous commit aligned build_scoped_ids_query (the second-stage
per-QID fetch) with the shared Phase 1 query shape. But the first-stage
discovery query in fetch_changed_qids was still using the OLD narrow
filters, so even if a newly-reachable entity got edited on Wikidata,
the incremental wouldn't see it.
Two filters expanded to match fetch-wikidata-entities.py:
1. Team discovery — three-path UNION matching build_team_ids_query:
- subclass of Q476028 (association football club) — existing
- subclass of Q103229495 (men's association football team) — new
- P641=Q2736 AND subclass of Q847017 (sports club with football) — new
Matters because Wikidata classifies many well-known clubs solely
under Q103229495 (Q7156 Barcelona, Q170703 Boca Juniors) or under
generic Q847017 (Q8206935 Estudiantes de Río Cuarto). Those were
silently unreachable by the old filter.
2. Competition discovery — now mirrors the full-fetch competition
query's shape:
- Class path requires explicit P641 = Q2736 (blocks NHL/PGA/rugby
from leaking through via their P31 chain)
- Property path allows entities with a football-specific provider
claim, with the non-football FILTER NOT EXISTS guard
- FILTER NOT EXISTS { ?e wdt:P3450 ?parentComp } excludes seasonal
entities (they're handled by the season query below)
Matters because the old incremental discovery was picking up
seasonal entities as competitions and letting non-football
competitions leak through the property path.
Live-verified: the new team discovery query returns 98 entities
modified in the past day. Syntactically valid, non-zero, executes
against Wikidata's endpoint cleanly.
Note that this closes the drift for discovery, but the scoped per-QID
fetch query (build_scoped_ids_query) still uses VALUES to restrict —
which means entities that were modified BEFORE the expanded filter
went live still won't be discovered until a full refresh. For those,
the dump-based workflow (coming next) is the right tool.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>1 parent 088318e commit 9766049
1 file changed
Lines changed: 52 additions & 10 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
97 | 97 | | |
98 | 98 | | |
99 | 99 | | |
100 | | - | |
| 100 | + | |
| 101 | + | |
| 102 | + | |
| 103 | + | |
| 104 | + | |
| 105 | + | |
| 106 | + | |
| 107 | + | |
| 108 | + | |
101 | 109 | | |
102 | 110 | | |
103 | 111 | | |
| |||
107 | 115 | | |
108 | 116 | | |
109 | 117 | | |
| 118 | + | |
| 119 | + | |
| 120 | + | |
| 121 | + | |
110 | 122 | | |
111 | 123 | | |
112 | | - | |
113 | | - | |
| 124 | + | |
| 125 | + | |
| 126 | + | |
| 127 | + | |
| 128 | + | |
| 129 | + | |
| 130 | + | |
| 131 | + | |
| 132 | + | |
| 133 | + | |
| 134 | + | |
| 135 | + | |
| 136 | + | |
| 137 | + | |
| 138 | + | |
114 | 139 | | |
115 | 140 | | |
116 | 141 | | |
| |||
122 | 147 | | |
123 | 148 | | |
124 | 149 | | |
| 150 | + | |
| 151 | + | |
| 152 | + | |
| 153 | + | |
| 154 | + | |
125 | 155 | | |
126 | 156 | | |
127 | | - | |
128 | | - | |
129 | | - | |
130 | | - | |
131 | | - | |
132 | | - | |
133 | | - | |
| 157 | + | |
| 158 | + | |
| 159 | + | |
| 160 | + | |
| 161 | + | |
| 162 | + | |
| 163 | + | |
| 164 | + | |
| 165 | + | |
| 166 | + | |
| 167 | + | |
| 168 | + | |
| 169 | + | |
| 170 | + | |
| 171 | + | |
| 172 | + | |
| 173 | + | |
| 174 | + | |
| 175 | + | |
134 | 176 | | |
135 | 177 | | |
136 | 178 | | |
| |||
0 commit comments