Skip to content

Commit b895eae

Browse files
Merge pull request #14 from BayyinahEnterprise/v0.9.2-onnx-type-compliance
v0.9.2: D11-onnx type-compliance via check_type=True
2 parents baf0059 + f034f70 commit b895eae

8 files changed

Lines changed: 687 additions & 75 deletions

File tree

CHANGELOG.md

Lines changed: 170 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -19,6 +19,176 @@ introduced this convention.
1919

2020
---
2121

22+
## [0.9.2] - 2026-05-04
23+
24+
D11-onnx adds type-compliance via ``check_type=True``. Closes
25+
Gap 1 from the v0.9.1 post-release NeuroGolf evaluation:
26+
opset-compliance was previously missing
27+
opset-correct-but-type-incompatible operator usages
28+
(Equal-on-float in opset 10 / cont46 NeuroGolf bug class).
29+
Decision 1 of the v0.9.2 prompt passes ``check_type=True``
30+
alongside ``strict_mode=True`` in the
31+
``onnx.shape_inference.infer_shapes`` call. Empirically verified
32+
against onnx 1.17.0: purely additive to ``strict_mode``; zero
33+
regressions on v0.9.1 silent-pass / firing cases.
34+
35+
### Fixed
36+
37+
- **Gap 1 closure (HIGH).** D11-onnx now catches type-restriction
38+
failures via ``check_type=True``. Equal-on-float in opset 10
39+
was silent under v0.9.1's ``strict_mode``-only call; v0.9.2
40+
fires with ``category="type_mismatch"`` and
41+
``error_kind="TypeInferenceError"``. The failure mode is a
42+
real one identified by the post-release NeuroGolf evaluation
43+
(cont46 bug class) where users encountered it as a runtime
44+
failure that the lint pipeline did not surface.
45+
- The ``ShapeCoverageDiagnostic.category`` field is **required**
46+
(no default) per the round-30 fail-fast and round-32 MED-1
47+
discipline. A future construction site that forgets to set
48+
``category`` raises ``TypeError`` at construction time rather
49+
than silently mislabeling ``type_mismatch`` findings as
50+
``shape_mismatch``. The dataclass extension is additive for
51+
consumers (existing fields unchanged) and breaking for
52+
constructors; the only construction site is
53+
``check_shape_coverage`` in the same module, updated atomically.
54+
55+
### Added
56+
57+
- ``ShapeCoverageDiagnostic.category`` field discriminates three
58+
sub-types: ``"shape_mismatch"`` (v0.9.1 behavior preserved),
59+
``"type_mismatch"`` (v0.9.2 addition under ``check_type=True``),
60+
and ``"unparseable"`` (the fallback when neither known per-op
61+
pattern matches). Decision 2 of the v0.9.2 prompt: same outer
62+
diagnostic family with sub-type metadata on the dataclass,
63+
preserving the v0.9.1 runner-tag contract.
64+
- ``_TYPE_MISMATCH_RE`` regex parses two empirical body shapes
65+
observed against onnx 1.14-1.21:
66+
``(op_type:Equal): A typestr: T, has unsupported type: tensor(float)``
67+
and
68+
``(op_type:Reshape, ...): shape typestr: tensor(int64), has unsupported type: tensor(float)``.
69+
The first uses the schema parameter form; the second uses the
70+
named-input form. Both share the literal ``" typestr: "``
71+
substring; the regex captures the general ``<token> typestr:``
72+
shape so both variants route to ``type_mismatch`` rather than
73+
the unparseable-fallback.
74+
- ``_classify_per_op_finding(text)`` dispatcher tries
75+
``_SHAPE_MISMATCH_RE`` first per Decision 3 of the v0.9.2
76+
prompt: shape-mismatch is preferred when both could match
77+
because its ``error_kind`` is explicit. The fixed ordering is
78+
the spec for future-onnx-version stability.
79+
- ``_format_for_category(op, body, category)`` produces sub-type-
80+
specific diagnosis and ``minimal_fix`` prose. The
81+
``type_mismatch`` fix enumerates three actionable options:
82+
Cast the input, raise the model's opset_import to a version
83+
where the op accepts the current input type, or replace the op
84+
with one that natively accepts the type.
85+
- Three new fixture builders in ``tests/fixtures/onnx/builders.py``:
86+
``make_equal_float_op10_model`` (fires type_mismatch),
87+
``make_equal_int64_op10_model`` (silent-pass, well-typed),
88+
``make_reshape_float_shape_op13_model`` (fires type_mismatch
89+
via the shape-typestr body shape).
90+
91+
### Changed
92+
93+
- ``onnx.shape_inference.infer_shapes`` is now called with both
94+
``strict_mode=True`` and ``check_type=True``. Standing Rule of
95+
the v0.9.2 prompt: the two flags are paired in v0.9.2+; no
96+
path in ``shape_coverage.py`` uses one without the other.
97+
- ``check_shape_coverage`` rewrote its message-walking logic:
98+
the master ``_OP_TYPE_MARKER`` regex locates each
99+
``(op_type:NAME):`` marker in the InferenceError message and
100+
the per-marker classifier decides shape-mismatch vs
101+
type-mismatch vs unparseable. Document order of findings is
102+
preserved for diagnostic stability. The unparseable-fallback
103+
fires only when no per-op marker is found anywhere in the
104+
message, so a single multi-op message never duplicates the
105+
generic finding alongside per-op findings.
106+
- The three v0.9.1 mock ``_raise`` functions in
107+
``tests/test_onnx_shape_coverage.py`` updated to accept
108+
``**kwargs`` so they match the new flag combination. The
109+
v0.9.1 D11-onnx tests automatically become the regression
110+
suite for the strict_mode + check_type combination per the
111+
round-32 MED-2 closure.
112+
113+
### Deferred
114+
115+
- **Gap 2** (score-validity advisory via ``onnx_tool.profile``)
116+
scheduled for **v0.9.3** per Decision 9. The new optional
117+
extra (``[onnx-profile]`` pulling ``onnx-tool``) needs a
118+
design round addressing dependency containment, gate semantics
119+
(advisory vs MARAD), and message format. Closes MARAD-3 from
120+
cont45 (TopK chain profiler-blocker), unblocks 2 rebuilds.
121+
Estimated time budget 4 hours.
122+
- **Gap 3** (channel-sum / input-mask consistency) **held until
123+
NeuroGolf provides a tight spec**. The substrate-side
124+
consistency definition does not yet exist; implementing
125+
without a spec would invent a definition that may not match
126+
actual needs. Per the framework's reproducer-first discipline,
127+
substrate must precede surface.
128+
- **Gap 4** (numpy-vs-ONNX divergence detection) scheduled for
129+
**v0.9.4** per Decision 9. Per the round-32 NeuroGolf
130+
leverage analysis, this is the dominant-failure-mode catch
131+
(every NeuroGolf primitive has a numpy reference; the
132+
substrate-vs-surface gap is structurally where bugs hide:
133+
cont48, cont42 task284 first build, cont44 task313 prefix-sum
134+
direction). v0.9.4 design round resolves the build-script-to-
135+
lint contract for discovering numpy reference functions, the
136+
probe-grid strategy, and the tolerance before commit 1.
137+
Estimated 2 design + implementation sessions.
138+
- v1.3 paper revision waits for v0.9.2 to ship and tracks v0.9.2
139+
substrate per Decision 8. The empirical record at v1.3
140+
publication is "29 versions" instead of "28," and Gap 1
141+
closure is cited as a worked example of the audit chain
142+
working post-release.
143+
144+
### Tests
145+
146+
Test count: 380 (v0.9.1 ship state) -> 389 (v0.9.2). Net
147+
delta: +9.
148+
149+
Breakdown:
150+
151+
- Type-compliance firing tests: +2
152+
(``test_d11_onnx_type_mismatch_fires_on_equal_float_op10``,
153+
``test_d11_onnx_type_mismatch_fires_on_reshape_float_shape_op13``)
154+
- Type-compliance silent-pass tests: +2
155+
(``test_d11_onnx_type_compliance_silent_pass_on_well_typed_equal_int64``,
156+
``test_d11_onnx_type_compliance_silent_pass_on_well_typed_relu_float``)
157+
- Parser tests: +2
158+
(``test_d11_onnx_type_mismatch_regex_matches_message``,
159+
``test_d11_onnx_classifier_prefers_shape_mismatch_first``)
160+
- Fallback test: +1
161+
(``test_d11_onnx_classifier_unparseable_returns_category``)
162+
- Diagnostic-prose test: +1
163+
(``test_d11_onnx_type_mismatch_diagnosis_prose``)
164+
- V0_9_2 surface snapshot: +1
165+
(``test_v0_9_2_onnx_adapter_surface_snapshot``)
166+
167+
Sum: 2 + 2 + 2 + 1 + 1 + 1 = +9 ✓
168+
169+
The ten v0.9.1 D11-onnx tests in
170+
``tests/test_onnx_shape_coverage.py`` automatically become the
171+
regression suite for the strict_mode + check_type combination
172+
per Decision 1's "purely additive" property and round-32 MED-2
173+
closure. Their three mock ``_raise`` functions accept
174+
``**kwargs`` so they match the new flag combination without
175+
behavioral change.
176+
177+
### Round-31 / 32 closure ledger
178+
179+
| Finding | Source | Severity | Closure |
180+
| --- | --- | --- | --- |
181+
| Gap 1 | v0.9.1 NeuroGolf evaluation | HIGH | Closed in v0.9.2 round 31 via Decisions 1-3 (check_type=True flag, ShapeCoverageDiagnostic.category split, _TYPE_MISMATCH_RE regex extension). Empirically verified zero regressions; four firing cases confirmed. |
182+
| META: v0.9.1 missed check_type=True | round-31 audit-of-self | INFORMATIONAL | Acknowledged. The v0.9.1 round-29 design correctly identified strict-mode shape inference as canonical for shape mismatches; check_type was not considered because the round-29 empirical probes focused on shape, not type. The post-release NeuroGolf evaluation surfaced the gap; v0.9.2 closes it. The framework's reciprocal contract working as designed: substrate (NeuroGolf bug class) governs surface (v0.9.2 spec). |
183+
| Gap 2 | v0.9.1 NeuroGolf evaluation | MEDIUM | Scheduled for v0.9.3 per Decision 9. Tracked in ### Deferred section. |
184+
| Gap 3 | v0.9.1 NeuroGolf evaluation | MEDIUM | Held until NeuroGolf provides a tight spec. Tracked in ### Deferred section. |
185+
| Gap 4 | round-32 leverage analysis | HIGH (competition lever) | Scheduled for v0.9.4 per Decision 9. The dominant-failure-mode catch; alone catches ~4 historical sessions and likely prevents ~5-7 future ones. |
186+
| Round-32 MED-1: "purely additive" framing wrong for frozen dataclass | round-32 audit | MEDIUM | §3.1 wording corrected: additive for consumers, breaking for constructors. ``category`` kept required (no default) per round-30 fail-fast. The only construction site is ``check_shape_coverage``, updated atomically. |
187+
| Round-32 MED-2: regression test count duplicated v0.9.1 | round-32 audit | MEDIUM | Test math reduced from +12 to +9. The v0.9.1 D11-onnx tests automatically become the regression suite via the ``check_shape_coverage`` entry point; the v0.9.2 change is at the call site inside that function. The three mock ``_raise`` functions updated to accept ``**kwargs``. |
188+
| Round-32 LOW: Q1/Q2 strategic decisions left open | round-32 audit | LOW | §1 gains Decisions 8 and 9 locking v1.3-tracks-v0.9.2 and the v0.9.x competition-leverage sequence (v0.9.2 type-compliance, v0.9.3 score-validity, v0.9.4 numpy-vs-ONNX divergence). |
189+
190+
Eight closures (1 HIGH closed in v0.9.2, 1 INFORMATIONAL, 2 deferrals tracked, 1 NEW HIGH scheduled for v0.9.4, 3 round-32 audit findings) plus the integration-items tracking from the NeuroGolf evaluation noted as out-of-scope deployment-side use cases.
191+
22192
## [0.9.1] - 2026-05-04
23193

24194
D11-onnx (shape-coverage on ONNX edges) ships, deferred from

pyproject.toml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@ build-backend = "setuptools.build_meta"
44

55
[project]
66
name = "furqan-lint"
7-
version = "0.9.1"
7+
version = "0.9.2"
88
description = "Structural-honesty checks for Python, powered by Furqan"
99
readme = "README.md"
1010
requires-python = ">=3.10"

src/furqan_lint/__init__.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
"""furqan-lint: structural-honesty checks for Python."""
22

3-
__version__ = "0.9.1"
3+
__version__ = "0.9.2"
44

55
# Explicit public surface declaration. The implicit surface (anything
66
# not starting with an underscore at module level) is fragile: any

0 commit comments

Comments
 (0)