Skip to content

Fix get_parent same-time-tag bug in LineageClustering#383

Open
DonNabla wants to merge 3 commits into
mainfrom
fix/get-parent-same-time-tag
Open

Fix get_parent same-time-tag bug in LineageClustering#383
DonNabla wants to merge 3 commits into
mainfrom
fix/get-parent-same-time-tag

Conversation

@DonNabla

Copy link
Copy Markdown

Fix get_parent same-time-tag bug in LineageClustering

Summary

Tie-break by spatial proximity to the daughter's creation position when multiple parent interactions in lineage_cluster.get_parent share the same time tag.

For fast γ events that Compton-scatter at one site and photoabsorb at another within G4's sub-ns time-tag precision (~0.1 ns of travel time, bit-identical float64 t), the original time-cut + array-last logic returned the last same-time parent step in array order — which can be the wrong vertex. The Compton e⁻ daughter at site A then inherited the lineage of the γ's photoabsorbtion step at site B, collapsing the two physical scatter sites into a single cluster.

What's affected

Empirically the bug only flips event-level SS/MS classification for single-γ-decay isotopes (K40, Co60). Cascade chains (Th, U, Ra) and β-only sources (Pb212 with nucleusLimits excluding daughter chain) are unaffected, because either (a) cascade γ's have distinct decay-time tags so get_parent's time-cut already disambiguates, or (b) there's no γ Compton+phot pair to trigger the same-time-tag tie in the first place.

For K40/Co60, ~65 % of physically multi-scatter events end up reclassified as single-scatter, biasing the MS templates by ~50 % vs WFsim.

Validation

5 isotopes × 20 jobs × 100k primaries each (InnerCryostatFlange for K40/Co60/Th232/Ra226; WholeLXe for Pb212; SR0 Fermi-Dirac SS/MS classifier):

isotope topology vanilla MS-ratio vs WFsim 2026 patched MS-ratio vs WFsim 2026
K40 1.46 MeV single γ 0.385 1.006
Co60 1.17+1.33 MeV 2-γ cascade 0.351 0.998
Th232 chain cascade γ's 0.997 0.996
Ra226 α + 186 keV γ + chain 1.002 1.000
Pb212 intrinsic β (no γ) 1.000 1.000 — bit-identical h5

Single-event spot check (K40 g4id=39232):

site A (-35.6, -21.8, -0.2) site B (-33.2, -18.4, -0.6) alt_cs2_wo_timecorr SS/MS
WFsim epix 423.27 keV 1036.87 keV 154,972 PE MS
Vanilla fuse 0.06 keV 1460.76 keV 128 PE SS ✗
Patched fuse 423.95 keV 1036.87 keV 189,862 PE MS ✓

The diff

# Among parents with t <= particle.t, pick the latest time and then
# tie-break by spatial proximity if multiple steps share that latest time.
candidates = parent_interactions[parent_interactions_time_cut]
candidate_lineages = parent_lineages[parent_interactions_time_cut]
t_max = candidates["t"].max()
same_time_mask = candidates["t"] == t_max
if same_time_mask.sum() == 1:
    idx = int(np.where(same_time_mask)[0][0])
    return candidates[idx], candidate_lineages[idx]

same_time = candidates[same_time_mask]
same_time_lineages = candidate_lineages[same_time_mask]
distances = np.sqrt(
    (same_time["x"] - particle["x"]) ** 2
    + (same_time["y"] - particle["y"]) ** 2
    + (same_time["z"] - particle["z"]) ** 2
)
nearest = int(np.argmin(distances))
return same_time[nearest], same_time_lineages[nearest]

The existing nearest-in-time fallback (when no parent step satisfies t <= particle.t) is preserved unchanged.

Alternatives considered

I also prototyped and validated two position-free alternatives:

  • Process-match: daughter.creaproc == parent.edproc (G4's causal mapping). Closes ~85 % of the gap.
  • Precompute hybrid: {trackid: parent_step_index} map computed once per event using process-match → spatial fallback, making get_parent O(1). Also ~85 %.

Both are causally explicit but converge on the same coverage. The spatial tie-break in this PR catches the additional ~15 % of edge cases (daughters whose creaproc isn't directly in the parent's edproc set, e.g. secondary brems and cascaded photo-absorptions). A true hybrid (process-match prefilter on the same-time-max candidate set with spatial tie-break inside) is the principled long-term refinement — worth a follow-up PR but not blocking.

Test plan

  • Existing fuse unit tests pass
  • Spot-check K40 g4id=39232 produces the 423/1037 keV split documented above
  • Cross-isotope MS-rate ratios vs WFsim land within statistics of the table above on a re-run
Screenshot 2026-05-10 at 20 20 46 ces_sum_comparison

@DonNabla DonNabla requested review from HenningSE and cfuselli May 12, 2026 13:15
Tie-break by spatial proximity to the daughter's creation position when
multiple parent interactions share the same time tag.

For fast gammas that Compton-scatter at one site and photoabsorb at
another within G4's sub-ns time-tag precision (~0.1 ns / 3 cm), the
original time-cut + array-last logic returned whichever same-time
parent step happened to be last in the array. This mis-attributed the
Compton e- daughter at site A to the gamma's photoabsorbtion vertex at
site B, collapsing the two scatter sites into one cluster.

Net effect on single-gamma decays (e.g. K40, Co60):
~65 % of physically multi-scatter events get reclassified as single-
scatter at the recon stage, biasing the MS templates by ~50 %.
Chain isotopes (Th232, Ra226, etc.) and intrinsic-β isotopes (Pb212
with nucleusLimits excluding daughter chain) are unaffected because
their secondaries do not produce same-time-tag γ Compton+phot pairs.

Validated across 5 isotopes (20 jobs × 100k primaries each):

isotope | vanilla MS-ratio | patched MS-ratio | vs WFsim 2026
K40     | 0.385            | 1.006            | reference 1.000
Co60    | 0.351            | 0.998            | reference 1.000
Th232   | 0.997            | 0.996            | reference 1.000
Ra226   | 1.002            | 1.000            | reference 1.000
Pb212   | 1.000            | 1.000            | bit-identical h5

The fix preserves the existing nearest-in-time fallback when no parent
step satisfies t <= particle.t.

Full validation report and forensic walkthrough:
  /project/lgrandi/mpierre/high-ER-analysis/Inference_highER/
  template_builder/notebooks/WFsim_fuse_comparison_checks/
    Report_get_parent_patch_validation.md
    Investigate_one_K40_event_stage_by_stage.ipynb
@DonNabla DonNabla force-pushed the fix/get-parent-same-time-tag branch from 25c4017 to 2668d76 Compare May 12, 2026 13:24
@DonNabla

Copy link
Copy Markdown
Author

Happy to get the opinion of the fuse expert for this change. Mostly relevant for material ER simulation.

DonNabla and others added 2 commits May 15, 2026 05:34
The docformatter v1.7.7 hook fails on the default Python pre-commit picks
in CI; pinning language_version restores a working interpreter (suggested
fix from a fuse maintainer).

@HenningSE HenningSE left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me, approved

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants