Skip to content

fix: stabilize solid-phase numerics (SIGFPE, conditioning, blowups)#22

Open
trymarz wants to merge 11 commits into
mainfrom
fix/solid-phase-numerics-stability
Open

fix: stabilize solid-phase numerics (SIGFPE, conditioning, blowups)#22
trymarz wants to merge 11 commits into
mainfrom
fix/solid-phase-numerics-stability

Conversation

@trymarz

@trymarz trymarz commented Jun 23, 2026

Copy link
Copy Markdown
Owner

Eight atomic fixes for solid-phase numerical stability, plus a ninth that narrows
fix 4. These bugs exist on main independently of DEM coupling — they affect any
case with solid advection, chemistry, or porous regions spanning the full
porosity range [0, 1]. Found during poster-demo development.

  1. solver — floor porosityF in the rhoEqn ddt term (1e-3). The
    diagonal-only rho solve hits a zero diagonal → SIGFPE in solid-full cells.
  2. pyrolysis — floor solid rhoCp at 100 J/(m³·K). Pure-gas cells collapse
    rhoCp→1e-15, giving a ~1e21 TEqn condition number and garbage Ts.
  3. pyrolysis — clamp porosity ≤ 1 after the solve. Advection overshoot >1
    drives rho*(1-porosity)<0, the rhoLoc floor, and a ~1/SMALL species blowup.
  4. pyrolysis — rename geometric masks (whereIs_porLessOne_,
    whereIsNot_porEqlOne_) and add a mass-based hasSolidMass mask.
  5. pyrolysis — iterate the energy solve ≥2× per region evolve (the explicit
    Ts↔Tgas coupling is under-relaxed with a single pass).
  6. pyrolysis — zero heatTransfField only in near-empty interface cells, and
    only on outflow (restore the porosity>0.999 guard, & grad>0).
  7. chemistry — guard the post-integration solid-T update against vanishing
    heat capacity (newCp*solidRho→0 → inf/NaN → SIGFPE).
  8. chemistry — mirror the ±500 K/s dT/dt limiter from the ODE RHS into that
    post-integration update.
  9. pyrolysis (narrows 4) — gate only the chemistry source sRhoSi with
    hasSolidMass, not the advective transport, so solid still advects downstream.

Each has a documented root cause in its commit body and an inline source comment.

Verification: the serial charOnlyMove regression passes (fix 2 bounds a real
Ts = 8.1e12 K → physical; char/porosity bit-stable to 1e-4). Fixes 3/7/8 are
defensive guards on physically-bounded quantities. Full evidence — Part 1, the
updraftDemo main-vs-fix A/B, and the baseline diff-gate — is in the PR comment.

trymarz added 9 commits June 23, 2026 12:27
…IGFPE

The gas continuity equation is a diagonal-only system (only fvm::ddt is
implicit; fvc::div(phi) and Srho are explicit), so OpenFOAM solves it with
the diagonal solver: rho[i] = source[i] / diag[i]. The diagonal is
porosityF*V/deltaT. volPyrolysis::evolvePorosity() floors the gas void
fraction to exactly 0 in solid-full cells, making that diagonal exactly
zero, so diagonalSolver::solve divides by zero -> SIGFPE.

Floor porosityF in the ddt term to 1e-3 so the diagonal stays non-zero.
The explicit Srho*(1-porosityF) source is left on the true porosity field.
…ix conditioning

The solid heat capacity rhoCp = rho*Cp*(1-porosityF). In pure-gas cells
where porosityF=1, (1-porosityF)=0 so rhoCp -> 0; the old SMALL floor left
rhoCp ~1e-15 J/(m3.K). Combined with the immersed-boundary off-diagonal
zeroing at the porous interface, the TEqn diagonal for those cells is
~rhoCp*V/dt ~1e-18 while solid cells carry ~1e3. The DIC preconditioner
inverts that ~1e21 condition number into trillion-scale off-diagonals and
DICPCG 'converges' to garbage Ts (e.g. -51485 K) in 1-2 iterations.

Raise the floor to 100 J/(m3.K). These cells have no energy sources
(chemistry, heat transfer, radiation all zero where 1-porosity=0), so the
floor changes only the conditioning, never the solid-cell physics (real
rhoCp there is 5e5-1e6, far above 100).
…n blowup

The porosity transport equation uses explicit solid advection
(fvc::div(Us, por)). Near the porous interface this can overshoot porosity
slightly above 1.0. When porosityF > 1, rho*(1-porosityF) goes negative and
the rhoLoc floor max(...,SMALL) engages, leaving rhoLoc = SMALL. The
solid-species advection then amplifies by ~1/SMALL, blowing up the field
(observed: Ywood from 1.0 to ~7.7e12 in one timestep at t ~ 0.04 s).

Clamp porosity_ to <= 1.0 immediately after the porosity solve.
…dMass guard

Rename the two geometric mask member fields for clarity:
  whereIs_    -> porLessOne_  (1 where porosityF < 1, porous geometry present)
  whereIsNot_ -> porEqlOne_   (1 where porosityF = 1, pure gas / freeboard)
and the local solveEnergy() gradient whereIsNotGrad -> porEqlOneGrad. The
registered IOobject names ("whereIs"/"whereIsNot", NO_READ/NO_WRITE) and the
heterogeneousPyrolysisModel base-class 'whereIs' parameter are left untouched.

Add a mass-based mask in solveSpeciesMass(): hasSolidMass is 1 where
rho*(1-porosity) > SMALL. The species solve uses rhoLoc = max(rho*(1-por),
SMALL) as the implicit ddt coefficient; when solid has fully advected out of
a cell rhoLoc = SMALL, and the explicit fvc::div(solidFlux_, Yi) term pulls
neighbour Yi values amplified by ~1/SMALL -> blowup. porLessOne_ can't guard
this (it stays 1 wherever porosity < 1, even with no solid mass left), so
gate the YsEqn RHS with hasSolidMass.

Also add a documented placeholder comment in solveEnergy() for an optional
per-cell Ts clamp, to be enabled only if overshoots persist after the other
guards.
…mal coupling

The solid and gas energy equations couple through the explicit
heatTransfField = CONV*(Tgas - Ts)*porLessOne_. solveEnergy() uses Tgas from
the start of the timestep; with only one pass neither field sees the other's
updated value within the timestep, so the coupling is effectively
under-relaxed to the previous timestep's state (same principle as PISO
nCorrectors for p-U coupling).

Run the energy-solve loop at least 2 passes regardless of nNonOrthCorr_,
preserving a larger user-set nNonOrthCorr_ via max().
…, outflow only

Two bugs in the porous-interface heatTransfField treatment in solveEnergy():

Bug A: the porosity guard was commented out, so heatTransfField was zeroed
for ALL interface cells, including those with real solid. This thermally
insulated the interface and blocked heat transfer from gas into the bed.
Restore the guard so zeroing fires only in near-empty cells
(porosity > 0.999).

Bug B: the entry condition used (Us & porEqlOneGrad) != 0, which triggers for
both outflow (solid -> gas, positive) and inflow (gas -> solid, negative).
The interface rescaling is only meaningful for outflow; restrict to > 0.
… capacity

In calculateSourceTerms(), after the ODE integrator returns, the solid
temperature change is dTi = newhi/(newCp*solidRho)*dt. A cell can have
porosityF < 1 (admitted by the reacting-cell criterion) but nearly all solid
mass consumed or advected away, leaving solidRho -> 0 and newCp -> 0 while
newhi stays finite (computed from mass fractions, not absolute mass). The
division then overflows to inf/NaN -> FOAM_SIGFPE. The ODE integrator itself
completes fine; the bug is in the post-integration temperature update.

Guard the division on solidHeatCapacity (newCp*solidRho) > VSMALL; skip the
update when negligible (no solid mass to heat). A one-shot Pout diagnostic
reports the offending cell state the first time the guard fires.
…rature update

derivatives() (the ODE RHS) bounds the reaction heating to |dT/dt| <= 500 K/s
during integration, but the post-integration temperature update in
calculateSourceTerms() applied no such bound. The same physical process was
limited DURING integration and unbounded AFTER: with a small sub-step dt and
large newhi, dTi could reach hundreds of K, far exceeding the 500*dt the ODE
internally allows.

Apply max(min(dTi, 500*dt), -500*dt) after computing dTi, making the
temperature bounding symmetric between the ODE internal steps and the
post-integration update.
…id transport

The hasSolidMass mask introduced in 636da23 multiplied the entire species
RHS, including the advective transport term -fvc::div(solidFlux_, Yi). Because
the mask is 0 in any cell that does not already hold solid, it zeroed the
inflow as solid advected into fresh downstream cells, annihilating it instead
of transporting it. Per-commit isolation on charOnlyMoveCases/serial (a pure
solid-transport case, no chemistry) showed char's domain integral collapsing
to a hard zero by t=0.75 s, versus the gradual transport-out on main.

Narrow the mask to the chemistry source term only (hasSolidMass * sRhoSi),
leaving advection ungated. The advection blowup the mask once also guarded
against (porosity>1 -> rhoLoc<0 -> floored -> amplified by 1/rhoLoc) is already
removed upstream by the porosity<=1 clamp in evolvePorosity() (e753f6c).
@trymarz trymarz self-assigned this Jun 23, 2026
Regenerated charOnlyMoveCases/serial reference from the fix-tip build. Diff-gated
against the previous baseline (rtol 1e-4):
  - regressionSolidSpeciesIntegrals (char/ash/hum/rubberwood): unchanged
  - regressionPorosityAverage / regressionPorosityMin: unchanged
  - regressionTsMax:     8.10e12 K -> ~226-264 K
  - regressionTsAverage: 1.55e11 K -> ~3-26 K
  - regressionTsMin:     -2013 K   -> -1.0 K
Only the Ts dead-cell artifacts changed; char transport and porosity are
bit-stable. The bounded Ts is the intended effect of the rhoCp-floor and
heat-capacity guards (PR #22).
@trymarz trymarz marked this pull request as ready for review June 24, 2026 04:23
@trymarz trymarz requested a review from pjzuk June 24, 2026 04:25
@trymarz

trymarz commented Jun 24, 2026

Copy link
Copy Markdown
Owner Author

Verification detail

State exactly what was observed — build ≠ regression ≠ correct physics.

Expanded root causes (the 9 changes)

  1. solver — gas continuity is a diagonal-only system solved by the diagonal
    solver (rho = source/diag); evolvePorosity() floors the gas void fraction to
    0 in solid-full cells, zeroing the diagonal → SIGFPE. The floor (1e-3) is in
    the ddt term only; the explicit (1-porosityF) source stays on the true field.
  2. pyrolysis — in pure-gas cells (1-porosityF)=0 collapses rhoCp→1e-15;
    combined with the immersed-boundary off-diagonal zeroing this gives a ~1e21 TEqn
    condition number and DICPCG "converges" to garbage Ts (e.g. −51485 K). These
    cells have no energy sources, so the floor changes only conditioning, never
    solid-cell physics (real rhoCp is 5e5–1e6).
  3. pyrolysis — explicit solid advection can overshoot porosity>1; then
    rho*(1-porosity) goes negative, the rhoLoc floor engages, and species
    advection amplifies by 1/SMALL (originally observed Ywood~1e12 at t0.04 s).
  4. pyrolysisporLessOne_ is geometric (1 wherever porosity<1, even with no
    solid mass left); the new hasSolidMass mask is 0 once solid has advected/burned
    out. Only the registered IOobject name symbols were renamed.
  5. pyrolysis — the Ts↔Tgas coupling through heatTransfField is fully
    explicit; one pass leaves it under-relaxed to the previous timestep (same
    principle as PISO nCorrectors). A larger user-set nNonOrthogonalCorrectors is
    preserved.
  6. pyrolysis — restore the porosity>0.999 guard that had been commented out
    (so the interface is no longer thermally insulated) and zero only on outflow
    (& grad>0 instead of != 0).
  7. chemistry — a reacting cell can have porosity<1 but ~zero solid mass, so
    newCp*solidRho→0 while reaction heat stays finite → newhi/(newCp*solidRho)
    overflows → SIGFPE. Skip when solidHeatCapacity ≤ VSMALL.
  8. chemistry — bounds reaction heating symmetrically during and after
    integration.
  9. pyrolysis (narrows 4) — the first fix 4 gated the entire species RHS
    (source + fvc::div(solidFlux_, Yi)), which annihilated solid as it advected
    into still-empty downstream cells (charOnlyMove showed char → hard zero). The
    advection blowup that gating also guarded is already removed upstream by fix 3's
    porosity≤1 clamp, so the source-only gate is sufficient.

Part 1 — serial regression (charOnlyMoveCases/serial) — PASS

Ran on the foam machine, full clean build per step. charOnlyMove is pure solid
transport (Us=0.01 m/s advects char out the outlet, no active reaction), so the
correct signal is char's domain integral decaying gradually.

  • Per-commit isolation showed fix 2 (minRhoCp) is the decisive Ts win (bounds Ts
    8.1e12 → 18.7 K, char bit-identical to main), and that the first fix 4 was a
    regression (char annihilated to 0) — which is why commit 9 narrows it.
  • At the fix tip: char transport restored, matches main to 5+ sig figs at every
    step
    ; Ts max bounded at 257.7 K (vs main 8.10e12). The rise above fix 2's
    18.7 K is fix 6 deliberately re-enabling interface heat exchange, not a regression.

Part 2 — DEM + chemistry coupling

DEM smoke pass (fix tip). Built --yade and ran the registered DEM suite
(applications/test/regression/Allrun.yade --short, 200 Yade steps).
DEM_UsInterp_solidU: RUN FINISH, DEM coupling: active, both ranks finished,
FOAM_SIGFPE trapping enabled and never tripped, fields bounded (rho 1.139–1.157,
continuity ~1e-6). Chemistry on + volPyrolysis active + 2-proc Yade coupling, no
FPE / no blow-up. (test2 and MicroTGA-DEM abort in FoamCoupling::StartFoamSolver()
at MPI init — an OpenMPI back-to-back-launch limitation that reproduces on main,
not a regression. --short reaches only t≈0.0015 s.)

updraftDemo A/B (main vs fix tip, endTime 0.06 s) — bounded on both sides.
I ran the full DEM+chemistry case main-vs-fix to t≈0.06 s under three timestep
regimes, expecting to witness the documented porosity-overshoot blow-up on main.
It did not reproduce in any of themmain and the fix tip both reach End
with bounded fields:

regime gas Courant max main result fix-tip result
native adaptive (maxDeltaT 1e-3) 0.44 End, porosity max 1, Ywood max 0.999 identical, bounded
fixed deltaT 1e-6, adjustTimeStep no (tiny) End, Ywood max 1.00002 identical, bounded
raised maxDeltaT 5e-3 (so maxCo=1 binds) 1.04 End @ t=0.0593, porosity max 1, Ywood max 0.999, no FPE identical, bounded

In the shipped updraftDemo definition the solid advection never drives porosity
above 1 even at Courant ≈ 1, so fix 3's clamp never engages and the 1/SMALL species
amplification never fires. The original Ywood→7.7e12 @ t≈0.04 s observation came
from a poster-development variant that is not reproducible from the committed
case at this window.

What this means: the demonstrated catch is Part 1 — fix 2 bounds a real
Ts = 8.1e12 K blow-up to physical values. Fixes 3, 7, 8 are retained as defensive
guards: each clamps a physically-bounded quantity (porosity ≤ 1, finite heat
capacity, bounded dT/dt) at a site where the unclamped path is mathematically
capable of the overflow, even though the shipped cases stay on the safe side of it.
Part 1 confirms they don't perturb results (char + porosity bit-stable to 1e-4).

Baseline regeneration — DONE (diff-gated)

Regenerated charOnlyMoveCases/serial/reference/postProcessing/ from the fix-tip
run, diff-gated against the previous committed baseline (rtol 1e-4):

metric old (junk) new (bounded) verdict
regressionSolidSpeciesIntegrals (char/ash/hum/rubberwood) OK, unchanged
regressionPorosityAverage / regressionPorosityMin OK, unchanged
regressionTsMax 8.10e12 K ~226–264 K changed (junk→bounded)
regressionTsAverage 1.55e11 K ~3–26 K changed (junk→bounded)
regressionTsMin −2013 K −1.0 K changed (junk→bounded)

Only the Ts dead-cell junk changed; char transport and porosity are bit-stable
within 1e-4.

Notes for reviewers

  • Only the C++ mask symbols were renamed (fix 4). The registered IOobject name
    strings "whereIs"/"whereIsNot" (NO_READ/NO_WRITE, internal) were left
    unchanged — renaming a registered field name is an out-of-scope behaviour change.
  • heterogeneousP1's own whereIs_ members and the base-class whereIs parameter
    are intentionally untouched.
  • No DIAG debug prints remain in the touched files (verified by grep).

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in my opinion, this is not the best modification. It should fail in the unlikely case of porosityF=0. The porosityF must be prevented to become 0 elsewhere, eg. in volPyrolysis.C,
The point is that there is no point in solving any of gas equations in such cells.
We deliberately keep away from porosityF=0.

Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can ensure porosityF is larger in zero in volPyro then


//- Masks for equations
volScalarField whereIs_;
volScalarField porLessOne_; // 1 where porosityF < 1 (porous geometry present)

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if you want to modify names, do it for better:
porosityLessThanOne_

Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I started with that but switched for shorter ;). We can roll back to longer one

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should not be necessary, as we should evolve only chemistry where the porous medium actually, really is before the displacement. No porous media - no chemical sources. The guardian is a good thing, but should never fire.

fvm::ddt(rhoLoc,Yi)
==
sRhoSi
hasSolidMass * sRhoSi

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sRhoSi, should not be necessary here, because sRhoSi or RRs(i) should be inactive in such cells.

@@ -148,7 +164,7 @@ void volPyrolysis::solveSpeciesMass()

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think this is the place where porosityF should be moved.

@@ -148,7 +164,7 @@ void volPyrolysis::solveSpeciesMass()

for (label i = 0; i < Ys_.size(); ++i)

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

after motion of porosity only, we can renormalize mass fractions

// garbage Ts (e.g. -51485 K) in 2 iterations. A 100 floor
// lifts the isolated diagonal to ~0.2, dropping the
// condition number to ~1e4. These cells have no energy
// sources (chemistry/htf/radiation are all zero where

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this might be all right, but it suggests that there are non 0 fluxes and sources where they should be straight 0

Resolve pjzuk's review by fixing root causes instead of guarding symptoms
downstream, and fix the charOnlyMove energy-conservation violation that shares
the same root cause.

- rhoEqn.H / evolvePorosity (comments 1, 5): drop the `max(porosityF,porFloor)`
  floor in the gas-continuity ddt. Keep porosityF strictly in (1e-3, 1] at its
  source in evolvePorosity(), and remove the contradictory `porosity<1e-4 -> 0`
  set that was driving the diagonal to zero (SIGFPE) in the first place.

- preEvolveRegion (comments 3, 4): mark reacting cells by real solid mass
  (rho*(1-porosity) > 0), not geometric porosity<1. Chemistry sources are now
  zero wherever there is no porous medium, so the chemistry-source gate in
  solveSpeciesMass() is redundant and reverts to plain sRhoSi; the ODE solid-T
  guard becomes unreachable (kept as a defensive backstop, diagnostic Pout
  removed). dT/dt limiter retained.

- heatTransfer (comment 7 + charOnlyMove energy leak): gate the gas<->solid
  coupling on real solid mass instead of the geometric mask. A fresh/empty cell
  carrying a dead-cell sentinel Ts no longer equilibrates against the gas and
  injects spurious negative energy (observed: gas T and solid Ts both dragged to
  ~200 K, below the 300 K inlet, with heatTransfer ~ -1.45 MW/m^3 while Sh=0).
  Removes the porosity>0.999 heatTransfField=0 band-aid and dead trial blocks.

- rename masks (comment 2): porLessOne_ -> porosityLessThanOne_,
  porEqlOne_ -> porosityEqualOne_ (C++ symbols only; registered field names
  "whereIs"/"whereIsNot" left unchanged).

Verification: in-container build is unavailable (no OpenFOAM); changes are
prepared for a foam-machine run via pr22-verify-work/verify-pr22.sh. Not yet
run; baselines to be regenerated after the energy gates pass.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants