Skip to content

fix(meshing): SPD-floor the mmpde metric to stop silent NaN-bail on deformed meshes#259

Merged
lmoresi merged 2 commits into
developmentfrom
bugfix/mmpde-spd-floor
Jun 20, 2026
Merged

fix(meshing): SPD-floor the mmpde metric to stop silent NaN-bail on deformed meshes#259
lmoresi merged 2 commits into
developmentfrom
bugfix/mmpde-spd-floor

Conversation

@lmoresi

@lmoresi lmoresi commented Jun 20, 2026

Copy link
Copy Markdown
Member

Problem

_winslow_mmpde's energy/gradient use fractional powers — sqrt(detM), detM**((1-p)/2), and S**q with S = tr(J·M⁻¹·Jᵀ) — that are defined only for an SPD metric.

The metric is a guide field FE-evaluated at a fixed reference cloud. Once the interior deforms, a reference point can fall outside the current mesh and the P1 density is then evaluated by FE extrapolation, going negative. A scalar density ρ<0 gives M = ρ·I with det = ρ² > 0 — so a detM>0 check passes — but M is negative-definite, so S<0 and S**q = NaN. The energy becomes non-finite and the mover bails with zero displacement.

The effect is intermittent and silent: on an adaptive convection loop roughly half the adapts no-op, the mesh under-resolves, and the symptom is easily misread as an adapt/solver coupling problem or "passive-fault damping."

Fix

Project every evaluated metric tensor onto SPD with a small relative eigenvalue floor at the single _eval_M chokepoint:

  • a genuinely SPD metric is returned unchanged (short-circuit no-op, bit-identical on valid cells);
  • extrapolation garbage becomes a benign "coarsen here" (tiny positive eigenvalues) instead of a NaN.

No API change. No behaviour change where the metric is already valid — verified bit-identical on the adapts that previously succeeded.

Validation

  • test_0762_fault_metric_tensor, test_0850_mesh_smoothing, test_0750_meshing_follow_metric, test_0830_mesh_adapt_variable_transfer: 33 passed, 5 skipped (MMG), 4 pre-existing xfail.
  • Stagnant-lid adaptive convection driver (Ra 1e6, res 24): with the fix all forced adapts move, T stays in [0,1], vrms bounded over 52 steps (no-fault) and 32 steps (passive fault). Without it, ~half the adapts NaN-bail and the mesh stops adapting.

Underworld development team with AI support from Claude Code

…eformed meshes

_winslow_mmpde's energy/gradient use fractional powers (sqrt(detM),
detM**((1-p)/2), and S**q with S = tr(J M^-1 J^T)) that are defined only
for an SPD metric. The metric is a guide field FE-evaluated at a FIXED
reference cloud; once the interior deforms, a reference point can fall
outside the current mesh and the P1 density is then evaluated by FE
extrapolation, going negative. A scalar density rho<0 gives M = rho*I
with det = rho^2 > 0 (so a detM>0 check passes) but M negative-definite,
so S<0 and S**q = NaN. The energy is then non-finite and the mover bails
with zero displacement -> adaptation silently stops on roughly half the
adapts of an adaptive convection loop (mesh under-resolves; the symptom
was previously misread as adapt/solver coupling or "passive fault damping").

Project every evaluated metric tensor onto SPD with a small relative
eigenvalue floor at the _eval_M chokepoint. A genuinely SPD metric is
returned unchanged (short-circuit no-op); extrapolation garbage becomes a
benign "coarsen here" instead of a NaN. No behaviour change on valid
metrics (bit-identical), no API change.

Tests: test_0762/0850/0750/0830 green (33 passed, 5 skipped MMG, 4
pre-existing xfail). Validated on the stagnant-lid adaptive convection
driver: all forced adapts move, T stays in [0,1], vrms bounded over 52
steps (no-fault) and 32 steps (passive fault).

Underworld development team with AI support from Claude Code
Copilot AI review requested due to automatic review settings June 20, 2026 04:56

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR hardens the Winslow/MMPDE mesh smoother against non-SPD metric tensors (typically produced by FE extrapolation after mesh deformation) by sanitizing the evaluated metric so fractional-power terms in the MMPDE functional do not generate NaNs and silently stall adaptation.

Changes:

  • Wraps _eval_M with an SPD projection step using eigenvalue flooring to ensure the metric is symmetric positive-definite.
  • Adds detailed in-code rationale describing how non-SPD metrics arise and how the sanitizer prevents NaN-driven no-op adapts.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread src/underworld3/meshing/smoothing.py Outdated
Comment on lines +3184 to +3196
def _spd_sanitise(M):
Ms = 0.5 * (M + np.swapaxes(M, -1, -2))
w, Vc = np.linalg.eigh(Ms)
if w.size:
wmax = float(np.nanmax(w))
else:
wmax = 1.0
floor = max(wmax, 1.0) * 1.0e-8
if np.all(np.isfinite(w)) and float(w.min()) >= floor:
return Ms # already SPD → no-op
w = np.clip(np.nan_to_num(w, nan=floor, posinf=wmax, neginf=floor),
floor, None)
return np.einsum('nij,nj,nkj->nik', Vc, w, Vc)

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed in 732557e: added an empty-batch early return (parallel rank with no owned cells no longer hits the zero-size reduction) and made the SPD test per-tensor — only the cells whose own eigenvalues fail are projected, every already-SPD tensor is returned bit-identical. Verified empty-safe + valid cells untouched + all outputs SPD.

Address Copilot review on _spd_sanitise:
- Guard the empty batch (a parallel rank that owns no cells): return
  early before the eigenvalue reductions, which previously raised on a
  zero-size array.
- Make the SPD test per-tensor: only the cells whose OWN eigenvalues are
  non-finite or below the floor are projected; every already-SPD tensor
  is returned bit-identical to the symmetrised input. One bad point can
  no longer perturb otherwise-valid cells (matches the "no-op for SPD"
  contract). Verified: empty input is safe, a single garbage cell leaves
  the rest bit-identical, all output tensors SPD.

Underworld development team with AI support from Claude Code
@lmoresi lmoresi merged commit 4d0069a into development Jun 20, 2026
1 check passed
@lmoresi lmoresi deleted the bugfix/mmpde-spd-floor branch June 20, 2026 05:54
lmoresi added a commit that referenced this pull request Jun 22, 2026
Restores the intended "RBF metric eval" robustness. The mmpde mover's
RBF/Shepard metric path baked the metric via an FE evaluation at the FIXED
reference cloud `ref`; on a deformed mesh that reference can mis-locate / drift
outside the deformed interior and the FE evaluation returns garbage (the P1
density — strictly positive by construction — comes back NEGATIVE, even at a
field's own DOFs). A negative density is a non-SPD metric, which the mover then
either NaN-bails on (a hidden stall) or, with the #259 SPD-floor, acts on as
"coarsen hard here" → giant cells / holes / divergence in the adapted mesh.

Fix: bake the metric at the CURRENT mesh NODES (its own DOF locations) and
Shepard-interpolate from there. A Shepard (positive-weight, convex) average of
the positive nodal values is GUARANTEED >= 0 (monotone) — the metric can never
go non-SPD from the eval — and nodes are always inside the mesh (no
out-of-domain / drift) and need no per-step cell location (fast). `ref` is kept
for the _edge_mats reference frame.

Best paired with #264 (deformed-mesh point-location fix), which makes the nodal
FE bake exact; together the adapted mesh stays clean under forced every-step
adaptation at R=5 (folded=0, cell-area-ratio flat) where it previously tore
holes. Validated in a sibling worktree (this is a one-line source change; CI
runs the full mover suite).

Underworld development team with AI support from Claude Code
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants