Skip to content

Stokes_Constrained parallel correctness: gauge, convergence, knockout, rotation-gauge#265

Open
lmoresi wants to merge 3 commits into
developmentfrom
bugfix/stokes-constrained-parallel
Open

Stokes_Constrained parallel correctness: gauge, convergence, knockout, rotation-gauge#265
lmoresi wants to merge 3 commits into
developmentfrom
bugfix/stokes-constrained-parallel

Conversation

@lmoresi

@lmoresi lmoresi commented Jun 21, 2026

Copy link
Copy Markdown
Member

Parallel-correctness work for SNES_Stokes_Constrained (the in-saddle Lagrange-multiplier free-slip solver), bringing the constrained free-slip solve to round-off partition-independence.

What's here

1. Auto pressure gauge. Guarded auto_pressure_gauge (default on) pins the surface-mean pressure on the first constraint boundary when a pressure null space is active and the user set no gauge. Physics-neutral; raw mean pressure 10.5%→0.4% at np=8, velocity bit-identical. Topography is read gauge-invariantly via topography(reference="mean") — the raw multiplier keeps its own independent gauge freedom the pressure pin does not touch.

2. Solver convergence. The grouped u | [p,h] Schur preconditioner uses inner iterative sub-solves = a variable preconditioner; plain gmres is invalid for that and false-converged (preconditioned norm reported CONVERGED while the true residual blew up), landing on a wrong, partition-dependent answer. Fixed with ksp_type=fgmres, ksp_norm_type=unpreconditioned, a tightened Eisenstat–Walker default, and a defensive ksp_max_it. Now genuinely converges (true resid ~1e-13) and reproduces the Zhong benchmark.

3. Knockout audit + LU guard. Confirmed the interior-multiplier knockout is lossless (the earlier "2–5% off" was iterative conditioning, not lost physics). Added a warning that a monolithic direct solve (pc_type lu/cholesky) of the constrained saddle point is a serial diagnostic only.

4. Rigid-rotation gauge fix (the main new result). On a fully free-slip velocity the rigid rotations are a true nullspace of the velocity block (A_uu·rotation = 0). The monolithic nullspace attached for the solve is not removed inside the fieldsplit/Schur iteration — the inner velocity KSP has no rotation nullspace, so it leaves an unconstrained, partition-dependent rigid rotation in the velocity. The operator is blind to it (the true residual still converges to machine precision), so the tangential velocity differed serial-vs-parallel by ~0.1–1% while the physical (radial) flow was already clean. SNES_Stokes_SaddlePt.solve now projects the velocity rotation modes out of the converged solution via MatNullSpaceRemove (_remove_velocity_rotation_gauge) — the velocity analogue of the constant-pressure gauge. Because the rotation is a genuine nullspace, removing it does not change the residual and yields the same rotation-free solution on any decomposition.

Validated on the constrained free-slip problem: tangential surface velocity serial-vs-np8 relative difference 1.2% → 2.5e-14, iteration count unchanged. New serial guard tests/test_1065_rotation_gauge_freeslip.py (the existing test_1063 cases all pin the rotation nullspace with a Dirichlet BC, so they did not cover this path).

Known PETSc limitation (documented, not fixed here)

add_natural_bc on an internal surface assembles a slightly partition-dependent load: DMPlexComputeBdResidual_Single_Internal attributes each interior facet's integral to a partition-dependent support[0] cell closure, so a seam facet's non-owned closure DOFs are dropped at global assembly (F0 differs ~3e-4). The refined TODO(BUG) records this and the workarounds that were tested and ruled out (owner-only label strip, 1-cell overlap, codim-1 submesh). Once the rotation gauge is removed, this leaves only ~3e-6 in the velocity (the elliptic solve smooths the localized load error); applying the internal traction as a volume body force makes it round-off.

Tests

  • test_1064 (constrained spherical shell) + test_1010 (Cartesian Stokes): 9 passed, 2 pre-existing xfails.
  • test_1063 (parallel constrained free-slip, np=2): passed.
  • test_1065 (new rotation-gauge guard): passed.

Underworld development team with AI support from Claude Code

lmoresi added 3 commits June 20, 2026 12:33
… audit

Three fixes from a focused parallel-correctness investigation of the in-saddle
Lagrange-multiplier free-slip solver (Stokes_Constrained), validated against the
Zhong (2008) spherical-shell response (3-D, SphericalShellInternalBoundary).

Item 1 — consistent pressure gauge (raw-field reproducibility).
On an enclosed constrained problem the constant pressure and constant multiplier
are independent gauge freedoms; the solver lands on a partition-dependent level
for each, so raw pressure/multiplier are not reproducible across ranks. Added a
guarded automatic pressure gauge (auto_pressure_gauge, default on) that pins the
surface-mean pressure on the first constraint boundary when a pressure null space
is active and the user has registered no gauge of their own. It is physics-
neutral (velocity bit-identical) and fixes the raw mean pressure (10.5% -> 0.4%
serial vs np=8). The pressure pin does NOT fix the raw multiplier (separate gauge
freedom): dynamic topography is read gauge-invariantly via
topography(reference="mean"). New parallel regression in test_1063.

Item 2 — the grouped-Schur solve was false-converging.
The grouped u | [p,h] Schur preconditioner uses inner iterative sub-solves (a
variable/nonlinear preconditioner). A non-flexible Krylov method (the base's
gmres) is invalid for that and false-converged: it reported CONVERGED_RTOL in ~2
iterations on the preconditioned norm while the TRUE residual ||r||/||b|| blew up
to ~1e3, landing on a wrong, partition-dependent answer (the ~0.4% velocity
spread and the failure to reproduce the Zhong response). Fixed with four defaults
in Stokes_Constrained: ksp_type=fgmres (flexible Krylov), ksp_norm_type=
unpreconditioned (honest true-residual stopping test), Eisenstat-Walker
rtol0=rtolmax=tolerance*0.1 (EW's default 0.3 capped the linear solve at one
iteration), and a bounded ksp_max_it. The solve now genuinely converges (true
residual ~1e-13) and reproduces Zhong (Ut within 1%, Ub within 0.1%); new
test_1064 default-constrained test asserts this. Velocity partition spread
0.36% -> 0.12%.

The residual 0.12% is a separate, pre-existing bug: add_natural_bc on an INTERNAL
surface assembles a partition-dependent load (the force *function* integrates
identically to 1e-16, but the assembled vector differs 0.027% because interior
facets at partition seams lack their neighbour support cell on the non-overlapped
assembly DM). Marked with TODO(BUG) at the natural-BC assembly; filed in the
planning file. Universal (affects Nitsche too), amplified here by the augmented
conditioning.

Item 3 — interior-multiplier knockout audit.
Confirmed the knockout is genuinely lossless (a converged solve with it disabled
moves the answer ~1e-8 in velocity, no interior-h blow-up); the previously
observed "2-5% when off" is an iterative-conditioning artifact, not lost physics.
Rewrote the docstrings accordingly. Added a guard that warns a monolithic direct
solve (pc_type lu/cholesky) of the constrained saddle point is a serial
diagnostic only (wrong response + segfaults in parallel).

Underworld development team with AI support from Claude Code
…p solution

For a free-slip (non-Dirichlet) velocity the rigid-body rotations are a true
nullspace of the velocity block (A_uu·rotation = 0). The monolithic nullspace
attached for the solve is not removed inside the fieldsplit/Schur iteration: the
inner velocity KSP has no rotation nullspace, so it leaves an unconstrained rigid
rotation in the velocity. The operator is blind to it (true residual still
converges to machine precision), but the rotation amplitude is partition-
dependent — the tangential velocity differed serial-vs-parallel by ~0.1–1% even
at a converged residual, while the physical (radial / gauge-invariant) flow was
already partition-clean.

SNES_Stokes_SaddlePt.solve() now projects the velocity rotation modes out of the
converged solution via MatNullSpaceRemove (_remove_velocity_rotation_gauge).
Because the rotation is a genuine nullspace, removing it does not change the
residual and yields the same rotation-free solution on any decomposition. This
is the velocity analogue of the constant-pressure gauge (set_pressure_gauge).
Surgical: only the velocity rotation modes (zero in the pressure/multiplier
blocks), so the existing pressure/multiplier gauge handling is untouched; a
no-op when there are no rotation modes (e.g. Dirichlet velocity BCs).

Validated on the SphericalShellInternalBoundary constrained free-slip problem:
the tangential surface velocity serial-vs-np8 relative difference drops from
1.2% to 2.5e-14, with the iteration count unchanged. Regression clean:
test_1064 + test_1010 (9 passed, 2 pre-existing xfails) and test_1063 parallel
(np=2) all pass.

Also refines the TODO(BUG) at the natural-BC assembly loop to document the
underlying PETSc interior-facet support[0] partition-dependence and the
workarounds that were tested and ruled out.

Underworld development team with AI support from Claude Code
Adds a fast serial tier-a/level-1 test that solves a fully free-slip annulus
(both boundaries constraint BCs, no Dirichlet velocity BC -> rigid-rotation
nullspace active) driven by a purely radial body force, and asserts the
converged velocity carries no rigid-rotation component. The existing test_1063
cases all pin the rotation nullspace with a Dirichlet BC, so they did not cover
this path; this test fails without _remove_velocity_rotation_gauge (the rotation
coefficient is ~1e-3) and passes with it (~4e-15).

Underworld development team with AI support from Claude Code
Copilot AI review requested due to automatic review settings June 21, 2026 12:13

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR improves parallel correctness and solver robustness for the constrained free-slip Stokes solver (SNES_Stokes_Constrained / SNES_Stokes_SaddlePt) by addressing gauge freedoms (pressure and rigid-rotation), fixing false convergence in the grouped Schur fieldsplit configuration, and adding targeted regression tests to lock in partition-independent behavior.

Changes:

  • Add guarded automatic pressure gauge pinning for enclosed constrained problems (auto_pressure_gauge) to make raw mean pressure partition-reproducible.
  • Switch default outer Krylov/convergence settings for constrained solves to a flexible method with true-residual norm monitoring (FGMRES + unpreconditioned norm) and tighten EW behavior.
  • Project rigid-body rotation modes out of the converged velocity to remove a partition-dependent rotation gauge; add regression coverage (serial + MPI).

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
src/underworld3/systems/solvers.py Adds constrained-solver defaults (fgmres/unpreconditioned norm/EW tightening) and guarded automatic pressure gauge installation.
src/underworld3/cython/petsc_generic_snes_solvers.pyx Adds post-solve rigid-rotation gauge removal plus extensive notes on internal-surface natural BC partition dependence.
tests/test_1065_rotation_gauge_freeslip.py New serial regression test asserting the free-slip solution has ~0 projection onto the rigid-rotation mode.
tests/parallel/test_1063_constrained_freeslip_parallel.py Adds an MPI regression for raw pressure gauge reproducibility and updates the CLI recomputation helper.
tests/test_1064_constrained_spherical_shell_response.py Adds a new test ensuring the default constrained path matches Zhong benchmark velocity response.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +2201 to +2206
self._tolerance = value
self.petsc_options["snes_rtol"] = value
self.petsc_options["ksp_rtol"] = value * 1.0e-1
self.petsc_options["ksp_atol"] = value * 1.0e-6
self.petsc_options["snes_ksp_ew_rtol0"] = value * 1.0e-1
self.petsc_options["snes_ksp_ew_rtolmax"] = value * 1.0e-1
Comment on lines +5565 to +5568
if getattr(self, "_velocity_rotation_nullspace", None) is not None:
return self._velocity_rotation_nullspace
if not self._petsc_velocity_nullspace_basis:
return None
Comment on lines +5605 to +5607
rot_ns = self._build_velocity_rotation_nullspace()
if rot_ns is not None:
rot_ns.remove(gvec)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants