What it does
- Builds air shells around a body mesh, precomputes tetrahedral geometry, and scales volumes to unit body volume.
- Assembles an AMG hierarchy on CPU (PyAMG) and solves the magnetostatics subproblem on device using JAX PCG with a per‑component V‑cycle preconditioner (vector potential A) or a scalar V‑cycle (scalar potential U).
- Minimizes full micromagnetic energy with a two‑loop L‑BFGS outer loop across an external‑field schedule to emit M–H loops and VTU snapshots.
- Python with:
jax(x64 enabled),jaxopt,numpy,scipy,pyamg,meshio(for VTU), plus your project modules (geom,energies,optimize,add_shell,amg_pcg).
The code explicitly enablesjax_enable_x64and requires PyAMG to build the hierarchy. - GPU optional: JAX runs on CPU/GPU if available; AMG is built on CPU and converted to JAX sparse (BCOO) before device use.
CPU only environment:
conda env create -f environment_cpu.yml
NVIDIA GPU:
conda env create -f environment_gpu.yml
CPU only environement:
pip install numpy scipy pyamg meshio meshpy jax jaxopt
NVIDIA GPU:
pip install numpy scipy pyamg meshio meshpy jax jaxopt[cuda12]
Create a body mesh before running the solvers. Two backends are available:
- meshpy (TetGen): quality tetrahedralization with target size
h(uses a volume heuristic ≈0.1·h^3). - grid: regular brick grid each split into six tets (works without meshpy).
Box example (60×40×20, oriented with default axes), generate a box with the extension of 60 nm x 40 nm x 20 nm:
python mesh.py --geom box --extent 60,40,20 --h 2.0 --backend meshpy --out-name body
Tip: If you don’t have TetGen/meshpy, switch to
--backend gridto get a quick starter mesh (VTU export still needsmeshio).
This repo also supports an "eye" (Bézier-shaped) extruded prism geometry in the single-solid mesher.
You can generate meshes from the CLI in src/mesh.py or programmatically via run_single_solid_mesher(geom='eye', ...).
- Quick notes:
- Geometry choices supported by the mesher:
box,ellipsoid,eye, andelliptic_cylinder. elliptic_cylinder: an extruded ellipse (cross-section semi-axes a=Lx/2, b=Ly/2) with total thickness Lz along the local z-axis; supply--extent Lx,Ly,Lzas for other geometries.- Backends:
meshpy(TetGen) for high-quality tetrahedral meshes, orgrid(pure‑NumPy brick grid) for fast, dependency‑free mesh generation. The mesher prefersmeshpyif available. - Use
--force-gridto force the grid backend even whenmeshpyis installed (handy for CI or when TetGen isn't desired).
Examples (elliptic_cylinder, run from repository root):
# grid backend (fast, no extra deps)
python3 -c "import sys; sys.path.insert(0, '.'); from src.mesh import run_single_solid_mesher; run_single_solid_mesher(geom='elliptic_cylinder', extent='3.5,2.0,0.01', h=0.8, backend='grid', out_name='smoke_elliptic_grid', no_vis=True, verbose=True)"
# prefer meshpy backend (requires meshpy/TetGen)
python3 -c "import sys; sys.path.insert(0, '.'); from src.mesh import run_single_solid_mesher; run_single_solid_mesher(geom='elliptic_cylinder', extent='3.5,2.0,0.01', h=0.8, backend='meshpy', out_name='smoke_elliptic_meshpy', no_vis=True, verbose=True)"
Compute the demagnetization curve
python loop.py --mesh body.npz
The field direction is set from (hx, hy, hz) and normalized; the sweep runs magnitudes from hstart to hfinal with step hstep (internally in A/m; the .p2 values are given in Tesla and converted by dividing by μ₀). After each minimization, the volume‑average M, the scalar M·ĥ (reported as μ₀ M·H in Tesla), and the normalized energy are appended to <basename>.dat. VTU snapshots are written whenever |Δ(μ₀·M·ĥ)| ≥ mstep.
A body‑only tetrahedral mesh saved as NPZ; the shelling (air layers) is generated in memory. The module reads mesh.npz, looks for a sidecar mesh.p2 (see below), then creates shells using run_add_shell_pipeline(..., K, KL, auto_layers=True), and finally assembles a canonical TetGeom (connectivity, gradients, volumes, material IDs). You do not need to pre‑mesh air. citeturn1search2
A simple INI file next to the mesh, same basename. Sections and keys used:
[mesh]
size = 1.0e-9
K = 1.5
KL = 5.0
[initial state]
mx = 0.0
my = 0.0
mz = 1.0
[field]
hstart = 0.0
hfinal = 0.0
hstep = -0.002
hx = 0.0
hy = 0.0
hz = 1.0
mstep = 0.4
mfinal = -1.2
[minimizer]
tol_fun = 1e-8
tol_hmag_factor = 1.0All keys are parsed robustly; missing entries fall back to sensible defaults.
| key | type | comment | comment |
|---|---|---|---|
| size | float | mesh units in m | |
| KL | float | expansion factor of sample | if omitted defaults from CLI are used, CLI overwrite |
| K | float | expansion factor for air layers | if omitted defaults from CLI are used, CLI overwrite |
| mx,mx,mz | float | initial magnetic state | uniform |
| hstart | float | start value of the field | in Tesla (to convert from A/m multiply with µ0) |
| hfinal | float | final value of the field | in Tesla (to convert from A/m multiply with µ0) |
| hstep | float | field step | in Tesla (to convert from A/m multiply with µ0) |
| hx,hy,hz | float | field direction | will be normalized |
| mstep | float | output frequency | write vtu file when delta µ0M > mstep |
| mfinal | float | µ0M for early stopping | not used |
| tol_fun | float | Optimality tolerance | see Practical Optimization, Gill & Murry; can be overwrite with CLI |
| tol_hmag_factor | float | tolerance for magnetic potential | ≈ tol_hmag_factor * tol_fun^(1/3) |
A whitespace‑delimited text file; each non‑comment line describes one ferromagnetic material group in the mesh:
# theta phi K1 (unused) Js A
1.5708 0.000 5e4 0.0 1.8 1.3e-11
theta, phi– easy‑axis orientation in spherical angles (radians).K1– uniaxial anisotropy constant [J/m³].Js– saturation polarization [T]; converted to Ms = Js / μ₀ [A/m].A– exchange stiffness [J/m]; automatically rescaled by1/size²from the.p2mesh size.
Number of lines required = G − 1, where G = max(geom.mat_id) is the total number of groups (the last group is reserved for air and is auto‑appended with Ms=0, A=0, K1=0). If more lines are present, the extra ones are ignored with a warning.
-
VTU:
<basename>.0000.vtufor the initial state and then<basename>.<step>.vtuduring the sweep. Cell data include integermat_idand vectors M, H, B (M and H are written in SI after multiplying by μ₀). -
DAT:
<basename>.datwith rows:[vtu_id] [μ0*h (T)] [μ0*M·ĥ (T)] [μ0*Mx (T)] [μ0*My (T)] [μ0*Mz (T)] [E_norm]where
ĥis the (unit) field direction from.p2andE_norm = E / E_ref(dimensionless).E_refis the magnetostatic energy of the initial state.
Here’s a concise README section you can drop in to document initial magnetic states and how to resume from a stored state.
loop.py supports several ways to choose the initial magnetization for a run:
-
By default, the code starts from a uniform magnetization taken from the
[initial state]section of the sidecar.p2file:[initial state] mx = 0.0 my = 0.0 mz = 1.0 ini = 0´resume index ini, set it to the number of the start state, = 0 no resume.
-
You can override this on the CLI with:
--ini uniformwhich ignoresmx,my,mzin.p2and uses(0,0,1).or
--ini numberfor restarting from a stored state. (see below)
- For debugging or exploratory runs, you can start from a y–z plane vortex field: '
--ini vortex
During the sweep, whenever a VTU snapshot is written, the solver also saves a checkpoint next to it:
<basename>.<step>.state.npz
This file contains:
- the current magnetization variable (same shape as m-nodes, already unit‑normalized), and
- the current auxiliary potential (A for vector formulation or U for scalar),
- a mode flag ensuring you resume with the same magnetostatics formulation.
You can resume from such a checkpoint in two ways:
- CLI takes precedence: python loop.py --mesh body.npz --ini 0020 # resumes from body.0020.state.npz
- From
.p2(if--iniis not supplied): [initial state]ini = 20 ; resumes from .0020.state.npz where<basename>is the mesh basename (e.g.,body⇒body.0020.state.npz).
What happens on resume
- The solver loads the saved magnetization and the A/U potential and continues the field schedule from your current
.p2. - VTU numbering continues from the resume index (the next VTU will be
<resume_index+1>), and the stored A/U field is used as a warm start for the magnetostatics solve at the next step—this speeds up the first iteration after resuming.
Requirements / caveats
- The checkpoint must match the magnetostatics mode you run with now (
--ms Uor--ms A); cross‑mode resumes are rejected. - The mesh must be the same (same node count/connectivity). Changing the mesh invalidates the checkpoint.
- Checkpoints are written only when a VTU is emitted, which is controlled by the cadence parameter
mstepin.p2. If you stop before a VTU threshold is crossed, there may be no state file to resume from.
The only CLI entry point is now loop.py. (The magnetostatics core is a library module.)
python loop.py --mesh body.npz --krn body.krn [OPTIONS]- Magnetostatics formulation:
--ms {A,U}(aliases:vector,scalar).
A= vector potential with explicit gauge stabilization;U= scalar potential (no gauge). - AMG type:
--amg {sa,rs}(smoothed‑aggregation or Ruge–Stüben). - Gauge (A‑formulation):
--gauge <value>(addsgauge·Ion the finest level), or provide--hand--Lto auto‑computegauge = 0.1*(h/L)^2(overrides--gauge). - Air‑shell grading:
--K,--KL(defaults 1.5, 5.0 if not in.p2). - Initial LBFGS model:
--h0 {gamma,bb,bb1,bb2,diag,block_jacobi,identity}with optional--h0-damping μ(fordiag/block_jacobi). - Line search:
--ls-init {current,max,value,increase}with--ls-init-stepsize,--ls-max-stepsize,--ls-increase. - A/U inner solve controls:
--a-it,--a-nu-pre,--a-nu-post,--a-omega,--a-coarse-it,--a-coarse-omega. - Toggles & prints:
--print-p2,--print-materials,--print-energy,--no-vtu,--no-demag,--debug-lbfgs.
It is recommend to use the magnetic scalar potential --ms U which is the default.
- Prepare shells & geometry from the body mesh. Stores
volume_scalefactor = 1 / (body_volume)and scales element volumes accordingly. - Build AMG hierarchy: A‑mode builds on
(1/μ0)·K + gauge·I(then used per‑component); U‑mode builds onμ0·K(scalar). Converted to JAX‑BCOO tuples plus JacobiDinv; optional coarse Cholesky factor. - Read
.p2: field schedule (Tesla→A/m), initial magnetization, tolerances, cadence. - Read
.krn: buildMs,A_exchange/size²,K1, and per‑element easy‑axis LUT. - Initial energy: solve A or U for the uniform initial state and report Brown energy (density and scaled) and a classical demag estimate using
Nz(density and scaled). - Initial VTU: write
basename.0000.vtuwith cell datamat_id, and vectors M, H, B (M and H are multiplied by μ₀ for VTU, i.e., in Tesla). - Demag sweep: at each field step, minimize energy with two‑loop L‑BFGS; append to
basename.datand write VTUs whenever Δ(μ₀·M·ĥ) ≥mstep.
At each external field step the code minimizes the total energy
E = S[m,A(m)] + E_ex[m] + E_an[m] + E_Zeeman[m; H_ext],
with the unit‑length constraint enforced by parameterizing an unconstrained vector u_raw per node and projecting to m = u/||u||. The gradient is mapped back to u_raw using the tangent‑plane projector (I − m mᵀ) (chain rule). citeturn1search1
The outer optimizer is two‑loop L‑BFGS (jaxopt.LBFGS), where the initial inverse‑Hessian seed H₀ is configurable: scalar γ (from last curvature pair), Barzilai–Borwein (bb/bb1/bb2), identity, or physics‑informed diagonal / block‑Jacobi operators assembled from exchange and anisotropy surrogates on the tangent space. A custom line search is used with several initialization strategies. citeturn1search1
To improve scaling across fields, the energy returned to the optimizer is normalized by a reference E_ref (Brown energy at the initial state), but all outputs also retain physical units when written (e.g., fields multiplied by μ₀ when exporting).
- Vector potential A: PCG on the linearized Brown functional in A with right‑preconditioner = per‑component AMG V‑cycle. Explicit gauge stabilization is added both in the finest‑level matrix (AMG build) and in the operator used by CG.
- Scalar potential U: PCG on
μ0·Kusing a scalar AMG V‑cycle; no gauge term. Operator and RHS are chosen so that the physical sign convention yieldsH = −∇U.
Both paths compute M, H, B on elements for VTU:
A‑path: B = curl A (per tet), H = (1/μ0)·B − M.
U‑path: H = −∇U, B = μ0·(H+M).
This guide explains how to use run_example.py (sometimes referred to as run_examples in messages) to run any example from your matrixfree repository without polluting the example folder. The wrapper:
- copies the chosen example directory into a fresh run folder under your current working directory (default:
./runs/<example>-YYYYmmdd-HHMMSS), - executes the chosen entry script from
src/(default:loop.py) inside that run folder, - keeps all outputs in the run folder, leaving the original example clean.
Place
run_example.pyat the repo root, alongsidesrc/andexamples/.
From any directory:
# Run the example named 'rotate_to_easyaxis' using the default entry script (loop.py)
python /path/to/matrixfree/run_example.py \
--example rotate_to_easy_axis \
-- --mesh cube --debug-lbfgsWhat happens:
examples/rotate_to_easy_axisis copied to./runs/rotate_to_easyaxis-<timestamp>under your current directory.- The wrapper runs
matrixfree/src/loop.pywith CWD = that run folder. - All outputs land in
./runs/rotate_to_easyaxis-<timestamp>.
usage: run_example.py --example EXAMPLE [--parent-dir PARENT] [--script SCRIPT]
[--repo-root PATH] [--workdir PATH]
[--clean-on-success] [--dry-run]
[--] [SCRIPT_ARGS ...]
Required
--example EXAMPLE
The name of an example directory (e.g.,anisotropy), or a full/relative path to an example directory. If you pass just a name, the wrapper resolves it under<repo_root>/<parent-dir>/EXAMPLE(default parent isexamples).
Common options
--script SCRIPT(default:loop.py)
The entry script insrc/to run. Examples:loop.py,mesh.py. You can also pass an absolute path to a script file.--parent-dir PARENT(default:examples)
Parent directory under the repo root where examples live. Change this if your examples are in a different folder name.--workdir PATH
Where to run. Default creates./runs/<example>-<timestamp>under your current directory. Use--workdir .to copy the example into the current directory (may overwrite!).--repo-root PATH
Explicit path to the repo root (folder that containssrc/and the examples parent). Defaults to the directory that containsrun_example.py.--clean-on-success
Delete the run directory only if the entry script exits with code 0.--dry-run
Print what would be done (copy target and the command to execute) and exit.
Positional pass‑through
- Everything after
--is passed verbatim to your entry script (e.g., flags forloop.py). Example:python run_example.py --example anisotropy -- --mesh cube --debug-lbfgs --h0 block_jacobi
All examples are located in the directory examples. Each directory contains a *READM.md with instructions and main results for comparison.
-
rotate_to_easyaxis relax a permanent magnet toward its easy axis and compare L‑BFGS preconditioners
-
hard_axis_loop test the contributions of anisotropy energy and Zeeman energy in the micromagnetic solver
-
switch_sphere compute the nucleation field of a small hard magnetic sphere
-
switch_prolate_ellipsoid compute the nucleation field of a small prolate hard magnetic ellipsoid
-
switch_oblate_ellipsoid compute the nucleation field of a small oblate hard magnetic ellipsoid
The above three switching experiments test the magnetostatic, the anisotropy energy, and Zeeman energy
-
standard_problem_2 run the micromagnetic standard problem 2, test the magnetostatic, exchange and Zeeman terms
-
standard problem_3 run the micromagnetic standard problem 3, test the anisotropy, magnetostatic, and exchange terms
python path/to/matrixfree/run_example.py \
--example rotate_to_easyaxis \
--script loop.py
-- --mesh cube --debug-lbfgs --h0 block_jacobipython run_example.py --example rotate_to_easyaxis --dry-run -- --mesh cubepython run_example.py \
--example rotate_to_easyaxis \
--clean-on-success \
-- --mesh cube --debug-lbfgsThe micromagnetic solver is designed to compute magnetization processes and nucleation fields in hard magnetic materials using a finite element approach. It is grounded in the continuum theory of micromagnetism and implements energy minimization techniques to find stable magnetic configurations.
The solver minimizes the total magnetic Gibbs free energy, which includes:
- Exchange energy
- Magnetocrystalline anisotropy energy
- Zeeman energy (interaction with external field)
- Stray-field energy (demagnetizing field)
The magnetization polarization vector ( \mathbf{J} ) is constrained to have constant magnitude, and the energy is expressed in terms of local variables and their gradients.
To avoid the computational cost of long-range interactions in the stray-field energy, the solver uses auxiliary potentials:
- Magnetic vector potential ( \mathbf{A} ): leads to a curl–curl equation and a bounded energy functional.
- Magnetic scalar potential (U): leads to a Poisson equation but results in a saddle-point problem.
The solver supports both formulations, but prefers the vector potential due to its better numerical properties and compatibility with minimization techniques.
- The domain is discretized using linear tetrahedral elements.
- The solver avoids assembling global sparse matrices. Instead, it computes local element contributions (e.g., ( \nabla \times \mathbf{A} )) and uses scatter-add operations to accumulate nodal values.
- This results in a matrix-free implementation that is memory-efficient and well-suited for JAX-based JIT compilation.
- AMG is used to precondition the linear systems arising from the vector or scalar potential formulations.
- A scalar Laplacian surrogate is built from the mesh geometry to guide AMG coarsening.
- The solver uses Jacobi smoothing only, avoiding complex H(curl) or H¹-specific multigrid components.
- The solver uses L-BFGS or conjugate gradient methods to minimize the energy functional.
- Termination criteria are based on:
- Function value change
- Step size
- Gradient norm
- A damping parameter ( \mu ) may be added to the Hessian approximation to improve stability in flat regions.
- The solver computes equilibrium magnetization states for decreasing external field values.
- Previous solutions are used as initial guesses for the next field step, improving convergence.
- The switching field is identified when the magnetization reverses.
- The solver reproduces known analytical results for ellipsoidal particles.
- It captures inhomogeneous nucleation and stray-field effects in cubic and dodecahedral particles.
- 2D simulations provide lower bounds for nucleation fields, while 3D simulations offer more accurate predictions, especially for larger grains.
To enforce the physical constraint (|\mathbf m(\mathbf x)|=1) without ad‑hoc renormalization, we optimize over a curvilinear path on the sphere at each iteration. Starting from the current nodal magnetization (\mathbf U \in \mathbb{R}^{N\times 3}) with unit rows (|\mathbf U_i|=1), we first build a tangent search direction
(\mathbf d_{\mathrm{tan}} ;=; \mathbf d ;-; (\mathbf d!\cdot!\mathbf U),\mathbf U,)
i.e., the component of the descent direction orthogonal to each (\mathbf U_i). We then move along a great‑circle–like curve on (\mathbb S^2) parameterized by a scalar step
(\mathbf U(\tau)=\operatorname{CurvilinearStep}!\left(\mathbf U,,\mathbf d_{\mathrm{tan}},,\tau\right))
is chosen so that (|\mathbf U_i(\tau)|=1) holds exactly for all (\tau) and all nodes (i); no post‑step projection is needed. In code, the path is implemented by forming
(\mathbf H = \mathbf U \times \mathbf d_{\mathrm{tan}},\qquad \mathbf U'(\tau)=\tfrac12,\mathbf W_{-}(\tau)^{-1},\mathbf H_\times!\big(\mathbf U+\mathbf U(\tau)\big), \quad \mathbf W_{-}(\tau)=\mathbf I-\tfrac{\tau}{2},\mathbf H_\times,)
where (\mathbf H_\times) is the per‑node skew matrix of the cross product with (\mathbf H). This gives a differentiable path on the sphere with (\mathbf U'(0)=\mathbf d_{\mathrm{tan}}), which we use for line search (see below).
- Constraint preserved by construction. Because the curve lives on (\mathbb S^2), every trial point satisfies (|\mathbf m|=1); you avoid drift and repeated normalize‑project cycles.
-
Correct directional derivatives. We evaluate the derivative of the merit function along the curve via (\phi'(\tau)=\langle \nabla E(\mathbf U(\tau)),,\mathbf U'(\tau)\rangle), using the analytic
$\mathbf U'(\tau)$ above. This is essential for robust Armijo/Wolfe‑type tests on manifolds. - Works with L‑BFGS and BB. The same curvilinear path underlies both our two‑loop L‑BFGS (with custom (H_0) and Barzilai–Borwein drivers; only the strategy for picking (\tau) changes.
We provide two backtracking strategies defined along the curvilinear path:
- Modified Armijo on the curve. A robust variant using the dimensionless decrease ratio (D(\tau)=(\phi(\tau)-\phi(0))/(\tau,\phi'(0))) with adaptive enlarge/reduce phases; this converges well even when the local model is non‑quadratic. This line search is used for the L-BFGS method.
- Armijo backtracking line search on the curve that uses a fixed shrinkage factor and sufficient decrease condition. This line search is used for the starting iterations of the Barzilai-Borwein method.
Both yield a step
A variety of solvers are used in static micromagnetics:
- Steepest descent / projected gradient. Simple but can be very slow on stiff, ill‑conditioned energies.
- Nonlinear conjugate gradient (NCG). Popular and efficient with line‑search safeguards; preconditioning brings substantial gains in micromagnetic energy minimization.
- Truncated Newton / Hessian‑based methods. Fast local convergence, but Hessian actions (including nonlocal magnetostatics) are expensive to implement robustly at scale.
- Quasi‑Newton methods (L‑BFGS). Exploit curvature information via a short memory of curvature pairs ((s_k,y_k)), with linear memory and (O(nm)) cost per iteration (small (m)). They are attractive for very large problems where full Hessian information is impractical.
We use L‑BFGS with line search, because it provides robust, often superlinear, convergence in high dimensions at modest per‑iteration cost, and it integrates naturally with the normalized representation. Compared with NCG, L‑BFGS typically attains a good metric of the local curvature after a few iterations, reducing sensitivity to line‑search parameters and delivering fewer energy/gradient evaluations overall.
Alternatively, the solver supports the Barzilai–Borwein (BB) method, a gradient-based optimizer that approximates curvature information using simple scalar step sizes derived from previous iterates. While BB typically requires more iterations than L‑BFGS, it offers a lightweight alternative with minimal memory overhead and no need to store or update quasi-Newton matrices. This makes it particularly attractive for large-scale problems.
To switch between the optimization methods in loop.py, use the --solver command-line option:
--solver {lbfgs, bb}
- Use
--solver lbfgsto run the L‑BFGS optimizer (default). - Use
--solver bbto switch to the Barzilai–Borwein method.
You can also fine-tune each method with additional flags:
- For L‑BFGS:
--lbfgs-history,--h0,--tol-fun, etc. - For BB:
--bb-variant,--bb2-precond,--bb-init-steps, and shared line search options.
Although L‑BFGS internally builds a metric from recent curvature pairs, external preconditioning (i.e., using a change of inner product or left‑preconditioning operator) can dramatically reduce iteration counts for ill‑conditioned micromagnetic energies (e.g., strong exchange–anisotropy contrasts, mesh aspect‑ratio effects). The literature documents substantial speed‑ups for preconditioned quasi‑Newton methods in large‑scale energy minimization, and Frimannslund’s work on combining L‑BFGS with Newton‑type ideas provides a principled rationale for mixing cheap Hessian surrogates with L‑BFGS updates to improve robustness and rate.
From a practical standpoint, left‑preconditioning L‑BFGS (or equivalently, changing the inner product to a problem‑aware metric) improves the quality of the initial inverse‑Hessian guess in the two‑loop recursion, aligns search directions with the dominant physics, and reduces the burden on the limited history of curvature pairs. This viewpoint is standard in preconditioned optimization libraries and has been shown repeatedly to accelerate quasi‑Newton methods.
We approximate the local part of the energy Hessian (exchange + crystalline anisotropy) and ignore nonlocal magnetostatics in the preconditioner. This yields sparse, easy‑to‑apply operators that capture the stiffest directions (exchange) and the preferred axes (anisotropy):
- diag: a scalar per node derived from local exchange/anisotropy contributions—minimal storage and cost;
- block‑Jacobi: a (3\times 3) SPD block per node (principal minors of the local Hessian), capturing directional coupling and the local easy‑axis geometry.
Using such local Hessian surrogates as preconditioners is a well‑established strategy in micromagnetic energy minimization and has been shown to speed up Krylov and NCG solvers by factors of 3–7 on realistic problems; the same rationale carries over to L‑BFGS because both benefit from better‑conditioned search spaces.
These preconditioners are cheap to build (one pass over elements) and cheap to apply (per‑node scalings or (3\times 3) solves), and they commute naturally with the tangent‑plane projection used in the normalized variable (\mathbf m=\mathbf u/|\mathbf u|). In spirit they are akin to “neighborhood” preconditioners known to accelerate geometry optimization in materials modeling.
Every iteration computes the projected (tangent) gradient via the normalization rule (\mathbf m=\mathbf u/|\mathbf u|) and a preconditioned L‑BFGS direction. A standard backtracking/Wolfe line search along the geodesic‑consistent update (implemented here by trial steps in (\mathbf u) followed by normalization) guarantees energy decrease and preserves the constraint, akin to projected‑descent methods on convex sets, but on the sphere.
The alternative nonlinear CG route is also very competitive in micromagnetics, especially with the same local preconditioners, and has been demonstrated to deliver large speed‑ups on demagnetization curves. Our choice of L‑BFGS is guided by its lower sensitivity to line‑search tuning and its curvature accumulation, which are advantageous for multi‑term energies and heterogeneous meshes; in practice we view L‑BFGS and NCG as complementary tools.
To highlight the effect of preconditioning in a controlled setting, we consider a zero‑field relaxation where the magnetization aligns with the local uniaxial anisotropy axis everywhere. Starting from a uniform initial guess, we minimize (E_{\text{ex}}+E_{\text{ani}}) on the mesh:
- With conventional (\gamma)‑scaling only (i.e., a uniform scalar scaling of the direction), the solver requires 47 L‑BFGS iterations to reach the target tolerance.
- With the block‑Jacobi preconditioner, the iteration count drops to 18, reflecting the improved conditioning and the alignment of search directions with local easy‑axis curvature.
This reduction is consistent with the general experience that directionally aware local preconditioners substantially accelerate quasi‑Newton methods on micromagnetic energies. (Results from our implementation; mesh and material parameters as in the Methods section.)
- Exl, L. et al. “Preconditioned nonlinear conjugate gradient method for micromagnetic energy minimization.” arXiv:1801.03690 (2018). (Local‑Hessian preconditioning idea.)
- Frimannslund, L.; Steihaug, T. “A class of methods combining L‑BFGS and truncated Newton.” Univ. Bergen report (2006). (Quasi‑Newton/second‑order hybrid rationale.)
- Byrd, R. H. et al. “A preconditioned L‑BFGS algorithm with application to molecular energy minimization.” (Tech. report, 2004). (Preconditioned L‑BFGS in large‑scale energy problems.)
Following the Magnetostatic field computation section, we compute the magnetostatic field using either the vector potential
We solve the gauged curl–curl problem resulting from minimizing Brown’s functional with respect to (\mathbf A):
where (V_h) is the space of nodal finite element functions. Which leads to the symmetric positive system:
with (C) the discrete curl operator assembled from nodal basis functions, (M) a mass matrix, and (\mathbf b_M) the magnetization term. The small gauge (\alpha>0) removes the large kernel of the curl operator and enables robust conjugate gradient solves.
Alternatively, we solve for the magnetic scalar potential (U) via a Poisson-like equation derived from minimizing the scalar form of Brown’s functional:
which yields the symmetric positive system:
where (K) is the stiffness matrix and (\mathbf b_U) encodes the divergence of the magnetization.
For both vector and scalar formulations, we use a plain V‑cycle AMG preconditioner with damped Jacobi smoothing, coupled to a PCG solver:
- Smoother. On each level (\ell), we apply (\nu_{\text{pre}}) and (\nu_{\text{post}}) steps of damped Jacobi with damping (\omega\in(0,1)) (defaults: (\nu_{\text{pre}}=\nu_{\text{post}}=2), (\omega\approx0.7)). This is simple and effective for both systems.
- Transfers and coarse operators. We use standard algebraic coarsening to build restriction/prolongation and define coarse matrices by the Galerkin product (A_{\ell+1}=R_\ell A_\ell P_\ell); the coarsest level is solved by a small sparse Cholesky or a few inner iterations.
To guide coarsening, we assemble a scalar Laplacian surrogate on the tetrahedral mesh using the standard (P_1) stiffness:
computed element-wise via gradients (\nabla\phi_i) and volumes (V_e). This surrogate provides a robust graph for algebraic multigrid aggregation.
- Assemble system matrices and right hand side for either vector or scalar potential.
- Build AMG hierarchy using the scalar Laplacian surrogate:
- Compute surrogate (L),
- Define strength‑of‑connection and transfers,
- Construct coarse operators and store coarsest-level Cholesky.
- Algebraic mulitgrid - preconditioned conjugate gradient (AMG‑PCG) solve:
- For vector potential: solve (\left(\tfrac{1}{\mu_0}C^{\top}C+\alpha M\right)\mathbf A = C^{\top}\mathbf b_M),
- For scalar potential: solve (\mu_0 K U = \mathbf b_U),
- Use Jacobi smoothing and coarse correction via (L).
Magnetostatic and electrostatic problems are posed on unbounded domains, but numerical methods must truncate this domain to a finite region. Classical strategies include:
- Domain truncation with an air region (graded mesh or uniform box),
- Shell transformation (mapping a finite shell to infinity), and
- Hybrid FEM–BEM coupling (Fredkin–Koehler approach).
The first approach—graded air region—is widely used for micromagnetics because it is simple, robust, and integrates well with standard finite-element solvers. Shell transformation and FEM–BEM methods are more rigorous but require specialized formulations and coupling operators.
We surround the magnetic body with nested homothetic shells forming a graded “air box.” Starting from the body’s outer surface
about a chosen center (typically the origin). These surfaces are meshed into tetrahedral layers using TetGen via MeshPy. Each layer is assigned a region attribute and a maximum volume constraint to enforce gradual coarsening.
Mesh-size schedule:
where (h_0) is the target edge near the magnet, (K>1) the geometric scale factor, and (\beta) controls growth. If (h_0) is not provided, we estimate it from the median boundary-edge length of the body surface.
This strategy pushes the artificial boundary far away while keeping the number of degrees of freedom manageable. It is consistent with best practices in computational electromagnetics for open boundaries.
- Surface extraction: Identify outer faces of the body mesh (faces appearing only once).
- Shell generation: Create scaled copies of the surface and build a PLC (piecewise-linear complex) for TetGen.
- Region seeds: Place one seed per shell gap to apply TetGen’s volume constraints.
- Meshing: Call TetGen with quality and volume options (
pqAaY). - Merge and clean: Weld duplicate nodes, orient tetrahedra positively, and remove degenerate elements.
The result is a single mesh combining the magnetic body and graded air layers, with mat_id tags distinguishing regions.
For test problems, we provide a lightweight mesher supporting:
- Box or ellipsoid geometry, oriented via user-specified axes,
- Backends: MeshPy/TetGen for quality tetrahedralization or a structured brick grid split into tetrahedra,
- Automatic surface tessellation for ellipsoids using an icosphere subdivision heuristic.
This tool ensures reproducible meshes for benchmarks and integrates seamlessly with the shell-generation pipeline.
Software architecture and implementation (JAX, analytic gradients, and an almost matrix‑free design)
Our implementation targets three goals that are often in tension in micromagnetics: throughput, memory efficiency, and clarity/reproducibility. To this end we build the numerical kernels in JAX, leveraging its NumPy‑like API and XLA just‑in‑time compilation to fuse array operations into a small number of optimized kernels that run on CPU or GPU without changing source code. JAX traces pure Python functions to an intermediate representation (jaxpr) and lowers them to XLA, which removes Python overhead and enables aggressive operator fusion and memory planning; this is particularly effective for our elementwise contractions and scatter‑adds.
JAX’s transformations are composable—jit for compilation, vmap/pmap for vectorization or SPMD parallelism, and grad for automatic differentiation—but in this work we rely only on jit (and array programming), for reasons detailed below.
All energy terms (exchange, uniaxial anisotropy, Zeeman, and the magnetostatic Brown functional) are implemented with closed‑form analytic expressions for both value and gradient. We intentionally do not call jax.grad on the total energy:
- Lower tracing/tape overhead & better kernel fusion. Reverse‑mode AD would record and replay large nonlocal expressions; providing explicit gradients lets us reuse per‑element intermediates (e.g., constant per‑tet curls, elementwise sums of (\mathbf M)) in both value and gradient, which reduces the size of the compiled graph and the number of kernels.
- Numerical control. The unit‑length constraint is enforced by the normalized variable (\mathbf m=\mathbf u/|\mathbf u|). With analytic gradients we incorporate the tangent‑plane projection ((I-\mathbf m\mathbf m^\top)) directly, avoiding extra AD‑generated casts and synchronizations and keeping updates on (\mathbb S^2) by construction. (Projected‑gradient/normalization is standard for constrained smooth optimization.)
- Transform‑friendly purity. The kernels remain side‑effect‑free array programs with static shapes, which JIT compiles reliably and makes portable to vectorization or multi‑device execution in future work.
(Background on JAX’s JIT/AD model and pure‑function requirement can be found in the official documentation and overview papers.)
We pursue an almost matrix‑free approach for the physics:
- For linear tetrahedra, both the vector potential
$\mathbf A$ and scalar potential$U$ are discretized using nodal finite elements. In the vector formulation,$\nabla\times \mathbf A$ is constant per element; we compute element-wise curls by contracting nodal values with stored shape-function gradients and scatter‑add the contributions to nodal arrays. Exchange and anisotropy are assembled analogously from per-element contractions and broadcasted arithmetic. This avoids building global sparse matrices and keeps memory use dominated by fields, not by matrices. The resulting pure array programs JIT‑compile to a few fused kernels with low memory traffic. - The one deliberate exception is the algebraic multigrid (AMG) preconditioner, used in both the vector and scalar potential solves. We construct a single scalar Laplacian surrogate (once per mesh) to guide algebraic coarsening and reuse the AMG hierarchy throughout the run. This balances a matrix‑free inner operator with a proven global preconditioner and is consistent with best‑practice AMG design. The same AMG framework applies to the scalar potential formulation, where the system resembles a standard Poisson problem.
To enforce (|\mathbf m|=1) we optimize in an unconstrained variable (\mathbf u) and set (\mathbf m=\mathbf u/|\mathbf u|) at each iteration; the gradient w.r.t. (\mathbf u) is the projected gradient (|\mathbf u|^{-1}(I-\mathbf m\mathbf m^\top)\nabla_{\mathbf m}E). This integrates cleanly with our preconditioned L‑BFGS minimizer (see Optimization section), and because all steps are expressed as fused array kernels, they JIT‑compile efficiently. Project‑then‑update schemes are standard for constrained smooth problems and are compatible with JAX’s tracing model.
jitwhere it matters. All hot kernels—energy, gradients, and the matrix‑free actions—are wrapped injax.jitso XLA can perform fusion and memory scheduling; data stays on device across iterations.- Static shapes & purity. Mesh connectivity, element gradients, and volumes are treated as compile‑time constants; we avoid Python‑side side effects in traced regions, following JAX’s “pure functions” guidance.
- Separation of concerns. The AMG hierarchy is set up outside traced regions (once per mesh); inside the iterative solve we use compiled matrix‑free physics actions and call the AMG V‑cycle as a preconditioner, minimizing device/host synchronization.
This repository includes two official MaMMoS benchmark workflows, each with its own example subfolder and detailed README:
-
Benchmark 1: Grain-Resolved Magnetostatics
- See:
examples/benchmark_1/ - Description: Simulates the magnetostatic response of polycrystalline or granular samples, focusing on demagnetization and field distribution in complex geometries. The workflow and scientific background are described in the README in that folder.
- See:
-
Benchmark 2: Magnetic Field Sensor (TMR)
- See:
examples/sensor_loop/ - Description: Simulates the full hysteresis loop and electrical response of a magnetic field sensor (TMR), including mesh generation, initial state computation, and field sweeps. The workflow, physical context, and analysis steps are described in detail in the README in that folder.
- See:
For full scientific definitions, parameters, and step-by-step instructions, see the README.md files in each benchmark subfolder. These are based on the official MaMMoS Deliverable 6.2 and provide all necessary details to reproduce the benchmarks.