Skip to content

perf(pm): adopt mimalloc as global allocator#3149

Open
elrrrrrrr wants to merge 1 commit into
nextfrom
perf/pm-mimalloc
Open

perf(pm): adopt mimalloc as global allocator#3149
elrrrrrrr wants to merge 1 commit into
nextfrom
perf/pm-mimalloc

Conversation

@elrrrrrrr

Copy link
Copy Markdown
Contributor

Split out of #3146 — single orthogonal change so the benchmark comment shows its isolated effect.

mimalloc as #[global_allocator] for the utoo binary (via the existing workspace [patch.crates-io] utooland fork — same allocator family the pack crates use through TurboMalloc).

Why re-test: the April experiment (perf/pm-mimalloc-allocator) showed no clear win, but that predates the v1.1.1 demand-resolver rework — parsing now runs on a dedicated rayon pool with results handed across threads, which is exactly the cross-thread alloc/free pattern where mimalloc beats glibc's arena locking.

Expected bench signal: p1 (and p0's resolve half) improve a few percent if the hypothesis holds; p4 unchanged (syscall-bound). RSS will read higher (mimalloc retains freed pages).

🤖 Generated with Claude Code

@elrrrrrrr elrrrrrrr added the benchmark Run pm-bench on PR label Jun 11, 2026

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request configures mimalloc as the global allocator for the pm crate to optimize performance for allocation-heavy workloads. However, because mimalloc is only declared as a dependency for non-wasm32 targets in Cargo.toml, unconditionally importing and setting it in main.rs will break compilation for wasm32 targets. A review comment correctly identifies this issue and suggests wrapping the global allocator setup in a #[cfg(not(target_arch = "wasm32"))] conditional compilation attribute.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

Comment thread crates/pm/src/main.rs
Comment on lines +5 to +8
use mimalloc::MiMalloc;

#[global_allocator]
static GLOBAL: MiMalloc = MiMalloc;

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The mimalloc dependency is only specified for non-wasm32 targets in Cargo.toml. Unconditionally importing and setting it as the global allocator will cause compilation to fail when targeting wasm32. Wrap the global allocator definition in a #[cfg(not(target_arch = "wasm32"))] attribute.

Suggested change
use mimalloc::MiMalloc;
#[global_allocator]
static GLOBAL: MiMalloc = MiMalloc;
#[cfg(not(target_arch = "wasm32"))]
#[global_allocator]
static GLOBAL: mimalloc::MiMalloc = mimalloc::MiMalloc;

Re-landed on the post-#3157 next. The workspace [patch.crates-io] already
points mimalloc/libmimalloc-sys at the utooland fork (shared with the pack
crates' TurboMalloc); pm now opts the binary in via #[global_allocator].

Allocation-heavy install workload (multi-MB JSON parses, graph build,
per-file clone bookkeeping) with cross-thread alloc/free where mimalloc
beats glibc arena locking. wasm32 target left on the system allocator.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@github-actions

Copy link
Copy Markdown

📊 pm-bench-phases · ea2db0e · linux (ubuntu-latest)

Workflow run — ant-design

registry.npmjs.org

utoo (PR) vs utoo-next (baseline): ✅ p0 +4.5% · ✅ p1 -20.6% · ✅ p3 +6.7% · ✅ p4 +2.0%

✅ within noise · 🚀 faster · ⚠️ slower — Δ is the median of per-round paired deltas (interleaved rounds share weather windows); flagged when |Δ| > 5% and ≥80% of rounds agree on sign

p0 · full cold install

PM wall (mean ± σ) min user sys RSS Δ vs baseline
utoo (PR) 10.4s ± 2.18s 7.92s 9.29s 12.7s 1.07G +4.5% ✅
utoo-next (baseline) 9.81s ± 1.82s 7.72s 9.34s 12.6s 947M
bun 12.2s ± 1.12s 11.3s 11.1s 10.9s 690M +11.4% ⚠️

p1 · resolve

PM wall (mean ± σ) min user sys RSS Δ vs baseline
utoo (PR) 2.73s ± 0.69s 2.31s 4.33s 1.43s 754M -20.6% ✅
utoo-next (baseline) 3.62s ± 1.32s 2.45s 4.42s 1.69s 586M
bun 4.67s ± 1.89s 2.60s 4.44s 1.23s 520M +12.8% ✅

p3 · cold install

PM wall (mean ± σ) min user sys RSS Δ vs baseline
utoo (PR) 7.00s ± 0.49s 6.57s 5.10s 11.6s 570M +6.7% ✅
utoo-next (baseline) 6.97s ± 0.55s 6.46s 5.18s 11.4s 517M
bun 8.03s ± 0.72s 7.46s 6.82s 10.8s 566M +17.0% ⚠️

p4 · warm link

PM wall (mean ± σ) min user sys RSS Δ vs baseline
utoo (PR) 2.54s ± 0.07s 2.47s 0.43s 3.73s 208M +2.0% ✅
utoo-next (baseline) 2.56s ± 0.23s 2.33s 0.48s 3.77s 49M
bun 3.90s ± 0.10s 3.80s 0.19s 2.59s 132M +53.0% ⚠️
Resources & footprint

p0 · full cold install

PM vCtx iCtx netRX netTX cache node_modules lock
utoo 70564 59058 1.25G 7M 1.83G 1.82G 2M
utoo-next 74105 59906 1.25G 7M 1.83G 1.82G 2M
bun 15797 18091 1.27G 6M 1.98G 1.85G 1M

p1 · resolve

PM vCtx iCtx netRX netTX cache node_modules lock
utoo 33302 50046 210M 2M 7M 3M 2M
utoo-next 32493 50717 210M 2M 7M 3M 2M
bun 11846 3711 212M 3M 116M 0B 1M

p3 · cold install

PM vCtx iCtx netRX netTX cache node_modules lock
utoo 67100 49708 1.04G 4M 1.82G 1.82G 2M
utoo-next 72806 49150 1.04G 4M 1.82G 1.82G 2M
bun 4466 7820 1.06G 4M 1.87G 1.87G 1M

p4 · warm link

PM vCtx iCtx netRX netTX cache node_modules lock
utoo 21685 13614 1K 14K 1.82G 1.82G 2M
utoo-next 22714 13843 1K 14K 1.82G 1.82G 2M
bun 281 22 3K 11K 1.87G 1.87G 1M

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

benchmark Run pm-bench on PR

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant