Skip to content

sjqtentacles/sml-gossip

Repository files navigation

sml-gossip

A SWIM-style epidemic membership and failure-detection protocol — node liveness states (Alive/Suspect/Dead) with incarnation-number refutation, random-peer probing, suspicion timeouts, and infection-style dissemination — modelled as a pure, deterministic state machine in Standard ML. ALL randomness (which peer a node probes) is drawn from a seeded sml-prng generator threaded through the cluster; never ambient randomness. Time is a caller-supplied logical tick, not a wall clock. No FFI and no external dependencies at runtime, deterministic and byte-identical under both MLton and Poly/ML.

CI

Status

  • 39 assertions, green on MLton and Poly/ML.
  • Basis-library + vendored sml-prng only; deterministic across compilers.
  • Vendors sml-prng (Layout B) under lib/github.com/sjqtentacles/sml-prng/, so the repo builds standalone.

Purity

No FFI, no IO inside the library, no wall-clock, no ambient randomness, and no threads. Every random choice flows from the seeded sml-prng state threaded through the cluster, and every protocol round is driven by a logical tick the caller supplies to step. Consequently a given make {nodes, seed, suspectTimeout} plus a given sequence of kill/step/inject calls always produces the EXACT same simulation trace — byte-for-byte, across runs, machines, and compilers. The test suite and demo are byte-identical under MLton and Poly/ML, and the determinism test asserts that two same-seed runs reproduce the full membership trace exactly.

The protocol (SWIM)

SWIM (Scalable Weakly-consistent Infection-style process group Membership) combines two mechanisms, both simulated here:

  • Failure detection. Each round, every live node picks a random peer (via the threaded PRNG) and probes it. A peer that does not respond — because it has actually failed in the simulation (kill) — is recorded Suspect. A node that stays Suspect longer than suspectTimeout logical ticks is aged out to Dead.
  • Epidemic dissemination. When two live nodes interact, they merge each other's membership tables, so an update started anywhere reaches the whole cluster in a logarithmic number of rounds.

Incarnation numbers (refutation). Each node carries a monotonic incarnation counter. Updates about a node are ordered by incarnation first, then by status severity (Alive < Suspect < Dead): a higher incarnation always wins, and at equal incarnation the more-severe status wins. A live node that learns it is being suspected refutes by bumping its incarnation and re-announcing itself Alive; the higher incarnation overrides the stale suspicion as it spreads. This is what prevents a stale Alive rumour from resurrecting a node that a newer Suspect/Dead rumour has already condemned.

Install

With smlpkg:

smlpkg add github.com/sjqtentacles/sml-gossip
smlpkg sync

Include the MLB from your own (it pulls in the vendored sml-prng):

local
  $(SML_LIB)/basis/basis.mlb
  lib/github.com/sjqtentacles/sml-gossip/... (via smlpkg)
in
  ...
end

This brings structure Gossip (and the vendored Prng) into scope.

Quick start

(* a 6-node cluster, all alive; suspects age out to Dead after 2 ticks *)
val c0 = Gossip.make { nodes = 6, seed = 0wxC0FFEE, suspectTimeout = 2 }

(* run three healthy rounds at logical ticks 0,1,2 -- nothing changes *)
fun stepN c t 0 = c
  | stepN c t n = stepN (Gossip.step c t) (t + 1) (n - 1)
val c3 = stepN c0 0 3

(* node 4 actually fails *)
val cK = Gossip.kill c3 4

(* keep stepping: the failure detector + gossip drive everyone to agree *)
val c  = stepN cK 3 23
val () = (* SOME Dead once every live node agrees *)
  case Gossip.agreedStatus c 4 of
    SOME Gossip.Dead => ()        (* converged on the failure *)
  | _ => ()

val view = Gossip.membershipOf c 0
(* [(0,Alive),(1,Alive),(2,Alive),(3,Alive),(4,Dead),(5,Alive)] *)
val ok = Gossip.converged c       (* true *)

API (signature GOSSIP)

type cluster
datatype status = Alive | Suspect | Dead

val make    : { nodes : int, seed : Word64.word, suspectTimeout : int } -> cluster
val size    : cluster -> int
val step    : cluster -> int -> cluster          (* advance one round at tick t *)
val kill    : cluster -> int -> cluster          (* actually fail a node        *)
val inject  : cluster -> int -> int -> int -> cluster  (* plant a suspicion     *)
val isAlive : cluster -> int -> bool             (* actual liveness            *)

val membershipOf      : cluster -> int -> (int * status) list  (* node i's view *)
val statusInView      : cluster -> int -> int -> status
val incarnationInView : cluster -> int -> int -> int
val converged         : cluster -> bool
val agreedStatus      : cluster -> int -> status option
val statusToString    : status -> string

Conventions

  • Cluster model. Node ids are 0 .. nodes-1. A cluster is an immutable value holding every node's local membership view, each node's actual liveness, the suspicion clocks, and the threaded PRNG state; every transition returns a fresh cluster.
  • Logical time. step c t runs one protocol round "at" tick t. Suspicion timeouts are measured in these ticks: a Suspect entry older than suspectTimeout ticks is aged out to Dead. The caller chooses the tick sequence (typically 0, 1, 2, …).
  • Seeded randomness only. Peer selection uses Prng.SplitMix64 seeded by seed and threaded through the cluster — there is no ambient randomness, so the trace is fully reproducible.
  • Merge ordering. Opinions about a node are ordered by incarnation, then by severity (Alive < Suspect < Dead); kill does not touch the PRNG, and inject models a (possibly false) suspicion so refutation can be exercised.
  • Convergence. converged is true when every actually-alive node agrees on every node's status; agreedStatus c j returns SOME s when all live nodes assign j the same status s, else NONE.

Build & test

make test        # MLton
make test-poly   # Poly/ML
make all-tests   # both
make example     # build + run examples/demo.sml
make clean

Both compilers run the same strict-TDD suite (39 assertions):

  • initial state — a fresh cluster is all-alive at incarnation 0 and trivially converged; degenerate (0- and 1-node) clusters are handled, and make rejects negative inputs;
  • healthy cluster — with no failures every probe succeeds, so after many rounds every view is still all-alive, no incarnation was ever bumped, and the cluster stays converged;
  • failure propagation — a killed node is detected, aged from Suspect to Dead, and epidemically marked Dead in every live node's view within a bounded number of rounds, while live nodes stay Alive;
  • incarnation / refutation — a falsely injected suspicion of a live node is refuted (its incarnation bumps above 0, it returns to Alive everywhere, and the higher incarnation propagates); and a stale Alive (incarnation 0) rumour never resurrects a node already agreed Dead;
  • determinism — two runs from the same seed produce the EXACT same full membership trace (asserted by structural equality of every round's snapshot), while different seeds diverge; plus a pinned exact reference trace (statuses, membership lists, and incarnation numbers) that catches any change to the PRNG wiring, peer-selection arithmetic, or merge/age-out logic.

All assertions compare exact integers, statuses, incarnation numbers, and booleans — there are no reals in the protocol, so equality is exact (no epsilon).

Vendoring

This library depends on sml-prng, whose sources are vendored verbatim under lib/github.com/sjqtentacles/sml-prng/ (prng.sig, prng.sml, and a prng.mlb; the dependency's own tests are not vendored). src/gossip.mlb references that prng.mlb first, then gossip.sig/gossip.sml; the Poly/ML use-chain loads the vendored signature and structure first, in dependency order. sml.pkg records the dependency in its require block so smlpkg sync can refresh it.

Example

make example stands up a 6-node cluster, kills a node, and prints each node's view as the failure is detected and disseminated (output is byte-identical under MLton and Poly/ML):

=== sml-gossip demo ==========================================

Cluster of 6 nodes, seeded PRNG (all peer choices deterministic),
suspectTimeout = 2 logical ticks. Legend: A=Alive S=Suspect D=Dead,
trailing 'x' marks a node that has actually failed.

Initial views (everyone sees everyone Alive):
  node 0  |A A A A A A
  node 1  |A A A A A A
  node 2  |A A A A A A
  node 3  |A A A A A A
  node 4  |A A A A A A
  node 5  |A A A A A A

Killing node 4 at tick 3.

--- after tick 5 ------------------------------------
  node 0  |A A A A S A
  node 1  |A A A A S A
  node 2  |A A A A S A
  node 3  |A A A A S A
  node 4x |A A A A A A
  node 5  |A A A A S A
  all live nodes agree node 4 is Suspect

--- after tick 12 ------------------------------------
  node 0  |A A A A D A
  node 1  |A A A A D A
  node 2  |A A A A D A
  node 3  |A A A A D A
  node 4x |A A A A A A
  node 5  |A A A A D A
  all live nodes agree node 4 is Dead

--- after tick 25 ------------------------------------
  node 0  |A A A A D A
  node 1  |A A A A D A
  node 2  |A A A A D A
  node 3  |A A A A D A
  node 4x |A A A A A A
  node 5  |A A A A D A
  all live nodes agree node 4 is Dead
  converged = true

===============================================================

Poly/ML note

CI builds Poly/ML 5.9.1 from source rather than using the Ubuntu package (Poly/ML 5.7.1), whose X86 code generator crashes (asGenReg raised while compiling) on some code. See .github/workflows/ci.yml.

License

MIT — see LICENSE.

About

SWIM-style gossip membership & failure detection simulated as a pure deterministic state machine in Standard ML (seeded PRNG). MLton + Poly/ML.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors