State-based convergent replicated data types (CvRDTs) for Standard ML.
A CvRDT is a value together with a join-semilattice merge -- a commutative,
associative, idempotent least-upper-bound -- and monotone mutators. Replicas
exchange whole states and merge them; because merge is a semilattice join,
replicas that have observed the same set of updates converge to the same value
regardless of the order or number of merges (strong eventual consistency).
This library provides four classic CvRDTs:
GCounter-- a grow-only counter. State is a per-replica count;valueis the sum across replicas;mergetakes the pointwise maximum.PNCounter-- an increment/decrement counter built from twoGCounters (a positive tally and a negative tally);valueis their difference.LWWRegister-- a last-write-wins register holding a single'avalue tagged with a caller-supplied timestamp and replica.mergekeeps the higher timestamp; ties are broken deterministically by the higher replica id.ORSet-- an observed-remove set with correct add / remove / re-add semantics. Eachaddstamps the element with a globally unique tag (replica * sequence);removetombstones exactly the tags observed locally; an element is present iff it has an untombstoned add. A concurrent add whose tag the remove never observed survives the merge.
Everything is exported through the single opaque structure Crdt (see
crdt.sig).
The library is pure Standard ML using only the Basis library -- no FFI, no threads, no external dependencies. Verified on MLton and Poly/ML, with byte-identical, deterministic test output across both.
The test suite additionally uses sml-test
for property-based testing; it is a test-only dependency, vendored under
lib/ and wired in from test/sources.mlb. The library's own basis
(lib/github.com/sjqtentacles/sml-crdt/sources.mlb) stays dependency-free.
make test # build + run the suite under MLton (default)
make test-poly # run the suite under Poly/ML
make all-tests # run under both
make cleansmlpkg add github.com/sjqtentacles/sml-crdt
smlpkg syncThen reference the library basis from your own .mlb:
lib/github.com/sjqtentacles/sml-crdt/sml-crdt.mlb
For Poly/ML, use the crdt.sig and crdt.sml sources in order.
(* A grow-only counter across two replicas, merged. *)
val a = Crdt.GCounter.incBy 0 3 Crdt.GCounter.empty
val b = Crdt.GCounter.incBy 1 4 Crdt.GCounter.empty
val v = Crdt.GCounter.value (Crdt.GCounter.merge (a, b)) (* 7 *)
(* Last-write-wins register; higher timestamp wins. *)
val r = Crdt.LWWRegister.merge
( Crdt.LWWRegister.set {ts=1, replica=0} "old" Crdt.LWWRegister.empty
, Crdt.LWWRegister.set {ts=2, replica=1} "new" Crdt.LWWRegister.empty )
val cur = Crdt.LWWRegister.value r (* SOME "new" *)
(* Observed-remove set over ints. *)
val s = Crdt.ORSet.empty (op = : int * int -> bool)
val s = Crdt.ORSet.add 0 42 s
val present = Crdt.ORSet.member 42 (Crdt.ORSet.remove 42 s) (* false *)- Canonical state. Counters keep per-replica maps as assoc lists sorted by replica with zero entries dropped; the OR-set keeps its adds and tombstones as tag-sorted, deduped lists. Canonical representations make the semilattice laws hold structurally and make equality-by-value well defined and deterministic -- which is what the property tests compare.
- OR-set element equality.
ORSetis value-polymorphic, so it cannot use SML's built-in=. Each set carries an expliciteq : 'a * 'a -> boolsupplied at construction (ORSet.empty eq); all of the set's operations use it. This keeps the library dependency-free while supporting arbitrary element types. - Property tests. The algebraic laws (commutativity, associativity,
idempotence of
merge) are checked withTest.Propfrom the vendoredsml-test, using a fixed seed for reproducibility. Random states are built by generating small operation logs (lists of(replica, op)) and replaying them. Because OR-set tags must be globally unique, independently generated operands draw from disjoint replica-id pools so they never mint the same tag for different elements.