|
| 1 | +# sml-aead |
| 2 | + |
| 3 | +A single, algorithm-agnostic **AEAD** (authenticated encryption with associated |
| 4 | +data) facade in pure Standard ML, over the cipher primitives that already exist |
| 5 | +in the sjqtentacles ecosystem: |
| 6 | + |
| 7 | +- **ChaCha20-Poly1305** ([RFC 8439](https://www.rfc-editor.org/rfc/rfc8439)) — via vendored [`sml-chacha20`](https://github.com/sjqtentacles/sml-chacha20) |
| 8 | +- **AES-128-GCM / AES-256-GCM** (NIST GCM) — via vendored [`sml-aes`](https://github.com/sjqtentacles/sml-aes) |
| 9 | + |
| 10 | +Instead of programming against three different structures, callers (and |
| 11 | +downstream protocol work such as a future `sml-tls`/`sml-ssh`) target one |
| 12 | +`alg`-keyed `seal`/`open'` interface. This repo **implements no new |
| 13 | +cryptography** — it is a thin dispatch layer plus a consolidated home for the |
| 14 | +canonical RFC 8439 / NIST GCM test vectors. The standalone Poly1305 one-time |
| 15 | +MAC is also re-exported. |
| 16 | + |
| 17 | +No FFI, no external dependencies, and **deterministic** — byte-identical under |
| 18 | +both [MLton](http://mlton.org/) and [Poly/ML](https://www.polyml.org/). |
| 19 | + |
| 20 | +## Status |
| 21 | + |
| 22 | +- 50 assertions, green on MLton and Poly/ML (byte-identical output). |
| 23 | +- Vendors `sml-chacha20` + `sml-aes` (Layout B), so the repo builds standalone. |
| 24 | +- Validated against the **official vectors**: |
| 25 | + - **RFC 8439 §2.5.2** — Poly1305 one-time MAC. |
| 26 | + - **RFC 8439 §2.8.2** — the canonical ChaCha20-Poly1305 AEAD "sunscreen" |
| 27 | + example (ciphertext **and** tag, byte-exact). |
| 28 | + - **AES-GCM** — McGrew & Viega Appendix B test cases 4 (AES-128) and 16 |
| 29 | + (AES-256), the de-facto NIST GCM known-answer vectors (ciphertext **and** |
| 30 | + tag, byte-exact). |
| 31 | +- Tamper detection: a flipped tag byte, ciphertext byte, wrong AAD, or wrong |
| 32 | + nonce all make `open'` return `NONE`. |
| 33 | + |
| 34 | +> Building this facade surfaced and fixed a GHASH bug in `sml-aes` (its AES-GCM |
| 35 | +> produced correct ciphertext but a non-interoperable authentication tag); the |
| 36 | +> upstream fix plus NIST known-answer vectors landed in `sml-aes` first, and the |
| 37 | +> byte-identical fixed tree is vendored here. |
| 38 | +
|
| 39 | +## Install |
| 40 | + |
| 41 | +With [`smlpkg`](https://github.com/diku-dk/smlpkg): |
| 42 | + |
| 43 | +``` |
| 44 | +smlpkg add github.com/sjqtentacles/sml-aead |
| 45 | +smlpkg sync |
| 46 | +``` |
| 47 | + |
| 48 | +Include the MLB from your own (it pulls in the vendored `sml-chacha20` + |
| 49 | +`sml-aes`): |
| 50 | + |
| 51 | +``` |
| 52 | +local |
| 53 | + $(SML_LIB)/basis/basis.mlb |
| 54 | + lib/github.com/sjqtentacles/sml-aead/... (via smlpkg) |
| 55 | +in |
| 56 | + ... |
| 57 | +end |
| 58 | +``` |
| 59 | + |
| 60 | +This brings `structure Aead` (and the vendored primitive structures) into scope. |
| 61 | + |
| 62 | +## Quick start |
| 63 | + |
| 64 | +```sml |
| 65 | +(* keys, nonces, aad, plaintext and ciphertext are all raw byte strings: |
| 66 | + one char per byte, 0-255 *) |
| 67 | +
|
| 68 | +(* 32-byte key, 12-byte nonce for ChaCha20-Poly1305 *) |
| 69 | +val sealed = Aead.seal Aead.ChaCha20Poly1305 |
| 70 | + { key = key32, nonce = nonce12, aad = "header", plaintext = "secret" } |
| 71 | +(* sealed = ciphertext || 16-byte tag *) |
| 72 | +
|
| 73 | +val recovered = Aead.open' Aead.ChaCha20Poly1305 |
| 74 | + { key = key32, nonce = nonce12, aad = "header", ciphertext = sealed } |
| 75 | +(* SOME "secret", or NONE if anything was tampered with *) |
| 76 | +
|
| 77 | +(* AES-256-GCM: 32-byte key, 12-byte IV *) |
| 78 | +val sealed' = Aead.seal Aead.AesGcm256 |
| 79 | + { key = key32, nonce = iv12, aad = "", plaintext = "secret" } |
| 80 | +
|
| 81 | +(* standalone Poly1305 one-time MAC *) |
| 82 | +val tagHex = Aead.Poly1305.macHex oneTimeKey32 "message" |
| 83 | +``` |
| 84 | + |
| 85 | +## Demo |
| 86 | + |
| 87 | +`make example` runs [`examples/demo.sml`](examples/demo.sml), sealing one |
| 88 | +message under all three algorithms with fixed keys/nonces: |
| 89 | + |
| 90 | +``` |
| 91 | +sml-aead demo |
| 92 | +============= |
| 93 | +plaintext : attack at dawn |
| 94 | +aad : hdr:v1 |
| 95 | +
|
| 96 | +ChaCha20-Poly1305: |
| 97 | + sealed : 3fbe31666572e6086fe65702aec51e96d0c044d64b4e8c7b5adf1ab69cdf |
| 98 | + opened : true |
| 99 | +AES-128-GCM: |
| 100 | + sealed : a55a77ce6c24968e63fd3994b0499fdab3ff07e908165aa3a63689380b45 |
| 101 | + opened : true |
| 102 | +AES-256-GCM: |
| 103 | + sealed : 1c8aec772aa21ad2be556c7c7817dd8320f4fdcc354535b6df9615f13af7 |
| 104 | + opened : true |
| 105 | +
|
| 106 | +Poly1305 MAC of "attack at dawn" under a fixed key: |
| 107 | + daf8f71535db4152c9b148bda19abcfa |
| 108 | +``` |
| 109 | + |
| 110 | +## API |
| 111 | + |
| 112 | +```sml |
| 113 | +datatype alg = ChaCha20Poly1305 (* 256-bit key, 96-bit nonce *) |
| 114 | + | AesGcm128 (* 128-bit key, 96-bit nonce *) |
| 115 | + | AesGcm256 (* 256-bit key, 96-bit nonce *) |
| 116 | +
|
| 117 | +exception Aead of string (* wrong key/nonce length *) |
| 118 | +
|
| 119 | +val tagLen : int (* = 16 *) |
| 120 | +val keyLen : alg -> int |
| 121 | +val nonceLen : alg -> int |
| 122 | +
|
| 123 | +val seal : alg -> {key:string, nonce:string, aad:string, plaintext:string} |
| 124 | + -> string (* ciphertext || tag *) |
| 125 | +val open' : alg -> {key:string, nonce:string, aad:string, ciphertext:string} |
| 126 | + -> string option (* SOME plaintext, or NONE if auth fails *) |
| 127 | +
|
| 128 | +structure Poly1305 : |
| 129 | + sig |
| 130 | + val mac : string -> string -> string (* raw 16-byte tag *) |
| 131 | + val macHex : string -> string -> string (* 32-char hex tag *) |
| 132 | + end |
| 133 | +``` |
| 134 | + |
| 135 | +| Function | Behavior | |
| 136 | +| --- | --- | |
| 137 | +| `keyLen alg` / `nonceLen alg` | required key / nonce length in bytes (`nonceLen` is always 12) | |
| 138 | +| `seal alg {key, nonce, aad, plaintext}` | encrypt + authenticate; returns `ciphertext` with the 16-byte tag appended; raises `Aead` on a wrong key/nonce length | |
| 139 | +| `open' alg {key, nonce, aad, ciphertext}` | verify the tag in constant time and decrypt; `SOME plaintext` on success, `NONE` if authentication fails or the input is shorter than the tag | |
| 140 | +| `Poly1305.mac key msg` | RFC 8439 §2.5 one-time authenticator with a 32-byte `key` | |
| 141 | + |
| 142 | +### Conventions |
| 143 | + |
| 144 | +- **Bytes as `string`.** Keys, nonces, AAD, plaintext and ciphertext are raw |
| 145 | + byte strings (one char per byte, 0–255), matching the rest of the |
| 146 | + sjqtentacles crypto/codec family. They are never hex. |
| 147 | +- **Tag appended.** `seal` returns `ciphertext || tag` (16-byte tag), exactly as |
| 148 | + the underlying RFC 8439 / NIST GCM constructions specify, and `open'` expects |
| 149 | + that same layout. |
| 150 | +- **Nonce reuse is catastrophic.** As with any AEAD, never reuse a |
| 151 | + (key, nonce) pair; the caller is responsible for unique nonces. |
| 152 | +- **No new crypto.** The ciphers live in `sml-chacha20` / `sml-aes`; this repo |
| 153 | + only unifies and vector-tests them. |
| 154 | + |
| 155 | +## Build & test |
| 156 | + |
| 157 | +``` |
| 158 | +make test # MLton |
| 159 | +make test-poly # Poly/ML |
| 160 | +make all-tests # both |
| 161 | +make example # build + run examples/demo.sml |
| 162 | +make clean |
| 163 | +``` |
| 164 | + |
| 165 | +## License |
| 166 | + |
| 167 | +MIT — see [LICENSE](LICENSE). |
0 commit comments