Skip to content

Add delta-compression using an operation log#700

Merged
Shatur merged 48 commits into
simgine:masterfrom
cBournhonesque:cb/delta-compression
Jun 23, 2026
Merged

Add delta-compression using an operation log#700
Shatur merged 48 commits into
simgine:masterfrom
cBournhonesque:cb/delta-compression

Conversation

@cBournhonesque

Copy link
Copy Markdown
Contributor

Lighyear implemented delta compression by providing an API to compute diffs between two component states, keeping track of past component states and manually sending diffs from the last ACK-ed state for each client.

This was inefficient for the common case of sending a Vec<Point> or VecDeque<Point>, i.e. an appendable data structure, because we had to compute diffs in an expensive manner (diffs between 2 vecs) and we had to store the full vec state for a lot of past ticks.

This PR implements delta-compression with a different approach:

  • the user has to provide methods to manually modify the component that they want to replicate, and use that in their code. This lets us avoid the expensive diff-computation form state since the user directly provides the diff methods!
  • replicon can then send those diffs, along with an u64 cursor to make sure that a received diff is not applied multiple times.
    On the send-side: we only send the diffs since the last acked-diff.
    On the receive-size: we store received diffs in a buffer, and only apply them in order.

I've tried it in a personal game where I need to replicate Vec and I got ~80% reduction in bandwidth

@Shatur

Shatur commented Jun 3, 2026

Copy link
Copy Markdown
Contributor

Sounds awesome, I like the idea!

I'll review tomorrow (it's 2:30 AM for me 😅).

@Shatur

Shatur commented Jun 4, 2026

Copy link
Copy Markdown
Contributor

Wait, can't we already do this with ordered events? They're buffered by the backend and can represent operations. For example, what if, instead of using a special apply_op_delta, we register the component as "replicate once" and use server to apply operations?

One downside - right now this won't take visibility into account. If you use Broadcast, this will send the event and log an error on clients that don't see the entity on deserialization. But we can fix that: all events that touch entities implement the MapEntities trait and registered in a special way. We could use this to do "fake" entity mapping that actually just collects the entities touched by the event into a Vec. Then, before sending, we check whether each entity is visible (we have an API for it). This shouldn't be hard to implement. This should be a nice change in general because right now Broadcast is useless when you hide at least one entity.

What do you think about this? Am I missing something?

@cBournhonesque

Copy link
Copy Markdown
Contributor Author

I think that in general it's preferred to keep all replication-related code in replication-land instead of message-land, since over time replication might accumulate more changes that only work with replication-messages.

Some issues with implementing this via messages:

  • visibility is better handled via replication as you mentioned
  • priority accumulation wouldn't work via messages
  • I would have to special-case this in my code by reading FromServer messages and re-applying manually on the receiver side to get the correct component, whereas right now on the receiver side I can treat Points as a normal component. The only 'special' handling code is on the sender where I apply an op.
  • Prediction/Interpolation rely on ServerMutateTicks to correctly know the ticks where all mutation messages were received. If these delta-ops are sent as separate ordered events they wouldn't impact ServerMutateTicks so we would also need to special-case this
  • future replication-only logic might also not work (priority per component, authority change, etc.)

@Shatur

Shatur commented Jun 4, 2026

Copy link
Copy Markdown
Contributor

Agreed. Let's proceed with your approach. I'll finish my review in a few minutes. And I'll open an issue about visibility, I think it's something that we should handle regardless.

I would have to special-case this in my code by reading FromServer messages

Everything you said makes sense, but I want to note that this part wouldn't require any special casing. You can send the event on the server instead of manually editing the component and apply the changes to using FromClient on both client and server. I.e. instead of defining a trait, you define a system/observer that applies the logic. Never mind, I think you're right about this one too.

@Shatur Shatur left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the idea, left a couple suggestions. Could you also update to the latest master?

Comment thread src/shared/replication/rules/component.rs Outdated
Comment thread src/shared/replication/op_delta.rs Outdated
Comment thread src/shared/replication/client_ticks.rs Outdated
Comment thread src/shared/replication/op_delta.rs Outdated
Comment thread src/shared/replication/rules.rs Outdated
Comment thread src/server.rs Outdated
Comment thread src/shared/replication/op_delta.rs Outdated
Comment thread src/shared/replication/op_delta.rs Outdated

@malfuu malfuu left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi! I think your solution is much stronger than what I was thinking. I tried going for a snapshot solution (think video encoding), by trading possible correctness for a smaller memory footprint (and leaving out diffs for variable length types). For my use case, I would have argued it had its worth; but I am also interested to see how it plays out with this solution.
Perhaps in the future I hope to argue for the snapshot strategy again.
I hope to test it out soon!

@cBournhonesque cBournhonesque force-pushed the cb/delta-compression branch from deb3e48 to a0f21ed Compare June 5, 2026 15:14
Comment thread CHANGELOG.md Outdated
Comment thread Cargo.toml Outdated
Comment thread src/shared/replication/diff.rs Outdated
Comment thread src/shared/replication/rules.rs Outdated
@cBournhonesque

Copy link
Copy Markdown
Contributor Author

@Shatur should be ready to review!

@Shatur Shatur left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! Left a few small suggestions / questions.

Comment thread src/shared/replication/rules.rs Outdated
Comment thread src/shared/replication/diff.rs Outdated
Comment thread src/shared/replication/diff.rs Outdated
Comment thread src/shared/replication/diff.rs Outdated
Comment thread src/shared/replication/diff.rs Outdated
Comment thread src/shared/replication/diff.rs Outdated
Charles Bournhonesque added 13 commits June 13, 2026 14:47
Lighyear implemented delta compression by providing an API to compute diffs
between two component states, keeping track of past component states and manually
sending diffs from the last ACK-ed state for each client.

This was inefficient for the common case of sending a `Vec<Point>` or `VecDeque<Point>`,
i.e. an appendable data structure, because we had to compute diffs in an expensive manner
and we had to store the full vec for a lot of past ticks.

This PR implements delta-compression with a different approach:
- the user has to provide methods to manually modify the component that they want to replicate, and use that in their code. This lets us avoid the expensive diff-computation form state since the user directly provides the diffs!
- replicon can then send those diffs, along with an u64 cursor to make sure that a received diff is not applied multiple times

Change-Id: I3869167e660a403f37480c1d1daef1f088b06ca5
Change-Id: I686ce4d6eb09a563351862d4306775b46de45b6e
Change-Id: I64ed3af039e3c72b5545ae194f806597eb8372cc
Change-Id: Ic0c22b920c24660359149813fbcd311de9378c0e
…rite. This means we need to add a cursor index on the component itself.

Change-Id: Ia2561bb0111e53339ad65d945f796c9bb26eb79a
Change-Id: I6e1afd401856d3689218b1fbf52476d9f28088e9
Change-Id: I765f281420128dc894ae874d5cb97d980f96dd9e
Change-Id: I32a34846de52ee815628845e3e40a0260c7e575b
Change-Id: Ia8977db93de82cb67dfdc08afd2e073fb5313f0d
The diff-log was assigning a new patch index for each individual diff.
That means a component that could add 1000 new points in one tick would increment the
cursor by 1000!
The DIFF_HISTORY_LEN is then hard to choose and very component-specific.
Instead we now only increment the patch index when we send the component.

We could probably lower the history value

Change-Id: I7009af651f16407fe159aaacb0d1bedabfd06bfb
Change-Id: I7c7dc584d45abc0b1e0a0690a4472c4075dce2c8
Change-Id: Iff38b0c401a4f64fd570c6c56a4f7b679e5033ec
Change-Id: Ifda8d3a8b1c843c3ee5300738abf32af88119fa7

@Shatur Shatur left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left 2 last comments to the previous suggestions, but looks good overall. Once resolved, I'm planning to fix the CI and make a few minor changes myself (mostly just stylistic things and docs), don't want to bother you with this 🙂

Charles Bournhonesque added 2 commits June 13, 2026 15:25
Change-Id: I49dc4af018ffb4299bf61050e7cd32adff2e1d3f
Change-Id: I3d6b242b9e4bcdfea18da46bc174fd7a82fe6643

@Shatur Shatur left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Tomorrow I'll do the mentioned changes and merge it 🙂

Shatur added 8 commits June 14, 2026 21:04
It logically belongs here + allows to avoid
one extra lookup during param initialization.

One `expect` is unavodiable with any approach
anyway.
While it doesn't make much sense, it's probably better to keep it for
now to avoid itnroducing
a breacking change.
Shorter, just a preference.
No reason to fill the `first_index` if it will be discarded. I think it
simpler this way.
Used only in one place where we no longer need to unwrap.
Replace substract with cursor increment. Since
we no longer need to substract, > 0 check is no
longer needed.

Also I this reads better as for my taste.
@Shatur

Shatur commented Jun 14, 2026

Copy link
Copy Markdown
Contributor

Got a bit busy with life stuff today, so I didn't finish, but pushed a few small changes. I'll wrap this up tomorrow 🙂

Previously, we used `u64`, which is unlikely to ever overflow.
Even at 60 Hz, it would take ~9.7 billion years.

But the code used checked operations, even though they would not help.
I was going to switch to regular operations, but realized that if we
handle overflows properly, we could save some traffic.

By default, integers in postcard are serialized with varint encoding.
> 128 takes 2 bytes, > 16384 takes 3 bytes, and so on. But we can switch
  to `u16` with overflow handling and fixint encoding, which always
  takes 2
  bytes.

This also simplifies the logic inside `batches_after`.
I also had to replace `BTreeMap` with `HashMap`, but we don't need to
preserve the ordering since we iterate over indices, so it's fine.
@Shatur Shatur force-pushed the cb/delta-compression branch from 4a20cad to d751b3a Compare June 15, 2026 16:29
Comment thread src/shared/replication/diff.rs Outdated
Shatur added 2 commits June 15, 2026 20:20
It's easier to understand and faster.
I still want to rework the mutation serialization, but
it's a good step in this direction.
@Shatur Shatur force-pushed the cb/delta-compression branch from b16e3d7 to 6e7d6d4 Compare June 15, 2026 17:20
Comment thread src/shared/replication/diff.rs Outdated
@Shatur

Shatur commented Jun 22, 2026

Copy link
Copy Markdown
Contributor

Done! Notable changes that might affect you:

  • PatchIndex is now a dedicated type that works similarly to tick, but wraps around u16. postcard uses varint encoding, so with the old approach, the first 128 indices would take 1 byte, while the rest would take 2+ bytes. With this approach, the index always takes only 2 bytes.
  • PatchHistory and PatchBuffer are no longer components and are accessible from ctx. See the default_* functions. I also moved the default functions to the rule_fns module. Because of this change, we no longer need separate DiffFns and marker overrides.
  • PatchIndex inside WireDiff::Snapshot is no longer optional. We always allocate an index if there is a change.
  • PatchIndex inside WireDiff::Patches is now the last index, not the first. This makes the math easier and more consistent.
  • External mutations are now handled properly by storing Tick. Previously, the detection was based on checking whether the number of missing patches was 0, but the client may not have acked an earlier patch.
  • I split queue_and_take_ready into two functions to make tests more convenient: push and drain_ready.
  • patches_after no longer allocates and now returns an iterator plus the current patch index. If the iterator is empty, the caller should fall back to a snapshot with the snapshot index. You can still get the first index by subtracting the number of received patches from it if needed.
  • I moved some of the tests into unit tests and simplified the integration tests. I removed the test with need_history since it’s not really testing a Replicon feature. I'd suggest to make a similar inside Lightyear.
  • Patch storage is now flat. One index now corresponds to one patch. I think this is more efficient and easier to understand.
  • TestFnsEntityExt now works with patches. It's helper to test serialization functions.

I’d probably recommend taking another look at the PR diff. It's not big, most of the + lines come from tests. As you can see, there are almost no - lines now 🙂

Please double-check that this approach works for Lightyear.

We're planning to implement client-server replication.
Comment thread src/shared/replication/diff.rs Outdated
@Shatur Shatur enabled auto-merge (squash) June 23, 2026 21:23
@Shatur Shatur merged commit b3215c7 into simgine:master Jun 23, 2026
10 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants