Tree snapshot and delta sync in JavaScript. Content-defined chunking splits files into variable-size pieces whose boundaries are determined by content, not position — so inserting a byte at the start of a file only invalidates the chunks immediately around the insertion. Chunks are stored by SHA-256 hash, giving automatic cross-file deduplication. Folder trees are fingerprinted as Merkle trees, so unchanged subtrees can be skipped with a single hash compare.
Zero dependencies beyond @noble/hashes.
pnpm add @fizzyflow/doublesyncimport DoubleSync, {
CDCStore,
DoubleSyncMemoryFolder,
DoubleSyncMemoryFile,
DoubleSyncSession,
} from '@fizzyflow/doublesync';
const sync = new DoubleSync({ avgSize: 8192 });
// --- Sender ---
const root = new DoubleSyncMemoryFolder('project');
const src = await root.addFolder('src');
await src.addFile('main.js', new TextEncoder().encode('console.log("hello")'));
const senderStore = new CDCStore();
const sender = new DoubleSyncSession({ sync, senderStore });
const patch = await sender.snapshot(root); // full patch (first call)
// --- Receiver ---
const dest = new DoubleSyncMemoryFolder('copy');
const receiverStore = new CDCStore();
const receiver = new DoubleSyncSession({ sync, store: receiverStore, dest });
await receiver.apply(patch);
// dest now has src/main.js with the same content
// --- Incremental update ---
await src.addFile('main.js', new TextEncoder().encode('console.log("updated")'));
const diffPatch = await sender.snapshot(root); // diff patch (subsequent calls)
await receiver.apply(diffPatch);
// only changed chunks are transferredFiles are split into variable-size chunks using the FastCDC algorithm with normalized chunking. A rolling Gear hash scans byte-by-byte and declares boundaries when the hash's low bits match a mask. Two masks keep chunk sizes tightly centered around avgSize:
- Strict mask in
[minSize, avgSize)— fewer boundaries, avoids tiny chunks - Lenient mask in
[avgSize, maxSize)— more boundaries, avoids huge chunks - Hard clamps at
minSize(default:avgSize / 4, min 64) andmaxSize(default:avgSize * 8)
Because boundaries are determined by content, inserting or deleting bytes only affects the immediately surrounding chunks — all other chunks keep their identity (same hash) and don't need retransmission.
A snapshot is a binary document describing a folder tree at a point in time. Each file entry carries a CDC manifest (ordered list of chunk hashes) and a fingerprint (SHA-256 of content). Each folder carries a fingerprint computed recursively from its sorted children, forming a Merkle tree.
A full patch bundles a snapshot with every chunk it references — fully self-contained.
A diff patch carries only the operations needed to go from the previous snapshot to the current one (add file, replace file, remove, add empty folder), plus only the new chunks. The receiver must hold the previous snapshot to apply it.
DoubleSyncSession manages this automatically: first call to snapshot() returns a full patch, subsequent calls return diff patches.
Any patch can be wrapped in a gzip compression envelope. The receiver auto-detects and unwraps transparently. Enable with compress: 'gzip' on DoubleSyncSession.
The core sync engine.
const sync = new DoubleSync({ avgSize: 8192, enableMerkleSkip: true });| Parameter | Type | Default | Description |
|---|---|---|---|
avgSize |
number | 8192 | Target average chunk size (must be power of 2) |
minSize |
number | avgSize / 4 |
Minimum chunk size (min 64) |
maxSize |
number | avgSize * 8 |
Maximum chunk size |
enableMerkleSkip |
boolean | true | Skip unchanged subtrees by fingerprint comparison |
| Method | Returns | Description |
|---|---|---|
buildSnapshot({ root, store }) |
Promise<Uint8Array> |
Walk tree, chunk files into store, return snapshot document |
buildPatch({ root, senderStore, receiverStore? }) |
Promise<Uint8Array> |
Build self-contained patch (snapshot + all chunks) |
buildDiffPatch({ root, senderStore, receiverStore?, prevSnapshot }) |
Promise<{ patch, newSnapshot }> |
Build incremental diff patch against previous snapshot |
applySnapshot({ snapshot, store, dest }) |
Promise<void> |
Reconstruct tree from snapshot + chunks in store |
applyPatch({ patch, store, dest }) |
Promise<void> |
Apply full patch to destination |
applyDiffPatch({ patch, store, dest, prevSnapshot }) |
Promise<Uint8Array> |
Apply diff patch; returns new snapshot |
missingChunks({ snapshot, store }) |
Array<{ hash, length }> |
List chunks referenced by snapshot that store lacks |
Stateful sender/receiver wrapper that tracks the previous snapshot automatically.
// Sender
const sender = new DoubleSyncSession({ sync, senderStore, compress: 'gzip' });
const patch = await sender.snapshot(root);
// Receiver
const receiver = new DoubleSyncSession({ sync, store, dest });
await receiver.apply(patch);| Parameter | Type | Required | Description |
|---|---|---|---|
sync |
DoubleSync | yes | Sync engine instance |
senderStore |
CDCStore | for snapshot() |
Chunk store for building patches |
store |
CDCStore | for apply() |
Chunk store for receiving |
dest |
DoubleSyncFolder | for apply() |
Target folder to write into |
compress |
'gzip' | false |
no | Compress outgoing patches |
| Method | Returns | Description |
|---|---|---|
snapshot(root) |
Promise<Uint8Array> |
Build next patch (full on first call, diff on subsequent) |
apply(patch) |
Promise<void> |
Apply patch (auto-detects full vs diff, handles decompression) |
| Property | Type | Description |
|---|---|---|
lastSnapshot |
Uint8Array | null |
Snapshot from the last snapshot() or apply() call |
Low-level chunking engine for a single byte buffer (no tree awareness).
const cdc = new CDCSync({ avgSize: 8192 });
const { manifest, chunks } = cdc.split(data);
const rebuilt = cdc.reconstruct({ manifest, store });| Method | Returns | Description |
|---|---|---|
split(data) |
{ manifest, chunks } |
Split buffer, return manifest + chunk list |
splitInto({ data, store }) |
Uint8Array |
Split + insert chunks into store; return manifest bytes |
reconstruct({ manifest, store }) |
Uint8Array |
Reassemble buffer from manifest + store |
missingChunks({ manifest, store }) |
Array<{ hash, length }> |
Chunks manifest references that store lacks |
In-memory content-addressable chunk store.
const store = new CDCStore();
const hash = store.put(bytes); // store by content, returns SHA-256
store.putWithHash(hash, bytes); // store by known hash
store.has(hash); // boolean
const data = store.get(hash); // Uint8Array | null
store.delete(hash); // boolean| Static Method | Returns | Description |
|---|---|---|
CDCStore.hashOf(bytes) |
Uint8Array |
SHA-256 of bytes |
CDCStore.hashToHex(hash) |
string |
32-byte hash to lowercase hex |
CDCStore.hashFromHex(hex) |
Uint8Array |
Hex string to 32-byte hash |
DoubleSync operates on an abstract filesystem interface. Two in-memory implementations are provided; extend DoubleSyncFile / DoubleSyncFolder to back them with real FS, S3, a database, etc.
| Method | Returns | Description |
|---|---|---|
list() |
Promise<Array> |
Direct children (files and folders) |
walk(prefix?) |
async generator | Depth-first { path, file } for all files |
findByPath(parts) |
Promise<File | Folder | null> |
Navigate by path segments |
addFile(name, bytes) |
Promise<DoubleSyncFile> |
Create or replace file |
addFolder(name) |
Promise<DoubleSyncFolder> |
Create or return existing subfolder |
removeChild(name) |
Promise<void> |
Remove direct child |
ensurePath(parts) |
Promise<DoubleSyncFolder> |
Create nested folder path |
| Method | Returns | Description |
|---|---|---|
getContent() |
Promise<Uint8Array> |
Read file bytes |
getSize() |
Promise<number> |
File size |
getFingerprint() |
Promise<Uint8Array> |
SHA-256 of content |
setContent(bytes) |
Promise<void> |
Write file bytes |
DoubleSyncMemoryFolder— children stored in aMapDoubleSyncMemoryFile— content stored asUint8Array
import { DoubleSyncFormat, MAGICS } from '@fizzyflow/doublesync';
DoubleSyncFormat.detect(bytes); // 'manifest' | 'snapshot' | 'patch' | 'diff-patch' | 'compressed' | null
DoubleSyncFormat.isDoubleSync(bytes); // true if any known format
DoubleSyncFormat.isPatch(bytes); // true if full patch
DoubleSyncFormat.isDiffPatch(bytes); // true if diff patch
DoubleSyncFormat.isCompressed(bytes); // true if compression envelopeParsed view of a manifest document — the ordered list of chunk hashes and lengths that fully describes one file under CDC. Pair with CDCStore to reconstruct the original bytes.
import { CDCManifest } from '@fizzyflow/doublesync';
const m = new CDCManifest(bytes); // parse from wire format
m.chunkCount; // number
m.totalLength; // bigint — original file byte count
for (const { hash, length } of m.chunks()) { /* hash: Uint8Array(32), length: number */ }
CDCManifest.fromChunks(chunks); // build manifest from an array of { hash, length }Pure diff algorithm: given two parsed DoubleSyncSnapshot instances, returns the ordered list of operations that turn prev into next. Used internally by DoubleSync.buildDiffPatch but available for advanced use.
import { diffSnapshots } from '@fizzyflow/doublesync';
const ops = diffSnapshots(prevSnapshot, nextSnapshot);
// ops: Array<{ kind, path, manifest?, fingerprint? }>
// kind: 'replace_file' | 'add_file' | 'add_empty_folder' | 'remove'For advanced use, parse patch documents directly:
import { DoubleSyncPatch, DoubleSyncDiffPatch, DoubleSyncSnapshot } from '@fizzyflow/doublesync';
// Full patch
const p = new DoubleSyncPatch(bytes);
p.snapshot; // Uint8Array — embedded snapshot
p.chunkCount; // number
for (const { hash, bytes } of p.chunks()) { /* ... */ }
// Diff patch
const d = new DoubleSyncDiffPatch(bytes);
d.prevSnapshotHash; // Uint8Array (32 bytes)
d.opCount; // number
for (const { kind, path, manifest, fingerprint } of d.ops()) { /* ... */ }
// kind: 'replace_file' | 'add_file' | 'add_empty_folder' | 'remove'
// Snapshot
const s = new DoubleSyncSnapshot(bytes);
s.fileCount; // number
s.folderCount; // number
s.totalBytes; // bigint
for (const file of s.files()) { /* ... */ }The following are also exported for advanced/internal use:
| Export | Description |
|---|---|
FastCDC |
Raw FastCDC chunker (used by CDCSync internally) |
GEAR |
The 256-entry Gear hash table used by FastCDC |
folderFingerprint |
Compute a Merkle fingerprint from a list of { name, fingerprint } child specs |
All documents use little-endian byte order.
| Document | Magic | Description |
|---|---|---|
| CDC Manifest | 0xCDCAA001 |
Ordered chunk hash + length list for one file |
| Snapshot | 0xCDCB0001 |
Full tree structure with manifests and fingerprints |
| Full Patch | 0xCDCB0101 |
Snapshot + all referenced chunks |
| Diff Patch | 0xCDCB0102 |
Prev-snapshot hash + ops + new chunks only |
| Compressed | 0xCDCB02FF |
Gzip envelope wrapping any of the above |
pnpm testTests use vitest:
test/cdc.test.js— FastCDC chunking, manifest round-trip, deduplicationtest/double-sync.test.js— end-to-end tree sync: build snapshot, apply, verifytest/double-sync-diff.test.js— incremental diff patches, add/remove/replace opstest/fingerprint.test.js— folder Merkle fingerprint correctnesstest/double-sync-compressed.test.js— gzip compression envelope round-triptest/double-sync-stress.test.js— large trees, edge casestest/format.test.js— magic number detection and classification
AGPL-3.0