Skip to content

fix(node): anchor the real old_sha and issue a per-ref certificate#72

Open
beardthelion wants to merge 7 commits into
mainfrom
fix/arweave-anchor-real-oldsha-per-ref-cert
Open

fix(node): anchor the real old_sha and issue a per-ref certificate#72
beardthelion wants to merge 7 commits into
mainfrom
fix/arweave-anchor-real-oldsha-per-ref-cert

Conversation

@beardthelion

@beardthelion beardthelion commented Jun 20, 2026

Copy link
Copy Markdown
Collaborator

Refs #26 (integrity subset only; the issue stays open for the foundation work).

What was wrong

The push handler built the Arweave anchor from a copy of the ref updates that had already dropped the real old_sha, then hardcoded old_sha to 64 zeros. The signed RefCertificate with the real old -> new stayed local. On top of that, a multi-ref push only ever produced one certificate (for the first ref), and an empty push signed a timestamp string as new_sha. The 64-zero placeholder was also the wrong width: git object ids are 40-hex SHA-1.

What this changes

  • Thread the parsed old_sha through the clone moved into the anchoring task, so the anchor records the real old -> new transition. This drops the "0".repeat(64) placeholder and fixes the width mismatch.
  • Issue a signed RefCertificate for every ref the push advanced, each carrying that ref's real old/new, instead of one cert for the first ref.

No new dependencies, no schema or config changes.

Scope

Deliberately the integrity subset of #26. Carrying the signed certificate on Turbo, capturing the pusher signature, chaining certificates, and gl-side verification against the anchor remain open under #26.

Tests

  • New arweave::tests::test_anchor_body_carries_real_old_sha locks the anchor-body contract: the upload must serialize the real old_sha/new_sha, never a placeholder.
  • Full gitlawb-node suite green (126 passed); fmt and clippy clean.

The per-ref certificate issuance and old_sha threading live in the git_receive_pack handler, which has no test harness (it needs a live Postgres and git), so those are verified by compile and review rather than an automated test.

Summary by CodeRabbit

Release Notes

  • Bug Fixes

    • Improved multi-ref receive handling so each ref update preserves and propagates its correct old_sha/new_sha across pinning, gossip, sync notifications, broadcasting, and anchoring.
    • Refactored sync notifications to send one signed /api/v1/sync/notify request per ref, including the ref’s real old_sha.
    • Updated push webhook behavior so newly created refs are marked correctly.
  • Tests

    • Added regression coverage for per-ref /sync/notify payloads and for anchoring payloads carrying the real old_sha/new_sha.

@coderabbitai

coderabbitai Bot commented Jun 20, 2026

Copy link
Copy Markdown

Review Change Stack

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 3ad83b45-ac93-4f48-b324-9e89f9fff2ba

📥 Commits

Reviewing files that changed from the base of the PR and between cc83943 and f8658cb.

📒 Files selected for processing (2)
  • crates/gitlawb-node/src/api/repos.rs
  • crates/gitlawb-node/src/arweave.rs
🚧 Files skipped from review as they are similar to previous changes (2)
  • crates/gitlawb-node/src/arweave.rs
  • crates/gitlawb-node/src/api/repos.rs

📝 Walkthrough

Walkthrough

git_receive_pack now carries each ref update’s real old_sha through certificate issuance, peer notifications, gossip, and Arweave anchoring. The PR also adds regression tests for multi-ref notifications and anchor upload bodies.

Changes

Per-ref old_sha propagation

Layer / File(s) Summary
Tuple shape and ref certificates
crates/gitlawb-node/src/api/repos.rs
ref_updates_clone now carries (ref_name, old_sha, new_sha), ref certificates are issued once per ref update, and the push webhook created flag checks old_sha against ZERO_SHA.
Peer sync, gossip, and anchoring
crates/gitlawb-node/src/api/repos.rs
Pinata/gossip, GraphQL ref-update broadcasts, and Arweave anchor records now use each ref update’s old_sha.
Peer notify helpers and tests
crates/gitlawb-node/src/api/repos.rs
/api/v1/sync/notify is sent once per ref with that ref’s old_sha, and tests cover multi-ref notifications plus created-ref zero-hash forwarding.
Arweave anchor tests
crates/gitlawb-node/src/arweave.rs
Arweave tests use 40-character zero SHA fixtures and verify anchor_ref_update posts the real old_sha and new_sha to Irys.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related issues

Possibly related PRs

  • Gitlawb/node#34: Both PRs modify git_receive_pack downstream dissemination in crates/gitlawb-node/src/api/repos.rs, with this PR changing per-ref old_sha handling for notifications and anchoring.

Suggested labels

sev:medium

Suggested reviewers

  • kevincodex1

Poem

🐇 Hop hop, each ref tells true,
Old SHA travels through and through.
Zeros out, the real values stay,
Anchors and certs now track the way.

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Title check ✅ Passed The title is concise and accurately summarizes the main change: threading real old_sha and issuing per-ref certificates.
Description check ✅ Passed The description covers the bug, fix, scope, and tests, though it doesn't follow the template headings exactly.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix/arweave-anchor-real-oldsha-per-ref-cert

Comment @coderabbitai help to get the list of available commands.

@beardthelion

Copy link
Copy Markdown
Collaborator Author

@coderabbitai review

@coderabbitai

coderabbitai Bot commented Jun 21, 2026

Copy link
Copy Markdown
✅ Action performed

Review finished.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

@beardthelion

Copy link
Copy Markdown
Collaborator Author

@coderabbitai full review

@coderabbitai

coderabbitai Bot commented Jun 21, 2026

Copy link
Copy Markdown
✅ Action performed

Full review finished.

@beardthelion beardthelion added kind:bug Defect fix — wrong or unsafe behavior crate:node gitlawb-node — the serving node and REST API subsystem:attestation Certificates, anchoring, per-ref attestation sev:high Major break or real security/trust risk, no easy workaround labels Jun 22, 2026
@kevincodex1

Copy link
Copy Markdown
Contributor

@coderabbitai make a full review of this PR

@coderabbitai

coderabbitai Bot commented Jun 24, 2026

Copy link
Copy Markdown

@kevincodex1 Sure! I'll kick off a full review of this PR right away.

✅ Action performed

Full review finished.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
crates/gitlawb-node/src/api/repos.rs (1)

826-848: 🗄️ Data Integrity & Integration | 🟡 Minor | ⚡ Quick win

Gossip event still emits empty old_sha though the real value is now in hand.

The destructure discards old_sha as _old_sha at Line 826, then publish_ref_update hardcodes old_sha: "".to_string() at Line 842. This is the same integrity gap this PR closes for Arweave/DB — peers receiving the gossip event get no old→new transition. Bind and forward the real value.

Proposed fix
-            for (ref_name, _old_sha, new_sha) in &ref_updates_clone {
+            for (ref_name, old_sha, new_sha) in &ref_updates_clone {
                 let cid = cid_map.get(new_sha).map(|s| s.as_str());
@@
                         p2p.publish_ref_update(crate::p2p::RefUpdateEvent {
                             node_did: node_did_str.clone(),
                             pusher_did: pusher_did_clone.clone(),
                             repo: repo_slug.clone(),
                             ref_name: ref_name.clone(),
-                            old_sha: "".to_string(),
+                            old_sha: old_sha.clone(),
                             new_sha: new_sha.clone(),
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@crates/gitlawb-node/src/api/repos.rs` around lines 826 - 848, The gossip
event being published via p2p.publish_ref_update() in the loop over
ref_updates_clone hardcodes old_sha as an empty string instead of using the
actual value available from the destructuring. Change the destructuring pattern
to capture the old_sha value instead of discarding it with an underscore, and
then use that captured value when constructing the RefUpdateEvent struct passed
to publish_ref_update, replacing the hardcoded empty string assignment.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@crates/gitlawb-node/src/api/repos.rs`:
- Around line 920-928: The GraphQL RefUpdateBroadcast is collapsing multi-ref
pushes by only emitting the first ref with a hardcoded zero old_sha. Instead of
using ref_updates_clone.first() once to create a single broadcast, loop through
each tuple in ref_updates_clone and emit one RefUpdateBroadcast per ref. For
each iteration, extract the ref_name from the first tuple element, the real
old_sha from the second tuple element, and the new_sha from the third tuple
element, rather than hardcoding old_sha to zeros and only processing the first
ref. This ensures subscribers receive all refs in a multi-ref push with their
accurate state transitions.

---

Outside diff comments:
In `@crates/gitlawb-node/src/api/repos.rs`:
- Around line 826-848: The gossip event being published via
p2p.publish_ref_update() in the loop over ref_updates_clone hardcodes old_sha as
an empty string instead of using the actual value available from the
destructuring. Change the destructuring pattern to capture the old_sha value
instead of discarding it with an underscore, and then use that captured value
when constructing the RefUpdateEvent struct passed to publish_ref_update,
replacing the hardcoded empty string assignment.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: e4bde16d-c752-4472-b903-6d9411879d89

📥 Commits

Reviewing files that changed from the base of the PR and between e37ea7f and 31ed541.

📒 Files selected for processing (2)
  • crates/gitlawb-node/src/api/repos.rs
  • crates/gitlawb-node/src/arweave.rs

Comment thread crates/gitlawb-node/src/api/repos.rs Outdated

@jatmn jatmn left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Summary

The PR correctly threads each ref's real old_sha into Arweave anchoring and issues one signed ref certificate per updated ref. I found one remaining integrity gap in downstream notifications/subscriptions that still flatten multi-ref pushes to the first ref and a zero old_sha.

Findings

  1. HTTP sync notifications still misreport multi-ref pushes and old_sha
    In git_receive_pack, the PR updates
    ef_updates_clone to include (ref_name, old_sha, new_sha) and uses that for Arweave anchoring, but the HTTP peer notification body still takes
    ef_name/
    ew_sha from
    ef_updates_clone.first() and hardcodes old_sha to all zeros (crates/gitlawb-node/src/api/repos.rs:868-878). On the receiving side,
    otify_sync persists those fields into ReceivedRefUpdate (crates/gitlawb-node/src/api/peers.rs:264-283), matching the gossipsub path that is supposed to carry per-ref provenance (crates/gitlawb-node/src/p2p/mod.rs:33-54,297-318). So after this change, a multi-ref push will anchor correctly on Arweave but peers reached via the HTTP fallback will still record only the first ref and the wrong previous SHA. That leaves the integrity bug unresolved for one of the codepaths this handler already supports.

beardthelion added a commit that referenced this pull request Jun 24, 2026
PR #72 made the Arweave/DB anchor path per-ref with the real old_sha but
left three sibling fan-out paths in git_receive_pack flattening multi-ref
pushes and/or emitting a placeholder old_sha. Bring them in line:

- gossip publish_ref_update now forwards the real old_sha (was ""), matching
  the RefUpdateEvent contract that documents all-zeros for a created ref
- HTTP /sync/notify sends one signed request per ref with its real old_sha;
  the receiver is single-ref, so a multi-ref push no longer drops refs 2..N
- GraphQL RefUpdateBroadcast emits one broadcast per ref with the real
  old_sha instead of only the first ref + a zero placeholder

No receiver, struct, or wire-schema change; the all-zeros-for-created-ref
convention is preserved on the wire. Addresses jatmn's CHANGES_REQUESTED and
the two CodeRabbit findings.
beardthelion added a commit that referenced this pull request Jun 24, 2026
Extract the HTTP /sync/notify send into notify_peer_of_ref (one ref) and
notify_peer_of_refs (one peer, looping refs), collapsing the deeply-nested
inline block in git_receive_pack and giving the fan-out a unit-testable seam.
The ref_name is now also logged on every warn arm.

Add mockito regression tests: a two-ref push must emit one request per ref
each carrying its real old_sha (guards against re-flattening to .first() with
a zero placeholder, the bug PR #72 fixed), and a created ref must forward the
all-zeros hash verbatim. Behavior is unchanged; this only adds the seam, the
tests, and ref_name in the logs.
@beardthelion

Copy link
Copy Markdown
Collaborator Author

@jatmn addressed in 3ed5acd. The HTTP /sync/notify body now loops ref_updates_clone and sends one signed request per ref carrying that ref's real old_sha, so a multi-ref push no longer collapses to the first ref with a zero placeholder. Applied the same per-ref treatment to the gossip event and the GraphQL broadcast so all four fan-out paths match the Arweave loop. Receiver and wire schema unchanged.

Follow-up in ada4e55 extracts the send into a notify_peer_of_ref/notify_peer_of_refs helper and adds a mockito regression test asserting a two-ref push emits one request per ref with the correct per-ref old_sha.

One thing worth a separate issue, not this PR: enqueue_sync uses ON CONFLICT DO NOTHING keyed on the UUID id, so it never dedupes, and the per-ref fan-out now enqueues N rows per peer, each driving a full refspec-less fetch. Pre-existing, but the fan-out amplifies it. Happy to file it.

Re-requesting review.

@beardthelion

Copy link
Copy Markdown
Collaborator Author

Filed the enqueue_sync dedup / redundant-fetch issue as #96 (out of scope for this PR).

@jatmn jatmn left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Findings

BLOCKING

1. Merge conflict with main must be resolved

Severity: Blocking
Evidence:

  • gh pr view 72 --repo Gitlawb/node --json mergeable returns "CONFLICTING".
  • git merge-base HEAD origin/main reports fatal: no merge base in the local checkout.
  • git diff origin/main...HEAD is therefore not usable; the PR branch diverged from main before commit 1af4fdf ("feat(node): pin the per-push object delta...").
  • The current main contains changes that the PR branch is missing, including:
    • owner_push_rejection enforcement and Extension(auth)-based identity in git_receive_pack (the PR branch still uses headers: HeaderMap and extract_did_from_auth).
    • EncryptedManifest::blobs changed from &'a [(String, String, Vec<String>)] to &'a [(String, String)].
    • pinata::pin_new_objects and ipfs_pin::pin_new_objects signatures changed to accept per-push candidates instead of a withheld set.
    • replication_withheld_set helper extracted in main.

Recommendation: Rebase or merge main into the PR branch and reconcile the conflicting ref-update fan-out logic with the new owner-push gate and pin-delta changes. Do not merge until the PR is reported as mergeable.


HIGH

2. EncryptedManifest blob tuple shape is inconsistent with current main

Severity: High
Evidence: In the PR branch, arweave.rs line 115 defines blobs: &'a [(String, String, Vec<String>)] and manifest_blob_json unpacks a 3-tuple; current main uses &'a [(String, String)] and a 2-tuple.

Impact: Once the merge conflict is resolved, the branch will not compile against main unless the tuple shape is reconciled. This is a direct conflict in the changed file.

Recommendation: During rebase, adopt main's 2-tuple shape (no recipient list in the manifest) and remove the third element from tuple destructuring in arweave.rs and the caller in repos.rs.


MEDIUM

3. Existing Arweave test still uses a 64-hex zero old_sha

Severity: Medium
Evidence: arweave.rs line 246 in test_anchor_noop_when_url_empty uses old_sha: "0000000000000000000000000000000000000000000000000000000000000000".into() (64 hex chars). The PR's commit message and new test correctly use 40-hex SHA-1 values.

Impact: Minor inconsistency in test fixtures; the production code path now uses real 40-hex SHAs from parse_ref_updates, so this does not affect runtime correctness, but it weakens the signal that the width mismatch was fixed.

Recommendation: Update the existing fixture to a 40-hex zero string to match the fixed contract.


LOW / NITS

4. notify_peer_of_ref takes many positional arguments

Severity: Low
Evidence: repos.rs lines 468–519 define notify_peer_of_ref with ten positional parameters.

Impact: The signature is error-prone for callers and reviewers, although it is only called from one place in production and from tests.

Recommendation: Consider grouping related parameters into small structs (e.g., RefUpdate { ref_name, old_sha, new_sha } and PeerContext { peer_did, notify_url, ... }) in a follow-up. This is not blocking.

5. Tests rely on real HTTP mock timing

Severity: Low
Evidence: The new sync-notify tests use mockito with .expect(1). They passed locally but are sensitive to request ordering; mockito by default matches any request that satisfies the matcher, and the two mocks overlap only by ref_name, so they are correct as written.

Recommendation: No change required; the test design is sound.


The push handler discarded the real old->new transition before anchoring
and only certified the first ref of a multi-ref push, so the permanent
Arweave record was a weaker, incomplete copy of facts the node had
already parsed.

- Thread the parsed old_sha through the clone moved into the anchor task
  instead of hardcoding a 64-zero placeholder. This also fixes the SHA
  width mismatch: object ids are 40-hex SHA-1, not 64-hex.
- Issue a signed RefCertificate for every ref the push advanced, each
  carrying that ref's real old/new, rather than one cert for the first
  ref (which, on an empty push, signed a timestamp string as new_sha).

Lock the anchor-body contract with a test asserting the upload carries
the real old_sha and new_sha, never a placeholder.

Scope is the integrity subset only. Carrying the signed certificate on
Turbo, capturing the pusher signature, chaining certificates, and
gl-side verification remain open under #26.

Refs #26
PR #72 made the Arweave/DB anchor path per-ref with the real old_sha but
left three sibling fan-out paths in git_receive_pack flattening multi-ref
pushes and/or emitting a placeholder old_sha. Bring them in line:

- gossip publish_ref_update now forwards the real old_sha (was ""), matching
  the RefUpdateEvent contract that documents all-zeros for a created ref
- HTTP /sync/notify sends one signed request per ref with its real old_sha;
  the receiver is single-ref, so a multi-ref push no longer drops refs 2..N
- GraphQL RefUpdateBroadcast emits one broadcast per ref with the real
  old_sha instead of only the first ref + a zero placeholder

No receiver, struct, or wire-schema change; the all-zeros-for-created-ref
convention is preserved on the wire. Addresses jatmn's CHANGES_REQUESTED and
the two CodeRabbit findings.
Extract the HTTP /sync/notify send into notify_peer_of_ref (one ref) and
notify_peer_of_refs (one peer, looping refs), collapsing the deeply-nested
inline block in git_receive_pack and giving the fan-out a unit-testable seam.
The ref_name is now also logged on every warn arm.

Add mockito regression tests: a two-ref push must emit one request per ref
each carrying its real old_sha (guards against re-flattening to .first() with
a zero placeholder, the bug PR #72 fixed), and a created ref must forward the
all-zeros hash verbatim. Behavior is unchanged; this only adds the seam, the
tests, and ref_name in the logs.
Prefix the two new sync-notify tests with test_ and use the _mock binding
convention (matching arweave.rs/pinata.rs). No behavior change.
@beardthelion beardthelion force-pushed the fix/arweave-anchor-real-oldsha-per-ref-cert branch from 54ba2e1 to cc83943 Compare June 24, 2026 13:09
@beardthelion beardthelion requested a review from jatmn June 24, 2026 13:10

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
crates/gitlawb-node/src/api/repos.rs (2)

1071-1094: 🩺 Stability & Availability | 🟠 Major | ⚡ Quick win

Don’t let best-effort peer notify block local broadcast and anchoring.

This loop awaits every peer/ref HTTP request before Lines 1099-1144 run. With per-ref fan-out, one slow peer can delay GraphQL subscribers and Arweave anchoring. Move this block after the local broadcast/anchor work, or spawn it as an isolated best-effort task.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@crates/gitlawb-node/src/api/repos.rs` around lines 1071 - 1094, The peer
notification fan-out in repos.rs is currently awaited inline, which can block
the local broadcast and anchoring path. Move the announce/list_peers loop that
calls notify_peer_of_refs out of the main ref-update flow so the GraphQL
subscriber broadcast and Arweave anchoring run first, or run that peer-notify
work as an isolated best-effort task. Keep the change localized around the
announce block and the notify_peer_of_refs call sites so the existing local work
is not delayed by slow peers.

210-234: 🎯 Functional Correctness | 🟠 Major

Align paginated owner filtering with did_matches. list_all_repos_paged still uses owner_did = $1 OR owner_did LIKE '%:' || $1, so a bare owner ID can match other DID methods with the same suffix. That makes ?owner=...&limit=... behave differently from the unpaginated path.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@crates/gitlawb-node/src/api/repos.rs` around lines 210 - 234, The paginated
branch in repos listing applies owner filtering differently from the
non-paginated branch, so `list_all_repos_paged` must be updated to use the same
`did_matches` semantics as the `filtered` logic in `repos.rs`. Adjust the
query-building/path used by `list_all_repos_paged` so `?owner=...&limit=...`
matches only the intended DID, not any DID with the same suffix, and keep the
behavior consistent with `crate::api::did_matches`.

Source: Learnings

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Outside diff comments:
In `@crates/gitlawb-node/src/api/repos.rs`:
- Around line 1071-1094: The peer notification fan-out in repos.rs is currently
awaited inline, which can block the local broadcast and anchoring path. Move the
announce/list_peers loop that calls notify_peer_of_refs out of the main
ref-update flow so the GraphQL subscriber broadcast and Arweave anchoring run
first, or run that peer-notify work as an isolated best-effort task. Keep the
change localized around the announce block and the notify_peer_of_refs call
sites so the existing local work is not delayed by slow peers.
- Around line 210-234: The paginated branch in repos listing applies owner
filtering differently from the non-paginated branch, so `list_all_repos_paged`
must be updated to use the same `did_matches` semantics as the `filtered` logic
in `repos.rs`. Adjust the query-building/path used by `list_all_repos_paged` so
`?owner=...&limit=...` matches only the intended DID, not any DID with the same
suffix, and keep the behavior consistent with `crate::api::did_matches`.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 191bb03d-8643-43f5-b311-975e943d972d

📥 Commits

Reviewing files that changed from the base of the PR and between 54ba2e1 and cc83943.

📒 Files selected for processing (2)
  • crates/gitlawb-node/src/api/repos.rs
  • crates/gitlawb-node/src/arweave.rs
🚧 Files skipped from review as they are similar to previous changes (1)
  • crates/gitlawb-node/src/arweave.rs

@jatmn jatmn left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Findings

BLOCKING

1. cargo fmt --check fails on test-module import order

Severity: Blocking
Evidence:

  • cargo fmt --check --all reports a diff in crates/gitlawb-node/src/api/repos.rs at the #[cfg(test)] mod tests block.
  • Current code (line 1447):
    use super::*;
    use gitlawb_core::identity::Keypair;
    use crate::auth::caller_authorized_to_push;
    use crate::error::AppError;
  • Expected order (rustfmt groups use crate::... before external crates):
    use super::*;
    use crate::auth::caller_authorized_to_push;
    use crate::error::AppError;
    use gitlawb_core::identity::Keypair;
  • GitHub Actions fmt + clippy run also fails with the same cargo fmt --check exit code.

Impact: Blocks CI and therefore merge.
Recommendation: Run cargo fmt --all and commit the resulting change.


MEDIUM

2. Best-effort peer notification now blocks local broadcast and Arweave anchoring

Severity: Medium
Evidence:

  • In crates/gitlawb-node/src/api/repos.rs, the spawned background task orders work as follows (lines 1068–1150):
    1. Gossip publish (for ref_updates_clone, lines 1041–1066)
    2. HTTP peer notify loop (if announce { ... notify_peer_of_refs(...).await; }, lines 1068–1097)
    3. GraphQL broadcast (lines 1099–1111)
    4. Arweave anchoring (lines 1113–1150)
  • notify_peer_of_refs awaits every peer/ref HTTP request sequentially. With per-ref fan-out, a slow or unreachable peer can now delay the GraphQL broadcast and Arweave anchoring by the sum of all HTTP round-trips.
  • This is inside a tokio::spawn task, so it does not block the HTTP response, but it does delay internal downstream work that does not depend on peer notification.

Impact: Local subscribers and permanent anchors are delayed by best-effort peer notifications.
Recommendation: Move the HTTP peer-notification block after the GraphQL broadcast and Arweave anchoring, or spawn it as an independent best-effort task so that local work cannot be delayed by peer round-trips.


LOW / NITS

3. Existing Arweave test fixtures still use 64-hex zero old_sha

Severity: Low
Evidence:

  • crates/gitlawb-node/src/arweave.rs line 246:
    old_sha: "0000000000000000000000000000000000000000000000000000000000000000".into(),
  • crates/gitlawb-node/src/arweave.rs line 273:
    old_sha: "0".repeat(64),

Impact: The PR description and the new test_anchor_body_carries_real_old_sha correctly use 40-hex SHA-1 values and state that the old 64-hex placeholder was the wrong width. Leaving the older fixtures at 64 hex weakens the signal that the width mismatch was fixed, although it does not affect production correctness.
Recommendation: Update both fixtures to the canonical 40-hex zero string (0000000000000000000000000000000000000000) to match the fixed contract.

4. Webhook created check duplicates the all-zeros literal

Severity: Low
Evidence:

  • crates/gitlawb-node/src/api/repos.rs line 832:
    "created": update.old_sha == "0000000000000000000000000000000000000000",
  • The file already defines const ZERO_SHA: &str = "0000000000000000000000000000000000000000"; at line 21.

Impact: Minor maintainability risk; the literal and constant could drift.
Recommendation: Use update.old_sha == ZERO_SHA in the webhook payload.


- Reorder the repos test-module imports so `use crate::...` precedes the
  external `gitlawb_core` import (unblocks cargo fmt --check / CI).
- Webhook `created` check reuses the existing ZERO_SHA constant instead of
  repeating the 40-hex literal.
- Arweave test fixtures use the canonical 40-hex zero old_sha, matching the
  SHA-1 width the per-ref anchoring contract now carries.
The post-push background task awaited the HTTP peer-notify fan-out before
the GraphQL subscription broadcast and Arweave anchoring. With per-ref
notification a slow or unreachable peer could delay both local subscribers
and permanent anchoring by the sum of the peer round-trips. Move the
peer-notify loop to run last; peers already learned of the push via the
gossip publish earlier in the task, so this only deprioritizes the
best-effort HTTP fallback.
Review follow-ups on the two prior commits:
- The peer-notify ordering comment claimed peers 'already learned via the
  gossip publish', but that publish is gated on a present p2p handle; with
  no handle (the very case the HTTP fallback exists for) it does not run.
  State the actual rationale: peer-notify is the lowest-priority best-effort
  step, ordered last so it cannot delay local broadcast or anchoring.
- The all-zeros notify test still inlined the 40-hex literal; point it at the
  ZERO_SHA constant like the production path.
@beardthelion

Copy link
Copy Markdown
Collaborator Author

All four addressed.

1 (blocking) — fmt. Fixed the test-module import order; cargo fmt --all -- --check is clean.

2 — peer notification blocking local work. Moved the HTTP peer-notify loop to run last in the background task, after the GraphQL broadcast and Arweave anchoring. Peers are notified later when Irys is slow, but that's the intended deprioritization: it's the lowest-priority best-effort step and can no longer delay local subscribers or anchoring. The whole if announce { … } block moved as a unit, so the readability gate still applies unchanged.

3 — Arweave fixtures. Both old_sha placeholders are now the canonical 40-hex zero string.

4 — duplicated zero literal. The webhook created check uses ZERO_SHA; I also caught a third inline copy in the all-zeros notify test and pointed it at the constant.

One thing I'll leave for a separate issue rather than widen this PR: notify_peer_of_ref sends with no per-request timeout, so a hung peer can hold the spawned task open. Pre-existing (this PR only relocated the loop), but worth tracking.

@beardthelion beardthelion requested a review from jatmn June 24, 2026 17:00
@beardthelion

Copy link
Copy Markdown
Collaborator Author

Correcting my own note above, not finding #2: I said the peer-notify path has no per-request timeout. That isn't right, and I don't want to leave it on the record.

The shared http_client is built in main.rs with a 10s total timeout and redirect: none, and the post-push task passes that same client through notify_peer_of_refsnotify_peer_of_ref, where the POST is sent without overriding it. So each peer request is bounded at ~10s rather than able to hang the task. There's no separate timeout issue to file.

The real residual is the one your #2 was pointing at: the notify loop is sequential over peers × refs, so a batch of unreachable peers can still sum to roughly peers × 10s of work. Running it after the broadcast and anchoring keeps that off the local path, which was the goal here. If peer fan-out grows, parallelizing or bounding that loop is a reasonable follow-up, but it doesn't seem to block this PR. Happy to file a tracking issue if you'd rather have it written down.

@jatmn jatmn left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, lets open a tracking issue for followup on that one.

Otherwise looks good.

@kevincodex1 LGTM

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

crate:node gitlawb-node — the serving node and REST API kind:bug Defect fix — wrong or unsafe behavior sev:high Major break or real security/trust risk, no easy workaround subsystem:attestation Certificates, anchoring, per-ref attestation

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants