Skip to content

feat: add ExecApprovalsStore write path and wire coordinator side effects#526

Open
AlexAlves87 wants to merge 7 commits into
openclaw:mainfrom
AlexAlves87:feat/exec-approvals-write-path
Open

feat: add ExecApprovalsStore write path and wire coordinator side effects#526
AlexAlves87 wants to merge 7 commits into
openclaw:mainfrom
AlexAlves87:feat/exec-approvals-write-path

Conversation

@AlexAlves87

@AlexAlves87 AlexAlves87 commented May 23, 2026

Copy link
Copy Markdown
Contributor

What

Adds the write path to ExecApprovalsStore and wires the two side-effect calls into ExecApprovalsCoordinator, closing the persistence loop for the exec approvals V2 pipeline.

Store — new public API:

  • AddAllowlistEntryAsync(agentId, pattern) — persists a new allowlist entry after an AllowAlways prompt decision. Deduplicates on write (OrdinalIgnoreCase). Returns true if the entry is present after the call (added or already there), false on empty pattern or I/O failure. New entries carry {id, pattern}; lastUsedAt is stamped later by RecordAllowlistUseAsync on first successful use (matches macOS parity).
  • RecordAllowlistUseAsync(agentId, pattern, resolvedPath) — updates lastUsedAt and lastResolvedPath for every matching entry after a final allow. No command text is persisted. Returns false if pattern not found or on I/O failure.

Store — private infrastructure:

  • UpdateFileAsync(mutate) — load → mutate → atomic save, serialized by the existing SemaphoreSlim. Never throws. Refuses to overwrite a malformed file. Handles transient IOException on the atomic move as a degraded path: logs Warn, no retry.

Coordinator — side effects wired:

  • RecordAllowlistUsageAsync fires on both allow exit points: the pass1 pre-approved branch and the post-pass2 branch. Both are required to cover the common allowlist-satisfied case.
  • PersistAllowlistEntriesAsync fires only after pass2 = Allow and followupDecision == AllowAlways, strictly outside the _promptLock block.
  • Each side effect is wrapped in its own independent best-effort try/catch so a failure in one does not skip the other and cannot flip an already-decided allow into InternalError.

Not wired in production yet: the coordinator is still not referenced in any production src/ file. The ProductionWiring_CoordinatorNotReferencedInSrc test remains green.

Design notes

Side effects fire strictly after the final allow decision is confirmed — not before the second evaluator pass. This is a deliberate structural safety choice: the guarantee is structural rather than relying on proof that Evaluate(context, AllowAlways) always produces Allow.

Pattern validation in AddAllowlistEntryAsync is non-empty only, matching macOS parity. Basename-only patterns are inert at match time but not rejected at persist time.

No command text reaches disk. lastUsedAt (Unix ms) and lastResolvedPath provide sufficient operational metadata without exposing tokens or secrets embedded in command arguments.

Testing

149 ExecApprovals tests passing, 0 failures. Build: 0 warnings, 0 errors (TreatWarningsAsErrors clean). Tray tests: 961/961.

Store tests cover: success paths, dedup, not-found, malformed-file refusal, I/O failure degradation on both mutators, concurrency (5 concurrent writes produce a single entry), round-trip JSON validation, and a regression guard asserting lastUsedCommand is absent from the persisted file.

Coordinator tests cover: AllowAlways persistence, non-allowlist security guard, duplicate pattern dedup, allowlist usage recording, allowlist-not-satisfied guard, pass1 pre-approved path, and fallback path with allowlist satisfied.

Real behavior proof

End-to-end coordinator/store runtime proof using real filesystem I/O (test RuntimeProof_AllowAlways_PersistsAndRecordsLastUsed in tests/OpenClaw.Shared.Tests/ExecApprovalsCoordinatorTests.cs).

Scope clarification: this is slice runtime proof, not full production runtime proof. The coordinator is intentionally not wired in production yet. A follow-up production wiring slice connects the coordinator, and the WinUI prompt dialog is a separate slice after that. UI-driven proof against the live app is only meaningful once those land.

Reproduce locally:

dotnet test tests/OpenClaw.Shared.Tests/OpenClaw.Shared.Tests.csproj \
  --filter "FullyQualifiedName~RuntimeProof_AllowAlways_PersistsAndRecordsLastUsed" \
  --logger "console;verbosity=detailed" --nologo

Captured output — current head (ad4021d), lastUsedCommand absent:

=== Initial exec-approvals.json ===
{"version":1,"agents":{"main":{"security":"allowlist","ask":"always"}}}

=== After AllowAlways (correlationId=proof-1) ===
{
  "version": 1,
  "agents": {
    "main": {
      "security": "allowlist",
      "ask": "always",
      "allowlist": [
        {
          "id": "af40c359-2cd8-4afb-a6f7-068330ac5b08",
          "pattern": "C:\WINDOWS\system32\cmd.EXE"
        }
      ]
    }
  }
}

=== After allowlist hit (correlationId=proof-2) ===
{
  "version": 1,
  "agents": {
    "main": {
      "security": "allowlist",
      "ask": "always",
      "allowlist": [
        {
          "id": "af40c359-2cd8-4afb-a6f7-068330ac5b08",
          "pattern": "C:\WINDOWS\system32\cmd.EXE",
          "lastUsedAt": 1780863068197,
          "lastResolvedPath": "C:\WINDOWS\system32\cmd.EXE"
        }
      ]
    }
  }
}

The entry id is identical between the two invocations, proving on-disk dedup. lastUsedAt and lastResolvedPath appear only after the second invocation, proving RecordAllowlistUseAsync fires on the allowlist-hit path. No lastUsedCommand field is present at any stage.

Co-Authored-By: Claude Sonnet 4.6 noreply@anthropic.com

@clawsweeper

clawsweeper Bot commented May 23, 2026

Copy link
Copy Markdown

Codex review: needs maintainer review before merge. Reviewed June 7, 2026, 8:36 PM ET / 00:36 UTC.

Summary
The PR adds ExecApprovalsStore allowlist write/update APIs, wires best-effort ExecApprovalsCoordinator persistence side effects, removes persisted command-text metadata, and expands store/coordinator tests.

Reproducibility: not applicable. this is a feature PR rather than a bug report with a current-main failure path. The relevant proof is the filesystem-backed coordinator/store runtime test shown in the PR body.

Review metrics: 3 noteworthy metrics.

  • Changed files: 15 files changed. The diff spans shared exec-approval contracts, store/coordinator logic, and tests, so review has to cover both persistence and policy side effects.
  • Store APIs: 2 public write APIs added. AddAllowlistEntryAsync and RecordAllowlistUseAsync create the new security-sensitive approvals persistence surface.
  • Production wiring: 0 production src references added. The PR intentionally keeps the coordinator unwired, so live-app proof should wait for a later production wiring slice.

Merge readiness
Overall: 🐚 platinum hermit
Proof: 🦞 diamond lobster
Patch quality: 🐚 platinum hermit
Result: ready for maintainer review.

Overall follows the weaker of proof and patch quality, so missing proof can cap an otherwise strong patch.

Rank-up moves:

  • Get explicit maintainer acceptance for persisted allowlist patterns and lastResolvedPath metadata.
  • [P2] Confirm the required build/shared/tray validation for the exact PR head before merge.

Risk before merge

  • [P1] Merging establishes a new local exec-approvals.json write path that persists allowlist patterns and lastResolvedPath; command text is removed, but maintainers still need to accept that metadata boundary.
  • [P1] The proof is strong for the coordinator/store slice but intentionally not live tray/UI production proof because the coordinator remains unwired in production src.
  • [P1] This read-only review did not run ./build.ps1 or dotnet test; final merge should rely on maintainer-accepted validation for the exact head.

Maintainer options:

  1. Accept the metadata boundary explicitly
    Maintainers can land the slice after explicitly accepting that exec-approvals.json stores allowlist patterns and lastResolvedPath metadata, with final validation on the exact head.
  2. Minimize persisted path metadata first
    If the boundary is not acceptable, revise the write path to omit or further reduce lastResolvedPath-style metadata and update the runtime proof.
  3. Pause until production wiring review
    Maintainers may defer this PR if they want the storage boundary, production wiring, and UI proof reviewed as one product decision instead of staged slices.

Next step before merge

  • [P2] The remaining action is maintainer security-boundary acceptance and final exact-head validation, not a narrow automated repair.

Security
Needs attention: No command text remains in the proposed persisted contract, but maintainers still need to accept the sensitive metadata boundary for allowlist patterns and lastResolvedPath.

Review details

Best possible solution:

Merge only after maintainers explicitly accept the exec-approvals metadata boundary and exact-head validation is clean; otherwise narrow the persisted metadata before landing the write path.

Do we have a high-confidence way to reproduce the issue?

Not applicable: this is a feature PR rather than a bug report with a current-main failure path. The relevant proof is the filesystem-backed coordinator/store runtime test shown in the PR body.

Is this the best way to solve the issue?

Unclear: the staged coordinator/store slice is coherent and avoids command-text persistence, but it is the best merge path only if maintainers accept the remaining persisted metadata boundary.

AGENTS.md: found and applied where relevant.

Codex review notes: model gpt-5.5, reasoning high; reviewed against 25c11dd07f09.

Label changes

Label justifications:

  • P2: This is a normal-priority feature/security-boundary review item with limited blast radius because the coordinator remains unwired in production.
  • merge-risk: 🚨 security-boundary: The PR adds a local approvals persistence path that stores allowlist and resolved-path metadata, which is a sensitive-data boundary decision beyond ordinary CI.
  • rating: 🐚 platinum hermit: Overall readiness is 🐚 platinum hermit; proof is 🦞 diamond lobster and patch quality is 🐚 platinum hermit.
  • status: ⏳ waiting on author: ClawSweeper has contributor-facing work open and is waiting for author action. Sufficient (terminal): The PR body includes terminal-style runtime output from a real filesystem-backed coordinator/store test for the changed slice, including before/after JSON and absence of lastUsedCommand.
  • proof: sufficient: Contributor real behavior proof is sufficient. The PR body includes terminal-style runtime output from a real filesystem-backed coordinator/store test for the changed slice, including before/after JSON and absence of lastUsedCommand.
Evidence reviewed

Security concerns:

  • [medium] Confirm persisted approval metadata boundary — src/OpenClaw.Shared/ExecApprovals/ExecApprovalsStore.cs:123
    The PR intentionally persists allowlist patterns and lastResolvedPath in exec-approvals.json; those paths can reveal local filesystem context even though full command text has been removed.
    Confidence: 0.82

What I checked:

Likely related people:

  • AlexAlves87: Git history shows this person introduced the ExecApprovalsStore read path, the ExecApprovalsCoordinator, and related evaluator/allowlist matcher work before this PR. (role: feature-history owner and current-area contributor; confidence: high; commits: a4f106c7d09c, 12416d282a23, 6d58171c5847; files: src/OpenClaw.Shared/ExecApprovals/ExecApprovalsStore.cs, src/OpenClaw.Shared/ExecApprovals/ExecApprovalsCoordinator.cs, src/OpenClaw.Shared/ExecApprovals/ExecAllowlistMatcher.cs)
  • Christine Yan: Current-main blame attributes the present store/coordinator files to commit 85445c7, although the commit subject is broad localization work and deeper feature history points to AlexAlves87. (role: recent current-main file provenance; confidence: low; commits: 85445c78066b; files: src/OpenClaw.Shared/ExecApprovals/ExecApprovalsStore.cs, src/OpenClaw.Shared/ExecApprovals/ExecApprovalsCoordinator.cs)
What the crustacean ranks mean
  • 🦀 challenger crab: rare, exceptional readiness with strong proof, clean implementation, and convincing validation.
  • 🦞 diamond lobster: very strong readiness with only minor maintainer review expected.
  • 🐚 platinum hermit: good normal PR, likely mergeable with ordinary maintainer review.
  • 🦐 gold shrimp: useful signal, but proof or patch confidence is still limited.
  • 🦪 silver shellfish: thin signal; proof, validation, or implementation needs work.
  • 🧂 unranked krab: not merge-ready because proof is missing/unusable or there are serious correctness or safety concerns.
  • 🌊 off-meta tidepool: rating does not apply to this item.

Shiny media proof means a screenshot, video, or linked artifact directly shows the changed behavior. Runtime, network, CSP, and security claims still need visible diagnostics.

How this review workflow works
  • ClawSweeper keeps one durable marker-backed review comment per issue or PR.
  • Re-runs edit this comment so the latest verdict, findings, and automation markers stay together instead of adding duplicate bot comments.
  • A fresh review can be triggered by eligible @clawsweeper re-review comments, exact-item GitHub events, scheduled/background review runs, or manual workflow dispatch.
  • PR/issue authors and users with repository write access can comment @clawsweeper re-review or @clawsweeper re-run on an open PR or issue to request a fresh review only.
  • Maintainers can also comment @clawsweeper review to request a fresh review only.
  • Fresh-review commands do not start repair, autofix, rebase, CI repair, or automerge.
  • Maintainer-only repair and merge flows require explicit commands such as @clawsweeper autofix, @clawsweeper automerge, @clawsweeper fix ci, or @clawsweeper address review.
  • Maintainers can comment @clawsweeper explain to ask for more context, or @clawsweeper stop to stop active automation.

@clawsweeper clawsweeper Bot added rating: 🦪 silver shellfish Thin PR readiness signal; proof, validation, or implementation needs work. status: 📣 needs proof The PR needs real behavior proof before ClawSweeper can clear the contributor ask. rating: 🧂 unranked krab Not merge-ready due to missing proof or serious correctness/safety concerns. P2 Normal priority bug or improvement with limited blast radius. merge-risk: 🚨 security-boundary 🚨 Merging this PR could weaken sandboxing, authorization, credentials, or sensitive data. and removed rating: 🦪 silver shellfish Thin PR readiness signal; proof, validation, or implementation needs work. labels May 23, 2026
@clawsweeper

clawsweeper Bot commented May 23, 2026

Copy link
Copy Markdown

ClawSweeper PR egg

✨ Hatched: 🌱 uncommon Clockwork Patch Peep

Hatch command

Comment @clawsweeper hatch when this PR is hatchable.

Hatchability rules:

  • Merged PRs are hatchable.
  • Open PRs are hatchable when they are status: 👀 ready for maintainer look, status: 🚀 automerge armed, or labeled clawsweeper:automerge.
  • Closed unmerged PRs are hatchable only when one of those hatchable labels is still present in the durable record.

Rarity: 🌱 uncommon.
Trait: sniffs out flaky tests.
Image traits: location flaky test forest; accessory little merge flag; palette rose quartz and slate; mood sleepy but ready; pose peeking out from the egg shell; shell glossy opal shell; lighting warm desk-lamp glow; background gentle dashboard dots.
Share on X: post this hatch
Copy: My PR egg hatched a 🌱 uncommon Clockwork Patch Peep in ClawSweeper.

What is this egg doing here?
  • Eggs appear after the PR passes real-behavior proof. It is here for vibes, not verdicts: it does not change labels, ratings, merge decisions, or automation.
  • The shell reacts to review momentum: open follow-up work warms it up, re-review makes it wobble, and a clean final review lets it hatch.
  • Hatchability usually comes from sufficient real-behavior proof, no blocking P0/P1/P2 findings, no security attention needed, and clean correctness. A merged PR is already final, so merge makes the egg hatchable independently.
  • The hatch is seeded from this repository and PR number, so the same PR keeps the same creature; the reviewed head SHA can only change safe visual details.
  • Rarity is just collectible sparkle: 🥚 common, 🌱 uncommon, 💎 rare, ✨ glimmer, and 🌈 legendary.

@AlexAlves87

Copy link
Copy Markdown
Contributor Author

@clawsweeper re-review

@clawsweeper

clawsweeper Bot commented May 23, 2026

Copy link
Copy Markdown

🦞👀
ClawSweeper picked this up.

Command router queued. I will update this comment with the next step.

@clawsweeper clawsweeper Bot added proof: sufficient Contributor real behavior proof is sufficient. rating: 🦐 gold shrimp Decent PR readiness signal, but merge confidence is limited. status: 🔁 re-review loop A fresh ClawSweeper review was explicitly requested after the latest review. and removed rating: 🧂 unranked krab Not merge-ready due to missing proof or serious correctness/safety concerns. status: 📣 needs proof The PR needs real behavior proof before ClawSweeper can clear the contributor ask. labels May 23, 2026
@clawsweeper

clawsweeper Bot commented May 23, 2026

Copy link
Copy Markdown

🦞🧹
ClawSweeper re-review requested.

I asked ClawSweeper to review this item again.
Action: item re-review queued (workflow sweep.yml, event repository_dispatch).
Result: the existing ClawSweeper review comment will be edited in place when the review finishes.

Re-review progress:

@AlexAlves87

Copy link
Copy Markdown
Contributor Author

@clawsweeper re-review

@clawsweeper

clawsweeper Bot commented May 23, 2026

Copy link
Copy Markdown

🦞🧹
ClawSweeper re-review requested.

I asked ClawSweeper to review this item again.
Action: item re-review queued (workflow sweep.yml, event repository_dispatch).
Result: the existing ClawSweeper review comment will be edited in place when the review finishes.

Re-review progress:

@clawsweeper clawsweeper Bot added rating: 🦞 diamond lobster Very strong PR readiness with only minor maintainer review expected. status: 👀 ready for maintainer look ClawSweeper has no concrete contributor-facing blocker left for this PR. and removed rating: 🦐 gold shrimp Decent PR readiness signal, but merge confidence is limited. status: 🔁 re-review loop A fresh ClawSweeper review was explicitly requested after the latest review. labels May 26, 2026
@shanselman

Copy link
Copy Markdown
Contributor

First-pass triage: holding this one rather than merging.

A dual-model review found enough security-boundary/correctness risk that I don’t think it clears the quick-merge bar yet:

  • Potentially sensitive full command text appears to be persisted in exec-approvals.json usage metadata (lastUsedCommand). That can leak tokens/secrets embedded in command arguments.
  • Approval-store side effects are wired into Allow paths; if the side-effect wrapper throws unexpectedly, an already-approved command could be reported as denied.

Suggested follow-up: avoid persisting full command text (or store only a redacted/structured summary), and wrap approval metadata side effects so they log/fail-soft without changing the approval decision.

@AlexAlves87

Copy link
Copy Markdown
Contributor Author

Both findings addressed in the latest commit (90b36db):

1. lastUsedCommand removed. The field is gone from ExecAllowlistEntry and the command parameter is dropped from RecordAllowlistUseAsync. Only lastUsedAt and lastResolvedPath are persisted — no command text reaches disk. A DoesNotContain("lastUsedCommand") assertion in the store round-trip test acts as a regression guard.

2. Side effects isolated. Each side effect (PersistAllowlistEntriesAsync, RecordAllowlistUsageAsync) is now wrapped in its own independent best-effort try/catch in both the pre-approved (pass1) and post-prompt paths. A failure in one does not skip the other and cannot flip an already-decided allow into InternalError.

Co-Authored-By: Claude Sonnet 4.6 noreply@anthropic.com

@clawsweeper clawsweeper Bot added rating: 🦪 silver shellfish Thin PR readiness signal; proof, validation, or implementation needs work. and removed proof: sufficient Contributor real behavior proof is sufficient. rating: 🦞 diamond lobster Very strong PR readiness with only minor maintainer review expected. status: 👀 ready for maintainer look ClawSweeper has no concrete contributor-facing blocker left for this PR. labels Jun 7, 2026
@clawsweeper clawsweeper Bot added status: 📣 needs proof The PR needs real behavior proof before ClawSweeper can clear the contributor ask. proof: sufficient Contributor real behavior proof is sufficient. rating: 🐚 platinum hermit Good normal PR readiness with ordinary maintainer review expected. status: 👀 ready for maintainer look ClawSweeper has no concrete contributor-facing blocker left for this PR. and removed rating: 🦪 silver shellfish Thin PR readiness signal; proof, validation, or implementation needs work. status: 📣 needs proof The PR needs real behavior proof before ClawSweeper can clear the contributor ask. labels Jun 7, 2026
AlexAlves87 and others added 7 commits June 7, 2026 22:42
…ects (PR8)

Implements the store write path (AddAllowlistEntryAsync, RecordAllowlistUseAsync)
and wires the side-effect calls into ExecApprovalsCoordinator.

Side effects fire strictly after the final allow decision is confirmed (both
pass1 pre-approved and post-pass2 branches). Best-effort: never throw; any
I/O failure degrades to a logged warning. UpdateFileAsync refuses to write
a malformed file. SemaphoreSlim serializes intra-process writes.

Pattern validation is non-empty only, matching macOS parity. New entries
carry {id, pattern} only — lastUsedAt is absent on creation and stamped by
RecordAllowlistUseAsync on first use (macOS addAllowlistEntry parity).

Rail 19 preserved: coordinator not referenced in any production src/ file.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Strip rail codes, PR numbers, research doc references, D-labels, CVE/ADR
tags, and other planning labels from all ExecApprovals source files.
No logic changed — comments only.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…ce and lastUsed recording

End-to-end coordinator/store test using real filesystem I/O. Surfaces the
on-disk exec-approvals.json content at three points (initial, post-AllowAlways,
post-allowlist-hit) via ITestOutputHelper so the JSON is visible under
`dotnet test ... --logger "console;verbosity=detailed"`. Demonstrates both
side-effect paths: AllowAlways persists a new entry, and a later allowlist
hit records lastUsed* metadata against the same entry id (dedup).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
ResolveReadOnly merges entries from agents["*"] into the resolved
allowlist for any concrete agent, so a hit can be authorized by the
wildcard bucket. RecordAllowlistUseAsync was only searching the concrete
agent bucket, so wildcard-authorized executions never accumulated
lastUsed* metadata.

The method now iterates both the concrete agent bucket and "*", updating
metadata wherever the pattern matches. If agentId is already "*", only
the wildcard bucket is searched (no double-pass).

Tests added:
- Store: wildcard-only bucket hit updates metadata.
- Store: same pattern in both buckets updates both entries.
- Coordinator: end-to-end regression with allowlist living under "*".

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…ation

lastUsedCommand persisted the full command text in exec-approvals.json,
leaking tokens or secrets embedded in command arguments. Remove the field
from ExecAllowlistEntry and the command parameter from RecordAllowlistUseAsync;
lastUsedAt and lastResolvedPath provide sufficient operational metadata.

Approval-store side effects (PersistAllowlistEntriesAsync, RecordAllowlistUsageAsync)
were unguarded, so an unexpected exception could cause an already-approved command
to be reported as InternalError. Each side effect in both the pre-approved (pass1)
and post-prompt (step 8) paths is now wrapped in its own best-effort try/catch so
a failure in one does not skip the other and never flips an allow to a deny.

Add DoesNotContain("lastUsedCommand") assertion to the store round-trip test
as a regression guard against accidental reintroduction of the field.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@AlexAlves87 AlexAlves87 force-pushed the feat/exec-approvals-write-path branch from df28407 to ad4021d Compare June 7, 2026 20:43
@clawsweeper clawsweeper Bot added status: ⏳ waiting on author ClawSweeper has contributor-facing work open and is waiting for author action. and removed status: 👀 ready for maintainer look ClawSweeper has no concrete contributor-facing blocker left for this PR. labels Jun 7, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

merge-risk: 🚨 security-boundary 🚨 Merging this PR could weaken sandboxing, authorization, credentials, or sensitive data. P2 Normal priority bug or improvement with limited blast radius. proof: sufficient Contributor real behavior proof is sufficient. rating: 🐚 platinum hermit Good normal PR readiness with ordinary maintainer review expected. status: ⏳ waiting on author ClawSweeper has contributor-facing work open and is waiting for author action.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants