[Repo Assist] fix(setup): cancel wizard session before disconnect to prevent stale-session errors#717
Conversation
…session retry errors When wizard.next timed out (e.g. Teams channel selection hanging), EnterWizardErrorAsync called DisconnectAsync which nulled _client, then showed "Start wizard again" / "Skip wizard" buttons. CancelCurrentSessionAsync checked _client != null and skipped the wizard.cancel call — leaving the server-side session active. Subsequent "Start wizard again" clicks then hit a gateway "wizard already running" error. Fix: replace await DisconnectAsync() with await CancelCurrentSessionAsync() in both EnterWizardErrorAsync and StartWizardAsync. CancelCurrentSessionAsync sends wizard.cancel (best-effort, catch ignored) then calls DisconnectAsync, so the disconnect still happens. The session cancel is a no-op when _client is already null or _sessionId is empty, so the first-start path is unaffected. Closes #709 Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
|
Codex review: needs maintainer review before merge. Reviewed June 7, 2026, 10:08 PM ET / 02:08 UTC. Summary Reproducibility: yes. at source level: current main disconnects and nulls the wizard client before retry/skip can send Review metrics: 2 noteworthy metrics.
Merge readiness Overall follows the weaker of proof and patch quality, so missing proof can cap an otherwise strong patch. Rank-up moves:
Proof guidance:
Mantis proof suggestion Risk before merge
Maintainer options:
Next step before merge
Security Review detailsBest possible solution: Land the narrow cancellation change after redacted live setup proof and required validation show retry or skip recovery no longer hits a stale gateway wizard session. Do we have a high-confidence way to reproduce the issue? Yes at source level: current main disconnects and nulls the wizard client before retry/skip can send Is this the best way to solve the issue? Likely yes: reusing the existing AGENTS.md: found and applied where relevant. Codex review notes: model gpt-5.5, reasoning high; reviewed against d1b136347e95. Label changesLabel justifications:
Evidence reviewedAcceptance criteria:
What I checked:
Likely related people:
What the crustacean ranks mean
Shiny media proof means a screenshot, video, or linked artifact directly shows the changed behavior. Runtime, network, CSP, and security claims still need visible diagnostics. How this review workflow works
|
🤖 This fix is proposed by Repo Assist, an automated AI assistant.
Problem
When the setup wizard's
wizard.nextcall timed out (e.g. Teams channel selection hung for longer than the step timeout), the error recovery path left the gateway-side wizard session active. Subsequent attempts to "Start wizard again" then failed with a "wizard already running" gateway error — causing users to be permanently stuck in a broken setup loop.Root cause (
src/OpenClaw.SetupEngine.UI/Pages/WizardPage.xaml.cs):EnterWizardErrorAsynccalledDisconnectAsync()which nulls_client, then showed "Start wizard again" / "Skip wizard" buttons. When the user clicked either button,CancelCurrentSessionAsyncchecked_client != nulland skipped sendingwizard.cancel— leaving the server-side session alive. The nextwizard.starthit a live existing session.Fix
Replace
await DisconnectAsync()withawait CancelCurrentSessionAsync()in two places:EnterWizardErrorAsync— cancel the session immediately when entering error state, before disconnect, so subsequent "Start wizard again" clicks don't hit a stale session.StartWizardAsync— cancel any prior server-side session before starting a freshwizard.start. This covers theShowErrorpath inApplyPayloadAsyncwhere_clientis still connected when the error is shown.CancelCurrentSessionAsyncalready:wizard.cancelbest-effort (catch { }ignores failures)DisconnectAsync()at the end — so the disconnect still happens_clientis null or_sessionIdis empty (no change to first-start path)Test Status
GITHUB_ENVfile (environment infrastructure issue, not caused by this change; pre-existing in this run environment)WizardPage.xaml.csis in a WinUI project; no unit tests exist for this UI code-behind file — WinUI runtime required. The fix is a minimal 2-line substitution.Closes #709
Add this agentic workflows to your repo
To install this agentic workflow, run