Skip to content

fix(#370): run foreground codex task in the detached worker so long turns survive the 10-minute Bash ceiling#372

Open
brenoperucchi wants to merge 3 commits into
openai:mainfrom
brenoperucchi:fix/issue-370-foreground-detach
Open

fix(#370): run foreground codex task in the detached worker so long turns survive the 10-minute Bash ceiling#372
brenoperucchi wants to merge 3 commits into
openai:mainfrom
brenoperucchi:fix/issue-370-foreground-detach

Conversation

@brenoperucchi

Copy link
Copy Markdown

Problem (#370)

Foreground task runs — the /codex:rescue default and the stop-gate review — executed the Codex turn inline in the process Claude Code's Bash tool SIGTERMs at the 600,000 ms (10-minute) ceiling. Any turn that ran past ~10 minutes was killed mid-run and the job was left stuck in status: "running" with a frozen log forever. The --background path was already immune because it spawns a detached/setsid worker that survives the SIGTERM.

Fix

Route the foreground path through the same detached, setsid-isolated worker the --background path already used, and have the foreground process only observe it: stream the worker's job log and poll until the job finishes. When the harness kills the observer at the ceiling, the worker keeps running in its own session, records completion, and the result stays retrievable via /codex:result <id> instead of dying silently.

  • handleTask foreground → enqueueDetachedTask + observeDetachedTask, preserving stdout/stderr/exit-code parity for the three consumers: the rescue subagent, the CLI, and the stop hook (which parses task --jsonrawOutput).
  • enqueueBackgroundTaskenqueueDetachedTask; the job record (with its request payload) is now persisted before spawning the worker, removing a worker-reads-before-record race that the foreground observer would otherwise surface as a hang; the worker pid is patched into the index afterward so /codex:cancel can still reach a not-yet-running job.
  • Bump DEFAULT_STATUS_WAIT_TIMEOUT_MS 240000 → 1800000 so /codex:status --wait (and the observer backstop) no longer reports "still running" after only 4 minutes on a healthy long job.

Scope: review / adversarial-review intentionally left inline

Their long-running path is already detached at a different layer — commands/review.md runs the companion under Claude Code's Bash(..., run_in_background: true), which is immune to the foreground ceiling — and the companion's review --background is pinned to inline, full-result behavior by an existing test (tests/runtime.test.mjs). The foreground review flow is intentionally limited to "clearly tiny, roughly 1–2 file" reviews. Rerouting review through the worker would break that pinned contract for no real safety gain, so it was left as-is.

Tests

node --test tests/*.test.mjs — no regressions introduced (verified by diffing the failing-test set against the pristine tree).

New regression test in tests/runtime.test.mjs: "foreground task survives its observer being killed mid-run and stays retrievable (#370)" launches a foreground task as its own process-group leader, SIGKILLs the observer's process group mid-turn, and asserts the detached worker still drives the job to completed and /codex:result returns the output — instead of the job being stranded in running.

Also smoke-tested end-to-end against the real Codex CLI: a foreground task now dispatches to the detached worker, streams progress, and returns the result on stdout.

Collaboration note

The investigation here was author-led — reading the original error, reproducing it, and tracing the root cause started from me, and I reviewed and validated the solution as well. Claude (Claude Code) was used to carry out the implementation work; the resulting code was analyzed and reviewed by both of us.

🤖 Generated with Claude Code

Foreground `task` (the /codex:rescue default and the stop-gate review)
ran the Codex turn inline in the process Claude Code's Bash tool SIGTERMs
at the 600,000 ms ceiling, so any turn past ~10 min was killed mid-run and
left stuck in status "running" with a frozen log forever.

Route the foreground path through the same detached, setsid-isolated
worker the --background path already used, and have the foreground process
only observe it: stream the worker's job log and poll until the job
finishes. When the harness kills the observer at the ceiling, the worker
keeps running, records completion, and the result stays retrievable via
/codex:result instead of dying silently.

- handleTask foreground -> enqueueDetachedTask + observeDetachedTask,
  preserving stdout/stderr/exit-code parity for the rescue subagent, the
  CLI, and the stop hook (which parses `task --json`).
- enqueueBackgroundTask -> enqueueDetachedTask; persist the job record
  (with its request payload) before spawning to remove a worker-reads-
  first race, then patch the worker pid into the index for /codex:cancel.
- bump DEFAULT_STATUS_WAIT_TIMEOUT_MS 240000 -> 1800000 so status --wait
  no longer reports "still running" after 4 min on a healthy long job.
- add a regression test that kills the observer mid-turn and asserts the
  worker still completes and /codex:result returns the output.

review/adversarial-review are intentionally left inline: their long-run
path is already detached via Bash(run_in_background: true) in review.md.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@brenoperucchi brenoperucchi requested a review from a team June 13, 2026 06:42

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: ba61776a11

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

const length = stat.size - fromOffset;
const buffer = Buffer.alloc(length);
fs.readSync(fd, buffer, 0, length, fromOffset);
process.stderr.write(buffer.toString("utf8"));

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Avoid streaming result blocks to stderr

For non-JSON foreground tasks this tails the persisted job log verbatim to stderr. That log also contains logBody blocks such as the full assistant message and the Final output block, so a normal successful foreground /codex:rescue or CLI task now emits Codex's answer on stderr before emitting the same result on stdout. Callers that display or record stderr now see duplicated/noisy output, and this breaks the wrapper contract that the rescue subagent returns the task stdout unchanged; stream only progress messages or filter output blocks from the tailed log.

Useful? React with 👍 / 👎.

…i#372)

Addresses the Codex review on openai#372: the foreground observer tailed the
persisted job log verbatim to stderr, which also emitted the logBody
blocks (the assistant message and the `Final output` block) — so a normal
successful foreground rescue/CLI task printed Codex's answer on stderr
before printing the same result on stdout.

Filter the tailed log to progress lines only via a new exported
isStreamableProgressLine() (reusing the existing block-title detection),
and consume only complete lines so a body is never partially echoed. The
rendered result still goes to stdout exactly once.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: d17b8aff5d

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment on lines +67 to +71
if (typeof line !== "string" || !line.startsWith("[")) {
return false;
}
const stripped = stripLogPrefix(line);
return Boolean(stripped) && !isProgressBlockTitle(stripped);

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Skip block bodies when streaming log lines

Fresh evidence in this revision is that the new filter still decides streamability one line at a time: any persisted assistant/final-output body line that begins with [ (for example Codex output starting with [P2] ... or a JSON array) passes this predicate even though appendLogBlock writes block bodies as raw continuation lines after the block title. In those cases a foreground /codex:rescue still leaks part of Codex's answer to stderr before returning the same content on stdout, so the tailer needs to track that it is inside a block body rather than treating every bracket-prefixed line as progress.

Useful? React with 👍 / 👎.

@brenoperucchi

Copy link
Copy Markdown
Author

@codex review

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: d17b8aff5d

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment on lines +67 to +71
if (typeof line !== "string" || !line.startsWith("[")) {
return false;
}
const stripped = stripLogPrefix(line);
return Boolean(stripped) && !isProgressBlockTitle(stripped);

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Require a timestamp before streaming log lines

When a Codex answer or reasoning block contains a body line that starts with brackets, such as [1] ... or a markdown link/reference, this predicate treats it as a progress entry because it only checks startsWith("[") before stripping the first bracketed segment. streamJobLogTail now uses this for foreground tasks, so those block-body lines are echoed to stderr and then the full rendered result is printed to stdout, reintroducing the result duplication/leak that the new filtering is meant to avoid. Validate the actual log timestamp prefix before stripping/streaming the line.

Useful? React with 👍 / 👎.

…nai#372)

Follow-up to the Codex review: isStreamableProgressLine decided
streamability from a bare leading "[", so a block-body line that starts
with a bracket -- a markdown reference like "[1] ...", a "[P2] ..." line,
or a JSON array in Codex's answer -- was treated as a progress entry and
echoed to stderr, re-leaking part of the result before stdout.

Match the actual `[<ISO-8601>]` log prefix written by appendLogLine/nowIso
instead, so only real progress entries stream and block bodies never do.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@brenoperucchi

Copy link
Copy Markdown
Author

@codex review

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: dbd56b2c63

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment on lines +953 to +954
const { logFile } = enqueueDetachedTask(cwd, job, request);
await observeDetachedTask(cwd, { ...job, logFile }, { json: options.json });

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Preserve subagent labels when detaching foreground tasks

Because foreground tasks now always enter the detached-worker path, tasks that spawn Codex subagents can lose the thread/started metadata before the worker's turn id is known; the job log then records raw ids like thr_2 instead of the subagent nickname (design-challenger). I verified this with npm test -- --test-reporter=spec: tests/runtime.test.mjs's subagent-log test fails, and users lose named subagent context in foreground task logs/status for this scenario. Please preserve/apply buffered thread metadata when moving foreground tasks through the detached worker.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for flagging this — I tried to reproduce it and couldn't, so I want to share what I found before changing anything.

The subagent-label test passes for me both in isolation and in the full suite (8/8 runs), and also in the delicate path you'd expect to be at risk — a warm shared-broker + detached-worker + subagent run — where labels resolve to design-challenger with zero raw thr_* ids (3/3 runs).

This PR doesn't touch the subagent-label path in codex.mjs: the foreground task now runs the same captureTurn code the inline path already used (it only moved process), and buffered notifications are drained in arrival order (captureTurn), so the subagent's thread/started registers the nickname before the collabAgentToolCall item reads it via labelForThread. So a label would only fall back to the raw id if the app-server emitted the collab item before the subagent thread/started — which would affect the inline path identically and isn't changed here.

If you have the failing run's output or a seed, I'm glad to dig in — but with npm test green here I'd rather not add speculative handling to that path without a reproducer.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant