[FE Fix] Keep evaluator chain connected on app-revision change + Railway deploy diagnostics by mmabrouk · Pull Request #4513 · Agenta-AI/agenta

mmabrouk · 2026-06-01T17:43:19Z

Combines #4489 with the Railway preview-deploy diagnostics from #4509 (cherry-picked: d97cf94 + 63136af).

Purpose: run #4489's Railway preview deploy with the improved diagnostics so a failure surfaces the real cause instead of a bare Process completed with exit code 1:

ERR traces with the failing command + file:line call stack
railway errors go to stderr instead of /dev/null
on a failed deploy, the key services' Railway logs are pulled into the job
setup/deploy logs persisted as artifacts + step summaries

Secrets are redacted from every diagnostic path.

Base PR: #4489 • Diagnostics PR: #4509

changePrimaryNode cleared all output connections when the primary app node was re-selected. Because the primary node is updated in place (its id is preserved), the app -> evaluator edge stayed valid, but clearing it orphaned the still-present evaluator node. connectDownstreamNode then no-ops on the existing evaluator, so the edge was never recreated and the evaluator silently stopped running after the first app-revision change. Preserve connections whose endpoints still reference existing nodes instead of clearing unconditionally.

…bility

Failed Railway preview deploys reported only 'Process completed with exit code 1'. The real cause was either swallowed by >/dev/null in the deploy scripts or lived only in the Railway dashboard, which CI never surfaced. - Add install_error_trap to bootstrap/configure/deploy-from-images so a failure prints the command, exit code, and a file:line call stack. - railway_call now prints failures to stderr instead of stdout, so callers that send stdout to /dev/null still surface the underlying railway error. - Add dump_railway_logs: on a failed deploy, pull the tail of key services' Railway logs (Postgres, alembic, api, ...) into the job. - Persist setup/deploy output to a log file, upload it as an artifact, and write a step summary with the log tail on failure. Diagnostics only; no deploy logic changes. The artifact-upload steps are marked continue-on-error so they can never fail an otherwise-passing job. (cherry picked from commit d97cf94)

Address CodeRabbit review. The ERR handler printed $BASH_COMMAND and railway_call printed railway's output on failure. configure.sh passes real secret values as CLI args to 'railway variable set' (POSTGRES_PASSWORD, AGENTA_AUTH_KEY/CRYPT_KEY, *_API_KEY), so a failure could emit plaintext secrets into the uploaded deploy-log artifact. Add _railway_redact (masks KEY=value for PASSWORD/TOKEN/SECRET/KEY keys and scheme://user:password@host) and apply it to every diagnostic path: railway_call failure + rate-limit output, the ERR handler command, and dump_railway_logs. The success path stays unredacted so callers that parse 'variable list -k' output (e.g. resolve_postgres_password) keep working. (cherry picked from commit 63136af)

vercel · 2026-06-01T17:43:25Z

The latest updates on your projects. Learn more about Vercel for GitHub.

Project	Deployment	Actions	Updated (UTC)
agenta-documentation	Ready	Preview, Comment	Jun 1, 2026 7:38pm

coderabbitai · 2026-06-01T17:43:28Z

Important

Review skipped

Auto reviews are disabled on base/target branches other than the default branch.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro Plus

Run ID: 2574be2e-4cd1-4cbd-97e3-5c484d2db4c6

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

🔍 Trigger review

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch ci-diag/4489-evaluator-chain

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

railway_call previously retried only rate-limits. Railway's API also intermittently times out write mutations (notably 'variable set'), reproduced locally at ~20% even single-threaded and independent of env size. With ~15 variable-set calls per deploy and no retry, deploys failed ~96% of the time (0.8^15). Retry policy: - rate-limit (429): always retried (request was rejected, not processed) - transient network/timeout: retried ONLY for idempotent commands; a timed-out create (init/add/environment new/volume add) may have succeeded server-side, so it is not blind-retried (avoids duplicate projects/services/volumes); rate-limit retries still apply to them - deterministic errors (not found/unauthorized): fail fast, no retry Failure output stays redacted. Covered by unit tests. (cherry picked from commit 060441a)

dump_railway_logs ran after the 'deploy-from-images.sh | tee $log_file' pipeline, so its Railway service-log tails only reached the live Actions log, not the uploaded artifact or the step-summary tail. If the live log truncates, the root-cause dump was lost again. Tee the dump into $log_file so it lands in the artifact and the summary. (Addresses CodeRabbit review.) (cherry picked from commit da990a8)

github-actions · 2026-06-01T20:25:39Z

Railway Preview Environment


Image tag	`pr-4513-ae79381`
Status	Failed
Logs	View workflow run
Updated at 2026-06-01T20:25:38.321Z

ardaerzin and others added 8 commits May 29, 2026 13:25

Merge branch 'main' into fe-fix/evaluator-chain-run-reliability

dd0afda

Merge branch 'main' into fe-fix/evaluator-chain-run-reliability

752d3db

Merge branch 'main' into fe-fix/evaluator-chain-run-reliability

635730b

Merge branch 'main' into fe-fix/evaluator-chain-run-reliability

275512c

Merge branch 'release/v0.100.9' into fe-fix/evaluator-chain-run-relia…

0e379e2

…bility

dosubot Bot added the size:L This PR changes 100-499 lines, ignoring generated files. label Jun 1, 2026

dosubot Bot added ci/cd Frontend labels Jun 1, 2026

vercel Bot deployed to Preview June 1, 2026 17:43 View deployment

vercel Bot deployed to Preview June 1, 2026 19:34 View deployment

vercel Bot deployed to Preview June 1, 2026 19:38 View deployment

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[FE Fix] Keep evaluator chain connected on app-revision change + Railway deploy diagnostics#4513

[FE Fix] Keep evaluator chain connected on app-revision change + Railway deploy diagnostics#4513
mmabrouk wants to merge 10 commits into
release/v0.100.9from
ci-diag/4489-evaluator-chain

mmabrouk commented Jun 1, 2026

Uh oh!

vercel Bot commented Jun 1, 2026 •

edited

Loading

Uh oh!

coderabbitai Bot commented Jun 1, 2026 •

edited

Loading

Review skipped

Uh oh!

github-actions Bot commented Jun 1, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

mmabrouk commented Jun 1, 2026

Uh oh!

vercel Bot commented Jun 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

coderabbitai Bot commented Jun 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review skipped

Uh oh!

github-actions Bot commented Jun 1, 2026

Railway Preview Environment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

vercel Bot commented Jun 1, 2026 •

edited

Loading

coderabbitai Bot commented Jun 1, 2026 •

edited

Loading