[FE Fix] Keep evaluator chain connected on app-revision change + Railway deploy diagnostics#4513
[FE Fix] Keep evaluator chain connected on app-revision change + Railway deploy diagnostics#4513mmabrouk wants to merge 10 commits into
Conversation
changePrimaryNode cleared all output connections when the primary app node was re-selected. Because the primary node is updated in place (its id is preserved), the app -> evaluator edge stayed valid, but clearing it orphaned the still-present evaluator node. connectDownstreamNode then no-ops on the existing evaluator, so the edge was never recreated and the evaluator silently stopped running after the first app-revision change. Preserve connections whose endpoints still reference existing nodes instead of clearing unconditionally.
Failed Railway preview deploys reported only 'Process completed with exit code 1'. The real cause was either swallowed by >/dev/null in the deploy scripts or lived only in the Railway dashboard, which CI never surfaced. - Add install_error_trap to bootstrap/configure/deploy-from-images so a failure prints the command, exit code, and a file:line call stack. - railway_call now prints failures to stderr instead of stdout, so callers that send stdout to /dev/null still surface the underlying railway error. - Add dump_railway_logs: on a failed deploy, pull the tail of key services' Railway logs (Postgres, alembic, api, ...) into the job. - Persist setup/deploy output to a log file, upload it as an artifact, and write a step summary with the log tail on failure. Diagnostics only; no deploy logic changes. The artifact-upload steps are marked continue-on-error so they can never fail an otherwise-passing job. (cherry picked from commit d97cf94)
Address CodeRabbit review. The ERR handler printed $BASH_COMMAND and railway_call printed railway's output on failure. configure.sh passes real secret values as CLI args to 'railway variable set' (POSTGRES_PASSWORD, AGENTA_AUTH_KEY/CRYPT_KEY, *_API_KEY), so a failure could emit plaintext secrets into the uploaded deploy-log artifact. Add _railway_redact (masks KEY=value for PASSWORD/TOKEN/SECRET/KEY keys and scheme://user:password@host) and apply it to every diagnostic path: railway_call failure + rate-limit output, the ERR handler command, and dump_railway_logs. The success path stays unredacted so callers that parse 'variable list -k' output (e.g. resolve_postgres_password) keep working. (cherry picked from commit 63136af)
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
|
Important Review skippedAuto reviews are disabled on base/target branches other than the default branch. Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Plus Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
railway_call previously retried only rate-limits. Railway's API also intermittently times out write mutations (notably 'variable set'), reproduced locally at ~20% even single-threaded and independent of env size. With ~15 variable-set calls per deploy and no retry, deploys failed ~96% of the time (0.8^15). Retry policy: - rate-limit (429): always retried (request was rejected, not processed) - transient network/timeout: retried ONLY for idempotent commands; a timed-out create (init/add/environment new/volume add) may have succeeded server-side, so it is not blind-retried (avoids duplicate projects/services/volumes); rate-limit retries still apply to them - deterministic errors (not found/unauthorized): fail fast, no retry Failure output stays redacted. Covered by unit tests. (cherry picked from commit 060441a)
dump_railway_logs ran after the 'deploy-from-images.sh | tee $log_file' pipeline, so its Railway service-log tails only reached the live Actions log, not the uploaded artifact or the step-summary tail. If the live log truncates, the root-cause dump was lost again. Tee the dump into $log_file so it lands in the artifact and the summary. (Addresses CodeRabbit review.) (cherry picked from commit da990a8)
Railway Preview Environment
|
Combines #4489 with the Railway preview-deploy diagnostics from #4509 (cherry-picked:
d97cf94+63136af).Purpose: run #4489's Railway preview deploy with the improved diagnostics so a failure surfaces the real cause instead of a bare
Process completed with exit code 1:/dev/nullSecrets are redacted from every diagnostic path.
Base PR: #4489 • Diagnostics PR: #4509