Skip to content

[release] v0.100.9 + Railway deploy diagnostics#4514

Closed
mmabrouk wants to merge 24 commits into
mainfrom
ci-diag/4504-release-v0.100.9
Closed

[release] v0.100.9 + Railway deploy diagnostics#4514
mmabrouk wants to merge 24 commits into
mainfrom
ci-diag/4504-release-v0.100.9

Conversation

@mmabrouk
Copy link
Copy Markdown
Member

@mmabrouk mmabrouk commented Jun 1, 2026

Combines #4504 with the Railway preview-deploy diagnostics from #4509 (cherry-picked: d97cf94 + 63136af).

Purpose: run #4504's Railway preview deploy with the improved diagnostics so a failure surfaces the real cause instead of a bare Process completed with exit code 1:

  • ERR traces with the failing command + file:line call stack
  • railway errors go to stderr instead of /dev/null
  • on a failed deploy, the key services' Railway logs are pulled into the job
  • setup/deploy logs persisted as artifacts + step summaries

Secrets are redacted from every diagnostic path.

Base PR: #4504 • Diagnostics PR: #4509

ardaerzin and others added 22 commits May 29, 2026 22:44
…ects

Switching projects while on an entity-scoped page (evaluation, playground,
testset) kept the old entity id in the URL. Since that entity does not exist
in the target project, the user landed on an empty screen.

Preserve only the top-level section segment on project switch and drop nested
entity ids. All top-level sections have index pages, so the truncated path
always resolves. Logic extracted to a pure helper with regression tests.
…lity table

The observability table's evaluator metric columns only carry a slug, but
evaluatorReferenceAtomFamily resolved names solely through its id-based
workflow-revision query. Slug-only callers fell through to a minimal
reference that returned name=slug, leaking slugs into the table.

Add a slug-resolution branch that matches the slug against the loaded
app+evaluator union list (workflowsListQueryStateAtom) to surface the real
name, deferring while the list loads. Mirrors appReferenceAtomFamily.
Render archived evaluators through the same single PageLayout as the
active Evaluators page, with a Back arrow + title in the standard h-11
header row. Drops the ArchivedEntityLayout wrapper, which double-framed
the page (extra p-4 + tall stacked Back/title/subtitle header) and
pushed the table ~60-70px lower, causing a visible vertical shift when
toggling between Evaluators and Archived.
…-table-shows-evaluator-slugs-instead-of-names

[FE Fix]: Show evaluator names instead of slugs in observability table
…ge-layout-fix

[FE Fix]: Align archived evaluators page layout with active page
…jects-keeps-evaluation-or-playground-in-url

[FE Fix]: Correctly handle project changes from entity scoped pages
Failed Railway preview deploys reported only 'Process completed with exit
code 1'. The real cause was either swallowed by >/dev/null in the deploy
scripts or lived only in the Railway dashboard, which CI never surfaced.

- Add install_error_trap to bootstrap/configure/deploy-from-images so a
  failure prints the command, exit code, and a file:line call stack.
- railway_call now prints failures to stderr instead of stdout, so callers
  that send stdout to /dev/null still surface the underlying railway error.
- Add dump_railway_logs: on a failed deploy, pull the tail of key services'
  Railway logs (Postgres, alembic, api, ...) into the job.
- Persist setup/deploy output to a log file, upload it as an artifact, and
  write a step summary with the log tail on failure.

Diagnostics only; no deploy logic changes. The artifact-upload steps are
marked continue-on-error so they can never fail an otherwise-passing job.

(cherry picked from commit d97cf94)
Address CodeRabbit review. The ERR handler printed $BASH_COMMAND and
railway_call printed railway's output on failure. configure.sh passes real
secret values as CLI args to 'railway variable set' (POSTGRES_PASSWORD,
AGENTA_AUTH_KEY/CRYPT_KEY, *_API_KEY), so a failure could emit plaintext
secrets into the uploaded deploy-log artifact.

Add _railway_redact (masks KEY=value for PASSWORD/TOKEN/SECRET/KEY keys and
scheme://user:password@host) and apply it to every diagnostic path:
railway_call failure + rate-limit output, the ERR handler command, and
dump_railway_logs. The success path stays unredacted so callers that parse
'variable list -k' output (e.g. resolve_postgres_password) keep working.

(cherry picked from commit 63136af)
@vercel
Copy link
Copy Markdown

vercel Bot commented Jun 1, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
agenta-documentation Ready Ready Preview, Comment Jun 1, 2026 7:38pm

Request Review

@dosubot dosubot Bot added size:L This PR changes 100-499 lines, ignoring generated files. ci/cd devops labels Jun 1, 2026
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Jun 1, 2026

Review Change Stack

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro Plus

Run ID: fb8220ac-eef2-46bb-b16d-280fc141dea3

📥 Commits

Reviewing files that changed from the base of the PR and between c78070f and 81c0c43.

📒 Files selected for processing (1)
  • .github/workflows/43-railway-deploy.yml
🚧 Files skipped from review as they are similar to previous changes (1)
  • .github/workflows/43-railway-deploy.yml

📝 Walkthrough

Summary by CodeRabbit

  • New Features

    • Audit Log now always shows the entitlement-gated UI in EE and is dynamically loaded on settings pages.
    • Project switch preserves top-level section when moving between projects.
  • Bug Fixes

    • Improved deploy/preview setup with persisted logs, richer redaction and error diagnostics, and safer retries.
    • Sidebar Audit Log visibility now requires EE + view-permission.
  • Chores

    • Version bumped across packages and charts to v0.100.9.

Walkthrough

Bumps release to 0.100.9; separates EE retention DAOs/services/routers and moves events query into EE with entitlements checks; hardens Railway scripts/workflows with redaction and persistent logs; refactors Audit Log EE wiring and Settings injection; extracts buildProjectSwitchHref with tests; and applies version bumps.

Changes

Railway Infrastructure Hardening

Layer / File(s) Summary
Error trapping, redaction, and retry
hosting/railway/oss/scripts/lib.sh
Adds _railway_redact, reworks railway_call to capture/retry/redact output, and adds install_error_trap/_railway_on_error with stack traces and optional Actions annotations; updates dump_railway_logs to redact output.
Install error trap in scripts
hosting/railway/oss/scripts/bootstrap.sh, hosting/railway/oss/scripts/configure.sh, hosting/railway/oss/scripts/deploy-from-images.sh
Call install_error_trap immediately after sourcing lib.sh to activate centralized error handling.
Workflow log persistence & artifact upload
.github/workflows/41-railway-setup.yml, .github/workflows/43-railway-deploy.yml
Persist bootstrap/deploy logs to workspace files, stream with tee, capture pipeline exit codes, append failure tails to GITHUB_STEP_SUMMARY, and upload logs as artifacts in always() steps tolerant to missing files and upload failures.

Events and Tracing Retention Refactoring

Layer / File(s) Summary
DAO renames to retention-specific classes
api/ee/src/dbs/postgres/tracing/dao.py, api/ee/src/dbs/postgres/events/dao.py
Rename EE DAOs to TracingRetentionDAO and EventsRetentionDAO and update docstrings to clarify retention-only responsibility.
Service renames to retention services
api/ee/src/core/tracing/service.py, api/ee/src/core/events/service.py
Rename services to TracingRetentionService and EventsRetentionService, switching injected DAO types to the retention variants.
Events API router: query + retention admin
api/ee/src/apis/fastapi/events/router.py
Add EE POST /events/query endpoint with Permission.VIEW_EVENTS and Flag.AUDIT checks; introduce EventsRetentionRouter for admin flush calling events_retention_service.flush_events().
Spans router: retention-focused
api/ee/src/apis/fastapi/spans/router.py
Replace SpansRouter wiring to depend on TracingRetentionService and call tracing_retention_service.flush_spans() in admin flush.
EE main wiring & entrypoint cleanup
api/ee/src/main.py, api/entrypoints/routers.py, api/oss/src/apis/fastapi/events/router.py
Instantiate retention DAOs/services and query_events_service, wire retention routers in main, mount admin retention routes, remove OSS Events router module and its mounting from entrypoint.
Tests updated for retention
api/ee/tests/pytest/unit/*
Update unit tests to use retention services/routers and patch/assert EE-qualified symbols accordingly.

Frontend Audit Log Architecture

Layer / File(s) Summary
Audit Log EE gating and injection
web/ee/src/components/pages/settings/AuditLog/AuditLog.tsx, web/ee/src/components/pages/settings/AuditLog/components/AuditLogTable.tsx, web/ee/src/pages/.../settings/index.tsx
EE AuditLog now always renders AuditLogGated; table casts store for type compatibility; EE settings page dynamically imports and injects AuditLog into the shared Settings component which now accepts an AuditLogComponent prop and gates the tab by EE + canViewEvents.
Sidebar Audit Log gating
web/oss/src/components/Sidebar/SettingsSidebar.tsx
Require both isEE() and canViewEvents for Audit Log tab visibility.

Frontend Navigation Helper

Layer / File(s) Summary
buildProjectSwitchHref helper & tests
web/oss/src/components/Sidebar/components/assets/projectSwitchHref.ts, projectSwitchHref.test.ts
Add buildProjectSwitchHref with BuildProjectSwitchHrefParams and comprehensive regression tests covering top-level path preservation, query/tab handling, and defaults.
ListOfProjects integration
web/oss/src/components/Sidebar/components/ListOfProjects.tsx
Replace inline href-building with buildProjectSwitchHref call.

Frontend Evaluators UI Cleanup

Layer / File(s) Summary
Archived Evaluators simplification
web/oss/src/components/Evaluators/ArchivedEvaluatorsPage.tsx
Render EvaluatorsRegistry directly; remove layout wrapper and routing helpers.
Evaluators archive title and layout
web/oss/src/components/Evaluators/index.tsx
Add archivedTitle with back button, standardize PageLayout styling, and disable header tabs for archived view.
Entity reference slug resolution
web/oss/src/components/References/atoms/entityReferences.ts
Add slug-only resolution branch that matches slug against workflows list and returns resolved metadata or loading state.

Version Bumps and Miscellaneous

Layer / File(s) Summary
Version 0.100.9 bumps
api/pyproject.toml, clients/python/pyproject.toml, hosting/kubernetes/helm/Chart.yaml, sdks/python/pyproject.toml, services/pyproject.toml, web/*/package.json, web/packages/agenta-api-client/package.json
Increment versions from 0.100.8 → 0.100.9 across manifests and Chart.yaml appVersion.
Event API payload typing tweak
web/packages/agenta-entities/src/event/api/api.ts
Refactor event payload construction to use an untyped literal + as AgentaApi.EventQuery assertion without behavior change.

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly Related PRs

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 26.47% which is insufficient. The required threshold is 60.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly and specifically describes the pull request's main changes: a version release (v0.100.9) and addition of Railway deploy diagnostics.
Description check ✅ Passed The description is directly related to the changeset, explaining the purpose of combining two PRs (#4504 and #4509) and detailing the diagnostic improvements made to Railway preview deployment.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch ci-diag/4504-release-v0.100.9

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
web/oss/src/pages/w/[workspace_id]/p/[project_id]/settings/index.tsx (1)

1-1: ⚠️ Potential issue | 🔴 Critical | ⚡ Quick win

Fix missing React type imports in Settings props/component types

web/oss/src/pages/w/[workspace_id]/p/[project_id]/settings/index.tsx references React.ComponentType and React.FC, but only imports hooks from "react", so the React namespace types will fail type-checking. Import ComponentType/FC (and use them directly).

Proposed fix
-import {useCallback, useEffect, useMemo, useState} from "react"
+import {
+    type ComponentType,
+    type FC,
+    useCallback,
+    useEffect,
+    useMemo,
+    useState,
+} from "react"
@@
 interface SettingsProps {
-    AuditLogComponent?: React.ComponentType
+    AuditLogComponent?: ComponentType
 }
 
-export const Settings: React.FC<SettingsProps> = ({AuditLogComponent}) => {
+export const Settings: FC<SettingsProps> = ({AuditLogComponent}) => {
🧹 Nitpick comments (5)
web/ee/src/components/pages/settings/AuditLog/components/AuditLogTable.tsx (1)

250-252: 🏗️ Heavy lift

Don’t erase the datasetStore type contract in AuditLogTable

  • eventsPaginatedStore.store comes from @agenta/entities’ own InfiniteDatasetStore implementation (createPaginatedEntityStore → entities createInfiniteDatasetStore), but InfiniteVirtualTableFeatureShell expects the InfiniteDatasetStore type from @agenta/ui/oss (web/oss/src/components/InfiniteVirtualTable/createInfiniteDatasetStore.ts). The as never cast removes the compile-time boundary that should catch contract drift.
  • Prefer a narrow bridge/adapter (or share the exact InfiniteDatasetStore types across @agenta/ui and @agenta/entities) instead of per-callsite as never (this pattern already shows up across other tables).
api/ee/src/core/tracing/service.py (1)

14-18: ⚡ Quick win

Keep the constructor and field retention-specific.

This still accepts the DAO positionally and stores it as self.tracing_dao, which leaves the old generic name in place after the retention/query split. Making the dependency keyword-only and carrying the tracing_retention_dao name through the instance state would make this boundary much clearer.

Suggested change
 class TracingRetentionService:
     def __init__(
         self,
-        tracing_retention_dao: TracingRetentionDAO,
+        *,
+        tracing_retention_dao: TracingRetentionDAO,
     ):
-        self.tracing_dao = tracing_retention_dao
+        self.tracing_retention_dao = tracing_retention_dao
# follow-up updates in this file
project_ids = await self.tracing_retention_dao.fetch_projects_with_plan(...)
traces, spans = await self.tracing_retention_dao.delete_traces_before_cutoff(...)

As per coding guidelines, "Prefer keyword-only parameters using *".

api/ee/src/apis/fastapi/spans/router.py (1)

22-26: ⚡ Quick win

Make the injected service keyword-only.

The updated call sites already use a named argument, so adding * here would tighten the router API without extra churn.

Suggested change
 class SpansRetentionRouter:
     def __init__(
         self,
-        tracing_retention_service: TracingRetentionService,
+        *,
+        tracing_retention_service: TracingRetentionService,
     ):
         self.tracing_retention_service = tracing_retention_service

As per coding guidelines, "Prefer keyword-only parameters using *".

api/ee/src/core/events/service.py (1)

24-29: ⚡ Quick win

Keep the injected DAO name retention-specific.

The constructor was renamed to events_retention_dao, but storing it on self.events_dao reintroduces the old generic name and blurs the retention/query split this refactor is establishing.

Suggested rename
 class EventsRetentionService:
     def __init__(
         self,
         events_retention_dao: EventsRetentionDAO,
     ):
-        self.events_dao = events_retention_dao
+        self.events_retention_dao = events_retention_dao

Update the two call sites in _flush_events_for_plan() to use self.events_retention_dao as well.

api/ee/tests/pytest/unit/events/test_events_router_audit.py (1)

52-53: ⚡ Quick win

Assert the forwarded query kwargs too.

This only proves the service was awaited. Since the EE router now owns the project_id mapping for /events/query, it’s worth pinning the forwarded kwargs so a broken request-to-service translation does not keep passing.

Suggested assertion upgrade
     assert response == EventsQueryResponse(count=0, events=[])
     query_events_service.query.assert_awaited_once()
+    assert (
+        str(query_events_service.query.await_args.kwargs["project_id"])
+        == request.state.project_id
+    )
+    assert query_events_service.query.await_args.kwargs["event"] is None
+    assert query_events_service.query.await_args.kwargs["windowing"] is None

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro Plus

Run ID: 51e9ddad-52ff-4324-b10c-53898ec72130

📥 Commits

Reviewing files that changed from the base of the PR and between d7c60c1 and c668181.

⛔ Files ignored due to path filters (4)
  • api/uv.lock is excluded by !**/*.lock
  • clients/python/uv.lock is excluded by !**/*.lock
  • sdks/python/uv.lock is excluded by !**/*.lock
  • services/uv.lock is excluded by !**/*.lock
📒 Files selected for processing (46)
  • .github/workflows/41-railway-setup.yml
  • .github/workflows/43-railway-deploy.yml
  • api/ee/src/apis/fastapi/events/models.py
  • api/ee/src/apis/fastapi/events/router.py
  • api/ee/src/apis/fastapi/spans/router.py
  • api/ee/src/core/events/service.py
  • api/ee/src/core/tracing/service.py
  • api/ee/src/dbs/postgres/events/dao.py
  • api/ee/src/dbs/postgres/tracing/dao.py
  • api/ee/src/main.py
  • api/ee/tests/pytest/unit/events/test_events_router_audit.py
  • api/ee/tests/pytest/unit/test_admin_retention_routers.py
  • api/ee/tests/pytest/unit/test_events_retention.py
  • api/entrypoints/routers.py
  • api/oss/src/apis/fastapi/events/__init__.py
  • api/oss/src/apis/fastapi/events/router.py
  • api/pyproject.toml
  • clients/python/pyproject.toml
  • hosting/kubernetes/helm/Chart.yaml
  • hosting/railway/oss/scripts/bootstrap.sh
  • hosting/railway/oss/scripts/configure.sh
  • hosting/railway/oss/scripts/deploy-from-images.sh
  • hosting/railway/oss/scripts/lib.sh
  • sdks/python/pyproject.toml
  • services/pyproject.toml
  • web/ee/package.json
  • web/ee/src/components/pages/settings/AuditLog/AuditLog.tsx
  • web/ee/src/components/pages/settings/AuditLog/assets/constants.ts
  • web/ee/src/components/pages/settings/AuditLog/components/AuditEventCells.tsx
  • web/ee/src/components/pages/settings/AuditLog/components/AuditEventDrawer.tsx
  • web/ee/src/components/pages/settings/AuditLog/components/AuditLogFilters.tsx
  • web/ee/src/components/pages/settings/AuditLog/components/AuditLogTable.tsx
  • web/ee/src/components/pages/settings/AuditLog/state.ts
  • web/ee/src/pages/w/[workspace_id]/p/[project_id]/settings/index.tsx
  • web/oss/package.json
  • web/oss/src/components/Evaluators/ArchivedEvaluatorsPage.tsx
  • web/oss/src/components/Evaluators/index.tsx
  • web/oss/src/components/References/atoms/entityReferences.ts
  • web/oss/src/components/Sidebar/SettingsSidebar.tsx
  • web/oss/src/components/Sidebar/components/ListOfProjects.tsx
  • web/oss/src/components/Sidebar/components/assets/projectSwitchHref.test.ts
  • web/oss/src/components/Sidebar/components/assets/projectSwitchHref.ts
  • web/oss/src/pages/w/[workspace_id]/p/[project_id]/settings/index.tsx
  • web/package.json
  • web/packages/agenta-api-client/package.json
  • web/packages/agenta-entities/src/event/api/api.ts
💤 Files with no reviewable changes (2)
  • api/oss/src/apis/fastapi/events/router.py
  • api/entrypoints/routers.py

Comment on lines +103 to +113
- name: Upload setup log
if: always()
# Diagnostics only: a failed/duplicate upload must never fail the job.
continue-on-error: true
uses: actions/upload-artifact@v4
with:
name: railway-setup-log-${{ inputs.pr_number }}
path: railway-setup-*.log
if-no-files-found: ignore
overwrite: true
retention-days: 7
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot Jun 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Show the relevant portion of the workflow
echo "=== .github/workflows/41-railway-setup.yml (around lines 90-130) ==="
nl -ba .github/workflows/41-railway-setup.yml | sed -n '80,140p'

# Find all occurrences of upload-artifact in workflows
echo
echo "=== Occurrences of actions/upload-artifact@ in .github/workflows ==="
rg -n --hidden "uses:\s*actions/upload-artifact@" .github/workflows || true

# Find occurrences of "actions/upload-artifact@" anywhere in repo
echo
echo "=== Occurrences of actions/upload-artifact@ in entire repo ==="
rg -n --hidden "actions/upload-artifact@" . || true

# Check if any workflows pin actions to full SHAs (heuristic: @<40 hex> or `@refs/tags/` with SHA)
echo
echo "=== Heuristic: actions pinned to full commit SHAs in workflows ==="
rg -n --hidden "uses:\s*[^@]+@([0-9a-f]{40})" .github/workflows || true

# Inspect repository workflow/policy configuration files that might specify pinning requirements
echo
echo "=== Search for policy/pinning config hints ==="
rg -n --hidden "(supply.?chain|pin(n|ning)?|actions\/(upload-artifact|checkout)|COMMIT_SHA|commit sha)" .github . || true

Repository: Agenta-AI/agenta

Length of output: 190


🏁 Script executed:

#!/bin/bash
set -euo pipefail

WF=".github/workflows/41-railway-setup.yml"

echo "=== .github/workflows/41-railway-setup.yml (around lines 90-130) ==="
# Use cat -n (nl may be unavailable)
cat -n "$WF" | sed -n '80,140p'

echo
echo "=== Occurrences of actions/upload-artifact@ in .github/workflows ==="
rg -n --hidden "uses:\s*actions/upload-artifact@" .github/workflows || true

echo
echo "=== Occurrences of actions/upload-artifact@ in entire repo ==="
rg -n --hidden "actions/upload-artifact@" . || true

echo
echo "=== Heuristic: actions pinned to full commit SHAs in workflows (`@40-hex`) ==="
rg -n --hidden "uses:\s*[^@]+@[0-9a-f]{40}" .github/workflows || true

echo
echo "=== Search for policy/pinning config hints ==="
rg -n --hidden "(supply.?chain|pin(n|ning)?|COMMIT_SHA|commit sha|actions\/(checkout|upload-artifact))" .github . || true

Repository: Agenta-AI/agenta

Length of output: 3727


🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "=== Any actions pinned to full commit SHAs in workflows? ==="
# Match uses: owner/repo@<40-hex>
rg -n --hidden "uses:\s*[^#\s]+@[0-9a-f]{40}\b" .github/workflows || true

echo
echo "=== Any uses: ...@refs/tags or `@v`* (non-SHA) ==="
rg -n --hidden "uses:\s*[^#\s]+@(refs/tags/|refs/heads/|v[0-9]+|v[0-9]+\.[0-9]+|v[0-9]+\.[0-9]+\.[0-9]+|main)\b" .github/workflows || true

echo
echo "=== Search for pinning / supply-chain policy files/config ==="
# Look for common policy file names/strings
rg -n --hidden "(supply.?chain|pin(n|ning)?|actions\/(checkout|upload-artifact)|COMMIT_SHA|commit[ -]?sha|allowed-actions|untrusted|workflow|GITHUB_TOKEN)" .github || true

echo
echo "=== List all workflows referencing actions/upload-artifact@v4 ==="
rg -n --hidden "uses:\s*actions/upload-artifact@v4\b" .github/workflows || true

Repository: Agenta-AI/agenta

Length of output: 7918


🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "=== Files under .github (sample) ==="
git ls-files .github | head -n 200

echo
echo "=== Search for common pinning/supply-chain policy config strings (no complex regex) ==="
rg -n --hidden "supply chain|supply-chain|pinning|pinned|allowed-actions|untrusted|action pin|commit sha|commit-sha|COMMIT_SHA" .github . || true

echo
echo "=== Search for any uses: ...@<40hex> (full SHA pinning) ==="
rg -n --hidden "uses:\s*[^#[:space:]]+@[0-9a-f]{40}\b" .github/workflows || true

Repository: Agenta-AI/agenta

Length of output: 50375


🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "=== Any pinned actions to full commit SHA under .github/workflows? ==="
rg -n --hidden "uses:\s*[^#[:space:]]+@[0-9a-f]{40}\b" .github/workflows || true

echo
echo "=== All occurrences of actions/upload-artifact@v4 ==="
rg -n --hidden "uses:\s*actions/upload-artifact@v4\b" .github/workflows

Repository: Agenta-AI/agenta

Length of output: 775


🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "=== Pinning / supply-chain policy hints in .github (YAML) ==="
rg -n --hidden -g'*.yml' -g'*.yaml' \
  "supply.?chain|pin(n|ning)?|pinned actions|allowed-actions|action pin|COMMIT_SHA|commit sha|workflow.*pin" \
  .github || true

echo
echo "=== Files that commonly hold Actions pinning config ==="
find .github -maxdepth 3 -type f \( -name "*action*" -o -name "*supply*" -o -name "*policy*" -o -name "*security*" \) | head -n 200

Repository: Agenta-AI/agenta

Length of output: 1306


Pin actions/upload-artifact to a full commit SHA (not @v4).

In .github/workflows/41-railway-setup.yml (Upload setup log step, line ~107), uses: actions/upload-artifact@v4 relies on a mutable tag; switch to actions/upload-artifact@<commit_sha> to preserve supply-chain integrity.

🧰 Tools
🪛 zizmor (1.25.2)

[error] 107-107: unpinned action reference (unpinned-uses): action is not pinned to a hash (required by blanket policy)

(unpinned-uses)

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as the note on the deploy workflow: deferring the @v4 → SHA pin to a dedicated repo-wide change for consistency (it's the convention across all workflows here).

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mmabrouk, understood — deferring to a repo-wide consistency pass makes sense. Would you like me to open a GitHub issue to track pinning all actions/upload-artifact@v4 (and other mutable action tags) to full commit SHAs across the workflow files?

Comment thread .github/workflows/43-railway-deploy.yml
Comment on lines +216 to +226
- name: Upload deploy log
if: always()
# Diagnostics only: a failed/duplicate upload must never fail the job.
continue-on-error: true
uses: actions/upload-artifact@v4
with:
name: railway-deploy-log-${{ inputs.pr_number }}
path: railway-deploy-*.log
if-no-files-found: ignore
overwrite: true
retention-days: 7
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot Jun 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
set -euo pipefail

FILE=".github/workflows/43-railway-deploy.yml"

echo "== File exists? =="
ls -l "$FILE" || exit 1

echo
echo "== Show lines 190-250 =="
nl -ba "$FILE" | sed -n '190,250p'

echo
echo "== Find upload-artifact usage =="
rg -n "actions/upload-artifact@" .github/workflows/43-railway-deploy.yml || true

echo
echo "== Find any pinned upload-artifact elsewhere (full repo) =="
rg -n "actions/upload-artifact@([0-9]+|v[0-9]+|[0-9a-f]{40})" .github/workflows -S || true

Repository: Agenta-AI/agenta

Length of output: 257


🏁 Script executed:

#!/bin/bash
set -euo pipefail

FILE=".github/workflows/43-railway-deploy.yml"

echo "== Show lines 200-240 (approx.) =="
sed -n '200,240p' "$FILE" | cat -n

echo
echo "== Find upload-artifact usage (in this file) =="
rg -n "actions/upload-artifact@" .github/workflows/43-railway-deploy.yml || true

echo
echo "== Find any upload-artifact usage elsewhere (any workflow) =="
rg -n "actions/upload-artifact@" .github/workflows -S || true

echo
echo "== Look specifically for pinned commit SHA or tag usage =="
rg -n "actions/upload-artifact@([0-9a-f]{40}|v[0-9]+)" .github/workflows -S || true

Repository: Agenta-AI/agenta

Length of output: 3370


Pin actions/upload-artifact to a full commit SHA (instead of @v4).

actions/upload-artifact@v4 appears in multiple workflows—.github/workflows/43-railway-deploy.yml (line 220), .github/workflows/41-railway-setup.yml (line 107), and .github/workflows/44-railway-tests.yml (lines 387, 466, 547, 641, 653). Using the mutable @v4 tag is a supply-chain risk; replace each occurrence with a pinned commit SHA.

🧰 Tools
🪛 zizmor (1.25.2)

[error] 220-220: unpinned action reference (unpinned-uses): action is not pinned to a hash (required by blanket policy)

(unpinned-uses)

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Leaving actions/upload-artifact@v4 as a version tag here. It's the repo-wide convention — @v4 is used in 5 places in 44-railway-tests.yml, and actions/checkout@v6 / actions/github-script@v7 are tag-pinned everywhere too. Pinning only the two steps this PR touches would be inconsistent and wouldn't meaningfully reduce supply-chain risk while the same action stays unpinned elsewhere. SHA-pinning is worth doing, but as a dedicated repo-wide change (ideally with Dependabot to keep the pins fresh), not piecemeal in a diagnostics PR.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mmabrouk, that's a fair and consistent rationale. Keeping version tags uniform across the repo makes sense — piecemeal SHA-pinning without tooling (e.g. Dependabot/Renovate) to keep the hashes fresh would create more maintenance burden than it eliminates risk. Happy to open a follow-up issue to track the repo-wide SHA-pinning + Dependabot setup if you'd like one for the backlog.


✏️ Learnings added
Learnt from: mmabrouk
Repo: Agenta-AI/agenta PR: 4514
File: .github/workflows/43-railway-deploy.yml:219-229
Timestamp: 2026-06-01T19:38:51.520Z
Learning: In the Agenta-AI/agenta repository, GitHub Actions are pinned to version tags (e.g., `actions/upload-artifactv4`, `actions/checkoutv6`, `actions/github-scriptv7`) rather than full commit SHAs. This is a deliberate repo-wide convention. SHA-pinning has been intentionally deferred to a dedicated future effort (ideally with Dependabot), so review comments requesting SHA-pinning in individual PRs should not be raised.

Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.

Comment on lines +348 to +382
// Slug-only resolution (e.g. the observability table's evaluator
// metric columns, which only carry a slug). The id query above
// never runs in that case, so match the slug against the already
// loaded app+evaluator union list to surface the real `name`
// instead of leaking the slug. Mirrors `appReferenceAtomFamily`.
if (slug) {
const listState = get(workflowsListQueryStateAtom)
const listMatch = listState.data.find((w) => w.slug === slug)

if (listMatch) {
return {
data: {
id: listMatch.id ?? id ?? null,
slug: listMatch.slug ?? slug,
name: listMatch.name ?? listMatch.slug ?? slug,
metrics: extractMetricsFromWorkflow(listMatch),
},
isPending: false,
isFetching: false,
isLoading: false,
isError: false,
}
}

// List still loading — defer rather than flash the slug.
if (listState.isPending) {
return {
data: null,
isPending: true,
isFetching: true,
isLoading: true,
isError: false,
}
}
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Description: Check if workflowsListQueryStateAtom returns items with the nested 'data' structure

# Search for the definition of workflowsListQueryStateAtom to understand its structure
ast-grep --pattern $'export const workflowsListQueryStateAtom = $_'

# Find how the workflows list is populated and what structure it returns
rg -nP -A 10 'workflowsListQueryStateAtom.*=' --type=ts

# Check if there are any transformations that flatten the workflow structure
rg -nP -C 5 'resolveOutputSchemaProperties|workflow.*data' --type=ts -g '**/workflow/**'

Repository: Agenta-AI/agenta

Length of output: 50374


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# 1) Inspect extractMetricsFromWorkflow in the target OSS file
echo "=== entityReferences.ts: extractMetricsFromWorkflow ==="
target1="web/oss/src/components/References/atoms/entityReferences.ts"
if [ -f "$target1" ]; then
  rg -n "function extractMetricsFromWorkflow|extractMetricsFromWorkflow" "$target1" || true
  # Print likely implementation region
  line=$(rg -n "extractMetricsFromWorkflow" "$target1" | head -n 1 | cut -d: -f1 || true)
  if [ -n "${line:-}" ]; then
    start=$((line-40)); end=$((line+80))
    sed -n "${start},${end}p" "$target1" | cat -n
  fi
else
  echo "Missing: $target1"
fi

# 2) Inspect workflowsListQueryStateAtom implementation
echo -e "\n=== allWorkflows.ts: workflowsListQueryStateAtom ==="
target2="web/packages/agenta-entities/src/workflow/state/allWorkflows.ts"
if [ -f "$target2" ]; then
  rg -n "workflowsListQueryStateAtom" "$target2"
  line=$(rg -n "workflowsListQueryStateAtom" "$target2" | head -n 1 | cut -d: -f1)
  start=$((line-30)); end=$((line+70))
  sed -n "${start},${end}p" "$target2" | cat -n
else
  echo "Missing: $target2"
fi

# 3) Inspect list atom sources referenced by workflowsListQueryStateAtom
echo -e "\n=== store.ts: list \"Thin workflow reference\" / data presence ==="
target3="web/packages/agenta-entities/src/workflow/state/store.ts"
if [ -f "$target3" ]; then
  rg -n "Thin workflow reference|Full workflow data lives in the molecule|POST /workflows/query|has data|no data" "$target3" || true
  # Print around the comment area if found
  line=$(rg -n "Thin workflow reference" "$target3" | head -n 1 | cut -d: -f1 || true)
  if [ -n "${line:-}" ]; then
    start=$((line-40)); end=$((line+90))
    sed -n "${start},${end}p" "$target3" | cat -n
  fi
else
  echo "Missing: $target3"
fi

# 4) Inspect evaluator utils list atoms too (in case they differ)
echo -e "\n=== evaluatorUtils.ts: evaluatorsListDataAtom shape (if relevant) ==="
target4="web/packages/agenta-entities/src/workflow/state/evaluatorUtils.ts"
if [ -f "$target4" ]; then
  rg -n "evaluatorsListDataAtom|list response|no data|Thin workflow" "$target4" || true
  line=$(rg -n "evaluatorsListDataAtom" "$target4" | head -n 1 | cut -d: -f1 || true)
  if [ -n "${line:-}" ]; then
    start=$((line-25)); end=$((line+80))
    sed -n "${start},${end}p" "$target4" | cat -n
  fi
else
  echo "Missing: $target4"
fi

Repository: Agenta-AI/agenta

Length of output: 18647


Fix metrics extraction in slug-only evaluator resolution

workflowsListQueryStateAtom is built from “thin workflow references” (see toWorkflowListRef in web/packages/agenta-entities/src/workflow/state/store.ts), and the list API explicitly returns lean objects with no data field. The slug-only path then calls extractMetricsFromWorkflow(listMatch), but extractMetricsFromWorkflow derives metrics from workflow?.data via resolveOutputSchemaProperties(...), so slug-only resolution will produce empty metrics (not the real output-schema columns).

Update the slug-only path to either:

  1. Resolve metrics using full workflow data (e.g., fetch latest revision / workflow entity using listMatch.id), or
  2. Intentionally return empty metrics for slug-only cases and document that observability metric columns require ID-based resolution.

railway_call previously retried only rate-limits. Railway's API also
intermittently times out write mutations (notably 'variable set'),
reproduced locally at ~20% even single-threaded and independent of env
size. With ~15 variable-set calls per deploy and no retry, deploys failed
~96% of the time (0.8^15).

Retry policy:
- rate-limit (429): always retried (request was rejected, not processed)
- transient network/timeout: retried ONLY for idempotent commands; a
  timed-out create (init/add/environment new/volume add) may have
  succeeded server-side, so it is not blind-retried (avoids duplicate
  projects/services/volumes); rate-limit retries still apply to them
- deterministic errors (not found/unauthorized): fail fast, no retry

Failure output stays redacted. Covered by unit tests.

(cherry picked from commit 060441a)
dump_railway_logs ran after the 'deploy-from-images.sh | tee $log_file'
pipeline, so its Railway service-log tails only reached the live Actions
log, not the uploaded artifact or the step-summary tail. If the live log
truncates, the root-cause dump was lost again. Tee the dump into $log_file
so it lands in the artifact and the summary. (Addresses CodeRabbit review.)

(cherry picked from commit da990a8)
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Jun 1, 2026

Railway Preview Environment

Status Destroyed (PR closed)

Updated at 2026-06-02T12:00:13.731Z

@mmabrouk mmabrouk closed this Jun 2, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ci/cd devops size:L This PR changes 100-499 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants