feat(webapp,sdk): in-dashboard AI agent#4018
Conversation
Range-scoped to ^18 so packages on react 19 (rsc) stay untouched. Collapses the duplicate react copies that otherwise break hook-based SDK components with an invalid hook call.
When set, the returned action runs under the given baseURL and access token (via apiClientManager.runWithConfig) instead of the ambient SDK config, so one server can start chat sessions in a different runtime environment.
Adds an "Ask the agent" side panel on every environment page, backed by a chat.agent task that runs as its own internal Trigger project with no main-database or ClickHouse access. Conversations persist to a dedicated Postgres schema via a small Drizzle store; the panel restores the last open chat and resumes streaming without replay. No tools yet: the agent answers from the model only. Tools, delegated auth, and the data lane land in follow-ups.
Adds hasDashboardAgentAccess to the flag catalog, off by default via DASHBOARD_AGENT_ENABLED. The panel renders only when the flag is on for the org or the viewer is staff, and the resource route enforces the same check so a non-flagged user cannot start sessions by calling it directly. Controllable globally and per-org from the admin flags UI.
…+ caching Drive the agent and title models from dashboard-managed prompts resolved through an Anthropic provider registry (agent claude-sonnet-4-6, title claude-haiku-4-5). Generate the chat title in the background after the first turn with the cheaper model so it never blocks the response, writing it only while the chat still has the default title. Add Anthropic prompt caching: an ephemeral breakpoint on the system block plus a rolling breakpoint on the last message.
The deploy-time prompt dedup hashed only the prompt text, so changing a code prompt's model or config was silently skipped and the previous version kept serving. Include model and config in the version-definition hash so those changes create a new version.
Open the agent inside a ResizablePanelGroup (content + handle + agent panel) using the shared Resizable primitive, so the panel drag-resizes between 320 and 720px with the standard handle and keyboard support. autosaveId persists the width across opens and reloads.
The project worker-by-tag endpoint now accepts a delegated user-actor token the same way it accepts a personal access token, so a first-party caller can list a project's deployed tasks on the user's behalf. It stays identity-scoped, with no change to what the user can already do.
A caret range resolved a newer ai build than the SDK uses, pulling a second copy of the AI SDK tool types and breaking type-checking against the chat agent's tool set. Pinning the exact version dedupes them.
The dashboard agent can now read your projects, environments, runs, and deployed tasks by calling the public API as you. Each turn mints a short-lived, read-only delegated token on the server and hands it to the agent through a same-origin proxy on the message-send path, so the token is never exposed to the browser. The response stream stays pointed directly at the realtime host. Tools: list_projects, list_environments, get_run, list_tasks.
The dashboard agent can now list recent runs in the current environment (filterable by status, task, and a time window of up to 30 days) and fetch a run's execution trace to explain why it failed, retried, or was slow.
chat.headStart now accepts an apiClient (base URL + access token), so the head-start route can create the session and trigger the agent run against a different project or environment than the warm server's ambient config. Mirrors chat.createStartSessionAction; the run callback's LLM keys are unaffected.
Pins ai (^6) and @ai-sdk/provider-utils (^4) via pnpm overrides so the AI SDK tool types resolve to one instance across the app, the SDK, and internal packages. Two provider-utils copies previously broke type-checking a shared tool set across package boundaries. Also adds @ai-sdk/anthropic to the webapp for the agent's first-turn route.
The first turn of a new dashboard agent chat now streams step 1 from the webapp while the agent run boots in parallel, cutting first-token latency. The warm route runs step 1 with schema-only tools (the tool execute fns stay in the agent, never the webapp bundle) and mints a fresh read-only delegated token server-side, injected into the run so the first turn's tool calls are authed as the user without the token ever reaching the browser. Falls back to the normal path when no Anthropic key is configured.
The dashboard agent can now read error groups as the signed-in user: list_errors lists distinct errors by fingerprint with occurrence counts and status, get_error returns the full detail for one, and list_runs can filter to the runs behind an error group. All read-only and scoped to the current environment.
…eMessages A Head Start handover hands the first turn's pending tool call to the agent as a tool-approval round whose trailing tool message must reach the model untouched for the call to execute. A prepareMessages hook that rewrote the last message (for example the recommended prompt-caching breakpoint) dropped it, so the turn failed with "tool_use ids were found without tool_result". The agent now preserves that approval tail across prepareMessages, so caching and Head Start compose cleanly.
Drive the agent through real turns offline with mockChatAgent and MockLanguageModelV3: text streaming, tool execution, the prompt-cache breakpoint, the Head Start handover resume, and the read tools failing closed without a delegated token. The agent's datastore and model are now injectable via locals so the tests need no database or provider.
…answer quality Vitest evals that exercise the agent against a real model with fixture tools: a tool-selection set scored on expected tool choices, plus an LLM-as-judge case for answer quality. Runs behind its own config so it stays out of the unit test run, and *.eval.ts files are kept out of the task index.
After each turn the agent enqueues a decoupled, idempotency-keyed task that judges the turn (grounding, whether the question was answered, intent, outcome) and flags product signal (capability gaps, docs gaps, support opportunities, feature requests), then writes one row. The judge runs out of band so it never blocks or bills the agent run. Adds the chat_turn_evals table and migration to @internal/dashboard-agent-db.
…mode When a project has a connected GitHub repo, the in-dashboard agent can now read its source to ground answers. It pulls the repo at the tracked commit onto its own filesystem and exposes read-only list, read, and grep tools, so it can explain a run or error against the actual task code, citing file and line. The webapp resolves a short-lived signed archive URL server-side and injects it per turn, so the GitHub token never reaches the agent. With no connected repo the agent stays in its usual assistant mode.
…deployed from The agent code tools now take an optional runId. When investigating a specific run, the agent reads the exact source that run version was deployed from instead of the latest commit, so the explanation matches the code that actually ran. A new endpoint maps the run to its deployment commit and resolves a signed archive URL for it server-side (the GitHub token stays on the server); the agent downloads and reads that commit. Runs with no deployed version fall back to the tracked branch head.
The dashboard agent now answers "why did this run fail?" with a structured failure card (summary, category, likely cause, confidence, evidence, impact, next steps, and action buttons) instead of plain prose. The card is the first block in a small view catalog: the agent emits a render_view tool call constrained to a fixed set of blocks, and the dashboard renders it through a component registry. Only known block types render, so the agent can never produce arbitrary markup. New blocks plug in by adding a schema member plus a registry entry.
…iew block The dashboard agent can now answer analytics questions with data and charts. Two read-only query tools let it discover the TRQL schema and run queries against the current environment, and a new "chart" block in the view catalog renders a line or bar chart. The chart block carries the TRQL query rather than rows: the panel runs it through the dashboard's existing query execution and chart components, so the chart is live and matches the Query page. Queries run as the user over the public query API (read:query), so the agent still reaches no data directly.
… questions The dashboard agent can now answer "how does Trigger.dev work?" questions (docs, concepts, features, how-tos) by asking the Trigger.dev support assistant, instead of guessing or limiting itself to the user's own data. The ask_support tool forwards the question to a service-to-service ask endpoint (the support assistant composes the answer) and returns it. It carries no user data and uses a shared secret server-side only, so nothing reaches the browser.
AgentChat's .in/append and .out SSE calls built their headers by hand and omitted x-trigger-branch, so a chat.agent deployed to a preview branch returned 401 "x-trigger-branch header required for preview env" the moment a message was appended. sessions.start already sends the branch via the API client; these two raw fetches now do too, from the same apiClientManager.branchName.
Resolve dashboard agent access in the env layout loader (global env override, then admins, then the global or per-org feature flag, default off) and pass it to the launcher. The button previously read only the per-org flag on the client, so enabling the agent globally granted server access but left the button hidden for non-admins. The launcher now matches the server check.
🦋 Changeset detectedLatest commit: 13ec864 The changes in this PR will be included in the next version bump. This PR includes changesets to release 28 packages
Not sure what this means? Click here to learn what changesets are. Click here if you're a maintainer who wants to add another changeset to this PR |
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Repository UI Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (5)
🚧 Files skipped from review as they are similar to previous changes (4)
📜 Recent review details⏰ Context from checks skipped due to timeout. (32)
🧰 Additional context used📓 Path-based instructions (6)**/*.{ts,tsx}📄 CodeRabbit inference engine (.github/copilot-instructions.md)
Files:
{packages/core,apps/webapp}/**/*.{ts,tsx}📄 CodeRabbit inference engine (.github/copilot-instructions.md)
Files:
**/*.{ts,tsx,js,jsx}📄 CodeRabbit inference engine (.github/copilot-instructions.md)
Files:
**/*.ts📄 CodeRabbit inference engine (.cursor/rules/otel-metrics.mdc)
Files:
apps/webapp/**/*.{ts,tsx}📄 CodeRabbit inference engine (.cursor/rules/webapp.mdc)
Files:
**/*.{js,ts,tsx,jsx,css,json,md}📄 CodeRabbit inference engine (AGENTS.md)
Files:
🧠 Learnings (10)📚 Learning: 2026-03-22T13:26:12.060ZApplied to files:
📚 Learning: 2026-03-22T19:24:14.403ZApplied to files:
📚 Learning: 2026-05-18T08:21:27.694ZApplied to files:
📚 Learning: 2026-05-18T08:21:27.694ZApplied to files:
📚 Learning: 2026-06-13T19:53:13.759ZApplied to files:
📚 Learning: 2026-06-17T17:13:49.929ZApplied to files:
📚 Learning: 2026-06-23T13:04:21.413ZApplied to files:
📚 Learning: 2026-05-12T21:04:05.815ZApplied to files:
📚 Learning: 2026-06-04T18:16:35.386ZApplied to files:
📚 Learning: 2026-06-09T17:58:04.699ZApplied to files:
🔇 Additional comments (1)
WalkthroughThis PR adds a dashboard agent feature across the webapp, a dedicated Drizzle datastore package, and a Trigger task package. It introduces new routes for chat loading, message proxying, repo snapshot lookup, and user-actor token flow, plus UI components for chat, history, draft, diagnosis cards, charts, and view rendering. It also adds dashboard-agent access gating and environment configuration. Separately, the SDK updates add preview-branch header forwarding, preserve head-start approval tails through 🚥 Pre-merge checks | ✅ 3 | ❌ 2❌ Failed checks (2 warnings)
✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Self-hosted containers now apply the trigger_dashboard_agent schema migrations on startup, alongside the Prisma and ClickHouse migrations, via a small runtime migrator (drizzle-orm over the committed SQL, no drizzle-kit in the image). A new SKIP_DASHBOARD_AGENT_MIGRATIONS flag lets deployments that apply migrations out of band opt out.
The dashboard agent now creates chat sessions server-side: the server generates the chat id and owns the chat record (bound to the user and org), and only issues a session token for a chat the requesting user owns. A new chat opens in a draft composer; the first message creates the session via chat.startHeadStart, so the client never chooses the id. Also hardens the repo-read tools with collision-resistant workspace keys and a realpath guard against symlink escapes.
0d6821e to
8a37eda
Compare
@trigger.dev/build
trigger.dev
@trigger.dev/core
@trigger.dev/python
@trigger.dev/react-hooks
@trigger.dev/redis-worker
@trigger.dev/rsc
@trigger.dev/schema-to-json
@trigger.dev/sdk
commit: |
- Log head-start warm-step failures instead of swallowing them, so a failed first turn is visible in the server logs. - Guard the new-chat create against stale responses so a slow create can't replace a chat the user just switched to. - Scope the chat ownership check by organization, not just by user. - Clear the head start idle timer on the warm-step failure path (it was only cleared on the success path).
Summary
Adds an in-dashboard AI agent: a chat panel, reachable from any environment
page, that answers questions about your runs, errors, tasks, and analytics,
diagnoses why a run failed, charts your data, reads your connected repo's
source, and answers product and how-to questions. It is gated behind the
hasDashboardAgentAccessfeature flag (global or per-org, default off), sothis PR ships disabled: the launcher is hidden unless the flag is enabled.
Design
The agent runs as a standalone
chat.agentTrigger task in its own internalpackage, with no access to the webapp database, Prisma, or ClickHouse. It reads
the user's data over the public API, acting as the user via a short-lived
delegated user-actor token minted server-side each turn (never in the browser),
building on #3997. The
error and analytics tools use #4005
and the TRQL query API.
The first turn of a new chat streams from a warm webapp route (Head Start) while
the durable agent boots in parallel. Structured answers (a run-failure diagnosis
card, a live chart) render through a small typed view catalog rather than
arbitrary markup. A knowledge lane forwards product and how-to questions to the
support assistant.
Conversation history lives in a separate Drizzle-backed store on its own
Postgres schema, kept as a display read-model so it can never corrupt the
agent's model context.
The SDK changes add an
apiClientoption tochat.createStartSessionActionandchat.headStart, and keep the Head Start tool-approval tail intact across acustom
prepareMessageshook so prompt caching and Head Start compose.