Skip to content

test(e2e): add insights and online-insights lifecycle tests#1599

Open
notgitika wants to merge 3 commits into
aws:mainfrom
notgitika:test/insights-e2e
Open

test(e2e): add insights and online-insights lifecycle tests#1599
notgitika wants to merge 3 commits into
aws:mainfrom
notgitika:test/insights-e2e

Conversation

@notgitika

@notgitika notgitika commented Jun 22, 2026

Copy link
Copy Markdown
Contributor

This PR adds e2e tests for insights feature that was recently shipped. two independent describe.sequential suites in a single file (e2e-tests/insights-lifecycle.test.ts), each owning its own deployed agent and CFN stack so a deploy failure in one suite does not blank the other.

Suite A: online-insights lifecycle
add online-insights → deploy → invoke → pause → resume → teardown. Mirrors evals-lifecycle.test.ts. Verifies live executionStatus toggling through the control plane.

Suite B: run insights + recommendation chain
deploy → invoke → run insights (async) → view list → view detail → archive → run insights --wait → run recommendation --from-insights → teardown. Covers:

  • Async submission of run insights (no --wait) and local job record
  • view insights list + per-id detail
  • archive insights round-trip (job no longer appears in view insights)
  • run insights --wait reaching a terminal status
  • run recommendation --from-insights chaining off the completed insights job

The chain step accepts either success: true or a service error containing no sessions / completed_with_errors / failed / empty, since real trace volume is not guaranteed seconds after invoke. Flag-parsing errors (e.g. Unknown option, --from-insights) still fail the test hard, so the wiring is verified even when the upstream job has nothing to learn from.

Adds end-to-end coverage for the Lens/Insights feature shipped in the NYS
summit release. Two independent sequential suites in
e2e-tests/insights-lifecycle.test.ts, each owning its own deployed agent
and CFN stack so a deploy failure in one suite does not blank the other:

- online-insights lifecycle: add online-insights -> deploy -> invoke ->
  pause -> resume -> teardown. Verifies live executionStatus toggling
  through the control plane.
- run-insights and recommendation chain: deploy -> invoke ->
  run insights (async) -> view list -> view detail -> archive ->
  run insights --wait -> run recommendation --from-insights -> teardown.
  Covers async submission, local job storage, view/archive round-trip,
  and the chain from a completed insights job into a system-prompt
  recommendation. The chain step accepts either success or a service
  error indicating the upstream job had no usable sessions, since real
  trace volume is not guaranteed seconds after invoke; flag-parsing
  errors fail hard.
@github-actions github-actions Bot added the size/m PR size: M label Jun 22, 2026
@github-actions github-actions Bot added the agentcore-harness-reviewing AgentCore Harness review in progress label Jun 22, 2026
@agentcore-devx-automation agentcore-devx-automation Bot added the claude-security-reviewing Claude Code /security-review in progress label Jun 22, 2026
@agentcore-devx-automation

Copy link
Copy Markdown
Contributor

Claude Security Review: no high-confidence findings. (run)

@agentcore-devx-automation agentcore-devx-automation Bot removed the claude-security-reviewing Claude Code /security-review in progress label Jun 22, 2026
@github-actions

Copy link
Copy Markdown
Contributor

Package Tarball

aws-agentcore-0.20.2.tgz

How to install

gh release download pr-1599-tarball --repo aws/agentcore-cli --pattern "*.tgz" --dir /tmp/pr-tarball
npm install -g /tmp/pr-tarball/aws-agentcore-0.20.2.tgz

@agentcore-cli-automation agentcore-cli-automation left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Test additions look thorough and follow the existing evals-lifecycle.test.ts pattern. One blocker before merge: the view insights <id> test asserts a region field that the CLI doesn't actually emit. Otherwise the wiring against the real commands looks correct.

Comment thread e2e-tests/insights-lifecycle.test.ts Outdated
@github-actions github-actions Bot removed the agentcore-harness-reviewing AgentCore Harness review in progress label Jun 22, 2026
JobRecordBase does not store region — region is parsed from arn. The
view insights <id> --json output therefore has no region field. Switch
the detail-call assertion to match arn against the bedrock-agentcore
ARN shape and update the InsightsJobJson interface accordingly.
@github-actions github-actions Bot added size/m PR size: M and removed size/m PR size: M labels Jun 22, 2026
@agentcore-devx-automation agentcore-devx-automation Bot added the claude-security-reviewing Claude Code /security-review in progress label Jun 22, 2026
@agentcore-devx-automation

Copy link
Copy Markdown
Contributor

Claude Security Review: no high-confidence findings. (run)

@agentcore-devx-automation agentcore-devx-automation Bot removed the claude-security-reviewing Claude Code /security-review in progress label Jun 22, 2026
@notgitika notgitika marked this pull request as ready for review June 23, 2026 18:06
@notgitika notgitika requested review from a team and agentcore-cli-automation June 23, 2026 18:06
…on step

The BatchEvaluation API rejects `--insights` and `--evaluator` together
("evaluators and insights are mutually exclusive"), so the --wait submit
exited 1 with success:false and waitJobId stayed undefined. The follow-up
recommendation chain test then ran with id=undefined and tripped the
recommendation handler's 'exactly one evaluator' check. Pass --evaluator
on the recommendation invocation instead, where the handler actually
requires it for system-prompt type.
@github-actions github-actions Bot added size/m PR size: M and removed size/m PR size: M labels Jun 23, 2026
@agentcore-devx-automation agentcore-devx-automation Bot added the claude-security-reviewing Claude Code /security-review in progress label Jun 23, 2026
@agentcore-devx-automation

Copy link
Copy Markdown
Contributor

Claude Security Review: no high-confidence findings. (run)

@agentcore-devx-automation agentcore-devx-automation Bot removed the claude-security-reviewing Claude Code /security-review in progress label Jun 23, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

size/m PR size: M

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants