Draft: Fix runAsUser error, migration ordering, and details during session errors#1592
Draft: Fix runAsUser error, migration ordering, and details during session errors#1592bsquizz wants to merge 2 commits into
Conversation
…ering Add runAsUser: 1000 to pod securityContext and bake a non-root appuser into the Dockerfile so the image satisfies the runAsNonRoot requirement without relying solely on the manifest override. Fix typedFKMigration ID from 202505130001 to 202605150001 so it sorts after the 202603100138 migration that creates the role_bindings table; the old ID caused fresh-cluster deployments to fail with "relation role_bindings does not exist". Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Return 401/403 to the caller when the K8s dynamic client gets an Unauthorized or Forbidden error during session listing, rather than masking both as 500. Applies to both the initial List call and the paginated continue call. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
✅ Deploy Preview for cheerful-kitten-f556a0 canceled.
|
📝 WalkthroughWalkthroughThe PR hardens the ambient-api-server container to run as non-root user UID 1000 across both the Docker image and Kubernetes deployment manifest. It also improves error response granularity in the session handler by distinguishing between Kubernetes API authorization failures (401/403) and generic failures (500). A migration ID is bumped in roleBindings. ChangesNon-root user and session error handling
🚥 Pre-merge checks | ✅ 5 | ❌ 3❌ Failed checks (3 warnings)
✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
✨ Simplify code
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
components/manifests/base/core/ambient-api-server-service.yml (1)
59-59: 🛠️ Refactor suggestion | 🟠 Major | 🏗️ Heavy liftSet
readOnlyRootFilesystem: truein both container security contexts.Both the init container (line 59) and main container (line 160) have
readOnlyRootFilesystem: false, which violates the coding guideline. If the containers need writable filesystem areas (e.g.,/tmp), addemptyDirvolume mounts for those specific paths.As per coding guidelines, "All containers must have restricted SecurityContext:
runAsNonRoot, dropALLcapabilities,readOnlyRootFilesystem".🔒 Suggested security context updates
securityContext: allowPrivilegeEscalation: false - readOnlyRootFilesystem: false + readOnlyRootFilesystem: true capabilities: drop: - ALLIf writable paths are needed, add volume mounts:
initContainers: - name: migration # ... existing config ... volumeMounts: - name: db-secrets mountPath: /secrets/db + - name: tmp + mountPath: /tmpvolumes: - name: db-secrets secret: secretName: ambient-api-server-db + - name: tmp + emptyDir: {}Apply the same pattern to the main
api-servercontainer.Also applies to: 160-160
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@components/manifests/base/core/ambient-api-server-service.yml` at line 59, Update the containers' securityContext: set readOnlyRootFilesystem: true for the initContainer and the main api-server container (the securityContext blocks currently setting readOnlyRootFilesystem: false), and if either container needs writable paths (e.g., /tmp) add an emptyDir volume and mount it at that path; also ensure the securityContext includes runAsNonRoot: true and capabilities: drop: ["ALL"] to match the guideline.
🧹 Nitpick comments (1)
components/ambient-api-server/plugins/roleBindings/migration.go (1)
77-88: ⚡ Quick winCollect or log rollback errors instead of discarding.
The rollback silently discards all errors with
_ = tx.Exec(...).Error. This makes diagnosing partial rollback failures difficult.♻️ Proposed fix to collect errors
Rollback: func(tx *gorm.DB) error { - _ = tx.Exec(`DROP INDEX IF EXISTS idx_role_bindings_credential_id`).Error - _ = tx.Exec(`DROP INDEX IF EXISTS idx_role_bindings_session_id`).Error - _ = tx.Exec(`DROP INDEX IF EXISTS idx_role_bindings_agent_id`).Error - _ = tx.Exec(`DROP INDEX IF EXISTS idx_role_bindings_project_id`).Error - _ = tx.Exec(`ALTER TABLE role_bindings DROP COLUMN IF EXISTS credential_id`).Error - _ = tx.Exec(`ALTER TABLE role_bindings DROP COLUMN IF EXISTS session_id`).Error - _ = tx.Exec(`ALTER TABLE role_bindings DROP COLUMN IF EXISTS agent_id`).Error - _ = tx.Exec(`ALTER TABLE role_bindings DROP COLUMN IF EXISTS project_id`).Error - _ = tx.Exec(`ALTER TABLE role_bindings ALTER COLUMN user_id SET NOT NULL`).Error - return tx.Exec(`ALTER TABLE role_bindings ADD COLUMN IF NOT EXISTS scope_id TEXT`).Error + var errs []error + if err := tx.Exec(`DROP INDEX IF EXISTS idx_role_bindings_credential_id`).Error; err != nil { + errs = append(errs, err) + } + if err := tx.Exec(`DROP INDEX IF EXISTS idx_role_bindings_session_id`).Error; err != nil { + errs = append(errs, err) + } + if err := tx.Exec(`DROP INDEX IF EXISTS idx_role_bindings_agent_id`).Error; err != nil { + errs = append(errs, err) + } + if err := tx.Exec(`DROP INDEX IF EXISTS idx_role_bindings_project_id`).Error; err != nil { + errs = append(errs, err) + } + if err := tx.Exec(`ALTER TABLE role_bindings DROP COLUMN IF EXISTS credential_id`).Error; err != nil { + errs = append(errs, err) + } + if err := tx.Exec(`ALTER TABLE role_bindings DROP COLUMN IF EXISTS session_id`).Error; err != nil { + errs = append(errs, err) + } + if err := tx.Exec(`ALTER TABLE role_bindings DROP COLUMN IF EXISTS agent_id`).Error; err != nil { + errs = append(errs, err) + } + if err := tx.Exec(`ALTER TABLE role_bindings DROP COLUMN IF EXISTS project_id`).Error; err != nil { + errs = append(errs, err) + } + if err := tx.Exec(`ALTER TABLE role_bindings ALTER COLUMN user_id SET NOT NULL`).Error; err != nil { + errs = append(errs, err) + } + if err := tx.Exec(`ALTER TABLE role_bindings ADD COLUMN IF NOT EXISTS scope_id TEXT`).Error; err != nil { + errs = append(errs, err) + } + if len(errs) > 0 { + return fmt.Errorf("rollback encountered %d error(s): %v", len(errs), errs) + } + return nil }As per coding guidelines: Never silently swallow partial failures; every error path must propagate or be explicitly collected, never discarded.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@components/ambient-api-server/plugins/roleBindings/migration.go` around lines 77 - 88, The Rollback function currently discards all tx.Exec(...).Error results; change it to collect each error from the calls inside Rollback (the tx.Exec calls that DROP INDEX, DROP COLUMN, ALTER COLUMN, and ADD COLUMN) into an error accumulator (e.g., a slice or use errors.Join) and at the end return a single non-nil error if any step failed; ensure the final return does not overwrite earlier errors but is combined with them so partial rollback failures are not silently swallowed (update Rollback in migration.go where Rollback: func(tx *gorm.DB) error { ... } is defined).
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@components/ambient-api-server/plugins/roleBindings/migration.go`:
- Line 35: The Rollback function for the migration with ID "202605150001"
currently discards errors from the tx.Exec calls (the `_` assignments in
Rollback), so update Rollback(func(tx *gorm.DB)) to collect each error into a
slice (e.g., var errs []error) by appending err when a tx.Exec(...).Error != nil
for every DROP/DDL operation, and at the end return a combined error (e.g.,
fmt.Errorf("rollback errors: %v", errs)) if len(errs)>0, otherwise return nil;
ensure every tx.Exec in Rollback is changed to check and append its error
instead of ignoring it.
---
Outside diff comments:
In `@components/manifests/base/core/ambient-api-server-service.yml`:
- Line 59: Update the containers' securityContext: set readOnlyRootFilesystem:
true for the initContainer and the main api-server container (the
securityContext blocks currently setting readOnlyRootFilesystem: false), and if
either container needs writable paths (e.g., /tmp) add an emptyDir volume and
mount it at that path; also ensure the securityContext includes runAsNonRoot:
true and capabilities: drop: ["ALL"] to match the guideline.
---
Nitpick comments:
In `@components/ambient-api-server/plugins/roleBindings/migration.go`:
- Around line 77-88: The Rollback function currently discards all
tx.Exec(...).Error results; change it to collect each error from the calls
inside Rollback (the tx.Exec calls that DROP INDEX, DROP COLUMN, ALTER COLUMN,
and ADD COLUMN) into an error accumulator (e.g., a slice or use errors.Join) and
at the end return a single non-nil error if any step failed; ensure the final
return does not overwrite earlier errors but is combined with them so partial
rollback failures are not silently swallowed (update Rollback in migration.go
where Rollback: func(tx *gorm.DB) error { ... } is defined).
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: b6dea970-049f-4784-949f-0af55fe6a235
📒 Files selected for processing (4)
components/ambient-api-server/Dockerfilecomponents/ambient-api-server/plugins/roleBindings/migration.gocomponents/backend/handlers/sessions.gocomponents/manifests/base/core/ambient-api-server-service.yml
| func typedFKMigration() *gormigrate.Migration { | ||
| return &gormigrate.Migration{ | ||
| ID: "202505130001", | ||
| ID: "202605150001", |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Check if the old migration ID exists in version control history or deployed environments
# Search for the old migration ID in git history
echo "=== Checking git history for old migration ID ==="
git log --all --oneline --grep="202505130001" || echo "No commits mention old ID"
# Search for migration tracking in the codebase
echo -e "\n=== Checking for migration version tracking ==="
rg -n "202505130001" --type=go
# Look for any migration state or version files
echo -e "\n=== Looking for migration state files ==="
fd -e json -e yaml -e yml migration | head -20Repository: ambient-code/platform
Length of output: 299
🏁 Script executed:
cat -n components/ambient-api-server/plugins/roleBindings/migration.goRepository: ambient-code/platform
Length of output: 4195
🏁 Script executed:
git log --oneline --all -- components/ambient-api-server/plugins/roleBindings/migration.go | head -20Repository: ambient-code/platform
Length of output: 328
🏁 Script executed:
git show 5556220:components/ambient-api-server/plugins/roleBindings/migration.go | head -50Repository: ambient-code/platform
Length of output: 1630
Fix Rollback function to propagate all errors.
The migration ID change from 202505130001 to 202605150001 is confirmed and will trigger a rerun if already deployed. While the Migrate function is idempotent (all DDL uses IF EXISTS/IF NOT EXISTS guards), the Rollback function silently discards all errors except the last one (lines 78–86). Per guidelines, errors must be propagated or explicitly collected, never discarded. If any operation fails during rollback, the failure is invisible and leaves the schema in an inconsistent state.
Collect all rollback errors instead of silencing them with _ assignment:
Suggested fix
Rollback: func(tx *gorm.DB) error {
var errs []error
if err := tx.Exec(`DROP INDEX IF EXISTS idx_role_bindings_credential_id`).Error; err != nil {
errs = append(errs, err)
}
if err := tx.Exec(`DROP INDEX IF EXISTS idx_role_bindings_session_id`).Error; err != nil {
errs = append(errs, err)
}
// ... continue for all operations
if len(errs) > 0 {
return fmt.Errorf("rollback errors: %v", errs)
}
return nil
},🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@components/ambient-api-server/plugins/roleBindings/migration.go` at line 35,
The Rollback function for the migration with ID "202605150001" currently
discards errors from the tx.Exec calls (the `_` assignments in Rollback), so
update Rollback(func(tx *gorm.DB)) to collect each error into a slice (e.g., var
errs []error) by appending err when a tx.Exec(...).Error != nil for every
DROP/DDL operation, and at the end return a combined error (e.g.,
fmt.Errorf("rollback errors: %v", errs)) if len(errs)>0, otherwise return nil;
ensure every tx.Exec in Rollback is changed to check and append its error
instead of ignoring it.
Summary
runAsNonRootcompliance in ambient-api-server: Addedshadow-utilsand a dedicatedappuser(UID 1000) to the Dockerfile so the image satisfiesrunAsNonRootwithout depending solely on the manifest override. AddedrunAsUser: 1000to the podsecurityContextto make the UID explicit.roleBindings: CorrectedtypedFKMigrationID from202505130001to202605150001so it sorts after the202603100138migration that creates therole_bindingstable. The wrong ID caused fresh-cluster deployments to fail with "relation role_bindings does not exist".ListSessions: Instead of masking all K8s errors as 500, the handler now returns 401 for expired/invalid tokens and 403 for insufficient permissions. Applies to both the initial list call and the paginated continue call.Test plan
ambient-api-serverstarts without migration errorskubectl describe podshows the container running as UID 1000 withrunAsNonRoot: trueListSessionsand verify the response is 401, not 500🤖 Generated with Claude Code