Skip to content

feat(rest-api): Support for adding Tenant information when reported Issue for Delete Instance#2829

Open
hwadekar-nv wants to merge 1 commit into
NVIDIA:mainfrom
hwadekar-nv:fix/tenant-deleted-issue
Open

feat(rest-api): Support for adding Tenant information when reported Issue for Delete Instance#2829
hwadekar-nv wants to merge 1 commit into
NVIDIA:mainfrom
hwadekar-nv:fix/tenant-deleted-issue

Conversation

@hwadekar-nv

@hwadekar-nv hwadekar-nv commented Jun 23, 2026

Copy link
Copy Markdown
Contributor

As detailed in the description provided here: #2107

This PR aims to address the following issues:

  1. When a Tenant Repair Issue is present, the ReleaseInstance call should include Tenant and User information as a JSON string.
  2. A unit test has been added to verify that the ReleaseInstance request contains Tenant and User information when a Tenant Repair Issue is present.

Type of Change

  • Add - New feature or capability

Testing

  • Unit tests added/updated

@hwadekar-nv hwadekar-nv self-assigned this Jun 23, 2026
@hwadekar-nv hwadekar-nv requested a review from a team as a code owner June 23, 2026 23:37
@hwadekar-nv hwadekar-nv force-pushed the fix/tenant-deleted-issue branch from 0fc7298 to b086bab Compare June 23, 2026 23:38
@coderabbitai

coderabbitai Bot commented Jun 23, 2026

Copy link
Copy Markdown
Contributor

Review Change Stack

Summary by CodeRabbit

  • New Features
    • Instance deletion now captures attribution details, including deletion initiator information and tenant-reported issues
    • Enhanced deletion workflows with structured tracking of who initiated the deletion and any associated issue details for improved auditing

Walkthrough

A canonical delete-attribution JSON payload structure (InstanceDeleteAttributionConfig) is introduced in the instance model, along with ToInstanceDeleteAttributionConfig to build and serialize it from the requesting user and instance context. The delete handler is updated to invoke this method and assign the resulting JSON string to releaseInstanceRequest.Issue.Details when an issue is present.

Changes

Instance Delete Attribution

Layer / File(s) Summary
Attribution model types and serialization
rest-api/api/pkg/api/model/instance.go, rest-api/api/pkg/api/model/instance_test.go
Adds DeleteAttributionIssueDetailsMaxLen constant, InstanceDeleteInitiatedBy, InstanceDeleteAttributionTenantReported, and InstanceDeleteAttributionConfig structs, and the ToInstanceDeleteAttributionConfig method that maps MachineHealthIssue.Category to uppercase, conditionally includes details, and serializes to JSON. Table-driven unit tests cover all omission and mapping cases for initiated_by and tenant_reported.
Handler wiring and integration test
rest-api/api/pkg/api/handler/instance.go, rest-api/api/pkg/api/handler/instance_test.go
Conditionally calls ToInstanceDeleteAttributionConfig and assigns the result to releaseInstanceRequest.Issue.Details; returns a 500 response on failure. Integration test asserts non-empty status detail records and validates attribution fields extracted from the recorded Site-controller workflow call.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately captures the primary feature addition: supporting tenant information inclusion in delete instance operations when a repair issue is reported.
Description check ✅ Passed The description is directly related to the changeset, clearly mapping to the implemented feature for including tenant and user information in ReleaseInstance calls.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Comment @coderabbitai help to get the list of available commands.

@github-actions

github-actions Bot commented Jun 23, 2026

Copy link
Copy Markdown

🔍 Container Scan Summary

Service Total Critical High Medium Low Other
nico-flow 44 3 22 9 2 8
nico-nsm 51 0 12 31 8 0
nico-psm 46 3 24 9 2 8
nico-rest-api 46 3 24 9 2 8
nico-rest-cert-manager 43 3 22 9 1 8
nico-rest-db 44 3 22 9 2 8
nico-rest-site-agent 43 3 22 9 1 8
nico-rest-site-manager 43 3 22 9 1 8
nico-rest-workflow 44 3 22 9 2 8
TOTAL 404 24 192 103 21 64

Per-CVE detail lives in the per-service grype-* artifacts (JSON + SARIF). Severity counts only — no CVE IDs published here.

@github-actions

Copy link
Copy Markdown

@github-actions

Copy link
Copy Markdown

🔐 TruffleHog Secret Scan

No secrets or credentials found!

Your code has been scanned for 700+ types of secrets and credentials. All clear! 🎉

🔗 View scan details

🕐 Last updated: 2026-06-23 23:40:21 UTC | Commit: 0fc7298

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@rest-api/api/pkg/api/handler/instance_test.go`:
- Around line 9726-9750: Replace the conditional guard that checks if tsc.Calls
has elements and if Arguments has sufficient length with mandatory require
statements. Use require.NotEmpty(t, tsc.Calls) to ensure the test fails if calls
are missing when MachineHealthIssue is present, then safely access the Arguments
array without the length check. This makes the workflow-attribution assertions
mandatory rather than allowing them to be silently skipped, ensuring regressions
are caught deterministically.

In `@rest-api/api/pkg/api/handler/instance.go`:
- Around line 4906-4909: The error handling block in the transaction closure
(where derr is checked) is returning cutil.NewAPIErrorResponse(c,
http.StatusInternalServerError, "Failed to delete Instance", nil), but inside
WithTx closures you should return cutil.NewAPIError instead and let the outer
handler translate it after transaction unwind. Replace the
cutil.NewAPIErrorResponse call with cutil.NewAPIError, removing the context
parameter c and keeping only the status code, message, and data parameters.

In `@rest-api/api/pkg/api/model/instance_test.go`:
- Around line 2688-2695: The test currently only validates top-level keys in the
parsed JSON structure but does not check the nested structure of tenant_reported
to ensure the omitempty contract is enforced for the details field. After the
existing assertions that validate wantRawKeys and wantAbsentKeys at the top
level, add additional assertions that unmarshal the tenant_reported object from
the parsed map and verify that the details field is properly absent (not null)
when it should be omitted according to the omitempty semantics for nil or empty
detail cases.

In `@rest-api/api/pkg/api/model/instance.go`:
- Around line 1651-1678: The ToInstanceDeleteAttributionConfig method serializes
the config to JSON and returns it as a string, but does not validate that the
resulting serialized payload respects the DeleteAttributionIssueDetailsMaxLen
size contract. After the json.Marshal call succeeds and before the return
statement, add a length validation check to ensure the serialized string data
does not exceed DeleteAttributionIssueDetailsMaxLen. If the length exceeds the
limit, return an empty string and an appropriate error to prevent cross-service
contract violations when this attribution payload is used as Issue.Details
elsewhere.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: bc6cda04-ca4d-4afe-9d9b-12145bc8d98f

📥 Commits

Reviewing files that changed from the base of the PR and between 8f81824 and b086bab.

📒 Files selected for processing (4)
  • rest-api/api/pkg/api/handler/instance.go
  • rest-api/api/pkg/api/handler/instance_test.go
  • rest-api/api/pkg/api/model/instance.go
  • rest-api/api/pkg/api/model/instance_test.go

Comment on lines +9726 to +9750
if len(tsc.Calls) > 0 && len(tsc.Calls[len(tsc.Calls)-1].Arguments) > 3 {
releaseReq := tsc.Calls[len(tsc.Calls)-1].Arguments[3].(*cwssaws.InstanceReleaseRequest)
require.NotNil(t, releaseReq.Issue)
require.NotNil(t, releaseReq.Issue.Details)
var issueDetails model.InstanceDeleteAttributionConfig
require.NoError(t, json.Unmarshal([]byte(releaseReq.Issue.Details), &issueDetails))
assert.Equal(t, tt.args.reqOrg, issueDetails.InitiatedBy.Org)
assert.Equal(t, tt.args.reqUser.ID.String(), issueDetails.InitiatedBy.UserID)
assert.Equal(t, dinstance.TenantID.String(), issueDetails.InitiatedBy.TenantID)
assert.Equal(t, tt.args.reqOrg, issueDetails.InitiatedBy.TenantOrg)
require.NotNil(t, issueDetails.TenantReported)
assert.Equal(t, strings.ToUpper(tt.args.reqData.MachineHealthIssue.Category), issueDetails.TenantReported.Category)
if tt.args.reqData.MachineHealthIssue.Summary != nil {
assert.Equal(t, *tt.args.reqData.MachineHealthIssue.Summary, issueDetails.TenantReported.Summary)
}
if tt.args.reqData.MachineHealthIssue.Details != nil {
assert.Equal(t, *tt.args.reqData.MachineHealthIssue.Details, *issueDetails.TenantReported.Details)
}
if issueDetails.TenantReported.Details == nil {
assert.Nil(t, issueDetails.TenantReported.Details)
} else {
require.NotNil(t, issueDetails.TenantReported.Details)
assert.Equal(t, *issueDetails.TenantReported.Details, *tt.args.reqData.MachineHealthIssue.Details)
}
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

📐 Maintainability & Code Quality | 🟡 Minor | ⚡ Quick win

Make workflow-attribution assertions mandatory for the machine-health test path.

The conditional guard around tsc.Calls allows this branch to be skipped silently. For cases where MachineHealthIssue is present, require call presence first (require.NotEmpty) and then assert arguments, so regressions fail deterministically.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@rest-api/api/pkg/api/handler/instance_test.go` around lines 9726 - 9750,
Replace the conditional guard that checks if tsc.Calls has elements and if
Arguments has sufficient length with mandatory require statements. Use
require.NotEmpty(t, tsc.Calls) to ensure the test fails if calls are missing
when MachineHealthIssue is present, then safely access the Arguments array
without the length check. This makes the workflow-attribution assertions
mandatory rather than allowing them to be silently skipped, ensuring regressions
are caught deterministically.

Comment on lines +4906 to +4909
if derr != nil {
logger.Error().Err(derr).Msg("error building delete attribution config")
return cutil.NewAPIErrorResponse(c, http.StatusInternalServerError, "Failed to delete Instance", nil)
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🎯 Functional Correctness | 🟠 Major | ⚡ Quick win

Use transaction-scoped error construction inside the WithTx closure.

At Line 4908, this branch returns cutil.NewAPIErrorResponse(...) from inside the transaction closure. The closure should return cutil.NewAPIError(...) and let the outer handler path translate it after transaction unwind.

💡 Proposed fix
-				return cutil.NewAPIErrorResponse(c, http.StatusInternalServerError, "Failed to delete Instance", nil)
+				return cutil.NewAPIError(http.StatusInternalServerError, "Failed to delete Instance", nil)

As per path instructions, "Inside transaction closures, return cutil.NewAPIError(code, msg, data) for errors; outside, return cutil.NewAPIErrorResponse(c, code, msg, data) directly."

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
if derr != nil {
logger.Error().Err(derr).Msg("error building delete attribution config")
return cutil.NewAPIErrorResponse(c, http.StatusInternalServerError, "Failed to delete Instance", nil)
}
if derr != nil {
logger.Error().Err(derr).Msg("error building delete attribution config")
return cutil.NewAPIError(http.StatusInternalServerError, "Failed to delete Instance", nil)
}
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@rest-api/api/pkg/api/handler/instance.go` around lines 4906 - 4909, The error
handling block in the transaction closure (where derr is checked) is returning
cutil.NewAPIErrorResponse(c, http.StatusInternalServerError, "Failed to delete
Instance", nil), but inside WithTx closures you should return cutil.NewAPIError
instead and let the outer handler translate it after transaction unwind. Replace
the cutil.NewAPIErrorResponse call with cutil.NewAPIError, removing the context
parameter c and keeping only the status code, message, and data parameters.

Source: Path instructions

Comment on lines +2688 to +2695
var parsed map[string]json.RawMessage
require.NoError(t, json.Unmarshal([]byte(raw), &parsed))
for _, key := range tt.wantRawKeys {
assert.Contains(t, parsed, key)
}
for _, key := range tt.wantAbsentKeys {
assert.NotContains(t, parsed, key)
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🎯 Functional Correctness | 🟡 Minor | ⚡ Quick win

Add nested raw-key assertions for tenant_reported.details omission semantics.

The current raw JSON check only validates top-level keys. For the nil/empty details cases, the test should also assert that tenant_reported.details is absent (not null) to fully lock the omitempty contract.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@rest-api/api/pkg/api/model/instance_test.go` around lines 2688 - 2695, The
test currently only validates top-level keys in the parsed JSON structure but
does not check the nested structure of tenant_reported to ensure the omitempty
contract is enforced for the details field. After the existing assertions that
validate wantRawKeys and wantAbsentKeys at the top level, add additional
assertions that unmarshal the tenant_reported object from the parsed map and
verify that the details field is properly absent (not null) when it should be
omitted according to the omitempty semantics for nil or empty detail cases.

Comment on lines +1651 to +1678
// ToInstanceDeleteAttributionConfig builds the delete attribution config from the API request.
func (idr *APIInstanceDeleteRequest) ToInstanceDeleteAttributionConfig(user *cdbm.User, instance *cdbm.Instance) (string, error) {
config := InstanceDeleteAttributionConfig{
InitiatedBy: InstanceDeleteInitiatedBy{
Org: instance.Tenant.Org,
UserID: user.ID.String(),
TenantID: instance.Tenant.ID.String(),
TenantOrg: instance.Tenant.Org,
},
}
if idr.MachineHealthIssue != nil {
mhi := idr.MachineHealthIssue
reported := InstanceDeleteAttributionTenantReported{
Category: strings.ToUpper(mhi.Category),
}
if mhi.Summary != nil {
reported.Summary = *mhi.Summary
}
if mhi.Details != nil && *mhi.Details != "" {
reported.Details = mhi.Details
}
config.TenantReported = &reported
}
data, err := json.Marshal(config)
if err != nil {
return "", err
}
return string(data), nil

@coderabbitai coderabbitai Bot Jun 23, 2026

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🗄️ Data Integrity & Integration | 🟠 Major | ⚡ Quick win

Enforce the Issue.Details size contract before returning the serialized attribution payload.

At Line 1674, the serialized JSON can exceed DeleteAttributionIssueDetailsMaxLen even when request fields pass validation (Summary and Details can each be 1024). That creates a cross-service contract risk when this string is forwarded as Issue.Details.

💡 Proposed fix
 data, err := json.Marshal(config)
 if err != nil {
 	return "", err
 }
+if len(data) > DeleteAttributionIssueDetailsMaxLen {
+	return "", fmt.Errorf("delete attribution payload exceeds %d bytes", DeleteAttributionIssueDetailsMaxLen)
+}
 return string(data), nil
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@rest-api/api/pkg/api/model/instance.go` around lines 1651 - 1678, The
ToInstanceDeleteAttributionConfig method serializes the config to JSON and
returns it as a string, but does not validate that the resulting serialized
payload respects the DeleteAttributionIssueDetailsMaxLen size contract. After
the json.Marshal call succeeds and before the return statement, add a length
validation check to ensure the serialized string data does not exceed
DeleteAttributionIssueDetailsMaxLen. If the length exceeds the limit, return an
empty string and an appropriate error to prevent cross-service contract
violations when this attribution payload is used as Issue.Details elsewhere.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@hwadekar-nv This one seems important?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like the humans are having a chat. I'll hop back into my burrow for now. If you need me again, just tag @coderabbitai in a new comment, and I'll come hopping out!


// InstanceDeleteInitiatedBy records who initiated an Instance delete in the cloud API layer.
type InstanceDeleteInitiatedBy struct {
Org string `json:"org"`

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the difference between org and tenant_org?

}

// InstanceDeleteAttributionTenantReported captures tenant-reported issue fields included in delete attribution.
type InstanceDeleteAttributionTenantReported struct {

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

// InstanceDeleteAttributionConfig is the canonical JSON payload persisted on delete and forwarded to Site when an issue is reported.
type InstanceDeleteAttributionConfig struct {
InitiatedBy InstanceDeleteInitiatedBy `json:"initiated_by"`
TenantReported *InstanceDeleteAttributionTenantReported `json:"tenant_reported,omitempty"`

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't we already send this information when Tenant specifies it? Seems like we're duplicating the information?

}

// ToInstanceDeleteAttributionConfig builds the delete attribution config from the API request.
func (idr *APIInstanceDeleteRequest) ToInstanceDeleteAttributionConfig(user *cdbm.User, instance *cdbm.Instance) (string, error) {

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should call this GetTenantAttributionDetails. Also let's pass user and tenant as argument, we don't seem to use Instance for anything other than accessing Tenant.

@sunilkumar-nvidia

Copy link
Copy Markdown
Contributor

I’d recommend moving delete attribution out of Issue.details and into a new optional delete_attribution field on InstanceReleaseRequest. That keeps details backward-compatible as tenant-provided issue text, works for both issue and no-issue delete flows, and lets Core store attribution as a separate structured field instead of parsing escaped JSON from details. For all-delete audit coverage, the API should populate this field on every delete and also emit/persist structured audit metadata cloud-side.

Something like:
Add a new optional proto field on InstanceReleaseRequest, outside issue:

`
message DeleteInitiatedBy {
string org = 1;
string user_id = 2;
string tenant_id = 3;
string tenant_org = 4;
}

message DeleteAttribution {
DeleteInitiatedBy initiated_by = 1;
}

message InstanceReleaseRequest {
common.InstanceId id = 1;
optional Issue issue = 2;
optional bool is_repair_tenant = 3;
optional DeleteAttribution delete_attribution = 4;
}`

@hwadekar-nv

Copy link
Copy Markdown
Contributor Author

Thanks @thossain-nv @sunilkumar-nvidia, while implementing this, I also felt to have separate field in proto instead using Issue.details, if okey, will add that into proto?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants