fix: validate semantic token edit delete count before allocation (fixes #322571) by vs-code-engineering[bot] · Pull Request #322575 · microsoft/vscode

vs-code-engineering · 2026-06-23T16:31:10Z

Summary

ModelSemanticColoring._setDocumentSemanticTokens applies token edits returned by a DocumentSemanticTokensProvider by allocating a destination buffer sized srcData.length + deltaLength, where deltaLength accumulates (edit.data?.length ?? 0) - edit.deleteCount across all edits. When a provider returns edits whose cumulative deleteCount exceeds the number of Uint32 entries in the previous result, that sum becomes negative and new Uint32Array(negativeLength) throws RangeError: Invalid typed array length: -57045.

The method already validates one form of bad edit (edit.start > srcData.length) and recovers by warning and calling setSemanticTokens(null, true) to force a full re-fetch — but that validation runs after the buffer is allocated, so the negative-length case crashes before it is reached. The fix adds the missing validation at the same designated recovery point, before allocation.

Fixes #322571
Recommended reviewer: @alexdima

Culprit Commit

Not identified — pre-existing. The allocation new Uint32Array(srcData.length + deltaLength) and the surrounding edit-application loop predate the window visible in this shallow checkout (fetch-depth=1), and recent commits touching the file (a listener-leak fix, the ESM migration, and a ResourceMap change) do not modify the deltaLength accumulation or the allocation math. The bug is in the original edit-application logic rather than a recent regression; it surfaces only when a provider emits edits that delete more tokens than exist, which is why it appears intermittently in telemetry rather than tracing to a single change.

Code Flow

flowchart TD
    A["Language extension DocumentSemanticTokensProvider returns SemanticTokensEdits"] --> B["ModelSemanticColoring._setDocumentSemanticTokens"]
    B --> C["Loop accumulates deltaLength += edit.data.length - edit.deleteCount"]
    C --> D["destLength = srcData.length + deltaLength"]
    D -->|"destLength is negative"| E["new Uint32Array(destLength) throws RangeError: Invalid typed array length"]
    D -->|"destLength is non-negative"| F["Allocate destData, validate edit.start, copy and splice tokens"]
    F -->|"edit.start greater than srcData.length"| G["warnInvalidEditStart + setSemanticTokens(null, true) + return"]

The existing recovery branch G proves the intended contract: invalid edits are survivable and should trigger a full re-fetch, not a crash. The negative-length case simply reaches the allocation E before the loop that contains the edit.start check.

Affected Files

src/vs/editor/contrib/semanticTokens/browser/documentSemanticTokens.ts — crash site; the destination buffer is allocated here from the unvalidated, possibly-negative computed length.
src/vs/editor/common/services/semanticTokensProviderStyling.ts — houses the one-time invalid-edit warning helpers (warnInvalidEditStart, warnOverlappingSemanticTokens, warnInvalidLengthSemanticTokens); a sibling helper for this case is added here.

Repro Steps

Install/activate a language extension whose DocumentSemanticTokensProvider.provideDocumentSemanticTokensEdits returns a SemanticTokensEdits result.
Have it return one or more edits whose combined deleteCount is larger than the number of Uint32 entries in the previously delivered full token result (for example, deleting a trailing range after the document shrank, so the edit removes more entries than remain).
The editor applies the edits in _setDocumentSemanticTokens; deltaLength goes sufficiently negative that srcData.length + deltaLength < 0.
new Uint32Array(...) throws RangeError: Invalid typed array length, which reaches unhandled-error telemetry.

How the Fix Works

Chosen approach.

documentSemanticTokens.ts: Before allocating, compute destDataLength = srcData.length + deltaLength and guard if (destDataLength < 0). On a negative length, call the new styling.warnInvalidEditDeleteCount(...) to emit a diagnostic, then recover with this._model.tokenization.setSemanticTokens(null, true) and return — the identical recovery already used for the edit.start validation a few lines below. This fixes the problem at the place the code itself designates for handling invalid edits, completing validation that was already present but ordered after the allocation. The real producer of the bad data is an out-of-process language extension reachable only across the extension-host boundary, so it cannot be guarded in core; the correct core behavior for any malformed edit is exactly this reject-and-re-fetch path, and the negative-length case is just another malformed edit.
semanticTokensProviderStyling.ts: Add warnInvalidEditDeleteCount(previousResultId, resultId, srcLength, deltaLength) mirroring the existing warnInvalidEditStart — a one-time _logService.warn gated by a _hasWarnedInvalidEditDeleteCount flag. The violation is surfaced (logged with the offending lengths), never swallowed, so the underlying provider bug remains diagnosable instead of being silently masked.

This is a guard placed upstream of the throw that rejects the invalid input, rather than a try/catch wrapped around the allocation; the error path is never entered, and no existing logService.error/telemetry call is removed or weakened. The edits are discarded and a fresh full result is requested — the data is not coerced into a benign-but-wrong value.

Alternatives considered.

Wrapping the allocation in try/catch: rejected because it hides the malformed-edit condition from diagnostics and leaves the model in an inconsistent partial state instead of forcing a clean re-fetch — it masks the symptom rather than handling the bad data at the designated recovery point.
Clamping destDataLength to 0 (or to srcData.length) and proceeding: rejected because it silently fabricates a token buffer that does not correspond to the provider's intent, producing wrong highlighting with no signal; the existing code's contract for invalid edits is to drop them and re-fetch, not to coerce.

Recommended Owner

@alexdima — original author and most recent maintainer of the semantic tokens stack (documentSemanticTokens.ts and semanticTokensProviderStyling.ts), owner of editor core/tokenization, and actively committing in this area. Best positioned to confirm the recovery semantics and review the added validation.

Generated by errors-fix · 1.8K AIC · ⌖ 88.7 AIC · ⊞ 69.4K · ◷

A DocumentSemanticTokensProvider can return edits whose cumulative deleteCount exceeds the number of tokens in the previous result. This makes srcData.length + deltaLength negative, so new Uint32Array() throws "RangeError: Invalid typed array length". Validate the computed destination length before allocating, mirroring the existing edit.start validation, and recover gracefully via setSemanticTokens(null, true) while logging a one-time diagnostic warning. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Copilot

Copilot encountered an error and was unable to review this pull request. You can try again by re-requesting a review.

Copilot

Copilot encountered an error and was unable to review this pull request. You can try again by re-requesting a review.

Copilot

Copilot encountered an error and was unable to review this pull request. You can try again by re-requesting a review.

vs-code-engineering Bot added the errors-fix label Jun 23, 2026

vs-code-engineering Bot requested review from Copilot June 23, 2026 16:31

Copilot AI reviewed Jun 23, 2026

vs-code-engineering Bot requested review from alexdima and Copilot June 23, 2026 16:32

vs-code-engineering Bot added the *error-fix-driving label Jun 23, 2026

Copilot AI reviewed Jun 23, 2026

vs-code-engineering Bot assigned alexdima Jun 23, 2026

vs-code-engineering Bot marked this pull request as ready for review June 23, 2026 16:33

vs-code-engineering Bot enabled auto-merge (squash) June 23, 2026 16:33

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix: validate semantic token edit delete count before allocation (fixes #322571)#322575

fix: validate semantic token edit delete count before allocation (fixes #322571)#322575
vs-code-engineering[bot] wants to merge 1 commit into
mainfrom
fix/semantic-tokens-negative-array-length-322571-80ef07510a827ddf

vs-code-engineering Bot commented Jun 23, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

vs-code-engineering Bot commented Jun 23, 2026

Summary

Culprit Commit

Code Flow

Affected Files

Repro Steps

How the Fix Works

Recommended Owner

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants