fix: validate semantic token edit delete count before allocation (fixes #322571)#322575
Open
vs-code-engineering[bot] wants to merge 1 commit into
Open
Conversation
A DocumentSemanticTokensProvider can return edits whose cumulative deleteCount exceeds the number of tokens in the previous result. This makes srcData.length + deltaLength negative, so new Uint32Array() throws "RangeError: Invalid typed array length". Validate the computed destination length before allocating, mirroring the existing edit.start validation, and recover gracefully via setSemanticTokens(null, true) while logging a one-time diagnostic warning. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
ModelSemanticColoring._setDocumentSemanticTokensapplies token edits returned by aDocumentSemanticTokensProviderby allocating a destination buffer sizedsrcData.length + deltaLength, wheredeltaLengthaccumulates(edit.data?.length ?? 0) - edit.deleteCountacross all edits. When a provider returns edits whose cumulativedeleteCountexceeds the number ofUint32entries in the previous result, that sum becomes negative andnew Uint32Array(negativeLength)throwsRangeError: Invalid typed array length: -57045.The method already validates one form of bad edit (
edit.start > srcData.length) and recovers by warning and callingsetSemanticTokens(null, true)to force a full re-fetch — but that validation runs after the buffer is allocated, so the negative-length case crashes before it is reached. The fix adds the missing validation at the same designated recovery point, before allocation.Fixes #322571
Recommended reviewer:
@alexdimaCulprit Commit
Not identified — pre-existing. The allocation
new Uint32Array(srcData.length + deltaLength)and the surrounding edit-application loop predate the window visible in this shallow checkout (fetch-depth=1), and recent commits touching the file (a listener-leak fix, the ESM migration, and aResourceMapchange) do not modify thedeltaLengthaccumulation or the allocation math. The bug is in the original edit-application logic rather than a recent regression; it surfaces only when a provider emits edits that delete more tokens than exist, which is why it appears intermittently in telemetry rather than tracing to a single change.Code Flow
flowchart TD A["Language extension DocumentSemanticTokensProvider returns SemanticTokensEdits"] --> B["ModelSemanticColoring._setDocumentSemanticTokens"] B --> C["Loop accumulates deltaLength += edit.data.length - edit.deleteCount"] C --> D["destLength = srcData.length + deltaLength"] D -->|"destLength is negative"| E["new Uint32Array(destLength) throws RangeError: Invalid typed array length"] D -->|"destLength is non-negative"| F["Allocate destData, validate edit.start, copy and splice tokens"] F -->|"edit.start greater than srcData.length"| G["warnInvalidEditStart + setSemanticTokens(null, true) + return"]The existing recovery branch
Gproves the intended contract: invalid edits are survivable and should trigger a full re-fetch, not a crash. The negative-length case simply reaches the allocationEbefore the loop that contains theedit.startcheck.Affected Files
src/vs/editor/contrib/semanticTokens/browser/documentSemanticTokens.ts— crash site; the destination buffer is allocated here from the unvalidated, possibly-negative computed length.src/vs/editor/common/services/semanticTokensProviderStyling.ts— houses the one-time invalid-edit warning helpers (warnInvalidEditStart,warnOverlappingSemanticTokens,warnInvalidLengthSemanticTokens); a sibling helper for this case is added here.Repro Steps
DocumentSemanticTokensProvider.provideDocumentSemanticTokensEditsreturns aSemanticTokensEditsresult.deleteCountis larger than the number ofUint32entries in the previously delivered full token result (for example, deleting a trailing range after the document shrank, so the edit removes more entries than remain)._setDocumentSemanticTokens;deltaLengthgoes sufficiently negative thatsrcData.length + deltaLength < 0.new Uint32Array(...)throwsRangeError: Invalid typed array length, which reaches unhandled-error telemetry.How the Fix Works
Chosen approach.
documentSemanticTokens.ts: Before allocating, computedestDataLength = srcData.length + deltaLengthand guardif (destDataLength < 0). On a negative length, call the newstyling.warnInvalidEditDeleteCount(...)to emit a diagnostic, then recover withthis._model.tokenization.setSemanticTokens(null, true)andreturn— the identical recovery already used for theedit.startvalidation a few lines below. This fixes the problem at the place the code itself designates for handling invalid edits, completing validation that was already present but ordered after the allocation. The real producer of the bad data is an out-of-process language extension reachable only across the extension-host boundary, so it cannot be guarded in core; the correct core behavior for any malformed edit is exactly this reject-and-re-fetch path, and the negative-length case is just another malformed edit.semanticTokensProviderStyling.ts: AddwarnInvalidEditDeleteCount(previousResultId, resultId, srcLength, deltaLength)mirroring the existingwarnInvalidEditStart— a one-time_logService.warngated by a_hasWarnedInvalidEditDeleteCountflag. The violation is surfaced (logged with the offending lengths), never swallowed, so the underlying provider bug remains diagnosable instead of being silently masked.This is a guard placed upstream of the throw that rejects the invalid input, rather than a
try/catchwrapped around the allocation; the error path is never entered, and no existinglogService.error/telemetry call is removed or weakened. The edits are discarded and a fresh full result is requested — the data is not coerced into a benign-but-wrong value.Alternatives considered.
try/catch: rejected because it hides the malformed-edit condition from diagnostics and leaves the model in an inconsistent partial state instead of forcing a clean re-fetch — it masks the symptom rather than handling the bad data at the designated recovery point.destDataLengthto0(or tosrcData.length) and proceeding: rejected because it silently fabricates a token buffer that does not correspond to the provider's intent, producing wrong highlighting with no signal; the existing code's contract for invalid edits is to drop them and re-fetch, not to coerce.Recommended Owner
@alexdima— original author and most recent maintainer of the semantic tokens stack (documentSemanticTokens.tsandsemanticTokensProviderStyling.ts), owner of editor core/tokenization, and actively committing in this area. Best positioned to confirm the recovery semantics and review the added validation.