Skip to content

speculative : fix out-of-bounds read in ngram-map on prompt shrink#23936

Open
o7si wants to merge 1 commit into
ggml-org:masterfrom
o7si:issue-23889
Open

speculative : fix out-of-bounds read in ngram-map on prompt shrink#23936
o7si wants to merge 1 commit into
ggml-org:masterfrom
o7si:issue-23889

Conversation

@o7si
Copy link
Copy Markdown
Contributor

@o7si o7si commented May 31, 2026

common_ngram_map_begin() invalidates stale entries using map.size_last_begin (the previous generation start), not size_begin (the new prompt length):

if (key.key_idx >= map.size_last_begin) {

When a slot is reused for a shorter prompt, size_begin < size_last_begin, so keys with key_idx in [size_begin, size_last_begin) are kept and later read out of bounds in common_ngram_map_draft():

if (inp[map.keys[i].key_idx + j] != key_tokens[j]) {

Since a key n-gram is size_key tokens long, use size_begin and remove a key when key_idx + size_key > size_begin.

Related issue:

@ggml-gh-bot

This comment was marked as resolved.

@o7si o7si marked this pull request as ready for review May 31, 2026 10:48
@ggerganov ggerganov requested a review from srogmann May 31, 2026 13:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants