Skip to content

Potential privacy / IP leakage from datastore-backed speculative decoding #36

@fastdecoding

Description

@fastdecoding

We are reporting a potential privacy / IP leakage issue in REST-style datastore-backed speculative decoding.

In our evaluation, we found that although the implementation limits the number of datastore tokens accepted in a single verification step, consecutive datastore-backed chunks can still accumulate into longer recovered fragments within one response. In our 1000-prompt evaluation, the average per-deployment median stitched recovery was 155 words for text-model deployments and 87 words for code-model deployments.

Representative examples below are lightly normalized for readability.

Text example

  • model: lmsys/vicuna-13b-v1.5
  • datastore: datastore_chat_large.idx
Thank you for your patience and understanding during this challenging time.

Best regards,

[Your Name]
[Your Title]
[Your Company]

Code example

  • model: codellama/CodeLlama-13b-hf
  • datastore: datastore_stack_large.idx
def verify_reset_password_token(token):
    try:
        id = jwt.decode(token, current_app.config['SECRET_KEY'],
                       algorithms=['HS256'])['reset_password']
    except:
        return
    return User.query.get(id)

We understand that this is at least partly a deployment issue rather than necessarily a core correctness bug. In particular, the risk becomes more serious when REST is used with streaming output and when the datastore contains private, user-derived, or otherwise sensitive content.

For that reason, it would be useful to add explicit guidance in the documentation, for example:

  • avoid streaming partial outputs in privacy-sensitive REST deployments
  • do not place private or sensitive content in the datastore unless the leakage risk is acceptable
  • document that per-step acceptance limits do not fully bound cumulative recovery within a single generation

We are intentionally not including low-level reproduction details in this public report. We are preparing an academic disclosure and wanted to notify the maintainers before publication. We would be happy to share a private technical write-up and affected configurations directly with the maintainers.

Thanks.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions