Skip to content

fix(tts): add streamFlush() API and drain Kokoro buffer on streamStop#1197

Open
msluszniak wants to merge 1 commit into
mainfrom
@ms/tts-stream-insert-can-flush
Open

fix(tts): add streamFlush() API and drain Kokoro buffer on streamStop#1197
msluszniak wants to merge 1 commit into
mainfrom
@ms/tts-stream-insert-can-flush

Conversation

@msluszniak
Copy link
Copy Markdown
Member

@msluszniak msluszniak commented May 28, 2026

Description

Kokoro::stream hung when the buffer held content without an end-of-sentence character: the timer-based force-flush (after kStreamMaxSkippedIterations idle iterations) tried to extract 0 characters, so the loop reset its skip counter without draining anything and streamStop(false) never returned. The same threshold also fired mid-token for LLM-style streaming, partitioning sentences before they finished.

Replace the threshold with caller-driven flushing:

  • New streamFlush() API. Caller signals "drain what's currently buffered, EOS or not" — typically right before streamStop(false) for normal apps that ended on un-terminated content.
  • streamStop(false) now drains automatically (equivalent to streamFlush() plus auto-stop on empty buffer), so it's guaranteed to return even if the trailing tail has no EOS.
  • LLM-style callers feeding partial tokens never call streamFlush() — model punctuation drives natural EOS partitioning, and the residual tail is drained by streamStop(false) when generation completes.

The skipped-iteration force-flush and its tuning constant kStreamMaxSkippedIterations are removed.

Introduces a breaking change?

  • Yes
  • No

Type of change

  • Bug fix (change which fixes an issue)
  • New feature (change which adds functionality)
  • Documentation update (improves or adds clarity to existing documentation)
  • Other (chores, tests, code style improvements etc.)

Tested on

  • iOS
  • Android

Testing instructions

  1. Build & run apps/speech on Android.
  2. Open Text to Speech - LLM Streaming.
  3. Tap Generate and let the story play through.
  4. Tap Stop mid-stream — verify it returns promptly without hanging.

Related issues

Fixes #1153

@msluszniak msluszniak self-assigned this May 28, 2026
@msluszniak msluszniak added the bug fix PRs that are fixing bugs label May 28, 2026
@msluszniak msluszniak requested a review from IgorSwat May 28, 2026 11:19
@msluszniak
Copy link
Copy Markdown
Member Author

@IgorSwat
Now I'm in favour of changing direction to:

Explicit streamFlush() API. Caller signals "I'm done feeding for now, partition what's left." LLM mode never calls it; normal apps call it before streamStop(false). No threshold tuning.

because this will not introduce breaking change as the current state.

@msluszniak msluszniak changed the title fix(tts): drain Kokoro buffer on streamStop and add canFlush insert option fix(tts)!: drain Kokoro buffer on streamStop and add canFlush insert option May 28, 2026
@msluszniak msluszniak force-pushed the @ms/tts-stream-insert-can-flush branch from 7358b19 to 0e0cdb9 Compare May 28, 2026 12:21
@msluszniak msluszniak changed the title fix(tts)!: drain Kokoro buffer on streamStop and add canFlush insert option fix(tts): add streamFlush() API and drain Kokoro buffer on streamStop May 28, 2026
@msluszniak msluszniak force-pushed the @ms/tts-stream-insert-can-flush branch from 0e0cdb9 to 160cf11 Compare May 28, 2026 12:51
Copy link
Copy Markdown
Member Author

@msluszniak msluszniak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I need to add documentation for this function for flushing.

The streaming loop used a timer-based force-flush after
`kStreamMaxSkippedIterations` idle iterations. It tried to extract zero
characters whenever the buffer held content without an end-of-sentence
character — the counter reset on every iteration and `streamStop(false)`
hung forever (#1153). The same threshold also fired mid-token for
LLM-style streaming, partitioning sentences before they finished.

Replace the threshold with caller-driven flushing:

- New `streamFlush()` API. Caller signals "drain what's currently
  buffered, EOS or not" — typically right before `streamStop(false)` for
  normal apps that ended on un-terminated content.
- `streamStop(false)` now drains automatically (equivalent to
  `streamFlush()` plus auto-stop on empty buffer), so it's guaranteed to
  return even if the trailing tail has no EOS.
- LLM-style callers feeding partial tokens never call `streamFlush()` —
  model punctuation drives natural EOS partitioning, and the residual
  tail is drained by `streamStop(false)` when generation completes.

The skipped-iteration force-flush and its tuning constant
`kStreamMaxSkippedIterations` are removed.

Fixes #1153.
@msluszniak msluszniak force-pushed the @ms/tts-stream-insert-can-flush branch from 160cf11 to 3f49e51 Compare May 28, 2026 21:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug fix PRs that are fixing bugs

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Kokoro::stream hangs on non-EOS-terminated buffer; streamStop(false) never returns

2 participants