feat: add token estimation utility for chat completions#1922
Conversation
Adds estimateTokens() function for cost estimation and dry-run capability without making API calls. Closes openai#318 Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: e0cf99201f
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| } else if ('input_audio' in part) { | ||
| // Audio is tokenized separately; rough estimate | ||
| tokens += 50; | ||
| } else if ('refusal' in part && typeof part.refusal === 'string') { |
There was a problem hiding this comment.
Handle file content parts in token estimates
When a user message contains a ChatCompletionContentPart.File such as content: [{ type: 'file', file: { file_data: ... } }], none of these content-part branches adds any tokens, so the estimate includes only the fixed message overhead even when the file is the bulk of the prompt. Because this helper is advertised for cost and context-window checks, file-backed chat requests can be reported as fitting cheaply when they do not; please add file handling or return a conservative/unknown estimate for these parts.
Useful? React with 👍 / 👎.
| if (model.startsWith(prefix)) { | ||
| return MODEL_CONTEXT_WINDOWS[prefix]; |
There was a problem hiding this comment.
Avoid matching unrelated model IDs by prefix
For any valid but unlisted model that shares a prefix with an older key, this fallback returns the older model's context window; for example gpt-4.5-preview appears in the repo's generated model unions, but it would match gpt-4 here and report an 8k window instead of unknown or the actual larger window. That makes maxOutputTokens wildly too small or zero for valid requests, so prefix matching should be limited to version suffixes that are known to share a base model or the missing models should be added explicitly.
Useful? React with 👍 / 👎.
Adds token counting utility for cost estimation and dry-run support.
Closes #318