Skip to content

[codex] Add generic model context window hints#23

Merged
senamakel merged 2 commits into
mainfrom
openhuman/model-context-patterns
Jul 5, 2026
Merged

[codex] Add generic model context window hints#23
senamakel merged 2 commits into
mainfrom
openhuman/model-context-patterns

Conversation

@senamakel

@senamakel senamakel commented Jul 5, 2026

Copy link
Copy Markdown
Member

Summary

  • add a provider-neutral context-window hint helper for common model family ids
  • expose the helper through the harness public API
  • cover common provider ids plus o1/o3 boundary handling

Validation

  • cargo fmt --check
  • timeout 180s cargo clippy --all-targets -- -D warnings
  • timeout 120s cargo test context_window_patterns_cover_common_provider_families
  • timeout 120s cargo test o1_o3_context_patterns_require_segment_boundaries

Summary by CodeRabbit

  • New Features
    • Added support for recognizing model IDs and returning a matching context window size when available.
    • Improved handling of model-name formats, including common provider patterns and special segment-based IDs.
  • Bug Fixes
    • Empty, whitespace-only, and unknown model IDs now return no context window instead of an incorrect value.

@coderabbitai

coderabbitai Bot commented Jul 5, 2026

Copy link
Copy Markdown

Review Change Stack

Warning

Review limit reached

You’ve reached a temporary PR review limit under our Fair Usage Limits Policy.

Your recent review volume is higher than typical usage, so adaptive limits are currently applied.

Next review available in: 50 minutes

Enable usage-based reviews in Billing to review now. Otherwise, wait until the next included review is available.
You're only billed for reviews past your plan's rate limits ($0.25/file).

How can I continue?

After more reviews become available, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

To avoid repeated limits, reduce automatic review volume by pausing incremental auto-reviews earlier, using label-based review opt-in, excluding WIP or generated PR titles, or requesting reviews manually when the PR is ready. If your team needs uninterrupted high-volume reviews, an organization admin can enable usage-based reviews.

How do review limits work?

CodeRabbit enforces per-developer PR review limits for each organization. Most developers receive the normal plan review availability.

For paid Pro and Pro+ PR reviews, CodeRabbit uses adaptive limits for sustained high-volume activity. When a developer's recent PR review activity reaches the 95th percentile or higher among CodeRabbit users, additional reviews become available more gradually as earlier reviews age out of the rolling window.

Please refer docs for additional details.

Review details
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 9495c254-3b6d-4d34-8a46-7b51cb07113c

📥 Commits

Reviewing files that changed from the base of the PR and between 4570237 and 8092871.

📒 Files selected for processing (1)
  • src/harness/model/mod.rs
📝 Walkthrough

Walkthrough

This PR adds a context_window_for_model_id function to the harness model module that maps model-id patterns to conservative context-window token sizes using substring or segment-delimited matching, exports it publicly, and adds unit tests covering common model formats and o1/o3 boundary rules.

Changes

Context Window Inference

Layer / File(s) Summary
Pattern matching and public export
src/harness/model/mod.rs, src/harness/mod.rs
Adds ContextPatternMatch, MODEL_CONTEXT_PATTERNS, matches_context_pattern, and the public context_window_for_model_id function; re-exports it from the harness module.
Unit tests
src/harness/model/test.rs
Adds tests validating context window sizes for common provider model IDs, o1/o3 segment rules, and unknown/whitespace inputs.

Estimated code review effort: 2 (Simple) | ~10 minutes

Poem

A rabbit sniffs each model's name,
Matching patterns, tokens tame,
Windows sized with careful care,
Tests confirm what's fair and rare,
Hop, hop — context found with flair! 🐇

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly matches the main change: adding generic model context window hints.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

Comment @coderabbitai help to get the list of available commands.

@senamakel senamakel marked this pull request as ready for review July 5, 2026 03:30

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
src/harness/model/mod.rs (1)

47-68: 📐 Maintainability & Code Quality | 🔵 Trivial | ⚡ Quick win

Document the ordering invariant of MODEL_CONTEXT_PATTERNS.

Correctness relies on more-specific substrings (e.g. gpt-4.1, gpt-4o, gpt-4-turbo) preceding shorter, more general ones (gpt-4) since find_map returns the first match. This is implicit and easy to break silently when new entries are added later.

♻️ Suggested doc comment
 const MODEL_CONTEXT_PATTERNS: &[(&str, ContextPatternMatch, u64)] = &[
+    // NOTE: order matters — `find_map` returns the first match, so more
+    // specific substrings (e.g. "gpt-4.1") must precede shorter, more
+    // general ones (e.g. "gpt-4") that would otherwise shadow them.
     ("claude-haiku-4.5", ContextPatternMatch::Substring, 200_000),
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/harness/model/mod.rs` around lines 47 - 68, Document the ordering rule
for MODEL_CONTEXT_PATTERNS in model/mod.rs: because
ModelContext::context_pattern_for uses find_map, the first matching entry wins,
so more specific substrings like gpt-4.1, gpt-4o, and gpt-4-turbo must stay
before broader matches like gpt-4. Add a doc comment on MODEL_CONTEXT_PATTERNS
that states this invariant and warns future additions must preserve
specificity-first ordering to avoid silent misclassification.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@src/harness/model/mod.rs`:
- Around line 47-68: Document the ordering rule for MODEL_CONTEXT_PATTERNS in
model/mod.rs: because ModelContext::context_pattern_for uses find_map, the first
matching entry wins, so more specific substrings like gpt-4.1, gpt-4o, and
gpt-4-turbo must stay before broader matches like gpt-4. Add a doc comment on
MODEL_CONTEXT_PATTERNS that states this invariant and warns future additions
must preserve specificity-first ordering to avoid silent misclassification.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 606b6a99-9637-4766-99b8-511f8c42f891

📥 Commits

Reviewing files that changed from the base of the PR and between e72036d and 4570237.

📒 Files selected for processing (3)
  • src/harness/mod.rs
  • src/harness/model/mod.rs
  • src/harness/model/test.rs

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 45702373ae

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/harness/model/mod.rs
("gpt-4.1", ContextPatternMatch::Substring, 1_047_576),
("gpt-4o", ContextPatternMatch::Substring, 128_000),
("gpt-4-turbo", ContextPatternMatch::Substring, 128_000),
("gpt-4", ContextPatternMatch::Substring, 128_000),

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Keep the gpt-4 fallback at 8k

OpenAI lists the plain gpt-4 model's context window as 8,192 tokens (https://developers.openai.com/api/docs/models/gpt-4), so context_window_for_model_id("gpt-4") now returns a 128k budget for a model that rejects prompts above 8k. Because this helper is intended for pre-dispatch budgeting, callers that use the plain gpt-4 id can skip summarization/compaction and hit provider context-limit errors; keep the 128k value scoped to gpt-4-turbo/gpt-4o variants.

Useful? React with 👍 / 👎.

Comment thread src/harness/model/mod.rs
Comment on lines +66 to +67
("llama-3", ContextPatternMatch::Substring, 128_000),
("llama3", ContextPatternMatch::Substring, 128_000),

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Don't assign Llama 3 the Llama 3.1 window

Meta's Llama 3 model card lists an 8k context length for the Llama 3 8B/70B models (https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct), while the 128k window belongs to later Llama 3.1 models. With these generic patterns, context_window_for_model_id("llama3:8b") returns 128k, so local/hosted Llama 3 callers can over-budget by 16x and fail at runtime; add separate llama-3.1/llama3.1 patterns or lower the plain Llama 3 fallback.

Useful? React with 👍 / 👎.

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 45702373ae

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/harness/model/mod.rs
("gpt-4.1", ContextPatternMatch::Substring, 1_047_576),
("gpt-4o", ContextPatternMatch::Substring, 128_000),
("gpt-4-turbo", ContextPatternMatch::Substring, 128_000),
("gpt-4", ContextPatternMatch::Substring, 128_000),

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Keep the gpt-4 fallback at 8k

OpenAI lists the plain gpt-4 model's context window as 8,192 tokens (https://developers.openai.com/api/docs/models/gpt-4), so context_window_for_model_id("gpt-4") now returns a 128k budget for a model that rejects prompts above 8k. Because this helper is intended for pre-dispatch budgeting, callers that use the plain gpt-4 id can skip summarization/compaction and hit provider context-limit errors; keep the 128k value scoped to gpt-4-turbo/gpt-4o variants.

Useful? React with 👍 / 👎.

Comment thread src/harness/model/mod.rs
Comment on lines +66 to +67
("llama-3", ContextPatternMatch::Substring, 128_000),
("llama3", ContextPatternMatch::Substring, 128_000),

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Don't assign Llama 3 the Llama 3.1 window

Meta's Llama 3 model card lists an 8k context length for the Llama 3 8B/70B models (https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct), while the 128k window belongs to later Llama 3.1 models. With these generic patterns, context_window_for_model_id("llama3:8b") returns 128k, so local/hosted Llama 3 callers can over-budget by 16x and fail at runtime; add separate llama-3.1/llama3.1 patterns or lower the plain Llama 3 fallback.

Useful? React with 👍 / 👎.

@senamakel senamakel merged commit 0bc4642 into main Jul 5, 2026
3 checks passed
@senamakel senamakel deleted the openhuman/model-context-patterns branch July 5, 2026 03:44

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 8092871096

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/harness/model/mod.rs
("o1", ContextPatternMatch::Segment, 200_000),
("o3", ContextPatternMatch::Segment, 200_000),
("deepseek", ContextPatternMatch::Substring, 128_000),
("gemma3", ContextPatternMatch::Substring, 8_192),

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Return Gemma 3's input context window

For Gemma 3 ids such as gemma3:4b (the added test covers this exact Ollama-style id), this returns 8,192 even though Google's Gemma 3 model card lists 128K input context for the 4B/12B/27B sizes and 32K for 1B/270M: https://ai.google.dev/gemma/docs/core/model_card_3. Since this helper is meant to feed pre-dispatch and capability budgeting, using the 8K value will unnecessarily reject/compact long prompts that those Gemma 3 models can accept; split Gemma 3 from the older Gemma fallback or make the pattern size-aware.

Useful? React with 👍 / 👎.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant