add Verifiers v1 support to Prime CLI#758
Conversation
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 518d3d9d3a
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 2 potential issues.
❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.
Reviewed by Cursor Bugbot for commit a3d41ab. Configure here.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: a3d41abef6
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| saved_headers = saved_config.get("client", {}).get("headers", {}) | ||
| job_id = saved_headers.get("X-PI-Job-Id") or _build_job_id(job_target, model) | ||
| display_id = saved_headers.get(INTERNAL_ENV_DISPLAY_HEADER) | ||
| upstream_slug = display_id if isinstance(display_id, str) and "/" in display_id else None |
There was a problem hiding this comment.
Validate display headers before reusing them as slugs
When resuming a run saved from a local environment that was ahead of its upstream, INTERNAL_ENV_DISPLAY_HEADER can contain a display string such as wiki-search (local - ahead of primeintellect/wiki-search), not an owner/name slug. This check accepts any value containing /, so push_eval_results_to_hub bypasses metadata lookup and tries to resolve/upload with that malformed slug, causing resumed uploads to fail or attach incorrectly. Only reuse the header when it is a real slug, otherwise fall back to metadata resolution.
Useful? React with 👍 / 👎.
| from verifiers.v1.task import WireTask | ||
| from verifiers.v1.trace import Trace | ||
|
|
||
| trace_type = Trace[WireTask] |
There was a problem hiding this comment.
i think we acc have a WireTrace type for this built-in iirc
| main_messages = ( | ||
| [ | ||
| message.model_dump(mode="json", exclude_none=True) | ||
| for message in branches[-1].messages |
There was a problem hiding this comment.
should we index with 0 instead of -1?
| message.model_dump(mode="json", exclude_none=True) | ||
| for message in branch.messages | ||
| ], | ||
| "reward": trace.reward, |
There was a problem hiding this comment.
was this v0 behavior? i thought step reward was None by default?
| "prompt": [], | ||
| "completion": main_messages, |
There was a problem hiding this comment.
i think this is fair, but need to think abt if this will look weird on platform somehow. iirc one table is showing the prompt as a col (we prob shouldnt do this anyways on platform) but yea smth to look out for
| _print_environment_source_footer(resolved_env) | ||
| return | ||
|
|
||
| resume_dir = _parse_value_option(passthrough_args, "--resume", None) |
There was a problem hiding this comment.
hmm getting a bit of smell from this code block. feels like this can be simplified


Overview
Adds Verifiers v1 support to the Prime CLI while preserving compatibility with legacy and hosted evaluation flows.
What changed
prime eval runandprime env initthrough the Verifiers v1 entrypoints--save-resultson the legacy v0 evaluatorverifiers>=0.1.15.dev371from PyPI instead of a Git sourceexclude-newerpolicyuv syncanduv runworkflow for Python 3.11–3.13User impact
Local evaluations use the Verifiers v1 CLI by default. Legacy and hosted-generated commands remain compatible with the existing hosted runner, while local v1 results can be uploaded through the Prime Evals flow.
Note
Medium Risk
Large changes to the default eval execution and upload path affect most local runs; legacy and hosted paths are retained but users on Python 3.10 or unpinned older verifiers will break.
Overview
Local
prime eval runandprime env initnow go through Verifiers v1 (verifiers.v1.cli.*), with plugin invocation using v1 console-style entrypoints and help text rewritten forprimecommands.v1 eval flow adds taskset TOML (
@config),--resume,--dry-run,--client.base-url/--client.headers, default output dirs, post-runmetadata.json, andconvert_eval_resultsso v1 traces upload to Prime Evals while legacyresults.jsonlstill works.--save-results/-skeeps the v0 evaluator (hosted-generated commands). CLI--sampling-argsis rejected for v1 unless legacy save mode; v1 uses--sampling.*.Dependencies & platform:
verifiers>=0.1.15.dev371from PyPI (git pin removed), Python>=3.11,<3.14, CI/docker matrices 3.11–3.13 only. Root lock/trust metadata updated for the v1 dependency graph.Reviewed by Cursor Bugbot for commit a3d41ab. Bugbot is set up for automated code reviews on this repo. Configure here.