Define the Verifiers CLI lifecycle surface by xeophon · Pull Request #1857 · PrimeIntellect-ai/verifiers

xeophon · 2026-06-24T15:41:18Z

Overview

This PR makes Verifiers the authoritative local CLI and runtime surface for creating, validating, serving, evaluating, and optimizing environments. Prime-specific acquisition and account management stay in Prime CLI.

High-level changes

Exports eval, init, validate, serve, and gepa through verifiers.cli.CLI_MODULES.
Uses strict Pydantic configuration for the complete local lifecycle, including native @ TOML loading.
Makes V1 tasksets the default scaffold while retaining explicit V0 creation, evaluation, and serving.
Supports typed GEPA configuration for one or multiple local environments.
Restricts taskset, harness, and V0 environment IDs to built-in or locally importable packages.
Removes Prime profile loading, Hub reference resolution, downloads, caching, and installation from Verifiers.
Keeps config.toml and results.jsonl as the native evaluation artifacts without an additional manifest.

Prime and Verifiers boundary

Prime CLI acquires owner/name[@version] packages and materializes its selected account as environment variables. Verifiers consumes the installed local package and those explicit runtime credentials; it does not read Prime profile files or perform platform transfers.

Verifiers continues to own the configuration and behavior of local evaluation, environment initialization and validation, serving, GEPA, plugin loading, execution, and artifact production.

Public surface

The typed command models, command-module registry, and artifact readers are exported for host applications. Bundled examples, templates, documentation, and Lab guidance use the same command and artifact contracts.

Companion change

Prime CLI integration: PrimeIntellect-ai/prime#760

…otocol

…o codex/eval-process-protocol # Conflicts: # verifiers/v1/cli/eval/main.py # verifiers/v1/cli/eval/resolver.py

macroscopeapp · 2026-06-29T14:55:07Z

+            continue
+        task = trace.task.model_dump(mode="json", exclude_none=True)
+        branches = trace.branches
+        main_messages = (


🟡 Medium cli/output.py:166

convert_results_for_upload sets "prompt": [] and puts branches[-1].messages (the full root-to-leaf conversation, including the original prompt messages) into "completion". This duplicates the prompt inside completion and loses the prompt/completion split, so consumers that concatenate prompt + completion or gate on both fields being non-empty will misrender or skip native-run transcripts. Consider splitting branches[-1].messages so the initial prompt messages populate "prompt" and the remaining messages populate "completion".

🚀 Reply "fix it for me" or copy this AI Prompt for your agent:

In file @verifiers/v1/cli/output.py around line 166: `convert_results_for_upload` sets `"prompt": []` and puts `branches[-1].messages` (the full root-to-leaf conversation, including the original prompt messages) into `"completion"`. This duplicates the prompt inside `completion` and loses the prompt/completion split, so consumers that concatenate `prompt + completion` or gate on both fields being non-empty will misrender or skip native-run transcripts. Consider splitting `branches[-1].messages` so the initial prompt messages populate `"prompt"` and the remaining messages populate `"completion"`.

macroscopeapp · 2026-06-29T14:55:07Z

+    """Read JSONL results while reporting incomplete or invalid records to the caller."""
+    results: list[dict[str, Any]] = []
+    invalid: list[InvalidResultLine] = []
+    with path.open(encoding="utf-8") as handle:


🟡 Medium cli/output.py:82

read_results opens results.jsonl with encoding="utf-8" but does not catch UnicodeDecodeError inside the loop, so a single malformed UTF-8 line (e.g. from an interrupted write of a multibyte character) raises during for line in handle and aborts the entire read. The malformed line is never appended to invalid, so one corrupted line makes the whole run unloadable. Consider opening with encoding="utf-8", errors="replace" (or wrapping the iteration in a try/except UnicodeDecodeError) so partial corruption is reported per line instead of failing the whole load.

Suggested change

with path.open(encoding="utf-8") as handle:

with path.open(encoding="utf-8", errors="replace") as handle:

🚀 Reply "fix it for me" or copy this AI Prompt for your agent:

In file @verifiers/v1/cli/output.py around line 82: `read_results` opens `results.jsonl` with `encoding="utf-8"` but does not catch `UnicodeDecodeError` inside the loop, so a single malformed UTF-8 line (e.g. from an interrupted write of a multibyte character) raises during `for line in handle` and aborts the entire read. The malformed line is never appended to `invalid`, so one corrupted line makes the whole run unloadable. Consider opening with `encoding="utf-8", errors="replace"` (or wrapping the iteration in a `try/except UnicodeDecodeError`) so partial corruption is reported per line instead of failing the whole load.

add versioned eval process protocol

638d611

xeophon mentioned this pull request Jun 24, 2026

Rewrite Prime CLI around Pydantic configs PrimeIntellect-ai/prime#760

Draft

macroscopeapp Bot reviewed Jun 24, 2026

View reviewed changes

Comment thread verifiers/v1/cli/eval/main.py Outdated

simplify eval process protocol

0cf1ecb

macroscopeapp Bot reviewed Jun 25, 2026

View reviewed changes

Comment thread verifiers/v1/cli/eval/resolver.py Outdated

xeophon added 2 commits June 25, 2026 15:07

simplify eval resolver

163b8e6

Merge remote-tracking branch 'origin/main' into codex/eval-process-pr…

dd9d5b9

…otocol

macroscopeapp Bot reviewed Jun 25, 2026

View reviewed changes

Comment thread verifiers/v1/cli/eval/resolver.py Outdated

xeophon added 2 commits June 25, 2026 15:17

stabilize eval process run identity

82f52a7

Merge remote-tracking branch 'origin/codex/eval-process-protocol' int…

592f290

…o codex/eval-process-protocol # Conflicts: # verifiers/v1/cli/eval/main.py # verifiers/v1/cli/eval/resolver.py

xeophon changed the title ~~[codex] add versioned eval process protocol~~ Add versioned eval process protocol Jun 25, 2026

macroscopeapp Bot reviewed Jun 25, 2026

View reviewed changes

Comment thread verifiers/v1/cli/eval/main.py Outdated

xeophon added 4 commits June 25, 2026 15:54

remove eval process tests

8561532

remove eval run sidecar

ffa575d

raise catchable eval resume errors

1ffe6c7

simplify eval to a one-pass CLI

bed41b8

xeophon changed the title ~~Add versioned eval process protocol~~ Simplify the V1 eval CLI for host delegation Jun 26, 2026

macroscopeapp Bot reviewed Jun 26, 2026

View reviewed changes

Comment thread verifiers/cli/plugins/prime.py Outdated

Simplify the CLI surface

9f5c234

xeophon changed the title ~~Simplify the V1 eval CLI for host delegation~~ Define the Verifiers CLI lifecycle surface Jun 26, 2026

Unify the Verifiers CLI lifecycle

1741600

macroscopeapp Bot reviewed Jun 26, 2026

View reviewed changes

Harden Hub source resolution

065011a

macroscopeapp Bot reviewed Jun 26, 2026

View reviewed changes

Comment thread verifiers/utils/install_utils.py Outdated

Comment thread verifiers/utils/install_utils.py Outdated

Comment thread verifiers/v1/utils/install.py Outdated

Remove platform acquisition from Verifiers

dd1f34f

macroscopeapp Bot reviewed Jun 26, 2026

View reviewed changes

Comment thread docs/overview.md Outdated

Export eval artifact upload helpers

d76d165

macroscopeapp Bot reviewed Jun 29, 2026

View reviewed changes

Comment thread verifiers/v1/cli/output.py Outdated

xeophon mentioned this pull request Jun 29, 2026

Prime CLI v1 PrimeIntellect-ai/prime#766

Open

Merge main and address PR feedback

6789718

macroscopeapp Bot reviewed Jun 29, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Define the Verifiers CLI lifecycle surface#1857

Define the Verifiers CLI lifecycle surface#1857
xeophon wants to merge 16 commits into
mainfrom
codex/eval-process-protocol

xeophon commented Jun 24, 2026 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

macroscopeapp Bot Jun 29, 2026

Uh oh!

macroscopeapp Bot Jun 29, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

	with path.open(encoding="utf-8") as handle:
	with path.open(encoding="utf-8", errors="replace") as handle:

Uh oh!

Conversation

xeophon commented Jun 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Overview

High-level changes

Prime and Verifiers boundary

Public surface

Companion change

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

macroscopeapp Bot Jun 29, 2026

Choose a reason for hiding this comment

Uh oh!

macroscopeapp Bot Jun 29, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

xeophon commented Jun 24, 2026 •

edited

Loading