Skip to content

docs(synthesizer): document + version + test synthesizer behavior (v1.0 #5)#43

Closed
ernestprovo23 wants to merge 2 commits into
mainfrom
feat/v1-synthesizer-behavior
Closed

docs(synthesizer): document + version + test synthesizer behavior (v1.0 #5)#43
ernestprovo23 wants to merge 2 commits into
mainfrom
feat/v1-synthesizer-behavior

Conversation

@ernestprovo23

Copy link
Copy Markdown
Member

PR-C of conclave v1.0 — document + test synthesizer behavior

Closes readiness must-do #5 (~/.claude/sdlc/conclave-v1/00_READINESS_REVIEW.md): the synthesizer path is the heart of the "council" value prop but was undocumented and lightly tested, risking silent degradation.

Investigation — actual current behavior (file:line)

  • Which model synthesizes: Council.synthesizer = synthesizer or config.synthesizer (council.py:76); CLI --synthesizer/-s (cli.py:245); config synthesizer: (config.py:80/191); default claude = anthropic/claude-sonnet-4-6 (registry.py:138). Same model is the judge (adversarial, modes.py:421) and debate consolidator (modes.py:237).
  • Failure / unkeyed path: already OBSERVABLE, not silent. No-usable-answers (council.py:364), synthesizer unkeyed (council.py:372), synthesizer call fails (council.py:390) each set CouncilResult.synthesis_error; synthesis stays None; member answers preserved. Adversarial judge mirrors to AdversarialResult.verdict_errorsynthesis_error (modes.py:431/454/344). No silent quiet-concat path existed — confirmed + pinned with tests rather than "fixed."
  • Prompt: _SYNTH_SYSTEM constant (council.py), reused by streaming (streaming.py:189); debate/judge prompts in prompts.py. Was not versioned → addressed here.

What changed

  1. Versioned synthesis promptconclave.prompts.SYNTHESIS_PROMPT_VERSION ("2026-06-14"), re-exported from council, stamped onto every CouncilResult as prompt_version (lazy default_factory avoids the promptsmodels import cycle). Prompt text byte-stable; constant + text pinned so a prompt change without a version bump fails CI.
  2. Documentation — module + _synthesize docstrings (selection precedence, the three degraded paths, versioning) and a README "Synthesizer behavior" section. Doc index: new test row + CouncilResult field note + changelog row.
  3. Confirmed observability — no behavior change to the degraded path; regression tests assert the signal is present.
  4. tests/test_synthesizer.py (21 tests) — (a) default selection, (b) arg/config override, (c) CLI --synthesizer override, (d) degraded path signaled for synthesize + debate + adversarial judge (unkeyed + call-failure), (e) prompt-version stability across every mode. Mocks at the existing httpx-transport boundary (offline).

Invariants

  • No non-synthesis behavior changed; happy-path synthesis output is byte-for-byte unchanged (additive prompt_version metadata only).
  • ruff check + ruff format --check clean; coverage 89.39% (CI floor 75%); 210 passed.
  • The 2 test_logging.py failures seen locally are a pre-existing caplog handler-isolation artifact (byte-identical to main, green in CI), unrelated to this PR.

🤖 Generated with Claude Code

#5)

The synthesizer/judge path is the heart of conclave's "council" value prop but
was undocumented and lightly tested, risking silent degradation. This makes it
sound and observable for 1.0.

Investigation (current behavior, unchanged):
- synthesizer = constructor arg, else config `synthesizer:`, else built-in
  default "claude" (registry.DEFAULT_SYNTHESIZER). Same model judges in
  adversarial and consolidates in debate.
- degraded paths were ALREADY observable, not silent: no-usable-answers,
  unkeyed synthesizer, and synthesizer-call-failure each set
  CouncilResult.synthesis_error (adversarial: AdversarialResult.verdict_error,
  mirrored). synthesis stays None; member answers preserved. No silent
  quiet-concat path existed to fix -- confirmed + pinned with tests.

Changes:
- version the synthesis prompt set: new conclave.prompts.SYNTHESIS_PROMPT_VERSION,
  re-exported from council, stamped onto every CouncilResult as `prompt_version`
  (lazy default_factory avoids the prompts<->models import cycle). Prompt text is
  byte-stable; the constant + text are pinned so a prompt change without a version
  bump fails CI.
- document selection/default/configurability/fallback in the council module
  docstring + _synthesize docstring, and a README "Synthesizer behavior" section.
- DOCUMENTATION_INDEX: new test file row, CouncilResult field note, changelog row.
- tests/test_synthesizer.py (21 tests): default + arg/config/CLI override
  selection; observable degradation for synthesize, debate, and adversarial judge
  (unkeyed + call-failure); prompt-version stability across every mode.

No non-synthesis behavior changed; happy-path synthesis output is byte-for-byte
unchanged. Mocks at the existing httpx-transport boundary (offline).
… capture

CI on this branch resolved pytest 9.1.0 (deps are pinned >=8.0.0; main's last
green run predates the 9.1.0 release). pytest 9.x attaches its LogCaptureHandler
(a StreamHandler subclass) directly to the non-propagating `conclave` logger
during a run, so `len(logger.handlers) == 1` now sees 3 handlers and
test_logging.py's one-shot-configuration assertions fail across 3.11/3.12/3.13.

Count only conclave's own handler via `type(h) is logging.StreamHandler`
(pytest's is a subclass) instead of all handlers. This preserves the test's
intent exactly -- the factory installs one StreamHandler and never duplicates it
-- while ignoring pytest-injected capture handlers, and is stable across pytest
versions. No production code changed.
@ernestprovo23

Copy link
Copy Markdown
Member Author

Added a second commit (e0cf138): CI on this branch resolved pytest 9.1.0 (deps are >=8.0.0; main's last green run predates 9.1.0). pytest 9.x attaches its LogCaptureHandler (a StreamHandler subclass) to the non-propagating conclave logger, so test_logging.py's len(handlers) == 1 assertions saw 3 handlers and failed across all 3 Python versions. Fix counts only conclave's own handler via type(h) is logging.StreamHandler (pytest's is a subclass) — intent-preserving, no production code changed. This is a pre-existing latent fragility surfaced (not caused) by the pytest upgrade.

@ernestprovo23

Copy link
Copy Markdown
Member Author

Superseded by #45 (integrated into the v1.0.0 release commit).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant