Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
30 changes: 30 additions & 0 deletions .gitleaks.toml
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
# gitleaks configuration for conclave.
#
# conclave is a bring-your-own-keys tool whose security tests deliberately plant
# OBVIOUSLY-FAKE, key-SHAPED strings (e.g. "sk-FAKE...", "AIza-FAKE...") to prove
# that redact() / the cache / streaming never let a real key escape. Those fake
# fixtures are not secrets, but a key-shaped literal can trip gitleaks' generic
# rules. This allowlist scopes that exception to the test tree ONLY, so a real
# secret committed anywhere in the source or docs is still caught.
#
# We extend gitleaks' bundled default ruleset rather than replacing it, so every
# upstream detector stays active outside the allowlisted paths.

[extend]
useDefault = true

[allowlist]
description = "Fake, key-shaped fixtures used by the key-leak regression tests are not secrets."
# Restrict the allowance to the tests directory: production code, docs, and
# config are still fully scanned.
paths = [
'''tests/.*\.py''',
]
# Defense in depth: also allow the explicit fake-key marker tokens anywhere they
# appear, so a fixture moved/quoted in a doc example is not flagged. These are
# intentionally synthetic sentinels, never real credentials.
regexes = [
'''FAKE-?[A-Za-z0-9_\-]*''',
'''sk-(test|FAKE|streamleak|CONCLAVE)[A-Za-z0-9_\-]*''',
'''AIza-?(dummy|test|FAKE)[A-Za-z0-9_\-]*''',
]
149 changes: 149 additions & 0 deletions SECURITY.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,155 @@ handling: a weakness that causes a real API key to be stored, logged, serialized
or echoed back to the user breaks the core trust promise. We treat reports against
that surface as the highest priority.

## Threat model

This section is the honest, current map of what conclave's key handling **does**
and **does not** defend against. It backs the headline BYO-keys claim. The threat
we model is **credential leakage**: a real provider API key escaping the process
into a result object, a log line, a serialized payload, an on-disk cache file, or
terminal output. Accuracy here is the product — we document accepted limitations
rather than overclaim.

### Trust boundary

The boundary conclave defends is **"the user's key value never leaves the in-flight
HTTPS request to the provider, except where the user themselves directs it."**

```
environment ──(read by NAME, at call time)──▶ adapter builds request
│ │
│ headers carry the key
│ ▼
└────────────── INSIDE the boundary ──▶ httpx → provider (TLS)
════════════ TRUST BOUNDARY ══════════════ │ response / error
redact() scrubs every error/diagnostic string conclave produces
CouncilResult / logs / cache / stdout ◀── OUTSIDE the boundary (must be key-free)
```

Inside the boundary the key value legitimately exists: in the environment, in the
local variable that reads it, and in the request headers handed to httpx. Outside
the boundary — anything conclave returns, logs, caches, serializes, or prints —
must be free of key material. Everything below is about keeping that second set
clean.

### What IS protected

- **Name-only key handling.** Keys are referenced by env var **name** in config
and code. The value is read from the environment **at call time** in
`providers._resolve_key`, used only to build the request, and **never assigned
to any object, cached field, or model**. `registry.key_present` /
`key_source` report only *whether* a var is set and its *name* — never its value.
- **`redact()` scope.** Every error/diagnostic string conclave surfaces passes
through `conclave.adapters.base.redact()` before it reaches a result field, a
log, or stdout. `redact()` scrubs, in order: (1) the live **value** of every env
var conclave knows a name for — built-in providers **and** custom-endpoint
`env_var` names declared in config (this catches a BYO key of *any* shape);
(2) `x-api-key` / `x-goog-api-key` header echoes; (3) `Authorization: Bearer …`
tokens; (4) standalone provider-shaped key tokens (`sk-…`, `xai-…`, `pplx-…`,
`AIza…`). `ProviderError` redacts **on construction**; the provider call path
redacts again at capture (belt-and-suspenders).
- **No key persistence — including the cache.** conclave never writes a key to
disk. The optional result cache (`conclave.cache`, off by default) stores only
the already-redacted `CouncilResult` (`model_dump(mode="json")`), and its cache
key is a SHA-256 over prompt + mode + member/synthesizer **names** + model ids +
params — **no env var name or value** is read when computing it. So neither a
cache file, a cache filename, nor the cache key can carry a secret.
- **Streaming path.** Streamed text deltas carry only parsed answer **content**.
A mid-stream provider error is captured and **redacted** on the final
`ModelAnswer` exactly like the buffered path; the error path emits no text
delta, so a key echoed in an error reaches neither a streamed event nor the
final answer.
- **Partial-failure isolation.** One member failing never aborts a run, and each
member's error is independently redacted, so a leak in one provider's error
response cannot smear into another member's result. The defense-in-depth
catch-alls in `Council.fan_out` and `streaming._drive_member` (which only fire on
an *unexpected* raise escaping the already-redacting provider call) also run
their exception text through `redact()`, so the "every surfaced error string is
scrubbed" invariant holds even on those paths.
- **`repr` / `str` safety.** No config, adapter, or result object stores a key, so
none can render one in a `repr`/`str` or a traceback frame that references it.
The transient request `headers` dict does carry the key (it must, to authenticate),
but it is built inside the adapter and handed straight to the transport — it is
not retained on any object.
- **Exception cause-chain hardening.** The transport raises its `TransportError`
with the cause chain dropped, so the surfaced error retains **no** reference to
the underlying httpx exception. That httpx exception's `.request.headers` holds
the live auth header; had it survived as `__cause__`/`__context__` it would be one
cause-chain hop from the surfaced error and would leak the key via
`traceback.format_exception`, `logging.exception`, or a cause-chain `repr` of a
transport error. `raise … from None` clears `__cause__` and sets
`__suppress_context__` (so no standard formatter renders the httpx exception or
its headers), and the transport additionally nulls `__context__` at a boundary so
even a direct `err.__context__` attribute walk finds no header-bearing exception.
- **CLI.** `conclave providers` prints key **presence** and the env var **name**,
never a value; `--json` serializes the same redacted `CouncilResult`.

These guarantees are pinned by the regression suite in
[`tests/test_keyleak_audit.py`](tests/test_keyleak_audit.py) (one test class per
vector below), plus the redaction/cache/streaming tests in `tests/test_providers.py`,
`tests/test_cache.py`, and `tests/test_streaming.py`.

### What redact() does NOT cover — accepted limitations

`redact()` is a defense for the strings **conclave itself** produces. It is not a
universal egress filter, and we do not claim it is. Known gaps, accepted for 1.0:

- **httpx / httpcore DEBUG logging (out of band — guarded by default).** httpx and
httpcore have their own loggers. At **DEBUG** level httpcore logs full request
headers, including the `Authorization` / `x-api-key` value, to whatever handler
the host application configured. This bypasses `redact()` entirely — it never
sees those records. The guard against it is now **default-on, opt-out**:
constructing a `Council` automatically calls `conclave.guard_transport_logging()`,
which installs a filter that drops httpx/httpcore **DEBUG** records (the only
level that emits header content) while leaving INFO+ diagnostics intact. The
guard is scoped to the `httpx`/`httpcore` loggers only — it never touches the
host application's root logger or any other logger. Opt out with
`Council(…, allow_transport_debug_logging=True)` for the rare case where you
need that DEBUG band in a process that holds no real keys; you remain responsible
for it then. Consumers using the provider functions directly **without** a
`Council` can install the same guard by calling `conclave.guard_transport_logging()`
once at startup (it is idempotent). Either way, the standing guidance remains:
do not enable httpx/httpcore DEBUG logging in a process that holds real provider
keys (e.g. avoid `logging.basicConfig(level=logging.DEBUG)` process-wide).
- **Partial / URL-encoded / transformed key fragments.** `redact()` masks the
exact env-var value and a fixed set of known key *shapes*. It does **not** catch
a key that a provider has split, truncated, URL-encoded, base64-wrapped, or
otherwise transformed before echoing it back, **unless** that transformed form
still equals the live env-var value (the value-based pass) or matches a known
shape. A novel provider error that leaks `<first-12-chars>…` of a key, or a
percent-encoded form, can slip the pattern pass. The value-based pass is the
primary defense; the shape patterns are best-effort secondary.
- **Anything the user explicitly logs or prints.** If a consumer reads a key from
the environment themselves, or logs/prints the request headers, the raw
`os.environ`, or their own constructed Authorization header, that is outside
conclave's control. conclave only governs the strings it returns and logs.
- **The in-flight request and the provider side.** The key is, by necessity,
present in the request headers and transmitted to the provider over TLS. What the
provider does with it (its logs, its breach posture) is outside scope. Memory
inspection of the running process (a local attacker with debugger access) is also
out of scope — the env var value is in process memory by design.
- **Dependencies.** Vulnerabilities in httpx, pydantic, typer, pyyaml, or rich
themselves are upstream; report conclave's *use* of them to us if exploitable,
but the libraries' own CVEs belong upstream. CI runs gitleaks on every push.

### Key-leak vector map (what a reviewer probes day 1)

| # | Vector | Risk | Status |
|---|--------|------|--------|
| 1 | Cache write path ordering | HIGH if pre-redaction | **Protected** — cache stores only the redacted `CouncilResult`; key never in file/filename/key. Test: V1. |
| 2 | Streaming chunk path | MED | **Protected** — deltas are answer content; mid-stream errors redacted on the final answer, never streamed. Test: V2. |
| 3 | config/transport `__repr__` in tracebacks | MED | **Protected** — no object stores a key; transient headers are not retained. Test: V3. |
| 4 | Provider 400/422 echoing request fragments | MED | **Protected** — error capture runs through `redact()` (and `ProviderError` redacts on construction). Test: V4. |
| 5 | httpx/httpcore DEBUG logging | HIGH (bypasses redact) | **Default-on guard (Council installs it) + opt-out** — `Council.__init__` calls `guard_transport_logging()` automatically; opt out with `allow_transport_debug_logging=True`. Tests: V5, V9. |
| 6 | redact() misses URL-encoded / partial fragments | MED | **Accepted limitation** — documented above; value-based pass is primary, shape patterns best-effort. |
| 7 | Test fixtures with key-shaped strings | LOW | **Protected** — all fixtures use obviously-fake `…FAKE…` patterns; `.gitleaks.toml` allowlists the test tree only. |
| 8 | Partial-failure catch-all error construction (audit-found; not in original map) | LOW | **Protected** — `fan_out` / `_drive_member` catch-alls now `redact()` the raw exception text too. Test: V7. |
| 9 | TransportError cause chain retaining the httpx exception (header-bearing `.request`) | HIGH (leaks via traceback/`logging.exception`) | **Protected** — transport raises `… from None`; surfaced error keeps no `__cause__`/`__context__` ref to the httpx exception, so its auth header cannot leak via traceback/cause-chain repr. Test: V8. |

## Reporting a vulnerability

**Do not open a public GitHub issue, pull request, or discussion for a security
Expand Down
3 changes: 2 additions & 1 deletion src/conclave/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -49,7 +49,7 @@
StreamEvent,
TokenUsage,
)
from .transport import aclose
from .transport import aclose, guard_transport_logging

__version__ = "0.3.0"

Expand All @@ -64,5 +64,6 @@
"ConclaveConfig",
"load_config",
"aclose",
"guard_transport_logging",
"__version__",
]
11 changes: 11 additions & 0 deletions src/conclave/cache.py
Original file line number Diff line number Diff line change
Expand Up @@ -195,6 +195,17 @@ def store(key: str, result: CouncilResult) -> None:
try:
path = _entry_path(key)
path.parent.mkdir(parents=True, exist_ok=True)
# KEY-LEAK INVARIANT (audit vector 1): the cache only ever persists a
# CouncilResult that has ALREADY passed through redaction upstream. Every
# error string on the result (ModelAnswer.error, synthesis_error,
# verdict_error) is scrubbed by redact() at the point of capture in
# conclave.providers, BEFORE it is placed on the result and therefore long
# before it reaches this write. Member/synthesis answer TEXT is provider
# content, never key material. The cache KEY (make_key) is composed solely
# of prompt + mode + member/synthesizer NAMES + model ids + params -- no
# env var name or value is read here. Net: no raw key (name or value) can
# reach a cache file or filename. Do not move any un-redacted capture into
# the result after this contract -- it would persist a secret to disk.
payload = result.model_dump(mode="json")
payload["cached"] = False
# Atomic-ish write: write to a temp sibling then replace, so a crash mid
Expand Down
40 changes: 32 additions & 8 deletions src/conclave/council.py
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,7 @@

from . import cache as cache_mod
from . import transport
from .adapters.base import redact
from .config import ConclaveConfig, load_config
from .logging import get_logger
from .models import CouncilResult, ModelAnswer, StreamEvent
Expand Down Expand Up @@ -55,6 +56,20 @@ class Council:
identical repeat run is served from the on-disk cache instead of
re-calling the providers. The cache never persists API keys --
see :mod:`conclave.cache`.
allow_transport_debug_logging: Opt **out** of the transport-logging guard.
Defaults to ``False``, which means the guard is **ON**: constructing a
``Council`` installs :func:`conclave.transport.guard_transport_logging`
so httpx/httpcore ``DEBUG`` records -- the only band that emits request
headers, including the live ``Authorization``/``x-api-key`` value -- are
dropped before any handler formats them (key-leak audit, RANK 6). The
guard is idempotent, so constructing many councils installs it once. The
filter is scoped to the ``httpx``/``httpcore`` loggers only; it never
touches the host application's root logger or any other logger.
Set ``True`` to skip installation for the rare case where you genuinely
need httpx/httpcore ``DEBUG`` output in a process that does not hold real
keys; you remain responsible for that band then. Consumers using the
provider functions directly (without a ``Council``) can still call
:func:`conclave.guard_transport_logging` themselves.

Example:
>>> council = Council(models=["grok", "perplexity"], synthesizer="claude")
Expand All @@ -70,6 +85,7 @@ def __init__(
temperature: float = 0.7,
timeout: float = 120.0,
cache: bool | None = None,
allow_transport_debug_logging: bool = False,
) -> None:
self.config = config or load_config()
self.requested_models = list(models)
Expand All @@ -78,6 +94,15 @@ def __init__(
self.timeout = timeout
# Explicit override wins; otherwise defer to config (off by default).
self.cache_enabled = self.config.cache if cache is None else cache
# Default-on transport-logging guard (key-leak audit, RANK 6): drop
# httpx/httpcore DEBUG records (the only band that emits the auth header)
# so a process holding a real key cannot leak it via verbose transport
# logging, even if the host enables DEBUG app-wide. Idempotent, so many
# councils install it once; scoped to the httpx/httpcore loggers only.
# ``allow_transport_debug_logging=True`` opts out for callers who need
# that DEBUG band and accept the responsibility.
if not allow_transport_debug_logging:
transport.guard_transport_logging()

def _available_members(self) -> tuple[list[tuple[str, str]], list[str]]:
"""Partition requested members into (available, skipped-for-no-key).
Expand Down Expand Up @@ -205,14 +230,13 @@ async def fan_out(
if isinstance(outcome, ModelAnswer):
answers.append(outcome)
else:
logger.warning("%s raised unexpectedly: %s", name, outcome)
answers.append(
ModelAnswer(
name=name,
model_id=model_id,
error=f"{type(outcome).__name__}: {outcome}",
)
)
# call_model already redacts and never raises, so this arm only
# fires on an UNEXPECTED escape. Redact the exception text anyway:
# the invariant "every error string conclave surfaces is scrubbed"
# must hold even on this defense-in-depth path (key-leak audit).
message = redact(f"{type(outcome).__name__}: {outcome}")
logger.warning("%s raised unexpectedly: %s", name, message)
answers.append(ModelAnswer(name=name, model_id=model_id, error=message))
return answers

async def ask(self, prompt: str, synthesize: bool = True) -> CouncilResult:
Expand Down
14 changes: 8 additions & 6 deletions src/conclave/streaming.py
Original file line number Diff line number Diff line change
Expand Up @@ -31,6 +31,7 @@
from collections.abc import AsyncIterator
from typing import TYPE_CHECKING

from .adapters.base import redact
from .logging import get_logger
from .models import CouncilResult, ModelAnswer, StreamEvent
from .providers import call_model_stream
Expand Down Expand Up @@ -81,15 +82,16 @@ async def _drive_member(
)
)
except Exception as exc: # noqa: BLE001 -- a member must never wedge the run
logger.warning("%s streaming raised unexpectedly: %s", name, exc)
# call_model_stream already redacts and never raises, so this arm only
# fires on an UNEXPECTED escape. Redact the exception text anyway so the
# "every surfaced error string is scrubbed" invariant holds even on this
# defense-in-depth path (key-leak audit, vector 2).
message = redact(f"{type(exc).__name__}: {exc}")
logger.warning("%s streaming raised unexpectedly: %s", name, message)
await queue.put(
(
"answer",
ModelAnswer(
name=name,
model_id=model_id,
error=f"{type(exc).__name__}: {exc}",
),
ModelAnswer(name=name, model_id=model_id, error=message),
)
)
finally:
Expand Down
Loading
Loading