From da35e542097c530d080a8e50dddd00dbf7cfeee1 Mon Sep 17 00:00:00 2001
From: Beatrice Marti <mabesa@users.noreply.github.com>
Date: Sat, 13 Jun 2026 08:01:22 +0200
Subject: [PATCH 01/16] =?UTF-8?q?docs:=20Phase=200=20decision=20pack=20?=
 =?UTF-8?q?=E2=80=94=20resolve=20Nepal=20v1=20design=20questions,=20add=20?=
 =?UTF-8?q?FI=E2=86=94SAP3=20mapping?=
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Settle the open design questions where SAP3 behaviour already determines the
answer, and route the rest to the model developer as a decision-ready list.

- open_design_questions.md: split into resolved decisions (station-keyed output
  per Option a, target declaration, state-free, spatial vocab → SpatialRepresentation,
  quantile floor) and open developer questions. Correct two stale premises:
  forecast_horizon IS consumed by the adapter; the issue_datetime column is renamed
  to valid_time, not dropped.
- fi-sap3-mapping.md (new): authoritative FI↔SAP3 adapter-boundary contract —
  governance, the two adapter paths, output/input mapping tables, station identity,
  state bridge, and artifact-metadata ownership split.
- model_interface.md: replace the "for later integration" stub with the Training &
  Lifecycle Protocol target spec (unified ForecastModel, ArtifactScope, train/retrain/
  serialize/deserialize, rng injection, TrainedArtifact, station-keyed ModelOutput).
- nepal-model-requirements.md: record developer answers — eastern group ships first
  (GROUP-scoped), SnowMapper starts with SWE + ROF, artifact transfer is east→west.

Also wire up the mandated bump-my-version workflow (was unconfigured) and add
__version__ to the package. Patch bump 0.1.1 → 0.1.2.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
---
 docs/fi-sap3-mapping.md          | 321 +++++++++++++++++++++++++++++++
 docs/model_interface.md          |  97 ++++++++--
 docs/nepal-model-requirements.md | 251 ++++++++++++++++++++++++
 docs/open_design_questions.md    | 137 ++++++-------
 forecast_interface/__init__.py   |   2 +
 pyproject.toml                   |  19 +-
 uv.lock                          | 240 ++++++++++++++++++++++-
 7 files changed, 979 insertions(+), 88 deletions(-)
 create mode 100644 docs/fi-sap3-mapping.md
 create mode 100644 docs/nepal-model-requirements.md

diff --git a/docs/fi-sap3-mapping.md b/docs/fi-sap3-mapping.md
new file mode 100644
index 0000000..7848fef
--- /dev/null
+++ b/docs/fi-sap3-mapping.md
@@ -0,0 +1,321 @@
+# ForecastInterface ↔ SAPPHIRE Flow (SAP3) Adapter-Boundary Mapping
+
+> **Status:** Forward contract (Phase 0, docs-only). No adapter exists in SAP3 yet.
+> **FI side:** authoritative for output types; co-designs input types; owns the model protocol.
+> **Companion document:** SAP3 `docs/plans/archive/014-forecast-interface-adapter-design.md`
+> (referred to below as **doc 014**). This document is the **FI-side** formalization of doc 014.
+
+This is the single authoritative place that says, type by type, how a ForecastInterface
+(FI) model maps onto SAP3's `StationForecastModel` / `GroupForecastModel` contract across
+the planned `ForecastInterfaceAdapter` boundary.
+
+File references use `path:line`. FI paths are relative to this repo
+(`/Users/bea/Documents/GitHub/ForecastInterface`); SAP3 paths are relative to the sibling
+repo (`/Users/bea/Documents/GitHub/SAPPHIRE_flow`).
+
+---
+
+## 1. Purpose & governance
+
+FI is the **model-author-facing contract**: a model developer implements `ForecastModel`
+(`forecast_interface/interface/protocol.py:10`), declares its `InputRequirement`
+(`forecast_interface/input/requirement.py:35`), and returns a `ModelResult`
+(`forecast_interface/interface/result.py:40`) wrapping a `ModelOutput`
+(`forecast_interface/output/model_output.py:9`). FI knows nothing about stations, groups,
+batching, stacking, QC, alerts, or artifact storage.
+
+SAP3 is the **operational system**. It plans to wrap FI models behind a thin
+`ForecastInterfaceAdapter` (doc 014 §A Task 1, lines 157–240; naming per doc 014 line 343)
+that translates between FI's contract and SAP3's `StationForecastModel` /
+`GroupForecastModel` protocols (`src/sapphire_flow/protocols/forecast_model.py:23,49`).
+
+### Governance split (doc 014 lines 300–305)
+
+| Concern | Authority | Mechanism |
+|---|---|---|
+| FI **output** types (`ModelOutput`, `VariableOutput`, data containers, enums) | **FI** | Stable & test-locked; SAP3 adapts to them. |
+| FI **input** types (`InputRequirement` & friends) | **Co-designed** | SAP3 contributes via a SAP3→FI PR (doc 014 Task 3, lines 257–283). |
+| FI **interface / model Protocol** (`ForecastModel`) | **FI** | FI-owned; SAP3 wraps thin (doc 014 Task 4–5, lines 287–305). |
+
+### Current state of the adapter (verified)
+
+As of this writing there is **no adapter implemented in SAP3**:
+
+- No import of the FI package anywhere in `SAPPHIRE_flow/src/`.
+- No `ForecastInterfaceAdapter` class.
+- No use of FI's `ModelOutput`. The only `ModelOutput`-named symbol in SAP3 src is
+  `ModelOutputError` (`src/sapphire_flow/exceptions.py:17`) — the *planned* error subclass,
+  not a use of FI's output type.
+
+This document therefore describes a **forward contract**, not an implemented one. Everything
+below states what the adapter *must* do when built; it is not a description of running code.
+
+---
+
+## 2. The two adapter paths
+
+The adapter boundary sits between SAP3's assembly layer and the FI model. SAP3 selects a
+path by `artifact_scope` (doc 014 lines 114–127).
+
+```
+STATION path (v0b — ships first):
+  ModelDataRequirements → assembly → StationModelInputs → adapter → FI input
+    → FI model → ModelOutput → adapter → tuple[dict[str, ForecastEnsemble], bytes | None]
+        (SAP3)                 (boundary)    (FI)        (boundary)        (SAP3)
+
+GROUP path (v1 — multi-station):
+  ModelDataRequirements → assembly → GroupModelInputs → adapter → FI input
+    → FI model → ModelOutput → adapter → dict[StationId, tuple[dict[str, ForecastEnsemble], bytes | None]]
+        (SAP3)                 (boundary)    (FI)        (boundary)            (SAP3)
+```
+
+- **v0b ships the STATION path only.** FI-wrapped models implement
+  `StationForecastModel` (`src/sapphire_flow/protocols/forecast_model.py:23`). Its
+  `predict()` returns `tuple[dict[str, ForecastEnsemble], bytes | None]` (line 32–39).
+- **GROUP / multi-station is v1.** `GroupForecastModel.predict_batch()` must return
+  `dict[StationId, tuple[...]]` (`src/sapphire_flow/protocols/forecast_model.py:58–63`).
+  Decomposing a single `ModelOutput` per station requires station-keyed FI output — see §5.
+
+---
+
+## 3. Output mapping (FI → SAP3)
+
+FI's output is authoritative; SAP3 adapts. This table grounds and supersedes doc 014's
+divergence table (lines 138–151) on the FI side.
+
+| FI type / field (`path:line`) | SAP3 target (`path:line`) | Adapter responsibility |
+|---|---|---|
+| `ModelOutput` (`output/model_output.py:9`) | `tuple[dict[str, ForecastEnsemble], bytes \| None]` (`protocols/forecast_model.py:38`) | Convert whole container → forecast dict + state bytes. |
+| `VariableOutput.deterministic` / `.quantiles` / `.trajectories` (`output/variable_output.py:109–111`) | `ForecastEnsemble` (`types/ensemble.py:18`) | Pick whichever is populated; route to the matching factory. |
+| `TrajectoryData` (`output/variable_output.py:64`) | `ForecastEnsemble.from_members()` (`types/ensemble.py:39`) → `MEMBERS` | Reshape member columns → `member_id`/`value`; ≥1 member (SAP3 line 60–62). |
+| `QuantileData` (`output/variable_output.py:33`) | `ForecastEnsemble.from_quantiles()` (`types/ensemble.py:76`) → `QUANTILES` | Reshape quantile columns → `quantile`/`value`. **See operational gap below.** |
+| `DeterministicData` (`output/variable_output.py:20`) | single-member `MEMBERS` ensemble (`types/ensemble.py:39`) | Wrap the single `value` column as `member_id=1`; flagged `insufficient_ensemble_size`, skips operational alert thresholds (doc 014 lines 190–196). |
+| `EpistemicUncertaintyData` (`output/variable_output.py:90`) | — (no SAP3 target) | **Dropped at the boundary in v0b** (FI-only; doc 014 lines 197–204). Revisit if models emit it. |
+| `VariableOutput.flags: frozenset[ForecastFlag]` (`output/variable_output.py:113`; enum `output/flags.py:4`) | `QcFlag.rule_id` strings | Map to `fi_*` rule ids (table below). |
+| `VariableStatus` (`output/status.py:4`) | `QcStatus` (`types/enums.py:4`) | Map per table below. |
+| `VariableMetadata.unit: Unit` (`output/metadata.py:11`; enum `common/units.py:4`) | `ForecastEnsemble.units: str` (`types/ensemble.py:24`) | Map `Unit` enum → SAP3 canonical unit string (table below). |
+| `ModelOutput.issue_datetime` (`output/model_output.py:13`) | `ForecastEnsemble.issued_at: UtcDatetime` (`types/ensemble.py:22`) | Apply `ensure_utc()`. |
+| per-row `datetime` column (all FI data containers) | `valid_time` column (SAP3 factories require it: `types/ensemble.py:54,91`) | Rename `datetime` → `valid_time`. |
+| `VariableMetadata.forecast_horizon: int` (`output/metadata.py:14`) | `ForecastEnsemble.forecast_horizon_steps: int` (`types/ensemble.py:25`) | **DIRECT** — both int, both step counts. **`forecast_horizon` IS consumed by the adapter** (corrects any prior "never consumed" belief). See note below. |
+| `VariableMetadata.timedelta: timedelta` (`output/metadata.py:13`) | `ForecastEnsemble.time_step: timedelta` (`types/ensemble.py:23`) | **DIRECT** assignment. |
+| `VariableMetadata.resolution: TemporalResolution` (`output/metadata.py:12`; enum `common/resolutions.py:4`) | — (no direct target) | Categorical label only; **cross-validate** against `timedelta`, never the conversion source. |
+| `ModelOutput.variables` key / `VariableMetadata.name` (`output/model_output.py:14`, `output/metadata.py:10`) | `ForecastEnsemble.parameter: str` (`types/ensemble.py:23`) | Validate against `ForecastParameter = Literal["discharge","water_level"]` and `ModelDataRequirements.target_parameters` (`types/model.py:261`). |
+| empty `ModelOutput.variables` **or** all-`FAILURE` | `ModelOutputError` (`exceptions.py:17`) | Adapter **raises** — zero usable ensembles (doc 014 lines 160–168, 218–223). |
+
+### Status & flag mapping
+
+`VariableStatus` → `QcStatus` (doc 014 lines 247–248):
+
+| FI `VariableStatus` | SAP3 `QcStatus` (`types/enums.py:4`) | Note |
+|---|---|---|
+| `SUCCESS` | `QC_PASSED` (`"qc_passed"`) | — |
+| `FAILURE` | `QC_FAILED` (`"qc_failed"`) | If **all** variables fail → raise `ModelOutputError` instead. |
+| `PARTIAL` | `QC_SUSPECT` (`"qc_suspect"`) + flag | No exact equivalent; attach `fi_partial_output`. |
+
+`ForecastFlag` → `QcFlag.rule_id` (doc 014 lines 250–253; `fi_` prefix marks FI-origin):
+
+| FI flag (`output/flags.py:4`) / status | SAP3 `QcFlag.rule_id` |
+|---|---|
+| `VariableStatus.PARTIAL` | `fi_partial_output` |
+| `ForecastFlag.HIGH_EPISTEMIC_UNCERTAINTY` | `fi_high_epistemic_uncertainty` |
+| `ForecastFlag.DATA_AVAILABILITY` | `fi_data_availability` |
+
+All FI enums use UPPER_CASE member names; SAP3 stores lowercase `.value`. Convert at the
+boundary — never pass FI enum values into the SAP3 domain layer (doc 014 lines 151, 254).
+
+### Unit mapping (`Unit` enum → SAP3 canonical unit string)
+
+FI's `Unit.value` holds a glyph form (e.g. `"m³/s"`, `common/units.py:5`). SAP3 expects an
+ASCII canonical string from its `parameters` table (doc 014 line 147). The adapter maps by
+**enum member**, not by `.value`:
+
+| FI `Unit` member (`common/units.py`) | FI `.value` | SAP3 canonical string |
+|---|---|---|
+| `M3_PER_S` | `m³/s` | `m3/s` |
+| `MM_PER_DAY` | `mm/day` | `mm/day` |
+| `MM_PER_S` | `mm/s` | `mm/s` |
+| `MM` | `mm` | `mm` |
+| `CM` | `cm` | `cm` |
+| `M` | `m` | `m` |
+| `DEG_C` | `°C` | `degC` |
+| `UNITLESS` | `-` | `-` |
+
+### `QuantileData` operational gap (FI valid ≠ SAP3 usable)
+
+FI's `QuantileData` requires only **≥1** quantile level in `(0,1)`, sorted & unique
+(`output/variable_output.py:39–51`). SAP3's `from_quantiles()` requires **≥7** quantile
+levels **with tail coverage** (min ≤ 0.05 and max ≥ 0.95) (`types/ensemble.py:98–106`).
+
+Consequently an FI model emitting fewer than 7 quantiles (or without tail coverage) is
+**structurally valid FI output but NOT operationally usable by SAP3** — `from_quantiles()`
+raises `ValueError`. State this to model authors explicitly: FI's quantile floor is a
+permissive structural minimum; SAP3's operational floor is stricter.
+
+### `forecast_horizon` consumption note
+
+Two horizon notions coexist and must not be conflated:
+
+- `VariableMetadata.forecast_horizon` (`output/metadata.py:14`) is the **declared** step
+  count. The adapter assigns it directly to `ForecastEnsemble.forecast_horizon_steps`.
+- SAP3's factories also recompute a horizon internally as
+  `values["valid_time"].n_unique()` (`types/ensemble.py:63,107`). The adapter should ensure
+  these agree (declared horizon == distinct `valid_time` count) and treat a mismatch as a
+  structural error.
+
+### `success` property caveat
+
+`ModelOutput.success` returns `True` when `variables` is empty, because `all()` over an
+empty iterable is `True` (`output/model_output.py:33–36`). The adapter **must not** rely on
+`success` alone to gate conversion (doc 014 lines 219–223).
+
+> **Discrepancy vs doc 014:** Current FI now *forbids* empty `variables` at construction —
+> `ModelOutput._at_least_one_variable` raises if the dict is empty
+> (`output/model_output.py:23–31`). So the empty-variables case is no longer constructible
+> through the public API. The all-`FAILURE` → `ModelOutputError` guard remains live and
+> necessary; the empty-variables guard is now defense-in-depth.
+
+---
+
+## 4. Input mapping (SAP3 → FI)
+
+Per doc 014 Task 3 (lines 257–283), SAP3 will PR FI's input types. FI's input contract is
+*already partially implemented* in this repo (`forecast_interface/input/`) — see the
+discrepancy note at the end of this section.
+
+### Concept → field mapping
+
+| FI input concept (`path:line`) | SAP3 `ModelDataRequirements` field (`types/model.py:260`) |
+|---|---|
+| `past_known` temporality (`input/requirement.py:9`) | `past_dynamic_features: frozenset[str]` (line 262) |
+| `future_known` temporality (`input/requirement.py:10`) | `future_dynamic_features: frozenset[str]` (line 263) |
+| `InputRequirement.static` (`input/requirement.py:37`) | `static_features: frozenset[str]` (line 264) |
+| `PastKnownVariable.lookback` (`input/variable.py:12`) | `lookback_steps: int` (line 266) |
+| `FutureKnownVariable.future_steps` (`input/variable.py:31`) | `forecast_horizon_steps: int` (line 267) |
+| `TemporalResolution` keys + `VariableMetadata.timedelta` | `supported_time_steps: frozenset[timedelta]` (line 265) |
+| `SpatialResolution` keys (`input/requirement.py:22`) | `spatial_input_type: SpatialRepresentation` (line 268) |
+| *(no FI equivalent yet)* — proposed by SAP3 PR | `target_parameters: frozenset[str]` (line 261) |
+| `PastKnownVariable.max_nan` / `FutureKnownVariable.max_nan` (`input/variable.py:13,32`) | Derivable from SAP3 QC config (doc 014 line 273) |
+| `FutureKnownVariable.ensemble_mode` (`input/variable.py:33`) | Derivable from NWP ensemble config (doc 014 line 273) |
+
+### Assembled inputs (4-slot)
+
+After requirement matching, SAP3 assembles concrete DataFrames into a 4-slot shape, passed
+to the FI model via the adapter. The slots are identical for station and group:
+
+| Slot | `StationModelInputs` (`types/model.py:59`, via `StationInputData` line 51) | `GroupModelInputs` (`types/model.py:78`) |
+|---|---|---|
+| `past_targets` | target history | stacked, `station_id`-keyed |
+| `past_dynamic` | past dynamic features | stacked |
+| `future_dynamic` | future dynamic features | stacked |
+| `static` (`pl.DataFrame \| None`) | catchment attributes | stacked, one row per station |
+
+`GroupModelInputs.for_station()` (`types/model.py:89`) slices a group into per-station
+`StationInputData`. Both carry `issue_time`, `forecast_horizon_steps`, `time_step`.
+
+### Spatial enum mapping (FI → SAP3)
+
+FI `SpatialResolution` (`common/resolutions.py:15`) → SAP3 `SpatialRepresentation`
+(`types/enums.py:73`):
+
+| FI `SpatialResolution` | SAP3 `SpatialRepresentation` |
+|---|---|
+| `LUMPED` | `BASIN_AVERAGE` (`"basin_average"`) |
+| `HRU` | `ELEVATION_BAND` (`"elevation_band"`) |
+| `GRIDDED` | `GRIDDED` (`"gridded"`) |
+| *(new in FI — proposed)* `POINT` | `POINT` (`"point"`) |
+
+> SAP3 has a `POINT` member already (`types/enums.py:74`); FI's `SpatialResolution`
+> currently has only `LUMPED`/`HRU`/`GRIDDED` (`common/resolutions.py:16–18`). Adding
+> `POINT` to FI is part of the SAP3→FI input PR (see Open Items §8).
+
+---
+
+## 5. Station identity & the GROUP path (Option a)
+
+FI's `ModelOutput.variables` is currently `dict[str, VariableOutput]`
+(`output/model_output.py:14`) — keyed by **variable name**, with no station decomposition.
+SAP3's `GroupForecastModel.predict_batch()` requires per-station results
+(`dict[StationId, ...]`, `protocols/forecast_model.py:63`).
+
+**Decision recorded (doc 014 "Option (a)", lines 208–217):** FI adopts **station-keyed
+output** so the GROUP-path adapter can map per-station 1:1:
+
+```
+ModelOutput.variables : dict[station_id, dict[variable, VariableOutput]]
+```
+
+- Single-station models return a **one-key dict** (one station id → its variable map).
+- The STATION-path adapter unwraps the single key into
+  `tuple[dict[str, ForecastEnsemble], bytes | None]`.
+- The GROUP-path adapter maps each station key → one `(forecast_dict, state)` entry of the
+  `dict[StationId, tuple[...]]` return.
+
+**Cross-repo coordination item (FLAG):** SAP3's adapter design in doc 014 is currently
+**STATION-path-only** for v0b (lines 208–217 explicitly defer GROUP support). When FI moves
+to station-keyed output, SAP3's `ForecastInterfaceAdapter` must be extended to consume it.
+Until both sides land this change, GROUP-path FI wrapping is not possible. This is the
+single largest open structural divergence between the two repos.
+
+---
+
+## 6. State bridge
+
+FI is **state-free**: the `ForecastModel` protocol (`interface/protocol.py:10`) has no
+state parameter or return, and `ModelOutput` carries no warm-up snapshot.
+
+SAP3 carries warm-up state as `bytes | None`:
+
+- `StationForecastModel.predict(..., prior_state: bytes | None = None)` →
+  `tuple[..., bytes | None]` (`protocols/forecast_model.py:32–39`).
+- The state lifecycle is handled **entirely by the adapter**, not by FI.
+
+**v0b:** FI-wrapped models are stateless (doc 014 lines 205–207). The adapter:
+
+- Ignores `prior_state` (nothing to feed an FI model).
+- Returns `(forecast_dict, None)` — no state to snapshot.
+
+**v1 (future):** optional `dump_state` / `restore_state` methods on the FI `ForecastModel`
+protocol would let conceptual / hybrid models round-trip warm-up state through the adapter
+(doc 014 lines 129–134, 206–207). Not part of the current FI protocol.
+
+---
+
+## 7. Artifact metadata ownership
+
+Training provenance and storage metadata are split between FI-declared / artifact-embedded
+fields and SAP3-stored fields (`ModelArtifactRecord`, `types/model.py:290`).
+
+| Field / concept | Owner | Where |
+|---|---|---|
+| Model identity / name | FI / artifact | `ModelOutput.model_name` (`output/model_output.py:12`); embedded in artifact |
+| `interface_version` | FI / artifact | declared by model, embedded in artifact |
+| `model_version` | FI / artifact | declared by model, embedded in artifact |
+| Training provenance hashes | FI / artifact | embedded in artifact |
+| Training seed | FI / artifact | embedded in artifact (deterministic training) |
+| Product / data-source versions | FI / artifact | embedded in artifact |
+| Region scope | FI / artifact | declared by model, embedded in artifact |
+| Embedding-key behaviour when station set differs | FI / artifact | model declares how it keys stations (relevant to GROUP path, §5) |
+| `sha256_hash` | **SAP3** | `ModelArtifactRecord.sha256_hash` (`types/model.py:297`) |
+| `training_period_start` / `_end` | **SAP3** | `types/model.py:298–299` |
+| `trained_at` | **SAP3** | `types/model.py:300` |
+| `status` | **SAP3** | `ModelArtifactStatus` (`types/model.py:295`; enum `types/enums.py:47`) |
+| scope / `group_id` / `station_id` | **SAP3** | `types/model.py:293–294`; scope via `ArtifactScope` (`types/enums.py:41`) |
+
+FI/artifact side answers *"what is this model and how was it built"*; SAP3 side answers
+*"which trained binary is stored, for which scope, in what lifecycle state"*.
+
+---
+
+## 8. Open cross-repo items
+
+| # | Item | Detail |
+|---|---|---|
+| 1 | Typed IDs vs str | SAP3 `StationId`/`StationGroupId` are `NewType(UUID)` but `ModelId` is `NewType(str)` (`types/ids.py:4,16,21`). FI uses free-form `str` for variable/model names. Decide whether FI adopts typed ids at the boundary or the adapter parses str→UUID. |
+| 2 | SAP3 input-types PR scope | Whether the SAP3→FI PR lands `target_parameters` and `spatial_input_type`/`POINT` into FI's input spec (doc 014 lines 275–276; §4 above). |
+| 3 | GROUP-path adapter extension | Station-keyed FI output (§5) requires SAP3's STATION-only adapter design to extend to GROUP. Largest structural divergence. |
+| 4 | Quantile floor mismatch | FI requires ≥1 quantile (`output/variable_output.py:39–51`); SAP3 requires ≥7 with tail coverage (`types/ensemble.py:98–106`). FI output can be valid yet operationally unusable (§3). |
+| 5 | Epistemic uncertainty | `EpistemicUncertaintyData` is dropped at the boundary in v0b (doc 014 lines 197–204). Revisit (add to `ForecastEnsemble` / store as metadata) if models emit it. |
+| 6 | Interface module now exists | doc 014 assumes FI's `interface/` is unimplemented (lines 80, 287). It is now implemented (`ForecastModel`, `ModelResult`, `FailureCause`). SAP3 should re-evaluate Tasks 4–5 against the real protocol. |
+| 7 | `ModelResult` failure channel | FI now returns `ModelResult = ModelSuccess \| ModelFailure` (`interface/result.py:40`) with a `FailureCause` enum (`interface/failure.py:4`). SAP3's `ModelOutputError` path must account for the `ModelFailure` branch, not only all-`FAILURE` `ModelOutput`. |
+| 8 | Resolution enum split | FI split into `TemporalResolution` + `SpatialResolution` (`common/resolutions.py`); doc 014 references a single `Resolution`. Mapping tables above use the current split. |
+```
diff --git a/docs/model_interface.md b/docs/model_interface.md
index e215eb8..a152680 100644
--- a/docs/model_interface.md
+++ b/docs/model_interface.md
@@ -2,35 +2,94 @@
 
 The primary goal of this package is to define the interface between any forecasting library and the forecasting model. The forecasting model can be implemented in any package / code base but needs to follow the protocol defined here.
 
-Core functionalites include:
- **Forecast Function** forecast()
-Takes as input the ModelInput and outputs the ModelOutput (Forecast).
+There is **one unified protocol**: `ForecastModel`. The scope of a model (single station vs. group / national) is **declared**, not split into separate protocols. SAP3 consumes the FI protocol through a thin adapter that dispatches to its own `StationForecastModel` / `GroupForecastModel` — see [`docs/fi-sap3-mapping.md`](./fi-sap3-mapping.md). The driving requirements for the first (Nepal v1) integration are in [`docs/nepal-model-requirements.md`](./nepal-model-requirements.md).
 
-**Hindcast Function** hindcast()
-Takes as input the ModelInput and outputs the ModelOutput (Hindcast).
+Core functionalities include:
 
-**Initialization** __init__():
-The model should take the model specific config file which needs to provide the input requirements as an argument and loads internally all relevant artifacts / checkpoints etc.
+**Forecast Function** `predict()`
+Takes as input the `ModelInput` and a trained artifact, and outputs the `ModelOutput` (Forecast).
 
-For later integration:
-**Calibrate**
-    - can be
-    - **Calibrate** (for conceptual or single basin ml model)
-    - **Finetune** (if a base model is provided)
-    - **Retrain** (if the model is already trained)
+**Hindcast Function** `hindcast()`
+Takes as input the `ModelInput` and a trained artifact, and outputs the `ModelOutput` (Hindcast).
+
+**Training Functions** `train()` / `retrain()`
+Produce a `TrainedArtifact` from training inputs. See the Training & Lifecycle Protocol below.
+
+---
+
+## Training & Lifecycle Protocol (target spec)
+
+> **Status: target contract.** This section describes the protocol surface FI is committed to, settled by the Nepal v1 decisions. It is **not yet reflected in `forecast_interface/` code**. The current `forecast_interface/interface/protocol.py` exposes only `input_requirement`, `predict(*, inputs, issue_datetime)` and `hindcast(*, inputs, issue_datetime)` — with **no** `TrainedArtifact`, **no** `rng`, and **no** training methods. The implementation lands in a later phase; this is the forward target.
+
+### Scope: `ArtifactScope`
+
+A model declares its scope rather than implementing a scope-specific protocol.
+
+```python
+class ArtifactScope(Enum):
+    STATION = auto()  # one artifact per station
+    GROUP = auto()    # one artifact covering multiple stations
+```
+
+A "national-group" model is a `GROUP` (it is just a group whose station set happens to be national). There is no separate national scope.
+
+### The `ForecastModel` protocol surface
+
+| Member | Signature | Required? | Notes |
+|---|---|---|---|
+| `input_requirement` | `property -> InputRequirement` | required | Declares data needs **and** `target_parameters` (the targets, parallel to features). |
+| `artifact_scope` | `property -> ArtifactScope` | required | Declared scope (`STATION` / `GROUP`). |
+| `train` | `train(inputs, *, config, rng) -> TrainedArtifact` | **required** | Cold, full rebuild from scratch. This is the required baseline every model must support. |
+| `retrain` | `retrain(base_artifact, inputs, *, config, rng) -> TrainedArtifact` | optional | Warm-start from an existing artifact, for models capable of it. Models that cannot warm-start simply do not implement it; callers fall back to `train`. |
+| `predict` | `predict(artifact, *, inputs, issue_datetime, rng) -> ModelResult` | required | Forecast. Returns FI's `ModelResult` → `ModelOutput`. |
+| `hindcast` | `hindcast(artifact, *, inputs, issue_datetime, rng) -> ModelResult` | required | Hindcast. Same return type as `predict`. |
+| `serialize_artifact` | `serialize_artifact(artifact) -> bytes` | required | Opaque byte serialization of a `TrainedArtifact`. |
+| `deserialize_artifact` | `deserialize_artifact(raw: bytes) -> TrainedArtifact` | required | Inverse of `serialize_artifact`. |
+
+`input_requirement.target_parameters` declares the model's prediction targets alongside its feature requirements. (The current `InputRequirement` has only `dynamic` / `static` feature declarations; `target_parameters` is part of this forward spec.)
+
+### Determinism (dependency injection)
+
+`train`, `retrain`, `predict`, and `hindcast` all take an **injected** `rng: random.Random`. Models **MUST** be deterministic under a fixed `(data, config, seed)` triple: the same inputs, the same config, and an RNG seeded the same way must produce identical artifacts and identical outputs. No model may call `random` / `numpy.random` global state or `datetime.now()` directly in its forecast logic — all nondeterminism is injected. This matches both SAP3's contract and the repository's dependency-injection rule.
+
+### `TrainedArtifact`
+
+A `TrainedArtifact` is an **opaque, self-contained, deployment-portable** object representing everything a model needs to produce forecasts:
+
+- **Opaque** to FI: FI never inspects its internals. It is produced by `train` / `retrain` and consumed by `predict` / `hindcast`.
+- **Self-contained**: `serialize_artifact` produces `bytes` that embed all weights, scalers, and metadata — **with no absolute filesystem paths** and no machine-local references.
+- **Deployment-portable**: `deserialize_artifact(serialize_artifact(a))` must reconstruct an artifact that runs **unchanged on another SAP3 instance**.
+
+**Group / national artifacts and station identity.** An artifact whose scope is `GROUP` typically embeds the station identifiers it was trained on. Such artifacts **must document their embedding key** (how station IDs are stored and matched). They **must also define behaviour when the station set at predict time differs** from the trained set — either handle the mismatch gracefully (e.g. predict only for known stations, emit explicit `FAILURE` entries for unknown ones) or raise an explicit error. A group artifact must **never silently mis-associate** a prediction with the wrong station.
+
+### State-free
+
+FI's protocol is **state-free**: there is no `state` parameter and no state in the return value. SAP3's `prior_state` bytes are handled entirely inside the SAP3 adapter (see [`docs/fi-sap3-mapping.md`](./fi-sap3-mapping.md)), not by the FI protocol. An optional `dump_state` / `restore_state` pair is noted only as a **possible future extension** and is out of scope for v1.
+
+### Output stays FI-authoritative
+
+`predict` / `hindcast` **return** `ModelResult` → `ModelOutput` (defined below). `ModelOutput` is **not** replaced by SAP3's `ForecastEnsemble`: the SAP3 adapter maps `ModelOutput` *into* its own representation, never the other way around. See [`docs/fi-sap3-mapping.md`](./fi-sap3-mapping.md) for the field-level mapping.
 
 ---
 
 ## ModelOutput
 
-`ModelOutput` is the unified return type for both `forecast()` and `hindcast()`.
+`ModelOutput` is the unified return type for both `predict()` and `hindcast()`.
 
 | Field | Type | Description |
 |---|---|---|
 | `model_name` | `str` | Identifier of the model that produced the output |
 | `issue_datetime` | `datetime` (UTC) | Single issue datetime for the entire output |
-| `variables` | `dict[str, VariableOutput]` | Mapping of variable name to its output |
-| `success` | `bool` | Computed property — `True` when all variables contain valid data |
+| `variables` | `dict[str, dict[str, VariableOutput]]` | Station-keyed: `station_id → variable_name → VariableOutput` |
+| `success` | `bool` | Computed property — `True` when all variables (across all stations) contain valid data |
+
+### Station-keyed variables
+
+`variables` is keyed first by `station_id`, then by `variable_name`. This supersedes the previous flat `dict[str, VariableOutput]`.
+
+- A **single-station** model returns a one-key dict (`{station_id: {variable_name: VariableOutput}}`).
+- A **group / national** model returns one entry per station it forecasts.
+- **Missing stations are explicit `FAILURE` entries**, never absent keys. A caller can always look up every expected station; the absence of usable data is represented by a `VariableOutput` with `status == FAILURE`, not by a missing key.
 
 ### DataFrame Column Schema
 
@@ -44,6 +103,8 @@ All data classes share a unified DataFrame schema with two required datetime col
 **Forecast**: `issue_datetime` is constant across all rows (single issue time).
 **Hindcast**: `issue_datetime` varies across rows (multiple issue times).
 
+`predict` and `hindcast` return the **same** `ModelOutput` type; the only distinction is whether `issue_datetime` is constant (forecast) or varies (hindcast) across rows.
+
 ### Data Classes
 
 Each data class wraps a DataFrame with the two temporal columns above, plus class-specific value columns:
@@ -61,7 +122,7 @@ Captures model uncertainty as standard deviation and range.
 
 ### VariableOutput
 
-Groups data for a single output variable:
+Groups data for a single output variable (within a single station):
 
 | Field | Type | Description |
 |---|---|---|
@@ -75,7 +136,7 @@ Groups data for a single output variable:
 
 At least one of `deterministic`, `quantiles`, or `trajectories` must be present when status is `SUCCESS`.
 
-`variables` must contain at least one entry. When status is `PARTIAL`, at least one data representation must still be present (same rule as `SUCCESS`).
+`variables` must contain at least one station entry, and each station's inner dict must contain at least one variable. When status is `PARTIAL`, at least one data representation must still be present (same rule as `SUCCESS`). A station that produced no usable data is represented by a `FAILURE` `VariableOutput`, not by an empty or missing entry.
 
 ### ForecastFlag
 
diff --git a/docs/nepal-model-requirements.md b/docs/nepal-model-requirements.md
new file mode 100644
index 0000000..dba4fb1
--- /dev/null
+++ b/docs/nepal-model-requirements.md
@@ -0,0 +1,251 @@
+# Nepal Requirements for ForecastInterface and Model Implementers
+
+**Date:** 2026-06-11
+**Audience:** ForecastInterface maintainer, model implementer, SAPPHIRE Flow
+maintainers
+
+## Goal
+
+ForecastInterface must support a Nepal workflow where SAPPHIRE Flow implements
+and validates the eastern part of the country, while DHM/hydromet staff train
+and configure models for the western part. The same interface must also allow
+operators to test one model across all gauges or keep separate east/west models.
+
+## Current Alignment With SAPPHIRE Flow
+
+SAPPHIRE Flow already has these concepts internally:
+
+- explicit model data requirements:
+  - target parameters;
+  - past dynamic features;
+  - future dynamic features;
+  - static features;
+  - supported time steps;
+  - lookback steps;
+  - forecast horizon;
+  - spatial input type;
+- station-scoped and group-scoped model artifacts;
+- stacked multi-station inputs for group models;
+- per-station output from group models;
+- active/superseded artifact lifecycle;
+- hindcast and skill computation after training;
+- multiple model assignments per station;
+- pooled and BMA forecast combination for member ensembles.
+
+ForecastInterface already has:
+
+- an `InputRequirement` structure with temporal/spatial/product axes;
+- `predict()` and `hindcast()` protocol methods;
+- output containers for deterministic, quantile, trajectory, and epistemic
+  uncertainty data.
+
+The main missing pieces are training/retraining, artifact metadata/provenance,
+target declaration, and the station dimension for multi-station models.
+
+## Required Interface Capabilities
+
+### 1. Explicit target declaration
+
+ForecastInterface must declare what variables the model forecasts, separate
+from predictor variables.
+
+Required:
+
+- `target_variables`, e.g. `{"discharge"}` or `{"water_level", "discharge"}`;
+- units for each target;
+- output representation supported for each target: deterministic, quantiles,
+  trajectories/members, or combinations;
+- validation that output variable keys match declared targets unless a model
+  returns a documented subset with `PARTIAL` status.
+
+Why: SAPPHIRE Flow uses target variables for station compatibility, skill
+scoring, API storage, and deciding which observations are needed.
+
+### 2. Station and group model output shape
+
+ForecastInterface must support models that predict for many gauges in one call.
+
+Required output shape:
+
+```python
+variables: dict[str, dict[str, VariableOutput]]
+#          station_id  variable_name
+```
+
+Rules:
+
+- station IDs in the output must match station IDs in the input;
+- single-station models return a one-station mapping;
+- forecast DataFrames can remain per-station and do not need a `station_id`
+  column if the station dimension is in the outer dict;
+- missing station outputs must be explicit failures, not absent keys.
+
+### 3. Training and retraining protocol
+
+ForecastInterface needs a training API in addition to `predict()` and
+`hindcast()`.
+
+Required operations:
+
+```python
+train(inputs, *, config, rng) -> TrainedArtifact
+retrain(base_artifact, inputs, *, config, rng) -> TrainedArtifact
+serialize_artifact(artifact) -> bytes
+deserialize_artifact(raw: bytes) -> TrainedArtifact
+```
+
+Acceptable naming can be finalized by the ForecastInterface maintainer, but the
+capabilities must exist.
+
+Required behavior:
+
+- `train()` creates a new model artifact from training data.
+- `retrain()` can start from an existing artifact or checkpoint and update it
+  with new data.
+- Retraining must be deterministic when config, input data, and random seed are
+  fixed.
+- The model must fail with a clear error when the provided data does not satisfy
+  declared requirements.
+- Training and retraining must work for station-scoped, regional group-scoped,
+  and national group-scoped models.
+
+**Resolved (SAPPHIRE Flow, 2026-06-11): cold retrain required, warm-start optional.**
+`train()` — a full rebuild on the (updated) dataset — is the **required** contract every
+model must implement; it works for conceptual and ML models alike. `retrain(base_artifact,
+…)` / warm-start (fine-tune from an existing artifact) is an **optional** capability for
+models that support it; a model that does not implement it simply cold-trains. This
+resolves the open question below: full rebuild is the required baseline, warm-start is an
+optional optimisation.
+
+### 4. Artifact metadata and provenance
+
+Every trained artifact must carry metadata that SAPPHIRE Flow can store and use
+for safe promotion/rollback.
+
+Required artifact metadata:
+
+| Field | Purpose |
+|---|---|
+| `model_name` / `model_id` | Stable model identity. |
+| `interface_version` | Compatibility with ForecastInterface. |
+| `model_version` | Code/config version from implementer. |
+| `artifact_scope` | `station`, `group`, or `national`/country-level group. |
+| `region_scope` | `east`, `west`, `national`, or another agreed label. |
+| `station_ids` | Gauges used for training and valid application. |
+| `training_period_start`, `training_period_end` | Reproducibility and skill context. |
+| `input_requirement_hash` | Detects incompatible interface changes. |
+| `training_data_hash` | Detects data changes. |
+| `catchment_package_version` | Links artifact to gateway shapefile/catchment version. |
+| `snowmapper_product_version` | Links artifact to SnowMapper inputs if used. |
+| `weather_product_versions` | Links artifact to NWP/reanalysis inputs. |
+| `random_seed` | Reproducibility. |
+| `created_at_utc` | Audit trail. |
+
+### 5. Region and application constraints
+
+Model artifacts must make their valid application scope explicit.
+
+Required:
+
+- a national model can declare valid application to all listed Nepal gauges;
+- an eastern model can declare valid application to eastern gauges only unless
+  explicitly marked as transferable/test-only;
+- a western model can declare valid application to western gauges only unless
+  explicitly marked as transferable/test-only;
+- applying a model outside its declared region must be possible in test mode but
+  should require an explicit operator choice.
+
+Why: DHM may want to test one model on all gauges, but accidental cross-region
+promotion should be prevented.
+
+### 6. Output requirements for model comparison and combination
+
+If a model should participate in pooled/BMA combination in SAPPHIRE Flow, it
+must output trajectory/member forecasts, not quantiles only.
+
+Required:
+
+- trajectory/member outputs must have stable member IDs;
+- all members must share the same issue time, valid times, units, and horizon;
+- quantile-only outputs are acceptable for primary/fallback operation but should
+  be marked as not combinable for pooled/BMA.
+
+### 7. Input requirements for SnowMapper and catchment data
+
+ForecastInterface input requirements must be able to express:
+
+- SnowMapper variables such as SWE and snowmelt as past and/or future dynamic
+  features;
+- product/source names and versions;
+- spatial representation: lumped/basin-average, HRU/elevation-band, or gridded;
+- static catchment attributes required by the model;
+- allowed missing-data thresholds per variable and product.
+
+### 8. Artifact portability across deployments (staging → production)
+
+HSOL trains models on a cloud **staging** instance; trained artifacts are then promoted to
+the on-prem **production** deployment (where DHM also retrains). The interface must make
+artifacts portable across instances.
+
+Required:
+
+- a trained artifact MUST be **self-contained and deployment-independent**: it serializes
+  to bytes with no absolute paths and no dependence on the training environment, and
+  **deserializes and runs unchanged on a different SAPPHIRE Flow instance**;
+- if a group/national artifact embeds station identifiers (e.g. per-station ML
+  embeddings), it MUST document the **embedding key** and its behaviour when the station
+  set at predict time **differs** from training (a new western gauge; an identifier
+  remapped between staging and production) — it must handle this gracefully or raise an
+  **explicit error**, and never silently associate a station with the wrong embedding.
+
+Why: SAPPHIRE Flow promotes the serialized artifact between instances; it cannot reach
+inside the artifact, so the artifact must be honest about its environment and ID
+assumptions. The provenance fields in §4 (`interface_version`, `input_requirement_hash`,
+`station_ids`) support compatibility checks but do not by themselves guarantee runtime
+portability.
+
+## SAPPHIRE Flow Integration Implications
+
+SAPPHIRE Flow can already store model artifacts and supersede old active
+artifacts. It still needs additional work before full Nepal east/west operation:
+
+- implement configurable retrain strategies;
+- audit group-scoped model assignments and artifact lookup;
+- implement merged data requirements before running multiple models with
+  different inputs on one gauge;
+- add operator workflows for test, promote, rollback, and regional assignment.
+
+ForecastInterface should not assume these are solved by the model package; it
+should expose enough metadata for SAPPHIRE Flow to implement them safely.
+
+## Acceptance Criteria
+
+- A model implementer can train an eastern, western, or national model artifact
+  through ForecastInterface.
+- DHM can retrain a model from an existing artifact using new western Nepal data.
+- The artifact metadata tells SAPPHIRE Flow whether the model is valid for east,
+  west, or all Nepal gauges.
+- A group model can forecast multiple gauges in one call and return explicit
+  per-station outputs.
+- SAPPHIRE Flow can reject or warn on missing target variables, incompatible
+  static/dynamic inputs, unsupported time steps, or invalid region application.
+- Models intended for pooled/BMA combination provide trajectory/member outputs.
+
+## Open Questions for the Model Implementer
+
+- ~~Should retraining always start from a previous artifact, or can it also mean a
+  full rebuild on an expanded dataset?~~ **Resolved:** full rebuild (cold) is the required
+  baseline; warm-start from an artifact is optional (see §3).
+- ~~What is the first intended artifact scope for Nepal: station, east/west group,
+  or national group?~~ **Resolved:** the **eastern regional group** ships first, so the
+  first production artifact is **GROUP-scoped**.
+- ~~Which SnowMapper variables and lead times will the model consume?~~ **Resolved
+  (partial):** starts with **SWE** and **ROF**, as banded forcing at elevation-band
+  granularity (see §7). Lead times still to be confirmed.
+- ~~Are western Nepal models expected to transfer to eastern gauges for testing,
+  or only the other way around?~~ **Resolved:** transfer is **east → west** — an
+  eastern group artifact is applied to western gauges. The eastern artifact must
+  define its behaviour when the station set differs (see §8 embedding-key contract).
+- Can all stateful models reconstruct state from lookback data, or do any
+  require persisted hidden state between forecasts?
+
diff --git a/docs/open_design_questions.md b/docs/open_design_questions.md
index 7c8b8b3..facf68c 100644
--- a/docs/open_design_questions.md
+++ b/docs/open_design_questions.md
@@ -1,20 +1,26 @@
 # Open Design Questions
 
-Questions to discuss with the model developer before finalizing the interface.
+This file tracks design decisions for the ForecastInterface (FI) — the model-author-facing contract — and the questions still owed to the model developer.
 
----
+## Context
+
+ForecastInterface (FI) is the contract model authors implement. SAPPHIRE Flow (SAP3) consumes FI models through a thin, planned-but-not-yet-built `ForecastInterfaceAdapter`. Governance (SAP3 doc 014, "ForecastInterface Adapter Design") sets the ownership boundaries:
 
-## 1. Multi-Station Model Output
+- **FI OUTPUT types are authoritative** — SAP3 adapts to them.
+- **FI INPUT types are co-designed** via a SAP3 → FI PR.
+- **FI's INTERFACE / protocol is FI-owned**, with SAP3 wrapping thin.
 
-**Status:** Open, high priority — proposal for discussion
+These decisions are reflected in `docs/model_interface.md` and the new `docs/fi-sap3-mapping.md` (the FI ↔ SAP3 adapter mapping). This file does not duplicate their content.
+
+---
 
-### Problem
+# 1. Resolved decisions
 
-The typical ML model predicts for many stations in one forward pass. Currently `ModelOutput.variables` is `dict[str, VariableOutput]` — keyed by variable name only, with no station dimension. This forces the orchestrator to call the model once per station, losing batching efficiency.
+Questions that SAP3's behaviour already settles. Each is marked resolved with its rationale and where it will be reflected.
 
-### Proposal
+## 1.1 Multi-station output structure — RESOLVED
 
-Add a station dimension to `ModelOutput`. The model receives station identifiers in its input data (as a grouping variable to correlate targets with features) and returns per-station results keyed by the same identifiers:
+**Decision:** `ModelOutput.variables` becomes `dict[str, dict[str, VariableOutput]]` (station_id → variable_name → `VariableOutput`).
 
 ```python
 class ModelOutput(BaseModel):
@@ -24,100 +30,97 @@ class ModelOutput(BaseModel):
     #               ^station_id  ^variable_name
 ```
 
-A single-station model simply returns a dict with one key. This keeps one protocol for both cases.
+A single-station model returns a dict with one key, e.g. `{"station_xyz": {"discharge": ..., "water_level": ...}}`. The per-variable DataFrames stay per-station — the station dimension lives in the dict structure, not in a `station_id` column inside the forecast DataFrames.
 
-### How this works in SAPPHIRE_flow today
+**Rationale:** matches SAP3's GROUP-path output `dict[StationId, dict[str, ForecastEnsemble]]` and Nepal requirement §2. Missing stations must be explicit **FAILURE entries**, never absent keys.
 
-SAPPHIRE_flow uses this pattern for group (multi-station) models:
+**Cross-repo note:** this advances a v1-deferred GROUP-path item and requires SAP3's adapter to extend from STATION-only to GROUP — a cross-repo coordination item.
 
-**Input side:** All per-station DataFrames are stacked into one long DataFrame with a `station_id` column (string type, always first column). The model receives one `GroupModelInputs` object containing all stations. A convenience method `for_station(sid)` filters and drops the column, giving clean per-station DataFrames.
+**Reflected in:** `docs/model_interface.md`, `docs/fi-sap3-mapping.md`.
 
-**Output side:** The model returns `dict[StationId, dict[str, ForecastEnsemble]]` — the station dimension is in the dict structure, not in the DataFrames. Each `ForecastEnsemble` is already per-station.
+## 1.2 Target declaration — RESOLVED
 
-**Pattern summary:**
-```
-Input:  stacked DataFrames with station_id column → model
-Output: model → dict[station_id, dict[variable_name, data]]
-```
+**Decision:** FI will declare `target_parameters` (plus, per target, its unit and supported output representation), parallel to feature inputs.
 
-### What changes for the model developer
+**Rationale:** mirrors SAP3's `ModelDataRequirements.target_parameters`. SAP3 doc 014 Task 3 explicitly plans to PR `target_parameters` + `spatial_input_type` into FI's input spec.
 
-- `predict()` returns a `ModelResult` wrapping a `ModelOutput` where `variables` is now `dict[str, dict[str, VariableOutput]]`
-- Station identifiers come from the input data — the model echoes back the same IDs it received
-- The per-variable DataFrames remain per-station (no `station_id` column in the forecast DataFrames themselves)
-- Models that predict for a single station return `{"station_xyz": {"discharge": ..., "water_level": ...}}`
+**Reflected in:** `docs/input_requirement.md` (later phase).
 
-### Questions for the model developer
+## 1.3 Model state — RESOLVED (already)
 
-- Does this match how your models work? (station_id as grouping variable in, per-station results out)
-- Are station identifiers always strings, or do you use typed IDs?
-- Is there a case where the model defines its own spatial units that don't map 1:1 to input station IDs?
+**Decision:** FI stays **state-free**. SAP3's `prior_state` bytes are handled entirely by the adapter. An optional `dump_state()` / `restore_state(bytes)` pair on the protocol is a **future extension only**, not part of the current contract.
 
----
+**Rationale:** state management is orchestrator-side concern (SAP3's `PgModelStateStore`, `WarmUpSource`, `prior_state` on predict). FI's `predict()` has no state parameter and no state in its return.
 
-## 2. Target Variable Declaration
+**Reflected in:** `docs/model_interface.md`.
 
-**Status:** Open
+## 1.4 Spatial vocabulary — RESOLVED
 
-`InputRequirement` declares what data the model consumes, but does not distinguish between **target variables** (what the model forecasts) and **feature variables** (predictors).
+**Decision:** align FI's spatial vocabulary to SAP3's `SpatialRepresentation` enum:
 
-In the current YAML example, `discharge` sits under `past_known` alongside `precipitation` — they look identical structurally. SAPPHIRE_flow's `ModelDataRequirements` has an explicit `target_parameters: frozenset[str]` field.
+| FI value | Notes |
+|---|---|
+| `POINT` | |
+| `BASIN_AVERAGE` | replaces the old `LUMPED` |
+| `ELEVATION_BAND` | replaces the old `HRU` |
+| `GRIDDED` | |
 
-**Key constraint:** Target past observations are NOT always available. Pure simulation/process-based models can forecast a variable without having seen its history.
+Banded Snowmapper SWE / snowmelt is declared at `ELEVATION_BAND`.
 
-**Questions for the model developer:**
+**Reflected in:** `docs/input_requirement.md`, `docs/model_interface.md`.
 
-- Should `InputRequirement` declare which variables are forecast targets? Or should target declaration live elsewhere (e.g. a separate field on the model protocol)?
-- Do your models always have historical observations of the target variable, or do some models forecast without past target data?
-- Should the interface enforce that `ModelOutput.variables` keys match declared targets?
+## 1.5 Quantile floor — RESOLVED (split responsibility)
 
----
+**Decision:** FI proposes a **structural minimum of ≥3 quantiles** (center + two tails). SAP3's operational requirement of **≥7 quantiles with tail coverage** (a level ≤ 0.05 and a level ≥ 0.95) is enforced at the **adapter boundary**, NOT in FI.
 
-## 3. `VariableMetadata` Field Review
+An FI model emitting fewer than 7 quantiles is **structurally valid but NOT operationally usable in SAP3**.
 
-**Status:** Partially resolved
+**Reflected in:** `docs/model_interface.md`, `docs/fi-sap3-mapping.md`.
 
-`VariableMetadata` currently has: `name`, `unit`, `resolution`, `timedelta`, `forecast_horizon`, `offset`.
+## 1.6 Nepal v1 deployment specifics — RESOLVED (model developer)
 
-### Fields confirmed as necessary
-- **`unit`** — consumed by the SAPPHIRE_flow adapter (mapped to string)
-- **`timedelta`** — consumed by the adapter as `time_step`
-- **`resolution`** — not redundant with `timedelta`; it is the categorical label (e.g. SUB_DAILY) that `timedelta` refines (e.g. 15min). Could benefit from a cross-validator.
-- **`offset`** — number of timesteps (of length `timedelta`) between the last observed data point and the first forecast step. Not currently consumed by SAPPHIRE_flow but potentially relevant for lead-time aware skill scoring.
+**Decisions provided by the model developer:**
 
-### Fields to discuss
-- **`name`** — currently redundant with the dict key in `ModelOutput.variables`. No validator enforces `key == metadata.name`. The adapter uses only the dict key. Should we remove `name` and rely solely on the dict key, or add a validator to keep them in sync?
-- **`forecast_horizon`** — never consumed; SAPPHIRE_flow derives the horizon from the DataFrame row count. Is there a use case where a declared horizon that differs from actual rows is meaningful (e.g. the model intended to produce 48 steps but only managed 30)?
+- **First artifact scope: the eastern regional group ships first.** The first production artifact is therefore **GROUP-scoped** (`ArtifactScope.GROUP`). This makes the station-keyed output of decision 1.1 and the GROUP adapter path **load-bearing from day one**, not a later concern.
+- **SnowMapper forcing starts with SWE and ROF** (snow water equivalent and runoff), declared as **banded dynamic forcing at `ELEVATION_BAND`** (see decision 1.4). Specific **lead times** are still to be confirmed — see Q7 residual.
+- **Artifact transfer direction is east → west** (an eastern group artifact applied to western gauges). This makes the embedding-key / station-set-mismatch contract (Nepal §8) concrete: the eastern GROUP artifact **must define its behaviour when applied to the western station set** — handle gracefully or raise an explicit error, never silently associate a station with the wrong embedding.
 
-### DataFrame `issue_datetime` column
-Every DataFrame requires an `issue_datetime` column, but the adapter drops it immediately and uses only the top-level `ModelOutput.issue_datetime` scalar. No cross-validation ensures they match. With multi-station output (question 1), the column could become meaningful (per-station issue times). Should we remove the column requirement for now, or add a validator?
+**Reflected in:** `docs/nepal-model-requirements.md`, `docs/model_interface.md` (artifact portability), `docs/fi-sap3-mapping.md` (artifact metadata ownership).
 
 ---
 
-## 4. Quantile Minimum Count
+# 2. Open questions for the model developer
 
-**Status:** Open
+A decision-ready list. Each needs the model developer's input before the corresponding spec is frozen.
 
-ForecastInterface currently allows ≥1 quantile level. SAPPHIRE_flow requires ≥7 with min ≤ 0.05 and max ≥ 0.95.
+### Q1 — Station ID typing
 
-**Proposal:** Set ForecastInterface minimum to ≥3 (structurally meaningful: center + two tails) and leave SAPPHIRE's stricter constraint as an operational requirement enforced at the adapter boundary.
+SAP3 uses a typed `StationId = NewType(..., UUID)`. Should FI expose **opaque `str` station keys** (the adapter maps them to/from `StationId`), or **adopt typed IDs** directly? And: is there any case where the model defines its own spatial units that do **not** map 1:1 to the input station IDs?
 
-**Question for the model developer:**
-- Is there a valid use case for producing fewer than 3 quantiles?
+### Q2 — Past-target availability
 
----
+Do all your models see the **target's own history**, or do some pure-simulation / process-based models forecast a target **without** any past observations of it? This determines whether past-target is a *required* declared input or an optional one.
+
+### Q3 — Quantile minimum
+
+Is there any valid use case for emitting **fewer than 3 quantiles**? (Reminder: anything below 7 is non-operational in SAP3.)
+
+### Q4 — State reconstruction
+
+Can every stateful model rebuild its internal state from a **sufficiently long lookback window**, or does any model **strictly require persisted hidden state** between calls? (Confirms decision 1.3 covers all cases.)
+
+### Q5 — `VariableMetadata` fields
 
-## 5. Model State for Recurrent Models
+`VariableMetadata` currently has `name`, `unit`, `resolution`, `timedelta`, `forecast_horizon`, `offset`. Three points to settle:
 
-**Status:** Resolved — keep ForecastInterface state-free
+- **(a) `name`** — redundant with the dict key in `ModelOutput.variables`. Drop `name`, or keep it and add a validator enforcing `key == metadata.name`?
+- **(b) `forecast_horizon`** — this **is consumed** by the (designed) adapter: SAP3 doc 014 (lines 149, 228–229) assigns `ForecastEnsemble.forecast_horizon_steps` directly from `VariableMetadata.forecast_horizon`. The open question is **not** whether to keep it (we do) but whether to add a **cross-validator** that it matches the DataFrame row count.
+- **(c) `offset`** — confirm semantics: number of steps (each of length `timedelta`) between the last observed point and the first forecast step.
 
-SAPPHIRE_flow manages model state on the orchestrator side (`PgModelStateStore`, `WarmUpSource`, `prior_state` parameter on predict). ForecastInterface's `predict()` has no state parameter and no state in the return.
+### Q6 — Per-row `issue_datetime` column
 
-**Decision:** ForecastInterface remains state-free. Stateful models (LSTMs etc.) either:
-- Reconstruct state from the lookback window provided in inputs (warm-up from data)
-- Get state injected by a thin SAPPHIRE_flow adapter wrapping the FI model
+The adapter maps `ModelOutput.issue_datetime` → `ForecastEnsemble.issued_at` and renames the per-row datetime column → `valid_time`. The question is whether to **keep the per-row `issue_datetime` column requirement** — with a cross-validator that it matches the top-level `issue_datetime` for forecasts — or **relax it**. Frame this as a validator question, not a removal: the column is not "dropped", it is renamed and re-used.
 
-If needed later, optional `restore_state(bytes)` / `dump_state() -> bytes` methods on the protocol would be a clean extension.
+### Q7 — SnowMapper lead times (residual)
 
-**Question for the model developer:**
-- Can your stateful models always reconstruct their internal state from a sufficiently long lookback window? Or do some models strictly require persisted state between calls?
+The Nepal deployment specifics are otherwise settled (see decision 1.6): the eastern regional group ships first, SnowMapper forcing starts with **SWE** and **ROF**, and artifact transfer is **east → west**. The one residual: which **lead times** of SWE and ROF will the model consume — past-known lookback, future-known horizon, and how far in each?
diff --git a/forecast_interface/__init__.py b/forecast_interface/__init__.py
index e5997cb..560bc93 100644
--- a/forecast_interface/__init__.py
+++ b/forecast_interface/__init__.py
@@ -1,3 +1,5 @@
+__version__ = "0.1.2"
+
 from .input import (
     DynamicInputSpec,
     EnsembleMode,
diff --git a/pyproject.toml b/pyproject.toml
index c99e724..7778a90 100644
--- a/pyproject.toml
+++ b/pyproject.toml
@@ -1,6 +1,6 @@
 [project]
 name = "forecastinterface"
-version = "0.1.1"
+version = "0.1.2"
 description = "Add your description here"
 readme = "README.md"
 requires-python = ">=3.11"
@@ -12,8 +12,25 @@ dependencies = [
 [tool.pytest.ini_options]
 pythonpath = ["."]
 
+[tool.bumpversion]
+current_version = "0.1.2"
+commit = false
+tag = false
+allow_dirty = true
+
+[[tool.bumpversion.files]]
+filename = "pyproject.toml"
+search = 'version = "{current_version}"'
+replace = 'version = "{new_version}"'
+
+[[tool.bumpversion.files]]
+filename = "forecast_interface/__init__.py"
+search = '__version__ = "{current_version}"'
+replace = '__version__ = "{new_version}"'
+
 [dependency-groups]
 dev = [
+    "bump-my-version>=1.3.0",
     "pytest>=9.0.2",
     "ruff>=0.15.7",
 ]
diff --git a/uv.lock b/uv.lock
index 4f513f9..66f7497 100644
--- a/uv.lock
+++ b/uv.lock
@@ -1,5 +1,5 @@
 version = 1
-revision = 2
+revision = 3
 requires-python = ">=3.11"
 
 [[package]]
@@ -11,6 +11,69 @@ wheels = [
     { url = "https://files.pythonhosted.org/packages/78/b6/6307fbef88d9b5ee7421e68d78a9f162e0da4900bc5f5793f6d3d0e34fb8/annotated_types-0.7.0-py3-none-any.whl", hash = "sha256:1f02e8b43a8fbbc3f3e0d4f0f4bfc8131bcb4eebe8849b8e5c773f3a1c582a53", size = 13643, upload-time = "2024-05-20T21:33:24.1Z" },
 ]
 
+[[package]]
+name = "anyio"
+version = "4.13.0"
+source = { registry = "https://pypi.org/simple" }
+dependencies = [
+    { name = "idna" },
+    { name = "typing-extensions", marker = "python_full_version < '3.13'" },
+]
+sdist = { url = "https://files.pythonhosted.org/packages/19/14/2c5dd9f512b66549ae92767a9c7b330ae88e1932ca57876909410251fe13/anyio-4.13.0.tar.gz", hash = "sha256:334b70e641fd2221c1505b3890c69882fe4a2df910cba14d97019b90b24439dc", size = 231622, upload-time = "2026-03-24T12:59:09.671Z" }
+wheels = [
+    { url = "https://files.pythonhosted.org/packages/da/42/e921fccf5015463e32a3cf6ee7f980a6ed0f395ceeaa45060b61d86486c2/anyio-4.13.0-py3-none-any.whl", hash = "sha256:08b310f9e24a9594186fd75b4f73f4a4152069e3853f1ed8bfbf58369f4ad708", size = 114353, upload-time = "2026-03-24T12:59:08.246Z" },
+]
+
+[[package]]
+name = "bracex"
+version = "2.6"
+source = { registry = "https://pypi.org/simple" }
+sdist = { url = "https://files.pythonhosted.org/packages/63/9a/fec38644694abfaaeca2798b58e276a8e61de49e2e37494ace423395febc/bracex-2.6.tar.gz", hash = "sha256:98f1347cd77e22ee8d967a30ad4e310b233f7754dbf31ff3fceb76145ba47dc7", size = 26642, upload-time = "2025-06-22T19:12:31.254Z" }
+wheels = [
+    { url = "https://files.pythonhosted.org/packages/9d/2a/9186535ce58db529927f6cf5990a849aa9e052eea3e2cfefe20b9e1802da/bracex-2.6-py3-none-any.whl", hash = "sha256:0b0049264e7340b3ec782b5cb99beb325f36c3782a32e36e876452fd49a09952", size = 11508, upload-time = "2025-06-22T19:12:29.781Z" },
+]
+
+[[package]]
+name = "bump-my-version"
+version = "1.3.0"
+source = { registry = "https://pypi.org/simple" }
+dependencies = [
+    { name = "click" },
+    { name = "httpx" },
+    { name = "pydantic" },
+    { name = "pydantic-settings" },
+    { name = "questionary" },
+    { name = "rich" },
+    { name = "rich-click" },
+    { name = "tomlkit" },
+    { name = "wcmatch" },
+]
+sdist = { url = "https://files.pythonhosted.org/packages/3d/61/07b90027091a4192b4a0290dc3da1aeea6b9e7b6b4c0f7fd30dab36070c1/bump_my_version-1.3.0.tar.gz", hash = "sha256:5780137a8d93378af3839798fcba01c7e6cb28dcc5aa5a7ab4d8507787f1995c", size = 1142429, upload-time = "2026-03-22T13:27:34.923Z" }
+wheels = [
+    { url = "https://files.pythonhosted.org/packages/36/01/b168791bfbfb0322ef6d38d236f6f17a02e41fb7753e23e4cdb0f19ac969/bump_my_version-1.3.0-py3-none-any.whl", hash = "sha256:3cdaa54588d2443a29303b77e7539417187952c3d22f87bfdd32c0fe6af2f570", size = 64878, upload-time = "2026-03-22T13:27:33.006Z" },
+]
+
+[[package]]
+name = "certifi"
+version = "2026.5.20"
+source = { registry = "https://pypi.org/simple" }
+sdist = { url = "https://files.pythonhosted.org/packages/f3/ce/ee2ecad540810a79593028e88299baeae54d346cc7a0d94b6199988b89b1/certifi-2026.5.20.tar.gz", hash = "sha256:69dea482ab64caa7b9f6aba1c6bf48bb6a5448d1c0f1b17ab42ad8c763a5344d", size = 135422, upload-time = "2026-05-20T11:46:50.073Z" }
+wheels = [
+    { url = "https://files.pythonhosted.org/packages/59/8c/57e832b7af6d7c5abe66eb3fbe3a3a32f4d11ea23a1aa7131371035be991/certifi-2026.5.20-py3-none-any.whl", hash = "sha256:3c52e209ba0a4ad7aebe60436a4ab349c39e1e602e8c134221e546902ad25897", size = 134134, upload-time = "2026-05-20T11:46:48.578Z" },
+]
+
+[[package]]
+name = "click"
+version = "8.3.3"
+source = { registry = "https://pypi.org/simple" }
+dependencies = [
+    { name = "colorama", marker = "sys_platform == 'win32'" },
+]
+sdist = { url = "https://files.pythonhosted.org/packages/bb/63/f9e1ea081ce35720d8b92acde70daaedace594dc93b693c869e0d5910718/click-8.3.3.tar.gz", hash = "sha256:398329ad4837b2ff7cbe1dd166a4c0f8900c3ca3a218de04466f38f6497f18a2", size = 328061, upload-time = "2026-04-22T15:11:27.506Z" }
+wheels = [
+    { url = "https://files.pythonhosted.org/packages/ae/44/c1221527f6a71a01ec6fbad7fa78f1d50dfa02217385cf0fa3eec7087d59/click-8.3.3-py3-none-any.whl", hash = "sha256:a2bf429bb3033c89fa4936ffb35d5cb471e3719e1f3c8a7c3fff0b8314305613", size = 110502, upload-time = "2026-04-22T15:11:25.044Z" },
+]
+
 [[package]]
 name = "colorama"
 version = "0.4.6"
@@ -22,7 +85,7 @@ wheels = [
 
 [[package]]
 name = "forecastinterface"
-version = "0.1.0"
+version = "0.1.2"
 source = { virtual = "." }
 dependencies = [
     { name = "polars" },
@@ -31,6 +94,7 @@ dependencies = [
 
 [package.dev-dependencies]
 dev = [
+    { name = "bump-my-version" },
     { name = "pytest" },
     { name = "ruff" },
 ]
@@ -43,10 +107,57 @@ requires-dist = [
 
 [package.metadata.requires-dev]
 dev = [
+    { name = "bump-my-version", specifier = ">=1.3.0" },
     { name = "pytest", specifier = ">=9.0.2" },
     { name = "ruff", specifier = ">=0.15.7" },
 ]
 
+[[package]]
+name = "h11"
+version = "0.16.0"
+source = { registry = "https://pypi.org/simple" }
+sdist = { url = "https://files.pythonhosted.org/packages/01/ee/02a2c011bdab74c6fb3c75474d40b3052059d95df7e73351460c8588d963/h11-0.16.0.tar.gz", hash = "sha256:4e35b956cf45792e4caa5885e69fba00bdbc6ffafbfa020300e549b208ee5ff1", size = 101250, upload-time = "2025-04-24T03:35:25.427Z" }
+wheels = [
+    { url = "https://files.pythonhosted.org/packages/04/4b/29cac41a4d98d144bf5f6d33995617b185d14b22401f75ca86f384e87ff1/h11-0.16.0-py3-none-any.whl", hash = "sha256:63cf8bbe7522de3bf65932fda1d9c2772064ffb3dae62d55932da54b31cb6c86", size = 37515, upload-time = "2025-04-24T03:35:24.344Z" },
+]
+
+[[package]]
+name = "httpcore"
+version = "1.0.9"
+source = { registry = "https://pypi.org/simple" }
+dependencies = [
+    { name = "certifi" },
+    { name = "h11" },
+]
+sdist = { url = "https://files.pythonhosted.org/packages/06/94/82699a10bca87a5556c9c59b5963f2d039dbd239f25bc2a63907a05a14cb/httpcore-1.0.9.tar.gz", hash = "sha256:6e34463af53fd2ab5d807f399a9b45ea31c3dfa2276f15a2c3f00afff6e176e8", size = 85484, upload-time = "2025-04-24T22:06:22.219Z" }
+wheels = [
+    { url = "https://files.pythonhosted.org/packages/7e/f5/f66802a942d491edb555dd61e3a9961140fd64c90bce1eafd741609d334d/httpcore-1.0.9-py3-none-any.whl", hash = "sha256:2d400746a40668fc9dec9810239072b40b4484b640a8c38fd654a024c7a1bf55", size = 78784, upload-time = "2025-04-24T22:06:20.566Z" },
+]
+
+[[package]]
+name = "httpx"
+version = "0.28.1"
+source = { registry = "https://pypi.org/simple" }
+dependencies = [
+    { name = "anyio" },
+    { name = "certifi" },
+    { name = "httpcore" },
+    { name = "idna" },
+]
+sdist = { url = "https://files.pythonhosted.org/packages/b1/df/48c586a5fe32a0f01324ee087459e112ebb7224f646c0b5023f5e79e9956/httpx-0.28.1.tar.gz", hash = "sha256:75e98c5f16b0f35b567856f597f06ff2270a374470a5c2392242528e3e3e42fc", size = 141406, upload-time = "2024-12-06T15:37:23.222Z" }
+wheels = [
+    { url = "https://files.pythonhosted.org/packages/2a/39/e50c7c3a983047577ee07d2a9e53faf5a69493943ec3f6a384bdc792deb2/httpx-0.28.1-py3-none-any.whl", hash = "sha256:d909fcccc110f8c7faf814ca82a9a4d816bc5a6dbfea25d6591d6985b8ba59ad", size = 73517, upload-time = "2024-12-06T15:37:21.509Z" },
+]
+
+[[package]]
+name = "idna"
+version = "3.18"
+source = { registry = "https://pypi.org/simple" }
+sdist = { url = "https://files.pythonhosted.org/packages/cd/63/9496c57188a2ee585e0f1db071d75089a11e98aa86eb99d9d7618fc1edce/idna-3.18.tar.gz", hash = "sha256:ffb385a7e039654cef1ab9ef32c6fafe283c0c0467bba1d9029738ce4a14a848", size = 196711, upload-time = "2026-06-02T14:34:07.794Z" }
+wheels = [
+    { url = "https://files.pythonhosted.org/packages/1e/5e/d4e9f1a599fb8e573b7b87160658329fbf28d19eac2718f51fc3def3aa5a/idna-3.18-py3-none-any.whl", hash = "sha256:7f952cbe720b688055e3f87de14f5c3e5fdaa8bc3928985c4077ca689de849a2", size = 65455, upload-time = "2026-06-02T14:34:06.319Z" },
+]
+
 [[package]]
 name = "iniconfig"
 version = "2.3.0"
@@ -56,6 +167,27 @@ wheels = [
     { url = "https://files.pythonhosted.org/packages/cb/b1/3846dd7f199d53cb17f49cba7e651e9ce294d8497c8c150530ed11865bb8/iniconfig-2.3.0-py3-none-any.whl", hash = "sha256:f631c04d2c48c52b84d0d0549c99ff3859c98df65b3101406327ecc7d53fbf12", size = 7484, upload-time = "2025-10-18T21:55:41.639Z" },
 ]
 
+[[package]]
+name = "markdown-it-py"
+version = "4.2.0"
+source = { registry = "https://pypi.org/simple" }
+dependencies = [
+    { name = "mdurl" },
+]
+sdist = { url = "https://files.pythonhosted.org/packages/06/ff/7841249c247aa650a76b9ee4bbaeae59370dc8bfd2f6c01f3630c35eb134/markdown_it_py-4.2.0.tar.gz", hash = "sha256:04a21681d6fbb623de53f6f364d352309d4094dd4194040a10fd51833e418d49", size = 82454, upload-time = "2026-05-07T12:08:28.36Z" }
+wheels = [
+    { url = "https://files.pythonhosted.org/packages/b3/81/4da04ced5a082363ecfa159c010d200ecbd959ae410c10c0264a38cac0f5/markdown_it_py-4.2.0-py3-none-any.whl", hash = "sha256:9f7ebbcd14fe59494226453aed97c1070d83f8d24b6fc3a3bcf9a38092641c4a", size = 91687, upload-time = "2026-05-07T12:08:27.182Z" },
+]
+
+[[package]]
+name = "mdurl"
+version = "0.1.2"
+source = { registry = "https://pypi.org/simple" }
+sdist = { url = "https://files.pythonhosted.org/packages/d6/54/cfe61301667036ec958cb99bd3efefba235e65cdeb9c84d24a8293ba1d90/mdurl-0.1.2.tar.gz", hash = "sha256:bb413d29f5eea38f31dd4754dd7377d4465116fb207585f97bf925588687c1ba", size = 8729, upload-time = "2022-08-14T12:40:10.846Z" }
+wheels = [
+    { url = "https://files.pythonhosted.org/packages/b3/38/89ba8ad64ae25be8de66a6d463314cf1eb366222074cfda9ee839c56a4b4/mdurl-0.1.2-py3-none-any.whl", hash = "sha256:84008a41e51615a49fc9966191ff91509e3c40b939176e643fd50a5c2196b8f8", size = 9979, upload-time = "2022-08-14T12:40:09.779Z" },
+]
+
 [[package]]
 name = "packaging"
 version = "26.0"
@@ -102,6 +234,18 @@ wheels = [
     { url = "https://files.pythonhosted.org/packages/da/76/2d48927e0aa2abbdde08cbf4a2536883b73277d47fbeca95e952de86df34/polars_runtime_32-1.39.3-cp310-abi3-win_arm64.whl", hash = "sha256:f49f51461de63f13e5dd4eb080421c8f23f856945f3f8bd5b2b1f59da52c2860", size = 41857648, upload-time = "2026-03-20T11:15:01.142Z" },
 ]
 
+[[package]]
+name = "prompt-toolkit"
+version = "3.0.52"
+source = { registry = "https://pypi.org/simple" }
+dependencies = [
+    { name = "wcwidth" },
+]
+sdist = { url = "https://files.pythonhosted.org/packages/a1/96/06e01a7b38dce6fe1db213e061a4602dd6032a8a97ef6c1a862537732421/prompt_toolkit-3.0.52.tar.gz", hash = "sha256:28cde192929c8e7321de85de1ddbe736f1375148b02f2e17edd840042b1be855", size = 434198, upload-time = "2025-08-27T15:24:02.057Z" }
+wheels = [
+    { url = "https://files.pythonhosted.org/packages/84/03/0d3ce49e2505ae70cf43bc5bb3033955d2fc9f932163e84dc0779cc47f48/prompt_toolkit-3.0.52-py3-none-any.whl", hash = "sha256:9aac639a3bbd33284347de5ad8d68ecc044b91a762dc39b7c21095fcd6a19955", size = 391431, upload-time = "2025-08-27T15:23:59.498Z" },
+]
+
 [[package]]
 name = "pydantic"
 version = "2.12.5"
@@ -214,6 +358,20 @@ wheels = [
     { url = "https://files.pythonhosted.org/packages/36/c7/cfc8e811f061c841d7990b0201912c3556bfeb99cdcb7ed24adc8d6f8704/pydantic_core-2.41.5-pp311-pypy311_pp73-win_amd64.whl", hash = "sha256:56121965f7a4dc965bff783d70b907ddf3d57f6eba29b6d2e5dabfaf07799c51", size = 2145302, upload-time = "2025-11-04T13:43:46.64Z" },
 ]
 
+[[package]]
+name = "pydantic-settings"
+version = "2.14.1"
+source = { registry = "https://pypi.org/simple" }
+dependencies = [
+    { name = "pydantic" },
+    { name = "python-dotenv" },
+    { name = "typing-inspection" },
+]
+sdist = { url = "https://files.pythonhosted.org/packages/07/60/1d1e59c9c90d54591469ada7d268251f71c24bdb765f1a8a832cee8c6653/pydantic_settings-2.14.1.tar.gz", hash = "sha256:e874d3bec7e787b0c9958277956ed9b4dd5de6a80e162188fdaff7c5e26fd5fa", size = 235551, upload-time = "2026-05-08T13:40:06.542Z" }
+wheels = [
+    { url = "https://files.pythonhosted.org/packages/ae/8d/f1af3832f5e6eb13ba94ee809e72b8ecb5eef226d27ee0bef7d963d943c7/pydantic_settings-2.14.1-py3-none-any.whl", hash = "sha256:6e3c7edfd8277687cdc598f56e5cff0e9bfff0910a3749deaa8d4401c3a2b9de", size = 60964, upload-time = "2026-05-08T13:40:04.958Z" },
+]
+
 [[package]]
 name = "pygments"
 version = "2.19.2"
@@ -239,6 +397,54 @@ wheels = [
     { url = "https://files.pythonhosted.org/packages/3b/ab/b3226f0bd7cdcf710fbede2b3548584366da3b19b5021e74f5bde2a8fa3f/pytest-9.0.2-py3-none-any.whl", hash = "sha256:711ffd45bf766d5264d487b917733b453d917afd2b0ad65223959f59089f875b", size = 374801, upload-time = "2025-12-06T21:30:49.154Z" },
 ]
 
+[[package]]
+name = "python-dotenv"
+version = "1.2.2"
+source = { registry = "https://pypi.org/simple" }
+sdist = { url = "https://files.pythonhosted.org/packages/82/ed/0301aeeac3e5353ef3d94b6ec08bbcabd04a72018415dcb29e588514bba8/python_dotenv-1.2.2.tar.gz", hash = "sha256:2c371a91fbd7ba082c2c1dc1f8bf89ca22564a087c2c287cd9b662adde799cf3", size = 50135, upload-time = "2026-03-01T16:00:26.196Z" }
+wheels = [
+    { url = "https://files.pythonhosted.org/packages/0b/d7/1959b9648791274998a9c3526f6d0ec8fd2233e4d4acce81bbae76b44b2a/python_dotenv-1.2.2-py3-none-any.whl", hash = "sha256:1d8214789a24de455a8b8bd8ae6fe3c6b69a5e3d64aa8a8e5d68e694bbcb285a", size = 22101, upload-time = "2026-03-01T16:00:25.09Z" },
+]
+
+[[package]]
+name = "questionary"
+version = "2.1.1"
+source = { registry = "https://pypi.org/simple" }
+dependencies = [
+    { name = "prompt-toolkit" },
+]
+sdist = { url = "https://files.pythonhosted.org/packages/f6/45/eafb0bba0f9988f6a2520f9ca2df2c82ddfa8d67c95d6625452e97b204a5/questionary-2.1.1.tar.gz", hash = "sha256:3d7e980292bb0107abaa79c68dd3eee3c561b83a0f89ae482860b181c8bd412d", size = 25845, upload-time = "2025-08-28T19:00:20.851Z" }
+wheels = [
+    { url = "https://files.pythonhosted.org/packages/3c/26/1062c7ec1b053db9e499b4d2d5bc231743201b74051c973dadeac80a8f43/questionary-2.1.1-py3-none-any.whl", hash = "sha256:a51af13f345f1cdea62347589fbb6df3b290306ab8930713bfae4d475a7d4a59", size = 36753, upload-time = "2025-08-28T19:00:19.56Z" },
+]
+
+[[package]]
+name = "rich"
+version = "15.0.0"
+source = { registry = "https://pypi.org/simple" }
+dependencies = [
+    { name = "markdown-it-py" },
+    { name = "pygments" },
+]
+sdist = { url = "https://files.pythonhosted.org/packages/c0/8f/0722ca900cc807c13a6a0c696dacf35430f72e0ec571c4275d2371fca3e9/rich-15.0.0.tar.gz", hash = "sha256:edd07a4824c6b40189fb7ac9bc4c52536e9780fbbfbddf6f1e2502c31b068c36", size = 230680, upload-time = "2026-04-12T08:24:00.75Z" }
+wheels = [
+    { url = "https://files.pythonhosted.org/packages/82/3b/64d4899d73f91ba49a8c18a8ff3f0ea8f1c1d75481760df8c68ef5235bf5/rich-15.0.0-py3-none-any.whl", hash = "sha256:33bd4ef74232fb73fe9279a257718407f169c09b78a87ad3d296f548e27de0bb", size = 310654, upload-time = "2026-04-12T08:24:02.83Z" },
+]
+
+[[package]]
+name = "rich-click"
+version = "1.9.8"
+source = { registry = "https://pypi.org/simple" }
+dependencies = [
+    { name = "click" },
+    { name = "colorama", marker = "sys_platform == 'win32'" },
+    { name = "rich" },
+]
+sdist = { url = "https://files.pythonhosted.org/packages/f7/ea/21e4867ea0ef881ffd4c0550fc21a061435e50d6324bcd034396633cbc18/rich_click-1.9.8.tar.gz", hash = "sha256:4008f921da88b5d91646c134ec881c1500e5a6b3f093e90e8f29400e09608371", size = 75363, upload-time = "2026-05-28T19:54:59.144Z" }
+wheels = [
+    { url = "https://files.pythonhosted.org/packages/6d/97/a87901aef6b7e7e4a34c6dd6cc17dca8594a592ef9d9dd765fca2b7facf7/rich_click-1.9.8-py3-none-any.whl", hash = "sha256:12873865396e6927835d4eabb1cc3996edcd65b7ac9b2391a29eca4f335a2f93", size = 72189, upload-time = "2026-05-28T19:54:57.867Z" },
+]
+
 [[package]]
 name = "ruff"
 version = "0.15.7"
@@ -264,6 +470,15 @@ wheels = [
     { url = "https://files.pythonhosted.org/packages/8f/e8/726643a3ea68c727da31570bde48c7a10f1aa60eddd628d94078fec586ff/ruff-0.15.7-py3-none-win_arm64.whl", hash = "sha256:18e8d73f1c3fdf27931497972250340f92e8c861722161a9caeb89a58ead6ed2", size = 11023304, upload-time = "2026-03-19T16:26:51.669Z" },
 ]
 
+[[package]]
+name = "tomlkit"
+version = "0.15.0"
+source = { registry = "https://pypi.org/simple" }
+sdist = { url = "https://files.pythonhosted.org/packages/51/db/03eaf4331631ef6b27d6e3c9b68c54dc6f0d63d87201fed600cc409307fd/tomlkit-0.15.0.tar.gz", hash = "sha256:7d1a9ecba3086638211b13814ea79c90dd54dd11993564376f3aa92271f5c7a3", size = 161875, upload-time = "2026-05-10T07:38:22.245Z" }
+wheels = [
+    { url = "https://files.pythonhosted.org/packages/6a/43/8bd850ee71a191bf072e31302c73a66be413fecdd98fdcd111ecbcce13ca/tomlkit-0.15.0-py3-none-any.whl", hash = "sha256:4dbc8f0fc024412b57ced8757ac7461305126a648ff8c2c807fcb8e133a78738", size = 41328, upload-time = "2026-05-10T07:38:23.517Z" },
+]
+
 [[package]]
 name = "typing-extensions"
 version = "4.15.0"
@@ -284,3 +499,24 @@ sdist = { url = "https://files.pythonhosted.org/packages/55/e3/70399cb7dd41c10ac
 wheels = [
     { url = "https://files.pythonhosted.org/packages/dc/9b/47798a6c91d8bdb567fe2698fe81e0c6b7cb7ef4d13da4114b41d239f65d/typing_inspection-0.4.2-py3-none-any.whl", hash = "sha256:4ed1cacbdc298c220f1bd249ed5287caa16f34d44ef4e9c3d0cbad5b521545e7", size = 14611, upload-time = "2025-10-01T02:14:40.154Z" },
 ]
+
+[[package]]
+name = "wcmatch"
+version = "10.1"
+source = { registry = "https://pypi.org/simple" }
+dependencies = [
+    { name = "bracex" },
+]
+sdist = { url = "https://files.pythonhosted.org/packages/79/3e/c0bdc27cf06f4e47680bd5803a07cb3dfd17de84cde92dd217dcb9e05253/wcmatch-10.1.tar.gz", hash = "sha256:f11f94208c8c8484a16f4f48638a85d771d9513f4ab3f37595978801cb9465af", size = 117421, upload-time = "2025-06-22T19:14:02.49Z" }
+wheels = [
+    { url = "https://files.pythonhosted.org/packages/eb/d8/0d1d2e9d3fabcf5d6840362adcf05f8cf3cd06a73358140c3a97189238ae/wcmatch-10.1-py3-none-any.whl", hash = "sha256:5848ace7dbb0476e5e55ab63c6bbd529745089343427caa5537f230cc01beb8a", size = 39854, upload-time = "2025-06-22T19:14:00.978Z" },
+]
+
+[[package]]
+name = "wcwidth"
+version = "0.8.1"
+source = { registry = "https://pypi.org/simple" }
+sdist = { url = "https://files.pythonhosted.org/packages/49/b4/51fe890511f0f242d07cb1ebe6a5b6db417262b9d2568b460347c57d95cc/wcwidth-0.8.1.tar.gz", hash = "sha256:faf5b4a5366a72dc49cad48cdf21f52bdf63bdda995178e483ba247ff79089b9", size = 1466072, upload-time = "2026-06-08T05:57:23.146Z" }
+wheels = [
+    { url = "https://files.pythonhosted.org/packages/bd/6e/95b0e537de1f4d4301f76f944642c6da50d1511cc7b3d64dc418a66c7509/wcwidth-0.8.1-py3-none-any.whl", hash = "sha256:f453740b1e4a4f3291faa37944c555d71056c4da08d59809b307ef4feba695c8", size = 323092, upload-time = "2026-06-08T05:57:21.413Z" },
+]

From a1d6413959ab560ab2a776962c4af8d9464fa23e Mon Sep 17 00:00:00 2001
From: Beatrice Marti <mabesa@users.noreply.github.com>
Date: Mon, 15 Jun 2026 11:04:55 +0200
Subject: [PATCH 02/16] feat(input): align spatial vocabulary to
 SpatialRepresentation and add target declaration
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Phase 1 of the SAP3-alignment work — the settled input-contract changes.

- Rename SpatialResolution → SpatialRepresentation, adopting SAP3's exact member
  names/values (POINT, BASIN_AVERAGE, ELEVATION_BAND, GRIDDED) so the adapter mapping
  is identity. Banded SnowMapper forcing (SWE, ROF) is declared at ELEVATION_BAND.
- Add explicit target declaration: new input/target.py with OutputRepresentation
  (DETERMINISTIC/QUANTILES/TRAJECTORIES) and TargetSpec(unit, representations);
  InputRequirement gains a required, non-empty `targets: dict[str, TargetSpec]`.
  Targets are declared independently of inputs (accommodates Q2 — pure-simulation
  models that lack the target's own history).
- Update tests (117 passing), docs/input_requirement.md, README.md, and the
  fi-sap3-mapping.md spatial/target rows (mapping is now identity).

BREAKING: SpatialResolution renamed and InputRequirement now requires `targets`.
Acceptable pre-1.0 with no external conformers. Patch bump 0.1.2 → 0.1.3.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
---
 README.md                                |  14 +-
 docs/fi-sap3-mapping.md                  |  25 ++-
 docs/input_requirement.md                |  74 +++++--
 forecast_interface/__init__.py           |  10 +-
 forecast_interface/common/__init__.py    |   4 +-
 forecast_interface/common/resolutions.py |   7 +-
 forecast_interface/input/__init__.py     |  12 +-
 forecast_interface/input/requirement.py  |  30 ++-
 forecast_interface/input/target.py       |  28 +++
 forecast_interface/output/metadata.py    |  10 +-
 pyproject.toml                           |   4 +-
 tests/test_input.py                      | 233 +++++++++++++++++++----
 tests/test_interface.py                  |  14 +-
 uv.lock                                  |   2 +-
 14 files changed, 365 insertions(+), 102 deletions(-)
 create mode 100644 forecast_interface/input/target.py

diff --git a/README.md b/README.md
index 11d6019..8fee9d5 100644
--- a/README.md
+++ b/README.md
@@ -126,12 +126,16 @@ See [Input Requirement Specification](docs/input_requirement.md) for full docume
 
 ```
 InputRequirement
+    targets: dict[str, TargetSpec]                  # what the model forecasts
     dynamic: dict[TemporalResolution, SpatialInputSpec]
     static: set[str]
 
+TargetSpec
+    unit: Unit
+    representations: frozenset[OutputRepresentation]  # DETERMINISTIC | QUANTILES | TRAJECTORIES
+
 SpatialInputSpec
-    distributed: DynamicInputSpec | None
-    lumped: DynamicInputSpec | None
+    data: dict[SpatialRepresentation, DynamicInputSpec]
 
 DynamicInputSpec
     past_known: dict[str, dict[str, PastKnownVariable]]
@@ -146,3 +150,9 @@ FutureKnownVariable
     max_nan: int
     ensemble_mode: EnsembleMode    # SINGLE or ENSEMBLE
 ```
+
+### Enums
+
+**SpatialRepresentation** -- `POINT`, `BASIN_AVERAGE`, `ELEVATION_BAND`, `GRIDDED`
+
+**OutputRepresentation** -- `DETERMINISTIC`, `QUANTILES`, `TRAJECTORIES`
diff --git a/docs/fi-sap3-mapping.md b/docs/fi-sap3-mapping.md
index 7848fef..3a7d6cd 100644
--- a/docs/fi-sap3-mapping.md
+++ b/docs/fi-sap3-mapping.md
@@ -192,8 +192,8 @@ discrepancy note at the end of this section.
 | `PastKnownVariable.lookback` (`input/variable.py:12`) | `lookback_steps: int` (line 266) |
 | `FutureKnownVariable.future_steps` (`input/variable.py:31`) | `forecast_horizon_steps: int` (line 267) |
 | `TemporalResolution` keys + `VariableMetadata.timedelta` | `supported_time_steps: frozenset[timedelta]` (line 265) |
-| `SpatialResolution` keys (`input/requirement.py:22`) | `spatial_input_type: SpatialRepresentation` (line 268) |
-| *(no FI equivalent yet)* — proposed by SAP3 PR | `target_parameters: frozenset[str]` (line 261) |
+| `SpatialRepresentation` keys (`input/requirement.py`) | `spatial_input_type: SpatialRepresentation` (line 268) |
+| `InputRequirement.targets` keys + `TargetSpec.unit`/`.representations` (`input/target.py`) | `target_parameters: frozenset[str]` (line 261) |
 | `PastKnownVariable.max_nan` / `FutureKnownVariable.max_nan` (`input/variable.py:13,32`) | Derivable from SAP3 QC config (doc 014 line 273) |
 | `FutureKnownVariable.ensemble_mode` (`input/variable.py:33`) | Derivable from NWP ensemble config (doc 014 line 273) |
 
@@ -214,19 +214,18 @@ to the FI model via the adapter. The slots are identical for station and group:
 
 ### Spatial enum mapping (FI → SAP3)
 
-FI `SpatialResolution` (`common/resolutions.py:15`) → SAP3 `SpatialRepresentation`
-(`types/enums.py:73`):
+As of Phase 1, FI's `SpatialRepresentation` (`common/resolutions.py`) adopts SAP3's exact
+member names and values (`types/enums.py:73`), so the mapping is **identity**:
 
-| FI `SpatialResolution` | SAP3 `SpatialRepresentation` |
+| FI `SpatialRepresentation` | SAP3 `SpatialRepresentation` |
 |---|---|
-| `LUMPED` | `BASIN_AVERAGE` (`"basin_average"`) |
-| `HRU` | `ELEVATION_BAND` (`"elevation_band"`) |
-| `GRIDDED` | `GRIDDED` (`"gridded"`) |
-| *(new in FI — proposed)* `POINT` | `POINT` (`"point"`) |
+| `POINT` (`"point"`) | `POINT` (`"point"`) |
+| `BASIN_AVERAGE` (`"basin_average"`) | `BASIN_AVERAGE` (`"basin_average"`) |
+| `ELEVATION_BAND` (`"elevation_band"`) | `ELEVATION_BAND` (`"elevation_band"`) |
+| `GRIDDED` (`"gridded"`) | `GRIDDED` (`"gridded"`) |
 
-> SAP3 has a `POINT` member already (`types/enums.py:74`); FI's `SpatialResolution`
-> currently has only `LUMPED`/`HRU`/`GRIDDED` (`common/resolutions.py:16–18`). Adding
-> `POINT` to FI is part of the SAP3→FI input PR (see Open Items §8).
+> The earlier `LUMPED`/`HRU` names were renamed to `BASIN_AVERAGE`/`ELEVATION_BAND` and
+> `POINT` was added, completing the alignment proposed in the SAP3→FI input PR.
 
 ---
 
@@ -317,5 +316,5 @@ FI/artifact side answers *"what is this model and how was it built"*; SAP3 side
 | 5 | Epistemic uncertainty | `EpistemicUncertaintyData` is dropped at the boundary in v0b (doc 014 lines 197–204). Revisit (add to `ForecastEnsemble` / store as metadata) if models emit it. |
 | 6 | Interface module now exists | doc 014 assumes FI's `interface/` is unimplemented (lines 80, 287). It is now implemented (`ForecastModel`, `ModelResult`, `FailureCause`). SAP3 should re-evaluate Tasks 4–5 against the real protocol. |
 | 7 | `ModelResult` failure channel | FI now returns `ModelResult = ModelSuccess \| ModelFailure` (`interface/result.py:40`) with a `FailureCause` enum (`interface/failure.py:4`). SAP3's `ModelOutputError` path must account for the `ModelFailure` branch, not only all-`FAILURE` `ModelOutput`. |
-| 8 | Resolution enum split | FI split into `TemporalResolution` + `SpatialResolution` (`common/resolutions.py`); doc 014 references a single `Resolution`. Mapping tables above use the current split. |
+| 8 | Resolution enum split | FI split into `TemporalResolution` + `SpatialRepresentation` (`common/resolutions.py`); doc 014 references a single `Resolution`. Mapping tables above use the current split. |
 ```
diff --git a/docs/input_requirement.md b/docs/input_requirement.md
index 3b06c45..446e9e5 100644
--- a/docs/input_requirement.md
+++ b/docs/input_requirement.md
@@ -6,10 +6,32 @@ All declared inputs are **required** — the pipeline fails if any are missing.
 
 ## Input Categories
 
-Two top-level categories:
+Three top-level declarations:
 
-1. **Dynamic inputs** — time-varying data (e.g. discharge, precipitation, temperature)
-2. **Static inputs** — time-invariant attributes (e.g. catchment area, slope, land cover fraction)
+1. **Targets** — what the model forecasts (the output variables and their supported representations)
+2. **Dynamic inputs** — time-varying data (e.g. discharge, precipitation, temperature)
+3. **Static inputs** — time-invariant attributes (e.g. catchment area, slope, land cover fraction)
+
+---
+
+## Targets
+
+A model must declare what it forecasts. `targets` is a dict keyed by variable name, each mapping to a `TargetSpec`:
+
+```python
+class TargetSpec(BaseModel):
+    unit: Unit
+    representations: frozenset[OutputRepresentation]
+```
+
+- **unit** — the physical unit of the target (e.g. `Unit.M3_PER_S`).
+- **representations** — the output forms the model can produce for this target, a non-empty set of `OutputRepresentation`: `deterministic`, `quantiles`, `trajectories`. A target may support more than one form.
+
+The combinability rule (whether a target's forecasts can be BMA-combined) is derived downstream from whether `TRAJECTORIES` is present; it is not encoded here.
+
+Targets are declared **independently** of inputs. A model that needs the target's own past history simply lists that variable under `past_known` in its dynamic inputs; a pure-simulation model omits it. (See Q2 in `open_design_questions.md`.)
+
+`targets` must contain at least one entry — a model must forecast something.
 
 ---
 
@@ -32,22 +54,23 @@ temporal_resolution
 
 The time step of the data. One of: `sub_hourly`, `hourly`, `sub_daily`, `daily`, `weekly`, `monthly`, `seasonal`, `annual`.
 
-#### 2. Spatial Resolution
+#### 2. Spatial Representation
 
-How spatial information is represented. Keyed by the `SpatialResolution` enum:
+How spatial information is represented. Keyed by the `SpatialRepresentation` enum (values mirror SAP3's enum, so the adapter mapping is identity):
 
-- **lumped** — single time series per basin (station observations or basin-averaged values)
-- **hru** — semi-distributed: multiple time series per basin (elevation bands, clusters, HRUs)
+- **point** — a single point location (e.g. a gauge or grid-cell extraction)
+- **basin_average** — single time series per basin (station observations or basin-averaged values)
+- **elevation_band** — semi-distributed: one time series per elevation band (e.g. banded SnowMapper forcing such as `swe` and `rof` is declared at `elevation_band`)
 - **gridded** — fully distributed raster data (spatial variability preserved)
 
-The `SpatialInputSpec` model holds a `data` dict keyed by `SpatialResolution`:
+The `SpatialInputSpec` model holds a `data` dict keyed by `SpatialRepresentation`:
 
 ```python
 class SpatialInputSpec(BaseModel):
-    data: dict[SpatialResolution, DynamicInputSpec]
+    data: dict[SpatialRepresentation, DynamicInputSpec]
 ```
 
-A model can require any combination of the three resolutions within the same temporal resolution.
+A model can require any combination of representations within the same temporal resolution.
 
 #### 3. Temporality
 
@@ -88,10 +111,17 @@ Example: `["catchment_area", "mean_slope", "forest_fraction", "clay_fraction"]`
 ## Full Example
 
 ```yaml
+targets:
+  discharge:
+    unit: "m³/s"
+    representations:
+      - quantiles
+      - trajectories
+
 dynamic:
   daily:
     data:
-      lumped:
+      basin_average:
         past_known:
           obs:
             discharge:
@@ -124,18 +154,20 @@ dynamic:
             precipitation:
               lookback: 30
               max_nan: 3
-      hru:
-        past_known:
-          obs:
-            precipitation:
-              lookback: 30
-              max_nan: 5
-            temperature:
-              lookback: 30
-              max_nan: 3
+      elevation_band:
+        future_known:
+          SnowMapper:
+            swe:
+              future_steps: 10
+              max_nan: 0
+              ensemble_mode: single
+            rof:
+              future_steps: 10
+              max_nan: 0
+              ensemble_mode: single
   hourly:
     data:
-      lumped:
+      basin_average:
         past_known:
           obs:
             discharge:
diff --git a/forecast_interface/__init__.py b/forecast_interface/__init__.py
index 560bc93..b983fcf 100644
--- a/forecast_interface/__init__.py
+++ b/forecast_interface/__init__.py
@@ -1,13 +1,15 @@
-__version__ = "0.1.2"
+__version__ = "0.1.3"
 
 from .input import (
     DynamicInputSpec,
     EnsembleMode,
     FutureKnownVariable,
     InputRequirement,
+    OutputRepresentation,
     PastKnownVariable,
     SpatialInputSpec,
-    SpatialResolution,
+    SpatialRepresentation,
+    TargetSpec,
 )
 from .interface import (
     FailureCause,
@@ -44,10 +46,12 @@
     "ModelOutput",
     "ModelResult",
     "ModelSuccess",
+    "OutputRepresentation",
     "PastKnownVariable",
     "QuantileData",
     "SpatialInputSpec",
-    "SpatialResolution",
+    "SpatialRepresentation",
+    "TargetSpec",
     "TemporalResolution",
     "TrajectoryData",
     "Unit",
diff --git a/forecast_interface/common/__init__.py b/forecast_interface/common/__init__.py
index f7c3169..4615ca6 100644
--- a/forecast_interface/common/__init__.py
+++ b/forecast_interface/common/__init__.py
@@ -1,8 +1,8 @@
-from .resolutions import SpatialResolution, TemporalResolution
+from .resolutions import SpatialRepresentation, TemporalResolution
 from .units import Unit
 
 __all__ = [
-    "SpatialResolution",
+    "SpatialRepresentation",
     "TemporalResolution",
     "Unit",
 ]
diff --git a/forecast_interface/common/resolutions.py b/forecast_interface/common/resolutions.py
index 1c33eff..95a2a92 100644
--- a/forecast_interface/common/resolutions.py
+++ b/forecast_interface/common/resolutions.py
@@ -12,7 +12,8 @@ class TemporalResolution(Enum):
     ANNUAL = "annual"
 
 
-class SpatialResolution(Enum):
-    LUMPED = "lumped"
-    HRU = "hru"
+class SpatialRepresentation(Enum):
+    POINT = "point"
+    BASIN_AVERAGE = "basin_average"
+    ELEVATION_BAND = "elevation_band"
     GRIDDED = "gridded"
diff --git a/forecast_interface/input/__init__.py b/forecast_interface/input/__init__.py
index 86f9611..baf731d 100644
--- a/forecast_interface/input/__init__.py
+++ b/forecast_interface/input/__init__.py
@@ -1,10 +1,14 @@
-from forecast_interface.common.resolutions import SpatialResolution, TemporalResolution
+from forecast_interface.common.resolutions import (
+    SpatialRepresentation,
+    TemporalResolution,
+)
 
 from .requirement import (
     DynamicInputSpec,
     InputRequirement,
     SpatialInputSpec,
 )
+from .target import OutputRepresentation, TargetSpec
 from .variable import EnsembleMode, FutureKnownVariable, PastKnownVariable
 
 __all__ = [
@@ -12,8 +16,10 @@
     "EnsembleMode",
     "FutureKnownVariable",
     "InputRequirement",
+    "OutputRepresentation",
     "PastKnownVariable",
-    "SpatialResolution",
-    "TemporalResolution",
     "SpatialInputSpec",
+    "SpatialRepresentation",
+    "TargetSpec",
+    "TemporalResolution",
 ]
diff --git a/forecast_interface/input/requirement.py b/forecast_interface/input/requirement.py
index d626330..44b7b49 100644
--- a/forecast_interface/input/requirement.py
+++ b/forecast_interface/input/requirement.py
@@ -1,7 +1,11 @@
 from pydantic import BaseModel, field_validator, model_validator
 
-from forecast_interface.common.resolutions import SpatialResolution, TemporalResolution
+from forecast_interface.common.resolutions import (
+    SpatialRepresentation,
+    TemporalResolution,
+)
 
+from .target import TargetSpec
 from .variable import FutureKnownVariable, PastKnownVariable
 
 
@@ -19,23 +23,39 @@ def _at_least_one_temporality(self) -> "DynamicInputSpec":
 
 
 class SpatialInputSpec(BaseModel):
-    data: dict[SpatialResolution, DynamicInputSpec]
+    data: dict[SpatialRepresentation, DynamicInputSpec]
 
     @field_validator("data")
     @classmethod
     def _at_least_one_spatial(
         cls,
-        v: dict[SpatialResolution, DynamicInputSpec],
-    ) -> dict[SpatialResolution, DynamicInputSpec]:
+        v: dict[SpatialRepresentation, DynamicInputSpec],
+    ) -> dict[SpatialRepresentation, DynamicInputSpec]:
         if not v:
-            raise ValueError("data must contain at least one spatial resolution")
+            raise ValueError("data must contain at least one spatial representation")
         return v
 
 
 class InputRequirement(BaseModel):
+    # Targets are declared independently of inputs; a model needing the target's own
+    # history lists it under past_known (see Q2 in open_design_questions.md).
+    targets: dict[str, TargetSpec]
     dynamic: dict[TemporalResolution, SpatialInputSpec]
     static: set[str] = set()
 
+    @field_validator("targets")
+    @classmethod
+    def _at_least_one_target(
+        cls,
+        v: dict[str, TargetSpec],
+    ) -> dict[str, TargetSpec]:
+        if not v:
+            raise ValueError("targets must contain at least one entry")
+        for name in v:
+            if not name or not name.strip():
+                raise ValueError("target variable names must be non-empty strings")
+        return v
+
     @field_validator("dynamic")
     @classmethod
     def _at_least_one_resolution(
diff --git a/forecast_interface/input/target.py b/forecast_interface/input/target.py
new file mode 100644
index 0000000..9b23274
--- /dev/null
+++ b/forecast_interface/input/target.py
@@ -0,0 +1,28 @@
+from enum import Enum
+
+from pydantic import BaseModel, field_validator
+
+from forecast_interface.common import Unit
+
+
+class OutputRepresentation(Enum):
+    DETERMINISTIC = "deterministic"
+    QUANTILES = "quantiles"
+    TRAJECTORIES = "trajectories"
+
+
+class TargetSpec(BaseModel):
+    unit: Unit
+    representations: frozenset[OutputRepresentation]
+
+    @field_validator("representations")
+    @classmethod
+    def _non_empty_representations(
+        cls,
+        v: frozenset[OutputRepresentation],
+    ) -> frozenset[OutputRepresentation]:
+        if not v:
+            raise ValueError(
+                "representations must contain at least one output representation"
+            )
+        return v
diff --git a/forecast_interface/output/metadata.py b/forecast_interface/output/metadata.py
index 2d19dbf..4d4d00f 100644
--- a/forecast_interface/output/metadata.py
+++ b/forecast_interface/output/metadata.py
@@ -7,12 +7,14 @@
 
 
 class VariableMetadata(BaseModel):
-    name: str # Name of the variable, e.g. "discharge", "water_level", etc.
+    name: str  # Name of the variable, e.g. "discharge", "water_level", etc.
     unit: Unit
     resolution: TemporalResolution
-    timedelta: datetime.timedelta # Concrete value im minutes for example, but can be any positive timedelta that is consistent with the resolution
-    forecast_horizon: int # Number of time steps of length timedelta that the forecast is made for 
-    offset: int # Number of time steps of length timedelta between the last observed data point and the first forecasted data point
+    timedelta: datetime.timedelta  # Concrete value im minutes for example, but can be any positive timedelta that is consistent with the resolution
+    forecast_horizon: (
+        int  # Number of time steps of length timedelta that the forecast is made for
+    )
+    offset: int  # Number of time steps of length timedelta between the last observed data point and the first forecasted data point
 
     @field_validator("name")
     @classmethod
diff --git a/pyproject.toml b/pyproject.toml
index 7778a90..dd72002 100644
--- a/pyproject.toml
+++ b/pyproject.toml
@@ -1,6 +1,6 @@
 [project]
 name = "forecastinterface"
-version = "0.1.2"
+version = "0.1.3"
 description = "Add your description here"
 readme = "README.md"
 requires-python = ">=3.11"
@@ -13,7 +13,7 @@ dependencies = [
 pythonpath = ["."]
 
 [tool.bumpversion]
-current_version = "0.1.2"
+current_version = "0.1.3"
 commit = false
 tag = false
 allow_dirty = true
diff --git a/tests/test_input.py b/tests/test_input.py
index 91943ee..d4e2944 100644
--- a/tests/test_input.py
+++ b/tests/test_input.py
@@ -1,18 +1,30 @@
 import pytest
 from pydantic import ValidationError
 
+from forecast_interface.common import Unit
 from forecast_interface.input import (
     DynamicInputSpec,
     EnsembleMode,
     FutureKnownVariable,
     InputRequirement,
+    OutputRepresentation,
     PastKnownVariable,
-    TemporalResolution,
     SpatialInputSpec,
-    SpatialResolution,
+    SpatialRepresentation,
+    TargetSpec,
+    TemporalResolution,
 )
 
 
+def _target() -> dict[str, TargetSpec]:
+    return {
+        "discharge": TargetSpec(
+            unit=Unit.M3_PER_S,
+            representations=frozenset({OutputRepresentation.QUANTILES}),
+        )
+    }
+
+
 # ---------------------------------------------------------------------------
 # Variable types
 # ---------------------------------------------------------------------------
@@ -66,6 +78,40 @@ def test_max_nan_negative(self) -> None:
             FutureKnownVariable(future_steps=1, max_nan=-1)
 
 
+# ---------------------------------------------------------------------------
+# Target types
+# ---------------------------------------------------------------------------
+
+
+class TestOutputRepresentation:
+    def test_members(self) -> None:
+        assert OutputRepresentation.DETERMINISTIC.value == "deterministic"
+        assert OutputRepresentation.QUANTILES.value == "quantiles"
+        assert OutputRepresentation.TRAJECTORIES.value == "trajectories"
+        assert len(OutputRepresentation) == 3
+
+
+class TestTargetSpec:
+    def test_valid(self) -> None:
+        spec = TargetSpec(
+            unit=Unit.M3_PER_S,
+            representations=frozenset(
+                {OutputRepresentation.QUANTILES, OutputRepresentation.TRAJECTORIES}
+            ),
+        )
+        assert spec.unit == Unit.M3_PER_S
+        assert OutputRepresentation.QUANTILES in spec.representations
+        assert len(spec.representations) == 2
+
+    def test_empty_representations_raises(self) -> None:
+        with pytest.raises(ValidationError, match="at least one output representation"):
+            TargetSpec(unit=Unit.M3_PER_S, representations=frozenset())
+
+    def test_unit_required(self) -> None:
+        with pytest.raises(ValidationError, match="unit"):
+            TargetSpec(representations=frozenset({OutputRepresentation.DETERMINISTIC}))
+
+
 # ---------------------------------------------------------------------------
 # Spec containers
 # ---------------------------------------------------------------------------
@@ -96,47 +142,62 @@ def test_empty_raises(self) -> None:
 
 
 class TestSpatialInputSpec:
-    def test_lumped_only(self) -> None:
+    def test_basin_average_only(self) -> None:
         dynamic = DynamicInputSpec(
             past_known={"obs": {"q": PastKnownVariable(lookback=10, max_nan=0)}}
         )
-        spec = SpatialInputSpec(data={SpatialResolution.LUMPED: dynamic})
-        assert SpatialResolution.LUMPED in spec.data
+        spec = SpatialInputSpec(data={SpatialRepresentation.BASIN_AVERAGE: dynamic})
+        assert SpatialRepresentation.BASIN_AVERAGE in spec.data
         assert len(spec.data) == 1
 
     def test_gridded_only(self) -> None:
         dynamic = DynamicInputSpec(
             past_known={"ERA5": {"swe": PastKnownVariable(lookback=90, max_nan=5)}}
         )
-        spec = SpatialInputSpec(data={SpatialResolution.GRIDDED: dynamic})
-        assert SpatialResolution.GRIDDED in spec.data
+        spec = SpatialInputSpec(data={SpatialRepresentation.GRIDDED: dynamic})
+        assert SpatialRepresentation.GRIDDED in spec.data
         assert len(spec.data) == 1
 
+    def test_elevation_band(self) -> None:
+        dynamic = DynamicInputSpec(
+            past_known={
+                "SnowMapper": {"swe": PastKnownVariable(lookback=30, max_nan=2)}
+            }
+        )
+        spec = SpatialInputSpec(data={SpatialRepresentation.ELEVATION_BAND: dynamic})
+        assert SpatialRepresentation.ELEVATION_BAND in spec.data
+
     def test_both(self) -> None:
-        lumped = DynamicInputSpec(
+        basin = DynamicInputSpec(
             past_known={"obs": {"q": PastKnownVariable(lookback=10, max_nan=0)}}
         )
         gridded = DynamicInputSpec(
             past_known={"ERA5": {"swe": PastKnownVariable(lookback=90, max_nan=5)}}
         )
         spec = SpatialInputSpec(
-            data={SpatialResolution.LUMPED: lumped, SpatialResolution.GRIDDED: gridded}
+            data={
+                SpatialRepresentation.BASIN_AVERAGE: basin,
+                SpatialRepresentation.GRIDDED: gridded,
+            }
         )
-        assert SpatialResolution.LUMPED in spec.data
-        assert SpatialResolution.GRIDDED in spec.data
+        assert SpatialRepresentation.BASIN_AVERAGE in spec.data
+        assert SpatialRepresentation.GRIDDED in spec.data
         assert len(spec.data) == 2
 
     def test_neither_raises(self) -> None:
-        with pytest.raises(ValidationError, match="at least one spatial resolution"):
+        with pytest.raises(
+            ValidationError, match="at least one spatial representation"
+        ):
             SpatialInputSpec(data={})
 
 
-class TestSpatialResolution:
+class TestSpatialRepresentation:
     def test_members(self) -> None:
-        assert SpatialResolution.LUMPED.value == "lumped"
-        assert SpatialResolution.HRU.value == "hru"
-        assert SpatialResolution.GRIDDED.value == "gridded"
-        assert len(SpatialResolution) == 3
+        assert SpatialRepresentation.POINT.value == "point"
+        assert SpatialRepresentation.BASIN_AVERAGE.value == "basin_average"
+        assert SpatialRepresentation.ELEVATION_BAND.value == "elevation_band"
+        assert SpatialRepresentation.GRIDDED.value == "gridded"
+        assert len(SpatialRepresentation) == 4
 
 
 # ---------------------------------------------------------------------------
@@ -147,10 +208,11 @@ def test_members(self) -> None:
 class TestInputRequirement:
     def test_minimal(self) -> None:
         req = InputRequirement(
+            targets=_target(),
             dynamic={
                 TemporalResolution.DAILY: SpatialInputSpec(
                     data={
-                        SpatialResolution.LUMPED: DynamicInputSpec(
+                        SpatialRepresentation.BASIN_AVERAGE: DynamicInputSpec(
                             past_known={
                                 "obs": {
                                     "discharge": PastKnownVariable(
@@ -161,17 +223,19 @@ def test_minimal(self) -> None:
                         )
                     }
                 )
-            }
+            },
         )
         assert TemporalResolution.DAILY in req.dynamic
         assert req.static == set()
+        assert "discharge" in req.targets
 
     def test_with_static(self) -> None:
         req = InputRequirement(
+            targets=_target(),
             dynamic={
                 TemporalResolution.DAILY: SpatialInputSpec(
                     data={
-                        SpatialResolution.LUMPED: DynamicInputSpec(
+                        SpatialRepresentation.BASIN_AVERAGE: DynamicInputSpec(
                             past_known={
                                 "obs": {"q": PastKnownVariable(lookback=30, max_nan=0)}
                             }
@@ -183,17 +247,63 @@ def test_with_static(self) -> None:
         )
         assert len(req.static) == 3
 
+    def test_empty_targets_raises(self) -> None:
+        with pytest.raises(ValidationError, match="at least one entry"):
+            InputRequirement(
+                targets={},
+                dynamic={
+                    TemporalResolution.DAILY: SpatialInputSpec(
+                        data={
+                            SpatialRepresentation.BASIN_AVERAGE: DynamicInputSpec(
+                                past_known={
+                                    "obs": {
+                                        "q": PastKnownVariable(lookback=1, max_nan=0)
+                                    }
+                                }
+                            )
+                        }
+                    )
+                },
+            )
+
+    def test_whitespace_target_key_raises(self) -> None:
+        with pytest.raises(
+            ValidationError, match="target variable names must be non-empty"
+        ):
+            InputRequirement(
+                targets={
+                    "  ": TargetSpec(
+                        unit=Unit.M3_PER_S,
+                        representations=frozenset({OutputRepresentation.QUANTILES}),
+                    )
+                },
+                dynamic={
+                    TemporalResolution.DAILY: SpatialInputSpec(
+                        data={
+                            SpatialRepresentation.BASIN_AVERAGE: DynamicInputSpec(
+                                past_known={
+                                    "obs": {
+                                        "q": PastKnownVariable(lookback=1, max_nan=0)
+                                    }
+                                }
+                            )
+                        }
+                    )
+                },
+            )
+
     def test_empty_dynamic_raises(self) -> None:
         with pytest.raises(ValidationError, match="at least one temporal resolution"):
-            InputRequirement(dynamic={})
+            InputRequirement(targets=_target(), dynamic={})
 
     def test_empty_static_string_raises(self) -> None:
         with pytest.raises(ValidationError, match="non-empty strings"):
             InputRequirement(
+                targets=_target(),
                 dynamic={
                     TemporalResolution.DAILY: SpatialInputSpec(
                         data={
-                            SpatialResolution.LUMPED: DynamicInputSpec(
+                            SpatialRepresentation.BASIN_AVERAGE: DynamicInputSpec(
                                 past_known={
                                     "obs": {
                                         "q": PastKnownVariable(lookback=1, max_nan=0)
@@ -208,10 +318,11 @@ def test_empty_static_string_raises(self) -> None:
 
     def test_duplicate_static_deduplicated(self) -> None:
         req = InputRequirement(
+            targets=_target(),
             dynamic={
                 TemporalResolution.DAILY: SpatialInputSpec(
                     data={
-                        SpatialResolution.LUMPED: DynamicInputSpec(
+                        SpatialRepresentation.BASIN_AVERAGE: DynamicInputSpec(
                             past_known={
                                 "obs": {"q": PastKnownVariable(lookback=1, max_nan=0)}
                             }
@@ -230,10 +341,11 @@ def test_duplicate_static_deduplicated(self) -> None:
     def test_whitespace_static_string_raises(self) -> None:
         with pytest.raises(ValidationError, match="non-empty strings"):
             InputRequirement(
+                targets=_target(),
                 dynamic={
                     TemporalResolution.DAILY: SpatialInputSpec(
                         data={
-                            SpatialResolution.LUMPED: DynamicInputSpec(
+                            SpatialRepresentation.BASIN_AVERAGE: DynamicInputSpec(
                                 past_known={
                                     "obs": {
                                         "q": PastKnownVariable(lookback=1, max_nan=0)
@@ -258,10 +370,21 @@ class TestFullYamlExample:
     @pytest.fixture()
     def full_requirement(self) -> InputRequirement:
         return InputRequirement(
+            targets={
+                "discharge": TargetSpec(
+                    unit=Unit.M3_PER_S,
+                    representations=frozenset(
+                        {
+                            OutputRepresentation.QUANTILES,
+                            OutputRepresentation.TRAJECTORIES,
+                        }
+                    ),
+                )
+            },
             dynamic={
                 TemporalResolution.DAILY: SpatialInputSpec(
                     data={
-                        SpatialResolution.LUMPED: DynamicInputSpec(
+                        SpatialRepresentation.BASIN_AVERAGE: DynamicInputSpec(
                             past_known={
                                 "obs": {
                                     "discharge": PastKnownVariable(
@@ -294,7 +417,7 @@ def full_requirement(self) -> InputRequirement:
                                 },
                             },
                         ),
-                        SpatialResolution.GRIDDED: DynamicInputSpec(
+                        SpatialRepresentation.GRIDDED: DynamicInputSpec(
                             past_known={
                                 "ERA5": {
                                     "swe": PastKnownVariable(lookback=90, max_nan=5),
@@ -304,11 +427,27 @@ def full_requirement(self) -> InputRequirement:
                                 }
                             }
                         ),
+                        SpatialRepresentation.ELEVATION_BAND: DynamicInputSpec(
+                            future_known={
+                                "SnowMapper": {
+                                    "swe": FutureKnownVariable(
+                                        future_steps=10,
+                                        max_nan=0,
+                                        ensemble_mode=EnsembleMode.SINGLE,
+                                    ),
+                                    "rof": FutureKnownVariable(
+                                        future_steps=10,
+                                        max_nan=0,
+                                        ensemble_mode=EnsembleMode.SINGLE,
+                                    ),
+                                }
+                            }
+                        ),
                     }
                 ),
                 TemporalResolution.HOURLY: SpatialInputSpec(
                     data={
-                        SpatialResolution.LUMPED: DynamicInputSpec(
+                        SpatialRepresentation.BASIN_AVERAGE: DynamicInputSpec(
                             past_known={
                                 "obs": {
                                     "discharge": PastKnownVariable(
@@ -338,35 +477,49 @@ def test_construction(self, full_requirement: InputRequirement) -> None:
         assert TemporalResolution.HOURLY in full_requirement.dynamic
         assert len(full_requirement.static) == 3
 
-    def test_daily_lumped_past(self, full_requirement: InputRequirement) -> None:
+    def test_targets(self, full_requirement: InputRequirement) -> None:
+        discharge = full_requirement.targets["discharge"]
+        assert discharge.unit == Unit.M3_PER_S
+        assert OutputRepresentation.TRAJECTORIES in discharge.representations
+
+    def test_daily_basin_average_past(self, full_requirement: InputRequirement) -> None:
         daily = full_requirement.dynamic[TemporalResolution.DAILY]
-        lumped = daily.data[SpatialResolution.LUMPED]
-        obs = lumped.past_known["obs"]
+        basin = daily.data[SpatialRepresentation.BASIN_AVERAGE]
+        obs = basin.past_known["obs"]
         assert obs["discharge"].lookback == 365
         assert obs["precipitation"].max_nan == 5
 
-    def test_daily_lumped_future(self, full_requirement: InputRequirement) -> None:
+    def test_daily_basin_average_future(
+        self, full_requirement: InputRequirement
+    ) -> None:
         daily = full_requirement.dynamic[TemporalResolution.DAILY]
-        lumped = daily.data[SpatialResolution.LUMPED]
-        gfs = lumped.future_known["GFS"]
+        basin = daily.data[SpatialRepresentation.BASIN_AVERAGE]
+        gfs = basin.future_known["GFS"]
         assert gfs["precipitation"].ensemble_mode == EnsembleMode.ENSEMBLE
         assert gfs["temperature"].ensemble_mode == EnsembleMode.SINGLE
-        ecmwf = lumped.future_known["ECMWF"]
+        ecmwf = basin.future_known["ECMWF"]
         assert ecmwf["precipitation"].future_steps == 15
 
     def test_daily_gridded_past(self, full_requirement: InputRequirement) -> None:
         daily = full_requirement.dynamic[TemporalResolution.DAILY]
-        gridded = daily.data[SpatialResolution.GRIDDED]
+        gridded = daily.data[SpatialRepresentation.GRIDDED]
         era5 = gridded.past_known["ERA5"]
         assert era5["swe"].lookback == 90
 
+    def test_daily_elevation_band(self, full_requirement: InputRequirement) -> None:
+        daily = full_requirement.dynamic[TemporalResolution.DAILY]
+        band = daily.data[SpatialRepresentation.ELEVATION_BAND]
+        snow = band.future_known["SnowMapper"]
+        assert "swe" in snow
+        assert "rof" in snow
+
     def test_hourly_block(self, full_requirement: InputRequirement) -> None:
         hourly = full_requirement.dynamic[TemporalResolution.HOURLY]
-        assert SpatialResolution.LUMPED in hourly.data
-        assert SpatialResolution.GRIDDED not in hourly.data
-        lumped = hourly.data[SpatialResolution.LUMPED]
-        assert lumped.past_known["obs"]["discharge"].lookback == 72
-        assert lumped.future_known["INCA"]["precipitation"].future_steps == 48
+        assert SpatialRepresentation.BASIN_AVERAGE in hourly.data
+        assert SpatialRepresentation.GRIDDED not in hourly.data
+        basin = hourly.data[SpatialRepresentation.BASIN_AVERAGE]
+        assert basin.past_known["obs"]["discharge"].lookback == 72
+        assert basin.future_known["INCA"]["precipitation"].future_steps == 48
 
     def test_serialization_roundtrip(self, full_requirement: InputRequirement) -> None:
         json_str = full_requirement.model_dump_json()
diff --git a/tests/test_interface.py b/tests/test_interface.py
index 2b77eed..6780f3d 100644
--- a/tests/test_interface.py
+++ b/tests/test_interface.py
@@ -8,9 +8,11 @@
 from forecast_interface.input import (
     DynamicInputSpec,
     InputRequirement,
+    OutputRepresentation,
     PastKnownVariable,
     SpatialInputSpec,
-    SpatialResolution,
+    SpatialRepresentation,
+    TargetSpec,
     TemporalResolution as InputTemporalResolution,
 )
 from forecast_interface.interface import (
@@ -64,17 +66,23 @@ def _make_model_output() -> ModelOutput:
 
 def _make_input_requirement() -> InputRequirement:
     return InputRequirement(
+        targets={
+            "q": TargetSpec(
+                unit=Unit.M3_PER_S,
+                representations=frozenset({OutputRepresentation.DETERMINISTIC}),
+            )
+        },
         dynamic={
             InputTemporalResolution.DAILY: SpatialInputSpec(
                 data={
-                    SpatialResolution.LUMPED: DynamicInputSpec(
+                    SpatialRepresentation.BASIN_AVERAGE: DynamicInputSpec(
                         past_known={
                             "obs": {"q": PastKnownVariable(lookback=1, max_nan=0)}
                         }
                     )
                 }
             )
-        }
+        },
     )
 
 
diff --git a/uv.lock b/uv.lock
index 66f7497..58a0f40 100644
--- a/uv.lock
+++ b/uv.lock
@@ -85,7 +85,7 @@ wheels = [
 
 [[package]]
 name = "forecastinterface"
-version = "0.1.2"
+version = "0.1.3"
 source = { virtual = "." }
 dependencies = [
     { name = "polars" },

From 22a805b0f4f29bdd60a2210f8974bec0d574bad6 Mon Sep 17 00:00:00 2001
From: Beatrice Marti <mabesa@users.noreply.github.com>
Date: Mon, 15 Jun 2026 11:15:57 +0200
Subject: [PATCH 03/16] feat(output): make ModelOutput station-keyed
 (dict[station][variable])
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Phase 2 — realize the GROUP-path per-station decomposition (Option a), required
for the eastern-group artifact that ships first.

- ModelOutput.variables is now dict[str, dict[str, VariableOutput]]
  (station_id → variable_name). Single-station models return a one-key dict.
- Validators: at least one station; each station's variable map non-empty;
  station-id and variable-name keys must be non-empty strings.
- success now spans all variables across all stations.
- Missing stations must be explicit FAILURE entries, never absent keys — the
  model echoes back every station id it was given.
- Station ids are opaque strings (PROVISIONAL Q1); the SAP3 adapter maps them
  to/from typed StationId.

Updates tests (125 passing), README, and the fi-sap3-mapping output/GROUP-path
sections. BREAKING: ModelOutput.variables shape changed. Patch bump 0.1.3 → 0.1.4.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
---
 README.md                                 |  44 +++++----
 docs/fi-sap3-mapping.md                   |  41 ++++----
 forecast_interface/__init__.py            |   2 +-
 forecast_interface/output/model_output.py |  31 ++++--
 pyproject.toml                            |   4 +-
 tests/test_interface.py                   |  26 ++---
 tests/test_output.py                      | 115 +++++++++++++++++++---
 7 files changed, 191 insertions(+), 72 deletions(-)

diff --git a/README.md b/README.md
index 8fee9d5..8ccd9e6 100644
--- a/README.md
+++ b/README.md
@@ -13,7 +13,9 @@ uv add forecastinterface
 
 ## ModelOutput
 
-Top-level container holding forecast results for one or more variables. Each variable can independently carry deterministic forecasts, quantile forecasts, trajectory ensembles, or any combination.
+Top-level container holding forecast results, keyed by station then variable. Each variable can independently carry deterministic forecasts, quantile forecasts, trajectory ensembles, or any combination.
+
+`variables` is station-keyed: `station_id → variable_name → VariableOutput`. A single-station model returns a one-key outer dict. Missing stations are explicit `FAILURE` entries (a `VariableOutput` with `status == FAILURE`), never absent keys — the model echoes back every station id it was given.
 
 ### Structure
 
@@ -21,8 +23,8 @@ Top-level container holding forecast results for one or more variables. Each var
 ModelOutput
     model_name: str
     issue_datetime: datetime
-    success: bool                          # derived — True when all variables succeeded
-    variables: dict[str, VariableOutput]   # keyed by variable name
+    success: bool                                    # derived — True when all variables (across all stations) succeeded
+    variables: dict[str, dict[str, VariableOutput]]  # station_id → variable_name → VariableOutput
 
 VariableOutput
     metadata: VariableMetadata
@@ -92,24 +94,26 @@ output = ModelOutput(
     model_name="MyModel",
     issue_datetime=issue_dt,
     variables={
-        "streamflow": VariableOutput(
-            metadata=VariableMetadata(
-                name="streamflow",
-                unit=Unit.M3_PER_S,
-                resolution=TemporalResolution.DAILY,
-                timedelta=timedelta(days=1),
-                forecast_horizon=10,
-                offset=0,
-            ),
-            deterministic=DeterministicData(
-                data=pl.DataFrame({
-                    "issue_datetime": [issue_dt, issue_dt],
-                    "datetime": [datetime(2024, 6, 1), datetime(2024, 6, 2)],
-                    "value": [42.0, 43.5],
-                }),
+        "station_1": {
+            "streamflow": VariableOutput(
+                metadata=VariableMetadata(
+                    name="streamflow",
+                    unit=Unit.M3_PER_S,
+                    resolution=TemporalResolution.DAILY,
+                    timedelta=timedelta(days=1),
+                    forecast_horizon=10,
+                    offset=0,
+                ),
+                deterministic=DeterministicData(
+                    data=pl.DataFrame({
+                        "issue_datetime": [issue_dt, issue_dt],
+                        "datetime": [datetime(2024, 6, 1), datetime(2024, 6, 2)],
+                        "value": [42.0, 43.5],
+                    }),
+                ),
+                status=VariableStatus.SUCCESS,
             ),
-            status=VariableStatus.SUCCESS,
-        ),
+        },
     },
 )
 
diff --git a/docs/fi-sap3-mapping.md b/docs/fi-sap3-mapping.md
index 3a7d6cd..edba7d4 100644
--- a/docs/fi-sap3-mapping.md
+++ b/docs/fi-sap3-mapping.md
@@ -99,7 +99,8 @@ divergence table (lines 138–151) on the FI side.
 | `VariableMetadata.forecast_horizon: int` (`output/metadata.py:14`) | `ForecastEnsemble.forecast_horizon_steps: int` (`types/ensemble.py:25`) | **DIRECT** — both int, both step counts. **`forecast_horizon` IS consumed by the adapter** (corrects any prior "never consumed" belief). See note below. |
 | `VariableMetadata.timedelta: timedelta` (`output/metadata.py:13`) | `ForecastEnsemble.time_step: timedelta` (`types/ensemble.py:23`) | **DIRECT** assignment. |
 | `VariableMetadata.resolution: TemporalResolution` (`output/metadata.py:12`; enum `common/resolutions.py:4`) | — (no direct target) | Categorical label only; **cross-validate** against `timedelta`, never the conversion source. |
-| `ModelOutput.variables` key / `VariableMetadata.name` (`output/model_output.py:14`, `output/metadata.py:10`) | `ForecastEnsemble.parameter: str` (`types/ensemble.py:23`) | Validate against `ForecastParameter = Literal["discharge","water_level"]` and `ModelDataRequirements.target_parameters` (`types/model.py:261`). |
+| `ModelOutput.variables` inner key / `VariableMetadata.name` (`output/model_output.py`, `output/metadata.py:10`) | `ForecastEnsemble.parameter: str` (`types/ensemble.py:23`) | Validate against `ForecastParameter = Literal["discharge","water_level"]` and `ModelDataRequirements.target_parameters` (`types/model.py:261`). |
+| `ModelOutput.variables` outer key (`output/model_output.py`) | `StationId` (`types/ids.py`) | Station id (opaque `str` on FI side, Q1); adapter maps str → typed `StationId` per GROUP-path decomposition (§5). |
 | empty `ModelOutput.variables` **or** all-`FAILURE` | `ModelOutputError` (`exceptions.py:17`) | Adapter **raises** — zero usable ensembles (doc 014 lines 160–168, 218–223). |
 
 ### Status & flag mapping
@@ -164,15 +165,16 @@ Two horizon notions coexist and must not be conflated:
 
 ### `success` property caveat
 
-`ModelOutput.success` returns `True` when `variables` is empty, because `all()` over an
-empty iterable is `True` (`output/model_output.py:33–36`). The adapter **must not** rely on
-`success` alone to gate conversion (doc 014 lines 219–223).
+`ModelOutput.success` returns `True` over an empty iterable, because `all()` of nothing is
+`True`. The adapter **must not** rely on `success` alone to gate conversion
+(doc 014 lines 219–223).
 
 > **Discrepancy vs doc 014:** Current FI now *forbids* empty `variables` at construction —
-> `ModelOutput._at_least_one_variable` raises if the dict is empty
-> (`output/model_output.py:23–31`). So the empty-variables case is no longer constructible
-> through the public API. The all-`FAILURE` → `ModelOutputError` guard remains live and
-> necessary; the empty-variables guard is now defense-in-depth.
+> `ModelOutput._validate_variables` raises if the outer (station) dict is empty, if any
+> station maps to an empty inner (variable) dict, or if any station-id / variable-name key is
+> empty / whitespace (`output/model_output.py`). So the empty-variables case is no longer
+> constructible through the public API. The all-`FAILURE` → `ModelOutputError` guard remains
+> live and necessary; the empty-variables guard is now defense-in-depth.
 
 ---
 
@@ -231,28 +233,33 @@ member names and values (`types/enums.py:73`), so the mapping is **identity**:
 
 ## 5. Station identity & the GROUP path (Option a)
 
-FI's `ModelOutput.variables` is currently `dict[str, VariableOutput]`
-(`output/model_output.py:14`) — keyed by **variable name**, with no station decomposition.
-SAP3's `GroupForecastModel.predict_batch()` requires per-station results
-(`dict[StationId, ...]`, `protocols/forecast_model.py:63`).
+FI's `ModelOutput.variables` is now **station-keyed**,
+`dict[str, dict[str, VariableOutput]]` (`output/model_output.py`) — keyed first by
+`station_id`, then by `variable_name`. SAP3's `GroupForecastModel.predict_batch()` requires
+per-station results (`dict[StationId, ...]`, `protocols/forecast_model.py:63`).
 
-**Decision recorded (doc 014 "Option (a)", lines 208–217):** FI adopts **station-keyed
-output** so the GROUP-path adapter can map per-station 1:1:
+**Decision realized (doc 014 "Option (a)", lines 208–217):** FI adopts **station-keyed
+output** so the GROUP-path adapter can map per-station 1:1. This realizes the GROUP-path
+per-station decomposition (Option a):
 
 ```
 ModelOutput.variables : dict[station_id, dict[variable, VariableOutput]]
 ```
 
 - Single-station models return a **one-key dict** (one station id → its variable map).
+- Missing stations are **explicit `FAILURE` entries** (the model echoes back every station
+  id it was given), never absent keys.
 - The STATION-path adapter unwraps the single key into
   `tuple[dict[str, ForecastEnsemble], bytes | None]`.
 - The GROUP-path adapter maps each station key → one `(forecast_dict, state)` entry of the
   `dict[StationId, tuple[...]]` return.
+- Station ids are opaque `str` on the FI side (open item Q1); the adapter maps str → typed
+  `StationId` (UUID) at the boundary.
 
 **Cross-repo coordination item (FLAG):** SAP3's adapter design in doc 014 is currently
-**STATION-path-only** for v0b (lines 208–217 explicitly defer GROUP support). When FI moves
-to station-keyed output, SAP3's `ForecastInterfaceAdapter` must be extended to consume it.
-Until both sides land this change, GROUP-path FI wrapping is not possible. This is the
+**STATION-path-only** for v0b (lines 208–217 explicitly defer GROUP support). Now that FI
+has moved to station-keyed output, SAP3's `ForecastInterfaceAdapter` must be extended to
+consume it. Until SAP3 lands its side, GROUP-path FI wrapping is not possible. This is the
 single largest open structural divergence between the two repos.
 
 ---
diff --git a/forecast_interface/__init__.py b/forecast_interface/__init__.py
index b983fcf..abe23e2 100644
--- a/forecast_interface/__init__.py
+++ b/forecast_interface/__init__.py
@@ -1,4 +1,4 @@
-__version__ = "0.1.3"
+__version__ = "0.1.4"
 
 from .input import (
     DynamicInputSpec,
diff --git a/forecast_interface/output/model_output.py b/forecast_interface/output/model_output.py
index a4fc170..d2b6d4d 100644
--- a/forecast_interface/output/model_output.py
+++ b/forecast_interface/output/model_output.py
@@ -7,11 +7,16 @@
 
 
 class ModelOutput(BaseModel):
+    # variables is station-keyed: station_id -> variable_name -> VariableOutput.
+    # The model echoes back EVERY station id it was given: missing stations are
+    # explicit FAILURE entries (a station whose VariableOutputs carry
+    # status=FAILURE), never absent keys.
     model_config = ConfigDict(arbitrary_types_allowed=True)
 
     model_name: str
     issue_datetime: datetime
-    variables: dict[str, VariableOutput]
+    # PROVISIONAL (Q1): station ids are opaque strings; the SAP3 adapter maps them to/from typed StationId (UUID).
+    variables: dict[str, dict[str, VariableOutput]]
 
     @field_validator("model_name")
     @classmethod
@@ -22,15 +27,29 @@ def _non_empty_model_name(cls, v: str) -> str:
 
     @field_validator("variables")
     @classmethod
-    def _at_least_one_variable(
+    def _validate_variables(
         cls,
-        v: dict[str, VariableOutput],
-    ) -> dict[str, VariableOutput]:
+        v: dict[str, dict[str, VariableOutput]],
+    ) -> dict[str, dict[str, VariableOutput]]:
         if not v:
-            raise ValueError("variables must contain at least one entry")
+            raise ValueError("variables must contain at least one station")
+        for station_id, station_vars in v.items():
+            if not station_id or not station_id.strip():
+                raise ValueError("station id keys must be non-empty strings")
+            if not station_vars:
+                raise ValueError(
+                    f"station {station_id!r} must contain at least one variable"
+                )
+            for variable_name in station_vars:
+                if not variable_name or not variable_name.strip():
+                    raise ValueError("variable name keys must be non-empty strings")
         return v
 
     @computed_field  # type: ignore[prop-decorator]
     @property
     def success(self) -> bool:
-        return all(v.status == VariableStatus.SUCCESS for v in self.variables.values())
+        return all(
+            v.status == VariableStatus.SUCCESS
+            for station in self.variables.values()
+            for v in station.values()
+        )
diff --git a/pyproject.toml b/pyproject.toml
index dd72002..ce41905 100644
--- a/pyproject.toml
+++ b/pyproject.toml
@@ -1,6 +1,6 @@
 [project]
 name = "forecastinterface"
-version = "0.1.3"
+version = "0.1.4"
 description = "Add your description here"
 readme = "README.md"
 requires-python = ">=3.11"
@@ -13,7 +13,7 @@ dependencies = [
 pythonpath = ["."]
 
 [tool.bumpversion]
-current_version = "0.1.3"
+current_version = "0.1.4"
 commit = false
 tag = false
 allow_dirty = true
diff --git a/tests/test_interface.py b/tests/test_interface.py
index 6780f3d..504b11a 100644
--- a/tests/test_interface.py
+++ b/tests/test_interface.py
@@ -48,18 +48,20 @@ def _make_model_output() -> ModelOutput:
         model_name="test_model",
         issue_datetime=_ISSUE_DT,
         variables={
-            "discharge": VariableOutput(
-                metadata=VariableMetadata(
-                    name="discharge",
-                    unit=Unit.M3_PER_S,
-                    resolution=TemporalResolution.DAILY,
-                    timedelta=timedelta(days=1),
-                    forecast_horizon=10,
-                    offset=0,
-                ),
-                deterministic=DeterministicData(data=df),
-                status=VariableStatus.SUCCESS,
-            )
+            "station_1": {
+                "discharge": VariableOutput(
+                    metadata=VariableMetadata(
+                        name="discharge",
+                        unit=Unit.M3_PER_S,
+                        resolution=TemporalResolution.DAILY,
+                        timedelta=timedelta(days=1),
+                        forecast_horizon=10,
+                        offset=0,
+                    ),
+                    deterministic=DeterministicData(data=df),
+                    status=VariableStatus.SUCCESS,
+                )
+            }
         },
     )
 
diff --git a/tests/test_output.py b/tests/test_output.py
index b1f41b3..57f82b0 100644
--- a/tests/test_output.py
+++ b/tests/test_output.py
@@ -644,33 +644,47 @@ def _make_variable_output(
             status=status,
         )
 
-    def test_valid_construction(self) -> None:
+    def test_valid_single_station(self) -> None:
         mo = ModelOutput(
             model_name="test_model",
             issue_datetime=datetime.datetime(2024, 1, 1),
-            variables={"discharge": self._make_variable_output()},
+            variables={"station_1": {"discharge": self._make_variable_output()}},
         )
         assert mo.model_name == "test_model"
         assert len(mo.variables) == 1
+        assert len(mo.variables["station_1"]) == 1
 
-    def test_multiple_variables(self) -> None:
+    def test_valid_multi_station(self) -> None:
         mo = ModelOutput(
             model_name="test_model",
             issue_datetime=datetime.datetime(2024, 1, 1),
             variables={
-                "discharge": self._make_variable_output(),
-                "temperature": self._make_variable_output(),
+                "station_1": {"discharge": self._make_variable_output()},
+                "station_2": {"discharge": self._make_variable_output()},
             },
         )
         assert len(mo.variables) == 2
 
+    def test_multiple_variables_per_station(self) -> None:
+        mo = ModelOutput(
+            model_name="test_model",
+            issue_datetime=datetime.datetime(2024, 1, 1),
+            variables={
+                "station_1": {
+                    "discharge": self._make_variable_output(),
+                    "temperature": self._make_variable_output(),
+                },
+            },
+        )
+        assert len(mo.variables["station_1"]) == 2
+
     def test_success_all_success(self) -> None:
         mo = ModelOutput(
             model_name="test_model",
             issue_datetime=datetime.datetime(2024, 1, 1),
             variables={
-                "a": self._make_variable_output(VariableStatus.SUCCESS),
-                "b": self._make_variable_output(VariableStatus.SUCCESS),
+                "station_1": {"a": self._make_variable_output(VariableStatus.SUCCESS)},
+                "station_2": {"b": self._make_variable_output(VariableStatus.SUCCESS)},
             },
         )
         assert mo.success is True
@@ -680,8 +694,8 @@ def test_success_false_when_any_failure(self) -> None:
             model_name="test_model",
             issue_datetime=datetime.datetime(2024, 1, 1),
             variables={
-                "a": self._make_variable_output(VariableStatus.SUCCESS),
-                "b": self._make_variable_output(VariableStatus.FAILURE),
+                "station_1": {"a": self._make_variable_output(VariableStatus.SUCCESS)},
+                "station_2": {"b": self._make_variable_output(VariableStatus.FAILURE)},
             },
         )
         assert mo.success is False
@@ -691,18 +705,51 @@ def test_success_false_when_any_partial(self) -> None:
             model_name="test_model",
             issue_datetime=datetime.datetime(2024, 1, 1),
             variables={
-                "a": self._make_variable_output(VariableStatus.SUCCESS),
-                "b": self._make_variable_output(VariableStatus.PARTIAL),
+                "station_1": {
+                    "a": self._make_variable_output(VariableStatus.SUCCESS),
+                    "b": self._make_variable_output(VariableStatus.PARTIAL),
+                },
+            },
+        )
+        assert mo.success is False
+
+    def test_explicit_failure_station_makes_success_false(self) -> None:
+        # A station with no usable data is echoed back as an explicit FAILURE
+        # entry, never an absent key; this still constructs but flips success.
+        mo = ModelOutput(
+            model_name="test_model",
+            issue_datetime=datetime.datetime(2024, 1, 1),
+            variables={
+                "station_1": {"discharge": self._make_variable_output()},
+                "station_2": {
+                    "discharge": self._make_variable_output(VariableStatus.FAILURE)
+                },
             },
         )
         assert mo.success is False
+        assert mo.variables["station_2"]["discharge"].status == VariableStatus.FAILURE
+
+    def test_roundtrip_preserves_nested_shape(self) -> None:
+        mo = ModelOutput(
+            model_name="test_model",
+            issue_datetime=datetime.datetime(2024, 1, 1),
+            variables={
+                "station_1": {
+                    "discharge": self._make_variable_output(VariableStatus.FAILURE)
+                },
+            },
+        )
+        restored = ModelOutput.model_validate(mo.model_dump())
+        assert set(restored.variables) == {"station_1"}
+        assert set(restored.variables["station_1"]) == {"discharge"}
+        assert restored.success is False
 
     def test_empty_model_name_rejected(self) -> None:
         with pytest.raises(ValueError, match="model_name must be a non-empty string"):
             ModelOutput(
                 model_name="",
                 issue_datetime=datetime.datetime(2024, 1, 1),
-                variables={"discharge": self._make_variable_output()},
+                variables={"station_1": {"discharge": self._make_variable_output()}},
             )
 
     def test_whitespace_model_name_rejected(self) -> None:
@@ -710,13 +757,53 @@ def test_whitespace_model_name_rejected(self) -> None:
             ModelOutput(
                 model_name="   ",
                 issue_datetime=datetime.datetime(2024, 1, 1),
-                variables={"discharge": self._make_variable_output()},
+                variables={"station_1": {"discharge": self._make_variable_output()}},
             )
 
     def test_empty_variables_rejected(self) -> None:
-        with pytest.raises(ValueError, match="at least one entry"):
+        with pytest.raises(ValueError, match="at least one station"):
             ModelOutput(
                 model_name="test_model",
                 issue_datetime=datetime.datetime(2024, 1, 1),
                 variables={},
             )
+
+    def test_empty_station_map_rejected(self) -> None:
+        with pytest.raises(ValueError, match="at least one variable"):
+            ModelOutput(
+                model_name="test_model",
+                issue_datetime=datetime.datetime(2024, 1, 1),
+                variables={"station_1": {}},
+            )
+
+    def test_empty_station_id_rejected(self) -> None:
+        with pytest.raises(ValueError, match="station id keys must be non-empty"):
+            ModelOutput(
+                model_name="test_model",
+                issue_datetime=datetime.datetime(2024, 1, 1),
+                variables={"": {"discharge": self._make_variable_output()}},
+            )
+
+    def test_whitespace_station_id_rejected(self) -> None:
+        with pytest.raises(ValueError, match="station id keys must be non-empty"):
+            ModelOutput(
+                model_name="test_model",
+                issue_datetime=datetime.datetime(2024, 1, 1),
+                variables={"   ": {"discharge": self._make_variable_output()}},
+            )
+
+    def test_empty_variable_name_rejected(self) -> None:
+        with pytest.raises(ValueError, match="variable name keys must be non-empty"):
+            ModelOutput(
+                model_name="test_model",
+                issue_datetime=datetime.datetime(2024, 1, 1),
+                variables={"station_1": {"": self._make_variable_output()}},
+            )
+
+    def test_whitespace_variable_name_rejected(self) -> None:
+        with pytest.raises(ValueError, match="variable name keys must be non-empty"):
+            ModelOutput(
+                model_name="test_model",
+                issue_datetime=datetime.datetime(2024, 1, 1),
+                variables={"station_1": {"   ": self._make_variable_output()}},
+            )

From 4068d14f3a141b722ff614659d062c96cc4d349c Mon Sep 17 00:00:00 2001
From: Beatrice Marti <mabesa@users.noreply.github.com>
Date: Mon, 15 Jun 2026 12:14:32 +0200
Subject: [PATCH 04/16] feat(interface): add training & lifecycle protocol
 (train/serialize, rng, artifact, scope)
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Phase 3 — grow ForecastModel from a predict-only protocol into the full
model-author contract, mirroring SAP3's mechanics while keeping FI's ModelResult
→ ModelOutput as the authoritative return.

- ArtifactScope enum (STATION / GROUP; national-group is a GROUP; SAP3's internal
  VIRTUAL scope intentionally omitted).
- TrainedArtifact: opaque, runtime_checkable marker Protocol — the self-contained,
  deployment-portable artifact boundary. Rich provenance metadata + group embedding-key
  contract deferred to Phase 4 (nepal §4/§8).
- ForecastModel now requires: input_requirement, artifact_scope, train, predict,
  hindcast, serialize_artifact, deserialize_artifact. predict/hindcast take the
  artifact positionally and an injected rng (random.Random) for determinism.
- RetrainableModel(ForecastModel) adds optional warm-start retrain; SAP3 checks
  isinstance to detect support. Cold train remains the required baseline.
- inputs/config typed Any (PROVISIONAL) pending the SAP3→FI input-types PR (doc 014).
- Reconcile model_interface.md with the implementation; restore the targets mention
  on input_requirement. Tests: 131 passing.

BREAKING: predict/hindcast signatures changed; protocol surface expanded.
Patch bump 0.1.4 → 0.1.5.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
---
 docs/model_interface.md                  |  40 +++---
 forecast_interface/__init__.py           |   8 +-
 forecast_interface/interface/__init__.py |   7 +-
 forecast_interface/interface/artifact.py |  22 ++++
 forecast_interface/interface/protocol.py |  47 ++++++-
 forecast_interface/interface/scope.py    |   9 ++
 pyproject.toml                           |   4 +-
 tests/test_interface.py                  | 148 +++++++++++++++++++++--
 uv.lock                                  |   2 +-
 9 files changed, 252 insertions(+), 35 deletions(-)
 create mode 100644 forecast_interface/interface/artifact.py
 create mode 100644 forecast_interface/interface/scope.py

diff --git a/docs/model_interface.md b/docs/model_interface.md
index a152680..d73b342 100644
--- a/docs/model_interface.md
+++ b/docs/model_interface.md
@@ -2,7 +2,7 @@
 
 The primary goal of this package is to define the interface between any forecasting library and the forecasting model. The forecasting model can be implemented in any package / code base but needs to follow the protocol defined here.
 
-There is **one unified protocol**: `ForecastModel`. The scope of a model (single station vs. group / national) is **declared**, not split into separate protocols. SAP3 consumes the FI protocol through a thin adapter that dispatches to its own `StationForecastModel` / `GroupForecastModel` — see [`docs/fi-sap3-mapping.md`](./fi-sap3-mapping.md). The driving requirements for the first (Nepal v1) integration are in [`docs/nepal-model-requirements.md`](./nepal-model-requirements.md).
+There are **two protocols**: the required `ForecastModel`, and `RetrainableModel` (which extends `ForecastModel`) for the optional warm-start retrain capability. The scope of a model (single station vs. group / national) is **declared** via `artifact_scope`, not split into separate protocols. SAP3 consumes the FI protocol through a thin adapter that dispatches to its own `StationForecastModel` / `GroupForecastModel` — see [`docs/fi-sap3-mapping.md`](./fi-sap3-mapping.md). The driving requirements for the first (Nepal v1) integration are in [`docs/nepal-model-requirements.md`](./nepal-model-requirements.md).
 
 Core functionalities include:
 
@@ -17,36 +17,38 @@ Produce a `TrainedArtifact` from training inputs. See the Training & Lifecycle P
 
 ---
 
-## Training & Lifecycle Protocol (target spec)
+## Training & Lifecycle Protocol
 
-> **Status: target contract.** This section describes the protocol surface FI is committed to, settled by the Nepal v1 decisions. It is **not yet reflected in `forecast_interface/` code**. The current `forecast_interface/interface/protocol.py` exposes only `input_requirement`, `predict(*, inputs, issue_datetime)` and `hindcast(*, inputs, issue_datetime)` — with **no** `TrainedArtifact`, **no** `rng`, and **no** training methods. The implementation lands in a later phase; this is the forward target.
+> **Status: implemented** in `forecast_interface/interface/` (`protocol.py`, `scope.py`, `artifact.py`). The `inputs` and `config` parameters remain **provisional** — typed `Any` until the assembled-input bundle and model-config types are co-designed with SAP3 (doc 014 Task 3, the SAP3→FI input-types PR). Rich `TrainedArtifact` provenance metadata and the group-artifact embedding-key / station-set-mismatch contract are **deferred to Phase 4** (see [`docs/nepal-model-requirements.md`](./nepal-model-requirements.md) §4 and §8).
 
 ### Scope: `ArtifactScope`
 
-A model declares its scope rather than implementing a scope-specific protocol.
+A model declares its scope via the `artifact_scope` attribute rather than implementing a scope-specific protocol.
 
 ```python
 class ArtifactScope(Enum):
-    STATION = auto()  # one artifact per station
-    GROUP = auto()    # one artifact covering multiple stations
+    STATION = "station"  # one artifact per station
+    GROUP = "group"      # one artifact covering multiple stations
 ```
 
-A "national-group" model is a `GROUP` (it is just a group whose station set happens to be national). There is no separate national scope.
+A "national-group" model is a `GROUP` (it is just a group whose station set happens to be national). There is no separate national scope. (SAP3 has an internal `VIRTUAL` scope for combination models; that is SAP3-internal and not model-author-facing, so it is not part of this enum.)
 
 ### The `ForecastModel` protocol surface
 
-| Member | Signature | Required? | Notes |
+`train`, `predict`, `hindcast`, `serialize_artifact` and `deserialize_artifact` are **required**. The optional warm-start `retrain` lives on a **separate** protocol, `RetrainableModel` (which extends `ForecastModel`), so it is not forced on every model. SAP3 checks `isinstance(model, RetrainableModel)` to know whether warm-start is supported; otherwise it falls back to `train`.
+
+| Member | Signature | Protocol | Notes |
 |---|---|---|---|
-| `input_requirement` | `property -> InputRequirement` | required | Declares data needs **and** `target_parameters` (the targets, parallel to features). |
-| `artifact_scope` | `property -> ArtifactScope` | required | Declared scope (`STATION` / `GROUP`). |
-| `train` | `train(inputs, *, config, rng) -> TrainedArtifact` | **required** | Cold, full rebuild from scratch. This is the required baseline every model must support. |
-| `retrain` | `retrain(base_artifact, inputs, *, config, rng) -> TrainedArtifact` | optional | Warm-start from an existing artifact, for models capable of it. Models that cannot warm-start simply do not implement it; callers fall back to `train`. |
-| `predict` | `predict(artifact, *, inputs, issue_datetime, rng) -> ModelResult` | required | Forecast. Returns FI's `ModelResult` → `ModelOutput`. |
-| `hindcast` | `hindcast(artifact, *, inputs, issue_datetime, rng) -> ModelResult` | required | Hindcast. Same return type as `predict`. |
-| `serialize_artifact` | `serialize_artifact(artifact) -> bytes` | required | Opaque byte serialization of a `TrainedArtifact`. |
-| `deserialize_artifact` | `deserialize_artifact(raw: bytes) -> TrainedArtifact` | required | Inverse of `serialize_artifact`. |
+| `input_requirement` | `property -> InputRequirement` | `ForecastModel` | Declares data needs **and forecast targets** — `InputRequirement.targets` (`dict[str, TargetSpec]`) names each target variable with its `unit` and supported output `representations` (see `docs/input_requirement.md`). |
+| `artifact_scope` | `attribute: ArtifactScope` | `ForecastModel` | Declared scope (`STATION` / `GROUP`). |
+| `train` | `train(inputs, *, config, rng) -> TrainedArtifact` | `ForecastModel` | Cold, full rebuild from scratch. The required baseline every model must support. |
+| `predict` | `predict(artifact, *, inputs, issue_datetime, rng) -> ModelResult` | `ForecastModel` | Forecast. Returns FI's `ModelResult` → `ModelOutput`. |
+| `hindcast` | `hindcast(artifact, *, inputs, issue_datetime, rng) -> ModelResult` | `ForecastModel` | Hindcast. Same return type as `predict`. |
+| `serialize_artifact` | `serialize_artifact(artifact) -> bytes` | `ForecastModel` | Opaque byte serialization of a `TrainedArtifact`. |
+| `deserialize_artifact` | `deserialize_artifact(raw: bytes) -> TrainedArtifact` | `ForecastModel` | Inverse of `serialize_artifact`. |
+| `retrain` | `retrain(base_artifact, inputs, *, config, rng) -> TrainedArtifact` | `RetrainableModel` | **Optional.** Warm-start from an existing artifact, for models capable of it. Models that cannot warm-start simply do not implement it; callers fall back to `train`. |
 
-`input_requirement.target_parameters` declares the model's prediction targets alongside its feature requirements. (The current `InputRequirement` has only `dynamic` / `static` feature declarations; `target_parameters` is part of this forward spec.)
+The `inputs` and `config` parameters are typed `Any` (provisional, see status note above).
 
 ### Determinism (dependency injection)
 
@@ -54,13 +56,13 @@ A "national-group" model is a `GROUP` (it is just a group whose station set happ
 
 ### `TrainedArtifact`
 
-A `TrainedArtifact` is an **opaque, self-contained, deployment-portable** object representing everything a model needs to produce forecasts:
+`TrainedArtifact` is implemented as a **marker `Protocol`** (no members) — a semantic boundary type. It is an **opaque, self-contained, deployment-portable** object representing everything a model needs to produce forecasts:
 
 - **Opaque** to FI: FI never inspects its internals. It is produced by `train` / `retrain` and consumed by `predict` / `hindcast`.
 - **Self-contained**: `serialize_artifact` produces `bytes` that embed all weights, scalers, and metadata — **with no absolute filesystem paths** and no machine-local references.
 - **Deployment-portable**: `deserialize_artifact(serialize_artifact(a))` must reconstruct an artifact that runs **unchanged on another SAP3 instance**.
 
-**Group / national artifacts and station identity.** An artifact whose scope is `GROUP` typically embeds the station identifiers it was trained on. Such artifacts **must document their embedding key** (how station IDs are stored and matched). They **must also define behaviour when the station set at predict time differs** from the trained set — either handle the mismatch gracefully (e.g. predict only for known stations, emit explicit `FAILURE` entries for unknown ones) or raise an explicit error. A group artifact must **never silently mis-associate** a prediction with the wrong station.
+**Deferred to Phase 4.** Rich provenance metadata (scope, region, training period, hashes, seed, product versions) and the group-artifact embedding-key / station-set-mismatch contract are **not** part of the marker Protocol yet; they land in Phase 4 (see [`docs/nepal-model-requirements.md`](./nepal-model-requirements.md) §4 and §8). The intended contract: an artifact whose scope is `GROUP` typically embeds the station identifiers it was trained on, must document its embedding key, must define behaviour when the predict-time station set differs from the trained set, and must **never silently mis-associate** a prediction with the wrong station.
 
 ### State-free
 
diff --git a/forecast_interface/__init__.py b/forecast_interface/__init__.py
index abe23e2..0517896 100644
--- a/forecast_interface/__init__.py
+++ b/forecast_interface/__init__.py
@@ -1,4 +1,4 @@
-__version__ = "0.1.4"
+__version__ = "0.1.5"
 
 from .input import (
     DynamicInputSpec,
@@ -12,11 +12,14 @@
     TargetSpec,
 )
 from .interface import (
+    ArtifactScope,
     FailureCause,
     ForecastModel,
     ModelFailure,
     ModelResult,
     ModelSuccess,
+    RetrainableModel,
+    TrainedArtifact,
 )
 from .output import (
     DeterministicData,
@@ -33,6 +36,7 @@
 )
 
 __all__ = [
+    "ArtifactScope",
     "DeterministicData",
     "DynamicInputSpec",
     "EnsembleMode",
@@ -49,10 +53,12 @@
     "OutputRepresentation",
     "PastKnownVariable",
     "QuantileData",
+    "RetrainableModel",
     "SpatialInputSpec",
     "SpatialRepresentation",
     "TargetSpec",
     "TemporalResolution",
+    "TrainedArtifact",
     "TrajectoryData",
     "Unit",
     "VariableMetadata",
diff --git a/forecast_interface/interface/__init__.py b/forecast_interface/interface/__init__.py
index 718bc5c..f5cbdf3 100644
--- a/forecast_interface/interface/__init__.py
+++ b/forecast_interface/interface/__init__.py
@@ -1,11 +1,16 @@
+from .artifact import TrainedArtifact
 from .failure import FailureCause
-from .protocol import ForecastModel
+from .protocol import ForecastModel, RetrainableModel
 from .result import ModelFailure, ModelResult, ModelSuccess
+from .scope import ArtifactScope
 
 __all__ = [
+    "ArtifactScope",
     "FailureCause",
     "ForecastModel",
     "ModelFailure",
     "ModelResult",
     "ModelSuccess",
+    "RetrainableModel",
+    "TrainedArtifact",
 ]
diff --git a/forecast_interface/interface/artifact.py b/forecast_interface/interface/artifact.py
new file mode 100644
index 0000000..94383e6
--- /dev/null
+++ b/forecast_interface/interface/artifact.py
@@ -0,0 +1,22 @@
+from typing import Protocol, runtime_checkable
+
+
+@runtime_checkable
+class TrainedArtifact(Protocol):
+    """Opaque, model-defined trained-state object.
+
+    A ``TrainedArtifact`` is produced by ``train`` / ``retrain`` and consumed by
+    ``predict`` / ``hindcast``. FI never inspects its internals. It MUST be
+    self-contained and deployment-portable: ``serialize_artifact`` produces
+    ``bytes`` that embed all weights, scalers and metadata with **no absolute
+    filesystem paths** and **no dependence on the training environment**, and
+    ``deserialize_artifact`` reconstructs an artifact that runs unchanged on a
+    different SAP3 instance.
+
+    Rich provenance metadata (scope, region, training period, hashes, seed,
+    product versions) and the group-artifact embedding-key / station-set-mismatch
+    contract are deferred to a LATER phase (Phase 4); see
+    ``docs/nepal-model-requirements.md`` §4 and §8.
+
+    This is a marker Protocol (no members) used as a semantic boundary type.
+    """
diff --git a/forecast_interface/interface/protocol.py b/forecast_interface/interface/protocol.py
index d478dbb..98191ad 100644
--- a/forecast_interface/interface/protocol.py
+++ b/forecast_interface/interface/protocol.py
@@ -1,9 +1,12 @@
 from datetime import datetime
+from random import Random
 from typing import Any, Protocol, runtime_checkable
 
 from forecast_interface.input.requirement import InputRequirement
 
+from .artifact import TrainedArtifact
 from .result import ModelResult
+from .scope import ArtifactScope
 
 
 @runtime_checkable
@@ -11,6 +14,46 @@ class ForecastModel(Protocol):
     @property
     def input_requirement(self) -> InputRequirement: ...
 
-    def predict(self, *, inputs: Any, issue_datetime: datetime) -> ModelResult: ...
+    artifact_scope: ArtifactScope
 
-    def hindcast(self, *, inputs: Any, issue_datetime: datetime) -> ModelResult: ...
+    # REQUIRED training contract — cold full rebuild is the baseline.
+    def train(self, inputs: Any, *, config: Any, rng: Random) -> TrainedArtifact: ...
+
+    # ^ PROVISIONAL: `inputs` is the assembled-input bundle, `config` model params;
+    #   both co-designed with SAP3 (doc 014 Task 3). Typed Any until that PR lands.
+
+    def predict(
+        self,
+        artifact: TrainedArtifact,
+        *,
+        inputs: Any,  # PROVISIONAL: assembled-input bundle, co-designed with SAP3.
+        issue_datetime: datetime,
+        rng: Random,
+    ) -> ModelResult: ...
+
+    def hindcast(
+        self,
+        artifact: TrainedArtifact,
+        *,
+        inputs: Any,  # PROVISIONAL: assembled-input bundle, co-designed with SAP3.
+        issue_datetime: datetime,
+        rng: Random,
+    ) -> ModelResult: ...
+
+    def serialize_artifact(self, artifact: TrainedArtifact) -> bytes: ...
+
+    def deserialize_artifact(self, raw: bytes) -> TrainedArtifact: ...
+
+
+@runtime_checkable
+class RetrainableModel(ForecastModel, Protocol):
+    # Warm-start retrain — OPTIONAL. SAP3 checks isinstance(model, RetrainableModel)
+    # to know whether warm-start is supported; otherwise it falls back to `train`.
+    def retrain(
+        self,
+        base_artifact: TrainedArtifact,
+        inputs: Any,  # PROVISIONAL: assembled-input bundle, co-designed with SAP3.
+        *,
+        config: Any,  # PROVISIONAL: model params, co-designed with SAP3.
+        rng: Random,
+    ) -> TrainedArtifact: ...
diff --git a/forecast_interface/interface/scope.py b/forecast_interface/interface/scope.py
new file mode 100644
index 0000000..ce5bff7
--- /dev/null
+++ b/forecast_interface/interface/scope.py
@@ -0,0 +1,9 @@
+from enum import Enum
+
+
+class ArtifactScope(Enum):
+    STATION = "station"
+    GROUP = "group"
+    # A "national-group" model is a GROUP (its station set happens to be national).
+    # SAP3 also has an internal VIRTUAL scope for combination models; that is
+    # SAP3-internal and not model-author-facing, so it is intentionally omitted here.
diff --git a/pyproject.toml b/pyproject.toml
index ce41905..17f4dd4 100644
--- a/pyproject.toml
+++ b/pyproject.toml
@@ -1,6 +1,6 @@
 [project]
 name = "forecastinterface"
-version = "0.1.4"
+version = "0.1.5"
 description = "Add your description here"
 readme = "README.md"
 requires-python = ">=3.11"
@@ -13,7 +13,7 @@ dependencies = [
 pythonpath = ["."]
 
 [tool.bumpversion]
-current_version = "0.1.4"
+current_version = "0.1.5"
 commit = false
 tag = false
 allow_dirty = true
diff --git a/tests/test_interface.py b/tests/test_interface.py
index 504b11a..4e3e80a 100644
--- a/tests/test_interface.py
+++ b/tests/test_interface.py
@@ -1,4 +1,5 @@
 from datetime import datetime, timedelta
+from random import Random
 from typing import Any
 
 import polars as pl
@@ -16,11 +17,14 @@
     TemporalResolution as InputTemporalResolution,
 )
 from forecast_interface.interface import (
+    ArtifactScope,
     FailureCause,
     ForecastModel,
     ModelFailure,
     ModelResult,
     ModelSuccess,
+    RetrainableModel,
+    TrainedArtifact,
 )
 from forecast_interface.output import (
     DeterministicData,
@@ -192,35 +196,161 @@ def test_failure_is_model_result(self) -> None:
 
 
 # ---------------------------------------------------------------------------
-# ForecastModel protocol
+# ArtifactScope
 # ---------------------------------------------------------------------------
 
 
+class TestArtifactScope:
+    def test_member_count(self) -> None:
+        assert len(ArtifactScope) == 2
+
+    def test_members_exist(self) -> None:
+        assert ArtifactScope.STATION is not None
+        assert ArtifactScope.GROUP is not None
+
+
+# ---------------------------------------------------------------------------
+# TrainedArtifact (opaque marker Protocol)
+# ---------------------------------------------------------------------------
+
+
+class TestTrainedArtifact:
+    def test_any_object_satisfies_marker_protocol(self) -> None:
+        # TrainedArtifact is an opaque marker Protocol with no members, so any
+        # object satisfies it via isinstance.
+        assert isinstance(object(), TrainedArtifact)
+
+
+# ---------------------------------------------------------------------------
+# ForecastModel / RetrainableModel protocols
+# ---------------------------------------------------------------------------
+
+
+class _ConformingModel:
+    artifact_scope = ArtifactScope.STATION
+
+    @property
+    def input_requirement(self) -> InputRequirement:
+        return _make_input_requirement()
+
+    def train(self, inputs: Any, *, config: Any, rng: Random) -> TrainedArtifact:
+        return object()
+
+    def predict(
+        self,
+        artifact: TrainedArtifact,
+        *,
+        inputs: Any,
+        issue_datetime: datetime,
+        rng: Random,
+    ) -> ModelResult:
+        return ModelSuccess(output=_make_model_output())
+
+    def hindcast(
+        self,
+        artifact: TrainedArtifact,
+        *,
+        inputs: Any,
+        issue_datetime: datetime,
+        rng: Random,
+    ) -> ModelResult:
+        return ModelSuccess(output=_make_model_output())
+
+    def serialize_artifact(self, artifact: TrainedArtifact) -> bytes:
+        return b""
+
+    def deserialize_artifact(self, raw: bytes) -> TrainedArtifact:
+        return object()
+
+
+class _RetrainableModel(_ConformingModel):
+    def retrain(
+        self,
+        base_artifact: TrainedArtifact,
+        inputs: Any,
+        *,
+        config: Any,
+        rng: Random,
+    ) -> TrainedArtifact:
+        return object()
+
+
 class TestForecastModel:
     def test_conforming_class_satisfies_protocol(self) -> None:
-        class _ConformingModel:
+        assert isinstance(_ConformingModel(), ForecastModel)
+
+    def test_missing_train_fails_protocol(self) -> None:
+        class _NoTrain:
+            artifact_scope = ArtifactScope.STATION
+
             @property
             def input_requirement(self) -> InputRequirement:
                 return _make_input_requirement()
 
             def predict(
-                self, *, inputs: Any, issue_datetime: datetime
+                self,
+                artifact: TrainedArtifact,
+                *,
+                inputs: Any,
+                issue_datetime: datetime,
+                rng: Random,
             ) -> ModelResult: ...
 
             def hindcast(
-                self, *, inputs: Any, issue_datetime: datetime
+                self,
+                artifact: TrainedArtifact,
+                *,
+                inputs: Any,
+                issue_datetime: datetime,
+                rng: Random,
             ) -> ModelResult: ...
 
-        assert isinstance(_ConformingModel(), ForecastModel)
+            def serialize_artifact(self, artifact: TrainedArtifact) -> bytes: ...
+
+            def deserialize_artifact(self, raw: bytes) -> TrainedArtifact: ...
+
+        assert not isinstance(_NoTrain(), ForecastModel)
+
+    def test_missing_serialize_fails_protocol(self) -> None:
+        class _NoSerialize:
+            artifact_scope = ArtifactScope.STATION
 
-    def test_missing_predict_fails_protocol(self) -> None:
-        class _Incomplete:
             @property
             def input_requirement(self) -> InputRequirement:
                 return _make_input_requirement()
 
+            def train(
+                self, inputs: Any, *, config: Any, rng: Random
+            ) -> TrainedArtifact: ...
+
+            def predict(
+                self,
+                artifact: TrainedArtifact,
+                *,
+                inputs: Any,
+                issue_datetime: datetime,
+                rng: Random,
+            ) -> ModelResult: ...
+
             def hindcast(
-                self, *, inputs: Any, issue_datetime: datetime
+                self,
+                artifact: TrainedArtifact,
+                *,
+                inputs: Any,
+                issue_datetime: datetime,
+                rng: Random,
             ) -> ModelResult: ...
 
-        assert not isinstance(_Incomplete(), ForecastModel)
+            def deserialize_artifact(self, raw: bytes) -> TrainedArtifact: ...
+
+        assert not isinstance(_NoSerialize(), ForecastModel)
+
+    def test_conforming_without_retrain_is_not_retrainable(self) -> None:
+        model = _ConformingModel()
+        assert isinstance(model, ForecastModel)
+        assert not isinstance(model, RetrainableModel)
+
+    def test_model_with_retrain_satisfies_both(self) -> None:
+        model = _RetrainableModel()
+        assert isinstance(model, ForecastModel)
+        assert isinstance(model, RetrainableModel)
diff --git a/uv.lock b/uv.lock
index 58a0f40..777d23f 100644
--- a/uv.lock
+++ b/uv.lock
@@ -85,7 +85,7 @@ wheels = [
 
 [[package]]
 name = "forecastinterface"
-version = "0.1.3"
+version = "0.1.5"
 source = { virtual = "." }
 dependencies = [
     { name = "polars" },

From 84a2ec2591999e8ad732db686ed0dd9cecd6d750 Mon Sep 17 00:00:00 2001
From: Beatrice Marti <mabesa@users.noreply.github.com>
Date: Tue, 16 Jun 2026 21:09:57 +0200
Subject: [PATCH 05/16] docs(interface): resolve lifecycle, state, failure,
 hindcast, output-floor, input & station-identity decisions

Grilling session outcomes recorded in model_interface.md and
open_design_questions.md:

- Warm-up/state: state-free v0; warm-up period = lookback; persisted
  state deferred as additive StatefulModel extension (1.3, Q4)
- Failure: structured ModelResult, not raise; whole-run vs per-station
  rule (1.7)
- Hindcast: demoted to optional BatchHindcastModel; SAP3 loops predict
  otherwise (1.8, Q6 scoped)
- Output floors: >=3 quantiles / >=8 trajectories structural; SAP3
  operational floors checked loud at integration; deterministic valid
  but non-operational (1.5, Q3)
- Input bundle: FI-owned ModelInputs isomorphic to InputRequirement;
  v1 scope = daily/hourly, POINT/BASIN_AVERAGE/ELEVATION_BAND, single
  product; GRIDDED + multi-product deferred (1.9); config open (Q8)
- Station identity: opaque-but-meaningful str keys stored in artifact;
  groups from v1; 1:1 in/out; embedding-key contract pulled to v1 (1.10, Q1)
- VariableMetadata: drop name; keep forecast_horizon + per-issue-block
  validator; offset semantics fixed (Q5)
- SnowMapper SWE/RoF at BASIN_AVERAGE or ELEVATION_BAND; availability
  lag flagged as new open item (Q7, Q9)

Code TODOs noted, not yet applied (validators, ModelInputs type,
protocol split). Authority rule refined: FI may express not-yet-
operational models provided the gap is never silent.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
---
 docs/model_interface.md        |  70 ++++++++++++++-----
 docs/open_design_questions.md  | 122 +++++++++++++++++++++++++++------
 forecast_interface/__init__.py |   2 +-
 pyproject.toml                 |   4 +-
 uv.lock                        |   2 +-
 5 files changed, 158 insertions(+), 42 deletions(-)

diff --git a/docs/model_interface.md b/docs/model_interface.md
index d73b342..f0f077d 100644
--- a/docs/model_interface.md
+++ b/docs/model_interface.md
@@ -2,15 +2,15 @@
 
 The primary goal of this package is to define the interface between any forecasting library and the forecasting model. The forecasting model can be implemented in any package / code base but needs to follow the protocol defined here.
 
-There are **two protocols**: the required `ForecastModel`, and `RetrainableModel` (which extends `ForecastModel`) for the optional warm-start retrain capability. The scope of a model (single station vs. group / national) is **declared** via `artifact_scope`, not split into separate protocols. SAP3 consumes the FI protocol through a thin adapter that dispatches to its own `StationForecastModel` / `GroupForecastModel` — see [`docs/fi-sap3-mapping.md`](./fi-sap3-mapping.md). The driving requirements for the first (Nepal v1) integration are in [`docs/nepal-model-requirements.md`](./nepal-model-requirements.md).
+There are **three protocols**: the required `ForecastModel`, plus two optional extensions — `RetrainableModel` (warm-start `retrain`) and `BatchHindcastModel` (efficient batch `hindcast`), both of which extend `ForecastModel`. A `StatefulModel` extension is **reserved** for future conceptual / hybrid models (see *Warm-up and state* below). The scope of a model (single station vs. group / national) is **declared** via `artifact_scope`, not split into separate protocols. SAP3 consumes the FI protocol through a thin adapter that dispatches to its own `StationForecastModel` / `GroupForecastModel` — see [`docs/fi-sap3-mapping.md`](./fi-sap3-mapping.md). The driving requirements for the first (Nepal v1) integration are in [`docs/nepal-model-requirements.md`](./nepal-model-requirements.md).
 
 Core functionalities include:
 
 **Forecast Function** `predict()`
 Takes as input the `ModelInput` and a trained artifact, and outputs the `ModelOutput` (Forecast).
 
-**Hindcast Function** `hindcast()`
-Takes as input the `ModelInput` and a trained artifact, and outputs the `ModelOutput` (Hindcast).
+**Hindcast Function** `hindcast()` — *optional, strongly recommended*
+Lives on the optional `BatchHindcastModel` extension. Takes a trained artifact and a **batch** of issue datetimes, and outputs the `ModelOutput` (Hindcast) for all of them in one call. Functionally equivalent to looping `predict()` over historical issue times, but vectorized for efficiency. SAP3 uses the batch path whenever the model implements it and falls back to looping `predict()` otherwise — because SAP3 runs hindcasts routinely (skill evaluation), implementing it is strongly recommended.
 
 **Training Functions** `train()` / `retrain()`
 Produce a `TrainedArtifact` from training inputs. See the Training & Lifecycle Protocol below.
@@ -35,7 +35,7 @@ A "national-group" model is a `GROUP` (it is just a group whose station set happ
 
 ### The `ForecastModel` protocol surface
 
-`train`, `predict`, `hindcast`, `serialize_artifact` and `deserialize_artifact` are **required**. The optional warm-start `retrain` lives on a **separate** protocol, `RetrainableModel` (which extends `ForecastModel`), so it is not forced on every model. SAP3 checks `isinstance(model, RetrainableModel)` to know whether warm-start is supported; otherwise it falls back to `train`.
+`train`, `predict`, `serialize_artifact` and `deserialize_artifact` are **required**. Two optional capabilities each live on a **separate** extension protocol, so they are not forced on every model: warm-start `retrain` on `RetrainableModel`, and batch `hindcast` on `BatchHindcastModel`. SAP3 detects each with `isinstance` — `isinstance(model, RetrainableModel)` to decide whether to warm-start (else it falls back to `train`), and `isinstance(model, BatchHindcastModel)` to decide whether to batch-hindcast (else it loops `predict`).
 
 | Member | Signature | Protocol | Notes |
 |---|---|---|---|
@@ -43,7 +43,7 @@ A "national-group" model is a `GROUP` (it is just a group whose station set happ
 | `artifact_scope` | `attribute: ArtifactScope` | `ForecastModel` | Declared scope (`STATION` / `GROUP`). |
 | `train` | `train(inputs, *, config, rng) -> TrainedArtifact` | `ForecastModel` | Cold, full rebuild from scratch. The required baseline every model must support. |
 | `predict` | `predict(artifact, *, inputs, issue_datetime, rng) -> ModelResult` | `ForecastModel` | Forecast. Returns FI's `ModelResult` → `ModelOutput`. |
-| `hindcast` | `hindcast(artifact, *, inputs, issue_datetime, rng) -> ModelResult` | `ForecastModel` | Hindcast. Same return type as `predict`. |
+| `hindcast` | `hindcast(artifact, *, inputs, issue_datetimes, rng) -> ModelResult` | `BatchHindcastModel` | **Optional, strongly recommended.** Batch hindcast over many issue datetimes in one call; same `ModelResult` return as `predict`. Absent it, SAP3 loops `predict`. |
 | `serialize_artifact` | `serialize_artifact(artifact) -> bytes` | `ForecastModel` | Opaque byte serialization of a `TrainedArtifact`. |
 | `deserialize_artifact` | `deserialize_artifact(raw: bytes) -> TrainedArtifact` | `ForecastModel` | Inverse of `serialize_artifact`. |
 | `retrain` | `retrain(base_artifact, inputs, *, config, rng) -> TrainedArtifact` | `RetrainableModel` | **Optional.** Warm-start from an existing artifact, for models capable of it. Models that cannot warm-start simply do not implement it; callers fall back to `train`. |
@@ -62,16 +62,44 @@ The `inputs` and `config` parameters are typed `Any` (provisional, see status no
 - **Self-contained**: `serialize_artifact` produces `bytes` that embed all weights, scalers, and metadata — **with no absolute filesystem paths** and no machine-local references.
 - **Deployment-portable**: `deserialize_artifact(serialize_artifact(a))` must reconstruct an artifact that runs **unchanged on another SAP3 instance**.
 
-**Deferred to Phase 4.** Rich provenance metadata (scope, region, training period, hashes, seed, product versions) and the group-artifact embedding-key / station-set-mismatch contract are **not** part of the marker Protocol yet; they land in Phase 4 (see [`docs/nepal-model-requirements.md`](./nepal-model-requirements.md) §4 and §8). The intended contract: an artifact whose scope is `GROUP` typically embeds the station identifiers it was trained on, must document its embedding key, must define behaviour when the predict-time station set differs from the trained set, and must **never silently mis-associate** a prediction with the wrong station.
+**Partly deferred.** Rich provenance metadata (scope, region, training period, hashes, seed, product versions) is **not** part of the marker Protocol yet and lands in Phase 4 (see [`docs/nepal-model-requirements.md`](./nepal-model-requirements.md) §4).
 
-### State-free
+The **group-artifact embedding-key / station-set-mismatch contract**, however, is **load-bearing from v1**, because `GROUP` artifacts ship from the start (decision 1.10) and east→west transfer is a Nepal v1 target (decision 1.6) — re-evaluate its earlier Phase 4 deferral. The contract: a `GROUP` artifact **embeds the meaningful station strings it was trained on** (which the model reads to key per-station state); it must define behaviour when the predict-time station set differs from the trained set — **known** stations use stored state, **unknown** stations are generalized from static attributes or rejected with an explicit error — and it must **never silently mis-associate** a prediction with the wrong station. Station strings are **stable, meaningful identifiers** that round-trip unchanged through `serialize_artifact` / `deserialize_artifact` and across deployments; the model never alters them.
 
-FI's protocol is **state-free**: there is no `state` parameter and no state in the return value. SAP3's `prior_state` bytes are handled entirely inside the SAP3 adapter (see [`docs/fi-sap3-mapping.md`](./fi-sap3-mapping.md)), not by the FI protocol. An optional `dump_state` / `restore_state` pair is noted only as a **possible future extension** and is out of scope for v1.
+### Warm-up and state (state-free in v0)
+
+FI's protocol is **state-free in v0**: `predict` / `hindcast` take no `state` parameter and return no state. This is a deliberate, lightweight default. It rests on separating two things commonly conflated as "warm-up":
+
+- **Warm-up *period* (cold spin-up)** — the run of forcing a model needs *before* the issue time to spin its internal stores up from nothing. This needs **no new channel**: it is declared as `lookback` in `InputRequirement`. Any model that needs a spin-up period simply requires a sufficiently long lookback. Covered for **all** model types in v0.
+- **Persisted warm *state* (cross-cycle snapshot)** — SAP3's `prior_state` / `new_state` bytes, carrying spun-up state from the previous cycle so it need not be re-spun each run. This is purely an **optimization**, and it is **deferred from v0**.
+
+**Why deferred, not dropped.** The model integrating in v1 is **pure ML**: it reconstructs everything from its lookback window each cycle and needs neither a warm-up period nor persisted state. Warm-up state is required for conceptual / distributed models and *may* be required for hybrid models — but none exist in the interface yet. Building a state channel now would tax every (currently stateless) model author with state ceremony they never use, which conflicts with keeping the interface lightweight and easy for humans and machines to adhere to. We therefore ship the single clean signature and **reserve** state as a future extension.
+
+**The reserved extension is additive and non-breaking.** When a conceptual / hybrid model that genuinely needs persisted state arrives, state support is added as an **optional `StatefulModel` sub-protocol** extending `ForecastModel` — exactly as `RetrainableModel` already does for warm-start. SAP3 detects it via `isinstance(model, StatefulModel)` and threads `prior_state` only for those models; existing `ForecastModel` implementations do not change a line. The precise shape of that extension (state as an extra `predict` parameter, a distinct method, or carried inside `ModelResult`) is **deliberately left open** until a real stateful model forces the decision — designing it against a concrete model beats guessing now.
+
+> **Correction to the earlier framing.** A previous note claimed `prior_state` could be "handled entirely inside the SAP3 adapter." That is not implementable: an adapter cannot inject state into a `predict` that has no state parameter. The correct resolution is the additive `StatefulModel` sub-protocol above.
+
+**SAP3 consistency.** A state-free FI model maps onto SAP3 as a model that ignores `prior_state` and always runs `WarmUpSource.FRESH` — already legal SAP3 behaviour for stateless models. The single divergence to record: **FI v0 does not use SAP3's warm-up-snapshot path; FI models warm up from `lookback`.**
 
 ### Output stays FI-authoritative
 
 `predict` / `hindcast` **return** `ModelResult` → `ModelOutput` (defined below). `ModelOutput` is **not** replaced by SAP3's `ForecastEnsemble`: the SAP3 adapter maps `ModelOutput` *into* its own representation, never the other way around. See [`docs/fi-sap3-mapping.md`](./fi-sap3-mapping.md) for the field-level mapping.
 
+### Failure & result model
+
+`predict` / `hindcast` return `ModelResult = ModelSuccess | ModelFailure` rather than raising. In operational forecasting, failure for a given cycle or station is **routine, not exceptional** (gauges offline, degraded inputs), so it is modelled as a typed outcome carrying a structured `FailureCause`, not an exception. SAP3's `except`-and-return path stays only as a **backstop** for unanticipated bugs — *anticipated* failure must be returned, not raised.
+
+Failure is represented at **two levels**, with a strict rule for which to use:
+
+| Level | Type | Means | When |
+|---|---|---|---|
+| Whole-run | `ModelFailure` (the union branch) | the model produced **nothing at all** | no artifact, invalid config, dependency down, whole input bundle malformed |
+| Per-station / variable | `VariableStatus.FAILURE` inside `ModelOutput` | the model ran and produced output for some stations/variables, but **this one** could not | that station's inputs missing or too degraded |
+
+**Rule:** if the model can produce output for **even one** station/variable, it returns `ModelSuccess` with per-entry `FAILURE` / `PARTIAL` status; `ModelFailure` is reserved for **total** inability to produce anything.
+
+`ModelFailure` carries `cause: FailureCause` (`INPUT_DATA` / `RESOURCE` / `MODEL_ERROR` / `CONFIGURATION` / `DEPENDENCY`) and a human-readable `message`. Per-station `FAILURE` entries currently carry only `status` + `flags` (e.g. `DATA_AVAILABILITY`); attaching a per-station `FailureCause` is a **possible future enhancement**, deferred to keep the per-entry surface light.
+
 ---
 
 ## ModelOutput
@@ -102,25 +130,29 @@ All data classes share a unified DataFrame schema with two required datetime col
 | `issue_datetime` | `datetime` (UTC) | When the forecast/hindcast was issued |
 | `datetime` | `datetime` (UTC) | The target valid time of the prediction |
 
-**Forecast**: `issue_datetime` is constant across all rows (single issue time).
-**Hindcast**: `issue_datetime` varies across rows (multiple issue times).
+**Forecast** (`predict`): `issue_datetime` is constant across all rows (a single issue time).
+**Batch hindcast** (`BatchHindcastModel.hindcast`): `issue_datetime` **varies** across rows (one block of rows per issue time in the batch).
 
-`predict` and `hindcast` return the **same** `ModelOutput` type; the only distinction is whether `issue_datetime` is constant (forecast) or varies (hindcast) across rows.
+`predict` and `hindcast` return the **same** `ModelOutput` type; the only distinction is whether `issue_datetime` is constant (forecast) or varies (batch hindcast) across rows. The varying-`issue_datetime` schema therefore only arises from the optional `BatchHindcastModel` path — a plain `ForecastModel` always emits a constant `issue_datetime`.
 
 ### Data Classes
 
 Each data class wraps a DataFrame with the two temporal columns above, plus class-specific value columns:
 
-**DeterministicData** — columns: `[issue_datetime, datetime, value]`
+**DeterministicData** — columns: `[issue_datetime, datetime, value]`. A single point forecast. **Valid output, but not operationally consumable by SAP3 on its own** — SAP3 has no deterministic channel (it is always probabilistic). A deterministic-only model is a legitimate, possibly strong model; to deploy it operationally it must *also* supply forecast uncertainty — quantiles or trajectories it emits itself, or a downstream uncertainty wrapper that produces them. See the operational-floor note below.
 
 **QuantileData** — columns: `[issue_datetime, datetime, <quantile_level>, ...]`
-Quantile columns are named by their level as strings (e.g., `"0.1"`, `"0.5"`, `"0.9"`). Levels must be in (0, 1), sorted ascending, and unique.
+Quantile columns are named by their level as strings (e.g., `"0.1"`, `"0.5"`, `"0.9"`). Levels must be in (0, 1), sorted ascending, and unique. **FI structural minimum: ≥ 3 levels** (a centre plus two tails).
 
 **TrajectoryData** — columns: `[issue_datetime, datetime, "1", "2", ..., "<N>"]`
-Sample columns are named `"1"` through `"<num_samples>"`.
+Sample columns are named `"1"` through `"<num_samples>"`. **FI structural minimum: ≥ 8 samples.** Models typically emit ~50 (deployment-specific, may be fewer).
 
 **EpistemicUncertaintyData** — columns: `[issue_datetime, datetime, std, range]`
-Captures model uncertainty as standard deviation and range.
+Captures model uncertainty as standard deviation and range. (Dropped at the SAP3 boundary in v0b; see the mapping doc.)
+
+#### Operational floor — structural vs. operational
+
+FI enforces only **deployment-independent structural floors** (≥3 quantiles, ≥8 trajectories; a probabilistic representation is required for operational use). The **operational floors are SAP3 deployment config** — `min_operational_quantile_levels` (≥7, with tail coverage ≤0.05 / ≥0.95) and `min_operational_ensemble_size` (≥20 members) — so they live on the SAP3 side, not in FI. To stop this failing *silently* at runtime, the model **declares the representation(s) and the count it will emit** (via `TargetSpec` / metadata), and SAP3 checks compatibility against its deployment floor at **integration / registration time**. Net rule: *valid FI output **and** declared counts ≥ the deployment floor ⟹ operational*; anything short is rejected loudly before any forecast runs, never silently dropped.
 
 ### VariableOutput
 
@@ -128,7 +160,7 @@ Groups data for a single output variable (within a single station):
 
 | Field | Type | Description |
 |---|---|---|
-| `metadata` | `VariableMetadata` | Name, unit, resolution, timedelta, forecast_horizon, offset |
+| `metadata` | `VariableMetadata` | `unit`, `resolution`, `timedelta`, `forecast_horizon`, `offset` (no `name` — the variable name is the dict key). See *Metadata semantics* below. |
 | `deterministic` | `DeterministicData \| None` | Point forecast |
 | `quantiles` | `QuantileData \| None` | Quantile forecast |
 | `trajectories` | `TrajectoryData \| None` | Ensemble trajectories |
@@ -140,6 +172,12 @@ At least one of `deterministic`, `quantiles`, or `trajectories` must be present
 
 `variables` must contain at least one station entry, and each station's inner dict must contain at least one variable. When status is `PARTIAL`, at least one data representation must still be present (same rule as `SUCCESS`). A station that produced no usable data is represented by a `FAILURE` `VariableOutput`, not by an empty or missing entry.
 
+### Metadata semantics
+
+- **`forecast_horizon`** — number of forecast steps; **consumed directly by the SAP3 adapter** (`ForecastEnsemble.forecast_horizon_steps`). A cross-validator enforces it against the data: for `predict`, `forecast_horizon` equals the row count; for batch `hindcast` it equals the rows **per `issue_datetime`** (one block per issue time).
+- **`offset`** — number of steps (each `timedelta` long) between the **last observation and the first forecast step**. `offset = 1` ⇒ the first forecast valid time is `last_obs + 1·timedelta` (the usual next-step case); `offset = 2` ⇒ a one-step gap.
+- **No `name`** — the variable name is the `ModelOutput.variables[station][variable]` dict key; duplicating it in metadata is omitted to avoid two disagreeing sources of truth.
+
 ### ForecastFlag
 
 Quality flags that can be attached to a variable output:
diff --git a/docs/open_design_questions.md b/docs/open_design_questions.md
index facf68c..e9f5a77 100644
--- a/docs/open_design_questions.md
+++ b/docs/open_design_questions.md
@@ -46,11 +46,16 @@ A single-station model returns a dict with one key, e.g. `{"station_xyz": {"disc
 
 **Reflected in:** `docs/input_requirement.md` (later phase).
 
-## 1.3 Model state — RESOLVED (already)
+## 1.3 Model state / warm-up — RESOLVED (refined)
 
-**Decision:** FI stays **state-free**. SAP3's `prior_state` bytes are handled entirely by the adapter. An optional `dump_state()` / `restore_state(bytes)` pair on the protocol is a **future extension only**, not part of the current contract.
+**Decision:** FI is **state-free in v0**, separating two things conflated as "warm-up":
 
-**Rationale:** state management is orchestrator-side concern (SAP3's `PgModelStateStore`, `WarmUpSource`, `prior_state` on predict). FI's `predict()` has no state parameter and no state in its return.
+- **Warm-up *period* (cold spin-up)** is declared as `lookback` in `InputRequirement` — no new channel, available to all model types.
+- **Persisted warm *state* (cross-cycle snapshot)** — SAP3's `prior_state` / `new_state` bytes — is **deferred from v0** and **reserved** as an additive, non-breaking `StatefulModel` sub-protocol (the same `isinstance`-detected pattern as `RetrainableModel`), to be designed when a conceptual / hybrid model actually requires it.
+
+**Rationale:** the v1 model is pure ML (see Q4, now answered) — it reconstructs state from its lookback window and needs neither a warm-up period nor persisted state. A state channel built now would tax every stateless author with unused ceremony, against the goal of a lightweight interface. The earlier framing — "`prior_state` handled entirely by the adapter" — was **incorrect**: an adapter cannot inject state into a `predict` that has no state parameter. The correct resolution is the additive sub-protocol above.
+
+**SAP3 consistency:** a state-free FI model maps to a SAP3 model that ignores `prior_state` and always runs `WarmUpSource.FRESH`. The one recorded divergence: FI v0 does not use SAP3's warm-up-snapshot path.
 
 **Reflected in:** `docs/model_interface.md`.
 
@@ -69,11 +74,18 @@ Banded Snowmapper SWE / snowmelt is declared at `ELEVATION_BAND`.
 
 **Reflected in:** `docs/input_requirement.md`, `docs/model_interface.md`.
 
-## 1.5 Quantile floor — RESOLVED (split responsibility)
+## 1.5 Output floors & deterministic output — RESOLVED (refined)
+
+**Decision:** FI enforces **deployment-independent structural floors** only; SAP3's deployment-configurable operational floors are checked **loudly at integration time**, not silently at runtime.
 
-**Decision:** FI proposes a **structural minimum of ≥3 quantiles** (center + two tails). SAP3's operational requirement of **≥7 quantiles with tail coverage** (a level ≤ 0.05 and a level ≥ 0.95) is enforced at the **adapter boundary**, NOT in FI.
+- **FI structural floors (hard validators):** QuantileData **≥ 3** levels (centre + two tails); TrajectoryData **≥ 8** samples. *(Code TODO: validators currently enforce ≥1 quantile / >0 trajectory — tighten to ≥3 / ≥8.)*
+- **SAP3 operational floors (deployment config, SAP3-side):** `min_operational_quantile_levels` ≥ 7 with tail coverage (a level ≤0.05 and a level ≥0.95); `min_operational_ensemble_size` ≥ 20 members. Kept out of FI because they are deployment-specific.
+- **Deterministic-only is allowed, not forbidden.** A deterministic model may be strong; FI accepts deterministic output as structurally valid. But SAP3 has **no deterministic channel**, so deterministic-only is **non-operational** until the model supplies forecast uncertainty (quantiles/trajectories it emits, or a downstream uncertainty wrapper).
+- **No silent non-operational output.** The model declares its representation(s) and emitted count; SAP3 checks them against its deployment floor at **integration/registration time** and rejects incompatibles loudly. Net: *valid FI + declared counts ≥ deployment floor ⟹ operational.*
 
-An FI model emitting fewer than 7 quantiles is **structurally valid but NOT operationally usable in SAP3**.
+**Authority-rule refinement (from Q1):** the Q1 principle — "FI must not express a model SAP3 can't operate" — is refined to: FI **may** express not-yet-operational models (e.g. deterministic-only), provided the non-operational state is **never silent** — it is declared, caught loudly at the SAP3 boundary, and has a documented path to operational.
+
+**Model developer input (Nepal):** the model emits **quantiles** (count configurable at training); **trajectories** typically ~50 (deployment-specific, may be fewer).
 
 **Reflected in:** `docs/model_interface.md`, `docs/fi-sap3-mapping.md`.
 
@@ -82,45 +94,111 @@ An FI model emitting fewer than 7 quantiles is **structurally valid but NOT oper
 **Decisions provided by the model developer:**
 
 - **First artifact scope: the eastern regional group ships first.** The first production artifact is therefore **GROUP-scoped** (`ArtifactScope.GROUP`). This makes the station-keyed output of decision 1.1 and the GROUP adapter path **load-bearing from day one**, not a later concern.
-- **SnowMapper forcing starts with SWE and ROF** (snow water equivalent and runoff), declared as **banded dynamic forcing at `ELEVATION_BAND`** (see decision 1.4). Specific **lead times** are still to be confirmed — see Q7 residual.
+- **SnowMapper forcing starts with SWE and ROF** (snow water equivalent and runoff), declared as dynamic forcing at **`BASIN_AVERAGE` or `ELEVATION_BAND`** (see decision 1.4; Q7 broadened this from ELEVATION_BAND-only). Lead times / resolutions follow the ECMWF forecast and ERA5-Land (Q7), with a possible SnowMapper availability lag (Q9).
 - **Artifact transfer direction is east → west** (an eastern group artifact applied to western gauges). This makes the embedding-key / station-set-mismatch contract (Nepal §8) concrete: the eastern GROUP artifact **must define its behaviour when applied to the western station set** — handle gracefully or raise an explicit error, never silently associate a station with the wrong embedding.
 
 **Reflected in:** `docs/nepal-model-requirements.md`, `docs/model_interface.md` (artifact portability), `docs/fi-sap3-mapping.md` (artifact metadata ownership).
 
+## 1.7 Failure channel — RESOLVED
+
+**Decision:** `predict` / `hindcast` return `ModelResult = ModelSuccess | ModelFailure` (structured), they do **not** raise. Two failure levels with a strict rule: `ModelFailure` = total inability to produce anything; `VariableStatus.FAILURE` = per-station/variable failure within an otherwise-successful run. **If even one station/variable is produced, return `ModelSuccess` with per-entry `FAILURE`/`PARTIAL`; reserve `ModelFailure` for total failure.**
+
+**Rationale:** operational per-cycle / per-station failure is routine, not exceptional; a typed outcome with `FailureCause` beats SAP3's catch-and-stringify and is more type-safe. Divergence from SAP3 (which raises) and from Sandro's original (no failure channel) is justified on these grounds; SAP3's `except`-and-return remains a backstop for unanticipated bugs only.
+
+**Deferred:** per-station `FailureCause` on `VariableOutput` (today only `status` + `flags`) — possible future enhancement, kept out to keep the per-entry surface light.
+
+**Reflected in:** `docs/model_interface.md`.
+
+## 1.8 Hindcast — RESOLVED (demoted to optional)
+
+**Decision:** `hindcast` is **removed from the required `ForecastModel` surface** and moved to an optional `BatchHindcastModel(ForecastModel)` sub-protocol with a batch signature `hindcast(artifact, *, inputs, issue_datetimes, rng) -> ModelResult`. SAP3 detects it via `isinstance` and uses the batch path when present; otherwise it **loops `predict`** (its existing behaviour). It is **optional in the type system but strongly recommended** — SAP3 runs hindcasts routinely for skill evaluation, so batch efficiency matters.
+
+**Rationale:** hindcast is functionally "predict over many historical issue times"; the live-vs-archive forcing difference is an *input* difference, not a method difference. A dedicated method buys only batch efficiency, so it is an optimization, not a capability — requiring it would tax every author with a second method. Demoting it makes the required surface minimal (`train` / `predict` / `serialize` / `deserialize`) and is **more** SAP3-consistent (SAP3 has no required `hindcast`). Divergence from Sandro's original (which treated `hindcast` as core) is justified on these grounds and flagged for the Sandro conversation.
+
+**Knock-on:** the varying-per-row `issue_datetime` output schema now arises **only** from the `BatchHindcastModel` path; a plain `ForecastModel` always emits a constant `issue_datetime` (scopes Q6 — see below).
+
+**Reflected in:** `docs/model_interface.md`.
+
+## 1.10 Station identity & group support — RESOLVED (refined)
+
+**Station key type:** FI station keys are opaque `str` (not typed `StationId` / UUID) — FI stays dependency-free.
+
+**But the string carries meaning and must be stable.** Correcting the initial "echo exactly, never interpret" framing: the model **does** use the station key to look up per-station state in its artifact — *the trained station strings are stored inside the artifact* — so the key is a **meaningful, stable identifier**, not an arbitrary token. Rules:
+
+- The same station string must be used consistently across **train → artifact → predict → output**.
+- The model may **read** the key (look it up in its artifact) but must **never alter** it; output is keyed by exactly the strings received.
+- The string must be **stable across deployments** (staging → prod, east → west), because artifacts embed it and must be portable. This argues for a **deployment-stable human / network station code**, NOT a per-DB UUID. **Open coordination item:** confirm the exact string the modeller's artifacts store; the SAP3 adapter maps `StationId` (UUID) ↔ that string (via `StationConfig.code` if it is the code).
+
+**Group support from v1:** `ArtifactScope.GROUP` is load-bearing from the start (consistent with 1.6). Multiple input stations may **share one group artifact**; output stays **1:1 station-in / station-out** — every input station gets an output entry. Grouping is about *artifact sharing*, not output cardinality.
+
+**Station-set mismatch (known vs unknown stations):** a GROUP artifact stores its trained member stations. At predict time, **known** stations use the stored per-station state; **unknown** stations (e.g. western gauges under east→west transfer) must be handled by **generalizing from static attributes or raising an explicit error — never silently mis-associating** a prediction with the wrong station / embedding. This is the embedding-key contract (Nepal §8). **Timing flag:** decision 1.6 puts east→west in Nepal v1 and groups ship from the start, so this contract is likely **v1, not Phase 4** — re-evaluate the deferral.
+
+**Reflected in:** `docs/model_interface.md`, `docs/fi-sap3-mapping.md` (§5, §7), `docs/nepal-model-requirements.md` (§8).
+
+## 1.9 Input bundle (`inputs`) typing & v1 delivery scope — RESOLVED
+
+**Decision:**
+- Replace `Any` for `inputs` with a **concrete FI-owned input-bundle type** (working name `ModelInputs`), **isomorphic to `InputRequirement`**: addressed by the same `(TemporalResolution → SpatialRepresentation → {past_known | future_known} → product → variable)` keys, leaves being the actual time-series DataFrames, plus `static` and the issue context. It is **station-aware** to match the station-keyed output (decision 1.1) for GROUP models. The SAP3 adapter builds it from SAP3's `StationInputData` / `GroupModelInputs`; FI does **not** import SAP3 types.
+- "Parse, don't validate": the bundle the model receives is **valid-by-construction** against its declared `InputRequirement`.
+
+**v1 delivery scope (model developer input):**
+- **Temporal resolution:** daily, hourly, or daily + hourly — all delivered in v1.
+- **Spatial representation:** POINT (runoff / water level), BASIN_AVERAGE and ELEVATION_BAND (most variables) — delivered in v1. **GRIDDED is declarable-but-not-delivered in v1** (future models); marked loud — SAP3 rejects at integration, never silently delivers wrong data.
+- **Product axis:** structure retained, but **v1 delivers a single product per variable.** Multi-product is a future extension (non-breaking — the product dict simply gains keys).
+- The un-delivered dimensions (GRIDDED, multi-product) are SAP3's delivery growth backlog.
+
+**`config` (train / predict): OPEN — modeller-owned** (see Q8). Until specified, `config` stays `Any`.
+
+**Reflected in:** `docs/input_requirement.md`, `docs/fi-sap3-mapping.md` (§4), and a future `ModelInputs` type (code).
+
 ---
 
 # 2. Open questions for the model developer
 
 A decision-ready list. Each needs the model developer's input before the corresponding spec is frozen.
 
-### Q1 — Station ID typing
+### Q1 — Station ID typing — ANSWERED
 
-SAP3 uses a typed `StationId = NewType(..., UUID)`. Should FI expose **opaque `str` station keys** (the adapter maps them to/from `StationId`), or **adopt typed IDs** directly? And: is there any case where the model defines its own spatial units that do **not** map 1:1 to the input station IDs?
+**Answer:** opaque `str` keys (not typed UUIDs), but they **carry meaning and are stable** across train / predict / deployment — the trained station strings are stored inside the artifact and read by the model for per-station lookup. Group artifacts shared across multiple stations are supported **from v1**; output is **1:1 station-in / station-out**; the station-set-mismatch case (unknown stations) is handled by the embedding-key contract (generalize or raise, never silent). **Residual coordination item:** the exact string identity (human code vs UUID-string) must match what the modeller's artifacts store — deployment portability argues for the code. See decision 1.10.
 
-### Q2 — Past-target availability
+### Q2 — Past-target availability — ANSWERED
 
-Do all your models see the **target's own history**, or do some pure-simulation / process-based models forecast a target **without** any past observations of it? This determines whether past-target is a *required* declared input or an optional one.
+**Answer:** target-history is an **optional** declared input — declared under `past_known` only when the model uses it — so both autoregressive and pure-simulation models are supported. The Nepal model **uses past discharge** as input, so it declares discharge under `past_known` in addition to declaring it as a target (decision 1.2).
 
-### Q3 — Quantile minimum
+### Q3 — Quantile minimum — ANSWERED
 
-Is there any valid use case for emitting **fewer than 3 quantiles**? (Reminder: anything below 7 is non-operational in SAP3.)
+**Answer:** FI structural floors are **≥ 3 quantiles** and **≥ 8 trajectories** (deployment-independent hard validators); no use case below those. Nepal emits quantiles with the count configurable at training. SAP3's operational floors (≥7 quantiles w/ tails, ≥20 members) stay deployment-side and are checked at integration. Deterministic-only output is allowed but non-operational without supplied uncertainty. See decision 1.5.
 
-### Q4 — State reconstruction
+### Q4 — State reconstruction — ANSWERED
 
-Can every stateful model rebuild its internal state from a **sufficiently long lookback window**, or does any model **strictly require persisted hidden state** between calls? (Confirms decision 1.3 covers all cases.)
+**Answer (model developer):** the current model is **pure ML** — it reconstructs its internal state from the lookback window each cycle and requires **no warm-up period and no persisted state**. Warm-up state is required for conceptual / distributed models and may be required for hybrid models, but none are in scope for v1. This confirms decision 1.3: state-free is correct for v0, with the reserved `StatefulModel` extension covering future conceptual / hybrid models.
 
-### Q5 — `VariableMetadata` fields
+### Q5 — `VariableMetadata` fields — ANSWERED
 
-`VariableMetadata` currently has `name`, `unit`, `resolution`, `timedelta`, `forecast_horizon`, `offset`. Three points to settle:
+- **(a) Drop `name`.** Redundant with the `ModelOutput.variables[station][variable]` key; two sources of truth that can disagree is forbidden by the type ethos. Field removed. *(Code TODO: remove `name` from `VariableMetadata`; update tests.)*
+- **(b) Keep `forecast_horizon`, add a per-issue-block cross-validator.** Kept because the adapter reads it directly (`ForecastEnsemble.forecast_horizon_steps`). Validator: for `predict`, `forecast_horizon == row count`; for batch `hindcast`, `forecast_horizon == rows per issue_datetime` (one block per issue time, decision 1.8). *(Code TODO: add validator.)*
+- **(c) `offset` semantics confirmed:** number of steps (each `timedelta` long) between the **last observation and the first forecast step**. `offset = 1` ⇒ first forecast valid time is `last_obs + 1·timedelta` (usual next-step case); `offset = 2` ⇒ a one-step gap. Model and adapter both assume this convention.
 
-- **(a) `name`** — redundant with the dict key in `ModelOutput.variables`. Drop `name`, or keep it and add a validator enforcing `key == metadata.name`?
-- **(b) `forecast_horizon`** — this **is consumed** by the (designed) adapter: SAP3 doc 014 (lines 149, 228–229) assigns `ForecastEnsemble.forecast_horizon_steps` directly from `VariableMetadata.forecast_horizon`. The open question is **not** whether to keep it (we do) but whether to add a **cross-validator** that it matches the DataFrame row count.
-- **(c) `offset`** — confirm semantics: number of steps (each of length `timedelta`) between the last observed point and the first forecast step.
+**Reflected in:** `docs/model_interface.md`.
 
 ### Q6 — Per-row `issue_datetime` column
 
 The adapter maps `ModelOutput.issue_datetime` → `ForecastEnsemble.issued_at` and renames the per-row datetime column → `valid_time`. The question is whether to **keep the per-row `issue_datetime` column requirement** — with a cross-validator that it matches the top-level `issue_datetime` for forecasts — or **relax it**. Frame this as a validator question, not a removal: the column is not "dropped", it is renamed and re-used.
 
-### Q7 — SnowMapper lead times (residual)
+**Now scoped by decision 1.8:** a plain `ForecastModel` (`predict`) always emits a constant per-row `issue_datetime` equal to the top-level one — so for the required surface this *can* carry a strict cross-validator. The varying-per-row case exists **only** on the optional `BatchHindcastModel` path, where the validator must instead check that the per-row `issue_datetime` matches the batch's declared `issue_datetimes`. So the answer likely differs by protocol: strict equality for `predict`, set-membership for batch `hindcast`.
+
+### Q8 — `config` contents (train / predict) — OPEN (modeller-owned)
+
+What does the model need in `config` at `train` and `predict` time, beyond `inputs` and the injected `rng`? Candidates: training hyperparameters, target quantile levels / trajectory count, forecast horizon, validation split, early-stopping criteria, seeds beyond `rng`. This is **modeller-owned** and must be specified before `config: Any` can be typed (decision 1.9).
+
+### Q7 — SnowMapper lead times — ANSWERED (with caveat)
+
+**Answer:** SnowMapper **SWE** and **RoF** are available at **BASIN_AVERAGE or ELEVATION_BAND** (the same spatial options as weather forcing — this **broadens** decision 1.6's "ELEVATION_BAND only"). They are derived from ECMWF forecasts, so assume the **same lead times and resolutions as the ECMWF forecast (future-known) and ERA5-Land (past / reanalysis)**.
+
+**Caveat — SnowMapper lag:** because the SnowMapper model runs *after* ECMWF, its outputs may **lag** the ECMWF forecasts — the SnowMapper future-known series can be shorter or offset relative to the driving ECMWF series for the same issue time. See Q9.
+
+**Reflected in:** `docs/input_requirement.md`, decision 1.6.
+
+### Q9 — Per-product availability lag — OPEN
 
-The Nepal deployment specifics are otherwise settled (see decision 1.6): the eastern regional group ships first, SnowMapper forcing starts with **SWE** and **ROF**, and artifact transfer is **east → west**. The one residual: which **lead times** of SWE and ROF will the model consume — past-known lookback, future-known horizon, and how far in each?
+Products derived downstream (e.g. SnowMapper SWE / RoF, which run *after* their driving ECMWF forecast) may become available **later** than their nominal forcing — their future-known series lags the issue time. `InputRequirement`'s variable properties (`lookback`, `future_steps`, `max_nan`, `ensemble_mode`) have **no explicit lag / offset** field. Decide whether to **(a)** add a per-variable `availability_lag` (in steps), or **(b)** absorb it via `max_nan` / a shorter `future_steps`. Needs modeller + data-availability input.
diff --git a/forecast_interface/__init__.py b/forecast_interface/__init__.py
index 0517896..e5ff426 100644
--- a/forecast_interface/__init__.py
+++ b/forecast_interface/__init__.py
@@ -1,4 +1,4 @@
-__version__ = "0.1.5"
+__version__ = "0.1.6"
 
 from .input import (
     DynamicInputSpec,
diff --git a/pyproject.toml b/pyproject.toml
index 17f4dd4..704c520 100644
--- a/pyproject.toml
+++ b/pyproject.toml
@@ -1,6 +1,6 @@
 [project]
 name = "forecastinterface"
-version = "0.1.5"
+version = "0.1.6"
 description = "Add your description here"
 readme = "README.md"
 requires-python = ">=3.11"
@@ -13,7 +13,7 @@ dependencies = [
 pythonpath = ["."]
 
 [tool.bumpversion]
-current_version = "0.1.5"
+current_version = "0.1.6"
 commit = false
 tag = false
 allow_dirty = true
diff --git a/uv.lock b/uv.lock
index 777d23f..87219b4 100644
--- a/uv.lock
+++ b/uv.lock
@@ -85,7 +85,7 @@ wheels = [
 
 [[package]]
 name = "forecastinterface"
-version = "0.1.5"
+version = "0.1.6"
 source = { virtual = "." }
 dependencies = [
     { name = "polars" },

From 4f06b9fbac332e7960b3d760994b111f5e78cf72 Mon Sep 17 00:00:00 2001
From: Beatrice Marti <mabesa@users.noreply.github.com>
Date: Tue, 16 Jun 2026 21:41:15 +0200
Subject: [PATCH 06/16] docs(interface): resolve parameter-identity, time-step,
 max_nan & combinability gaps (SAP3 gap-hunt round 2)
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Second grilling session — gaps surfaced by re-checking SAPPHIRE_flow:

- Parameter identity (1.11): canonical vocabulary synced with SAP3
  ParameterDefinition; optional per-variable aggregation (SUM/MEAN);
  units first-class on inputs (declared + delivered-or-rejected),
  Unit enum to expand (PERCENT, M_PER_S, DEGREE, W_PER_M2, MM_PER_HOUR)
- Time step (1.12): timedelta, not TemporalResolution enum — driven by
  3h/6h v1 targets; drop resolution from VariableMetadata; calendar
  resolutions out of v1 scope
- max_nan (1.13): SAP3 enforces as pre-predict gate; breach => FAILURE/
  DATA_AVAILABILITY without calling model; within-tolerance delivered as-is
- Combinability (1.14): derived from TRAJECTORIES representation; no FI
  combination machinery; SAP3 owns POOLED/BMA, remapping, VIRTUAL scope
- Q7 answered (SnowMapper SWE/RoF at BASIN_AVERAGE or ELEVATION_BAND,
  ECMWF lead times); Q9 opened (per-product availability lag)

Rejected as orchestrator-only (kept out of FI): forcing provenance /
staleness / warm-up / input-quality passed to model, forcing_type,
forecast lifecycle/version, combination mechanism.

Code TODOs accumulated across 1.5/1.8/1.9/1.11/1.12 + Q5; not yet applied.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
---
 docs/input_requirement.md      | 64 ++++++++++++++++++----------------
 docs/model_interface.md        |  3 +-
 docs/open_design_questions.md  | 52 +++++++++++++++++++++++++++
 forecast_interface/__init__.py |  2 +-
 pyproject.toml                 |  4 +--
 uv.lock                        |  2 +-
 6 files changed, 91 insertions(+), 36 deletions(-)

diff --git a/docs/input_requirement.md b/docs/input_requirement.md
index 446e9e5..d59d18b 100644
--- a/docs/input_requirement.md
+++ b/docs/input_requirement.md
@@ -27,7 +27,7 @@ class TargetSpec(BaseModel):
 - **unit** — the physical unit of the target (e.g. `Unit.M3_PER_S`).
 - **representations** — the output forms the model can produce for this target, a non-empty set of `OutputRepresentation`: `deterministic`, `quantiles`, `trajectories`. A target may support more than one form.
 
-The combinability rule (whether a target's forecasts can be BMA-combined) is derived downstream from whether `TRAJECTORIES` is present; it is not encoded here.
+The combinability rule is **derived** from whether `TRAJECTORIES` is present — trajectory output is combinable (pooled / BMA); quantile-only or deterministic output is not. It is not separately encoded, and the combination mechanism (cross-model compatibility checks, weighting, member-id remapping, sentinel models) is entirely SAP3-side (decision 1.14).
 
 Targets are declared **independently** of inputs. A model that needs the target's own past history simply lists that variable under `past_known` in its dynamic inputs; a pure-simulation model omits it. (See Q2 in `open_design_questions.md`.)
 
@@ -40,7 +40,7 @@ Targets are declared **independently** of inputs. A model that needs the target'
 Dynamic inputs follow a strict hierarchy:
 
 ```
-temporal_resolution
+time_step (timedelta)
   └── spatial_resolution
         └── temporality (past_known / future_known)
               └── product
@@ -50,9 +50,11 @@ temporal_resolution
 
 ### Hierarchy Levels
 
-#### 1. Temporal Resolution
+#### 1. Time step
 
-The time step of the data. One of: `sub_hourly`, `hourly`, `sub_daily`, `daily`, `weekly`, `monthly`, `seasonal`, `annual`.
+The time step of the data, expressed as a **`timedelta`** (the `dynamic` dict is keyed by `timedelta`). This is a precise, fixed duration — e.g. `timedelta(hours=1)`, `timedelta(hours=3)`, `timedelta(hours=6)`, `timedelta(hours=24)` — and is an identity match to SAP3's `time_step`. Sub-daily steps such as 3-hourly and 6-hourly are first-class.
+
+Calendar-based resolutions (monthly / seasonal / annual / decadal) are **out of scope for v1** — they are not fixed durations. If forecasting at those steps is later required, a dedicated `TimeStep = timedelta | CalendarResolution` type will be introduced (see decision 1.12).
 
 #### 2. Spatial Representation
 
@@ -95,11 +97,23 @@ Each variable declares the following properties:
 |----------------|--------|---------------|--------------------------------------------------------------|
 | `lookback`     | `int`  | past_known    | Number of past time steps required (must be > 0)             |
 | `future_steps` | `int`  | future_known  | Number of future time steps required (must be > 0)           |
-| `max_nan`      | `int`  | both          | Maximum allowed NaN values in the time series (must be >= 0) |
+| `max_nan`      | `int`  | both          | The model's **tolerance**: max NaNs it can cope with in the series (must be >= 0). **SAP3 enforces this as a pre-`predict` gate** — if exceeded, the model is not called and the station is failed (`DATA_AVAILABILITY`); within tolerance, residual NaNs are delivered **as-is** for the model to handle (decision 1.13). |
 | `ensemble_mode`| `EnsembleMode` | future_known  | Whether ensemble or single traces are needed (`single` or `ensemble`, default: `single`) |
+| `unit`         | `Unit` | both          | **Required.** The physical unit the model expects this variable in (e.g. `Unit.MM_PER_DAY`). The delivered series is tagged with its unit and delivered **in the declared unit, or rejected loudly at integration** — no data without units. (Automatic unit conversion is a future adapter feature.) |
+| `aggregation`  | `AggregationMethod \| None` | both | **Optional.** `SUM` or `MEAN`, used when the declared resolution is coarser than the delivered data. Defaults to the per-parameter convention (precipitation / reference_et = `SUM`; state variables = `MEAN`); declare only to override. |
 
 ---
 
+## Parameter vocabulary, units & aggregation
+
+These three are a single coherent concern (mirroring SAP3's `ParameterDefinition`, which binds name → unit → aggregation) and each is a **coordination contract with SAP3**.
+
+**Canonical parameter names.** Variable names are free strings, but they **must match SAP3's canonical parameter vocabulary** so the preprocessing pipeline can resolve them. Canonical set: `discharge`, `water_level`, `water_temperature`, `precipitation`, `temperature`, `relative_humidity`, `wind_speed`, `wind_direction`, `global_radiation`, `reference_et`, `snow_water_equivalent`, `runoff` (ROF). SAP3 soft-checks names at integration and rejects unknowns. **Sync obligation:** keep this list aligned with SAP3 and update it whenever a variable is added on either side.
+
+**Units.** Every input variable declares the `unit` it expects (see properties table); outputs declare units via `TargetSpec` / `VariableMetadata`. The `Unit` enum must cover every parameter's unit and is a **sync contract with SAP3's `ParameterDefinition` units** — extended as needed (current additions for forcing: `PERCENT`, `M_PER_S`, `DEGREE`, `W_PER_M2`, `MM_PER_HOUR`).
+
+**Aggregation.** When a model declares a variable at a resolution coarser than the delivered data, SAP3 aggregates with `SUM` or `MEAN`. Default follows the per-parameter convention (precipitation / reference_et = `SUM`; temperature, discharge, SWE and other state variables = `MEAN`); override via the optional `aggregation` property only for a non-default rule.
+
 ## Static Inputs
 
 An unordered set of variable names (`set[str]`). Duplicates are ignored.
@@ -119,7 +133,7 @@ targets:
       - trajectories
 
 dynamic:
-  daily:
+  PT24H:          # timedelta(hours=24) — daily
     data:
       basin_average:
         past_known:
@@ -127,33 +141,24 @@ dynamic:
             discharge:
               lookback: 365
               max_nan: 10
+              unit: "m³/s"
             precipitation:
               lookback: 30
               max_nan: 5
+              unit: "mm/day"
+              aggregation: sum   # override: sum hourly precip into daily
         future_known:
-          GFS:
+          ECMWF:
             precipitation:
-              future_steps: 10
+              future_steps: 15
               max_nan: 0
               ensemble_mode: ensemble
+              unit: "mm/day"
             temperature:
-              future_steps: 10
-              max_nan: 0
-              ensemble_mode: single
-          ECMWF:
-            precipitation:
               future_steps: 15
               max_nan: 0
-              ensemble_mode: ensemble
-      gridded:
-        past_known:
-          ERA5:
-            swe:
-              lookback: 90
-              max_nan: 5
-            precipitation:
-              lookback: 30
-              max_nan: 3
+              ensemble_mode: single
+              unit: "°C"
       elevation_band:
         future_known:
           SnowMapper:
@@ -161,11 +166,13 @@ dynamic:
               future_steps: 10
               max_nan: 0
               ensemble_mode: single
-            rof:
+              unit: "mm"
+            runoff:
               future_steps: 10
               max_nan: 0
               ensemble_mode: single
-  hourly:
+              unit: "mm"
+  PT6H:           # timedelta(hours=6) — 6-hourly (sub-daily, now first-class)
     data:
       basin_average:
         past_known:
@@ -173,12 +180,7 @@ dynamic:
             discharge:
               lookback: 72
               max_nan: 2
-        future_known:
-          INCA:
-            precipitation:
-              future_steps: 48
-              max_nan: 0
-              ensemble_mode: single
+              unit: "m³/s"
 
 static:
   - catchment_area
diff --git a/docs/model_interface.md b/docs/model_interface.md
index f0f077d..8a57191 100644
--- a/docs/model_interface.md
+++ b/docs/model_interface.md
@@ -160,7 +160,7 @@ Groups data for a single output variable (within a single station):
 
 | Field | Type | Description |
 |---|---|---|
-| `metadata` | `VariableMetadata` | `unit`, `resolution`, `timedelta`, `forecast_horizon`, `offset` (no `name` — the variable name is the dict key). See *Metadata semantics* below. |
+| `metadata` | `VariableMetadata` | `unit`, `timedelta`, `forecast_horizon`, `offset` (no `name` — variable name is the dict key; no `resolution` enum — `timedelta` is the single time-step source). See *Metadata semantics* below. |
 | `deterministic` | `DeterministicData \| None` | Point forecast |
 | `quantiles` | `QuantileData \| None` | Quantile forecast |
 | `trajectories` | `TrajectoryData \| None` | Ensemble trajectories |
@@ -177,6 +177,7 @@ At least one of `deterministic`, `quantiles`, or `trajectories` must be present
 - **`forecast_horizon`** — number of forecast steps; **consumed directly by the SAP3 adapter** (`ForecastEnsemble.forecast_horizon_steps`). A cross-validator enforces it against the data: for `predict`, `forecast_horizon` equals the row count; for batch `hindcast` it equals the rows **per `issue_datetime`** (one block per issue time).
 - **`offset`** — number of steps (each `timedelta` long) between the **last observation and the first forecast step**. `offset = 1` ⇒ the first forecast valid time is `last_obs + 1·timedelta` (the usual next-step case); `offset = 2` ⇒ a one-step gap.
 - **No `name`** — the variable name is the `ModelOutput.variables[station][variable]` dict key; duplicating it in metadata is omitted to avoid two disagreeing sources of truth.
+- **`timedelta` is the single time-step source** — there is no `resolution` enum (it would be a second, disagreeable source of truth). The time step is a precise `timedelta`, matching the `timedelta`-keyed input requirement (decision 1.12).
 
 ### ForecastFlag
 
diff --git a/docs/open_design_questions.md b/docs/open_design_questions.md
index e9f5a77..aadde73 100644
--- a/docs/open_design_questions.md
+++ b/docs/open_design_questions.md
@@ -119,6 +119,58 @@ Banded Snowmapper SWE / snowmelt is declared at `ELEVATION_BAND`.
 
 **Reflected in:** `docs/model_interface.md`.
 
+## 1.14 Output combinability — RESOLVED
+
+**Decision:** combinability is **derived from the output representation**, with no new FI machinery.
+
+- A target supporting **`TRAJECTORIES`** is combinable; **quantile-only or deterministic** output is **not** (SAP3's combination consumes MEMBERS and skips QUANTILES). No `CombinableModel` flag, no opt-out — eligibility rides on the representation.
+- The model owes only **per-model internal consistency**: all trajectory members share the same valid times, unit, and horizon (already guaranteed by `TrajectoryData` + `VariableMetadata`). **No cross-model stable member ids** — SAP3 remaps ids (POOLED offsets, BMA resamples).
+- **SAP3 owns the combination mechanism entirely:** cross-model compatibility checks (issue / valid times, unit, horizon must agree across models at a station), POOLED/BMA weighting, the `_pooled` / `_bma` sentinel models and `VIRTUAL` scope, and model selection (fallbacks excluded by priority ≥ 90). FI models are unaware they are combined.
+
+**Rationale:** keeps FI minimal; combination is orchestration, not a model-author concern. Confirms FI's existing "combinability derived from TRAJECTORIES" stance and the deliberate omission of `VIRTUAL` scope.
+
+**Reflected in:** `docs/input_requirement.md` (Targets), `docs/model_interface.md` (trajectories).
+
+## 1.13 `max_nan` enforcement — RESOLVED
+
+**Decision:** `max_nan` is the model's **declared per-variable tolerance**, and **SAP3 enforces it as a pre-`predict` gate** — the model is only ever called with inputs within tolerance (the `ModelInputs` bundle is valid-by-construction w.r.t. `max_nan`, per decision 1.9).
+
+- NaNs **exceed** `max_nan` → SAP3 does **not** call the model for that station; it records the failure directly: per-station `VariableStatus.FAILURE` with the `DATA_AVAILABILITY` signal if other stations are serviceable, else whole-run `ModelFailure` (decision 1.7 two-level rule).
+- NaNs **within** tolerance (≤ `max_nan`) → data delivered **as-is**; residual NaNs remain and the model handles them (impute/mask). `max_nan` gates the *unacceptable*; it does not promise zero NaNs.
+
+**Rationale:** keeps models defensive-check-free, puts the gate where the data lives (SAP3 — which today delivers raw NaNs, closing that gap), and reuses the existing failure vocabulary (no new types).
+
+**Reflected in:** `docs/input_requirement.md`.
+
+## 1.12 Time step: `timedelta`, not an enum — RESOLVED
+
+**Decision:** represent time step as **`timedelta`**, not the `TemporalResolution` enum. Driven by the v1 requirement for **3-hourly and 6-hourly** steps, which the coarse enum cannot express (`SUB_DAILY` cannot distinguish 3h from 6h — they would collide under one dict key).
+
+- **Input:** the `dynamic` dict is keyed by **`timedelta`** (e.g. `timedelta(hours=3/6/1/24)`). Identity match to SAP3's `time_step` / `supported_time_steps` (already `timedelta`) — this *reduces* divergence, since FI's `TemporalResolution` had no SAP3 counterpart.
+- **Output:** **drop the redundant `resolution` enum from `VariableMetadata`**; keep `timedelta` as the single source of truth (same two-sources-of-truth reasoning as dropping `name`, Q5).
+- **Retire `TemporalResolution`** from the v1 contract. **Calendar resolutions** (`MONTHLY` / `SEASONAL` / `ANNUAL`, decadal — genuinely not fixed durations) are **out of v1 scope**; if SAPPHIRE later needs pentadal/decadal/monthly forecasting, introduce a dedicated `TimeStep = timedelta | CalendarResolution` type then.
+
+**Deviation from Sandro:** the input hierarchy's level-1 "temporal resolution" is now keyed by `timedelta` rather than an enum — on the Sandro list — but forced by the 3h/6h requirement and more SAP3-consistent.
+
+*(Code TODO: change `InputRequirement.dynamic` key type to `timedelta`; remove `resolution` from `VariableMetadata`; remove `TemporalResolution` from `common/resolutions.py`; update validators and tests.)*
+
+**Reflected in:** `docs/input_requirement.md`, `docs/model_interface.md`.
+
+## 1.11 Parameter identity: vocabulary, units, aggregation — RESOLVED
+
+**Context:** SAP3 binds parameter name → unit → aggregation in one `ParameterDefinition`. FI had scattered this (free-string names, a closed disconnected `Unit` enum, no aggregation). Closed via three coordinated moves; all three are **sync contracts with SAP3** (keep aligned; update FI lists whenever a variable/unit is added on either side).
+
+**(a) Vocabulary — documented canonical names.** Variable names stay free strings but must match SAP3's canonical set (`discharge`, `water_level`, `water_temperature`, `precipitation`, `temperature`, `relative_humidity`, `wind_speed`, `wind_direction`, `global_radiation`, `reference_et`, `snow_water_equivalent`, `runoff`). SAP3 soft-checks at integration. A documented contract, **not** a hard FI enum (avoids tracking SAP3's evolving set).
+
+**(b) Aggregation — optional per-variable override.** Add optional `aggregation: AggregationMethod` (`SUM`/`MEAN`, mirroring SAP3) to `PastKnownVariable` / `FutureKnownVariable`, used when the declared resolution is coarser than delivered data. Default = per-parameter convention (precip/ref_et = SUM, rest = MEAN); declared only to override. Correctness-critical, hence expressible, but optional. *(Code TODO: add `AggregationMethod` enum + optional field.)*
+
+**(c) Units — first-class on inputs and outputs ("no data without units").**
+- Inputs: model **declares the expected `unit`** per input variable (`PastKnownVariable` / `FutureKnownVariable` gain `unit: Unit`); delivered `ModelInputs` series are **tagged with unit**; SAP3 **delivers in the declared unit or rejects loudly at integration** (auto-conversion deferred to a future adapter feature).
+- Outputs: unchanged — `TargetSpec` / `VariableMetadata` already declare units.
+- **`Unit` enum expansion:** add `PERCENT` (%), `M_PER_S` (m/s), `DEGREE`, `W_PER_M2` (W/m²), `MM_PER_HOUR`. *(Code TODO: expand `common/units.py`; add `unit` field to `input/variable.py`.)*
+
+**Reflected in:** `docs/input_requirement.md`, `docs/model_interface.md`, and code (`common/units.py`, `input/variable.py`).
+
 ## 1.10 Station identity & group support — RESOLVED (refined)
 
 **Station key type:** FI station keys are opaque `str` (not typed `StationId` / UUID) — FI stays dependency-free.
diff --git a/forecast_interface/__init__.py b/forecast_interface/__init__.py
index e5ff426..7fa4927 100644
--- a/forecast_interface/__init__.py
+++ b/forecast_interface/__init__.py
@@ -1,4 +1,4 @@
-__version__ = "0.1.6"
+__version__ = "0.1.7"
 
 from .input import (
     DynamicInputSpec,
diff --git a/pyproject.toml b/pyproject.toml
index 704c520..1a13ff5 100644
--- a/pyproject.toml
+++ b/pyproject.toml
@@ -1,6 +1,6 @@
 [project]
 name = "forecastinterface"
-version = "0.1.6"
+version = "0.1.7"
 description = "Add your description here"
 readme = "README.md"
 requires-python = ">=3.11"
@@ -13,7 +13,7 @@ dependencies = [
 pythonpath = ["."]
 
 [tool.bumpversion]
-current_version = "0.1.6"
+current_version = "0.1.7"
 commit = false
 tag = false
 allow_dirty = true
diff --git a/uv.lock b/uv.lock
index 87219b4..5536764 100644
--- a/uv.lock
+++ b/uv.lock
@@ -85,7 +85,7 @@ wheels = [
 
 [[package]]
 name = "forecastinterface"
-version = "0.1.6"
+version = "0.1.7"
 source = { virtual = "." }
 dependencies = [
     { name = "polars" },

From c3a043486af9468dd2f76d58cbbd8a6bd7c09ab2 Mon Sep 17 00:00:00 2001
From: Beatrice Marti <mabesa@users.noreply.github.com>
Date: Tue, 16 Jun 2026 22:20:41 +0200
Subject: [PATCH 07/16] feat: implement A1 model interface contract

---
 README.md                                    |  24 +-
 docs/fi-sap3-mapping.md                      |  17 +-
 docs/model_interface.md                      |   1 +
 forecast_interface/__init__.py               |   6 +-
 forecast_interface/common/__init__.py        |   5 +-
 forecast_interface/common/aggregation.py     |   6 +
 forecast_interface/common/resolutions.py     |  11 -
 forecast_interface/common/units.py           |   5 +
 forecast_interface/input/__init__.py         |   8 +-
 forecast_interface/input/requirement.py      |  20 +-
 forecast_interface/input/variable.py         |   7 +
 forecast_interface/output/__init__.py        |   2 -
 forecast_interface/output/metadata.py        |  12 +-
 forecast_interface/output/variable_output.py |  36 ++-
 pyproject.toml                               |   4 +-
 tests/test_input.py                          | 234 +++++++++++++++----
 tests/test_interface.py                      |  16 +-
 tests/test_output.py                         | 218 +++++++++++++----
 uv.lock                                      |   2 +-
 19 files changed, 458 insertions(+), 176 deletions(-)
 create mode 100644 forecast_interface/common/aggregation.py

diff --git a/README.md b/README.md
index 8ccd9e6..fc09ae4 100644
--- a/README.md
+++ b/README.md
@@ -36,11 +36,9 @@ VariableOutput
     flags: frozenset[ForecastFlag]
 
 VariableMetadata
-    name: str
     unit: Unit                             # e.g. Unit.M3_PER_S → "m³/s"
-    resolution: TemporalResolution                 # e.g. TemporalResolution.DAILY
     timedelta: timedelta                   # time step between forecast points
-    forecast_horizon: int                  # number of forecast steps (> 0)
+    forecast_horizon: int                  # forecast steps per issue_datetime block (> 0)
     offset: int                            # offset in steps (>= 0)
 
 DeterministicData
@@ -51,7 +49,7 @@ QuantileData
     data: pl.DataFrame                     # columns: ["issue_datetime", "datetime", "0.1", "0.5", "0.9"]
 
 TrajectoryData
-    num_samples: int                       # number of ensemble members (> 0)
+    num_samples: int                       # number of ensemble members (>= 8)
     data: pl.DataFrame                     # columns: ["issue_datetime", "datetime", "1", "2", ..., "N"]
 
 EpistemicUncertaintyData
@@ -71,9 +69,9 @@ All DataFrames are validated on construction:
 
 ### Enums
 
-**Unit** -- `M3_PER_S`, `MM_PER_DAY`, `MM_PER_S`, `MM`, `CM`, `M`, `DEG_C`, `UNITLESS`
+**Unit** -- `M3_PER_S`, `MM_PER_DAY`, `MM_PER_S`, `MM`, `CM`, `M`, `DEG_C`, `UNITLESS`, `PERCENT`, `M_PER_S`, `DEGREE`, `W_PER_M2`, `MM_PER_HOUR`
 
-**TemporalResolution** -- `SUB_HOURLY`, `HOURLY`, `SUB_DAILY`, `DAILY`, `WEEKLY`, `MONTHLY`, `SEASONAL`, `ANNUAL`
+**AggregationMethod** -- `SUM`, `MEAN`
 
 **VariableStatus** -- `SUCCESS`, `FAILURE`, `PARTIAL`
 
@@ -86,7 +84,7 @@ from datetime import datetime, timedelta
 import polars as pl
 from forecast_interface import (
     ModelOutput, VariableOutput, VariableMetadata,
-    DeterministicData, QuantileData, Unit, TemporalResolution, VariableStatus,
+    DeterministicData, QuantileData, Unit, VariableStatus,
 )
 
 issue_dt = datetime(2024, 6, 1, 6, 0)
@@ -97,11 +95,9 @@ output = ModelOutput(
         "station_1": {
             "streamflow": VariableOutput(
                 metadata=VariableMetadata(
-                    name="streamflow",
                     unit=Unit.M3_PER_S,
-                    resolution=TemporalResolution.DAILY,
                     timedelta=timedelta(days=1),
-                    forecast_horizon=10,
+                    forecast_horizon=2,
                     offset=0,
                 ),
                 deterministic=DeterministicData(
@@ -131,7 +127,7 @@ See [Input Requirement Specification](docs/input_requirement.md) for full docume
 ```
 InputRequirement
     targets: dict[str, TargetSpec]                  # what the model forecasts
-    dynamic: dict[TemporalResolution, SpatialInputSpec]
+    dynamic: dict[timedelta, SpatialInputSpec]
     static: set[str]
 
 TargetSpec
@@ -148,10 +144,14 @@ DynamicInputSpec
 PastKnownVariable
     lookback: int
     max_nan: int
+    unit: Unit
+    aggregation: AggregationMethod | None
 
 FutureKnownVariable
     future_steps: int
     max_nan: int
+    unit: Unit
+    aggregation: AggregationMethod | None
     ensemble_mode: EnsembleMode    # SINGLE or ENSEMBLE
 ```
 
@@ -160,3 +160,5 @@ FutureKnownVariable
 **SpatialRepresentation** -- `POINT`, `BASIN_AVERAGE`, `ELEVATION_BAND`, `GRIDDED`
 
 **OutputRepresentation** -- `DETERMINISTIC`, `QUANTILES`, `TRAJECTORIES`
+
+**AggregationMethod** -- `SUM`, `MEAN`
diff --git a/docs/fi-sap3-mapping.md b/docs/fi-sap3-mapping.md
index edba7d4..da2be3b 100644
--- a/docs/fi-sap3-mapping.md
+++ b/docs/fi-sap3-mapping.md
@@ -87,7 +87,7 @@ divergence table (lines 138–151) on the FI side.
 |---|---|---|
 | `ModelOutput` (`output/model_output.py:9`) | `tuple[dict[str, ForecastEnsemble], bytes \| None]` (`protocols/forecast_model.py:38`) | Convert whole container → forecast dict + state bytes. |
 | `VariableOutput.deterministic` / `.quantiles` / `.trajectories` (`output/variable_output.py:109–111`) | `ForecastEnsemble` (`types/ensemble.py:18`) | Pick whichever is populated; route to the matching factory. |
-| `TrajectoryData` (`output/variable_output.py:64`) | `ForecastEnsemble.from_members()` (`types/ensemble.py:39`) → `MEMBERS` | Reshape member columns → `member_id`/`value`; ≥1 member (SAP3 line 60–62). |
+| `TrajectoryData` (`output/variable_output.py:64`) | `ForecastEnsemble.from_members()` (`types/ensemble.py:39`) → `MEMBERS` | Reshape member columns → `member_id`/`value`; FI enforces ≥8 members. |
 | `QuantileData` (`output/variable_output.py:33`) | `ForecastEnsemble.from_quantiles()` (`types/ensemble.py:76`) → `QUANTILES` | Reshape quantile columns → `quantile`/`value`. **See operational gap below.** |
 | `DeterministicData` (`output/variable_output.py:20`) | single-member `MEMBERS` ensemble (`types/ensemble.py:39`) | Wrap the single `value` column as `member_id=1`; flagged `insufficient_ensemble_size`, skips operational alert thresholds (doc 014 lines 190–196). |
 | `EpistemicUncertaintyData` (`output/variable_output.py:90`) | — (no SAP3 target) | **Dropped at the boundary in v0b** (FI-only; doc 014 lines 197–204). Revisit if models emit it. |
@@ -96,10 +96,9 @@ divergence table (lines 138–151) on the FI side.
 | `VariableMetadata.unit: Unit` (`output/metadata.py:11`; enum `common/units.py:4`) | `ForecastEnsemble.units: str` (`types/ensemble.py:24`) | Map `Unit` enum → SAP3 canonical unit string (table below). |
 | `ModelOutput.issue_datetime` (`output/model_output.py:13`) | `ForecastEnsemble.issued_at: UtcDatetime` (`types/ensemble.py:22`) | Apply `ensure_utc()`. |
 | per-row `datetime` column (all FI data containers) | `valid_time` column (SAP3 factories require it: `types/ensemble.py:54,91`) | Rename `datetime` → `valid_time`. |
-| `VariableMetadata.forecast_horizon: int` (`output/metadata.py:14`) | `ForecastEnsemble.forecast_horizon_steps: int` (`types/ensemble.py:25`) | **DIRECT** — both int, both step counts. **`forecast_horizon` IS consumed by the adapter** (corrects any prior "never consumed" belief). See note below. |
-| `VariableMetadata.timedelta: timedelta` (`output/metadata.py:13`) | `ForecastEnsemble.time_step: timedelta` (`types/ensemble.py:23`) | **DIRECT** assignment. |
-| `VariableMetadata.resolution: TemporalResolution` (`output/metadata.py:12`; enum `common/resolutions.py:4`) | — (no direct target) | Categorical label only; **cross-validate** against `timedelta`, never the conversion source. |
-| `ModelOutput.variables` inner key / `VariableMetadata.name` (`output/model_output.py`, `output/metadata.py:10`) | `ForecastEnsemble.parameter: str` (`types/ensemble.py:23`) | Validate against `ForecastParameter = Literal["discharge","water_level"]` and `ModelDataRequirements.target_parameters` (`types/model.py:261`). |
+| `VariableMetadata.forecast_horizon: int` (`output/metadata.py:11`) | `ForecastEnsemble.forecast_horizon_steps: int` (`types/ensemble.py:25`) | **DIRECT** — both int, both step counts. **`forecast_horizon` IS consumed by the adapter** (corrects any prior "never consumed" belief). See note below. |
+| `VariableMetadata.timedelta: timedelta` (`output/metadata.py:10`) | `ForecastEnsemble.time_step: timedelta` (`types/ensemble.py:23`) | **DIRECT** assignment; the adapter derives the time step from this field. |
+| `ModelOutput.variables` inner key (`output/model_output.py`) | `ForecastEnsemble.parameter: str` (`types/ensemble.py:23`) | Variable name is the dict key; `VariableMetadata.name` was removed. Validate against `ForecastParameter = Literal["discharge","water_level"]` and `ModelDataRequirements.target_parameters` (`types/model.py:261`). |
 | `ModelOutput.variables` outer key (`output/model_output.py`) | `StationId` (`types/ids.py`) | Station id (opaque `str` on FI side, Q1); adapter maps str → typed `StationId` per GROUP-path decomposition (§5). |
 | empty `ModelOutput.variables` **or** all-`FAILURE` | `ModelOutputError` (`exceptions.py:17`) | Adapter **raises** — zero usable ensembles (doc 014 lines 160–168, 218–223). |
 
@@ -143,11 +142,11 @@ ASCII canonical string from its `parameters` table (doc 014 line 147). The adapt
 
 ### `QuantileData` operational gap (FI valid ≠ SAP3 usable)
 
-FI's `QuantileData` requires only **≥1** quantile level in `(0,1)`, sorted & unique
+FI's `QuantileData` requires **≥3** quantile levels in `(0,1)`, sorted & unique
 (`output/variable_output.py:39–51`). SAP3's `from_quantiles()` requires **≥7** quantile
 levels **with tail coverage** (min ≤ 0.05 and max ≥ 0.95) (`types/ensemble.py:98–106`).
 
-Consequently an FI model emitting fewer than 7 quantiles (or without tail coverage) is
+Consequently an FI model emitting 3–6 quantiles (or without tail coverage) is
 **structurally valid FI output but NOT operationally usable by SAP3** — `from_quantiles()`
 raises `ValueError`. State this to model authors explicitly: FI's quantile floor is a
 permissive structural minimum; SAP3's operational floor is stricter.
@@ -193,7 +192,7 @@ discrepancy note at the end of this section.
 | `InputRequirement.static` (`input/requirement.py:37`) | `static_features: frozenset[str]` (line 264) |
 | `PastKnownVariable.lookback` (`input/variable.py:12`) | `lookback_steps: int` (line 266) |
 | `FutureKnownVariable.future_steps` (`input/variable.py:31`) | `forecast_horizon_steps: int` (line 267) |
-| `TemporalResolution` keys + `VariableMetadata.timedelta` | `supported_time_steps: frozenset[timedelta]` (line 265) |
+| `InputRequirement.dynamic` `timedelta` keys + `VariableMetadata.timedelta` | `supported_time_steps: frozenset[timedelta]` (line 265) |
 | `SpatialRepresentation` keys (`input/requirement.py`) | `spatial_input_type: SpatialRepresentation` (line 268) |
 | `InputRequirement.targets` keys + `TargetSpec.unit`/`.representations` (`input/target.py`) | `target_parameters: frozenset[str]` (line 261) |
 | `PastKnownVariable.max_nan` / `FutureKnownVariable.max_nan` (`input/variable.py:13,32`) | Derivable from SAP3 QC config (doc 014 line 273) |
@@ -323,5 +322,5 @@ FI/artifact side answers *"what is this model and how was it built"*; SAP3 side
 | 5 | Epistemic uncertainty | `EpistemicUncertaintyData` is dropped at the boundary in v0b (doc 014 lines 197–204). Revisit (add to `ForecastEnsemble` / store as metadata) if models emit it. |
 | 6 | Interface module now exists | doc 014 assumes FI's `interface/` is unimplemented (lines 80, 287). It is now implemented (`ForecastModel`, `ModelResult`, `FailureCause`). SAP3 should re-evaluate Tasks 4–5 against the real protocol. |
 | 7 | `ModelResult` failure channel | FI now returns `ModelResult = ModelSuccess \| ModelFailure` (`interface/result.py:40`) with a `FailureCause` enum (`interface/failure.py:4`). SAP3's `ModelOutputError` path must account for the `ModelFailure` branch, not only all-`FAILURE` `ModelOutput`. |
-| 8 | Resolution enum split | FI split into `TemporalResolution` + `SpatialRepresentation` (`common/resolutions.py`); doc 014 references a single `Resolution`. Mapping tables above use the current split. |
+| 8 | Resolution enum split | FI now keeps `SpatialRepresentation` (`common/resolutions.py`) and uses `timedelta` for time steps; doc 014 references a single `Resolution`. Mapping tables above use the current FI shape. |
 ```
diff --git a/docs/model_interface.md b/docs/model_interface.md
index 8a57191..c8c2246 100644
--- a/docs/model_interface.md
+++ b/docs/model_interface.md
@@ -175,6 +175,7 @@ At least one of `deterministic`, `quantiles`, or `trajectories` must be present
 ### Metadata semantics
 
 - **`forecast_horizon`** — number of forecast steps; **consumed directly by the SAP3 adapter** (`ForecastEnsemble.forecast_horizon_steps`). A cross-validator enforces it against the data: for `predict`, `forecast_horizon` equals the row count; for batch `hindcast` it equals the rows **per `issue_datetime`** (one block per issue time).
+- `forecast_horizon` equals the actual forecast steps present per issue block, so a `PARTIAL` / short forecast declares a smaller `forecast_horizon` and still satisfies the validator.
 - **`offset`** — number of steps (each `timedelta` long) between the **last observation and the first forecast step**. `offset = 1` ⇒ the first forecast valid time is `last_obs + 1·timedelta` (the usual next-step case); `offset = 2` ⇒ a one-step gap.
 - **No `name`** — the variable name is the `ModelOutput.variables[station][variable]` dict key; duplicating it in metadata is omitted to avoid two disagreeing sources of truth.
 - **`timedelta` is the single time-step source** — there is no `resolution` enum (it would be a second, disagreeable source of truth). The time step is a precise `timedelta`, matching the `timedelta`-keyed input requirement (decision 1.12).
diff --git a/forecast_interface/__init__.py b/forecast_interface/__init__.py
index 7fa4927..d1cb529 100644
--- a/forecast_interface/__init__.py
+++ b/forecast_interface/__init__.py
@@ -1,5 +1,6 @@
-__version__ = "0.1.7"
+__version__ = "0.1.8"
 
+from .common import AggregationMethod
 from .input import (
     DynamicInputSpec,
     EnsembleMode,
@@ -27,7 +28,6 @@
     ForecastFlag,
     ModelOutput,
     QuantileData,
-    TemporalResolution,
     TrajectoryData,
     Unit,
     VariableMetadata,
@@ -36,6 +36,7 @@
 )
 
 __all__ = [
+    "AggregationMethod",
     "ArtifactScope",
     "DeterministicData",
     "DynamicInputSpec",
@@ -57,7 +58,6 @@
     "SpatialInputSpec",
     "SpatialRepresentation",
     "TargetSpec",
-    "TemporalResolution",
     "TrainedArtifact",
     "TrajectoryData",
     "Unit",
diff --git a/forecast_interface/common/__init__.py b/forecast_interface/common/__init__.py
index 4615ca6..f414bf9 100644
--- a/forecast_interface/common/__init__.py
+++ b/forecast_interface/common/__init__.py
@@ -1,8 +1,9 @@
-from .resolutions import SpatialRepresentation, TemporalResolution
+from .aggregation import AggregationMethod
+from .resolutions import SpatialRepresentation
 from .units import Unit
 
 __all__ = [
+    "AggregationMethod",
     "SpatialRepresentation",
-    "TemporalResolution",
     "Unit",
 ]
diff --git a/forecast_interface/common/aggregation.py b/forecast_interface/common/aggregation.py
new file mode 100644
index 0000000..dc6df68
--- /dev/null
+++ b/forecast_interface/common/aggregation.py
@@ -0,0 +1,6 @@
+from enum import Enum
+
+
+class AggregationMethod(Enum):
+    SUM = "sum"
+    MEAN = "mean"
diff --git a/forecast_interface/common/resolutions.py b/forecast_interface/common/resolutions.py
index 95a2a92..f4ba1f6 100644
--- a/forecast_interface/common/resolutions.py
+++ b/forecast_interface/common/resolutions.py
@@ -1,17 +1,6 @@
 from enum import Enum
 
 
-class TemporalResolution(Enum):
-    SUB_HOURLY = "sub_hourly"
-    HOURLY = "hourly"
-    SUB_DAILY = "sub_daily"
-    DAILY = "daily"
-    WEEKLY = "weekly"
-    MONTHLY = "monthly"
-    SEASONAL = "seasonal"
-    ANNUAL = "annual"
-
-
 class SpatialRepresentation(Enum):
     POINT = "point"
     BASIN_AVERAGE = "basin_average"
diff --git a/forecast_interface/common/units.py b/forecast_interface/common/units.py
index fb90766..9c2818c 100644
--- a/forecast_interface/common/units.py
+++ b/forecast_interface/common/units.py
@@ -10,3 +10,8 @@ class Unit(Enum):
     M = "m"
     DEG_C = "°C"
     UNITLESS = "-"
+    PERCENT = "%"
+    M_PER_S = "m/s"
+    DEGREE = "°"
+    W_PER_M2 = "W/m²"
+    MM_PER_HOUR = "mm/hour"
diff --git a/forecast_interface/input/__init__.py b/forecast_interface/input/__init__.py
index baf731d..9359bf5 100644
--- a/forecast_interface/input/__init__.py
+++ b/forecast_interface/input/__init__.py
@@ -1,7 +1,5 @@
-from forecast_interface.common.resolutions import (
-    SpatialRepresentation,
-    TemporalResolution,
-)
+from forecast_interface.common.aggregation import AggregationMethod
+from forecast_interface.common.resolutions import SpatialRepresentation
 
 from .requirement import (
     DynamicInputSpec,
@@ -12,6 +10,7 @@
 from .variable import EnsembleMode, FutureKnownVariable, PastKnownVariable
 
 __all__ = [
+    "AggregationMethod",
     "DynamicInputSpec",
     "EnsembleMode",
     "FutureKnownVariable",
@@ -21,5 +20,4 @@
     "SpatialInputSpec",
     "SpatialRepresentation",
     "TargetSpec",
-    "TemporalResolution",
 ]
diff --git a/forecast_interface/input/requirement.py b/forecast_interface/input/requirement.py
index 44b7b49..fd05694 100644
--- a/forecast_interface/input/requirement.py
+++ b/forecast_interface/input/requirement.py
@@ -1,9 +1,8 @@
+import datetime
+
 from pydantic import BaseModel, field_validator, model_validator
 
-from forecast_interface.common.resolutions import (
-    SpatialRepresentation,
-    TemporalResolution,
-)
+from forecast_interface.common.resolutions import SpatialRepresentation
 
 from .target import TargetSpec
 from .variable import FutureKnownVariable, PastKnownVariable
@@ -40,7 +39,7 @@ class InputRequirement(BaseModel):
     # Targets are declared independently of inputs; a model needing the target's own
     # history lists it under past_known (see Q2 in open_design_questions.md).
     targets: dict[str, TargetSpec]
-    dynamic: dict[TemporalResolution, SpatialInputSpec]
+    dynamic: dict[datetime.timedelta, SpatialInputSpec]
     static: set[str] = set()
 
     @field_validator("targets")
@@ -58,12 +57,15 @@ def _at_least_one_target(
 
     @field_validator("dynamic")
     @classmethod
-    def _at_least_one_resolution(
+    def _validate_dynamic_time_steps(
         cls,
-        v: dict[TemporalResolution, SpatialInputSpec],
-    ) -> dict[TemporalResolution, SpatialInputSpec]:
+        v: dict[datetime.timedelta, SpatialInputSpec],
+    ) -> dict[datetime.timedelta, SpatialInputSpec]:
         if not v:
-            raise ValueError("dynamic must contain at least one temporal resolution")
+            raise ValueError("dynamic must contain at least one time step")
+        for time_step in v:
+            if time_step.total_seconds() <= 0:
+                raise ValueError("dynamic time step keys must be positive")
         return v
 
     @field_validator("static")
diff --git a/forecast_interface/input/variable.py b/forecast_interface/input/variable.py
index 66ce4ad..12c20a0 100644
--- a/forecast_interface/input/variable.py
+++ b/forecast_interface/input/variable.py
@@ -2,6 +2,9 @@
 
 from pydantic import BaseModel, field_validator
 
+from forecast_interface.common.aggregation import AggregationMethod
+from forecast_interface.common.units import Unit
+
 
 class EnsembleMode(Enum):
     SINGLE = "single"
@@ -11,6 +14,8 @@ class EnsembleMode(Enum):
 class PastKnownVariable(BaseModel):
     lookback: int
     max_nan: int
+    unit: Unit
+    aggregation: AggregationMethod | None = None
 
     @field_validator("lookback")
     @classmethod
@@ -30,6 +35,8 @@ def _non_negative_max_nan(cls, v: int) -> int:
 class FutureKnownVariable(BaseModel):
     future_steps: int
     max_nan: int
+    unit: Unit
+    aggregation: AggregationMethod | None = None
     ensemble_mode: EnsembleMode = EnsembleMode.SINGLE
 
     @field_validator("future_steps")
diff --git a/forecast_interface/output/__init__.py b/forecast_interface/output/__init__.py
index 4ba5bce..16842e1 100644
--- a/forecast_interface/output/__init__.py
+++ b/forecast_interface/output/__init__.py
@@ -1,7 +1,6 @@
 from .flags import ForecastFlag
 from .metadata import VariableMetadata
 from .model_output import ModelOutput
-from forecast_interface.common.resolutions import TemporalResolution
 from .status import VariableStatus
 from forecast_interface.common.units import Unit
 from .variable_output import (
@@ -18,7 +17,6 @@
     "ForecastFlag",
     "ModelOutput",
     "QuantileData",
-    "TemporalResolution",
     "TrajectoryData",
     "Unit",
     "VariableMetadata",
diff --git a/forecast_interface/output/metadata.py b/forecast_interface/output/metadata.py
index 4d4d00f..9345106 100644
--- a/forecast_interface/output/metadata.py
+++ b/forecast_interface/output/metadata.py
@@ -3,26 +3,16 @@
 from pydantic import BaseModel, field_validator
 
 from forecast_interface.common.units import Unit
-from forecast_interface.common.resolutions import TemporalResolution
 
 
 class VariableMetadata(BaseModel):
-    name: str  # Name of the variable, e.g. "discharge", "water_level", etc.
     unit: Unit
-    resolution: TemporalResolution
-    timedelta: datetime.timedelta  # Concrete value im minutes for example, but can be any positive timedelta that is consistent with the resolution
+    timedelta: datetime.timedelta
     forecast_horizon: (
         int  # Number of time steps of length timedelta that the forecast is made for
     )
     offset: int  # Number of time steps of length timedelta between the last observed data point and the first forecasted data point
 
-    @field_validator("name")
-    @classmethod
-    def _non_empty_name(cls, v: str) -> str:
-        if not v or not v.strip():
-            raise ValueError("name must be a non-empty string")
-        return v
-
     @field_validator("forecast_horizon")
     @classmethod
     def _positive_horizon(cls, v: int) -> int:
diff --git a/forecast_interface/output/variable_output.py b/forecast_interface/output/variable_output.py
index 5405716..e2b61f9 100644
--- a/forecast_interface/output/variable_output.py
+++ b/forecast_interface/output/variable_output.py
@@ -39,8 +39,8 @@ class QuantileData(BaseModel):
     @field_validator("quantile_levels")
     @classmethod
     def _validate_levels(cls, v: list[float]) -> list[float]:
-        if not v:
-            raise ValueError("quantile_levels must not be empty")
+        if len(v) < 3:
+            raise ValueError("quantile_levels must contain at least 3 levels")
         for level in v:
             if not (0 < level < 1):
                 raise ValueError(f"quantile levels must be in (0, 1), got {level}")
@@ -69,9 +69,9 @@ class TrajectoryData(BaseModel):
 
     @field_validator("num_samples")
     @classmethod
-    def _positive_samples(cls, v: int) -> int:
-        if v <= 0:
-            raise ValueError(f"num_samples must be positive, got {v}")
+    def _minimum_samples(cls, v: int) -> int:
+        if v < 8:
+            raise ValueError(f"num_samples must be at least 8, got {v}")
         return v
 
     @model_validator(mode="after")
@@ -127,3 +127,29 @@ def _validate_data_present(self) -> "VariableOutput":
                     "must be present when status is SUCCESS or PARTIAL"
                 )
         return self
+
+    @model_validator(mode="after")
+    def _validate_forecast_horizon(self) -> "VariableOutput":
+        for representation, data_container in (
+            ("deterministic", self.deterministic),
+            ("quantiles", self.quantiles),
+            ("trajectories", self.trajectories),
+        ):
+            if data_container is None:
+                continue
+            df = data_container.data
+            if df.height == 0:
+                raise ValueError(f"{representation} data must not be empty")
+            mismatches = (
+                df.group_by("issue_datetime")
+                .agg(pl.len().alias("rows"))
+                .filter(pl.col("rows") != self.metadata.forecast_horizon)
+            )
+            if mismatches.height:
+                observed = mismatches["rows"][0]
+                raise ValueError(
+                    f"{representation} data must contain exactly "
+                    f"{self.metadata.forecast_horizon} rows per issue_datetime "
+                    f"(got {observed})"
+                )
+        return self
diff --git a/pyproject.toml b/pyproject.toml
index 1a13ff5..1321ae0 100644
--- a/pyproject.toml
+++ b/pyproject.toml
@@ -1,6 +1,6 @@
 [project]
 name = "forecastinterface"
-version = "0.1.7"
+version = "0.1.8"
 description = "Add your description here"
 readme = "README.md"
 requires-python = ">=3.11"
@@ -13,7 +13,7 @@ dependencies = [
 pythonpath = ["."]
 
 [tool.bumpversion]
-current_version = "0.1.7"
+current_version = "0.1.8"
 commit = false
 tag = false
 allow_dirty = true
diff --git a/tests/test_input.py b/tests/test_input.py
index d4e2944..416bc2d 100644
--- a/tests/test_input.py
+++ b/tests/test_input.py
@@ -1,8 +1,11 @@
+from datetime import timedelta
+
 import pytest
 from pydantic import ValidationError
 
 from forecast_interface.common import Unit
 from forecast_interface.input import (
+    AggregationMethod,
     DynamicInputSpec,
     EnsembleMode,
     FutureKnownVariable,
@@ -12,9 +15,11 @@
     SpatialInputSpec,
     SpatialRepresentation,
     TargetSpec,
-    TemporalResolution,
 )
 
+DAILY = timedelta(days=1)
+HOURLY = timedelta(hours=1)
+
 
 def _target() -> dict[str, TargetSpec]:
     return {
@@ -32,50 +37,83 @@ def _target() -> dict[str, TargetSpec]:
 
 class TestPastKnownVariable:
     def test_valid(self) -> None:
-        v = PastKnownVariable(lookback=30, max_nan=5)
+        v = PastKnownVariable(unit=Unit.M3_PER_S, lookback=30, max_nan=5)
         assert v.lookback == 30
         assert v.max_nan == 5
+        assert v.unit == Unit.M3_PER_S
+        assert v.aggregation is None
+
+    def test_aggregation_override(self) -> None:
+        v = PastKnownVariable(
+            unit=Unit.MM_PER_DAY,
+            lookback=30,
+            max_nan=5,
+            aggregation=AggregationMethod.SUM,
+        )
+        assert v.aggregation == AggregationMethod.SUM
+
+    def test_unit_required(self) -> None:
+        with pytest.raises(ValidationError, match="unit"):
+            PastKnownVariable(lookback=1, max_nan=0)
 
     def test_lookback_zero(self) -> None:
         with pytest.raises(ValidationError, match="lookback must be positive"):
-            PastKnownVariable(lookback=0, max_nan=0)
+            PastKnownVariable(unit=Unit.M3_PER_S, lookback=0, max_nan=0)
 
     def test_lookback_negative(self) -> None:
         with pytest.raises(ValidationError, match="lookback must be positive"):
-            PastKnownVariable(lookback=-1, max_nan=0)
+            PastKnownVariable(unit=Unit.M3_PER_S, lookback=-1, max_nan=0)
 
     def test_max_nan_negative(self) -> None:
         with pytest.raises(ValidationError, match="max_nan must be non-negative"):
-            PastKnownVariable(lookback=1, max_nan=-1)
+            PastKnownVariable(unit=Unit.M3_PER_S, lookback=1, max_nan=-1)
 
     def test_max_nan_zero_allowed(self) -> None:
-        v = PastKnownVariable(lookback=1, max_nan=0)
+        v = PastKnownVariable(unit=Unit.M3_PER_S, lookback=1, max_nan=0)
         assert v.max_nan == 0
 
 
 class TestFutureKnownVariable:
     def test_valid(self) -> None:
         v = FutureKnownVariable(
-            future_steps=10, max_nan=0, ensemble_mode=EnsembleMode.ENSEMBLE
+            unit=Unit.M3_PER_S,
+            future_steps=10,
+            max_nan=0,
+            ensemble_mode=EnsembleMode.ENSEMBLE,
         )
         assert v.future_steps == 10
         assert v.ensemble_mode == EnsembleMode.ENSEMBLE
+        assert v.unit == Unit.M3_PER_S
+        assert v.aggregation is None
+
+    def test_aggregation_override(self) -> None:
+        v = FutureKnownVariable(
+            unit=Unit.MM_PER_DAY,
+            future_steps=10,
+            max_nan=0,
+            aggregation=AggregationMethod.MEAN,
+        )
+        assert v.aggregation == AggregationMethod.MEAN
+
+    def test_unit_required(self) -> None:
+        with pytest.raises(ValidationError, match="unit"):
+            FutureKnownVariable(future_steps=1, max_nan=0)
 
     def test_ensemble_mode_default_single(self) -> None:
-        v = FutureKnownVariable(future_steps=5, max_nan=0)
+        v = FutureKnownVariable(unit=Unit.M3_PER_S, future_steps=5, max_nan=0)
         assert v.ensemble_mode == EnsembleMode.SINGLE
 
     def test_future_steps_zero(self) -> None:
         with pytest.raises(ValidationError, match="future_steps must be positive"):
-            FutureKnownVariable(future_steps=0, max_nan=0)
+            FutureKnownVariable(unit=Unit.M3_PER_S, future_steps=0, max_nan=0)
 
     def test_future_steps_negative(self) -> None:
         with pytest.raises(ValidationError, match="future_steps must be positive"):
-            FutureKnownVariable(future_steps=-3, max_nan=0)
+            FutureKnownVariable(unit=Unit.M3_PER_S, future_steps=-3, max_nan=0)
 
     def test_max_nan_negative(self) -> None:
         with pytest.raises(ValidationError, match="max_nan must be non-negative"):
-            FutureKnownVariable(future_steps=1, max_nan=-1)
+            FutureKnownVariable(unit=Unit.M3_PER_S, future_steps=1, max_nan=-1)
 
 
 # ---------------------------------------------------------------------------
@@ -120,7 +158,13 @@ def test_unit_required(self) -> None:
 class TestDynamicInputSpec:
     def test_past_only(self) -> None:
         spec = DynamicInputSpec(
-            past_known={"obs": {"discharge": PastKnownVariable(lookback=30, max_nan=2)}}
+            past_known={
+                "obs": {
+                    "discharge": PastKnownVariable(
+                        unit=Unit.M3_PER_S, lookback=30, max_nan=2
+                    )
+                }
+            }
         )
         assert "obs" in spec.past_known
         assert spec.future_known == {}
@@ -128,7 +172,11 @@ def test_past_only(self) -> None:
     def test_future_only(self) -> None:
         spec = DynamicInputSpec(
             future_known={
-                "GFS": {"precip": FutureKnownVariable(future_steps=10, max_nan=0)}
+                "GFS": {
+                    "precip": FutureKnownVariable(
+                        unit=Unit.M3_PER_S, future_steps=10, max_nan=0
+                    )
+                }
             }
         )
         assert "GFS" in spec.future_known
@@ -144,7 +192,11 @@ def test_empty_raises(self) -> None:
 class TestSpatialInputSpec:
     def test_basin_average_only(self) -> None:
         dynamic = DynamicInputSpec(
-            past_known={"obs": {"q": PastKnownVariable(lookback=10, max_nan=0)}}
+            past_known={
+                "obs": {
+                    "q": PastKnownVariable(unit=Unit.M3_PER_S, lookback=10, max_nan=0)
+                }
+            }
         )
         spec = SpatialInputSpec(data={SpatialRepresentation.BASIN_AVERAGE: dynamic})
         assert SpatialRepresentation.BASIN_AVERAGE in spec.data
@@ -152,7 +204,11 @@ def test_basin_average_only(self) -> None:
 
     def test_gridded_only(self) -> None:
         dynamic = DynamicInputSpec(
-            past_known={"ERA5": {"swe": PastKnownVariable(lookback=90, max_nan=5)}}
+            past_known={
+                "ERA5": {
+                    "swe": PastKnownVariable(unit=Unit.M3_PER_S, lookback=90, max_nan=5)
+                }
+            }
         )
         spec = SpatialInputSpec(data={SpatialRepresentation.GRIDDED: dynamic})
         assert SpatialRepresentation.GRIDDED in spec.data
@@ -161,7 +217,9 @@ def test_gridded_only(self) -> None:
     def test_elevation_band(self) -> None:
         dynamic = DynamicInputSpec(
             past_known={
-                "SnowMapper": {"swe": PastKnownVariable(lookback=30, max_nan=2)}
+                "SnowMapper": {
+                    "swe": PastKnownVariable(unit=Unit.M3_PER_S, lookback=30, max_nan=2)
+                }
             }
         )
         spec = SpatialInputSpec(data={SpatialRepresentation.ELEVATION_BAND: dynamic})
@@ -169,10 +227,18 @@ def test_elevation_band(self) -> None:
 
     def test_both(self) -> None:
         basin = DynamicInputSpec(
-            past_known={"obs": {"q": PastKnownVariable(lookback=10, max_nan=0)}}
+            past_known={
+                "obs": {
+                    "q": PastKnownVariable(unit=Unit.M3_PER_S, lookback=10, max_nan=0)
+                }
+            }
         )
         gridded = DynamicInputSpec(
-            past_known={"ERA5": {"swe": PastKnownVariable(lookback=90, max_nan=5)}}
+            past_known={
+                "ERA5": {
+                    "swe": PastKnownVariable(unit=Unit.M3_PER_S, lookback=90, max_nan=5)
+                }
+            }
         )
         spec = SpatialInputSpec(
             data={
@@ -210,13 +276,13 @@ def test_minimal(self) -> None:
         req = InputRequirement(
             targets=_target(),
             dynamic={
-                TemporalResolution.DAILY: SpatialInputSpec(
+                DAILY: SpatialInputSpec(
                     data={
                         SpatialRepresentation.BASIN_AVERAGE: DynamicInputSpec(
                             past_known={
                                 "obs": {
                                     "discharge": PastKnownVariable(
-                                        lookback=365, max_nan=10
+                                        unit=Unit.M3_PER_S, lookback=365, max_nan=10
                                     )
                                 }
                             }
@@ -225,7 +291,7 @@ def test_minimal(self) -> None:
                 )
             },
         )
-        assert TemporalResolution.DAILY in req.dynamic
+        assert DAILY in req.dynamic
         assert req.static == set()
         assert "discharge" in req.targets
 
@@ -233,11 +299,15 @@ def test_with_static(self) -> None:
         req = InputRequirement(
             targets=_target(),
             dynamic={
-                TemporalResolution.DAILY: SpatialInputSpec(
+                DAILY: SpatialInputSpec(
                     data={
                         SpatialRepresentation.BASIN_AVERAGE: DynamicInputSpec(
                             past_known={
-                                "obs": {"q": PastKnownVariable(lookback=30, max_nan=0)}
+                                "obs": {
+                                    "q": PastKnownVariable(
+                                        unit=Unit.M3_PER_S, lookback=30, max_nan=0
+                                    )
+                                }
                             }
                         )
                     }
@@ -252,12 +322,14 @@ def test_empty_targets_raises(self) -> None:
             InputRequirement(
                 targets={},
                 dynamic={
-                    TemporalResolution.DAILY: SpatialInputSpec(
+                    DAILY: SpatialInputSpec(
                         data={
                             SpatialRepresentation.BASIN_AVERAGE: DynamicInputSpec(
                                 past_known={
                                     "obs": {
-                                        "q": PastKnownVariable(lookback=1, max_nan=0)
+                                        "q": PastKnownVariable(
+                                            unit=Unit.M3_PER_S, lookback=1, max_nan=0
+                                        )
                                     }
                                 }
                             )
@@ -278,12 +350,14 @@ def test_whitespace_target_key_raises(self) -> None:
                     )
                 },
                 dynamic={
-                    TemporalResolution.DAILY: SpatialInputSpec(
+                    DAILY: SpatialInputSpec(
                         data={
                             SpatialRepresentation.BASIN_AVERAGE: DynamicInputSpec(
                                 past_known={
                                     "obs": {
-                                        "q": PastKnownVariable(lookback=1, max_nan=0)
+                                        "q": PastKnownVariable(
+                                            unit=Unit.M3_PER_S, lookback=1, max_nan=0
+                                        )
                                     }
                                 }
                             )
@@ -293,20 +367,70 @@ def test_whitespace_target_key_raises(self) -> None:
             )
 
     def test_empty_dynamic_raises(self) -> None:
-        with pytest.raises(ValidationError, match="at least one temporal resolution"):
+        with pytest.raises(
+            ValidationError, match="dynamic must contain at least one time step"
+        ):
             InputRequirement(targets=_target(), dynamic={})
 
+    def test_zero_dynamic_time_step_rejected(self) -> None:
+        with pytest.raises(ValidationError, match="time step keys must be positive"):
+            InputRequirement(
+                targets=_target(),
+                dynamic={
+                    timedelta(0): SpatialInputSpec(
+                        data={
+                            SpatialRepresentation.BASIN_AVERAGE: DynamicInputSpec(
+                                past_known={
+                                    "obs": {
+                                        "q": PastKnownVariable(
+                                            unit=Unit.M3_PER_S,
+                                            lookback=1,
+                                            max_nan=0,
+                                        )
+                                    }
+                                }
+                            )
+                        }
+                    )
+                },
+            )
+
+    def test_negative_dynamic_time_step_rejected(self) -> None:
+        with pytest.raises(ValidationError, match="time step keys must be positive"):
+            InputRequirement(
+                targets=_target(),
+                dynamic={
+                    timedelta(days=-1): SpatialInputSpec(
+                        data={
+                            SpatialRepresentation.BASIN_AVERAGE: DynamicInputSpec(
+                                past_known={
+                                    "obs": {
+                                        "q": PastKnownVariable(
+                                            unit=Unit.M3_PER_S,
+                                            lookback=1,
+                                            max_nan=0,
+                                        )
+                                    }
+                                }
+                            )
+                        }
+                    )
+                },
+            )
+
     def test_empty_static_string_raises(self) -> None:
         with pytest.raises(ValidationError, match="non-empty strings"):
             InputRequirement(
                 targets=_target(),
                 dynamic={
-                    TemporalResolution.DAILY: SpatialInputSpec(
+                    DAILY: SpatialInputSpec(
                         data={
                             SpatialRepresentation.BASIN_AVERAGE: DynamicInputSpec(
                                 past_known={
                                     "obs": {
-                                        "q": PastKnownVariable(lookback=1, max_nan=0)
+                                        "q": PastKnownVariable(
+                                            unit=Unit.M3_PER_S, lookback=1, max_nan=0
+                                        )
                                     }
                                 }
                             )
@@ -320,11 +444,15 @@ def test_duplicate_static_deduplicated(self) -> None:
         req = InputRequirement(
             targets=_target(),
             dynamic={
-                TemporalResolution.DAILY: SpatialInputSpec(
+                DAILY: SpatialInputSpec(
                     data={
                         SpatialRepresentation.BASIN_AVERAGE: DynamicInputSpec(
                             past_known={
-                                "obs": {"q": PastKnownVariable(lookback=1, max_nan=0)}
+                                "obs": {
+                                    "q": PastKnownVariable(
+                                        unit=Unit.M3_PER_S, lookback=1, max_nan=0
+                                    )
+                                }
                             }
                         )
                     }
@@ -343,12 +471,14 @@ def test_whitespace_static_string_raises(self) -> None:
             InputRequirement(
                 targets=_target(),
                 dynamic={
-                    TemporalResolution.DAILY: SpatialInputSpec(
+                    DAILY: SpatialInputSpec(
                         data={
                             SpatialRepresentation.BASIN_AVERAGE: DynamicInputSpec(
                                 past_known={
                                     "obs": {
-                                        "q": PastKnownVariable(lookback=1, max_nan=0)
+                                        "q": PastKnownVariable(
+                                            unit=Unit.M3_PER_S, lookback=1, max_nan=0
+                                        )
                                     }
                                 }
                             )
@@ -382,27 +512,29 @@ def full_requirement(self) -> InputRequirement:
                 )
             },
             dynamic={
-                TemporalResolution.DAILY: SpatialInputSpec(
+                DAILY: SpatialInputSpec(
                     data={
                         SpatialRepresentation.BASIN_AVERAGE: DynamicInputSpec(
                             past_known={
                                 "obs": {
                                     "discharge": PastKnownVariable(
-                                        lookback=365, max_nan=10
+                                        unit=Unit.M3_PER_S, lookback=365, max_nan=10
                                     ),
                                     "precipitation": PastKnownVariable(
-                                        lookback=30, max_nan=5
+                                        unit=Unit.M3_PER_S, lookback=30, max_nan=5
                                     ),
                                 }
                             },
                             future_known={
                                 "GFS": {
                                     "precipitation": FutureKnownVariable(
+                                        unit=Unit.M3_PER_S,
                                         future_steps=10,
                                         max_nan=0,
                                         ensemble_mode=EnsembleMode.ENSEMBLE,
                                     ),
                                     "temperature": FutureKnownVariable(
+                                        unit=Unit.M3_PER_S,
                                         future_steps=10,
                                         max_nan=0,
                                         ensemble_mode=EnsembleMode.SINGLE,
@@ -410,6 +542,7 @@ def full_requirement(self) -> InputRequirement:
                                 },
                                 "ECMWF": {
                                     "precipitation": FutureKnownVariable(
+                                        unit=Unit.M3_PER_S,
                                         future_steps=15,
                                         max_nan=0,
                                         ensemble_mode=EnsembleMode.ENSEMBLE,
@@ -420,9 +553,11 @@ def full_requirement(self) -> InputRequirement:
                         SpatialRepresentation.GRIDDED: DynamicInputSpec(
                             past_known={
                                 "ERA5": {
-                                    "swe": PastKnownVariable(lookback=90, max_nan=5),
+                                    "swe": PastKnownVariable(
+                                        unit=Unit.M3_PER_S, lookback=90, max_nan=5
+                                    ),
                                     "precipitation": PastKnownVariable(
-                                        lookback=30, max_nan=3
+                                        unit=Unit.M3_PER_S, lookback=30, max_nan=3
                                     ),
                                 }
                             }
@@ -431,11 +566,13 @@ def full_requirement(self) -> InputRequirement:
                             future_known={
                                 "SnowMapper": {
                                     "swe": FutureKnownVariable(
+                                        unit=Unit.M3_PER_S,
                                         future_steps=10,
                                         max_nan=0,
                                         ensemble_mode=EnsembleMode.SINGLE,
                                     ),
                                     "rof": FutureKnownVariable(
+                                        unit=Unit.M3_PER_S,
                                         future_steps=10,
                                         max_nan=0,
                                         ensemble_mode=EnsembleMode.SINGLE,
@@ -445,19 +582,20 @@ def full_requirement(self) -> InputRequirement:
                         ),
                     }
                 ),
-                TemporalResolution.HOURLY: SpatialInputSpec(
+                HOURLY: SpatialInputSpec(
                     data={
                         SpatialRepresentation.BASIN_AVERAGE: DynamicInputSpec(
                             past_known={
                                 "obs": {
                                     "discharge": PastKnownVariable(
-                                        lookback=72, max_nan=2
+                                        unit=Unit.M3_PER_S, lookback=72, max_nan=2
                                     ),
                                 }
                             },
                             future_known={
                                 "INCA": {
                                     "precipitation": FutureKnownVariable(
+                                        unit=Unit.M3_PER_S,
                                         future_steps=48,
                                         max_nan=0,
                                         ensemble_mode=EnsembleMode.SINGLE,
@@ -473,8 +611,8 @@ def full_requirement(self) -> InputRequirement:
 
     def test_construction(self, full_requirement: InputRequirement) -> None:
         assert len(full_requirement.dynamic) == 2
-        assert TemporalResolution.DAILY in full_requirement.dynamic
-        assert TemporalResolution.HOURLY in full_requirement.dynamic
+        assert DAILY in full_requirement.dynamic
+        assert HOURLY in full_requirement.dynamic
         assert len(full_requirement.static) == 3
 
     def test_targets(self, full_requirement: InputRequirement) -> None:
@@ -483,7 +621,7 @@ def test_targets(self, full_requirement: InputRequirement) -> None:
         assert OutputRepresentation.TRAJECTORIES in discharge.representations
 
     def test_daily_basin_average_past(self, full_requirement: InputRequirement) -> None:
-        daily = full_requirement.dynamic[TemporalResolution.DAILY]
+        daily = full_requirement.dynamic[DAILY]
         basin = daily.data[SpatialRepresentation.BASIN_AVERAGE]
         obs = basin.past_known["obs"]
         assert obs["discharge"].lookback == 365
@@ -492,7 +630,7 @@ def test_daily_basin_average_past(self, full_requirement: InputRequirement) -> N
     def test_daily_basin_average_future(
         self, full_requirement: InputRequirement
     ) -> None:
-        daily = full_requirement.dynamic[TemporalResolution.DAILY]
+        daily = full_requirement.dynamic[DAILY]
         basin = daily.data[SpatialRepresentation.BASIN_AVERAGE]
         gfs = basin.future_known["GFS"]
         assert gfs["precipitation"].ensemble_mode == EnsembleMode.ENSEMBLE
@@ -501,20 +639,20 @@ def test_daily_basin_average_future(
         assert ecmwf["precipitation"].future_steps == 15
 
     def test_daily_gridded_past(self, full_requirement: InputRequirement) -> None:
-        daily = full_requirement.dynamic[TemporalResolution.DAILY]
+        daily = full_requirement.dynamic[DAILY]
         gridded = daily.data[SpatialRepresentation.GRIDDED]
         era5 = gridded.past_known["ERA5"]
         assert era5["swe"].lookback == 90
 
     def test_daily_elevation_band(self, full_requirement: InputRequirement) -> None:
-        daily = full_requirement.dynamic[TemporalResolution.DAILY]
+        daily = full_requirement.dynamic[DAILY]
         band = daily.data[SpatialRepresentation.ELEVATION_BAND]
         snow = band.future_known["SnowMapper"]
         assert "swe" in snow
         assert "rof" in snow
 
     def test_hourly_block(self, full_requirement: InputRequirement) -> None:
-        hourly = full_requirement.dynamic[TemporalResolution.HOURLY]
+        hourly = full_requirement.dynamic[HOURLY]
         assert SpatialRepresentation.BASIN_AVERAGE in hourly.data
         assert SpatialRepresentation.GRIDDED not in hourly.data
         basin = hourly.data[SpatialRepresentation.BASIN_AVERAGE]
diff --git a/tests/test_interface.py b/tests/test_interface.py
index 4e3e80a..5ebc226 100644
--- a/tests/test_interface.py
+++ b/tests/test_interface.py
@@ -14,7 +14,6 @@
     SpatialInputSpec,
     SpatialRepresentation,
     TargetSpec,
-    TemporalResolution as InputTemporalResolution,
 )
 from forecast_interface.interface import (
     ArtifactScope,
@@ -29,7 +28,6 @@
 from forecast_interface.output import (
     DeterministicData,
     ModelOutput,
-    TemporalResolution,
     Unit,
     VariableMetadata,
     VariableOutput,
@@ -55,11 +53,9 @@ def _make_model_output() -> ModelOutput:
             "station_1": {
                 "discharge": VariableOutput(
                     metadata=VariableMetadata(
-                        name="discharge",
                         unit=Unit.M3_PER_S,
-                        resolution=TemporalResolution.DAILY,
                         timedelta=timedelta(days=1),
-                        forecast_horizon=10,
+                        forecast_horizon=1,
                         offset=0,
                     ),
                     deterministic=DeterministicData(data=df),
@@ -79,11 +75,17 @@ def _make_input_requirement() -> InputRequirement:
             )
         },
         dynamic={
-            InputTemporalResolution.DAILY: SpatialInputSpec(
+            timedelta(days=1): SpatialInputSpec(
                 data={
                     SpatialRepresentation.BASIN_AVERAGE: DynamicInputSpec(
                         past_known={
-                            "obs": {"q": PastKnownVariable(lookback=1, max_nan=0)}
+                            "obs": {
+                                "q": PastKnownVariable(
+                                    unit=Unit.M3_PER_S,
+                                    lookback=1,
+                                    max_nan=0,
+                                )
+                            }
                         }
                     )
                 }
diff --git a/tests/test_output.py b/tests/test_output.py
index 57f82b0..d6e0bb4 100644
--- a/tests/test_output.py
+++ b/tests/test_output.py
@@ -4,13 +4,13 @@
 import polars as pl
 import pytest
 
+from forecast_interface.common import AggregationMethod
 from forecast_interface.output import (
     DeterministicData,
     EpistemicUncertaintyData,
     ForecastFlag,
     ModelOutput,
     QuantileData,
-    TemporalResolution,
     TrajectoryData,
     Unit,
     VariableMetadata,
@@ -25,11 +25,9 @@
 
 def _make_metadata(**overrides: object) -> VariableMetadata:
     defaults: dict[str, object] = {
-        "name": "discharge",
         "unit": Unit.M3_PER_S,
-        "resolution": TemporalResolution.DAILY,
         "timedelta": timedelta(days=1),
-        "forecast_horizon": 10,
+        "forecast_horizon": 2,
         "offset": 0,
     }
     defaults.update(overrides)
@@ -60,24 +58,23 @@ def test_members_exist(self) -> None:
         assert Unit.M.value == "m"
         assert Unit.DEG_C.value == "°C"
         assert Unit.UNITLESS.value == "-"
+        assert Unit.PERCENT.value == "%"
+        assert Unit.M_PER_S.value == "m/s"
+        assert Unit.DEGREE.value == "°"
+        assert Unit.W_PER_M2.value == "W/m²"
+        assert Unit.MM_PER_HOUR.value == "mm/hour"
 
     def test_member_count(self) -> None:
-        assert len(Unit) == 8
+        assert len(Unit) == 13
 
 
-class TestTemporalResolution:
+class TestAggregationMethod:
     def test_members_exist(self) -> None:
-        assert TemporalResolution.SUB_HOURLY.value == "sub_hourly"
-        assert TemporalResolution.HOURLY.value == "hourly"
-        assert TemporalResolution.SUB_DAILY.value == "sub_daily"
-        assert TemporalResolution.DAILY.value == "daily"
-        assert TemporalResolution.WEEKLY.value == "weekly"
-        assert TemporalResolution.MONTHLY.value == "monthly"
-        assert TemporalResolution.SEASONAL.value == "seasonal"
-        assert TemporalResolution.ANNUAL.value == "annual"
+        assert AggregationMethod.SUM.value == "sum"
+        assert AggregationMethod.MEAN.value == "mean"
 
     def test_member_count(self) -> None:
-        assert len(TemporalResolution) == 8
+        assert len(AggregationMethod) == 2
 
 
 class TestVariableStatus:
@@ -93,21 +90,11 @@ def test_member_count(self) -> None:
 class TestVariableMetadata:
     def test_valid_construction(self) -> None:
         meta = _make_metadata()
-        assert meta.name == "discharge"
         assert meta.unit == Unit.M3_PER_S
-        assert meta.resolution == TemporalResolution.DAILY
         assert meta.timedelta == timedelta(days=1)
-        assert meta.forecast_horizon == 10
+        assert meta.forecast_horizon == 2
         assert meta.offset == 0
 
-    def test_empty_name_rejected(self) -> None:
-        with pytest.raises(ValueError, match="name must be a non-empty string"):
-            _make_metadata(name="")
-
-    def test_whitespace_name_rejected(self) -> None:
-        with pytest.raises(ValueError, match="name must be a non-empty string"):
-            _make_metadata(name="   ")
-
     def test_zero_forecast_horizon_rejected(self) -> None:
         with pytest.raises(ValueError, match="forecast_horizon must be positive"):
             _make_metadata(forecast_horizon=0)
@@ -328,30 +315,46 @@ def test_empty_levels_rejected(self) -> None:
                 "datetime": [_DT1],
             }
         )
-        with pytest.raises(ValueError, match="must not be empty"):
+        with pytest.raises(ValueError, match="must contain at least 3 levels"):
             QuantileData(quantile_levels=[], data=df)
 
+    def test_less_than_three_levels_rejected(self) -> None:
+        df = pl.DataFrame(
+            {
+                "issue_datetime": [_ISSUE_DT],
+                "datetime": [_DT1],
+                "0.1": [1.0],
+                "0.9": [3.0],
+            }
+        )
+        with pytest.raises(ValueError, match="must contain at least 3 levels"):
+            QuantileData(quantile_levels=[0.1, 0.9], data=df)
+
     def test_level_zero_rejected(self) -> None:
         df = pl.DataFrame(
             {
                 "issue_datetime": [_ISSUE_DT],
                 "datetime": [_DT1],
                 "0.0": [1.0],
+                "0.5": [2.0],
+                "0.9": [3.0],
             }
         )
         with pytest.raises(ValueError, match="must be in \\(0, 1\\)"):
-            QuantileData(quantile_levels=[0.0], data=df)
+            QuantileData(quantile_levels=[0.0, 0.5, 0.9], data=df)
 
     def test_level_one_rejected(self) -> None:
         df = pl.DataFrame(
             {
                 "issue_datetime": [_ISSUE_DT],
                 "datetime": [_DT1],
+                "0.1": [1.0],
+                "0.5": [2.0],
                 "1.0": [1.0],
             }
         )
         with pytest.raises(ValueError, match="must be in \\(0, 1\\)"):
-            QuantileData(quantile_levels=[1.0], data=df)
+            QuantileData(quantile_levels=[0.1, 0.5, 1.0], data=df)
 
     def test_unsorted_levels_rejected(self) -> None:
         df = pl.DataFrame(
@@ -360,21 +363,23 @@ def test_unsorted_levels_rejected(self) -> None:
                 "datetime": [_DT1],
                 "0.9": [1.0],
                 "0.1": [2.0],
+                "0.5": [3.0],
             }
         )
         with pytest.raises(ValueError, match="must be sorted ascending"):
-            QuantileData(quantile_levels=[0.9, 0.1], data=df)
+            QuantileData(quantile_levels=[0.9, 0.1, 0.5], data=df)
 
     def test_duplicate_levels_rejected(self) -> None:
         df = pl.DataFrame(
             {
                 "issue_datetime": [_ISSUE_DT],
                 "datetime": [_DT1],
+                "0.1": [1.0],
                 "0.5": [1.0],
             }
         )
         with pytest.raises(ValueError, match="must not contain duplicates"):
-            QuantileData(quantile_levels=[0.5, 0.5], data=df)
+            QuantileData(quantile_levels=[0.1, 0.5, 0.5], data=df)
 
     def test_column_mismatch_rejected(self) -> None:
         df = pl.DataFrame(
@@ -393,11 +398,13 @@ def test_non_numeric_quantile_column_rejected(self) -> None:
             {
                 "issue_datetime": [_ISSUE_DT],
                 "datetime": [_DT1],
+                "0.1": [1.0],
                 "0.5": ["abc"],
+                "0.9": [3.0],
             }
         )
         with pytest.raises(ValueError, match="must be numeric"):
-            QuantileData(quantile_levels=[0.5], data=df)
+            QuantileData(quantile_levels=[0.1, 0.5, 0.9], data=df)
 
 
 class TestTrajectoryData:
@@ -409,11 +416,16 @@ def test_valid_construction(self) -> None:
                 "1": [10.0],
                 "2": [20.0],
                 "3": [30.0],
+                "4": [40.0],
+                "5": [50.0],
+                "6": [60.0],
+                "7": [70.0],
+                "8": [80.0],
             }
         )
-        td = TrajectoryData(num_samples=3, data=df)
-        assert td.num_samples == 3
-        assert td.data.shape == (1, 5)
+        td = TrajectoryData(num_samples=8, data=df)
+        assert td.num_samples == 8
+        assert td.data.shape == (1, 10)
 
     def test_zero_samples_rejected(self) -> None:
         df = pl.DataFrame(
@@ -422,7 +434,7 @@ def test_zero_samples_rejected(self) -> None:
                 "datetime": [_DT1],
             }
         )
-        with pytest.raises(ValueError, match="num_samples must be positive"):
+        with pytest.raises(ValueError, match="num_samples must be at least 8"):
             TrajectoryData(num_samples=0, data=df)
 
     def test_negative_samples_rejected(self) -> None:
@@ -432,9 +444,26 @@ def test_negative_samples_rejected(self) -> None:
                 "datetime": [_DT1],
             }
         )
-        with pytest.raises(ValueError, match="num_samples must be positive"):
+        with pytest.raises(ValueError, match="num_samples must be at least 8"):
             TrajectoryData(num_samples=-1, data=df)
 
+    def test_less_than_eight_samples_rejected(self) -> None:
+        df = pl.DataFrame(
+            {
+                "issue_datetime": [_ISSUE_DT],
+                "datetime": [_DT1],
+                "1": [10.0],
+                "2": [20.0],
+                "3": [30.0],
+                "4": [40.0],
+                "5": [50.0],
+                "6": [60.0],
+                "7": [70.0],
+            }
+        )
+        with pytest.raises(ValueError, match="num_samples must be at least 8"):
+            TrajectoryData(num_samples=7, data=df)
+
     def test_column_count_mismatch_rejected(self) -> None:
         df = pl.DataFrame(
             {
@@ -445,7 +474,7 @@ def test_column_count_mismatch_rejected(self) -> None:
             }
         )
         with pytest.raises(ValueError, match="Column mismatch"):
-            TrajectoryData(num_samples=3, data=df)
+            TrajectoryData(num_samples=8, data=df)
 
     def test_wrong_column_names_rejected(self) -> None:
         df = pl.DataFrame(
@@ -457,7 +486,7 @@ def test_wrong_column_names_rejected(self) -> None:
             }
         )
         with pytest.raises(ValueError, match="Column mismatch"):
-            TrajectoryData(num_samples=2, data=df)
+            TrajectoryData(num_samples=8, data=df)
 
 
 class TestVariableOutput:
@@ -483,7 +512,7 @@ def test_valid_quantiles_only(self) -> None:
             }
         )
         vo = VariableOutput(
-            metadata=_make_metadata(),
+            metadata=_make_metadata(forecast_horizon=1),
             quantiles=QuantileData(quantile_levels=[0.1, 0.5, 0.9], data=df),
             status=VariableStatus.SUCCESS,
         )
@@ -497,11 +526,17 @@ def test_valid_trajectories_only(self) -> None:
                 "datetime": [_DT1],
                 "1": [10.0],
                 "2": [20.0],
+                "3": [30.0],
+                "4": [40.0],
+                "5": [50.0],
+                "6": [60.0],
+                "7": [70.0],
+                "8": [80.0],
             }
         )
         vo = VariableOutput(
-            metadata=_make_metadata(),
-            trajectories=TrajectoryData(num_samples=2, data=df),
+            metadata=_make_metadata(forecast_horizon=1),
+            trajectories=TrajectoryData(num_samples=8, data=df),
             status=VariableStatus.SUCCESS,
         )
         assert vo.trajectories is not None
@@ -510,22 +545,31 @@ def test_valid_trajectories_only(self) -> None:
     def test_valid_all_three(self) -> None:
         det = _make_deterministic()
         quant = QuantileData(
-            quantile_levels=[0.5],
+            quantile_levels=[0.1, 0.5, 0.9],
             data=pl.DataFrame(
                 {
-                    "issue_datetime": [_ISSUE_DT],
-                    "datetime": [_DT1],
-                    "0.5": [1.0],
+                    "issue_datetime": [_ISSUE_DT, _ISSUE_DT],
+                    "datetime": [_DT1, _DT2],
+                    "0.1": [1.0, 2.0],
+                    "0.5": [2.0, 3.0],
+                    "0.9": [3.0, 4.0],
                 }
             ),
         )
         traj = TrajectoryData(
-            num_samples=1,
+            num_samples=8,
             data=pl.DataFrame(
                 {
-                    "issue_datetime": [_ISSUE_DT],
-                    "datetime": [_DT1],
-                    "1": [1.0],
+                    "issue_datetime": [_ISSUE_DT, _ISSUE_DT],
+                    "datetime": [_DT1, _DT2],
+                    "1": [1.0, 2.0],
+                    "2": [2.0, 3.0],
+                    "3": [3.0, 4.0],
+                    "4": [4.0, 5.0],
+                    "5": [5.0, 6.0],
+                    "6": [6.0, 7.0],
+                    "7": [7.0, 8.0],
+                    "8": [8.0, 9.0],
                 }
             ),
         )
@@ -569,6 +613,80 @@ def test_partial_with_data_accepted(self) -> None:
         )
         assert vo.status == VariableStatus.PARTIAL
 
+    def test_partial_short_forecast_declares_smaller_horizon(self) -> None:
+        df = pl.DataFrame(
+            {
+                "issue_datetime": [_ISSUE_DT],
+                "datetime": [_DT1],
+                "value": [1.0],
+            }
+        )
+        vo = VariableOutput(
+            metadata=_make_metadata(forecast_horizon=1),
+            deterministic=DeterministicData(data=df),
+            status=VariableStatus.PARTIAL,
+        )
+        assert vo.metadata.forecast_horizon == 1
+
+    def test_horizon_validator_passes_single_issue(self) -> None:
+        vo = VariableOutput(
+            metadata=_make_metadata(forecast_horizon=2),
+            deterministic=_make_deterministic(),
+            status=VariableStatus.SUCCESS,
+        )
+        assert vo.deterministic is not None
+
+    def test_horizon_validator_passes_batch_hindcast(self) -> None:
+        issue_2 = datetime.datetime(2024, 1, 2, 6, 0)
+        df = pl.DataFrame(
+            {
+                "issue_datetime": [_ISSUE_DT, _ISSUE_DT, issue_2, issue_2],
+                "datetime": [
+                    _DT1,
+                    _DT2,
+                    datetime.datetime(2024, 1, 3),
+                    datetime.datetime(2024, 1, 4),
+                ],
+                "value": [1.0, 2.0, 3.0, 4.0],
+            }
+        )
+        vo = VariableOutput(
+            metadata=_make_metadata(forecast_horizon=2),
+            deterministic=DeterministicData(data=df),
+            status=VariableStatus.SUCCESS,
+        )
+        assert vo.deterministic is not None
+
+    def test_horizon_validator_rejects_mismatched_group_count(self) -> None:
+        df = pl.DataFrame(
+            {
+                "issue_datetime": [_ISSUE_DT],
+                "datetime": [_DT1],
+                "value": [1.0],
+            }
+        )
+        with pytest.raises(ValueError, match="rows per issue_datetime.*got 1"):
+            VariableOutput(
+                metadata=_make_metadata(forecast_horizon=2),
+                deterministic=DeterministicData(data=df),
+                status=VariableStatus.FAILURE,
+            )
+
+    def test_horizon_validator_rejects_empty_present_representation(self) -> None:
+        df = pl.DataFrame(
+            schema={
+                "issue_datetime": pl.Datetime,
+                "datetime": pl.Datetime,
+                "value": pl.Float64,
+            }
+        )
+        with pytest.raises(ValueError, match="deterministic data must not be empty"):
+            VariableOutput(
+                metadata=_make_metadata(forecast_horizon=1),
+                deterministic=DeterministicData(data=df),
+                status=VariableStatus.FAILURE,
+            )
+
     def test_epistemic_uncertainty_accepted(self) -> None:
         vo = VariableOutput(
             metadata=_make_metadata(),
diff --git a/uv.lock b/uv.lock
index 5536764..f023149 100644
--- a/uv.lock
+++ b/uv.lock
@@ -85,7 +85,7 @@ wheels = [
 
 [[package]]
 name = "forecastinterface"
-version = "0.1.7"
+version = "0.1.8"
 source = { virtual = "." }
 dependencies = [
     { name = "polars" },

From 81415ce48900343d95d400226e22f38986c53191 Mon Sep 17 00:00:00 2001
From: Beatrice Marti <mabesa@users.noreply.github.com>
Date: Tue, 16 Jun 2026 23:02:21 +0200
Subject: [PATCH 08/16] feat(input): add ModelInputs bundle type

---
 docs/open_design_questions.md           |  13 +
 forecast_interface/__init__.py          |  12 +-
 forecast_interface/input/__init__.py    |  12 +
 forecast_interface/input/_validators.py |  47 ++++
 forecast_interface/input/bundle.py      | 100 ++++++++
 pyproject.toml                          |   4 +-
 tests/test_bundle.py                    | 316 ++++++++++++++++++++++++
 uv.lock                                 |   2 +-
 8 files changed, 502 insertions(+), 4 deletions(-)
 create mode 100644 forecast_interface/input/_validators.py
 create mode 100644 forecast_interface/input/bundle.py
 create mode 100644 tests/test_bundle.py

diff --git a/docs/open_design_questions.md b/docs/open_design_questions.md
index aadde73..bb61117 100644
--- a/docs/open_design_questions.md
+++ b/docs/open_design_questions.md
@@ -119,6 +119,19 @@ Banded Snowmapper SWE / snowmelt is declared at `ELEVATION_BAND`.
 
 **Reflected in:** `docs/model_interface.md`.
 
+## 1.15 Forecast horizon ownership & issue context — RESOLVED
+
+**Decision:** the **model owns the forecast horizon** — it is not requested by SAP3.
+
+- Three quantities: **capability** (model's max, from training), **forcing-limited** (bounded by delivered `future_steps`), **actual** (= f(capability, forcing)). Only the model knows capability and sees forcing, so only the model computes the actual horizon.
+- The model **declares the actual horizon in `metadata.forecast_horizon`** (already in the output, per-variable). SAP3 **checks** it against operational need and flags/rejects if too short (loud-at-boundary), but never dictates it.
+- **The bundle (`ModelInputs`) carries no issue scalar at all — it is pure data.** `issue_datetime` (predict) and `issue_datetimes` (batch hindcast) are **separate protocol parameters**, not bundle fields — training has no single issue time, and batch hindcast has many. No requested horizon and no output `time_step` either: the model forecasts at its own trained step/horizon. `future_steps` stays purely forcing extent. `ModelInputs` is therefore uniform across train / predict / hindcast.
+- **Multi-horizon / multi-timestep is already structurally supported** — each `VariableOutput` carries its own `metadata.timedelta` + `forecast_horizon`. "Single horizon + single output step" is a v0 *convention*; going multi is a zero-schema-change additive step.
+
+**Deferred (YAGNI):** an optional declared horizon-*capability* field in the input spec for integration-time validation; the output declaration + SAP3's check suffice. Add only if SAP3 needs to validate horizon before the first run.
+
+**Reflected in:** `docs/model_interface.md` (issue context), future `ModelInputs` type (A2).
+
 ## 1.14 Output combinability — RESOLVED
 
 **Decision:** combinability is **derived from the output representation**, with no new FI machinery.
diff --git a/forecast_interface/__init__.py b/forecast_interface/__init__.py
index d1cb529..59c3bb0 100644
--- a/forecast_interface/__init__.py
+++ b/forecast_interface/__init__.py
@@ -1,15 +1,20 @@
-__version__ = "0.1.8"
+__version__ = "0.1.9"
 
 from .common import AggregationMethod
 from .input import (
+    DynamicInputs,
     DynamicInputSpec,
     EnsembleMode,
     FutureKnownVariable,
     InputRequirement,
+    InputSeries,
+    ModelInputs,
     OutputRepresentation,
     PastKnownVariable,
+    SpatialInputs,
     SpatialInputSpec,
     SpatialRepresentation,
+    StationInputs,
     TargetSpec,
 )
 from .interface import (
@@ -39,6 +44,7 @@
     "AggregationMethod",
     "ArtifactScope",
     "DeterministicData",
+    "DynamicInputs",
     "DynamicInputSpec",
     "EnsembleMode",
     "EpistemicUncertaintyData",
@@ -47,7 +53,9 @@
     "ForecastModel",
     "FutureKnownVariable",
     "InputRequirement",
+    "InputSeries",
     "ModelFailure",
+    "ModelInputs",
     "ModelOutput",
     "ModelResult",
     "ModelSuccess",
@@ -55,8 +63,10 @@
     "PastKnownVariable",
     "QuantileData",
     "RetrainableModel",
+    "SpatialInputs",
     "SpatialInputSpec",
     "SpatialRepresentation",
+    "StationInputs",
     "TargetSpec",
     "TrainedArtifact",
     "TrajectoryData",
diff --git a/forecast_interface/input/__init__.py b/forecast_interface/input/__init__.py
index 9359bf5..e128b6c 100644
--- a/forecast_interface/input/__init__.py
+++ b/forecast_interface/input/__init__.py
@@ -1,6 +1,13 @@
 from forecast_interface.common.aggregation import AggregationMethod
 from forecast_interface.common.resolutions import SpatialRepresentation
 
+from .bundle import (
+    DynamicInputs,
+    InputSeries,
+    ModelInputs,
+    SpatialInputs,
+    StationInputs,
+)
 from .requirement import (
     DynamicInputSpec,
     InputRequirement,
@@ -11,13 +18,18 @@
 
 __all__ = [
     "AggregationMethod",
+    "DynamicInputs",
     "DynamicInputSpec",
     "EnsembleMode",
     "FutureKnownVariable",
     "InputRequirement",
+    "InputSeries",
+    "ModelInputs",
     "OutputRepresentation",
     "PastKnownVariable",
+    "SpatialInputs",
     "SpatialInputSpec",
     "SpatialRepresentation",
+    "StationInputs",
     "TargetSpec",
 ]
diff --git a/forecast_interface/input/_validators.py b/forecast_interface/input/_validators.py
new file mode 100644
index 0000000..25c0d77
--- /dev/null
+++ b/forecast_interface/input/_validators.py
@@ -0,0 +1,47 @@
+import polars as pl
+
+
+DATETIME_DTYPE = pl.Datetime
+NUMERIC_DTYPES = (
+    pl.Decimal,
+    pl.Float16,
+    pl.Float32,
+    pl.Float64,
+    pl.Int8,
+    pl.Int16,
+    pl.Int32,
+    pl.Int64,
+    pl.Int128,
+    pl.UInt8,
+    pl.UInt16,
+    pl.UInt32,
+    pl.UInt64,
+    pl.UInt128,
+)
+
+
+def validate_input_series_dataframe(df: pl.DataFrame) -> None:
+    if "datetime" not in df.columns:
+        raise ValueError("DataFrame must contain a 'datetime' column")
+    if not isinstance(df.schema["datetime"], DATETIME_DTYPE):
+        raise ValueError("'datetime' column must be Datetime")
+    if "issue_datetime" in df.columns:
+        raise ValueError("InputSeries data must not contain 'issue_datetime'")
+
+    datetime_values = df["datetime"]
+    if datetime_values.is_null().any():
+        raise ValueError("'datetime' values must not be null")
+    if df.height == 0:
+        raise ValueError("InputSeries data must contain at least one row")
+
+    value_columns = [col for col in df.columns if col != "datetime"]
+    if not value_columns:
+        raise ValueError("InputSeries data must contain at least one value column")
+    for col in value_columns:
+        if not isinstance(df.schema[col], NUMERIC_DTYPES):
+            raise ValueError(f"Column '{col}' must be numeric")
+
+    if datetime_values.n_unique() != df.height:
+        raise ValueError("'datetime' values must be unique")
+    if not datetime_values.is_sorted():
+        raise ValueError("'datetime' values must be sorted ascending")
diff --git a/forecast_interface/input/bundle.py b/forecast_interface/input/bundle.py
new file mode 100644
index 0000000..84193e9
--- /dev/null
+++ b/forecast_interface/input/bundle.py
@@ -0,0 +1,100 @@
+import datetime
+
+import polars as pl
+from pydantic import BaseModel, ConfigDict, field_validator, model_validator
+
+from forecast_interface.common import SpatialRepresentation, Unit
+
+from ._validators import validate_input_series_dataframe
+
+
+class InputSeries(BaseModel):
+    model_config = ConfigDict(arbitrary_types_allowed=True)
+
+    # Deliberately carries only unit + data; declaration-only fields
+    # (lookback/future_steps/max_nan/aggregation/ensemble_mode) do not appear
+    # at the data leaf.
+    unit: Unit
+    data: pl.DataFrame
+
+    @field_validator("data")
+    @classmethod
+    def _validate_data(cls, v: pl.DataFrame) -> pl.DataFrame:
+        validate_input_series_dataframe(v)
+        return v
+
+
+class DynamicInputs(BaseModel):
+    past_known: dict[str, dict[str, InputSeries]] = {}
+    future_known: dict[str, dict[str, InputSeries]] = {}
+
+    @model_validator(mode="after")
+    def _at_least_one_temporality(self) -> "DynamicInputs":
+        if not self.past_known and not self.future_known:
+            raise ValueError(
+                "at least one of past_known or future_known must be non-empty"
+            )
+        return self
+
+
+class SpatialInputs(BaseModel):
+    data: dict[SpatialRepresentation, DynamicInputs]
+
+    @field_validator("data")
+    @classmethod
+    def _at_least_one_spatial(
+        cls,
+        v: dict[SpatialRepresentation, DynamicInputs],
+    ) -> dict[SpatialRepresentation, DynamicInputs]:
+        if not v:
+            raise ValueError("data must contain at least one spatial representation")
+        return v
+
+
+class StationInputs(BaseModel):
+    dynamic: dict[datetime.timedelta, SpatialInputs]
+    static: dict[str, int | float | str] = {}
+
+    @field_validator("dynamic")
+    @classmethod
+    def _validate_dynamic_time_steps(
+        cls,
+        v: dict[datetime.timedelta, SpatialInputs],
+    ) -> dict[datetime.timedelta, SpatialInputs]:
+        if not v:
+            raise ValueError("dynamic must contain at least one time step")
+        for time_step in v:
+            if time_step.total_seconds() <= 0:
+                raise ValueError("dynamic time step keys must be positive")
+        return v
+
+    @field_validator("static")
+    @classmethod
+    def _non_empty_static_entries(
+        cls,
+        v: dict[str, int | float | str],
+    ) -> dict[str, int | float | str]:
+        for entry in v:
+            if not entry or not entry.strip():
+                raise ValueError("static input names must be non-empty strings")
+        return v
+
+
+class ModelInputs(BaseModel):
+    # Intentional drift from InputRequirement: a station level is added because
+    # data is per-station, and static is a per-station dict rather than the
+    # declaration's top-level set[str].
+    stations: dict[str, StationInputs]
+
+    @field_validator("stations")
+    @classmethod
+    def _validate_stations(
+        cls,
+        v: dict[str, StationInputs],
+    ) -> dict[str, StationInputs]:
+        if not v:
+            raise ValueError("stations must contain at least one station")
+        for station_id in v:
+            if not station_id or not station_id.strip():
+                raise ValueError("station keys must be non-empty strings")
+        return v
diff --git a/pyproject.toml b/pyproject.toml
index 1321ae0..893b5e9 100644
--- a/pyproject.toml
+++ b/pyproject.toml
@@ -1,6 +1,6 @@
 [project]
 name = "forecastinterface"
-version = "0.1.8"
+version = "0.1.9"
 description = "Add your description here"
 readme = "README.md"
 requires-python = ">=3.11"
@@ -13,7 +13,7 @@ dependencies = [
 pythonpath = ["."]
 
 [tool.bumpversion]
-current_version = "0.1.8"
+current_version = "0.1.9"
 commit = false
 tag = false
 allow_dirty = true
diff --git a/tests/test_bundle.py b/tests/test_bundle.py
new file mode 100644
index 0000000..5d5c0c9
--- /dev/null
+++ b/tests/test_bundle.py
@@ -0,0 +1,316 @@
+from datetime import datetime, timedelta
+
+import polars as pl
+import pytest
+from pydantic import ValidationError
+
+from forecast_interface.common import Unit
+from forecast_interface.input import (
+    DynamicInputs,
+    InputSeries,
+    ModelInputs,
+    SpatialInputs,
+    SpatialRepresentation,
+    StationInputs,
+)
+
+DAILY = timedelta(days=1)
+HOURLY = timedelta(hours=1)
+DT1 = datetime(2024, 1, 1)
+DT2 = datetime(2024, 1, 2)
+
+
+def _single_df() -> pl.DataFrame:
+    return pl.DataFrame({"datetime": [DT1], "value": [1.0]})
+
+
+def _series() -> InputSeries:
+    return InputSeries(unit=Unit.M3_PER_S, data=_single_df())
+
+
+def _dynamic_inputs() -> DynamicInputs:
+    return DynamicInputs(past_known={"obs": {"q": _series()}})
+
+
+def _spatial_inputs() -> SpatialInputs:
+    return SpatialInputs(data={SpatialRepresentation.BASIN_AVERAGE: _dynamic_inputs()})
+
+
+def _station_inputs() -> StationInputs:
+    return StationInputs(dynamic={DAILY: _spatial_inputs()})
+
+
+class TestInputSeries:
+    def test_valid_single(self) -> None:
+        series = InputSeries(unit=Unit.M3_PER_S, data=_single_df())
+
+        assert series.unit == Unit.M3_PER_S
+        assert series.data.columns == ["datetime", "value"]
+
+    def test_valid_ensemble(self) -> None:
+        df = pl.DataFrame(
+            {
+                "datetime": [DT1, DT2],
+                "1": [1.0, 2.0],
+                "2": [2, 3],
+                "3": [3.0, 4.0],
+            }
+        )
+
+        series = InputSeries(unit=Unit.MM_PER_DAY, data=df)
+
+        assert series.unit == Unit.MM_PER_DAY
+        assert series.data.columns == ["datetime", "1", "2", "3"]
+
+    def test_unit_required(self) -> None:
+        with pytest.raises(ValidationError, match="unit"):
+            InputSeries(data=_single_df())
+
+    def test_nan_in_value_accepted(self) -> None:
+        df = pl.DataFrame({"datetime": [DT1], "value": [float("nan")]})
+
+        series = InputSeries(unit=Unit.M3_PER_S, data=df)
+
+        assert series.data["value"].is_nan().all()
+
+    def test_missing_datetime_rejected(self) -> None:
+        df = pl.DataFrame({"value": [1.0]})
+
+        with pytest.raises(
+            ValidationError, match="DataFrame must contain a 'datetime' column"
+        ):
+            InputSeries(unit=Unit.M3_PER_S, data=df)
+
+    def test_wrong_datetime_dtype_rejected(self) -> None:
+        df = pl.DataFrame({"datetime": ["2024-01-01"], "value": [1.0]})
+
+        with pytest.raises(ValidationError, match="'datetime' column must be Datetime"):
+            InputSeries(unit=Unit.M3_PER_S, data=df)
+
+    def test_issue_datetime_column_rejected_by_name(self) -> None:
+        df = pl.DataFrame(
+            {
+                "datetime": [DT1],
+                "issue_datetime": [1_704_067_200],
+                "value": [1.0],
+            }
+        )
+
+        with pytest.raises(
+            ValidationError, match="InputSeries data must not contain 'issue_datetime'"
+        ):
+            InputSeries(unit=Unit.M3_PER_S, data=df)
+
+    def test_null_datetime_rejected(self) -> None:
+        df = pl.DataFrame(
+            {
+                "datetime": pl.Series("datetime", [DT1, None], dtype=pl.Datetime),
+                "value": [1.0, 2.0],
+            }
+        )
+
+        with pytest.raises(ValidationError, match="'datetime' values must not be null"):
+            InputSeries(unit=Unit.M3_PER_S, data=df)
+
+    def test_empty_rows_rejected(self) -> None:
+        df = pl.DataFrame(
+            schema={"datetime": pl.Datetime, "value": pl.Float64},
+        )
+
+        with pytest.raises(
+            ValidationError, match="InputSeries data must contain at least one row"
+        ):
+            InputSeries(unit=Unit.M3_PER_S, data=df)
+
+    def test_no_value_column_rejected(self) -> None:
+        df = pl.DataFrame({"datetime": [DT1]})
+
+        with pytest.raises(
+            ValidationError,
+            match="InputSeries data must contain at least one value column",
+        ):
+            InputSeries(unit=Unit.M3_PER_S, data=df)
+
+    def test_stray_string_column_rejected(self) -> None:
+        df = pl.DataFrame({"datetime": [DT1], "value": [1.0], "stray": ["abc"]})
+
+        with pytest.raises(ValidationError, match="Column 'stray' must be numeric"):
+            InputSeries(unit=Unit.M3_PER_S, data=df)
+
+    def test_duplicate_datetimes_rejected(self) -> None:
+        df = pl.DataFrame({"datetime": [DT1, DT1], "value": [1.0, 2.0]})
+
+        with pytest.raises(ValidationError, match="'datetime' values must be unique"):
+            InputSeries(unit=Unit.M3_PER_S, data=df)
+
+    def test_unsorted_datetimes_rejected(self) -> None:
+        df = pl.DataFrame({"datetime": [DT2, DT1], "value": [2.0, 1.0]})
+
+        with pytest.raises(
+            ValidationError, match="'datetime' values must be sorted ascending"
+        ):
+            InputSeries(unit=Unit.M3_PER_S, data=df)
+
+
+class TestDynamicInputs:
+    def test_past_only(self) -> None:
+        inputs = DynamicInputs(past_known={"obs": {"q": _series()}})
+
+        assert "obs" in inputs.past_known
+        assert inputs.future_known == {}
+
+    def test_future_only(self) -> None:
+        inputs = DynamicInputs(future_known={"GFS": {"precip": _series()}})
+
+        assert "GFS" in inputs.future_known
+        assert inputs.past_known == {}
+
+    def test_both(self) -> None:
+        inputs = DynamicInputs(
+            past_known={"obs": {"q": _series()}},
+            future_known={"GFS": {"precip": _series()}},
+        )
+
+        assert "obs" in inputs.past_known
+        assert "GFS" in inputs.future_known
+
+    def test_empty_rejected(self) -> None:
+        with pytest.raises(
+            ValidationError, match="at least one of past_known or future_known"
+        ):
+            DynamicInputs()
+
+    def test_product_variable_nesting_depth(self) -> None:
+        series = _series()
+        inputs = DynamicInputs(past_known={"obs": {"q": series}})
+
+        assert inputs.past_known["obs"]["q"] is series
+
+
+class TestSpatialInputs:
+    def test_multiple_spatial_representations_accepted(self) -> None:
+        inputs = SpatialInputs(
+            data={
+                SpatialRepresentation.BASIN_AVERAGE: _dynamic_inputs(),
+                SpatialRepresentation.GRIDDED: _dynamic_inputs(),
+            }
+        )
+
+        assert SpatialRepresentation.BASIN_AVERAGE in inputs.data
+        assert SpatialRepresentation.GRIDDED in inputs.data
+
+    def test_empty_data_rejected(self) -> None:
+        with pytest.raises(
+            ValidationError,
+            match="data must contain at least one spatial representation",
+        ):
+            SpatialInputs(data={})
+
+
+class TestStationInputs:
+    def test_valid_dynamic(self) -> None:
+        inputs = StationInputs(dynamic={DAILY: _spatial_inputs()})
+
+        assert DAILY in inputs.dynamic
+        assert inputs.static == {}
+
+    def test_per_station_static_accepted(self) -> None:
+        inputs = StationInputs(
+            dynamic={DAILY: _spatial_inputs()},
+            static={
+                "catchment_area": 42.5,
+                "elevation": 1200,
+                "land_cover": "alpine",
+            },
+        )
+
+        assert inputs.static["land_cover"] == "alpine"
+        assert inputs.static["elevation"] == 1200
+
+    def test_empty_dynamic_rejected(self) -> None:
+        with pytest.raises(
+            ValidationError, match="dynamic must contain at least one time step"
+        ):
+            StationInputs(dynamic={})
+
+    def test_zero_dynamic_time_step_rejected(self) -> None:
+        with pytest.raises(
+            ValidationError, match="dynamic time step keys must be positive"
+        ):
+            StationInputs(dynamic={timedelta(0): _spatial_inputs()})
+
+    def test_negative_dynamic_time_step_rejected(self) -> None:
+        with pytest.raises(
+            ValidationError, match="dynamic time step keys must be positive"
+        ):
+            StationInputs(dynamic={timedelta(days=-1): _spatial_inputs()})
+
+    @pytest.mark.parametrize("name", ["", "   "])
+    def test_empty_static_name_rejected(self, name: str) -> None:
+        with pytest.raises(
+            ValidationError, match="static input names must be non-empty strings"
+        ):
+            StationInputs(dynamic={DAILY: _spatial_inputs()}, static={name: 1.0})
+
+
+class TestModelInputs:
+    def test_single_station_bundle(self) -> None:
+        inputs = ModelInputs(stations={"station_1": _station_inputs()})
+
+        assert "station_1" in inputs.stations
+
+    def test_multi_station_group_bundle_uses_same_type(self) -> None:
+        inputs = ModelInputs(
+            stations={
+                "station_1": _station_inputs(),
+                "station_2": StationInputs(dynamic={HOURLY: _spatial_inputs()}),
+            }
+        )
+
+        assert len(inputs.stations) == 2
+        assert HOURLY in inputs.stations["station_2"].dynamic
+
+    def test_empty_stations_rejected(self) -> None:
+        with pytest.raises(
+            ValidationError, match="stations must contain at least one station"
+        ):
+            ModelInputs(stations={})
+
+    @pytest.mark.parametrize("station_id", ["", "   "])
+    def test_empty_station_key_rejected(self, station_id: str) -> None:
+        with pytest.raises(
+            ValidationError, match="station keys must be non-empty strings"
+        ):
+            ModelInputs(stations={station_id: _station_inputs()})
+
+
+class TestExports:
+    def test_input_exports(self) -> None:
+        from forecast_interface.input import (
+            DynamicInputs,
+            InputSeries,
+            ModelInputs,
+            SpatialInputs,
+            StationInputs,
+        )
+
+        assert DynamicInputs is not None
+        assert InputSeries is not None
+        assert ModelInputs is not None
+        assert SpatialInputs is not None
+        assert StationInputs is not None
+
+    def test_top_level_exports(self) -> None:
+        from forecast_interface import (
+            DynamicInputs,
+            InputSeries,
+            ModelInputs,
+            SpatialInputs,
+            StationInputs,
+        )
+
+        assert DynamicInputs is not None
+        assert InputSeries is not None
+        assert ModelInputs is not None
+        assert SpatialInputs is not None
+        assert StationInputs is not None
diff --git a/uv.lock b/uv.lock
index f023149..2f4a420 100644
--- a/uv.lock
+++ b/uv.lock
@@ -85,7 +85,7 @@ wheels = [
 
 [[package]]
 name = "forecastinterface"
-version = "0.1.8"
+version = "0.1.9"
 source = { virtual = "." }
 dependencies = [
     { name = "polars" },

From d42b4a859cadc3e794405a3127ef8b24887a6418 Mon Sep 17 00:00:00 2001
From: Beatrice Marti <mabesa@users.noreply.github.com>
Date: Tue, 16 Jun 2026 23:18:18 +0200
Subject: [PATCH 09/16] feat: split batch hindcast protocol

---
 docs/model_interface.md                  |  6 +--
 forecast_interface/__init__.py           |  4 +-
 forecast_interface/interface/__init__.py |  3 +-
 forecast_interface/interface/protocol.py | 34 +++++++++-----
 pyproject.toml                           |  4 +-
 tests/test_interface.py                  | 60 +++++++++++-------------
 uv.lock                                  |  2 +-
 7 files changed, 60 insertions(+), 53 deletions(-)

diff --git a/docs/model_interface.md b/docs/model_interface.md
index c8c2246..1b561d1 100644
--- a/docs/model_interface.md
+++ b/docs/model_interface.md
@@ -7,7 +7,7 @@ There are **three protocols**: the required `ForecastModel`, plus two optional e
 Core functionalities include:
 
 **Forecast Function** `predict()`
-Takes as input the `ModelInput` and a trained artifact, and outputs the `ModelOutput` (Forecast).
+Takes as input the `ModelInputs` and a trained artifact, and outputs the `ModelOutput` (Forecast).
 
 **Hindcast Function** `hindcast()` — *optional, strongly recommended*
 Lives on the optional `BatchHindcastModel` extension. Takes a trained artifact and a **batch** of issue datetimes, and outputs the `ModelOutput` (Hindcast) for all of them in one call. Functionally equivalent to looping `predict()` over historical issue times, but vectorized for efficiency. SAP3 uses the batch path whenever the model implements it and falls back to looping `predict()` otherwise — because SAP3 runs hindcasts routinely (skill evaluation), implementing it is strongly recommended.
@@ -19,7 +19,7 @@ Produce a `TrainedArtifact` from training inputs. See the Training & Lifecycle P
 
 ## Training & Lifecycle Protocol
 
-> **Status: implemented** in `forecast_interface/interface/` (`protocol.py`, `scope.py`, `artifact.py`). The `inputs` and `config` parameters remain **provisional** — typed `Any` until the assembled-input bundle and model-config types are co-designed with SAP3 (doc 014 Task 3, the SAP3→FI input-types PR). Rich `TrainedArtifact` provenance metadata and the group-artifact embedding-key / station-set-mismatch contract are **deferred to Phase 4** (see [`docs/nepal-model-requirements.md`](./nepal-model-requirements.md) §4 and §8).
+> **Status: implemented** in `forecast_interface/interface/` (`protocol.py`, `scope.py`, `artifact.py`). The `inputs` parameters use FI-owned `ModelInputs`; only `config` remains **provisional** — typed `Any` until the model-config type is co-designed with SAP3 (Q8). Rich `TrainedArtifact` provenance metadata and the group-artifact embedding-key / station-set-mismatch contract are **deferred to Phase 4** (see [`docs/nepal-model-requirements.md`](./nepal-model-requirements.md) §4 and §8).
 
 ### Scope: `ArtifactScope`
 
@@ -48,7 +48,7 @@ A "national-group" model is a `GROUP` (it is just a group whose station set happ
 | `deserialize_artifact` | `deserialize_artifact(raw: bytes) -> TrainedArtifact` | `ForecastModel` | Inverse of `serialize_artifact`. |
 | `retrain` | `retrain(base_artifact, inputs, *, config, rng) -> TrainedArtifact` | `RetrainableModel` | **Optional.** Warm-start from an existing artifact, for models capable of it. Models that cannot warm-start simply do not implement it; callers fall back to `train`. |
 
-The `inputs` and `config` parameters are typed `Any` (provisional, see status note above).
+The `inputs` parameters use `ModelInputs`; only `config` is typed `Any` (provisional, see status note above).
 
 ### Determinism (dependency injection)
 
diff --git a/forecast_interface/__init__.py b/forecast_interface/__init__.py
index 59c3bb0..0f87f9b 100644
--- a/forecast_interface/__init__.py
+++ b/forecast_interface/__init__.py
@@ -1,4 +1,4 @@
-__version__ = "0.1.9"
+__version__ = "0.1.10"
 
 from .common import AggregationMethod
 from .input import (
@@ -19,6 +19,7 @@
 )
 from .interface import (
     ArtifactScope,
+    BatchHindcastModel,
     FailureCause,
     ForecastModel,
     ModelFailure,
@@ -43,6 +44,7 @@
 __all__ = [
     "AggregationMethod",
     "ArtifactScope",
+    "BatchHindcastModel",
     "DeterministicData",
     "DynamicInputs",
     "DynamicInputSpec",
diff --git a/forecast_interface/interface/__init__.py b/forecast_interface/interface/__init__.py
index f5cbdf3..4b33dd2 100644
--- a/forecast_interface/interface/__init__.py
+++ b/forecast_interface/interface/__init__.py
@@ -1,11 +1,12 @@
 from .artifact import TrainedArtifact
 from .failure import FailureCause
-from .protocol import ForecastModel, RetrainableModel
+from .protocol import BatchHindcastModel, ForecastModel, RetrainableModel
 from .result import ModelFailure, ModelResult, ModelSuccess
 from .scope import ArtifactScope
 
 __all__ = [
     "ArtifactScope",
+    "BatchHindcastModel",
     "FailureCause",
     "ForecastModel",
     "ModelFailure",
diff --git a/forecast_interface/interface/protocol.py b/forecast_interface/interface/protocol.py
index 98191ad..03e9d2d 100644
--- a/forecast_interface/interface/protocol.py
+++ b/forecast_interface/interface/protocol.py
@@ -1,8 +1,9 @@
+from collections.abc import Sequence
 from datetime import datetime
 from random import Random
 from typing import Any, Protocol, runtime_checkable
 
-from forecast_interface.input.requirement import InputRequirement
+from forecast_interface.input import InputRequirement, ModelInputs
 
 from .artifact import TrainedArtifact
 from .result import ModelResult
@@ -17,33 +18,40 @@ def input_requirement(self) -> InputRequirement: ...
     artifact_scope: ArtifactScope
 
     # REQUIRED training contract — cold full rebuild is the baseline.
-    def train(self, inputs: Any, *, config: Any, rng: Random) -> TrainedArtifact: ...
+    def train(
+        self, inputs: ModelInputs, *, config: Any, rng: Random
+    ) -> TrainedArtifact: ...
 
-    # ^ PROVISIONAL: `inputs` is the assembled-input bundle, `config` model params;
-    #   both co-designed with SAP3 (doc 014 Task 3). Typed Any until that PR lands.
+    # ^ PROVISIONAL: `config` model params are co-designed with SAP3 (Q8).
+    #   Typed Any until that contract lands.
 
     def predict(
         self,
         artifact: TrainedArtifact,
         *,
-        inputs: Any,  # PROVISIONAL: assembled-input bundle, co-designed with SAP3.
+        inputs: ModelInputs,
         issue_datetime: datetime,
         rng: Random,
     ) -> ModelResult: ...
 
+    def serialize_artifact(self, artifact: TrainedArtifact) -> bytes: ...
+
+    def deserialize_artifact(self, raw: bytes) -> TrainedArtifact: ...
+
+
+@runtime_checkable
+class BatchHindcastModel(ForecastModel, Protocol):
+    # The plural `issue_datetimes: Sequence[datetime]` is a static contract;
+    # runtime_checkable only verifies member presence.
     def hindcast(
         self,
         artifact: TrainedArtifact,
         *,
-        inputs: Any,  # PROVISIONAL: assembled-input bundle, co-designed with SAP3.
-        issue_datetime: datetime,
+        inputs: ModelInputs,
+        issue_datetimes: Sequence[datetime],
         rng: Random,
     ) -> ModelResult: ...
 
-    def serialize_artifact(self, artifact: TrainedArtifact) -> bytes: ...
-
-    def deserialize_artifact(self, raw: bytes) -> TrainedArtifact: ...
-
 
 @runtime_checkable
 class RetrainableModel(ForecastModel, Protocol):
@@ -52,8 +60,8 @@ class RetrainableModel(ForecastModel, Protocol):
     def retrain(
         self,
         base_artifact: TrainedArtifact,
-        inputs: Any,  # PROVISIONAL: assembled-input bundle, co-designed with SAP3.
+        inputs: ModelInputs,
         *,
-        config: Any,  # PROVISIONAL: model params, co-designed with SAP3.
+        config: Any,  # PROVISIONAL: model params are co-designed with SAP3 (Q8).
         rng: Random,
     ) -> TrainedArtifact: ...
diff --git a/pyproject.toml b/pyproject.toml
index 893b5e9..95a414b 100644
--- a/pyproject.toml
+++ b/pyproject.toml
@@ -1,6 +1,6 @@
 [project]
 name = "forecastinterface"
-version = "0.1.9"
+version = "0.1.10"
 description = "Add your description here"
 readme = "README.md"
 requires-python = ">=3.11"
@@ -13,7 +13,7 @@ dependencies = [
 pythonpath = ["."]
 
 [tool.bumpversion]
-current_version = "0.1.9"
+current_version = "0.1.10"
 commit = false
 tag = false
 allow_dirty = true
diff --git a/tests/test_interface.py b/tests/test_interface.py
index 5ebc226..842d207 100644
--- a/tests/test_interface.py
+++ b/tests/test_interface.py
@@ -1,3 +1,4 @@
+from collections.abc import Sequence
 from datetime import datetime, timedelta
 from random import Random
 from typing import Any
@@ -9,6 +10,7 @@
 from forecast_interface.input import (
     DynamicInputSpec,
     InputRequirement,
+    ModelInputs,
     OutputRepresentation,
     PastKnownVariable,
     SpatialInputSpec,
@@ -17,6 +19,7 @@
 )
 from forecast_interface.interface import (
     ArtifactScope,
+    BatchHindcastModel,
     FailureCause,
     ForecastModel,
     ModelFailure,
@@ -235,41 +238,45 @@ class _ConformingModel:
     def input_requirement(self) -> InputRequirement:
         return _make_input_requirement()
 
-    def train(self, inputs: Any, *, config: Any, rng: Random) -> TrainedArtifact:
+    def train(
+        self, inputs: ModelInputs, *, config: Any, rng: Random
+    ) -> TrainedArtifact:
         return object()
 
     def predict(
         self,
         artifact: TrainedArtifact,
         *,
-        inputs: Any,
+        inputs: ModelInputs,
         issue_datetime: datetime,
         rng: Random,
     ) -> ModelResult:
         return ModelSuccess(output=_make_model_output())
 
+    def serialize_artifact(self, artifact: TrainedArtifact) -> bytes:
+        return b""
+
+    def deserialize_artifact(self, raw: bytes) -> TrainedArtifact:
+        return object()
+
+
+class _BatchHindcastModel(_ConformingModel):
     def hindcast(
         self,
         artifact: TrainedArtifact,
         *,
-        inputs: Any,
-        issue_datetime: datetime,
+        inputs: ModelInputs,
+        issue_datetimes: Sequence[datetime],
         rng: Random,
     ) -> ModelResult:
         return ModelSuccess(output=_make_model_output())
 
-    def serialize_artifact(self, artifact: TrainedArtifact) -> bytes:
-        return b""
-
-    def deserialize_artifact(self, raw: bytes) -> TrainedArtifact:
-        return object()
-
 
 class _RetrainableModel(_ConformingModel):
     def retrain(
         self,
         base_artifact: TrainedArtifact,
-        inputs: Any,
+        inputs: ModelInputs,
         *,
         config: Any,
         rng: Random,
@@ -293,16 +300,7 @@ def predict(
                 self,
                 artifact: TrainedArtifact,
                 *,
-                inputs: Any,
-                issue_datetime: datetime,
-                rng: Random,
-            ) -> ModelResult: ...
-
-            def hindcast(
-                self,
-                artifact: TrainedArtifact,
-                *,
-                inputs: Any,
+                inputs: ModelInputs,
                 issue_datetime: datetime,
                 rng: Random,
             ) -> ModelResult: ...
@@ -322,23 +320,14 @@ def input_requirement(self) -> InputRequirement:
                 return _make_input_requirement()
 
             def train(
-                self, inputs: Any, *, config: Any, rng: Random
+                self, inputs: ModelInputs, *, config: Any, rng: Random
             ) -> TrainedArtifact: ...
 
             def predict(
                 self,
                 artifact: TrainedArtifact,
                 *,
-                inputs: Any,
-                issue_datetime: datetime,
-                rng: Random,
-            ) -> ModelResult: ...
-
-            def hindcast(
-                self,
-                artifact: TrainedArtifact,
-                *,
-                inputs: Any,
+                inputs: ModelInputs,
                 issue_datetime: datetime,
                 rng: Random,
             ) -> ModelResult: ...
@@ -351,8 +340,15 @@ def test_conforming_without_retrain_is_not_retrainable(self) -> None:
         model = _ConformingModel()
         assert isinstance(model, ForecastModel)
         assert not isinstance(model, RetrainableModel)
+        assert not isinstance(model, BatchHindcastModel)
 
     def test_model_with_retrain_satisfies_both(self) -> None:
         model = _RetrainableModel()
         assert isinstance(model, ForecastModel)
         assert isinstance(model, RetrainableModel)
+        assert not isinstance(model, BatchHindcastModel)
+
+    def test_batch_hindcast_model_satisfies_batch_protocol(self) -> None:
+        model = _BatchHindcastModel()
+        assert isinstance(model, BatchHindcastModel)
+        assert isinstance(model, ForecastModel)
diff --git a/uv.lock b/uv.lock
index 2f4a420..2d4958c 100644
--- a/uv.lock
+++ b/uv.lock
@@ -85,7 +85,7 @@ wheels = [
 
 [[package]]
 name = "forecastinterface"
-version = "0.1.9"
+version = "0.1.10"
 source = { virtual = "." }
 dependencies = [
     { name = "polars" },

From ede23197e043c1e498f0f280424e5f4d4e5635da Mon Sep 17 00:00:00 2001
From: Beatrice Marti <mabesa@users.noreply.github.com>
Date: Wed, 17 Jun 2026 09:59:18 +0200
Subject: [PATCH 10/16] chore: vendor the grill-me skill; gitignore local
 Claude settings & memory

Add .claude/skills/grill-me/SKILL.md so the design-interview skill is
shared in-repo. Ignore .claude/settings.local.json and .claude/projects/
(machine-local settings and agent session memory).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
---
 .claude/skills/grill-me/SKILL.md | 10 ++++++++++
 .gitignore                       |  4 ++++
 forecast_interface/__init__.py   |  2 +-
 pyproject.toml                   |  4 ++--
 uv.lock                          |  2 +-
 5 files changed, 18 insertions(+), 4 deletions(-)
 create mode 100644 .claude/skills/grill-me/SKILL.md

diff --git a/.claude/skills/grill-me/SKILL.md b/.claude/skills/grill-me/SKILL.md
new file mode 100644
index 0000000..bd04394
--- /dev/null
+++ b/.claude/skills/grill-me/SKILL.md
@@ -0,0 +1,10 @@
+---
+name: grill-me
+description: Interview the user relentlessly about a plan or design until reaching shared understanding, resolving each branch of the decision tree. Use when user wants to stress-test a plan, get grilled on their design, or mentions "grill me".
+---
+
+Interview me relentlessly about every aspect of this plan until we reach a shared understanding. Walk down each branch of the design tree, resolving dependencies between decisions one-by-one. For each question, provide your recommended answer.
+
+Ask the questions one at a time.
+
+If a question can be answered by exploring the codebase, explore the codebase instead.
diff --git a/.gitignore b/.gitignore
index 270e748..499767b 100644
--- a/.gitignore
+++ b/.gitignore
@@ -11,3 +11,7 @@ wheels/
 # All __pycache__ directories
 **/__pycache__/
 .DS_Store
+
+# Claude Code: track shared skills, ignore machine-local settings and agent memory
+.claude/settings.local.json
+.claude/projects/
diff --git a/forecast_interface/__init__.py b/forecast_interface/__init__.py
index 0f87f9b..552fc61 100644
--- a/forecast_interface/__init__.py
+++ b/forecast_interface/__init__.py
@@ -1,4 +1,4 @@
-__version__ = "0.1.10"
+__version__ = "0.1.11"
 
 from .common import AggregationMethod
 from .input import (
diff --git a/pyproject.toml b/pyproject.toml
index 95a414b..05a6fcb 100644
--- a/pyproject.toml
+++ b/pyproject.toml
@@ -1,6 +1,6 @@
 [project]
 name = "forecastinterface"
-version = "0.1.10"
+version = "0.1.11"
 description = "Add your description here"
 readme = "README.md"
 requires-python = ">=3.11"
@@ -13,7 +13,7 @@ dependencies = [
 pythonpath = ["."]
 
 [tool.bumpversion]
-current_version = "0.1.10"
+current_version = "0.1.11"
 commit = false
 tag = false
 allow_dirty = true
diff --git a/uv.lock b/uv.lock
index 2d4958c..2a8ba92 100644
--- a/uv.lock
+++ b/uv.lock
@@ -85,7 +85,7 @@ wheels = [
 
 [[package]]
 name = "forecastinterface"
-version = "0.1.10"
+version = "0.1.11"
 source = { virtual = "." }
 dependencies = [
     { name = "polars" },

From 6a743ef4937d20babb018e64cd36114561753acf Mon Sep 17 00:00:00 2001
From: Beatrice Marti <mabesa@users.noreply.github.com>
Date: Wed, 17 Jun 2026 11:10:22 +0200
Subject: [PATCH 11/16] build: add strict mypy gate

---
 .github/workflows/checks.yml   |  35 ++++++
 .pre-commit-config.yaml        |  20 ++++
 CLAUDE.md                      |   3 +-
 forecast_interface/__init__.py |   2 +-
 pyproject.toml                 |  17 ++-
 tests/test_bundle.py           |   2 +-
 tests/test_input.py            |  52 ++++-----
 tests/test_interface.py        |   9 +-
 uv.lock                        | 190 ++++++++++++++++++++++++++++++++-
 9 files changed, 297 insertions(+), 33 deletions(-)
 create mode 100644 .github/workflows/checks.yml
 create mode 100644 .pre-commit-config.yaml

diff --git a/.github/workflows/checks.yml b/.github/workflows/checks.yml
new file mode 100644
index 0000000..f3cb7c5
--- /dev/null
+++ b/.github/workflows/checks.yml
@@ -0,0 +1,35 @@
+name: Checks
+
+on:
+  push:
+  pull_request:
+
+jobs:
+  checks:
+    runs-on: ubuntu-latest
+    steps:
+      - name: Check out repository
+        uses: actions/checkout@v4
+
+      - name: Set up Python
+        uses: actions/setup-python@v5
+        with:
+          python-version: "3.11"
+
+      - name: Set up uv
+        uses: astral-sh/setup-uv@v6
+
+      - name: Install dependencies
+        run: uv sync --dev
+
+      - name: Check formatting
+        run: uv run ruff format --check
+
+      - name: Lint
+        run: uv run ruff check
+
+      - name: Type check
+        run: uv run mypy
+
+      - name: Test
+        run: uv run pytest -q
diff --git a/.pre-commit-config.yaml b/.pre-commit-config.yaml
new file mode 100644
index 0000000..442ce35
--- /dev/null
+++ b/.pre-commit-config.yaml
@@ -0,0 +1,20 @@
+repos:
+  - repo: local
+    hooks:
+      - id: ruff-format-check
+        name: ruff format --check
+        entry: uv run ruff format --check
+        language: system
+        pass_filenames: false
+
+      - id: ruff-check
+        name: ruff check
+        entry: uv run ruff check
+        language: system
+        pass_filenames: false
+
+      - id: mypy
+        name: mypy
+        entry: uv run mypy
+        language: system
+        pass_filenames: false
diff --git a/CLAUDE.md b/CLAUDE.md
index c8fa209..3ee5b8f 100644
--- a/CLAUDE.md
+++ b/CLAUDE.md
@@ -120,6 +120,7 @@ value: str | None = None
 - Use `ruff` for both linting and formatting:
   - Format: `uv run ruff format`
   - Lint + fix: `uv run ruff check --fix`
+- Required strict type check: `uv run mypy`. Under strict `no_implicit_reexport`, add new public symbols to `__all__`.
 
 ### Version Bumping (mandatory)
 
@@ -462,4 +463,4 @@ If your test breaks after a refactor that doesn't change behavior, the test was
   - Code is ready for production/publishing
   - Public API requires clarification
 
-**Rationale**: During prototyping and development, verbose documentation significantly bloats context. Write clear, readable code first. Documentation can be added later when actually needed.
\ No newline at end of file
+**Rationale**: During prototyping and development, verbose documentation significantly bloats context. Write clear, readable code first. Documentation can be added later when actually needed.
diff --git a/forecast_interface/__init__.py b/forecast_interface/__init__.py
index 552fc61..880083b 100644
--- a/forecast_interface/__init__.py
+++ b/forecast_interface/__init__.py
@@ -1,4 +1,4 @@
-__version__ = "0.1.11"
+__version__ = "0.1.12"
 
 from .common import AggregationMethod
 from .input import (
diff --git a/pyproject.toml b/pyproject.toml
index 05a6fcb..1a4acf8 100644
--- a/pyproject.toml
+++ b/pyproject.toml
@@ -1,6 +1,6 @@
 [project]
 name = "forecastinterface"
-version = "0.1.11"
+version = "0.1.12"
 description = "Add your description here"
 readme = "README.md"
 requires-python = ">=3.11"
@@ -12,8 +12,20 @@ dependencies = [
 [tool.pytest.ini_options]
 pythonpath = ["."]
 
+[tool.mypy]
+python_version = "3.11"
+strict = true
+plugins = ["pydantic.mypy"]
+show_error_codes = true
+files = ["forecast_interface", "tests"]
+
+[tool.pydantic-mypy]
+init_forbid_extra = true
+init_typed = true
+warn_required_dynamic_aliases = true
+
 [tool.bumpversion]
-current_version = "0.1.11"
+current_version = "0.1.12"
 commit = false
 tag = false
 allow_dirty = true
@@ -31,6 +43,7 @@ replace = '__version__ = "{new_version}"'
 [dependency-groups]
 dev = [
     "bump-my-version>=1.3.0",
+    "mypy>=2.1.0",
     "pytest>=9.0.2",
     "ruff>=0.15.7",
 ]
diff --git a/tests/test_bundle.py b/tests/test_bundle.py
index 5d5c0c9..c32c7a8 100644
--- a/tests/test_bundle.py
+++ b/tests/test_bundle.py
@@ -64,7 +64,7 @@ def test_valid_ensemble(self) -> None:
 
     def test_unit_required(self) -> None:
         with pytest.raises(ValidationError, match="unit"):
-            InputSeries(data=_single_df())
+            InputSeries.model_validate({"data": _single_df()})
 
     def test_nan_in_value_accepted(self) -> None:
         df = pl.DataFrame({"datetime": [DT1], "value": [float("nan")]})
diff --git a/tests/test_input.py b/tests/test_input.py
index 416bc2d..20097b6 100644
--- a/tests/test_input.py
+++ b/tests/test_input.py
@@ -54,7 +54,7 @@ def test_aggregation_override(self) -> None:
 
     def test_unit_required(self) -> None:
         with pytest.raises(ValidationError, match="unit"):
-            PastKnownVariable(lookback=1, max_nan=0)
+            PastKnownVariable.model_validate({"lookback": 1, "max_nan": 0})
 
     def test_lookback_zero(self) -> None:
         with pytest.raises(ValidationError, match="lookback must be positive"):
@@ -97,7 +97,7 @@ def test_aggregation_override(self) -> None:
 
     def test_unit_required(self) -> None:
         with pytest.raises(ValidationError, match="unit"):
-            FutureKnownVariable(future_steps=1, max_nan=0)
+            FutureKnownVariable.model_validate({"future_steps": 1, "max_nan": 0})
 
     def test_ensemble_mode_default_single(self) -> None:
         v = FutureKnownVariable(unit=Unit.M3_PER_S, future_steps=5, max_nan=0)
@@ -147,7 +147,9 @@ def test_empty_representations_raises(self) -> None:
 
     def test_unit_required(self) -> None:
         with pytest.raises(ValidationError, match="unit"):
-            TargetSpec(representations=frozenset({OutputRepresentation.DETERMINISTIC}))
+            TargetSpec.model_validate(
+                {"representations": frozenset({OutputRepresentation.DETERMINISTIC})}
+            )
 
 
 # ---------------------------------------------------------------------------
@@ -441,28 +443,30 @@ def test_empty_static_string_raises(self) -> None:
             )
 
     def test_duplicate_static_deduplicated(self) -> None:
-        req = InputRequirement(
-            targets=_target(),
-            dynamic={
-                DAILY: SpatialInputSpec(
-                    data={
-                        SpatialRepresentation.BASIN_AVERAGE: DynamicInputSpec(
-                            past_known={
-                                "obs": {
-                                    "q": PastKnownVariable(
-                                        unit=Unit.M3_PER_S, lookback=1, max_nan=0
-                                    )
+        req = InputRequirement.model_validate(
+            {
+                "targets": _target(),
+                "dynamic": {
+                    DAILY: SpatialInputSpec(
+                        data={
+                            SpatialRepresentation.BASIN_AVERAGE: DynamicInputSpec(
+                                past_known={
+                                    "obs": {
+                                        "q": PastKnownVariable(
+                                            unit=Unit.M3_PER_S, lookback=1, max_nan=0
+                                        )
+                                    }
                                 }
-                            }
-                        )
-                    }
-                )
-            },
-            static=[
-                "area",
-                "area",
-                "slope",
-            ],  # list with duplicates, Pydantic coerces to set
+                            )
+                        }
+                    )
+                },
+                "static": [
+                    "area",
+                    "area",
+                    "slope",
+                ],  # list with duplicates, Pydantic coerces to set
+            }
         )
         assert len(req.static) == 2
 
diff --git a/tests/test_interface.py b/tests/test_interface.py
index 842d207..0aa9d93 100644
--- a/tests/test_interface.py
+++ b/tests/test_interface.py
@@ -303,9 +303,11 @@ def predict(
                 inputs: ModelInputs,
                 issue_datetime: datetime,
                 rng: Random,
-            ) -> ModelResult: ...
+            ) -> ModelResult:
+                raise NotImplementedError
 
-            def serialize_artifact(self, artifact: TrainedArtifact) -> bytes: ...
+            def serialize_artifact(self, artifact: TrainedArtifact) -> bytes:
+                raise NotImplementedError
 
             def deserialize_artifact(self, raw: bytes) -> TrainedArtifact: ...
 
@@ -330,7 +332,8 @@ def predict(
                 inputs: ModelInputs,
                 issue_datetime: datetime,
                 rng: Random,
-            ) -> ModelResult: ...
+            ) -> ModelResult:
+                raise NotImplementedError
 
             def deserialize_artifact(self, raw: bytes) -> TrainedArtifact: ...
 
diff --git a/uv.lock b/uv.lock
index 2a8ba92..c626d1f 100644
--- a/uv.lock
+++ b/uv.lock
@@ -1,6 +1,10 @@
 version = 1
 revision = 3
 requires-python = ">=3.11"
+resolution-markers = [
+    "python_full_version >= '3.15'",
+    "python_full_version < '3.15'",
+]
 
 [[package]]
 name = "annotated-types"
@@ -24,6 +28,46 @@ wheels = [
     { url = "https://files.pythonhosted.org/packages/da/42/e921fccf5015463e32a3cf6ee7f980a6ed0f395ceeaa45060b61d86486c2/anyio-4.13.0-py3-none-any.whl", hash = "sha256:08b310f9e24a9594186fd75b4f73f4a4152069e3853f1ed8bfbf58369f4ad708", size = 114353, upload-time = "2026-03-24T12:59:08.246Z" },
 ]
 
+[[package]]
+name = "ast-serialize"
+version = "0.5.0"
+source = { registry = "https://pypi.org/simple" }
+sdist = { url = "https://files.pythonhosted.org/packages/81/9d/09e27731bd5864a9ce04e3244074e674bb8936bf62b45e0357248717adac/ast_serialize-0.5.0.tar.gz", hash = "sha256:5880091bfe6f4f986f22866375c2e884843e7a0b6343ae41aeea659613d879b6", size = 61157, upload-time = "2026-05-17T17:48:29.429Z" }
+wheels = [
+    { url = "https://files.pythonhosted.org/packages/c0/9a/13dde51ba9e15f8b97957ab7cb0120d0e381524d651c6bd630b9c359227f/ast_serialize-0.5.0-cp314-cp314t-macosx_10_12_x86_64.whl", hash = "sha256:8f5c14f169eb0972c0c21bada5358b23d6047c76583b005234f865b11f1fa00a", size = 1183520, upload-time = "2026-05-17T17:47:30.831Z" },
+    { url = "https://files.pythonhosted.org/packages/37/de/5a7f0a9fe68944f536632a5af84676739c7d2582be42deb082634bf3a754/ast_serialize-0.5.0-cp314-cp314t-macosx_11_0_arm64.whl", hash = "sha256:7d1a2de9de5be04652f0ed60738356ef94f66db37924a9499fffe98dc491aa0b", size = 1175779, upload-time = "2026-05-17T17:47:32.551Z" },
+    { url = "https://files.pythonhosted.org/packages/9c/81/0bb853e76e4f6e9a1855d569003c59e19ffac45f7079d91505d1bb212f92/ast_serialize-0.5.0-cp314-cp314t-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:be5173fb66f9b49026d9d5a2ff0fc7c7009077107c0eb285b2d60fdf1fe10bd1", size = 1233750, upload-time = "2026-05-17T17:47:34.731Z" },
+    { url = "https://files.pythonhosted.org/packages/e5/d3/4cf705beeccc08754d0bbda99aefff26110e209b9a07ac8a6b60eec48531/ast_serialize-0.5.0-cp314-cp314t-manylinux_2_17_armv7l.manylinux2014_armv7l.whl", hash = "sha256:f8015cd071ac1339924ee2b8098c93e00e155f30a16f40ec9816fcf84f4753f6", size = 1235942, upload-time = "2026-05-17T17:47:36.287Z" },
+    { url = "https://files.pythonhosted.org/packages/26/c8/ee097e437ea27dd2b8b227865c875492b585650a5802a22d82b304c8201b/ast_serialize-0.5.0-cp314-cp314t-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl", hash = "sha256:5499e8797edff2a9186aa313ed382c6b422e798e9332d9953badcee6e69a88f2", size = 1442517, upload-time = "2026-05-17T17:47:38.17Z" },
+    { url = "https://files.pythonhosted.org/packages/ff/bd/68063442838f1ba68ec72b5436430bc75b3bb17a1a3c3063f09b0c05ae2b/ast_serialize-0.5.0-cp314-cp314t-manylinux_2_17_s390x.manylinux2014_s390x.whl", hash = "sha256:6848f2a093fb5548751a9a09bff8fcd229e2bbeb0e3331f391b6ae6d26cd9903", size = 1254081, upload-time = "2026-05-17T17:47:39.826Z" },
+    { url = "https://files.pythonhosted.org/packages/50/e2/1e520793bc6a4e4524a6ab022391e827825eaa0c3811828bfdc6852eca26/ast_serialize-0.5.0-cp314-cp314t-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:832d4c998e0b091fd60a6d6bceee535483c4d490de9ba85003af835225719261", size = 1259910, upload-time = "2026-05-17T17:47:41.369Z" },
+    { url = "https://files.pythonhosted.org/packages/4e/e1/49b60f467979979cfe6913b43948ff25bca971ad0591d181812f163a988e/ast_serialize-0.5.0-cp314-cp314t-manylinux_2_31_riscv64.whl", hash = "sha256:16db7c62ec0b8efe1d7afd283a388d8f74f2605d56032e5a37747d2de8dba027", size = 1250678, upload-time = "2026-05-17T17:47:43.702Z" },
+    { url = "https://files.pythonhosted.org/packages/74/ba/66ab9555de6275677566f6574e5ef6c29cb185ea866f643bc06f8280a8ee/ast_serialize-0.5.0-cp314-cp314t-manylinux_2_5_i686.manylinux1_i686.whl", hash = "sha256:baf5eb061eb5bccade4128ad42da33787d72f6013809cd1b590376ece8b3c937", size = 1301603, upload-time = "2026-05-17T17:47:46.256Z" },
+    { url = "https://files.pythonhosted.org/packages/66/42/6aca9b9abc710014b2be9059689e5dd1679339e78f567ffb4d255a9e2050/ast_serialize-0.5.0-cp314-cp314t-musllinux_1_2_aarch64.whl", hash = "sha256:104e4a35bd7c124173c41760ef9aaea17ddb3f86c65cb643671d59afbe3ee94c", size = 1410332, upload-time = "2026-05-17T17:47:47.899Z" },
+    { url = "https://files.pythonhosted.org/packages/47/68/2f76594432a22581ecf878b5e75a9b8601c24b2241cf0bbeb1e21fcf370c/ast_serialize-0.5.0-cp314-cp314t-musllinux_1_2_armv7l.whl", hash = "sha256:36be371028fc1675acb38a331bde160dbab7ff907fdf00b67eb6911aa106951b", size = 1509979, upload-time = "2026-05-17T17:47:50.942Z" },
+    { url = "https://files.pythonhosted.org/packages/40/ac/a93c9b58292653f6c595752f677a08e608f903b710594909e9231a389b3b/ast_serialize-0.5.0-cp314-cp314t-musllinux_1_2_i686.whl", hash = "sha256:061ee58bdb52341c8201a6df41182a977736bae3b7ded87ca7176ca25a8a47ab", size = 1505002, upload-time = "2026-05-17T17:47:54.093Z" },
+    { url = "https://files.pythonhosted.org/packages/14/2e/b278f68c497ee2f1d1576cbbef8db5281cd4a5f2db040537592ac9c8862e/ast_serialize-0.5.0-cp314-cp314t-musllinux_1_2_x86_64.whl", hash = "sha256:b15219e9cdc9f53f6f4cb51c009203507228226148c05c5e8fe451c28b435eb3", size = 1456231, upload-time = "2026-05-17T17:47:56.311Z" },
+    { url = "https://files.pythonhosted.org/packages/0b/43/419be1c566a4c504cd8fd60ce2f84e790f295495c0f327cfaeadf3d51012/ast_serialize-0.5.0-cp314-cp314t-win32.whl", hash = "sha256:842d1c004bb466c7df036f95fabef789570541922b10976b12f5592a69cf0b38", size = 1058668, upload-time = "2026-05-17T17:47:58.305Z" },
+    { url = "https://files.pythonhosted.org/packages/03/6f/c9d4d549295ed05111aeb8853232d1afd9d0a179fddb01eeffbb3a4a6842/ast_serialize-0.5.0-cp314-cp314t-win_amd64.whl", hash = "sha256:b0c06d760909b095cc466356dfccd05a1c7233a6ca191c020dca2c6a6f16c24c", size = 1101075, upload-time = "2026-05-17T17:48:00.35Z" },
+    { url = "https://files.pythonhosted.org/packages/d0/8e/d00c5ab30c58222e07d62956fca86c59d91b9ad32997e633c38b526623a3/ast_serialize-0.5.0-cp314-cp314t-win_arm64.whl", hash = "sha256:787baedb0262cc49e8ce37cc15c00ae818e46a165a3b36f5e21ed174998104cb", size = 1075347, upload-time = "2026-05-17T17:48:01.753Z" },
+    { url = "https://files.pythonhosted.org/packages/e0/9e/dc2530acb3a60dc6e46d65abf27d1d9f86721694757906a148d90a6860de/ast_serialize-0.5.0-cp39-abi3-macosx_10_12_x86_64.whl", hash = "sha256:0668aa9459cfa8c9c49ddd2163ebcf43088ba045ef7492af6fe22e0098303101", size = 1191380, upload-time = "2026-05-17T17:48:03.738Z" },
+    { url = "https://files.pythonhosted.org/packages/26/0a/bd3d18a582f273d6c843d16bb9e22e9e16365ff7991e92f18f798e9f1224/ast_serialize-0.5.0-cp39-abi3-macosx_11_0_arm64.whl", hash = "sha256:bf683d6363edf2b39eed6b6d4fe22d34b6203867a67e27134d9e2a2680c4bc4a", size = 1183879, upload-time = "2026-05-17T17:48:05.463Z" },
+    { url = "https://files.pythonhosted.org/packages/40/ae/1f919100f8620887af58fcc381c61a1f218cdf89c6e155f87b213e61010a/ast_serialize-0.5.0-cp39-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:9cc22cf0c9be65e71cf88fda130af60d61eb4a79370ad4cfe7900d48a4aa2211", size = 1244529, upload-time = "2026-05-17T17:48:07.008Z" },
+    { url = "https://files.pythonhosted.org/packages/c6/ca/6376559dcce707cdbc1d0d9a13c8d3baaaa501e949ce0ebdc4230cd881aa/ast_serialize-0.5.0-cp39-abi3-manylinux_2_17_armv7l.manylinux2014_armv7l.whl", hash = "sha256:f66173891548c9f2726bf27957b41cabce12fa679dc6da505ddbde4d4b3b31cf", size = 1240560, upload-time = "2026-05-17T17:48:08.46Z" },
+    { url = "https://files.pythonhosted.org/packages/35/b2/a620e206b5aeb7efbf2710336df57d457cffbb3991076bbcc1147ef9abd4/ast_serialize-0.5.0-cp39-abi3-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl", hash = "sha256:e42d729ef2be96a14efbad355093284739e3670ece3e534f82cc8832790911d9", size = 1451172, upload-time = "2026-05-17T17:48:09.922Z" },
+    { url = "https://files.pythonhosted.org/packages/fa/e0/4ad5c04c24a40481b2935ce9a0ccdb6023dc8b667167d06ae530cc3512f2/ast_serialize-0.5.0-cp39-abi3-manylinux_2_17_s390x.manylinux2014_s390x.whl", hash = "sha256:b725026bafa801dbd7310eb13a75f0a2e370e7e51b2cb225f9d21fcfadf919ee", size = 1265072, upload-time = "2026-05-17T17:48:11.469Z" },
+    { url = "https://files.pythonhosted.org/packages/b2/71/4d1d479aa56d0101c40e17720c3d6ac2af7269ea0487a80b18e7bfd1a5b7/ast_serialize-0.5.0-cp39-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:b54f60c1d78767a53b67eaa663f0dfac3afe606aa07f1301572f588b73d64809", size = 1270488, upload-time = "2026-05-17T17:48:13.575Z" },
+    { url = "https://files.pythonhosted.org/packages/6d/4f/0de1bbe06f6edef9fde4ed12ca8e7b3ec7e6e2bd4e672c5af487f7957665/ast_serialize-0.5.0-cp39-abi3-manylinux_2_31_riscv64.whl", hash = "sha256:27d51654fc240a1e87e742d353d98eb45b75f62f129086b3596ab53df2ac2a43", size = 1260702, upload-time = "2026-05-17T17:48:15.141Z" },
+    { url = "https://files.pythonhosted.org/packages/75/61/e00872439cfdddcc3c1b6cdaa6e5d904ba8e26a18807c67c4e14409d0ca8/ast_serialize-0.5.0-cp39-abi3-manylinux_2_5_i686.manylinux1_i686.whl", hash = "sha256:2782c36237c46dd1674542f2109740ea5ea485a169bf1431939ada0434e17934", size = 1311182, upload-time = "2026-05-17T17:48:16.779Z" },
+    { url = "https://files.pythonhosted.org/packages/76/8e/699a5b955f7926956c95e9e1d74132acad73c2fe7a426f94da89123c20aa/ast_serialize-0.5.0-cp39-abi3-musllinux_1_2_aarch64.whl", hash = "sha256:1943db345233cc7194a470f13afa9c59772c0b123dea0c9414c4d4ca54369759", size = 1421410, upload-time = "2026-05-17T17:48:18.527Z" },
+    { url = "https://files.pythonhosted.org/packages/a9/ae/d5b7626874478997adc7a29ab28accf21e596fb590c944290401dfd0b29e/ast_serialize-0.5.0-cp39-abi3-musllinux_1_2_armv7l.whl", hash = "sha256:df1c00022cbbcb064bfaa505aa9c9295362443ce5dacb459d1331d3da353f887", size = 1516587, upload-time = "2026-05-17T17:48:20.133Z" },
+    { url = "https://files.pythonhosted.org/packages/0c/ce/b59e02a82d9c4244d64cde502e0b00e83e38816abe19155ceb5437402c7f/ast_serialize-0.5.0-cp39-abi3-musllinux_1_2_i686.whl", hash = "sha256:cae65289fc456fde04af979a2be09302ef5d8ab92ef23e596d6746dc267ada27", size = 1515171, upload-time = "2026-05-17T17:48:21.921Z" },
+    { url = "https://files.pythonhosted.org/packages/8b/38/d8d90042747d05aa08d4efcf1c99035a5f670a6bf4c214d31644392afbca/ast_serialize-0.5.0-cp39-abi3-musllinux_1_2_x86_64.whl", hash = "sha256:239a4c354e8d676e9d94631d1d4a64edc6b266f86ff3a5a80aedd344f342c01d", size = 1464668, upload-time = "2026-05-17T17:48:23.544Z" },
+    { url = "https://files.pythonhosted.org/packages/dd/51/5b840c4df7334104cecffa28f23904fe81ca89ca223d2450e288de39fd3c/ast_serialize-0.5.0-cp39-abi3-win32.whl", hash = "sha256:143a4ef63285a075871908fda3672dc21864b83a8ec3ee12304aa3e4c5387b9a", size = 1068311, upload-time = "2026-05-17T17:48:25.027Z" },
+    { url = "https://files.pythonhosted.org/packages/41/11/ca5672c7d491825bc4cd6702dea106a6b60d928707712ec257c7833ae476/ast_serialize-0.5.0-cp39-abi3-win_amd64.whl", hash = "sha256:cf25572c526add400f26a4750dc6ce0c3bb93fc1f75e7ae0cad4ce4f2cd5c590", size = 1108931, upload-time = "2026-05-17T17:48:26.591Z" },
+    { url = "https://files.pythonhosted.org/packages/45/19/cc8bd127d28a43da249aa955cfd164cf8fd534e79e42cea96c4854d72fd0/ast_serialize-0.5.0-cp39-abi3-win_arm64.whl", hash = "sha256:92a31c9c20d25a076edaeec76b128a3535d74a24f340b9a8a7e96c9b86dc9642", size = 1081181, upload-time = "2026-05-17T17:48:28.122Z" },
+]
+
 [[package]]
 name = "bracex"
 version = "2.6"
@@ -85,7 +129,7 @@ wheels = [
 
 [[package]]
 name = "forecastinterface"
-version = "0.1.11"
+version = "0.1.12"
 source = { virtual = "." }
 dependencies = [
     { name = "polars" },
@@ -95,6 +139,7 @@ dependencies = [
 [package.dev-dependencies]
 dev = [
     { name = "bump-my-version" },
+    { name = "mypy" },
     { name = "pytest" },
     { name = "ruff" },
 ]
@@ -108,6 +153,7 @@ requires-dist = [
 [package.metadata.requires-dev]
 dev = [
     { name = "bump-my-version", specifier = ">=1.3.0" },
+    { name = "mypy", specifier = ">=2.1.0" },
     { name = "pytest", specifier = ">=9.0.2" },
     { name = "ruff", specifier = ">=0.15.7" },
 ]
@@ -167,6 +213,79 @@ wheels = [
     { url = "https://files.pythonhosted.org/packages/cb/b1/3846dd7f199d53cb17f49cba7e651e9ce294d8497c8c150530ed11865bb8/iniconfig-2.3.0-py3-none-any.whl", hash = "sha256:f631c04d2c48c52b84d0d0549c99ff3859c98df65b3101406327ecc7d53fbf12", size = 7484, upload-time = "2025-10-18T21:55:41.639Z" },
 ]
 
+[[package]]
+name = "librt"
+version = "0.11.0"
+source = { registry = "https://pypi.org/simple" }
+sdist = { url = "https://files.pythonhosted.org/packages/40/08/9e7f6b5d2b5bed6ad055cdd5925f192bb403a51280f86b56554d9d0699a2/librt-0.11.0.tar.gz", hash = "sha256:075dc3ef4458a278e0195cbf6ac9d38808d9b906c5a6c7f7f79c3888276a3fb1", size = 200139, upload-time = "2026-05-10T18:17:25.138Z" }
+wheels = [
+    { url = "https://files.pythonhosted.org/packages/fe/87/2bf31fe17587b29e3f93ec31421e2b1e1c3e349b8bf6c7c313dbad1d5340/librt-0.11.0-cp311-cp311-macosx_10_9_x86_64.whl", hash = "sha256:93d95bd45b7d58343d8b90d904450a545144eec19a002511163426f8ab1fae29", size = 141092, upload-time = "2026-05-10T18:15:34.795Z" },
+    { url = "https://files.pythonhosted.org/packages/cf/08/5c5bf772920b7ebac6e32bc91a643e0ab3870199c0b542356d3baa83970a/librt-0.11.0-cp311-cp311-macosx_11_0_arm64.whl", hash = "sha256:4ee278c769a713638cdacd4c0436d72156e75df3ebc0166ab2b9dc43acc386c9", size = 142035, upload-time = "2026-05-10T18:15:36.242Z" },
+    { url = "https://files.pythonhosted.org/packages/06/20/662a03d254e5b000d838e8b345d83303ddb768c080fd488e40634c0fa66b/librt-0.11.0-cp311-cp311-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:f230cb1cbc9faaa616f9a678f530ebcf186e414b6bcbd88b960e4ba1b92428d5", size = 475022, upload-time = "2026-05-10T18:15:37.56Z" },
+    { url = "https://files.pythonhosted.org/packages/de/f3/aa81523e45184c6ec23dc7f63263362ec55f80a09d424c012359ecbe7e35/librt-0.11.0-cp311-cp311-manylinux2014_i686.manylinux_2_17_i686.manylinux_2_28_i686.whl", hash = "sha256:5d63c855d86938d9de93e265c9bd8c705b51ec494de5738340ee93767a686e4b", size = 467273, upload-time = "2026-05-10T18:15:39.182Z" },
+    { url = "https://files.pythonhosted.org/packages/6b/6f/59c74b560ca8853834d5501d589c8a2519f4184f273a085ffd0f37a1cc47/librt-0.11.0-cp311-cp311-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:993f028be9e96a08d31df3479ac80d99be374d17f3b78e4796b3fd3c913d4e89", size = 497083, upload-time = "2026-05-10T18:15:40.634Z" },
+    { url = "https://files.pythonhosted.org/packages/fe/7b/5aa4d2c9600a719401160bf7055417df0b2a47439b9d88286ce45e56b65f/librt-0.11.0-cp311-cp311-manylinux_2_34_riscv64.manylinux_2_39_riscv64.whl", hash = "sha256:258d73a0aa66a055e65b2e4d1b8cdb23b9d132c5bb915d9547d804fcaed116cc", size = 489139, upload-time = "2026-05-10T18:15:41.934Z" },
+    { url = "https://files.pythonhosted.org/packages/d6/31/9143803d7da6856a69153785768c4936864430eec0fd9461c3ea527d9922/librt-0.11.0-cp311-cp311-musllinux_1_2_aarch64.whl", hash = "sha256:0827efe7854718f04aaddf6496e96960a956e676fe1d0f04eb41511fd8ad06d5", size = 508442, upload-time = "2026-05-10T18:15:43.206Z" },
+    { url = "https://files.pythonhosted.org/packages/2f/5a/bce08184488426bda4ccc2c4964ac048c8f68ae89bd7120082eef4233cfd/librt-0.11.0-cp311-cp311-musllinux_1_2_i686.whl", hash = "sha256:7753e57d6e12d019c0d8786f1c09c709f4c3fcc57c3887b24e36e6c06ec938b7", size = 514230, upload-time = "2026-05-10T18:15:44.761Z" },
+    { url = "https://files.pythonhosted.org/packages/89/8c/bb5e213d254b7505a0e658da199d8ab719086632ce09eef311ab27976523/librt-0.11.0-cp311-cp311-musllinux_1_2_riscv64.whl", hash = "sha256:11bd19822431cc21af9f27374e7ae2e58103c7d98bda823536a6c47f6bb2bb3d", size = 494231, upload-time = "2026-05-10T18:15:46.308Z" },
+    { url = "https://files.pythonhosted.org/packages/9d/fb/541cdad5b1ab1300398c74c4c9a497b88e5074c21b1244c8f49731d3a284/librt-0.11.0-cp311-cp311-musllinux_1_2_x86_64.whl", hash = "sha256:22bdf239b219d3993761a148ffa134b19e52e9989c84f845d5d7b71d70a17412", size = 537585, upload-time = "2026-05-10T18:15:47.629Z" },
+    { url = "https://files.pythonhosted.org/packages/8f/f2/464bb69295c320cb06bddb4f14a4ec67934ee14b2bffb12b19fb7ab287ba/librt-0.11.0-cp311-cp311-win32.whl", hash = "sha256:46c60b61e308eb535fbd6fa622b1ee1bb2815691c1ad9c98bf7b84952ec3bc8d", size = 100509, upload-time = "2026-05-10T18:15:49.157Z" },
+    { url = "https://files.pythonhosted.org/packages/6d/e7/a17ee1788f9e4fbf548c19f4afa07c92089b9e24fef6cb2410863781ef4c/librt-0.11.0-cp311-cp311-win_amd64.whl", hash = "sha256:902e546ff044f579ff1c953ff5fce97b636fe9e3943996b2177710c6ef076f73", size = 118628, upload-time = "2026-05-10T18:15:50.345Z" },
+    { url = "https://files.pythonhosted.org/packages/cc/c7/6c766214f9f9903bcfcfbef97d807af8d8f5aa3502d247858ab17582d212/librt-0.11.0-cp311-cp311-win_arm64.whl", hash = "sha256:65ac3bc20f78aa0ee5ae84baa68917f89fef4af63e941084dd019a0d0e749f0c", size = 103122, upload-time = "2026-05-10T18:15:52.068Z" },
+    { url = "https://files.pythonhosted.org/packages/8b/d0/07c77e067f0838949b43bd89232c29d72efebb9d2801a9750184eb706b71/librt-0.11.0-cp312-cp312-macosx_10_13_x86_64.whl", hash = "sha256:b87504f1690a23b9a2cca841191a04f83895d4fc2dd04df91d82b1a04ca2ad46", size = 144147, upload-time = "2026-05-10T18:15:53.227Z" },
+    { url = "https://files.pythonhosted.org/packages/7a/24/8493538fa4f62f982686398a5b8f68008138a75086abdea19ade64bf4255/librt-0.11.0-cp312-cp312-macosx_11_0_arm64.whl", hash = "sha256:40071fc5fe0ce8daa6de616702314a01e1250711682b0523d6ab8d4525910cb3", size = 143614, upload-time = "2026-05-10T18:15:54.657Z" },
+    { url = "https://files.pythonhosted.org/packages/ff/1e/f8bad050810d9171f34a1648ed910e56814c2ba61639f2bd53c6377ae24b/librt-0.11.0-cp312-cp312-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:137e79445c896a0ea7b265f52d23954e05b64222ee1af69e2cb34219067cbb67", size = 485538, upload-time = "2026-05-10T18:15:56.117Z" },
+    { url = "https://files.pythonhosted.org/packages/c0/fe/3594ebfbaf03084ba4b120c9ba5c3183fd938a48725e9bbe6ff0a5159ad8/librt-0.11.0-cp312-cp312-manylinux2014_i686.manylinux_2_17_i686.manylinux_2_28_i686.whl", hash = "sha256:cca6644054e78746d8d4ef238681f9c34ff8b584fe6b988ecebb8db3b15e622a", size = 479623, upload-time = "2026-05-10T18:15:57.544Z" },
+    { url = "https://files.pythonhosted.org/packages/b0/da/5d1876984b3746c85dbd219dbfcb73c85f54ee263fd32e5b2a632ec14571/librt-0.11.0-cp312-cp312-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:d5b0eea49f5562861ee8d757a32ef7d559c1d35be2aaaa1ec28941d74c9ffc8a", size = 513082, upload-time = "2026-05-10T18:15:58.805Z" },
+    { url = "https://files.pythonhosted.org/packages/19/6e/55bdf5d5ca00c3e18430690bf2c953d8d3ffd3c337418173d33dec985dc9/librt-0.11.0-cp312-cp312-manylinux_2_34_riscv64.manylinux_2_39_riscv64.whl", hash = "sha256:0d1029d7e1ae1a7e647ed6fb5df8c4ce2dffefb7a9f5fd1376a4554d96dac09f", size = 508105, upload-time = "2026-05-10T18:16:00.2Z" },
+    { url = "https://files.pythonhosted.org/packages/07/10/f1f23a7c595ee90ece4d35c851e5d104b1311a887ed1b4ac4c35bbd13da8/librt-0.11.0-cp312-cp312-musllinux_1_2_aarch64.whl", hash = "sha256:bc3ce6b33c5828d9e80592011a5c584cb2ce86edbc4088405f70da47dc1d1b3b", size = 522268, upload-time = "2026-05-10T18:16:01.708Z" },
+    { url = "https://files.pythonhosted.org/packages/b6/02/5720f5697a7f54b78b3aefbe20df3a48cedcff1276618c4aa481177942ed/librt-0.11.0-cp312-cp312-musllinux_1_2_i686.whl", hash = "sha256:936c5995f3514a42111f20099397d8177c79b4d7e70961e396c6f5a0a3566766", size = 527348, upload-time = "2026-05-10T18:16:03.496Z" },
+    { url = "https://files.pythonhosted.org/packages/50/db/b4a47c6f91db4ff76348a0b3dd0cc65e090a078b765a810a62ff9434c3d3/librt-0.11.0-cp312-cp312-musllinux_1_2_riscv64.whl", hash = "sha256:9bc0ca6ad9381cbe8e4aa6e5726e4c80c78115a6e9723c599ed1d73e092bc49d", size = 516294, upload-time = "2026-05-10T18:16:05.173Z" },
+    { url = "https://files.pythonhosted.org/packages/9e/58/9384b2f4eb1ed1d273d40948a7c5c4b2360213b402ef3be4641c06299f9c/librt-0.11.0-cp312-cp312-musllinux_1_2_x86_64.whl", hash = "sha256:070aa8c26c0a74774317a72df8851facc7f0f012a5b406557ac56992d92e1ec8", size = 553608, upload-time = "2026-05-10T18:16:06.839Z" },
+    { url = "https://files.pythonhosted.org/packages/21/7b/5aa8848a7c6a9278c79375146da1812e695754ceec5f005e6043461a7315/librt-0.11.0-cp312-cp312-win32.whl", hash = "sha256:6bf14feb84b05ae945277395451998c89c54d0def4070eb5c08de544930b245a", size = 101879, upload-time = "2026-05-10T18:16:08.103Z" },
+    { url = "https://files.pythonhosted.org/packages/37/33/8a745436944947575b584231750a41417de1a38cf6a2e9251d1065651c09/librt-0.11.0-cp312-cp312-win_amd64.whl", hash = "sha256:75672f0bc524ede266287d532d7923dbce94c7514ad07627bac3d0c6d92cc4d9", size = 119831, upload-time = "2026-05-10T18:16:09.174Z" },
+    { url = "https://files.pythonhosted.org/packages/59/67/a6739ac96e28b7855808bdb0370e250606104a859750d209e5a0716fe7ab/librt-0.11.0-cp312-cp312-win_arm64.whl", hash = "sha256:2f10cf143e4a9bb0f4f5af568a00df94a2d69ef41c2579584454bb0fe5cc642c", size = 103470, upload-time = "2026-05-10T18:16:10.369Z" },
+    { url = "https://files.pythonhosted.org/packages/82/61/e59168d4d0bf2bf90f4f0caf7a001bfc60254c3af4586013b04dc3ef517b/librt-0.11.0-cp313-cp313-macosx_10_13_x86_64.whl", hash = "sha256:78dc31f7fdfe9c9d0eb0e8f42d139db230e826415bbcabd9f0e9faaaee909894", size = 144119, upload-time = "2026-05-10T18:16:11.771Z" },
+    { url = "https://files.pythonhosted.org/packages/61/fd/caa1d60b12f7dd79ccea23054e06eeaebe266a5f52c40a6b651069200ce5/librt-0.11.0-cp313-cp313-macosx_11_0_arm64.whl", hash = "sha256:fa475675db22290c3158e1d42326d0f5a65f04f44a0e68c3630a25b53560fb9c", size = 143565, upload-time = "2026-05-10T18:16:13.334Z" },
+    { url = "https://files.pythonhosted.org/packages/b8/a9/dc744f5c2b4978d48db970be29f22716d3413d28b14ad99740817315cf2c/librt-0.11.0-cp313-cp313-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:621db29691044bdeda22e789e482e1b0f3a985d90e3426c9c6d17606416205ea", size = 485395, upload-time = "2026-05-10T18:16:14.729Z" },
+    { url = "https://files.pythonhosted.org/packages/8f/21/7f8e97a1e4dae952a5a95948f6f8507a173bc1e669f54340bba6ca1ca31b/librt-0.11.0-cp313-cp313-manylinux2014_i686.manylinux_2_17_i686.manylinux_2_28_i686.whl", hash = "sha256:a9010e2ed5b3a9e158c5fd966b3ab7e834bb3d3aacc8f66c91dd4b57a3799230", size = 479383, upload-time = "2026-05-10T18:16:16.321Z" },
+    { url = "https://files.pythonhosted.org/packages/a6/6d/d8ee9c114bebf2c50e29ec2aa940826fccb62a645c3e4c18760987d0e16d/librt-0.11.0-cp313-cp313-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:7c39513d8b7477a2e1ed8c43fc21c524e8d5a0f8d4e8b7b074dbdbe7820a08e2", size = 513010, upload-time = "2026-05-10T18:16:17.647Z" },
+    { url = "https://files.pythonhosted.org/packages/f0/43/0b5708af2bd30a46400e72ba6bdaa8f066f15fb9a688527e34220e8d6c06/librt-0.11.0-cp313-cp313-manylinux_2_34_riscv64.manylinux_2_39_riscv64.whl", hash = "sha256:7aef3cf1d5af86e770ab04bfd993dfc4ae8b8c17f66fb77dd4a7d50de7bbb1a3", size = 508433, upload-time = "2026-05-10T18:16:19.309Z" },
+    { url = "https://files.pythonhosted.org/packages/4a/50/356187247d09013490481033183b3532b58acf8028bcb34b2b56a375c9b2/librt-0.11.0-cp313-cp313-musllinux_1_2_aarch64.whl", hash = "sha256:557183ddc36babe46b27dd60facbd5adb4492181a5be887587d57cda6e092f21", size = 522595, upload-time = "2026-05-10T18:16:20.642Z" },
+    { url = "https://files.pythonhosted.org/packages/40/e7/c6ac4240899c7f3248079d5a9900debe0dadb3fdeaf856684c987105ba47/librt-0.11.0-cp313-cp313-musllinux_1_2_i686.whl", hash = "sha256:83d3e1f72bd42f6c5c0b7daec530c3f829bd02db42c70b8ddf0c2d90a2459930", size = 527255, upload-time = "2026-05-10T18:16:22.352Z" },
+    { url = "https://files.pythonhosted.org/packages/eb/b5/a81322dbeedeeaf9c1ee6f001734d28a09d8383ac9e6779bc24bbd0743c6/librt-0.11.0-cp313-cp313-musllinux_1_2_riscv64.whl", hash = "sha256:4ce1f21fbe589bc1afd7872dece84fb0e1144f794a288e58a10d2c54a55c43be", size = 516847, upload-time = "2026-05-10T18:16:23.627Z" },
+    { url = "https://files.pythonhosted.org/packages/ae/66/6e6323787d592b55204a42595ff1102da5115601b53a7e9ddebc889a6da5/librt-0.11.0-cp313-cp313-musllinux_1_2_x86_64.whl", hash = "sha256:970b09f7044ea2b64c9da42fd3d335666518cfd1c6e8a182c95da73d0214b41e", size = 553920, upload-time = "2026-05-10T18:16:25.025Z" },
+    { url = "https://files.pythonhosted.org/packages/9c/21/623f8ca230857102066d9ca8c6c1734995908c4d0d1bee7bb2ef0021cb33/librt-0.11.0-cp313-cp313-win32.whl", hash = "sha256:78fddc31cd4d3caa897ad5d31f856b1faadc9474021ad6cb182b9018793e254e", size = 101898, upload-time = "2026-05-10T18:16:26.649Z" },
+    { url = "https://files.pythonhosted.org/packages/b3/1d/b4ebd44dd723f768469007515cb92251e0ae286c94c140f374801140fa74/librt-0.11.0-cp313-cp313-win_amd64.whl", hash = "sha256:8ca8aa88751a775870b764e93bad5135385f563cb8dcee399abf034ea4d3cb47", size = 119812, upload-time = "2026-05-10T18:16:27.859Z" },
+    { url = "https://files.pythonhosted.org/packages/3b/e4/b2f4ca7965ca373b491cdb4bc25cdb30c1649ca81a8782056a83850292a9/librt-0.11.0-cp313-cp313-win_arm64.whl", hash = "sha256:96f044bb325fd9cf1a723015638c219e9143f0dfbc0ca54c565df2b7fc748b44", size = 103448, upload-time = "2026-05-10T18:16:29.066Z" },
+    { url = "https://files.pythonhosted.org/packages/29/eb/dbce197da4e227779e56b5735f2decc3eb36e55a1cdbf1bd65d6639d76c1/librt-0.11.0-cp314-cp314-macosx_10_13_x86_64.whl", hash = "sha256:4a017a95e5837dc15a8c5661d60e05daa96b90908b1aa6b7acdf443cd25c8ebd", size = 143345, upload-time = "2026-05-10T18:16:30.674Z" },
+    { url = "https://files.pythonhosted.org/packages/76/a3/254bebd0c11c8ba684018efb8006ff22e466abce445215cca6c778e7d9de/librt-0.11.0-cp314-cp314-macosx_11_0_arm64.whl", hash = "sha256:b1ecbd9819deccc39b7542bf4d2a740d8a620694d39989e58661d3763458f8d4", size = 143131, upload-time = "2026-05-10T18:16:32.037Z" },
+    { url = "https://files.pythonhosted.org/packages/f1/3f/f77d6122d21ac7bf6ae8a7dfced1bd2a7ac545d3273ebdcaf8042f6d619f/librt-0.11.0-cp314-cp314-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:7da327dacd7be8f8ec36547373550744a3cc0e536d54665cd83f8bcd961200e8", size = 477024, upload-time = "2026-05-10T18:16:33.493Z" },
+    { url = "https://files.pythonhosted.org/packages/ac/0a/2c996dadebaa7d9bbbd43ef2d4f3e66b6da545f838a41694ef6172cebec8/librt-0.11.0-cp314-cp314-manylinux2014_i686.manylinux_2_17_i686.manylinux_2_28_i686.whl", hash = "sha256:0dc56b1f8d06e60db362cc3fdae206681817f86ce4725d34511473487f12a34b", size = 474221, upload-time = "2026-05-10T18:16:34.864Z" },
+    { url = "https://files.pythonhosted.org/packages/0a/7e/f5d92af8486b8272c23b3e686b46ff72d89c8169585eb61eef01a2ac7147/librt-0.11.0-cp314-cp314-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:05fb8fb2ab90e21c8d12ea240d744ad514da9baf381ebfa70d91d20d21713175", size = 505174, upload-time = "2026-05-10T18:16:36.705Z" },
+    { url = "https://files.pythonhosted.org/packages/af/1a/cb0734fe86398eb33193ab753b7326255c74cac5eb09e76b9b16536e7adb/librt-0.11.0-cp314-cp314-manylinux_2_34_riscv64.manylinux_2_39_riscv64.whl", hash = "sha256:cae74872be221df4374d10fec61f93ed1513b9546ea84f2c0bf73ab3e9bd0b03", size = 497216, upload-time = "2026-05-10T18:16:38.418Z" },
+    { url = "https://files.pythonhosted.org/packages/18/06/094820f91558b66e29943c0ec41c9914f460f48dd51fc503c3101e10842d/librt-0.11.0-cp314-cp314-musllinux_1_2_aarch64.whl", hash = "sha256:32bcc918c0148eb7e3d57385125bac7e5f9e4359d05f07448b09f6f778c2f31c", size = 513921, upload-time = "2026-05-10T18:16:39.848Z" },
+    { url = "https://files.pythonhosted.org/packages/0b/c2/00de9018871a282f530cacb457d5ec0428f6ac7e6fedde9aff7468d9fb04/librt-0.11.0-cp314-cp314-musllinux_1_2_i686.whl", hash = "sha256:f9743fc99135d5f78d2454435615f6dec0473ca507c26ce9d92b10b562a280d3", size = 520850, upload-time = "2026-05-10T18:16:41.471Z" },
+    { url = "https://files.pythonhosted.org/packages/51/9d/64631832348fd1834fb3a61b996434edddaaf25a31d03b0a76273159d2cf/librt-0.11.0-cp314-cp314-musllinux_1_2_riscv64.whl", hash = "sha256:5ba067f4aadae8fda802d91d2124c90c42195ff32d9161d3549e6d05cfe26f96", size = 504237, upload-time = "2026-05-10T18:16:43.15Z" },
+    { url = "https://files.pythonhosted.org/packages/a5/ec/ae5525eb16edc827a044e7bb8777a455ff95d4bca9379e7e6bddd7383647/librt-0.11.0-cp314-cp314-musllinux_1_2_x86_64.whl", hash = "sha256:de3bf945454d032f9e390b85c4072e0a0570bf825421c8be0e71209fa65e1abe", size = 546261, upload-time = "2026-05-10T18:16:44.408Z" },
+    { url = "https://files.pythonhosted.org/packages/5a/09/adce371f27ca039411da9659f7430fcc2ba6cd0c7b3e4467a0f091be7fa9/librt-0.11.0-cp314-cp314-win32.whl", hash = "sha256:d2277a05f6dcb9fd13db9566aac4fabd68c3ea1ea46ee5567d4eef8efa495a2f", size = 96965, upload-time = "2026-05-10T18:16:46.039Z" },
+    { url = "https://files.pythonhosted.org/packages/d6/ee/8ac720d98548f173c7ce2e632a7ca94673f74cacd5c8162a84af5b35958a/librt-0.11.0-cp314-cp314-win_amd64.whl", hash = "sha256:ab73e8db5e3f564d812c1f5c3a175930a5f9bc96ccb5e3b22a34d7858b401cf7", size = 115151, upload-time = "2026-05-10T18:16:47.133Z" },
+    { url = "https://files.pythonhosted.org/packages/94/20/c900cf14efeb09b6bef2b2dff20779f73464b97fd58d1c6bccc379588ae3/librt-0.11.0-cp314-cp314-win_arm64.whl", hash = "sha256:aea3caa317752e3a466fa8af45d91ee0ea8c7fdd96e42b0a8dd9b76a7931eba1", size = 98850, upload-time = "2026-05-10T18:16:48.597Z" },
+    { url = "https://files.pythonhosted.org/packages/0c/71/944bfe4b64e12abffcd3c15e1cce07f72f3d55655083786285f4dedeb532/librt-0.11.0-cp314-cp314t-macosx_10_13_x86_64.whl", hash = "sha256:d1b36540d7aaf9b9101b3a6f376c8d8e9f7a9aec93ed05918f2c69d493ffef72", size = 151138, upload-time = "2026-05-10T18:16:49.839Z" },
+    { url = "https://files.pythonhosted.org/packages/b6/10/99e64a5c86989357fda078c8143c533389585f6473b7439172dd8f3b3b2d/librt-0.11.0-cp314-cp314t-macosx_11_0_arm64.whl", hash = "sha256:efbb343ab2ce3540f4ecbe6315d677ed70f37cd9a72b1e58066c918ca83acbaa", size = 151976, upload-time = "2026-05-10T18:16:51.062Z" },
+    { url = "https://files.pythonhosted.org/packages/21/31/5072ad880946d83e5ea4147d6d018c78eefce85b77819b19bdd0ee229435/librt-0.11.0-cp314-cp314t-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:aa0dd688aab3f7914d3e6e5e3554978e0383312fb8e771d84be008a35b9ee548", size = 557927, upload-time = "2026-05-10T18:16:52.632Z" },
+    { url = "https://files.pythonhosted.org/packages/5e/8d/70b5fb7cfbab60edbe7381614ab985da58e144fbf465c86d44c95f43cdca/librt-0.11.0-cp314-cp314t-manylinux2014_i686.manylinux_2_17_i686.manylinux_2_28_i686.whl", hash = "sha256:f5fb36b8c6c63fdcbb1d526d94c0d1331610d43f4118cc1beb4efef4f3faacb2", size = 539698, upload-time = "2026-05-10T18:16:53.934Z" },
+    { url = "https://files.pythonhosted.org/packages/fa/a3/ba3495a0b3edbd24a4cae0d1d3c64f39a9fc45d06e812101289b50c1a619/librt-0.11.0-cp314-cp314t-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:4a9a237d13addb93715b6fee74023d5ee3469b53fce527626c0e088aa585805f", size = 577162, upload-time = "2026-05-10T18:16:55.589Z" },
+    { url = "https://files.pythonhosted.org/packages/f7/db/36e25fb81f99937ff1b96612a1dc9fd66f039cb9cc3aee12c01fac31aab9/librt-0.11.0-cp314-cp314t-manylinux_2_34_riscv64.manylinux_2_39_riscv64.whl", hash = "sha256:5ddd17bd87b2c56ddd60e546a7984a2e64c4e8eab92fb4cf3830a48ad5469d51", size = 566494, upload-time = "2026-05-10T18:16:56.975Z" },
+    { url = "https://files.pythonhosted.org/packages/33/0d/3f622b47f0b013eeb9cf4cc07ae9bfe378d832a4eec998b2b209fe84244d/librt-0.11.0-cp314-cp314t-musllinux_1_2_aarch64.whl", hash = "sha256:bd43992b4473d42f12ff9e68326079f0696d9d4e6000e8f39a0238d482ba6ee2", size = 596858, upload-time = "2026-05-10T18:16:58.374Z" },
+    { url = "https://files.pythonhosted.org/packages/a9/02/71b90bc93039c46a2000651f6ad60122b114c8f54c4ad306e0e96f5b75ad/librt-0.11.0-cp314-cp314t-musllinux_1_2_i686.whl", hash = "sha256:f8e3e8056dd674e279741485e2e512d6e9a751c7455809d0114e6ebf8d781085", size = 590318, upload-time = "2026-05-10T18:16:59.676Z" },
+    { url = "https://files.pythonhosted.org/packages/04/04/418cb3f75621e2b761fb1ab0f017f4d70a1a72a6e7c74ee4f7e8d198c2f3/librt-0.11.0-cp314-cp314t-musllinux_1_2_riscv64.whl", hash = "sha256:c1f708d8ae9c56cf38a903c44297243d2ec83fd82b396b977e0144a3e76217e3", size = 575115, upload-time = "2026-05-10T18:17:01.007Z" },
+    { url = "https://files.pythonhosted.org/packages/cc/2c/5a2183ac58dd911f26b5d7e7d7d8f1d87fcecdddd99d6c12169a258ff62c/librt-0.11.0-cp314-cp314t-musllinux_1_2_x86_64.whl", hash = "sha256:0add982e0e7b9fc14cf4b33789d5f13f66581889b88c2f58099f6ce8f92617bd", size = 617918, upload-time = "2026-05-10T18:17:02.682Z" },
+    { url = "https://files.pythonhosted.org/packages/15/1f/dc6771a52592a4451be6effa200cbfc9cec61e4393d3033d81a9d307961d/librt-0.11.0-cp314-cp314t-win32.whl", hash = "sha256:2b481d846ac894c4e8403c5fd0e87c5d11d6499e404b474602508a224ff531c8", size = 103562, upload-time = "2026-05-10T18:17:03.99Z" },
+    { url = "https://files.pythonhosted.org/packages/62/4a/7d1415567027286a75ba1093ec4aca11f073e0f559c530cf3e0a757ad55c/librt-0.11.0-cp314-cp314t-win_amd64.whl", hash = "sha256:28edb433edde181112a908c78907af28f964eabc15f4dd16c9d66c834302677c", size = 124327, upload-time = "2026-05-10T18:17:05.465Z" },
+    { url = "https://files.pythonhosted.org/packages/ce/62/b40b382fa0c66fee1478073eb8db352a4a6beda4a1adccf1df911d8c289c/librt-0.11.0-cp314-cp314t-win_arm64.whl", hash = "sha256:dee008f20b542e3cd162ba338a7f9ec0f6d23d395f66fe8aeeec3c9d067ea253", size = 102572, upload-time = "2026-05-10T18:17:06.809Z" },
+]
+
 [[package]]
 name = "markdown-it-py"
 version = "4.2.0"
@@ -188,6 +307,66 @@ wheels = [
     { url = "https://files.pythonhosted.org/packages/b3/38/89ba8ad64ae25be8de66a6d463314cf1eb366222074cfda9ee839c56a4b4/mdurl-0.1.2-py3-none-any.whl", hash = "sha256:84008a41e51615a49fc9966191ff91509e3c40b939176e643fd50a5c2196b8f8", size = 9979, upload-time = "2022-08-14T12:40:09.779Z" },
 ]
 
+[[package]]
+name = "mypy"
+version = "2.1.0"
+source = { registry = "https://pypi.org/simple" }
+dependencies = [
+    { name = "ast-serialize" },
+    { name = "librt", marker = "platform_python_implementation != 'PyPy'" },
+    { name = "mypy-extensions" },
+    { name = "pathspec" },
+    { name = "typing-extensions" },
+]
+sdist = { url = "https://files.pythonhosted.org/packages/82/15/cca9d88503549ed6fedeaa1d448cdddd542ee8a490232d732e278036fbf2/mypy-2.1.0.tar.gz", hash = "sha256:81e76ad12c2d804512e9b13240d1588316531bfba07558286078bfbce9613633", size = 3898359, upload-time = "2026-05-11T18:37:36.237Z" }
+wheels = [
+    { url = "https://files.pythonhosted.org/packages/0a/a1/639f3024794a2a15899cb90707fe02e044c4412794c39c5769fd3df2e2ef/mypy-2.1.0-cp311-cp311-macosx_10_9_x86_64.whl", hash = "sha256:a683016b16fe2f572dc04c72be7ee0504ac1605a265d0200f5cea695fb788f41", size = 14691685, upload-time = "2026-05-11T18:33:27.973Z" },
+    { url = "https://files.pythonhosted.org/packages/3b/08/9a585dea4325f20d8b80dc78623fa50d1fd2173b710f6237afd6ba6ab39b/mypy-2.1.0-cp311-cp311-macosx_11_0_arm64.whl", hash = "sha256:1a293c534adb55271fef24a26da04b855540a8c13cc07bc5917b9fd2c394f2ca", size = 13555165, upload-time = "2026-05-11T18:32:16.107Z" },
+    { url = "https://files.pythonhosted.org/packages/81/dc/7c42cc9c6cb01e8eb09961f1f738741d3e9c7e9d5c5b30ec69222625cd5f/mypy-2.1.0-cp311-cp311-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:7406f4d048e71e576f5356d317e5b0a9e666dfd966bd99f9d14ca06e1a341538", size = 13994376, upload-time = "2026-05-11T18:32:39.256Z" },
+    { url = "https://files.pythonhosted.org/packages/d4/fa/285946c33bce716e082c11dfeee9ee196eaf1f5042efb3581a31f9f205e4/mypy-2.1.0-cp311-cp311-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:e0210d626fc8b31ccc90233754c7bc90e1f43205e85d96387f7db1285b55c398", size = 14864618, upload-time = "2026-05-11T18:34:49.765Z" },
+    { url = "https://files.pythonhosted.org/packages/2b/83/82397f48af6c27e295d57979ded8490c9829040152cf7571b2f026aeb9a0/mypy-2.1.0-cp311-cp311-musllinux_1_2_x86_64.whl", hash = "sha256:3712c20deed54e814eaaa825603bada8ea1c390670a397c95b98405347acc563", size = 15102063, upload-time = "2026-05-11T18:34:05.855Z" },
+    { url = "https://files.pythonhosted.org/packages/40/68/b02dec39057b88eb03dc0aa854732e26e8361f34f9d0e20c7614967d1eba/mypy-2.1.0-cp311-cp311-win_amd64.whl", hash = "sha256:fcaa0e479066e31f7cceb6a3bea39cb22b2ff51a6b2f24f193d19179ba17c389", size = 11060564, upload-time = "2026-05-11T18:35:36.494Z" },
+    { url = "https://files.pythonhosted.org/packages/cf/a8/ea3dcbef31f99b634f2ee23bb0321cbc8c1b388b76a861eb849f13c347dc/mypy-2.1.0-cp311-cp311-win_arm64.whl", hash = "sha256:0b1a5260c95aa443083f9ed3592662941951bca3d4ca224a5dc517c38b7cf666", size = 9966983, upload-time = "2026-05-11T18:37:14.139Z" },
+    { url = "https://files.pythonhosted.org/packages/95/b1/55861beb5c339b44f9a2ba92df9e2cb1eeb4ae1eee674cdf7772c797778b/mypy-2.1.0-cp312-cp312-macosx_10_13_x86_64.whl", hash = "sha256:244358bf1c0da7722230bce60683d52e8e9fd030554926f15b747a84efb5b3af", size = 14874381, upload-time = "2026-05-11T18:37:31.784Z" },
+    { url = "https://files.pythonhosted.org/packages/0b/b3/b7f770114b7d0ac92d0f76e8d93c2780844a70488a90e91821927850da86/mypy-2.1.0-cp312-cp312-macosx_11_0_arm64.whl", hash = "sha256:4ec7c57657493c7a75534df2751c8ae2cda383c16ecc55d2106c54476b1b16f6", size = 13665501, upload-time = "2026-05-11T18:34:23.063Z" },
+    { url = "https://files.pythonhosted.org/packages/b6/f3/8ae2037967e2126689a0c11d99e2b707134a565191e92c60ca2572aec60a/mypy-2.1.0-cp312-cp312-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:d8161b6ff4392410023224f0969d17db93e1e154bc3e4ba62598e720723ae211", size = 14045750, upload-time = "2026-05-11T18:31:48.151Z" },
+    { url = "https://files.pythonhosted.org/packages/a0/32/615eb5911859e43d054941b0d0a7d06cfa2870eba86529cf385b052b111c/mypy-2.1.0-cp312-cp312-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:bf03e12003084a67395184d3eb8cbd6a489dc3655b5664b28c210a9e2403ab0b", size = 15061630, upload-time = "2026-05-11T18:37:06.898Z" },
+    { url = "https://files.pythonhosted.org/packages/d4/03/4eafbfff8bfab1b87082741eae6e6a624028c984e6708b73bce2a8570c9d/mypy-2.1.0-cp312-cp312-musllinux_1_2_x86_64.whl", hash = "sha256:20509760fd791c51579d573153407d226385ec1f8bcce55d730b354f3336bc22", size = 15288831, upload-time = "2026-05-11T18:31:18.07Z" },
+    { url = "https://files.pythonhosted.org/packages/99/ee/919661478e5891a3c96e549c036e467e64563ab85995b10c53c8358e16a3/mypy-2.1.0-cp312-cp312-win_amd64.whl", hash = "sha256:6753d0c1fdd6b1a23b9e4f283ce80b2153b724adcb2653b20b85a8a28ac6436b", size = 11135228, upload-time = "2026-05-11T18:34:31.23Z" },
+    { url = "https://files.pythonhosted.org/packages/24/0a/6a12b9782ca0831a553192f351679f4548abc9d19a7cc93bb7feb02084c7/mypy-2.1.0-cp312-cp312-win_arm64.whl", hash = "sha256:98ebb6589bb3b6d0c6f0c459d53ca55b8091fbc13d277c4041c885392e8195e8", size = 10040684, upload-time = "2026-05-11T18:36:48.199Z" },
+    { url = "https://files.pythonhosted.org/packages/6e/dd/c7191469c777f07689c032a8f7326e393ea34c92d6d76eb7ce5ba57ea66d/mypy-2.1.0-cp313-cp313-macosx_10_13_x86_64.whl", hash = "sha256:35aac3bb114e03888f535d5eb51b8bafbb3266586b599da1940f9b1be3ec5bd5", size = 14852174, upload-time = "2026-05-11T18:31:38.929Z" },
+    { url = "https://files.pythonhosted.org/packages/55/8c/aed55408879043d72bb9135f4d0d19a02b886dd569631e113e3d2706cb8d/mypy-2.1.0-cp313-cp313-macosx_11_0_arm64.whl", hash = "sha256:8de55a8c861f2a49331f807be98d90caeceeef520bde13d43a160207f8af613e", size = 13651542, upload-time = "2026-05-11T18:36:04.636Z" },
+    { url = "https://files.pythonhosted.org/packages/3a/8e/f371a824b1f1fa8ea6e3dbb8703d232977d572be2329554a3bc4d960302f/mypy-2.1.0-cp313-cp313-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:5fdf2941a07434af755837d9880f7d7d25f1dacb1af9dcd4b9b66f2220a3024e", size = 14033929, upload-time = "2026-05-11T18:35:55.742Z" },
+    { url = "https://files.pythonhosted.org/packages/94/21/f54be870d6dd53a82c674407e0f8eed7174b05ec78d42e5abd7b42e84fd5/mypy-2.1.0-cp313-cp313-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:e195b817c13f02352a9c124301f9f30f078405444679b6753c1b96b6eed37285", size = 15039200, upload-time = "2026-05-11T18:33:10.281Z" },
+    { url = "https://files.pythonhosted.org/packages/17/99/bf21748626a40ce59fd29a39386ab46afec88b7bd2f0fa6c3a97c995523f/mypy-2.1.0-cp313-cp313-musllinux_1_2_x86_64.whl", hash = "sha256:5431d42af987ebd92ba2f71d45c85ed41d8e6ca9f5fd209a69f68f707d2469e5", size = 15272690, upload-time = "2026-05-11T18:32:07.205Z" },
+    { url = "https://files.pythonhosted.org/packages/d6/d7/9e90d2cf47100bea550ed2bc7b0d4de3a62181d84d5e37da0003e8462637/mypy-2.1.0-cp313-cp313-win_amd64.whl", hash = "sha256:767fe8c66dc3e01e19e1737d4c38ebefead16125e1b8e58ad421903b376f5c65", size = 11147435, upload-time = "2026-05-11T18:33:56.477Z" },
+    { url = "https://files.pythonhosted.org/packages/ec/46/e5c449e858798e35ffc90946282a27c62a77be743fe17480e4977374eb91/mypy-2.1.0-cp313-cp313-win_arm64.whl", hash = "sha256:ecfe70d43775ab99562ab128ce49854a362044c9f894961f68f898c23cb7429d", size = 10035052, upload-time = "2026-05-11T18:32:30.049Z" },
+    { url = "https://files.pythonhosted.org/packages/b0/ca/b279a672e874aedd5498ae25f722dacc8aa86bbffb939b3f97cbb1cf6686/mypy-2.1.0-cp314-cp314-macosx_10_15_x86_64.whl", hash = "sha256:7354c5a7f69d9345c3d6e69921d57088eea3ddeeb6b20d34c1b3855b02c36ec2", size = 14848422, upload-time = "2026-05-11T18:35:45.984Z" },
+    { url = "https://files.pythonhosted.org/packages/27/e6/3efe56c631d959b9b4454e208b0ac4b7f4f58b404c89f8bec7b49efdfc21/mypy-2.1.0-cp314-cp314-macosx_11_0_arm64.whl", hash = "sha256:49890d4f76ac9e06ec117f9e09f3174da70a620a0c300953d8595c926e80947f", size = 13677374, upload-time = "2026-05-11T18:36:57.188Z" },
+    { url = "https://files.pythonhosted.org/packages/84/7f/8107ea87a44fd1f1b59882442f033c9c3488c127201b1d1d15f1cbd6022e/mypy-2.1.0-cp314-cp314-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:761be68e023ef5d94678772396a8af1220030f80837a3afd8d0aef3b419666f4", size = 14055743, upload-time = "2026-05-11T18:35:18.361Z" },
+    { url = "https://files.pythonhosted.org/packages/51/4d/b6d34db183133b83761b9199a82d31557cdbb70a380d8c3b3438e11882a3/mypy-2.1.0-cp314-cp314-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:c90345fc182dc363b891350457ec69c35140858538f38b4540845afcc32b1aef", size = 15020937, upload-time = "2026-05-11T18:34:59.618Z" },
+    { url = "https://files.pythonhosted.org/packages/ff/d7/f08360c691d758acb02f45022c34d98b92892f4ea756644e1000d4b9f3d8/mypy-2.1.0-cp314-cp314-musllinux_1_2_x86_64.whl", hash = "sha256:b84802e7b5a6daf1f5e15bc9fcd7ddae77be13981ffab037f1c67bb84d67d135", size = 15253371, upload-time = "2026-05-11T18:36:41.081Z" },
+    { url = "https://files.pythonhosted.org/packages/67/1b/09460a13719530a19bce27bd3bc8449e83569dd2ba7faf51c9c3c30c0b61/mypy-2.1.0-cp314-cp314-win_amd64.whl", hash = "sha256:022c771234936ceac541ebaf836fe9e2abeb3f5e09aff21588fe543ff006fe21", size = 11326429, upload-time = "2026-05-11T18:34:13.526Z" },
+    { url = "https://files.pythonhosted.org/packages/40/62/75dbf0f82f7b6680340efc614af29dd0b3c17b8a4f1cd09b8bd2fd6bc814/mypy-2.1.0-cp314-cp314-win_arm64.whl", hash = "sha256:498207db725cec88829a6a5c2fc771205fd043719ef98bc49aba8fb9fc4e6d57", size = 10218799, upload-time = "2026-05-11T18:32:23.491Z" },
+    { url = "https://files.pythonhosted.org/packages/b2/66/caca04ed7d972fb6eb6dd1ccd6df1de5c38fae8c5b3dc1c4e8e0d85ee6b9/mypy-2.1.0-cp314-cp314t-macosx_10_15_x86_64.whl", hash = "sha256:7d5e5cad0efeba72b93cd17490cc0d69c5ac9ca132994fe3fb0314808aeeb83e", size = 15923458, upload-time = "2026-05-11T18:35:28.64Z" },
+    { url = "https://files.pythonhosted.org/packages/ed/52/2d90cbe49d014b13ed7ff337930c30bad35893fe38a1e4641e756bb62191/mypy-2.1.0-cp314-cp314t-macosx_11_0_arm64.whl", hash = "sha256:ff715050c127d724fd260a2e666e7747fdd83511c0c47d449d98238970aef780", size = 14757697, upload-time = "2026-05-11T18:36:14.208Z" },
+    { url = "https://files.pythonhosted.org/packages/ac/37/d98f4a14e081b238992d0ed96b6d39c7cc0148c9699eb71eaa68629665ea/mypy-2.1.0-cp314-cp314t-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:82208da9e09414d520e912d3e462d454854bed0810b71540bb016dcbca7308fd", size = 15405638, upload-time = "2026-05-11T18:33:48.249Z" },
+    { url = "https://files.pythonhosted.org/packages/a3/c2/15c46613b24a84fad2aea1248bf9619b99c2767ae9071fe224c179a0b7d4/mypy-2.1.0-cp314-cp314t-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:e79ebc1b904b84f0310dff7469655a9c36c7a68bddb37bdd42b67a332df61d08", size = 16215852, upload-time = "2026-05-11T18:32:50.296Z" },
+    { url = "https://files.pythonhosted.org/packages/5c/90/9c16a57f482c76d25f6379762b56bbf65c711d8158cf271fb2802cfb0640/mypy-2.1.0-cp314-cp314t-musllinux_1_2_x86_64.whl", hash = "sha256:e583edc957cfb0deb142079162ae826f58449b116c1d442f2d91c69d9fced081", size = 16452695, upload-time = "2026-05-11T18:33:38.182Z" },
+    { url = "https://files.pythonhosted.org/packages/0f/4c/215a4eeb63cacc5f17f516691ea7285d11e249802b942476bff15922a314/mypy-2.1.0-cp314-cp314t-win_amd64.whl", hash = "sha256:b33b6cd332695bba180d55e717a79d3038e479a2c49cc5eb3d53603409b9a5d7", size = 12866622, upload-time = "2026-05-11T18:34:39.945Z" },
+    { url = "https://files.pythonhosted.org/packages/4b/50/1043e1db5f455ffe4c9ab22747cd8ca2bc492b1e4f4e21b130a44ee2b217/mypy-2.1.0-cp314-cp314t-win_arm64.whl", hash = "sha256:4f910fe825376a7b66ef7ca8c98e5a149e8cd64c19ae71d84047a74ee060d4e6", size = 10610798, upload-time = "2026-05-11T18:36:31.444Z" },
+    { url = "https://files.pythonhosted.org/packages/0d/2a/13ca1f292f6db1b98ff495ef3467736b331621c5917cad984b7043e7348d/mypy-2.1.0-py3-none-any.whl", hash = "sha256:a663814603a5c563fb87a4f96fb473eeb30d1f5a4885afcf44f9db000a366289", size = 2693302, upload-time = "2026-05-11T18:31:29.246Z" },
+]
+
+[[package]]
+name = "mypy-extensions"
+version = "1.1.0"
+source = { registry = "https://pypi.org/simple" }
+sdist = { url = "https://files.pythonhosted.org/packages/a2/6e/371856a3fb9d31ca8dac321cda606860fa4548858c0cc45d9d1d4ca2628b/mypy_extensions-1.1.0.tar.gz", hash = "sha256:52e68efc3284861e772bbcd66823fde5ae21fd2fdb51c62a211403730b916558", size = 6343, upload-time = "2025-04-22T14:54:24.164Z" }
+wheels = [
+    { url = "https://files.pythonhosted.org/packages/79/7b/2c79738432f5c924bef5071f933bcc9efd0473bac3b4aa584a6f7c1c8df8/mypy_extensions-1.1.0-py3-none-any.whl", hash = "sha256:1be4cccdb0f2482337c4743e60421de3a356cd97508abadd57d47403e94f5505", size = 4963, upload-time = "2025-04-22T14:54:22.983Z" },
+]
+
 [[package]]
 name = "packaging"
 version = "26.0"
@@ -197,6 +376,15 @@ wheels = [
     { url = "https://files.pythonhosted.org/packages/b7/b9/c538f279a4e237a006a2c98387d081e9eb060d203d8ed34467cc0f0b9b53/packaging-26.0-py3-none-any.whl", hash = "sha256:b36f1fef9334a5588b4166f8bcd26a14e521f2b55e6b9de3aaa80d3ff7a37529", size = 74366, upload-time = "2026-01-21T20:50:37.788Z" },
 ]
 
+[[package]]
+name = "pathspec"
+version = "1.1.1"
+source = { registry = "https://pypi.org/simple" }
+sdist = { url = "https://files.pythonhosted.org/packages/5a/82/42f767fc1c1143d6fd36efb827202a2d997a375e160a71eb2888a925aac1/pathspec-1.1.1.tar.gz", hash = "sha256:17db5ecd524104a120e173814c90367a96a98d07c45b2e10c2f3919fff91bf5a", size = 135180, upload-time = "2026-04-27T01:46:08.907Z" }
+wheels = [
+    { url = "https://files.pythonhosted.org/packages/f1/d9/7fb5aa316bc299258e68c73ba3bddbc499654a07f151cba08f6153988714/pathspec-1.1.1-py3-none-any.whl", hash = "sha256:a00ce642f577bf7f473932318056212bc4f8bfdf53128c78bbd5af0b9b20b189", size = 57328, upload-time = "2026-04-27T01:46:07.06Z" },
+]
+
 [[package]]
 name = "pluggy"
 version = "1.6.0"

From 7c911d42870f64b593d8e84a3a77f6ca3c339fff Mon Sep 17 00:00:00 2001
From: Beatrice Marti <mabesa@users.noreply.github.com>
Date: Wed, 17 Jun 2026 11:19:53 +0200
Subject: [PATCH 12/16] refactor: tighten type-ignore hygiene

- _make_metadata: drop the avoidable arg-type ignore via model_validate
- annotate the two unavoidable pydantic @computed_field prop-decorator
  ignores (success/trusted) with a one-line reason, per the B ignore policy

The only remaining ignores are the two documented pydantic/mypy
computed_field false positives; everything else is ignore-free under strict.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
---
 forecast_interface/__init__.py               | 2 +-
 forecast_interface/output/model_output.py    | 2 +-
 forecast_interface/output/variable_output.py | 2 +-
 pyproject.toml                               | 4 ++--
 tests/test_output.py                         | 2 +-
 uv.lock                                      | 2 +-
 6 files changed, 7 insertions(+), 7 deletions(-)

diff --git a/forecast_interface/__init__.py b/forecast_interface/__init__.py
index 880083b..0b61a5f 100644
--- a/forecast_interface/__init__.py
+++ b/forecast_interface/__init__.py
@@ -1,4 +1,4 @@
-__version__ = "0.1.12"
+__version__ = "0.1.13"
 
 from .common import AggregationMethod
 from .input import (
diff --git a/forecast_interface/output/model_output.py b/forecast_interface/output/model_output.py
index d2b6d4d..ba6f320 100644
--- a/forecast_interface/output/model_output.py
+++ b/forecast_interface/output/model_output.py
@@ -45,7 +45,7 @@ def _validate_variables(
                     raise ValueError("variable name keys must be non-empty strings")
         return v
 
-    @computed_field  # type: ignore[prop-decorator]
+    @computed_field  # type: ignore[prop-decorator]  # pydantic computed_field + property: known mypy false positive
     @property
     def success(self) -> bool:
         return all(
diff --git a/forecast_interface/output/variable_output.py b/forecast_interface/output/variable_output.py
index e2b61f9..536ea4b 100644
--- a/forecast_interface/output/variable_output.py
+++ b/forecast_interface/output/variable_output.py
@@ -113,7 +113,7 @@ class VariableOutput(BaseModel):
     flags: frozenset[ForecastFlag] = frozenset()
     status: VariableStatus
 
-    @computed_field  # type: ignore[prop-decorator]
+    @computed_field  # type: ignore[prop-decorator]  # pydantic computed_field + property: known mypy false positive
     @property
     def trusted(self) -> bool:
         return len(self.flags) == 0
diff --git a/pyproject.toml b/pyproject.toml
index 1a4acf8..5f7406f 100644
--- a/pyproject.toml
+++ b/pyproject.toml
@@ -1,6 +1,6 @@
 [project]
 name = "forecastinterface"
-version = "0.1.12"
+version = "0.1.13"
 description = "Add your description here"
 readme = "README.md"
 requires-python = ">=3.11"
@@ -25,7 +25,7 @@ init_typed = true
 warn_required_dynamic_aliases = true
 
 [tool.bumpversion]
-current_version = "0.1.12"
+current_version = "0.1.13"
 commit = false
 tag = false
 allow_dirty = true
diff --git a/tests/test_output.py b/tests/test_output.py
index d6e0bb4..26df01c 100644
--- a/tests/test_output.py
+++ b/tests/test_output.py
@@ -31,7 +31,7 @@ def _make_metadata(**overrides: object) -> VariableMetadata:
         "offset": 0,
     }
     defaults.update(overrides)
-    return VariableMetadata(**defaults)  # type: ignore[arg-type]
+    return VariableMetadata.model_validate(defaults)
 
 
 def _make_det_df() -> pl.DataFrame:
diff --git a/uv.lock b/uv.lock
index c626d1f..5b1af42 100644
--- a/uv.lock
+++ b/uv.lock
@@ -129,7 +129,7 @@ wheels = [
 
 [[package]]
 name = "forecastinterface"
-version = "0.1.12"
+version = "0.1.13"
 source = { virtual = "." }
 dependencies = [
     { name = "polars" },

From 7713a49342288c593dc5ec488e31def0a32bfcfe Mon Sep 17 00:00:00 2001
From: Beatrice Marti <mabesa@users.noreply.github.com>
Date: Wed, 17 Jun 2026 11:30:30 +0200
Subject: [PATCH 13/16] docs: consolidate and de-stale
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

- README: reshape into a gateway — drop the duplicated ModelOutput /
  InputRequirement schemas (authoritative in docs/), keep intro, doc
  links, and the usage example
- delete TODO.md (obsolete two-line stub)
- open_design_questions: clear six done "Code TODO" markers (A1-A3
  implemented them); mark Q6 ANSWERED (scoped by decision 1.8)
- model_interface: fix self-contradiction — embedding-key/station-set
  contract is v1 load-bearing, only rich provenance metadata is Phase 4
- nepal-model-requirements: replace stale spatial vocab (lumped/HRU)
  with POINT/BASIN_AVERAGE/ELEVATION_BAND/GRIDDED

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
---
 README.md                        | 120 +++----------------------------
 TODO.md                          |   2 -
 docs/model_interface.md          |   2 +-
 docs/nepal-model-requirements.md |   2 +-
 docs/open_design_questions.md    |  14 ++--
 forecast_interface/__init__.py   |   2 +-
 pyproject.toml                   |   4 +-
 uv.lock                          |   2 +-
 8 files changed, 23 insertions(+), 125 deletions(-)
 delete mode 100644 TODO.md

diff --git a/README.md b/README.md
index fc09ae4..b0540e6 100644
--- a/README.md
+++ b/README.md
@@ -8,83 +8,30 @@ uv add forecastinterface
 
 ## Documentation
 
-- [Model Interface Specification](docs/model_interface.md)
-- [Input Requirement Specification](docs/input_requirement.md)
+- [Model Interface Specification](docs/model_interface.md) — the `ForecastModel` protocol, training/lifecycle, and `ModelOutput` types
+- [Input Requirement Specification](docs/input_requirement.md) — the `InputRequirement` declaration and the `ModelInputs` bundle
+- [FI ↔ SAP3 Mapping](docs/fi-sap3-mapping.md) — how FI types map onto the SAPPHIRE_flow adapter boundary
 
 ## ModelOutput
 
-Top-level container holding forecast results, keyed by station then variable. Each variable can independently carry deterministic forecasts, quantile forecasts, trajectory ensembles, or any combination.
+The result of `predict` / `hindcast` (returned via `ModelResult`). A **station-keyed** container — `variables: dict[station_id, dict[variable_name, VariableOutput]]` — where each `VariableOutput` carries any combination of deterministic, quantile, trajectory, and epistemic-uncertainty data, plus a `status` (`SUCCESS`/`FAILURE`/`PARTIAL`) and quality `flags`. A single-station model returns a one-key outer dict; missing stations are explicit `FAILURE` entries, never absent keys.
 
-`variables` is station-keyed: `station_id → variable_name → VariableOutput`. A single-station model returns a one-key outer dict. Missing stations are explicit `FAILURE` entries (a `VariableOutput` with `status == FAILURE`), never absent keys — the model echoes back every station id it was given.
+See the [Model Interface Specification](docs/model_interface.md) for the full schema, DataFrame layouts, and enums.
 
-### Structure
-
-```
-ModelOutput
-    model_name: str
-    issue_datetime: datetime
-    success: bool                                    # derived — True when all variables (across all stations) succeeded
-    variables: dict[str, dict[str, VariableOutput]]  # station_id → variable_name → VariableOutput
-
-VariableOutput
-    metadata: VariableMetadata
-    deterministic: DeterministicData | None
-    quantiles: QuantileData | None
-    trajectories: TrajectoryData | None
-    epistemic_uncertainty: EpistemicUncertaintyData | None
-    status: VariableStatus                 # SUCCESS | FAILURE | PARTIAL
-    flags: frozenset[ForecastFlag]
-
-VariableMetadata
-    unit: Unit                             # e.g. Unit.M3_PER_S → "m³/s"
-    timedelta: timedelta                   # time step between forecast points
-    forecast_horizon: int                  # forecast steps per issue_datetime block (> 0)
-    offset: int                            # offset in steps (>= 0)
-
-DeterministicData
-    data: pl.DataFrame                     # columns: ["issue_datetime", "datetime", "value"]
-
-QuantileData
-    quantile_levels: list[float]           # e.g. [0.1, 0.5, 0.9] — sorted, in (0, 1)
-    data: pl.DataFrame                     # columns: ["issue_datetime", "datetime", "0.1", "0.5", "0.9"]
-
-TrajectoryData
-    num_samples: int                       # number of ensemble members (>= 8)
-    data: pl.DataFrame                     # columns: ["issue_datetime", "datetime", "1", "2", ..., "N"]
-
-EpistemicUncertaintyData
-    data: pl.DataFrame                     # columns: ["issue_datetime", "datetime", "std", "range"]
-```
-
-### DataFrame Schemas
-
-All DataFrames are validated on construction:
-
-| Container | `issue_datetime` column | `datetime` column | Value columns |
-|---|---|---|---|
-| `DeterministicData` | `Datetime` | `Datetime` | `value` (numeric) |
-| `QuantileData` | `Datetime` | `Datetime` | One per level, named as float strings: `"0.1"`, `"0.5"`, ... |
-| `TrajectoryData` | `Datetime` | `Datetime` | One per sample, named `"1"`, `"2"`, ..., `"N"` |
-| `EpistemicUncertaintyData` | `Datetime` | `Datetime` | `std` (numeric), `range` (numeric) |
-
-### Enums
-
-**Unit** -- `M3_PER_S`, `MM_PER_DAY`, `MM_PER_S`, `MM`, `CM`, `M`, `DEG_C`, `UNITLESS`, `PERCENT`, `M_PER_S`, `DEGREE`, `W_PER_M2`, `MM_PER_HOUR`
-
-**AggregationMethod** -- `SUM`, `MEAN`
+## InputRequirement
 
-**VariableStatus** -- `SUCCESS`, `FAILURE`, `PARTIAL`
+Declares what data a model needs: forecast `targets`, `dynamic` inputs nested as `timedelta` time step → spatial representation → past/future → product → variable (each with its `unit`, `lookback`/`future_steps`, `max_nan`, and optional `aggregation`), and `static` attributes. At run time the model receives a `ModelInputs` bundle isomorphic to this declaration.
 
-**ForecastFlag** -- `HIGH_EPISTEMIC_UNCERTAINTY`, `DATA_AVAILABILITY`
+See the [Input Requirement Specification](docs/input_requirement.md) for the full structure and examples.
 
-### Usage
+## Usage
 
 ```python
 from datetime import datetime, timedelta
 import polars as pl
 from forecast_interface import (
     ModelOutput, VariableOutput, VariableMetadata,
-    DeterministicData, QuantileData, Unit, VariableStatus,
+    DeterministicData, Unit, VariableStatus,
 )
 
 issue_dt = datetime(2024, 6, 1, 6, 0)
@@ -115,50 +62,3 @@ output = ModelOutput(
 
 assert output.success is True
 ```
-
-## InputRequirement
-
-Declares what data a forecasting model needs. The preprocessing pipeline reads this spec and provides exactly the required inputs.
-
-See [Input Requirement Specification](docs/input_requirement.md) for full documentation.
-
-### Structure
-
-```
-InputRequirement
-    targets: dict[str, TargetSpec]                  # what the model forecasts
-    dynamic: dict[timedelta, SpatialInputSpec]
-    static: set[str]
-
-TargetSpec
-    unit: Unit
-    representations: frozenset[OutputRepresentation]  # DETERMINISTIC | QUANTILES | TRAJECTORIES
-
-SpatialInputSpec
-    data: dict[SpatialRepresentation, DynamicInputSpec]
-
-DynamicInputSpec
-    past_known: dict[str, dict[str, PastKnownVariable]]
-    future_known: dict[str, dict[str, FutureKnownVariable]]
-
-PastKnownVariable
-    lookback: int
-    max_nan: int
-    unit: Unit
-    aggregation: AggregationMethod | None
-
-FutureKnownVariable
-    future_steps: int
-    max_nan: int
-    unit: Unit
-    aggregation: AggregationMethod | None
-    ensemble_mode: EnsembleMode    # SINGLE or ENSEMBLE
-```
-
-### Enums
-
-**SpatialRepresentation** -- `POINT`, `BASIN_AVERAGE`, `ELEVATION_BAND`, `GRIDDED`
-
-**OutputRepresentation** -- `DETERMINISTIC`, `QUANTILES`, `TRAJECTORIES`
-
-**AggregationMethod** -- `SUM`, `MEAN`
diff --git a/TODO.md b/TODO.md
deleted file mode 100644
index b3d0b46..0000000
--- a/TODO.md
+++ /dev/null
@@ -1,2 +0,0 @@
-- [ ] Differentiate between forecast and hindcast output
-- [ ] 
\ No newline at end of file
diff --git a/docs/model_interface.md b/docs/model_interface.md
index 1b561d1..d8a9bd4 100644
--- a/docs/model_interface.md
+++ b/docs/model_interface.md
@@ -19,7 +19,7 @@ Produce a `TrainedArtifact` from training inputs. See the Training & Lifecycle P
 
 ## Training & Lifecycle Protocol
 
-> **Status: implemented** in `forecast_interface/interface/` (`protocol.py`, `scope.py`, `artifact.py`). The `inputs` parameters use FI-owned `ModelInputs`; only `config` remains **provisional** — typed `Any` until the model-config type is co-designed with SAP3 (Q8). Rich `TrainedArtifact` provenance metadata and the group-artifact embedding-key / station-set-mismatch contract are **deferred to Phase 4** (see [`docs/nepal-model-requirements.md`](./nepal-model-requirements.md) §4 and §8).
+> **Status: implemented** in `forecast_interface/interface/` (`protocol.py`, `scope.py`, `artifact.py`). The `inputs` parameters use FI-owned `ModelInputs`; only `config` remains **provisional** — typed `Any` until the model-config type is co-designed with SAP3 (Q8). Rich `TrainedArtifact` provenance metadata is **deferred to Phase 4** (see [`docs/nepal-model-requirements.md`](./nepal-model-requirements.md) §4); the group-artifact embedding-key / station-set-mismatch contract is **v1 load-bearing** (see the `TrainedArtifact` section below and decision 1.10).
 
 ### Scope: `ArtifactScope`
 
diff --git a/docs/nepal-model-requirements.md b/docs/nepal-model-requirements.md
index dba4fb1..d309413 100644
--- a/docs/nepal-model-requirements.md
+++ b/docs/nepal-model-requirements.md
@@ -177,7 +177,7 @@ ForecastInterface input requirements must be able to express:
 - SnowMapper variables such as SWE and snowmelt as past and/or future dynamic
   features;
 - product/source names and versions;
-- spatial representation: lumped/basin-average, HRU/elevation-band, or gridded;
+- spatial representation: POINT, BASIN_AVERAGE, ELEVATION_BAND, or GRIDDED;
 - static catchment attributes required by the model;
 - allowed missing-data thresholds per variable and product.
 
diff --git a/docs/open_design_questions.md b/docs/open_design_questions.md
index bb61117..cd5c433 100644
--- a/docs/open_design_questions.md
+++ b/docs/open_design_questions.md
@@ -78,7 +78,7 @@ Banded Snowmapper SWE / snowmelt is declared at `ELEVATION_BAND`.
 
 **Decision:** FI enforces **deployment-independent structural floors** only; SAP3's deployment-configurable operational floors are checked **loudly at integration time**, not silently at runtime.
 
-- **FI structural floors (hard validators):** QuantileData **≥ 3** levels (centre + two tails); TrajectoryData **≥ 8** samples. *(Code TODO: validators currently enforce ≥1 quantile / >0 trajectory — tighten to ≥3 / ≥8.)*
+- **FI structural floors (hard validators):** QuantileData **≥ 3** levels (centre + two tails); TrajectoryData **≥ 8** samples. *(Implemented.)*
 - **SAP3 operational floors (deployment config, SAP3-side):** `min_operational_quantile_levels` ≥ 7 with tail coverage (a level ≤0.05 and a level ≥0.95); `min_operational_ensemble_size` ≥ 20 members. Kept out of FI because they are deployment-specific.
 - **Deterministic-only is allowed, not forbidden.** A deterministic model may be strong; FI accepts deterministic output as structurally valid. But SAP3 has **no deterministic channel**, so deterministic-only is **non-operational** until the model supplies forecast uncertainty (quantiles/trajectories it emits, or a downstream uncertainty wrapper).
 - **No silent non-operational output.** The model declares its representation(s) and emitted count; SAP3 checks them against its deployment floor at **integration/registration time** and rejects incompatibles loudly. Net: *valid FI + declared counts ≥ deployment floor ⟹ operational.*
@@ -165,7 +165,7 @@ Banded Snowmapper SWE / snowmelt is declared at `ELEVATION_BAND`.
 
 **Deviation from Sandro:** the input hierarchy's level-1 "temporal resolution" is now keyed by `timedelta` rather than an enum — on the Sandro list — but forced by the 3h/6h requirement and more SAP3-consistent.
 
-*(Code TODO: change `InputRequirement.dynamic` key type to `timedelta`; remove `resolution` from `VariableMetadata`; remove `TemporalResolution` from `common/resolutions.py`; update validators and tests.)*
+*(Implemented.)*
 
 **Reflected in:** `docs/input_requirement.md`, `docs/model_interface.md`.
 
@@ -175,12 +175,12 @@ Banded Snowmapper SWE / snowmelt is declared at `ELEVATION_BAND`.
 
 **(a) Vocabulary — documented canonical names.** Variable names stay free strings but must match SAP3's canonical set (`discharge`, `water_level`, `water_temperature`, `precipitation`, `temperature`, `relative_humidity`, `wind_speed`, `wind_direction`, `global_radiation`, `reference_et`, `snow_water_equivalent`, `runoff`). SAP3 soft-checks at integration. A documented contract, **not** a hard FI enum (avoids tracking SAP3's evolving set).
 
-**(b) Aggregation — optional per-variable override.** Add optional `aggregation: AggregationMethod` (`SUM`/`MEAN`, mirroring SAP3) to `PastKnownVariable` / `FutureKnownVariable`, used when the declared resolution is coarser than delivered data. Default = per-parameter convention (precip/ref_et = SUM, rest = MEAN); declared only to override. Correctness-critical, hence expressible, but optional. *(Code TODO: add `AggregationMethod` enum + optional field.)*
+**(b) Aggregation — optional per-variable override.** Add optional `aggregation: AggregationMethod` (`SUM`/`MEAN`, mirroring SAP3) to `PastKnownVariable` / `FutureKnownVariable`, used when the declared resolution is coarser than delivered data. Default = per-parameter convention (precip/ref_et = SUM, rest = MEAN); declared only to override. Correctness-critical, hence expressible, but optional. *(Implemented.)*
 
 **(c) Units — first-class on inputs and outputs ("no data without units").**
 - Inputs: model **declares the expected `unit`** per input variable (`PastKnownVariable` / `FutureKnownVariable` gain `unit: Unit`); delivered `ModelInputs` series are **tagged with unit**; SAP3 **delivers in the declared unit or rejects loudly at integration** (auto-conversion deferred to a future adapter feature).
 - Outputs: unchanged — `TargetSpec` / `VariableMetadata` already declare units.
-- **`Unit` enum expansion:** add `PERCENT` (%), `M_PER_S` (m/s), `DEGREE`, `W_PER_M2` (W/m²), `MM_PER_HOUR`. *(Code TODO: expand `common/units.py`; add `unit` field to `input/variable.py`.)*
+- **`Unit` enum expansion:** add `PERCENT` (%), `M_PER_S` (m/s), `DEGREE`, `W_PER_M2` (W/m²), `MM_PER_HOUR`. *(Implemented.)*
 
 **Reflected in:** `docs/input_requirement.md`, `docs/model_interface.md`, and code (`common/units.py`, `input/variable.py`).
 
@@ -240,13 +240,13 @@ A decision-ready list. Each needs the model developer's input before the corresp
 
 ### Q5 — `VariableMetadata` fields — ANSWERED
 
-- **(a) Drop `name`.** Redundant with the `ModelOutput.variables[station][variable]` key; two sources of truth that can disagree is forbidden by the type ethos. Field removed. *(Code TODO: remove `name` from `VariableMetadata`; update tests.)*
-- **(b) Keep `forecast_horizon`, add a per-issue-block cross-validator.** Kept because the adapter reads it directly (`ForecastEnsemble.forecast_horizon_steps`). Validator: for `predict`, `forecast_horizon == row count`; for batch `hindcast`, `forecast_horizon == rows per issue_datetime` (one block per issue time, decision 1.8). *(Code TODO: add validator.)*
+- **(a) Drop `name`.** Redundant with the `ModelOutput.variables[station][variable]` key; two sources of truth that can disagree is forbidden by the type ethos. Field removed. *(Implemented.)*
+- **(b) Keep `forecast_horizon`, add a per-issue-block cross-validator.** Kept because the adapter reads it directly (`ForecastEnsemble.forecast_horizon_steps`). Validator: for `predict`, `forecast_horizon == row count`; for batch `hindcast`, `forecast_horizon == rows per issue_datetime` (one block per issue time, decision 1.8). *(Implemented.)*
 - **(c) `offset` semantics confirmed:** number of steps (each `timedelta` long) between the **last observation and the first forecast step**. `offset = 1` ⇒ first forecast valid time is `last_obs + 1·timedelta` (usual next-step case); `offset = 2` ⇒ a one-step gap. Model and adapter both assume this convention.
 
 **Reflected in:** `docs/model_interface.md`.
 
-### Q6 — Per-row `issue_datetime` column
+### Q6 — Per-row `issue_datetime` column — ANSWERED (scoped by decision 1.8)
 
 The adapter maps `ModelOutput.issue_datetime` → `ForecastEnsemble.issued_at` and renames the per-row datetime column → `valid_time`. The question is whether to **keep the per-row `issue_datetime` column requirement** — with a cross-validator that it matches the top-level `issue_datetime` for forecasts — or **relax it**. Frame this as a validator question, not a removal: the column is not "dropped", it is renamed and re-used.
 
diff --git a/forecast_interface/__init__.py b/forecast_interface/__init__.py
index 0b61a5f..bc50ee9 100644
--- a/forecast_interface/__init__.py
+++ b/forecast_interface/__init__.py
@@ -1,4 +1,4 @@
-__version__ = "0.1.13"
+__version__ = "0.1.14"
 
 from .common import AggregationMethod
 from .input import (
diff --git a/pyproject.toml b/pyproject.toml
index 5f7406f..0fd38ab 100644
--- a/pyproject.toml
+++ b/pyproject.toml
@@ -1,6 +1,6 @@
 [project]
 name = "forecastinterface"
-version = "0.1.13"
+version = "0.1.14"
 description = "Add your description here"
 readme = "README.md"
 requires-python = ">=3.11"
@@ -25,7 +25,7 @@ init_typed = true
 warn_required_dynamic_aliases = true
 
 [tool.bumpversion]
-current_version = "0.1.13"
+current_version = "0.1.14"
 commit = false
 tag = false
 allow_dirty = true
diff --git a/uv.lock b/uv.lock
index 5b1af42..a4156fe 100644
--- a/uv.lock
+++ b/uv.lock
@@ -129,7 +129,7 @@ wheels = [
 
 [[package]]
 name = "forecastinterface"
-version = "0.1.13"
+version = "0.1.14"
 source = { virtual = "." }
 dependencies = [
     { name = "polars" },

From 4773949c197826990f3136431af5d497ca55a55b Mon Sep 17 00:00:00 2001
From: Beatrice Marti <mabesa@users.noreply.github.com>
Date: Wed, 17 Jun 2026 11:40:00 +0200
Subject: [PATCH 14/16] =?UTF-8?q?docs:=20remove=20non-contract=20docs=20(N?=
 =?UTF-8?q?epal=20requirements,=20FI=E2=86=94SAP3=20mapping)?=
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

This repo is the model-author contract; trim docs that belong elsewhere:
- delete docs/nepal-model-requirements.md — one deployment's requirements,
  now captured in resolved decisions 1.6/1.10 and the implemented code
- delete docs/fi-sap3-mapping.md — adapter-boundary mapping that belongs
  with the adapter in SAPPHIRE_flow, not the model-author contract

Fixed all inbound references in model_interface.md, open_design_questions.md,
and README.md. Remaining docs: model_interface (protocol/output),
input_requirement (inputs), open_design_questions (decisions + open Qs).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
---
 README.md                        |   1 -
 docs/fi-sap3-mapping.md          | 326 -------------------------------
 docs/model_interface.md          |   8 +-
 docs/nepal-model-requirements.md | 251 ------------------------
 docs/open_design_questions.md    |  12 +-
 forecast_interface/__init__.py   |   2 +-
 pyproject.toml                   |   4 +-
 uv.lock                          |   2 +-
 8 files changed, 14 insertions(+), 592 deletions(-)
 delete mode 100644 docs/fi-sap3-mapping.md
 delete mode 100644 docs/nepal-model-requirements.md

diff --git a/README.md b/README.md
index b0540e6..d42c02c 100644
--- a/README.md
+++ b/README.md
@@ -10,7 +10,6 @@ uv add forecastinterface
 
 - [Model Interface Specification](docs/model_interface.md) — the `ForecastModel` protocol, training/lifecycle, and `ModelOutput` types
 - [Input Requirement Specification](docs/input_requirement.md) — the `InputRequirement` declaration and the `ModelInputs` bundle
-- [FI ↔ SAP3 Mapping](docs/fi-sap3-mapping.md) — how FI types map onto the SAPPHIRE_flow adapter boundary
 
 ## ModelOutput
 
diff --git a/docs/fi-sap3-mapping.md b/docs/fi-sap3-mapping.md
deleted file mode 100644
index da2be3b..0000000
--- a/docs/fi-sap3-mapping.md
+++ /dev/null
@@ -1,326 +0,0 @@
-# ForecastInterface ↔ SAPPHIRE Flow (SAP3) Adapter-Boundary Mapping
-
-> **Status:** Forward contract (Phase 0, docs-only). No adapter exists in SAP3 yet.
-> **FI side:** authoritative for output types; co-designs input types; owns the model protocol.
-> **Companion document:** SAP3 `docs/plans/archive/014-forecast-interface-adapter-design.md`
-> (referred to below as **doc 014**). This document is the **FI-side** formalization of doc 014.
-
-This is the single authoritative place that says, type by type, how a ForecastInterface
-(FI) model maps onto SAP3's `StationForecastModel` / `GroupForecastModel` contract across
-the planned `ForecastInterfaceAdapter` boundary.
-
-File references use `path:line`. FI paths are relative to this repo
-(`/Users/bea/Documents/GitHub/ForecastInterface`); SAP3 paths are relative to the sibling
-repo (`/Users/bea/Documents/GitHub/SAPPHIRE_flow`).
-
----
-
-## 1. Purpose & governance
-
-FI is the **model-author-facing contract**: a model developer implements `ForecastModel`
-(`forecast_interface/interface/protocol.py:10`), declares its `InputRequirement`
-(`forecast_interface/input/requirement.py:35`), and returns a `ModelResult`
-(`forecast_interface/interface/result.py:40`) wrapping a `ModelOutput`
-(`forecast_interface/output/model_output.py:9`). FI knows nothing about stations, groups,
-batching, stacking, QC, alerts, or artifact storage.
-
-SAP3 is the **operational system**. It plans to wrap FI models behind a thin
-`ForecastInterfaceAdapter` (doc 014 §A Task 1, lines 157–240; naming per doc 014 line 343)
-that translates between FI's contract and SAP3's `StationForecastModel` /
-`GroupForecastModel` protocols (`src/sapphire_flow/protocols/forecast_model.py:23,49`).
-
-### Governance split (doc 014 lines 300–305)
-
-| Concern | Authority | Mechanism |
-|---|---|---|
-| FI **output** types (`ModelOutput`, `VariableOutput`, data containers, enums) | **FI** | Stable & test-locked; SAP3 adapts to them. |
-| FI **input** types (`InputRequirement` & friends) | **Co-designed** | SAP3 contributes via a SAP3→FI PR (doc 014 Task 3, lines 257–283). |
-| FI **interface / model Protocol** (`ForecastModel`) | **FI** | FI-owned; SAP3 wraps thin (doc 014 Task 4–5, lines 287–305). |
-
-### Current state of the adapter (verified)
-
-As of this writing there is **no adapter implemented in SAP3**:
-
-- No import of the FI package anywhere in `SAPPHIRE_flow/src/`.
-- No `ForecastInterfaceAdapter` class.
-- No use of FI's `ModelOutput`. The only `ModelOutput`-named symbol in SAP3 src is
-  `ModelOutputError` (`src/sapphire_flow/exceptions.py:17`) — the *planned* error subclass,
-  not a use of FI's output type.
-
-This document therefore describes a **forward contract**, not an implemented one. Everything
-below states what the adapter *must* do when built; it is not a description of running code.
-
----
-
-## 2. The two adapter paths
-
-The adapter boundary sits between SAP3's assembly layer and the FI model. SAP3 selects a
-path by `artifact_scope` (doc 014 lines 114–127).
-
-```
-STATION path (v0b — ships first):
-  ModelDataRequirements → assembly → StationModelInputs → adapter → FI input
-    → FI model → ModelOutput → adapter → tuple[dict[str, ForecastEnsemble], bytes | None]
-        (SAP3)                 (boundary)    (FI)        (boundary)        (SAP3)
-
-GROUP path (v1 — multi-station):
-  ModelDataRequirements → assembly → GroupModelInputs → adapter → FI input
-    → FI model → ModelOutput → adapter → dict[StationId, tuple[dict[str, ForecastEnsemble], bytes | None]]
-        (SAP3)                 (boundary)    (FI)        (boundary)            (SAP3)
-```
-
-- **v0b ships the STATION path only.** FI-wrapped models implement
-  `StationForecastModel` (`src/sapphire_flow/protocols/forecast_model.py:23`). Its
-  `predict()` returns `tuple[dict[str, ForecastEnsemble], bytes | None]` (line 32–39).
-- **GROUP / multi-station is v1.** `GroupForecastModel.predict_batch()` must return
-  `dict[StationId, tuple[...]]` (`src/sapphire_flow/protocols/forecast_model.py:58–63`).
-  Decomposing a single `ModelOutput` per station requires station-keyed FI output — see §5.
-
----
-
-## 3. Output mapping (FI → SAP3)
-
-FI's output is authoritative; SAP3 adapts. This table grounds and supersedes doc 014's
-divergence table (lines 138–151) on the FI side.
-
-| FI type / field (`path:line`) | SAP3 target (`path:line`) | Adapter responsibility |
-|---|---|---|
-| `ModelOutput` (`output/model_output.py:9`) | `tuple[dict[str, ForecastEnsemble], bytes \| None]` (`protocols/forecast_model.py:38`) | Convert whole container → forecast dict + state bytes. |
-| `VariableOutput.deterministic` / `.quantiles` / `.trajectories` (`output/variable_output.py:109–111`) | `ForecastEnsemble` (`types/ensemble.py:18`) | Pick whichever is populated; route to the matching factory. |
-| `TrajectoryData` (`output/variable_output.py:64`) | `ForecastEnsemble.from_members()` (`types/ensemble.py:39`) → `MEMBERS` | Reshape member columns → `member_id`/`value`; FI enforces ≥8 members. |
-| `QuantileData` (`output/variable_output.py:33`) | `ForecastEnsemble.from_quantiles()` (`types/ensemble.py:76`) → `QUANTILES` | Reshape quantile columns → `quantile`/`value`. **See operational gap below.** |
-| `DeterministicData` (`output/variable_output.py:20`) | single-member `MEMBERS` ensemble (`types/ensemble.py:39`) | Wrap the single `value` column as `member_id=1`; flagged `insufficient_ensemble_size`, skips operational alert thresholds (doc 014 lines 190–196). |
-| `EpistemicUncertaintyData` (`output/variable_output.py:90`) | — (no SAP3 target) | **Dropped at the boundary in v0b** (FI-only; doc 014 lines 197–204). Revisit if models emit it. |
-| `VariableOutput.flags: frozenset[ForecastFlag]` (`output/variable_output.py:113`; enum `output/flags.py:4`) | `QcFlag.rule_id` strings | Map to `fi_*` rule ids (table below). |
-| `VariableStatus` (`output/status.py:4`) | `QcStatus` (`types/enums.py:4`) | Map per table below. |
-| `VariableMetadata.unit: Unit` (`output/metadata.py:11`; enum `common/units.py:4`) | `ForecastEnsemble.units: str` (`types/ensemble.py:24`) | Map `Unit` enum → SAP3 canonical unit string (table below). |
-| `ModelOutput.issue_datetime` (`output/model_output.py:13`) | `ForecastEnsemble.issued_at: UtcDatetime` (`types/ensemble.py:22`) | Apply `ensure_utc()`. |
-| per-row `datetime` column (all FI data containers) | `valid_time` column (SAP3 factories require it: `types/ensemble.py:54,91`) | Rename `datetime` → `valid_time`. |
-| `VariableMetadata.forecast_horizon: int` (`output/metadata.py:11`) | `ForecastEnsemble.forecast_horizon_steps: int` (`types/ensemble.py:25`) | **DIRECT** — both int, both step counts. **`forecast_horizon` IS consumed by the adapter** (corrects any prior "never consumed" belief). See note below. |
-| `VariableMetadata.timedelta: timedelta` (`output/metadata.py:10`) | `ForecastEnsemble.time_step: timedelta` (`types/ensemble.py:23`) | **DIRECT** assignment; the adapter derives the time step from this field. |
-| `ModelOutput.variables` inner key (`output/model_output.py`) | `ForecastEnsemble.parameter: str` (`types/ensemble.py:23`) | Variable name is the dict key; `VariableMetadata.name` was removed. Validate against `ForecastParameter = Literal["discharge","water_level"]` and `ModelDataRequirements.target_parameters` (`types/model.py:261`). |
-| `ModelOutput.variables` outer key (`output/model_output.py`) | `StationId` (`types/ids.py`) | Station id (opaque `str` on FI side, Q1); adapter maps str → typed `StationId` per GROUP-path decomposition (§5). |
-| empty `ModelOutput.variables` **or** all-`FAILURE` | `ModelOutputError` (`exceptions.py:17`) | Adapter **raises** — zero usable ensembles (doc 014 lines 160–168, 218–223). |
-
-### Status & flag mapping
-
-`VariableStatus` → `QcStatus` (doc 014 lines 247–248):
-
-| FI `VariableStatus` | SAP3 `QcStatus` (`types/enums.py:4`) | Note |
-|---|---|---|
-| `SUCCESS` | `QC_PASSED` (`"qc_passed"`) | — |
-| `FAILURE` | `QC_FAILED` (`"qc_failed"`) | If **all** variables fail → raise `ModelOutputError` instead. |
-| `PARTIAL` | `QC_SUSPECT` (`"qc_suspect"`) + flag | No exact equivalent; attach `fi_partial_output`. |
-
-`ForecastFlag` → `QcFlag.rule_id` (doc 014 lines 250–253; `fi_` prefix marks FI-origin):
-
-| FI flag (`output/flags.py:4`) / status | SAP3 `QcFlag.rule_id` |
-|---|---|
-| `VariableStatus.PARTIAL` | `fi_partial_output` |
-| `ForecastFlag.HIGH_EPISTEMIC_UNCERTAINTY` | `fi_high_epistemic_uncertainty` |
-| `ForecastFlag.DATA_AVAILABILITY` | `fi_data_availability` |
-
-All FI enums use UPPER_CASE member names; SAP3 stores lowercase `.value`. Convert at the
-boundary — never pass FI enum values into the SAP3 domain layer (doc 014 lines 151, 254).
-
-### Unit mapping (`Unit` enum → SAP3 canonical unit string)
-
-FI's `Unit.value` holds a glyph form (e.g. `"m³/s"`, `common/units.py:5`). SAP3 expects an
-ASCII canonical string from its `parameters` table (doc 014 line 147). The adapter maps by
-**enum member**, not by `.value`:
-
-| FI `Unit` member (`common/units.py`) | FI `.value` | SAP3 canonical string |
-|---|---|---|
-| `M3_PER_S` | `m³/s` | `m3/s` |
-| `MM_PER_DAY` | `mm/day` | `mm/day` |
-| `MM_PER_S` | `mm/s` | `mm/s` |
-| `MM` | `mm` | `mm` |
-| `CM` | `cm` | `cm` |
-| `M` | `m` | `m` |
-| `DEG_C` | `°C` | `degC` |
-| `UNITLESS` | `-` | `-` |
-
-### `QuantileData` operational gap (FI valid ≠ SAP3 usable)
-
-FI's `QuantileData` requires **≥3** quantile levels in `(0,1)`, sorted & unique
-(`output/variable_output.py:39–51`). SAP3's `from_quantiles()` requires **≥7** quantile
-levels **with tail coverage** (min ≤ 0.05 and max ≥ 0.95) (`types/ensemble.py:98–106`).
-
-Consequently an FI model emitting 3–6 quantiles (or without tail coverage) is
-**structurally valid FI output but NOT operationally usable by SAP3** — `from_quantiles()`
-raises `ValueError`. State this to model authors explicitly: FI's quantile floor is a
-permissive structural minimum; SAP3's operational floor is stricter.
-
-### `forecast_horizon` consumption note
-
-Two horizon notions coexist and must not be conflated:
-
-- `VariableMetadata.forecast_horizon` (`output/metadata.py:14`) is the **declared** step
-  count. The adapter assigns it directly to `ForecastEnsemble.forecast_horizon_steps`.
-- SAP3's factories also recompute a horizon internally as
-  `values["valid_time"].n_unique()` (`types/ensemble.py:63,107`). The adapter should ensure
-  these agree (declared horizon == distinct `valid_time` count) and treat a mismatch as a
-  structural error.
-
-### `success` property caveat
-
-`ModelOutput.success` returns `True` over an empty iterable, because `all()` of nothing is
-`True`. The adapter **must not** rely on `success` alone to gate conversion
-(doc 014 lines 219–223).
-
-> **Discrepancy vs doc 014:** Current FI now *forbids* empty `variables` at construction —
-> `ModelOutput._validate_variables` raises if the outer (station) dict is empty, if any
-> station maps to an empty inner (variable) dict, or if any station-id / variable-name key is
-> empty / whitespace (`output/model_output.py`). So the empty-variables case is no longer
-> constructible through the public API. The all-`FAILURE` → `ModelOutputError` guard remains
-> live and necessary; the empty-variables guard is now defense-in-depth.
-
----
-
-## 4. Input mapping (SAP3 → FI)
-
-Per doc 014 Task 3 (lines 257–283), SAP3 will PR FI's input types. FI's input contract is
-*already partially implemented* in this repo (`forecast_interface/input/`) — see the
-discrepancy note at the end of this section.
-
-### Concept → field mapping
-
-| FI input concept (`path:line`) | SAP3 `ModelDataRequirements` field (`types/model.py:260`) |
-|---|---|
-| `past_known` temporality (`input/requirement.py:9`) | `past_dynamic_features: frozenset[str]` (line 262) |
-| `future_known` temporality (`input/requirement.py:10`) | `future_dynamic_features: frozenset[str]` (line 263) |
-| `InputRequirement.static` (`input/requirement.py:37`) | `static_features: frozenset[str]` (line 264) |
-| `PastKnownVariable.lookback` (`input/variable.py:12`) | `lookback_steps: int` (line 266) |
-| `FutureKnownVariable.future_steps` (`input/variable.py:31`) | `forecast_horizon_steps: int` (line 267) |
-| `InputRequirement.dynamic` `timedelta` keys + `VariableMetadata.timedelta` | `supported_time_steps: frozenset[timedelta]` (line 265) |
-| `SpatialRepresentation` keys (`input/requirement.py`) | `spatial_input_type: SpatialRepresentation` (line 268) |
-| `InputRequirement.targets` keys + `TargetSpec.unit`/`.representations` (`input/target.py`) | `target_parameters: frozenset[str]` (line 261) |
-| `PastKnownVariable.max_nan` / `FutureKnownVariable.max_nan` (`input/variable.py:13,32`) | Derivable from SAP3 QC config (doc 014 line 273) |
-| `FutureKnownVariable.ensemble_mode` (`input/variable.py:33`) | Derivable from NWP ensemble config (doc 014 line 273) |
-
-### Assembled inputs (4-slot)
-
-After requirement matching, SAP3 assembles concrete DataFrames into a 4-slot shape, passed
-to the FI model via the adapter. The slots are identical for station and group:
-
-| Slot | `StationModelInputs` (`types/model.py:59`, via `StationInputData` line 51) | `GroupModelInputs` (`types/model.py:78`) |
-|---|---|---|
-| `past_targets` | target history | stacked, `station_id`-keyed |
-| `past_dynamic` | past dynamic features | stacked |
-| `future_dynamic` | future dynamic features | stacked |
-| `static` (`pl.DataFrame \| None`) | catchment attributes | stacked, one row per station |
-
-`GroupModelInputs.for_station()` (`types/model.py:89`) slices a group into per-station
-`StationInputData`. Both carry `issue_time`, `forecast_horizon_steps`, `time_step`.
-
-### Spatial enum mapping (FI → SAP3)
-
-As of Phase 1, FI's `SpatialRepresentation` (`common/resolutions.py`) adopts SAP3's exact
-member names and values (`types/enums.py:73`), so the mapping is **identity**:
-
-| FI `SpatialRepresentation` | SAP3 `SpatialRepresentation` |
-|---|---|
-| `POINT` (`"point"`) | `POINT` (`"point"`) |
-| `BASIN_AVERAGE` (`"basin_average"`) | `BASIN_AVERAGE` (`"basin_average"`) |
-| `ELEVATION_BAND` (`"elevation_band"`) | `ELEVATION_BAND` (`"elevation_band"`) |
-| `GRIDDED` (`"gridded"`) | `GRIDDED` (`"gridded"`) |
-
-> The earlier `LUMPED`/`HRU` names were renamed to `BASIN_AVERAGE`/`ELEVATION_BAND` and
-> `POINT` was added, completing the alignment proposed in the SAP3→FI input PR.
-
----
-
-## 5. Station identity & the GROUP path (Option a)
-
-FI's `ModelOutput.variables` is now **station-keyed**,
-`dict[str, dict[str, VariableOutput]]` (`output/model_output.py`) — keyed first by
-`station_id`, then by `variable_name`. SAP3's `GroupForecastModel.predict_batch()` requires
-per-station results (`dict[StationId, ...]`, `protocols/forecast_model.py:63`).
-
-**Decision realized (doc 014 "Option (a)", lines 208–217):** FI adopts **station-keyed
-output** so the GROUP-path adapter can map per-station 1:1. This realizes the GROUP-path
-per-station decomposition (Option a):
-
-```
-ModelOutput.variables : dict[station_id, dict[variable, VariableOutput]]
-```
-
-- Single-station models return a **one-key dict** (one station id → its variable map).
-- Missing stations are **explicit `FAILURE` entries** (the model echoes back every station
-  id it was given), never absent keys.
-- The STATION-path adapter unwraps the single key into
-  `tuple[dict[str, ForecastEnsemble], bytes | None]`.
-- The GROUP-path adapter maps each station key → one `(forecast_dict, state)` entry of the
-  `dict[StationId, tuple[...]]` return.
-- Station ids are opaque `str` on the FI side (open item Q1); the adapter maps str → typed
-  `StationId` (UUID) at the boundary.
-
-**Cross-repo coordination item (FLAG):** SAP3's adapter design in doc 014 is currently
-**STATION-path-only** for v0b (lines 208–217 explicitly defer GROUP support). Now that FI
-has moved to station-keyed output, SAP3's `ForecastInterfaceAdapter` must be extended to
-consume it. Until SAP3 lands its side, GROUP-path FI wrapping is not possible. This is the
-single largest open structural divergence between the two repos.
-
----
-
-## 6. State bridge
-
-FI is **state-free**: the `ForecastModel` protocol (`interface/protocol.py:10`) has no
-state parameter or return, and `ModelOutput` carries no warm-up snapshot.
-
-SAP3 carries warm-up state as `bytes | None`:
-
-- `StationForecastModel.predict(..., prior_state: bytes | None = None)` →
-  `tuple[..., bytes | None]` (`protocols/forecast_model.py:32–39`).
-- The state lifecycle is handled **entirely by the adapter**, not by FI.
-
-**v0b:** FI-wrapped models are stateless (doc 014 lines 205–207). The adapter:
-
-- Ignores `prior_state` (nothing to feed an FI model).
-- Returns `(forecast_dict, None)` — no state to snapshot.
-
-**v1 (future):** optional `dump_state` / `restore_state` methods on the FI `ForecastModel`
-protocol would let conceptual / hybrid models round-trip warm-up state through the adapter
-(doc 014 lines 129–134, 206–207). Not part of the current FI protocol.
-
----
-
-## 7. Artifact metadata ownership
-
-Training provenance and storage metadata are split between FI-declared / artifact-embedded
-fields and SAP3-stored fields (`ModelArtifactRecord`, `types/model.py:290`).
-
-| Field / concept | Owner | Where |
-|---|---|---|
-| Model identity / name | FI / artifact | `ModelOutput.model_name` (`output/model_output.py:12`); embedded in artifact |
-| `interface_version` | FI / artifact | declared by model, embedded in artifact |
-| `model_version` | FI / artifact | declared by model, embedded in artifact |
-| Training provenance hashes | FI / artifact | embedded in artifact |
-| Training seed | FI / artifact | embedded in artifact (deterministic training) |
-| Product / data-source versions | FI / artifact | embedded in artifact |
-| Region scope | FI / artifact | declared by model, embedded in artifact |
-| Embedding-key behaviour when station set differs | FI / artifact | model declares how it keys stations (relevant to GROUP path, §5) |
-| `sha256_hash` | **SAP3** | `ModelArtifactRecord.sha256_hash` (`types/model.py:297`) |
-| `training_period_start` / `_end` | **SAP3** | `types/model.py:298–299` |
-| `trained_at` | **SAP3** | `types/model.py:300` |
-| `status` | **SAP3** | `ModelArtifactStatus` (`types/model.py:295`; enum `types/enums.py:47`) |
-| scope / `group_id` / `station_id` | **SAP3** | `types/model.py:293–294`; scope via `ArtifactScope` (`types/enums.py:41`) |
-
-FI/artifact side answers *"what is this model and how was it built"*; SAP3 side answers
-*"which trained binary is stored, for which scope, in what lifecycle state"*.
-
----
-
-## 8. Open cross-repo items
-
-| # | Item | Detail |
-|---|---|---|
-| 1 | Typed IDs vs str | SAP3 `StationId`/`StationGroupId` are `NewType(UUID)` but `ModelId` is `NewType(str)` (`types/ids.py:4,16,21`). FI uses free-form `str` for variable/model names. Decide whether FI adopts typed ids at the boundary or the adapter parses str→UUID. |
-| 2 | SAP3 input-types PR scope | Whether the SAP3→FI PR lands `target_parameters` and `spatial_input_type`/`POINT` into FI's input spec (doc 014 lines 275–276; §4 above). |
-| 3 | GROUP-path adapter extension | Station-keyed FI output (§5) requires SAP3's STATION-only adapter design to extend to GROUP. Largest structural divergence. |
-| 4 | Quantile floor mismatch | FI requires ≥1 quantile (`output/variable_output.py:39–51`); SAP3 requires ≥7 with tail coverage (`types/ensemble.py:98–106`). FI output can be valid yet operationally unusable (§3). |
-| 5 | Epistemic uncertainty | `EpistemicUncertaintyData` is dropped at the boundary in v0b (doc 014 lines 197–204). Revisit (add to `ForecastEnsemble` / store as metadata) if models emit it. |
-| 6 | Interface module now exists | doc 014 assumes FI's `interface/` is unimplemented (lines 80, 287). It is now implemented (`ForecastModel`, `ModelResult`, `FailureCause`). SAP3 should re-evaluate Tasks 4–5 against the real protocol. |
-| 7 | `ModelResult` failure channel | FI now returns `ModelResult = ModelSuccess \| ModelFailure` (`interface/result.py:40`) with a `FailureCause` enum (`interface/failure.py:4`). SAP3's `ModelOutputError` path must account for the `ModelFailure` branch, not only all-`FAILURE` `ModelOutput`. |
-| 8 | Resolution enum split | FI now keeps `SpatialRepresentation` (`common/resolutions.py`) and uses `timedelta` for time steps; doc 014 references a single `Resolution`. Mapping tables above use the current FI shape. |
-```
diff --git a/docs/model_interface.md b/docs/model_interface.md
index d8a9bd4..e4323fd 100644
--- a/docs/model_interface.md
+++ b/docs/model_interface.md
@@ -2,7 +2,7 @@
 
 The primary goal of this package is to define the interface between any forecasting library and the forecasting model. The forecasting model can be implemented in any package / code base but needs to follow the protocol defined here.
 
-There are **three protocols**: the required `ForecastModel`, plus two optional extensions — `RetrainableModel` (warm-start `retrain`) and `BatchHindcastModel` (efficient batch `hindcast`), both of which extend `ForecastModel`. A `StatefulModel` extension is **reserved** for future conceptual / hybrid models (see *Warm-up and state* below). The scope of a model (single station vs. group / national) is **declared** via `artifact_scope`, not split into separate protocols. SAP3 consumes the FI protocol through a thin adapter that dispatches to its own `StationForecastModel` / `GroupForecastModel` — see [`docs/fi-sap3-mapping.md`](./fi-sap3-mapping.md). The driving requirements for the first (Nepal v1) integration are in [`docs/nepal-model-requirements.md`](./nepal-model-requirements.md).
+There are **three protocols**: the required `ForecastModel`, plus two optional extensions — `RetrainableModel` (warm-start `retrain`) and `BatchHindcastModel` (efficient batch `hindcast`), both of which extend `ForecastModel`. A `StatefulModel` extension is **reserved** for future conceptual / hybrid models (see *Warm-up and state* below). The scope of a model (single station vs. group / national) is **declared** via `artifact_scope`, not split into separate protocols. SAP3 consumes the FI protocol through a thin adapter (built in SAPPHIRE_flow) that dispatches to its own `StationForecastModel` / `GroupForecastModel`. The first integration target is Nepal v1.
 
 Core functionalities include:
 
@@ -19,7 +19,7 @@ Produce a `TrainedArtifact` from training inputs. See the Training & Lifecycle P
 
 ## Training & Lifecycle Protocol
 
-> **Status: implemented** in `forecast_interface/interface/` (`protocol.py`, `scope.py`, `artifact.py`). The `inputs` parameters use FI-owned `ModelInputs`; only `config` remains **provisional** — typed `Any` until the model-config type is co-designed with SAP3 (Q8). Rich `TrainedArtifact` provenance metadata is **deferred to Phase 4** (see [`docs/nepal-model-requirements.md`](./nepal-model-requirements.md) §4); the group-artifact embedding-key / station-set-mismatch contract is **v1 load-bearing** (see the `TrainedArtifact` section below and decision 1.10).
+> **Status: implemented** in `forecast_interface/interface/` (`protocol.py`, `scope.py`, `artifact.py`). The `inputs` parameters use FI-owned `ModelInputs`; only `config` remains **provisional** — typed `Any` until the model-config type is co-designed with SAP3 (Q8). Rich `TrainedArtifact` provenance metadata is **deferred to Phase 4**; the group-artifact embedding-key / station-set-mismatch contract is **v1 load-bearing** (see the `TrainedArtifact` section below and decision 1.10).
 
 ### Scope: `ArtifactScope`
 
@@ -62,7 +62,7 @@ The `inputs` parameters use `ModelInputs`; only `config` is typed `Any` (provisi
 - **Self-contained**: `serialize_artifact` produces `bytes` that embed all weights, scalers, and metadata — **with no absolute filesystem paths** and no machine-local references.
 - **Deployment-portable**: `deserialize_artifact(serialize_artifact(a))` must reconstruct an artifact that runs **unchanged on another SAP3 instance**.
 
-**Partly deferred.** Rich provenance metadata (scope, region, training period, hashes, seed, product versions) is **not** part of the marker Protocol yet and lands in Phase 4 (see [`docs/nepal-model-requirements.md`](./nepal-model-requirements.md) §4).
+**Partly deferred.** Rich provenance metadata (scope, region, training period, hashes, seed, product versions) is **not** part of the marker Protocol yet and lands in Phase 4.
 
 The **group-artifact embedding-key / station-set-mismatch contract**, however, is **load-bearing from v1**, because `GROUP` artifacts ship from the start (decision 1.10) and east→west transfer is a Nepal v1 target (decision 1.6) — re-evaluate its earlier Phase 4 deferral. The contract: a `GROUP` artifact **embeds the meaningful station strings it was trained on** (which the model reads to key per-station state); it must define behaviour when the predict-time station set differs from the trained set — **known** stations use stored state, **unknown** stations are generalized from static attributes or rejected with an explicit error — and it must **never silently mis-associate** a prediction with the wrong station. Station strings are **stable, meaningful identifiers** that round-trip unchanged through `serialize_artifact` / `deserialize_artifact` and across deployments; the model never alters them.
 
@@ -83,7 +83,7 @@ FI's protocol is **state-free in v0**: `predict` / `hindcast` take no `state` pa
 
 ### Output stays FI-authoritative
 
-`predict` / `hindcast` **return** `ModelResult` → `ModelOutput` (defined below). `ModelOutput` is **not** replaced by SAP3's `ForecastEnsemble`: the SAP3 adapter maps `ModelOutput` *into* its own representation, never the other way around. See [`docs/fi-sap3-mapping.md`](./fi-sap3-mapping.md) for the field-level mapping.
+`predict` / `hindcast` **return** `ModelResult` → `ModelOutput` (defined below). `ModelOutput` is **not** replaced by SAP3's `ForecastEnsemble`: the SAP3 adapter maps `ModelOutput` *into* its own representation, never the other way around. The field-level mapping is implemented by the SAP3 adapter (in SAPPHIRE_flow).
 
 ### Failure & result model
 
diff --git a/docs/nepal-model-requirements.md b/docs/nepal-model-requirements.md
deleted file mode 100644
index d309413..0000000
--- a/docs/nepal-model-requirements.md
+++ /dev/null
@@ -1,251 +0,0 @@
-# Nepal Requirements for ForecastInterface and Model Implementers
-
-**Date:** 2026-06-11
-**Audience:** ForecastInterface maintainer, model implementer, SAPPHIRE Flow
-maintainers
-
-## Goal
-
-ForecastInterface must support a Nepal workflow where SAPPHIRE Flow implements
-and validates the eastern part of the country, while DHM/hydromet staff train
-and configure models for the western part. The same interface must also allow
-operators to test one model across all gauges or keep separate east/west models.
-
-## Current Alignment With SAPPHIRE Flow
-
-SAPPHIRE Flow already has these concepts internally:
-
-- explicit model data requirements:
-  - target parameters;
-  - past dynamic features;
-  - future dynamic features;
-  - static features;
-  - supported time steps;
-  - lookback steps;
-  - forecast horizon;
-  - spatial input type;
-- station-scoped and group-scoped model artifacts;
-- stacked multi-station inputs for group models;
-- per-station output from group models;
-- active/superseded artifact lifecycle;
-- hindcast and skill computation after training;
-- multiple model assignments per station;
-- pooled and BMA forecast combination for member ensembles.
-
-ForecastInterface already has:
-
-- an `InputRequirement` structure with temporal/spatial/product axes;
-- `predict()` and `hindcast()` protocol methods;
-- output containers for deterministic, quantile, trajectory, and epistemic
-  uncertainty data.
-
-The main missing pieces are training/retraining, artifact metadata/provenance,
-target declaration, and the station dimension for multi-station models.
-
-## Required Interface Capabilities
-
-### 1. Explicit target declaration
-
-ForecastInterface must declare what variables the model forecasts, separate
-from predictor variables.
-
-Required:
-
-- `target_variables`, e.g. `{"discharge"}` or `{"water_level", "discharge"}`;
-- units for each target;
-- output representation supported for each target: deterministic, quantiles,
-  trajectories/members, or combinations;
-- validation that output variable keys match declared targets unless a model
-  returns a documented subset with `PARTIAL` status.
-
-Why: SAPPHIRE Flow uses target variables for station compatibility, skill
-scoring, API storage, and deciding which observations are needed.
-
-### 2. Station and group model output shape
-
-ForecastInterface must support models that predict for many gauges in one call.
-
-Required output shape:
-
-```python
-variables: dict[str, dict[str, VariableOutput]]
-#          station_id  variable_name
-```
-
-Rules:
-
-- station IDs in the output must match station IDs in the input;
-- single-station models return a one-station mapping;
-- forecast DataFrames can remain per-station and do not need a `station_id`
-  column if the station dimension is in the outer dict;
-- missing station outputs must be explicit failures, not absent keys.
-
-### 3. Training and retraining protocol
-
-ForecastInterface needs a training API in addition to `predict()` and
-`hindcast()`.
-
-Required operations:
-
-```python
-train(inputs, *, config, rng) -> TrainedArtifact
-retrain(base_artifact, inputs, *, config, rng) -> TrainedArtifact
-serialize_artifact(artifact) -> bytes
-deserialize_artifact(raw: bytes) -> TrainedArtifact
-```
-
-Acceptable naming can be finalized by the ForecastInterface maintainer, but the
-capabilities must exist.
-
-Required behavior:
-
-- `train()` creates a new model artifact from training data.
-- `retrain()` can start from an existing artifact or checkpoint and update it
-  with new data.
-- Retraining must be deterministic when config, input data, and random seed are
-  fixed.
-- The model must fail with a clear error when the provided data does not satisfy
-  declared requirements.
-- Training and retraining must work for station-scoped, regional group-scoped,
-  and national group-scoped models.
-
-**Resolved (SAPPHIRE Flow, 2026-06-11): cold retrain required, warm-start optional.**
-`train()` — a full rebuild on the (updated) dataset — is the **required** contract every
-model must implement; it works for conceptual and ML models alike. `retrain(base_artifact,
-…)` / warm-start (fine-tune from an existing artifact) is an **optional** capability for
-models that support it; a model that does not implement it simply cold-trains. This
-resolves the open question below: full rebuild is the required baseline, warm-start is an
-optional optimisation.
-
-### 4. Artifact metadata and provenance
-
-Every trained artifact must carry metadata that SAPPHIRE Flow can store and use
-for safe promotion/rollback.
-
-Required artifact metadata:
-
-| Field | Purpose |
-|---|---|
-| `model_name` / `model_id` | Stable model identity. |
-| `interface_version` | Compatibility with ForecastInterface. |
-| `model_version` | Code/config version from implementer. |
-| `artifact_scope` | `station`, `group`, or `national`/country-level group. |
-| `region_scope` | `east`, `west`, `national`, or another agreed label. |
-| `station_ids` | Gauges used for training and valid application. |
-| `training_period_start`, `training_period_end` | Reproducibility and skill context. |
-| `input_requirement_hash` | Detects incompatible interface changes. |
-| `training_data_hash` | Detects data changes. |
-| `catchment_package_version` | Links artifact to gateway shapefile/catchment version. |
-| `snowmapper_product_version` | Links artifact to SnowMapper inputs if used. |
-| `weather_product_versions` | Links artifact to NWP/reanalysis inputs. |
-| `random_seed` | Reproducibility. |
-| `created_at_utc` | Audit trail. |
-
-### 5. Region and application constraints
-
-Model artifacts must make their valid application scope explicit.
-
-Required:
-
-- a national model can declare valid application to all listed Nepal gauges;
-- an eastern model can declare valid application to eastern gauges only unless
-  explicitly marked as transferable/test-only;
-- a western model can declare valid application to western gauges only unless
-  explicitly marked as transferable/test-only;
-- applying a model outside its declared region must be possible in test mode but
-  should require an explicit operator choice.
-
-Why: DHM may want to test one model on all gauges, but accidental cross-region
-promotion should be prevented.
-
-### 6. Output requirements for model comparison and combination
-
-If a model should participate in pooled/BMA combination in SAPPHIRE Flow, it
-must output trajectory/member forecasts, not quantiles only.
-
-Required:
-
-- trajectory/member outputs must have stable member IDs;
-- all members must share the same issue time, valid times, units, and horizon;
-- quantile-only outputs are acceptable for primary/fallback operation but should
-  be marked as not combinable for pooled/BMA.
-
-### 7. Input requirements for SnowMapper and catchment data
-
-ForecastInterface input requirements must be able to express:
-
-- SnowMapper variables such as SWE and snowmelt as past and/or future dynamic
-  features;
-- product/source names and versions;
-- spatial representation: POINT, BASIN_AVERAGE, ELEVATION_BAND, or GRIDDED;
-- static catchment attributes required by the model;
-- allowed missing-data thresholds per variable and product.
-
-### 8. Artifact portability across deployments (staging → production)
-
-HSOL trains models on a cloud **staging** instance; trained artifacts are then promoted to
-the on-prem **production** deployment (where DHM also retrains). The interface must make
-artifacts portable across instances.
-
-Required:
-
-- a trained artifact MUST be **self-contained and deployment-independent**: it serializes
-  to bytes with no absolute paths and no dependence on the training environment, and
-  **deserializes and runs unchanged on a different SAPPHIRE Flow instance**;
-- if a group/national artifact embeds station identifiers (e.g. per-station ML
-  embeddings), it MUST document the **embedding key** and its behaviour when the station
-  set at predict time **differs** from training (a new western gauge; an identifier
-  remapped between staging and production) — it must handle this gracefully or raise an
-  **explicit error**, and never silently associate a station with the wrong embedding.
-
-Why: SAPPHIRE Flow promotes the serialized artifact between instances; it cannot reach
-inside the artifact, so the artifact must be honest about its environment and ID
-assumptions. The provenance fields in §4 (`interface_version`, `input_requirement_hash`,
-`station_ids`) support compatibility checks but do not by themselves guarantee runtime
-portability.
-
-## SAPPHIRE Flow Integration Implications
-
-SAPPHIRE Flow can already store model artifacts and supersede old active
-artifacts. It still needs additional work before full Nepal east/west operation:
-
-- implement configurable retrain strategies;
-- audit group-scoped model assignments and artifact lookup;
-- implement merged data requirements before running multiple models with
-  different inputs on one gauge;
-- add operator workflows for test, promote, rollback, and regional assignment.
-
-ForecastInterface should not assume these are solved by the model package; it
-should expose enough metadata for SAPPHIRE Flow to implement them safely.
-
-## Acceptance Criteria
-
-- A model implementer can train an eastern, western, or national model artifact
-  through ForecastInterface.
-- DHM can retrain a model from an existing artifact using new western Nepal data.
-- The artifact metadata tells SAPPHIRE Flow whether the model is valid for east,
-  west, or all Nepal gauges.
-- A group model can forecast multiple gauges in one call and return explicit
-  per-station outputs.
-- SAPPHIRE Flow can reject or warn on missing target variables, incompatible
-  static/dynamic inputs, unsupported time steps, or invalid region application.
-- Models intended for pooled/BMA combination provide trajectory/member outputs.
-
-## Open Questions for the Model Implementer
-
-- ~~Should retraining always start from a previous artifact, or can it also mean a
-  full rebuild on an expanded dataset?~~ **Resolved:** full rebuild (cold) is the required
-  baseline; warm-start from an artifact is optional (see §3).
-- ~~What is the first intended artifact scope for Nepal: station, east/west group,
-  or national group?~~ **Resolved:** the **eastern regional group** ships first, so the
-  first production artifact is **GROUP-scoped**.
-- ~~Which SnowMapper variables and lead times will the model consume?~~ **Resolved
-  (partial):** starts with **SWE** and **ROF**, as banded forcing at elevation-band
-  granularity (see §7). Lead times still to be confirmed.
-- ~~Are western Nepal models expected to transfer to eastern gauges for testing,
-  or only the other way around?~~ **Resolved:** transfer is **east → west** — an
-  eastern group artifact is applied to western gauges. The eastern artifact must
-  define its behaviour when the station set differs (see §8 embedding-key contract).
-- Can all stateful models reconstruct state from lookback data, or do any
-  require persisted hidden state between forecasts?
-
diff --git a/docs/open_design_questions.md b/docs/open_design_questions.md
index cd5c433..507f976 100644
--- a/docs/open_design_questions.md
+++ b/docs/open_design_questions.md
@@ -10,7 +10,7 @@ ForecastInterface (FI) is the contract model authors implement. SAPPHIRE Flow (S
 - **FI INPUT types are co-designed** via a SAP3 → FI PR.
 - **FI's INTERFACE / protocol is FI-owned**, with SAP3 wrapping thin.
 
-These decisions are reflected in `docs/model_interface.md` and the new `docs/fi-sap3-mapping.md` (the FI ↔ SAP3 adapter mapping). This file does not duplicate their content.
+These decisions are reflected in `docs/model_interface.md` and `docs/input_requirement.md`. This file does not duplicate their content. The FI ↔ SAP3 adapter mapping lives with the adapter in SAPPHIRE_flow.
 
 ---
 
@@ -36,7 +36,7 @@ A single-station model returns a dict with one key, e.g. `{"station_xyz": {"disc
 
 **Cross-repo note:** this advances a v1-deferred GROUP-path item and requires SAP3's adapter to extend from STATION-only to GROUP — a cross-repo coordination item.
 
-**Reflected in:** `docs/model_interface.md`, `docs/fi-sap3-mapping.md`.
+**Reflected in:** `docs/model_interface.md`.
 
 ## 1.2 Target declaration — RESOLVED
 
@@ -87,7 +87,7 @@ Banded Snowmapper SWE / snowmelt is declared at `ELEVATION_BAND`.
 
 **Model developer input (Nepal):** the model emits **quantiles** (count configurable at training); **trajectories** typically ~50 (deployment-specific, may be fewer).
 
-**Reflected in:** `docs/model_interface.md`, `docs/fi-sap3-mapping.md`.
+**Reflected in:** `docs/model_interface.md`.
 
 ## 1.6 Nepal v1 deployment specifics — RESOLVED (model developer)
 
@@ -97,7 +97,7 @@ Banded Snowmapper SWE / snowmelt is declared at `ELEVATION_BAND`.
 - **SnowMapper forcing starts with SWE and ROF** (snow water equivalent and runoff), declared as dynamic forcing at **`BASIN_AVERAGE` or `ELEVATION_BAND`** (see decision 1.4; Q7 broadened this from ELEVATION_BAND-only). Lead times / resolutions follow the ECMWF forecast and ERA5-Land (Q7), with a possible SnowMapper availability lag (Q9).
 - **Artifact transfer direction is east → west** (an eastern group artifact applied to western gauges). This makes the embedding-key / station-set-mismatch contract (Nepal §8) concrete: the eastern GROUP artifact **must define its behaviour when applied to the western station set** — handle gracefully or raise an explicit error, never silently associate a station with the wrong embedding.
 
-**Reflected in:** `docs/nepal-model-requirements.md`, `docs/model_interface.md` (artifact portability), `docs/fi-sap3-mapping.md` (artifact metadata ownership).
+**Reflected in:** `docs/model_interface.md` (artifact portability).
 
 ## 1.7 Failure channel — RESOLVED
 
@@ -198,7 +198,7 @@ Banded Snowmapper SWE / snowmelt is declared at `ELEVATION_BAND`.
 
 **Station-set mismatch (known vs unknown stations):** a GROUP artifact stores its trained member stations. At predict time, **known** stations use the stored per-station state; **unknown** stations (e.g. western gauges under east→west transfer) must be handled by **generalizing from static attributes or raising an explicit error — never silently mis-associating** a prediction with the wrong station / embedding. This is the embedding-key contract (Nepal §8). **Timing flag:** decision 1.6 puts east→west in Nepal v1 and groups ship from the start, so this contract is likely **v1, not Phase 4** — re-evaluate the deferral.
 
-**Reflected in:** `docs/model_interface.md`, `docs/fi-sap3-mapping.md` (§5, §7), `docs/nepal-model-requirements.md` (§8).
+**Reflected in:** `docs/model_interface.md`.
 
 ## 1.9 Input bundle (`inputs`) typing & v1 delivery scope — RESOLVED
 
@@ -214,7 +214,7 @@ Banded Snowmapper SWE / snowmelt is declared at `ELEVATION_BAND`.
 
 **`config` (train / predict): OPEN — modeller-owned** (see Q8). Until specified, `config` stays `Any`.
 
-**Reflected in:** `docs/input_requirement.md`, `docs/fi-sap3-mapping.md` (§4), and a future `ModelInputs` type (code).
+**Reflected in:** `docs/input_requirement.md`, and the `ModelInputs` type (code).
 
 ---
 
diff --git a/forecast_interface/__init__.py b/forecast_interface/__init__.py
index bc50ee9..fd2addb 100644
--- a/forecast_interface/__init__.py
+++ b/forecast_interface/__init__.py
@@ -1,4 +1,4 @@
-__version__ = "0.1.14"
+__version__ = "0.1.15"
 
 from .common import AggregationMethod
 from .input import (
diff --git a/pyproject.toml b/pyproject.toml
index 0fd38ab..72cc8fa 100644
--- a/pyproject.toml
+++ b/pyproject.toml
@@ -1,6 +1,6 @@
 [project]
 name = "forecastinterface"
-version = "0.1.14"
+version = "0.1.15"
 description = "Add your description here"
 readme = "README.md"
 requires-python = ">=3.11"
@@ -25,7 +25,7 @@ init_typed = true
 warn_required_dynamic_aliases = true
 
 [tool.bumpversion]
-current_version = "0.1.14"
+current_version = "0.1.15"
 commit = false
 tag = false
 allow_dirty = true
diff --git a/uv.lock b/uv.lock
index a4156fe..31294e2 100644
--- a/uv.lock
+++ b/uv.lock
@@ -129,7 +129,7 @@ wheels = [
 
 [[package]]
 name = "forecastinterface"
-version = "0.1.14"
+version = "0.1.15"
 source = { virtual = "." }
 dependencies = [
     { name = "polars" },

From bbdb17e84997c7c63d9c4d9dd6c4f2cc2369b0b8 Mon Sep 17 00:00:00 2001
From: Beatrice Marti <mabesa@users.noreply.github.com>
Date: Wed, 17 Jun 2026 11:45:30 +0200
Subject: [PATCH 15/16] docs: wrap the model-developer (Sandro) asks into
 open_design_questions
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Restructure section 2 into the single place for what we need from the
model developer:
- "Still open" callout — Q8 (config), Q9 (lag), Q10 (station-string)
- "Deviations from the original proposal — please confirm" — the changes
  from the init proposal (lifecycle ownership, forecast→predict, station
  keying, spatial vocab, ensemble enum, failure channel, hindcast optional,
  timedelta) each with its decision ref
- add Q10 (station-string identity) as a first-class open coordination item

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
---
 docs/open_design_questions.md  | 29 ++++++++++++++++++++++++++++-
 forecast_interface/__init__.py |  2 +-
 pyproject.toml                 |  4 ++--
 uv.lock                        |  2 +-
 4 files changed, 32 insertions(+), 5 deletions(-)

diff --git a/docs/open_design_questions.md b/docs/open_design_questions.md
index 507f976..fc3ede2 100644
--- a/docs/open_design_questions.md
+++ b/docs/open_design_questions.md
@@ -218,7 +218,30 @@ Banded Snowmapper SWE / snowmelt is declared at `ELEVATION_BAND`.
 
 ---
 
-# 2. Open questions for the model developer
+# 2. For the model developer
+
+This section is the single place for what we need from / owe to the model developer: **open questions** still awaiting input, and **deviations** from the original proposal we'd like confirmed. The full Q&A record (answered + open) follows.
+
+## Still open — needs your input
+
+- **`config` contents (Q8)** — what the model needs in `config` at train / predict time.
+- **Per-product availability lag (Q9)** — how to express SnowMapper's lag behind ECMWF.
+- **Station-string identity (Q10)** — the exact string your artifacts store (human code vs UUID).
+
+## Deviations from the original proposal — please confirm
+
+The interface diverged from the original `init` proposal in these ways. Each is justified (see the linked decision / spec), but they change the original design, so we'd like your sign-off:
+
+- **Lifecycle ownership** — artifacts are **framework-owned** (`train` → `serialize_artifact` → SAP3 stores → `deserialize_artifact`) rather than loaded inside `__init__`; the model is artifact-stateless (see `docs/model_interface.md`, *Training & Lifecycle Protocol*).
+- **`forecast()` → `predict()`** — renamed to match SAP3.
+- **Output keying** — `dict[variable]` → **station-keyed** `dict[station][variable]` (decision 1.1).
+- **Spatial vocabulary** — `distributed` / `lumped` → `POINT` / `BASIN_AVERAGE` / `ELEVATION_BAND` / `GRIDDED` (decision 1.4).
+- **Ensemble flag** — `ensemble: bool` → `EnsembleMode` enum.
+- **Failure channel** — `forecast() → ModelOutput` → `predict() → ModelResult` (`Success | Failure`) (decision 1.7).
+- **Hindcast** — demoted from a core method to the optional `BatchHindcastModel` (decision 1.8).
+- **Time step** — `TemporalResolution` enum → `timedelta` keys (decision 1.12).
+
+## Question record
 
 A decision-ready list. Each needs the model developer's input before the corresponding spec is frozen.
 
@@ -267,3 +290,7 @@ What does the model need in `config` at `train` and `predict` time, beyond `inpu
 ### Q9 — Per-product availability lag — OPEN
 
 Products derived downstream (e.g. SnowMapper SWE / RoF, which run *after* their driving ECMWF forecast) may become available **later** than their nominal forcing — their future-known series lags the issue time. `InputRequirement`'s variable properties (`lookback`, `future_steps`, `max_nan`, `ensemble_mode`) have **no explicit lag / offset** field. Decide whether to **(a)** add a per-variable `availability_lag` (in steps), or **(b)** absorb it via `max_nan` / a shorter `future_steps`. Needs modeller + data-availability input.
+
+### Q10 — Station-string identity — OPEN (coordination)
+
+What exact string do your trained artifacts store as the station key — the **human / network station code**, or the **UUID string**? FI station keys are opaque `str` (decision 1.10), but they must match what the artifact embeds and be **stable across deployments** (staging → prod, east → west). Deployment portability argues for the code; the SAP3 adapter then maps `StationId` (UUID) ↔ that string. Confirm so the artifact and the adapter agree.
diff --git a/forecast_interface/__init__.py b/forecast_interface/__init__.py
index fd2addb..4d047d6 100644
--- a/forecast_interface/__init__.py
+++ b/forecast_interface/__init__.py
@@ -1,4 +1,4 @@
-__version__ = "0.1.15"
+__version__ = "0.1.16"
 
 from .common import AggregationMethod
 from .input import (
diff --git a/pyproject.toml b/pyproject.toml
index 72cc8fa..0731393 100644
--- a/pyproject.toml
+++ b/pyproject.toml
@@ -1,6 +1,6 @@
 [project]
 name = "forecastinterface"
-version = "0.1.15"
+version = "0.1.16"
 description = "Add your description here"
 readme = "README.md"
 requires-python = ">=3.11"
@@ -25,7 +25,7 @@ init_typed = true
 warn_required_dynamic_aliases = true
 
 [tool.bumpversion]
-current_version = "0.1.15"
+current_version = "0.1.16"
 commit = false
 tag = false
 allow_dirty = true
diff --git a/uv.lock b/uv.lock
index 31294e2..b6a99d2 100644
--- a/uv.lock
+++ b/uv.lock
@@ -129,7 +129,7 @@ wheels = [
 
 [[package]]
 name = "forecastinterface"
-version = "0.1.15"
+version = "0.1.16"
 source = { virtual = "." }
 dependencies = [
     { name = "polars" },

From 303aa422da45e293070ef1522251c782bbbf2b7b Mon Sep 17 00:00:00 2001
From: Beatrice Marti <mabesa@users.noreply.github.com>
Date: Wed, 17 Jun 2026 11:58:06 +0200
Subject: [PATCH 16/16] docs: progress the open model-developer questions
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

- Q9 (availability lag): RESOLVED mechanism — reduced per-variable
  future_steps (+ max_nan), no dedicated availability_lag field;
  concrete step-counts still owed by Sandro/data
- Q10 (station-string identity): RESOLVED — station/gauge code (not
  per-DB UUID); UUID stays SAP3-internal, adapter maps code<->UUID;
  Sandro to confirm artifacts key on the code (decision 1.10 updated)
- Q8 (config): sharpened to the two-step process — Sandro enumerates,
  then we partition model-private vs operationally-shared (likely the
  quantile levels); train/retrain-only scoping fixed
- refreshed the "Still open" callout to current state

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
---
 docs/open_design_questions.md  | 34 +++++++++++++++++++++++-----------
 forecast_interface/__init__.py |  2 +-
 pyproject.toml                 |  4 ++--
 uv.lock                        |  2 +-
 4 files changed, 27 insertions(+), 15 deletions(-)

diff --git a/docs/open_design_questions.md b/docs/open_design_questions.md
index fc3ede2..12fb62d 100644
--- a/docs/open_design_questions.md
+++ b/docs/open_design_questions.md
@@ -192,7 +192,7 @@ Banded Snowmapper SWE / snowmelt is declared at `ELEVATION_BAND`.
 
 - The same station string must be used consistently across **train → artifact → predict → output**.
 - The model may **read** the key (look it up in its artifact) but must **never alter** it; output is keyed by exactly the strings received.
-- The string must be **stable across deployments** (staging → prod, east → west), because artifacts embed it and must be portable. This argues for a **deployment-stable human / network station code**, NOT a per-DB UUID. **Open coordination item:** confirm the exact string the modeller's artifacts store; the SAP3 adapter maps `StationId` (UUID) ↔ that string (via `StationConfig.code` if it is the code).
+- The string must be **stable across deployments** (staging → prod, east → west), because artifacts embed it and must be portable. **Resolved (Q10):** the key is the **station / gauge code**, not a per-DB UUID; the SAP3 adapter maps `StationId` (UUID) ↔ code (via `StationConfig.code`). Residual: Sandro confirms his trained artifacts key on the code.
 
 **Group support from v1:** `ArtifactScope.GROUP` is load-bearing from the start (consistent with 1.6). Multiple input stations may **share one group artifact**; output stays **1:1 station-in / station-out** — every input station gets an output entry. Grouping is about *artifact sharing*, not output cardinality.
 
@@ -224,9 +224,9 @@ This section is the single place for what we need from / owe to the model develo
 
 ## Still open — needs your input
 
-- **`config` contents (Q8)** — what the model needs in `config` at train / predict time.
-- **Per-product availability lag (Q9)** — how to express SnowMapper's lag behind ECMWF.
-- **Station-string identity (Q10)** — the exact string your artifacts store (human code vs UUID).
+- **`config` contents (Q8)** — Sandro must enumerate the train-time config before we partition ownership and type it. *(The main open item.)*
+- **Availability-lag values (Q9)** — mechanism settled (reduced per-variable `future_steps`); the concrete SnowMapper step-counts are owed by Sandro / data availability.
+- **Station-code confirmation (Q10)** — key decided (station code); Sandro confirms his artifacts key on it (re-key if not).
 
 ## Deviations from the original proposal — please confirm
 
@@ -247,7 +247,7 @@ A decision-ready list. Each needs the model developer's input before the corresp
 
 ### Q1 — Station ID typing — ANSWERED
 
-**Answer:** opaque `str` keys (not typed UUIDs), but they **carry meaning and are stable** across train / predict / deployment — the trained station strings are stored inside the artifact and read by the model for per-station lookup. Group artifacts shared across multiple stations are supported **from v1**; output is **1:1 station-in / station-out**; the station-set-mismatch case (unknown stations) is handled by the embedding-key contract (generalize or raise, never silent). **Residual coordination item:** the exact string identity (human code vs UUID-string) must match what the modeller's artifacts store — deployment portability argues for the code. See decision 1.10.
+**Answer:** opaque `str` keys (not typed UUIDs), but they **carry meaning and are stable** across train / predict / deployment — the trained station strings are stored inside the artifact and read by the model for per-station lookup. Group artifacts shared across multiple stations are supported **from v1**; output is **1:1 station-in / station-out**; the station-set-mismatch case (unknown stations) is handled by the embedding-key contract (generalize or raise, never silent). **Residual:** resolved to the **station code** (Q10 / decision 1.10); Sandro confirms his artifacts key on it.
 
 ### Q2 — Past-target availability — ANSWERED
 
@@ -275,9 +275,15 @@ The adapter maps `ModelOutput.issue_datetime` → `ForecastEnsemble.issued_at` a
 
 **Now scoped by decision 1.8:** a plain `ForecastModel` (`predict`) always emits a constant per-row `issue_datetime` equal to the top-level one — so for the required surface this *can* carry a strict cross-validator. The varying-per-row case exists **only** on the optional `BatchHindcastModel` path, where the validator must instead check that the per-row `issue_datetime` matches the batch's declared `issue_datetimes`. So the answer likely differs by protocol: strict equality for `predict`, set-membership for batch `hindcast`.
 
-### Q8 — `config` contents (train / predict) — OPEN (modeller-owned)
+### Q8 — `config` contents — OPEN (awaiting Sandro, then partition)
 
-What does the model need in `config` at `train` and `predict` time, beyond `inputs` and the injected `rng`? Candidates: training hyperparameters, target quantile levels / trajectory count, forecast horizon, validation split, early-stopping criteria, seeds beyond `rng`. This is **modeller-owned** and must be specified before `config: Any` can be typed (decision 1.9).
+`config` is passed to **`train` / `retrain` only** — `predict` / `hindcast` take no `config`.
+
+**Resolution process (two steps):**
+1. **Sandro enumerates** what the model puts in `config` at train time — hyperparameters, the **quantile levels** it emits, trajectory/sample count, validation-split date, early-stopping criteria, etc. (Not forecast horizon — the model owns that, decision 1.15. Not seeds — `rng` is injected.)
+2. **We partition** each field into **model-private** (opaque to FI/SAP3) vs **operationally-shared** (FI/SAP3 needs to read or set it). The likely shared candidate is the **quantile levels** (SAP3 may need specific quantiles for danger thresholds).
+
+**Interim typing:** `config` stays `Any` until the partition is known; the expected end state is an opaque `dict[str, Any]` for model-private params (mirroring SAP3's `ModelParams`), with any operationally-shared fields lifted into a typed slot. (decision 1.9)
 
 ### Q7 — SnowMapper lead times — ANSWERED (with caveat)
 
@@ -287,10 +293,16 @@ What does the model need in `config` at `train` and `predict` time, beyond `inpu
 
 **Reflected in:** `docs/input_requirement.md`, decision 1.6.
 
-### Q9 — Per-product availability lag — OPEN
+### Q9 — Per-product availability lag — RESOLVED (mechanism); values owed
+
+Products derived downstream (e.g. SnowMapper SWE / RoF, which run *after* their driving ECMWF forecast) become available **later** than their nominal forcing — at issue time T their future-known series reaches fewer steps ahead than the ECMWF series driving them.
+
+**Mechanism — option (b):** represent the lag as **reduced per-variable `future_steps`** (the lagging product simply declares fewer future steps than its driving forcing, e.g. ECMWF precip `future_steps=15` vs SnowMapper SWE `future_steps=13`), with `max_nan` absorbing any residual ragged tail. **No dedicated `availability_lag` field** — it would be speculative complexity; the existing per-variable knobs already express the shorter-future-coverage case. If a genuine *leading-gap / offset* case ever appears that `future_steps` can't express, add the field then (additive, non-breaking).
+
+**Still owed (Sandro / data availability):** the concrete step counts — how many fewer future steps SnowMapper SWE / ROF actually cover vs. the ECMWF horizon.
 
-Products derived downstream (e.g. SnowMapper SWE / RoF, which run *after* their driving ECMWF forecast) may become available **later** than their nominal forcing — their future-known series lags the issue time. `InputRequirement`'s variable properties (`lookback`, `future_steps`, `max_nan`, `ensemble_mode`) have **no explicit lag / offset** field. Decide whether to **(a)** add a per-variable `availability_lag` (in steps), or **(b)** absorb it via `max_nan` / a shorter `future_steps`. Needs modeller + data-availability input.
+### Q10 — Station-string identity — RESOLVED (station code)
 
-### Q10 — Station-string identity — OPEN (coordination)
+**Decision:** the station key is the **station / gauge code** (the deployment-stable human / network identifier), **not** a per-DB UUID. The UUID stays internal to SAP3; its adapter maps `StationId` (UUID) ↔ code (via `StationConfig.code`) at the boundary. Chosen for portability — the artifact embeds these strings, and they must survive staging → prod and east → west transfer, where UUIDs are not stable across databases but codes are.
 
-What exact string do your trained artifacts store as the station key — the **human / network station code**, or the **UUID string**? FI station keys are opaque `str` (decision 1.10), but they must match what the artifact embeds and be **stable across deployments** (staging → prod, east → west). Deployment portability argues for the code; the SAP3 adapter then maps `StationId` (UUID) ↔ that string. Confirm so the artifact and the adapter agree.
+**Residual (Sandro):** confirm his already-trained artifacts key on the station code (and re-key if they currently use a UUID / internal id).
diff --git a/forecast_interface/__init__.py b/forecast_interface/__init__.py
index 4d047d6..f303f40 100644
--- a/forecast_interface/__init__.py
+++ b/forecast_interface/__init__.py
@@ -1,4 +1,4 @@
-__version__ = "0.1.16"
+__version__ = "0.1.17"
 
 from .common import AggregationMethod
 from .input import (
diff --git a/pyproject.toml b/pyproject.toml
index 0731393..e2e9cb5 100644
--- a/pyproject.toml
+++ b/pyproject.toml
@@ -1,6 +1,6 @@
 [project]
 name = "forecastinterface"
-version = "0.1.16"
+version = "0.1.17"
 description = "Add your description here"
 readme = "README.md"
 requires-python = ">=3.11"
@@ -25,7 +25,7 @@ init_typed = true
 warn_required_dynamic_aliases = true
 
 [tool.bumpversion]
-current_version = "0.1.16"
+current_version = "0.1.17"
 commit = false
 tag = false
 allow_dirty = true
diff --git a/uv.lock b/uv.lock
index b6a99d2..c53dd3d 100644
--- a/uv.lock
+++ b/uv.lock
@@ -129,7 +129,7 @@ wheels = [
 
 [[package]]
 name = "forecastinterface"
-version = "0.1.16"
+version = "0.1.17"
 source = { virtual = "." }
 dependencies = [
     { name = "polars" },