Skip to content

Feature/model interface#2

Merged
sandrohuni merged 16 commits into
mainfrom
feature/model-interface
Jun 23, 2026
Merged

Feature/model interface#2
sandrohuni merged 16 commits into
mainfrom
feature/model-interface

Conversation

@mabesa

@mabesa mabesa commented Jun 17, 2026

Copy link
Copy Markdown
Contributor

No description provided.

mabesa and others added 16 commits June 13, 2026 08:01
…FI↔SAP3 mapping

Settle the open design questions where SAP3 behaviour already determines the
answer, and route the rest to the model developer as a decision-ready list.

- open_design_questions.md: split into resolved decisions (station-keyed output
  per Option a, target declaration, state-free, spatial vocab → SpatialRepresentation,
  quantile floor) and open developer questions. Correct two stale premises:
  forecast_horizon IS consumed by the adapter; the issue_datetime column is renamed
  to valid_time, not dropped.
- fi-sap3-mapping.md (new): authoritative FI↔SAP3 adapter-boundary contract —
  governance, the two adapter paths, output/input mapping tables, station identity,
  state bridge, and artifact-metadata ownership split.
- model_interface.md: replace the "for later integration" stub with the Training &
  Lifecycle Protocol target spec (unified ForecastModel, ArtifactScope, train/retrain/
  serialize/deserialize, rng injection, TrainedArtifact, station-keyed ModelOutput).
- nepal-model-requirements.md: record developer answers — eastern group ships first
  (GROUP-scoped), SnowMapper starts with SWE + ROF, artifact transfer is east→west.

Also wire up the mandated bump-my-version workflow (was unconfigured) and add
__version__ to the package. Patch bump 0.1.1 → 0.1.2.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…d target declaration

Phase 1 of the SAP3-alignment work — the settled input-contract changes.

- Rename SpatialResolution → SpatialRepresentation, adopting SAP3's exact member
  names/values (POINT, BASIN_AVERAGE, ELEVATION_BAND, GRIDDED) so the adapter mapping
  is identity. Banded SnowMapper forcing (SWE, ROF) is declared at ELEVATION_BAND.
- Add explicit target declaration: new input/target.py with OutputRepresentation
  (DETERMINISTIC/QUANTILES/TRAJECTORIES) and TargetSpec(unit, representations);
  InputRequirement gains a required, non-empty `targets: dict[str, TargetSpec]`.
  Targets are declared independently of inputs (accommodates Q2 — pure-simulation
  models that lack the target's own history).
- Update tests (117 passing), docs/input_requirement.md, README.md, and the
  fi-sap3-mapping.md spatial/target rows (mapping is now identity).

BREAKING: SpatialResolution renamed and InputRequirement now requires `targets`.
Acceptable pre-1.0 with no external conformers. Patch bump 0.1.2 → 0.1.3.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Phase 2 — realize the GROUP-path per-station decomposition (Option a), required
for the eastern-group artifact that ships first.

- ModelOutput.variables is now dict[str, dict[str, VariableOutput]]
  (station_id → variable_name). Single-station models return a one-key dict.
- Validators: at least one station; each station's variable map non-empty;
  station-id and variable-name keys must be non-empty strings.
- success now spans all variables across all stations.
- Missing stations must be explicit FAILURE entries, never absent keys — the
  model echoes back every station id it was given.
- Station ids are opaque strings (PROVISIONAL Q1); the SAP3 adapter maps them
  to/from typed StationId.

Updates tests (125 passing), README, and the fi-sap3-mapping output/GROUP-path
sections. BREAKING: ModelOutput.variables shape changed. Patch bump 0.1.3 → 0.1.4.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…rng, artifact, scope)

Phase 3 — grow ForecastModel from a predict-only protocol into the full
model-author contract, mirroring SAP3's mechanics while keeping FI's ModelResult
→ ModelOutput as the authoritative return.

- ArtifactScope enum (STATION / GROUP; national-group is a GROUP; SAP3's internal
  VIRTUAL scope intentionally omitted).
- TrainedArtifact: opaque, runtime_checkable marker Protocol — the self-contained,
  deployment-portable artifact boundary. Rich provenance metadata + group embedding-key
  contract deferred to Phase 4 (nepal §4/§8).
- ForecastModel now requires: input_requirement, artifact_scope, train, predict,
  hindcast, serialize_artifact, deserialize_artifact. predict/hindcast take the
  artifact positionally and an injected rng (random.Random) for determinism.
- RetrainableModel(ForecastModel) adds optional warm-start retrain; SAP3 checks
  isinstance to detect support. Cold train remains the required baseline.
- inputs/config typed Any (PROVISIONAL) pending the SAP3→FI input-types PR (doc 014).
- Reconcile model_interface.md with the implementation; restore the targets mention
  on input_requirement. Tests: 131 passing.

BREAKING: predict/hindcast signatures changed; protocol surface expanded.
Patch bump 0.1.4 → 0.1.5.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…floor, input & station-identity decisions

Grilling session outcomes recorded in model_interface.md and
open_design_questions.md:

- Warm-up/state: state-free v0; warm-up period = lookback; persisted
  state deferred as additive StatefulModel extension (1.3, Q4)
- Failure: structured ModelResult, not raise; whole-run vs per-station
  rule (1.7)
- Hindcast: demoted to optional BatchHindcastModel; SAP3 loops predict
  otherwise (1.8, Q6 scoped)
- Output floors: >=3 quantiles / >=8 trajectories structural; SAP3
  operational floors checked loud at integration; deterministic valid
  but non-operational (1.5, Q3)
- Input bundle: FI-owned ModelInputs isomorphic to InputRequirement;
  v1 scope = daily/hourly, POINT/BASIN_AVERAGE/ELEVATION_BAND, single
  product; GRIDDED + multi-product deferred (1.9); config open (Q8)
- Station identity: opaque-but-meaningful str keys stored in artifact;
  groups from v1; 1:1 in/out; embedding-key contract pulled to v1 (1.10, Q1)
- VariableMetadata: drop name; keep forecast_horizon + per-issue-block
  validator; offset semantics fixed (Q5)
- SnowMapper SWE/RoF at BASIN_AVERAGE or ELEVATION_BAND; availability
  lag flagged as new open item (Q7, Q9)

Code TODOs noted, not yet applied (validators, ModelInputs type,
protocol split). Authority rule refined: FI may express not-yet-
operational models provided the gap is never silent.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…binability gaps (SAP3 gap-hunt round 2)

Second grilling session — gaps surfaced by re-checking SAPPHIRE_flow:

- Parameter identity (1.11): canonical vocabulary synced with SAP3
  ParameterDefinition; optional per-variable aggregation (SUM/MEAN);
  units first-class on inputs (declared + delivered-or-rejected),
  Unit enum to expand (PERCENT, M_PER_S, DEGREE, W_PER_M2, MM_PER_HOUR)
- Time step (1.12): timedelta, not TemporalResolution enum — driven by
  3h/6h v1 targets; drop resolution from VariableMetadata; calendar
  resolutions out of v1 scope
- max_nan (1.13): SAP3 enforces as pre-predict gate; breach => FAILURE/
  DATA_AVAILABILITY without calling model; within-tolerance delivered as-is
- Combinability (1.14): derived from TRAJECTORIES representation; no FI
  combination machinery; SAP3 owns POOLED/BMA, remapping, VIRTUAL scope
- Q7 answered (SnowMapper SWE/RoF at BASIN_AVERAGE or ELEVATION_BAND,
  ECMWF lead times); Q9 opened (per-product availability lag)

Rejected as orchestrator-only (kept out of FI): forcing provenance /
staleness / warm-up / input-quality passed to model, forcing_type,
forecast lifecycle/version, combination mechanism.

Code TODOs accumulated across 1.5/1.8/1.9/1.11/1.12 + Q5; not yet applied.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…emory

Add .claude/skills/grill-me/SKILL.md so the design-interview skill is
shared in-repo. Ignore .claude/settings.local.json and .claude/projects/
(machine-local settings and agent session memory).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
- _make_metadata: drop the avoidable arg-type ignore via model_validate
- annotate the two unavoidable pydantic @computed_field prop-decorator
  ignores (success/trusted) with a one-line reason, per the B ignore policy

The only remaining ignores are the two documented pydantic/mypy
computed_field false positives; everything else is ignore-free under strict.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
- README: reshape into a gateway — drop the duplicated ModelOutput /
  InputRequirement schemas (authoritative in docs/), keep intro, doc
  links, and the usage example
- delete TODO.md (obsolete two-line stub)
- open_design_questions: clear six done "Code TODO" markers (A1-A3
  implemented them); mark Q6 ANSWERED (scoped by decision 1.8)
- model_interface: fix self-contradiction — embedding-key/station-set
  contract is v1 load-bearing, only rich provenance metadata is Phase 4
- nepal-model-requirements: replace stale spatial vocab (lumped/HRU)
  with POINT/BASIN_AVERAGE/ELEVATION_BAND/GRIDDED

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This repo is the model-author contract; trim docs that belong elsewhere:
- delete docs/nepal-model-requirements.md — one deployment's requirements,
  now captured in resolved decisions 1.6/1.10 and the implemented code
- delete docs/fi-sap3-mapping.md — adapter-boundary mapping that belongs
  with the adapter in SAPPHIRE_flow, not the model-author contract

Fixed all inbound references in model_interface.md, open_design_questions.md,
and README.md. Remaining docs: model_interface (protocol/output),
input_requirement (inputs), open_design_questions (decisions + open Qs).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Restructure section 2 into the single place for what we need from the
model developer:
- "Still open" callout — Q8 (config), Q9 (lag), Q10 (station-string)
- "Deviations from the original proposal — please confirm" — the changes
  from the init proposal (lifecycle ownership, forecast→predict, station
  keying, spatial vocab, ensemble enum, failure channel, hindcast optional,
  timedelta) each with its decision ref
- add Q10 (station-string identity) as a first-class open coordination item

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
- Q9 (availability lag): RESOLVED mechanism — reduced per-variable
  future_steps (+ max_nan), no dedicated availability_lag field;
  concrete step-counts still owed by Sandro/data
- Q10 (station-string identity): RESOLVED — station/gauge code (not
  per-DB UUID); UUID stays SAP3-internal, adapter maps code<->UUID;
  Sandro to confirm artifacts key on the code (decision 1.10 updated)
- Q8 (config): sharpened to the two-step process — Sandro enumerates,
  then we partition model-private vs operationally-shared (likely the
  quantile levels); train/retrain-only scoping fixed
- refreshed the "Still open" callout to current state

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@mabesa mabesa requested a review from sandrohuni June 17, 2026 12:41
Comment thread docs/model_interface.md

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need >= 8 trajectories - for example deep ensembles often only consist of 5 members.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know about deep ensembles ans was thinking about conventional ensemble forecasts. We can reduce to 5 trajectories if this makes sense for deep ensembles.

@sandrohuni sandrohuni merged commit 7de9dd2 into main Jun 23, 2026
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants