Feature/model interface#2
Merged
Merged
Conversation
…FI↔SAP3 mapping Settle the open design questions where SAP3 behaviour already determines the answer, and route the rest to the model developer as a decision-ready list. - open_design_questions.md: split into resolved decisions (station-keyed output per Option a, target declaration, state-free, spatial vocab → SpatialRepresentation, quantile floor) and open developer questions. Correct two stale premises: forecast_horizon IS consumed by the adapter; the issue_datetime column is renamed to valid_time, not dropped. - fi-sap3-mapping.md (new): authoritative FI↔SAP3 adapter-boundary contract — governance, the two adapter paths, output/input mapping tables, station identity, state bridge, and artifact-metadata ownership split. - model_interface.md: replace the "for later integration" stub with the Training & Lifecycle Protocol target spec (unified ForecastModel, ArtifactScope, train/retrain/ serialize/deserialize, rng injection, TrainedArtifact, station-keyed ModelOutput). - nepal-model-requirements.md: record developer answers — eastern group ships first (GROUP-scoped), SnowMapper starts with SWE + ROF, artifact transfer is east→west. Also wire up the mandated bump-my-version workflow (was unconfigured) and add __version__ to the package. Patch bump 0.1.1 → 0.1.2. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…d target declaration Phase 1 of the SAP3-alignment work — the settled input-contract changes. - Rename SpatialResolution → SpatialRepresentation, adopting SAP3's exact member names/values (POINT, BASIN_AVERAGE, ELEVATION_BAND, GRIDDED) so the adapter mapping is identity. Banded SnowMapper forcing (SWE, ROF) is declared at ELEVATION_BAND. - Add explicit target declaration: new input/target.py with OutputRepresentation (DETERMINISTIC/QUANTILES/TRAJECTORIES) and TargetSpec(unit, representations); InputRequirement gains a required, non-empty `targets: dict[str, TargetSpec]`. Targets are declared independently of inputs (accommodates Q2 — pure-simulation models that lack the target's own history). - Update tests (117 passing), docs/input_requirement.md, README.md, and the fi-sap3-mapping.md spatial/target rows (mapping is now identity). BREAKING: SpatialResolution renamed and InputRequirement now requires `targets`. Acceptable pre-1.0 with no external conformers. Patch bump 0.1.2 → 0.1.3. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Phase 2 — realize the GROUP-path per-station decomposition (Option a), required for the eastern-group artifact that ships first. - ModelOutput.variables is now dict[str, dict[str, VariableOutput]] (station_id → variable_name). Single-station models return a one-key dict. - Validators: at least one station; each station's variable map non-empty; station-id and variable-name keys must be non-empty strings. - success now spans all variables across all stations. - Missing stations must be explicit FAILURE entries, never absent keys — the model echoes back every station id it was given. - Station ids are opaque strings (PROVISIONAL Q1); the SAP3 adapter maps them to/from typed StationId. Updates tests (125 passing), README, and the fi-sap3-mapping output/GROUP-path sections. BREAKING: ModelOutput.variables shape changed. Patch bump 0.1.3 → 0.1.4. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…rng, artifact, scope) Phase 3 — grow ForecastModel from a predict-only protocol into the full model-author contract, mirroring SAP3's mechanics while keeping FI's ModelResult → ModelOutput as the authoritative return. - ArtifactScope enum (STATION / GROUP; national-group is a GROUP; SAP3's internal VIRTUAL scope intentionally omitted). - TrainedArtifact: opaque, runtime_checkable marker Protocol — the self-contained, deployment-portable artifact boundary. Rich provenance metadata + group embedding-key contract deferred to Phase 4 (nepal §4/§8). - ForecastModel now requires: input_requirement, artifact_scope, train, predict, hindcast, serialize_artifact, deserialize_artifact. predict/hindcast take the artifact positionally and an injected rng (random.Random) for determinism. - RetrainableModel(ForecastModel) adds optional warm-start retrain; SAP3 checks isinstance to detect support. Cold train remains the required baseline. - inputs/config typed Any (PROVISIONAL) pending the SAP3→FI input-types PR (doc 014). - Reconcile model_interface.md with the implementation; restore the targets mention on input_requirement. Tests: 131 passing. BREAKING: predict/hindcast signatures changed; protocol surface expanded. Patch bump 0.1.4 → 0.1.5. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…floor, input & station-identity decisions Grilling session outcomes recorded in model_interface.md and open_design_questions.md: - Warm-up/state: state-free v0; warm-up period = lookback; persisted state deferred as additive StatefulModel extension (1.3, Q4) - Failure: structured ModelResult, not raise; whole-run vs per-station rule (1.7) - Hindcast: demoted to optional BatchHindcastModel; SAP3 loops predict otherwise (1.8, Q6 scoped) - Output floors: >=3 quantiles / >=8 trajectories structural; SAP3 operational floors checked loud at integration; deterministic valid but non-operational (1.5, Q3) - Input bundle: FI-owned ModelInputs isomorphic to InputRequirement; v1 scope = daily/hourly, POINT/BASIN_AVERAGE/ELEVATION_BAND, single product; GRIDDED + multi-product deferred (1.9); config open (Q8) - Station identity: opaque-but-meaningful str keys stored in artifact; groups from v1; 1:1 in/out; embedding-key contract pulled to v1 (1.10, Q1) - VariableMetadata: drop name; keep forecast_horizon + per-issue-block validator; offset semantics fixed (Q5) - SnowMapper SWE/RoF at BASIN_AVERAGE or ELEVATION_BAND; availability lag flagged as new open item (Q7, Q9) Code TODOs noted, not yet applied (validators, ModelInputs type, protocol split). Authority rule refined: FI may express not-yet- operational models provided the gap is never silent. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…binability gaps (SAP3 gap-hunt round 2) Second grilling session — gaps surfaced by re-checking SAPPHIRE_flow: - Parameter identity (1.11): canonical vocabulary synced with SAP3 ParameterDefinition; optional per-variable aggregation (SUM/MEAN); units first-class on inputs (declared + delivered-or-rejected), Unit enum to expand (PERCENT, M_PER_S, DEGREE, W_PER_M2, MM_PER_HOUR) - Time step (1.12): timedelta, not TemporalResolution enum — driven by 3h/6h v1 targets; drop resolution from VariableMetadata; calendar resolutions out of v1 scope - max_nan (1.13): SAP3 enforces as pre-predict gate; breach => FAILURE/ DATA_AVAILABILITY without calling model; within-tolerance delivered as-is - Combinability (1.14): derived from TRAJECTORIES representation; no FI combination machinery; SAP3 owns POOLED/BMA, remapping, VIRTUAL scope - Q7 answered (SnowMapper SWE/RoF at BASIN_AVERAGE or ELEVATION_BAND, ECMWF lead times); Q9 opened (per-product availability lag) Rejected as orchestrator-only (kept out of FI): forcing provenance / staleness / warm-up / input-quality passed to model, forcing_type, forecast lifecycle/version, combination mechanism. Code TODOs accumulated across 1.5/1.8/1.9/1.11/1.12 + Q5; not yet applied. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…emory Add .claude/skills/grill-me/SKILL.md so the design-interview skill is shared in-repo. Ignore .claude/settings.local.json and .claude/projects/ (machine-local settings and agent session memory). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
- _make_metadata: drop the avoidable arg-type ignore via model_validate - annotate the two unavoidable pydantic @computed_field prop-decorator ignores (success/trusted) with a one-line reason, per the B ignore policy The only remaining ignores are the two documented pydantic/mypy computed_field false positives; everything else is ignore-free under strict. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
- README: reshape into a gateway — drop the duplicated ModelOutput / InputRequirement schemas (authoritative in docs/), keep intro, doc links, and the usage example - delete TODO.md (obsolete two-line stub) - open_design_questions: clear six done "Code TODO" markers (A1-A3 implemented them); mark Q6 ANSWERED (scoped by decision 1.8) - model_interface: fix self-contradiction — embedding-key/station-set contract is v1 load-bearing, only rich provenance metadata is Phase 4 - nepal-model-requirements: replace stale spatial vocab (lumped/HRU) with POINT/BASIN_AVERAGE/ELEVATION_BAND/GRIDDED Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This repo is the model-author contract; trim docs that belong elsewhere: - delete docs/nepal-model-requirements.md — one deployment's requirements, now captured in resolved decisions 1.6/1.10 and the implemented code - delete docs/fi-sap3-mapping.md — adapter-boundary mapping that belongs with the adapter in SAPPHIRE_flow, not the model-author contract Fixed all inbound references in model_interface.md, open_design_questions.md, and README.md. Remaining docs: model_interface (protocol/output), input_requirement (inputs), open_design_questions (decisions + open Qs). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Restructure section 2 into the single place for what we need from the model developer: - "Still open" callout — Q8 (config), Q9 (lag), Q10 (station-string) - "Deviations from the original proposal — please confirm" — the changes from the init proposal (lifecycle ownership, forecast→predict, station keying, spatial vocab, ensemble enum, failure channel, hindcast optional, timedelta) each with its decision ref - add Q10 (station-string identity) as a first-class open coordination item Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
- Q9 (availability lag): RESOLVED mechanism — reduced per-variable future_steps (+ max_nan), no dedicated availability_lag field; concrete step-counts still owed by Sandro/data - Q10 (station-string identity): RESOLVED — station/gauge code (not per-DB UUID); UUID stays SAP3-internal, adapter maps code<->UUID; Sandro to confirm artifacts key on the code (decision 1.10 updated) - Q8 (config): sharpened to the two-step process — Sandro enumerates, then we partition model-private vs operationally-shared (likely the quantile levels); train/retrain-only scoping fixed - refreshed the "Still open" callout to current state Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
sandrohuni
reviewed
Jun 22, 2026
Collaborator
There was a problem hiding this comment.
Why do we need >= 8 trajectories - for example deep ensembles often only consist of 5 members.
Contributor
Author
There was a problem hiding this comment.
I don't know about deep ensembles ans was thinking about conventional ensemble forecasts. We can reduce to 5 trajectories if this makes sense for deep ensembles.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
No description provided.