Retrieve station observations from MeteoSwiss DWH#183
Conversation
Validate the DWH prerequisites (binary on PATH, OPR_HOME set, conf file readable) at workflow-build time so a misconfigured environment aborts at launch instead of hours into the run, and again at loader entry for the authoritative job environment. Errors aggregate all problems at once.
75c6055 to
ba88587
Compare
The loader calls check_prerequisites(), which probes for jretrievedwh.py on $PATH and $OPR_HOME. GitHub CI has neither, so the test failed there while passing locally. Mock it like the other DWH calls so the test is environment independent.
5d5f8ce to
065a5d9
Compare
dnerini
left a comment
There was a problem hiding this comment.
Fantastic contribution, many thanks for this! I'm having only a first look, but this PR looks already quite mature! I have a few initial very minor comments. I'm also testing in myself and so far everything is running very smoothly.
I have perhaps one question, how hard would be to add another importer for obs later on? maybe something like load_obs_data_from_ogd? would the current code structure allow for that?
| root, reftime: datetime, steps: list[int], params: list[str], freq: str = "1h" | ||
| ) -> xr.Dataset: | ||
| """Load PeakWeather station observations into an xarray Dataset. | ||
| DWH_PARAM_MAP = { |
There was a problem hiding this comment.
the global DWH_ variables are used only by load_obs_data_from_jretrieve, I suggest moving them at the top of that function
| # root: jretrievedwh:bbox=45.8,47.8,5.9,10.5 # minlat,maxlat,minlon,maxlon | ||
| ``` | ||
|
|
||
| **Prerequisites:** `jretrievedwh.py` must be on `$PATH` (falls back to |
There was a problem hiding this comment.
would it be possible to package this as a python package that we add as project dependency?
There was a problem hiding this comment.
@clairemerker I think I lack a bit of insight on what this does and how much effort this would be. I agree though that the current solution is 'hacky'
There was a problem hiding this comment.
Hm... It's part of this repo: https://service.meteoswiss.ch/git/databrokerandcustomerinterfaces/jretrieve
So if anything we would need to get if from there? I wouldn't create our own package for it...
| export JRETRIEVE_CLIENT_SECRET=your-client-secret | ||
| ``` | ||
|
|
||
| or place them in a `.env` file at the project root (next to `.jretrievedwh-conf.prod.py`): |
There was a problem hiding this comment.
would it help to include a .env.template ?
I would say this should work quite seamlessly, as long as the output can be wrangled in the same form as what we get from jretrieve. One of the issues I see with OGD, is that there you currently access station data by station and not by time, so probably is not very useful to our use-case with repeated requests per initialization time. |
AI-assisted implementation!
Adds a new verification truth source that retrieves SwissMetNet (SMN) surface observations live from the MeteoSwiss data warehouse (DWH) via the jretrievedwh.py client. This lets us verify model forecasts against station observations pulled on demand.
The implementation mirrors the approach already used in anemoi-plugins-meteoswiss add-synop-dwh-source, and produces an xr.Dataset in the exact same shape as load_obs_data_from_peakweather, so the downstream spatial mapping (map_forecast_to_truth) and verify() work unchanged.
Improvements:
_select_valid_timesSummary of changes
Examples
file:///M:/zue-prod/fc_development/seamless/S-RUC/evaluation/MRB-899_dashboard.html