Skip to content

Retrieve station observations from MeteoSwiss DWH#183

Open
jonasbhend wants to merge 22 commits into
mainfrom
feat/jretrieve
Open

Retrieve station observations from MeteoSwiss DWH#183
jonasbhend wants to merge 22 commits into
mainfrom
feat/jretrieve

Conversation

@jonasbhend

@jonasbhend jonasbhend commented Jun 12, 2026

Copy link
Copy Markdown
Contributor

AI-assisted implementation!

Adds a new verification truth source that retrieves SwissMetNet (SMN) surface observations live from the MeteoSwiss data warehouse (DWH) via the jretrievedwh.py client. This lets us verify model forecasts against station observations pulled on demand.

The implementation mirrors the approach already used in anemoi-plugins-meteoswiss add-synop-dwh-source, and produces an xr.Dataset in the exact same shape as load_obs_data_from_peakweather, so the downstream spatial mapping (map_forecast_to_truth) and verify() work unchanged.

Improvements:

  • check all valid times are present in _select_valid_times
  • fail fast when access is not available
  • only retrieve necessary data (increment is hardcoded to 60 min)
  • use varda's own jretrieve credentials

Summary of changes

  • wire-up data retrieval with jretrievedwh.py client
  • provide authorization with specific credentials via .jretrievedwh-conf.prod.py
  • add jretrieve markers in config for different retrieval options, including:
    • by measurement category group ids (default)
    • by station shortname
    • by bounding box (lon/lat)
  • use DWH metadata to infer station locations for meteogram plotting (e.g. with analysis as truth)
  • remove peakweather from dependencies and as a data source

Examples

202503010000_T_2M_BER 202503010000_T_2M_BER

file:///M:/zue-prod/fc_development/seamless/S-RUC/evaluation/MRB-899_dashboard.html

@dnerini dnerini left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fantastic contribution, many thanks for this! I'm having only a first look, but this PR looks already quite mature! I have a few initial very minor comments. I'm also testing in myself and so far everything is running very smoothly.

I have perhaps one question, how hard would be to add another importer for obs later on? maybe something like load_obs_data_from_ogd? would the current code structure allow for that?

root, reftime: datetime, steps: list[int], params: list[str], freq: str = "1h"
) -> xr.Dataset:
"""Load PeakWeather station observations into an xarray Dataset.
DWH_PARAM_MAP = {

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the global DWH_ variables are used only by load_obs_data_from_jretrieve, I suggest moving them at the top of that function

Comment thread README.md
# root: jretrievedwh:bbox=45.8,47.8,5.9,10.5 # minlat,maxlat,minlon,maxlon
```

**Prerequisites:** `jretrievedwh.py` must be on `$PATH` (falls back to

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would it be possible to package this as a python package that we add as project dependency?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@clairemerker I think I lack a bit of insight on what this does and how much effort this would be. I agree though that the current solution is 'hacky'

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm... It's part of this repo: https://service.meteoswiss.ch/git/databrokerandcustomerinterfaces/jretrieve
So if anything we would need to get if from there? I wouldn't create our own package for it...

Comment thread README.md
export JRETRIEVE_CLIENT_SECRET=your-client-secret
```

or place them in a `.env` file at the project root (next to `.jretrievedwh-conf.prod.py`):

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would it help to include a .env.template ?

@jonasbhend

Copy link
Copy Markdown
Contributor Author

Fantastic contribution, many thanks for this! I'm having only a first look, but this PR looks already quite mature! I have a few initial very minor comments. I'm also testing in myself and so far everything is running very smoothly.

I have perhaps one question, how hard would be to add another importer for obs later on? maybe something like load_obs_data_from_ogd? would the current code structure allow for that?

I would say this should work quite seamlessly, as long as the output can be wrangled in the same form as what we get from jretrieve. One of the issues I see with OGD, is that there you currently access station data by station and not by time, so probably is not very useful to our use-case with repeated requests per initialization time.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants