Mapping driven metadata (for solar and wind)#54
Conversation
Codecov Report✅ All modified and coverable lines are covered by tests.
🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
generator_to_trace_draft_mapper.py imported extract_solar_trace_metadata and extract_wind_trace_metadata. So maybe we should just delete generator_to_trace_draft_mapper.py along with these functions?
There was a problem hiding this comment.
Yeah good point - deleted.
|
|
||
| file_metadata: dict[Path, dict[str, str]] = {} | ||
| for path in files: | ||
| stem, sep, ref = path.stem.rpartition("_RefYear") |
There was a problem hiding this comment.
Might be worth adding error handling for either file name not matching expected format with _RefYear, or filename missing from resource_mapping. I guess these are errors which should only be found by devs at parser update time, so maybe it's not worth it.
There was a problem hiding this comment.
Hmm yes. I think you're right that it should be found at update time (at least with respect to resource_mapping).
That said maybe someone might tinker with filenames or something on their computer (for some reason?) - will add in a little error handling ("Unexpected trace filename") or similar.
Fixed type hints on internal function dict (`file_metadata`) Co-authored-by: nick-gorman <40549624+nick-gorman@users.noreply.github.com>
(No longer needed or used)
for more information, see https://pre-commit.ci
Check for incorrect names and unmapped stems with (with with an explicit ValueError instead vague KeyError failure)
Mapping-driven metadata for solar and wind trace parsing. File stems are looked up in
mappings/2024/resources.yaml(instead of reverse-engineered from filenames via regex).(Demand parsing unchanged - seperate PR, coming next I think).
Changes
resource_trace_metadata.build()for building metadata dictsolar_traces.py/wind_traces.pyuse it instead of the regex extractors.extract_solar_trace_metadata/extract_wind_trace_metadataand their tests deleted (demand extractor kept for now).Temporary measures
(These measures basically to keep PR's more manageable / phase the implementation)
The loader still returns the existing
dict[Path, dict[str, str]]shape so downstream helpers, filters, and parquet schema are untouched. Long-term metadata will probably move to a typed object (maybe dataclass, but maybe a pydantic model - given pydantic already used / a dependency). Later PR (likely after dealt with demand traces)_RESOURCE_TYPE_CODESinresource_trace_metadata.pytranslates the YAML's semantic vocabulary (solar_sat,wind_offshore_floating) to the legacy short codes (SAT,WFL) downstream still expects (will disappear in the longer term)Tests
One test in
tests/test_resource_trace_metadata.py. The function has no branching or differing versions, (i.e. Extra solar/wind/zone cases would just be testing different YAML rows, not code - unlike in the regex driven approach). Only testing two files (a solar project and a wind project, that together test main logic of the function )