Skip to content

Mapping driven metadata (for solar and wind)#54

Merged
dylanjmcconnell merged 15 commits into
mainfrom
map-driven-parsing
Jun 17, 2026
Merged

Mapping driven metadata (for solar and wind)#54
dylanjmcconnell merged 15 commits into
mainfrom
map-driven-parsing

Conversation

@dylanjmcconnell

Copy link
Copy Markdown
Member

Mapping-driven metadata for solar and wind trace parsing. File stems are looked up in mappings/2024/resources.yaml (instead of reverse-engineered from filenames via regex).

(Demand parsing unchanged - seperate PR, coming next I think).

Changes

  • New resource_trace_metadata.build() for building metadata dict
  • solar_traces.py / wind_traces.py use it instead of the regex extractors.
  • extract_solar_trace_metadata / extract_wind_trace_metadata and their tests deleted (demand extractor kept for now).

Temporary measures

(These measures basically to keep PR's more manageable / phase the implementation)

  • The loader still returns the existing dict[Path, dict[str, str]] shape so downstream helpers, filters, and parquet schema are untouched. Long-term metadata will probably move to a typed object (maybe dataclass, but maybe a pydantic model - given pydantic already used / a dependency). Later PR (likely after dealt with demand traces)

  • _RESOURCE_TYPE_CODES in resource_trace_metadata.py translates the YAML's semantic vocabulary (solar_sat, wind_offshore_floating) to the legacy short codes (SAT, WFL) downstream still expects (will disappear in the longer term)

Tests

One test in tests/test_resource_trace_metadata.py. The function has no branching or differing versions, (i.e. Extra solar/wind/zone cases would just be testing different YAML rows, not code - unlike in the regex driven approach). Only testing two files (a solar project and a wind project, that together test main logic of the function )

@codecov

codecov Bot commented Jun 11, 2026

Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.

Files with missing lines Coverage Δ
src/isp_trace_parser/metadata_extractors.py 77.77% <ø> (-7.94%) ⬇️
src/isp_trace_parser/resource_trace_metadata.py 100.00% <100.00%> (ø)
src/isp_trace_parser/solar_traces.py 100.00% <100.00%> (ø)
src/isp_trace_parser/wind_traces.py 100.00% <100.00%> (ø)
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@nick-gorman nick-gorman left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks nice and clean to me, always nice to say goodbye to some regex. Just a couple of minor comments.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

generator_to_trace_draft_mapper.py imported extract_solar_trace_metadata and extract_wind_trace_metadata. So maybe we should just delete generator_to_trace_draft_mapper.py along with these functions?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah good point - deleted.


file_metadata: dict[Path, dict[str, str]] = {}
for path in files:
stem, sep, ref = path.stem.rpartition("_RefYear")

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be worth adding error handling for either file name not matching expected format with _RefYear, or filename missing from resource_mapping. I guess these are errors which should only be found by devs at parser update time, so maybe it's not worth it.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm yes. I think you're right that it should be found at update time (at least with respect to resource_mapping).

That said maybe someone might tinker with filenames or something on their computer (for some reason?) - will add in a little error handling ("Unexpected trace filename") or similar.

Comment thread src/isp_trace_parser/resource_trace_metadata.py Outdated
dylanjmcconnell and others added 5 commits June 17, 2026 13:49
Fixed type hints on internal function dict (`file_metadata`)

Co-authored-by: nick-gorman <40549624+nick-gorman@users.noreply.github.com>
(No longer needed or used)
Check for incorrect names and unmapped stems with (with with an explicit ValueError instead vague KeyError failure)
@dylanjmcconnell dylanjmcconnell merged commit 59d4bc4 into main Jun 17, 2026
18 checks passed
@dylanjmcconnell dylanjmcconnell deleted the map-driven-parsing branch June 17, 2026 04:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants