From 008aef8596bf1d984dc0334bf1bdb37079236baf Mon Sep 17 00:00:00 2001 From: Matthew James Briggs Date: Wed, 3 Jun 2026 05:18:10 +0200 Subject: [PATCH 01/27] docs: delete core-dev project Delete the project in which core roundtrip, or corert tests were introduced. --- docs/ai/projects/core-dev/.prompt | 0 .../core-dev/design/agents-md-snippet.md | 50 --- docs/ai/projects/core-dev/design/design.md | 376 ------------------ docs/ai/projects/core-dev/index.md | 42 -- docs/ai/projects/core-dev/log.md | 101 ----- docs/ai/projects/core-dev/plan.md | 43 -- docs/ai/projects/core-dev/state.md | 8 - 7 files changed, 620 deletions(-) delete mode 100644 docs/ai/projects/core-dev/.prompt delete mode 100644 docs/ai/projects/core-dev/design/agents-md-snippet.md delete mode 100644 docs/ai/projects/core-dev/design/design.md delete mode 100644 docs/ai/projects/core-dev/index.md delete mode 100644 docs/ai/projects/core-dev/log.md delete mode 100644 docs/ai/projects/core-dev/plan.md delete mode 100644 docs/ai/projects/core-dev/state.md diff --git a/docs/ai/projects/core-dev/.prompt b/docs/ai/projects/core-dev/.prompt deleted file mode 100644 index e69de29bb..000000000 diff --git a/docs/ai/projects/core-dev/design/agents-md-snippet.md b/docs/ai/projects/core-dev/design/agents-md-snippet.md deleted file mode 100644 index f80f7f68a..000000000 --- a/docs/ai/projects/core-dev/design/agents-md-snippet.md +++ /dev/null @@ -1,50 +0,0 @@ -# AGENTS.md Snippet — Core Development Mode - -Verbatim Markdown to append to the repo-root `AGENTS.md` at implementation time. Stored -separately because nested code fences break Markdown rendering in the main design doc. - -Insert as a new top-level section after `## Quality Gates`. - ---- - -## Core Development Mode (Codegen) - -Use core-dev when modifying files under `src/private/mx/core/` and you do not need `mx/api` -or `mx/impl` to compile. Trimmed build for codegen iteration: only `mx/core` + `mx/ezxml` + -`mx/utility` compile. - -### In-mode iteration - -``` -make core-dev # configure + build (fast: skips api/impl) -make test-core-dev # run all core roundtrip cases -make test-core-dev ARGS='[core-roundtrip] lysuite/*' # subset -``` - -In-mode gate: `make fmt && make check-core-dev && make test-core-dev`. `fmt` and -`check-core-dev` run in Docker; `test-core-dev` runs natively. - -### Full pre-merge gate - -Before merging core changes, run the full gate. Changes under `mx/core/` require -`test-all`, which exercises per-element core unit tests, api import, and core -roundtrip together: - -``` -make fmt && make check && make test-all -``` - -### What core-dev tests - -Backed by the **core roundtrip** suite (`CORE_ROUNDTRIP`, `CORE_RT`) under -`src/private/mxtest/corert/`. Each `*.xml` / `*.musicxml` file under `data/` (excluding -`data/expected/`, `data/testOutput/`, `data/generalxml/`, `data/smufl/`) is one Catch2 -test case round-tripping through `mx::core::Document` (`fromXDoc` → `toXDoc`) against a -normalized input computed in-memory. - -The same suite runs inside the normal `mxtest` binary during `make test-all`, gated on -`MX_BUILD_CORE_TESTS=ON`. Distinct from the **api import** (`API_IMPORT`) suite in -`src/private/mxtest/import/`, which exercises the full `mx::api` stack against pre-generated -expected files under `data/expected/`. - -For design details see `docs/ai/projects/core-dev/design/design.md`. diff --git a/docs/ai/projects/core-dev/design/design.md b/docs/ai/projects/core-dev/design/design.md deleted file mode 100644 index 7098ef696..000000000 --- a/docs/ai/projects/core-dev/design/design.md +++ /dev/null @@ -1,376 +0,0 @@ -# core-dev Design - -Permanent in-repo "core development" mode. Lets a codegen agent churn `mx/core` symbols -without keeping `mx/api`, `mx/impl`, or higher-level tests compiling. Activated via -`make core-dev` / `make check-core-dev` / `make test-core-dev`, backed by an `MX_CORE_DEV=ON` -CMake option. - -The mode introduces a new test suite, **core roundtrip** (`CORE_ROUNDTRIP`, `CORE_RT`), under -`src/private/mxtest/corert/`. It runs standalone under `make test-core-dev` and inside the -normal `mxtest` binary during `make test-all`. It does not run during `make test` (the fast -path) — gated on `MX_BUILD_CORE_TESTS=ON`, the same flag that gates the 522 per-element core -unit tests. - -Distinct from **api import** (`API_IMPORT`), the `src/private/mxtest/import/` suite -that exercises the full `mx::api` stack against pre-generated expected files. "roundtrip" -is never used unqualified; always "core roundtrip" or "api import." - -Sections: - -1. Build System -2. Compilation Guards -3. Core Roundtrip Test -4. Agent Surface - ---- - -## 1. Build System - -### CMake Option - -``` -option(MX_CORE_DEV "Build only mx/core + ezxml + utility with roundtrip tests" OFF) -``` - -Default OFF; normal build unaffected. - -### Shared Glob - -`SRC_MX_CORE_ROUNDTRIP` is a new glob set, defined unconditionally near the -`SRC_MX_TEST_*` globs, targeting `src/private/mxtest/corert/*.{cpp,h}`. Both build modes -reference it (see section 3 for contents). - -### Effect of `MX_CORE_DEV=ON` (core-dev branch) - -`CMakeLists.txt` enters a distinct branch that builds only two targets: - -| Target | Type | Sources | -|-------------------|----------------|---------------------------------------------------| -| `mx-core` | static library | `SRC_MX_CORE` + `SRC_MX_EZXML` + `SRC_MX_UTILITY` | -| `mxtest-core-dev` | executable | `SRC_MX_CORE_ROUNDTRIP` + `SRC_CPUL` | - -`SRC_MX_API` and `SRC_MX_IMPL` are never passed to any `add_library` / `add_executable` in the -core-dev branch, so they never compile. `MX_BUILD_TESTS`, `MX_BUILD_CORE_TESTS`, and -`MX_BUILD_EXAMPLES` are ignored in core-dev mode — they control only the normal build. - -`PathRoot.h` is still generated (it records `MX_REPO_ROOT_PATH`, used by the core roundtrip -test for file discovery). - -### Effect of `MX_CORE_DEV=OFF` (normal branch) - -The existing build runs unchanged — `mx` library, `mxtest`, examples, install rules — -with one addition: when `MX_BUILD_TESTS=ON` *and* `MX_BUILD_CORE_TESTS=ON`, -`SRC_MX_CORE_ROUNDTRIP` is appended to `mxtest`'s source list, so the core -roundtrip cases run as part of `mxtest` alongside the api import, api, impl, and -per-element core suites. - -Gating rationale: `MX_BUILD_CORE_TESTS` gates the 522 per-element core unit tests — OFF -during `make test` / `make check`, ON during `make test-all`. Reusing it keeps `make test` -fast and groups the core roundtrip suite with "anything related to core." No new -test-suite flag is introduced. - -### Structure - -One top-level `if(MX_CORE_DEV) ... else() ... endif()` in `CMakeLists.txt`. The `else()` -branch holds the existing normal build with the `MX_BUILD_CORE_TESTS`-gated -append described above; the `if()` branch holds only the two core-dev targets. No small -`if(NOT MX_CORE_DEV)` gates sprinkled through the file. - -### Build Directory - -`build/core-dev/$(BUILD_TYPE)/`, following the `mode_dir` convention. Independent -CMake cache and incremental state, so switching between `dev` and `core-dev` never -reconfigures the other. - -### Makefile Targets - -| Target | What it does | -|------------------|----------------------------------------------------------------| -| `core-dev` | Configure + build with `MX_CORE_DEV=ON` | -| `check-core-dev` | fmt-check + warning-free build in Docker with `MX_CORE_DEV=ON` | -| `test-core-dev` | Depends on `core-dev`; runs `mxtest-core-dev` (passes `ARGS`) | - -`test-core-dev` always passes `--allow-running-no-tests` to the binary so the -target stays green when `ARGS` filters down to zero cases (and during early -implementation phases before any cases are registered). Real failures still -fail the target — the flag only changes how Catch2 treats the empty-result -case. - -`core-dev`: - -```makefile -core-dev: - $(CMAKE) -S . -B $(call mode_dir,core-dev) \ - -DCMAKE_BUILD_TYPE=$(BUILD_TYPE) \ - -DMX_CORE_DEV=ON \ - $(GEN_ARG) - $(CMAKE) --build $(call mode_dir,core-dev) --parallel $(JOBS) \ - --config $(BUILD_TYPE) -``` - -The `cmake_build` macro passes the three `MX_BUILD_*` flags positionally. It is -refactored to also accept `MX_CORE_DEV` as a parameter; `core-dev` invokes it with -`MX_CORE_DEV=ON` and the three `MX_BUILD_*` flags OFF. - -`check-core-dev`: mirrors `check` — fmt-check plus warning-free build, run under the pinned -Docker toolchain so the gate is deterministic. Configures with `MX_CORE_DEV=ON` (and the -three `MX_BUILD_*` flags OFF) instead of the `MX_BUILD_TESTS=on` set `check` uses. - -Docker delegation: outside the container, `make check` and `make check-core-dev` both -build the toolchain image once (`docker buildx build -t mx-sdk`) and then `docker run` it -with the workspace and `build/docker/` bind-mounted, invoking the in-container target -(`make check` or `make check-core-dev`). Inside the container `MX_RUNNING_IN_DOCKER=1` is -set by the image, which flips the Makefile to run the pinned tools directly. There are no -per-target Dockerfile stages. - ---- - -## 2. Compilation Guards - -No per-file `#ifdef` guards. Exclusion happens at the CMake target level by omitting -api/impl/higher-test glob sets in the core-dev branch (see section 1). - -`src/private/mxtest/control/CompileControl.h` is unchanged. Its macros -(`MX_COMPILE_API_TESTS`, etc.) guard code in files the core-dev binary does not compile, so -they are inert there. Removing dormant macros is out of scope. - -### Dual-Compilation Invariant - -The core roundtrip sources under `mxtest/corert/` compile in two contexts: - -- Linked into `mxtest-core-dev` (core-dev branch), against `mx-core` only. -- Linked into `mxtest` (normal branch with `MX_BUILD_CORE_TESTS=ON`), against the full `mx` - library. - -The sources must therefore have no `#ifdef MX_CORE_DEV` branches and must not depend on -anything that exists in only one context. Concretely: - -- Includes are limited to `mx/core/*`, `mx/ezxml/*`, `mx/utility/*`, `cpul/catch.h`, and the - two helpers below. -- The suite reuses `ChangeValues.h` and `SortAttributes.h` from `mxtest/import/`. Both are - unguarded today; the implementer must confirm this at integration time. - ---- - -## 3. Core Roundtrip Test - -### Source Location - -New files under `src/private/mxtest/corert/`: - -``` -CoreRoundtripTest.cpp Catch2 dynamic registration + test body -CoreRoundtripImpl.h File discovery, normalization, comparison declarations -CoreRoundtripImpl.cpp Implementations -``` - -No dependency on `mxtest/` infrastructure beyond the two unguarded helpers from -`mxtest/import/` (`ChangeValues.h`, `SortAttributes.h`). The suite does not -use `MxFileRepository`, `ImportTestImpl`, or `CompileControl.h`. It includes only -`mx/core/*`, `mx/ezxml/*`, `mx/utility/*`, and `cpul/catch.h`. - -The corresponding implementation files (`ChangeValues.cpp`, `SortAttributes.cpp`) -are linked into `mxtest-core-dev` via an explicit `SRC_MX_CORE_ROUNDTRIP_HELPERS` -list in `CMakeLists.txt` (the path is enumerated rather than globbed to pick -exactly these two helpers and not the rest of the api import suite, which would -drag in `MxFileRepository`, `ImportTestImpl`, etc.). In the normal branch the -same `.cpp` files are already linked transitively via `SRC_MX_TEST_IMPORT`. - -### File Discovery - -Runs once at static-initialization time, before Catch2's runner starts: - -1. Scan root: `MX_REPO_ROOT_PATH "/data"` (injected by CMake into `PathRoot.h`). -2. Recurse all subdirectories. -3. Collect files with extension `.xml` or `.musicxml`. -4. Exclude any path containing a directory segment named `expected`, `testOutput`, - `generalxml`, or `smufl`. - -The `expected/` and `testOutput/` exclusions remove generated outputs from the input set. -The `generalxml/` and `smufl/` exclusions remove non-MusicXML fixtures: `generalxml/` holds -only `fake.xml` (a non-MusicXML XML sample) and `smufl/` holds only `glyphnames.json` (not -XML). If MusicXML test inputs are ever added to those directories, the exclusions -must be revisited. - -Returns absolute paths. Each test name is the path relative to `data/` (e.g., -`lysuite/ly01a_Pitches_Pitches.xml`). - -### Catch2 Dynamic Registration - -Catch2 v3 (amalgamated at `src/private/cpul/catch.h`) exposes `Catch::AutoReg` and -`Catch::ITestInvoker` — the primitives `TEST_CASE` uses internally. - -A custom `ITestInvoker` carries the file path; a static initializer constructs one -`Catch::AutoReg` per discovered file, tagged `[core-roundtrip]`. Each case runs -independently: one failure does not block the rest. Agents filter via -`mxtest-core-dev [core-roundtrip] lysuite/`, or `mxtest [core-roundtrip]` inside the -`test-all` binary. - -### Per-File Test Flow - -The body below runs inside a `try` block; any thrown exception (from `ezxml`, -`fromXDoc`, `toXDoc`, or normalization) is caught and reported via Catch2's `FAIL` with the -file path and `exception::what()`. Catch2 ends the current case on `FAIL` but continues to -the next, so a thrown file never aborts the suite. - -``` -1. Load input via ezxml → inputXDoc -2. Set input root `version` attribute to the supported - MusicXML version (see note below) -3. mx::core::Document::fromXDoc(inputXDoc) → mxDoc -4. mxDoc->toXDoc() → actualXDoc -5. Apply full normalization to actualXDoc - -6. Load input from disk again (fresh) → expectedXDoc -7. Apply full normalization to expectedXDoc - -8. Depth-first compare expectedXDoc vs actualXDoc - On first mismatch: FAIL with path + detail; return -``` - -If `fromXDoc` returns false (non-throwing failure), `FAIL` immediately with file name. - -**Version string.** Step 2 uses a hardcoded `"3.0"` declared as a local -`constexpr const char *const kMusicXmlVersionString` in `CoreRoundtripImpl.cpp`. - -The original intent was to use `mx::core::toString(mx::core::DEFAULT_MUSIC_XML_VERSION)` -as a single source of truth. That is not currently possible: the `mx::core` -namespace declares the version in two conflicting places -(`DocumentSpec.h::DEFAULT_MUSIC_XML_VERSION`, lower-case `threePointZero`, with -stringification; `DocumentHeader.h::kDefaultMusicXmlVersion`, upper-case -`ThreePointZero`, no stringification helper), and including both in one TU is a -hard compile error — the `DocumentChoice` and `MusicXmlVersion` enums are -redeclared with incompatible scopes. The core roundtrip suite needs `Document.h` -for `makeDocument` / `fromXDoc` / `toXDoc`, which transitively conflicts with -`DocumentSpec.h`. The api import suite hardcodes `"3.0"` for the same reason -(see `ImportTestImpl::loadTestFile`); we follow that precedent. The constant -carries a comment pointing at the conflict so a future cleanup — most likely -part of the codegen rewrite that brings MusicXML 4.0 support — can replace it -with the canonical symbol. - -**Normalization symmetry.** Steps 5 and 7 apply the *same* normalization pipeline to -both sides, so the comparison is canonical-against-canonical. - -### Normalization - -Normalization is restricted to representational canonicalization — operations that neutralize -XML's non-semantic degrees of freedom. It excludes the defect-workaround fixups in -`mxtest/import/ExpectedFiles.cpp::generateExpectedFile`, which keep api import green against -known-bad inputs and `mx::core` quirks. Core roundtrip rejects that approach: a -mismatch is a signal to fix the bug (in `mx::core`, the input, or codegen), not to hide it -with normalization. The helpers (`convertValues`, `addChildIfNone`, etc.) live in -`mxtest/import/ChangeValues.h`; the suite imports only the canonicalization helpers below. - -Steps, in order: - -1. `setXmlDeclaration` — standalone=false, XML 1.0, preserve encoding (default UTF-8). -2. `setDoctype` — based on root element name (`score-partwise` or `score-timewise`). -3. `setRootMusicXmlVersion` — set root `version` from - `mx::core::toString(mx::core::DEFAULT_MUSIC_XML_VERSION)`. -4. `stripZerosFromDecimalFields` — canonicalize decimal text (`"1.000"` → `"1"`) on fields - in the decimal-field set. XML compares text strings, not numeric values, so this is - representational normalization, not a defect workaround. -5. `sortAttributes` — **must be last**. Attribute order is not semantic in XML. - -The same pipeline is applied to both `actualXDoc` (step 5 of the per-file flow) and -`expectedXDoc` (step 7), so the comparison sees two canonically-normalized trees. - -`SortAttributes.h` from `mxtest/import/` is the source for `sortAttributes`; the four -remaining helpers come from `ChangeValues.h`. Both headers are unguarded (per Section 2's -Dual-Compilation Invariant). - -### Depth-First Tree Comparison - -At each node-pair: element name, text content (see open question below), attribute count -and each `(name, value)` pair in order, child count. Attributes are already sorted by -normalization step 5, so the comparator walks them in order. Recurse into each child pair in -order. On first mismatch, record and return. - -Comparison stops at the first mismatch and calls `FAIL`. Catch2 ends the current case and -proceeds to the next, so one file's mismatch never blocks the suite. Continuing past a -mismatch would emit cascade noise — one structural difference makes every subsequent sibling -comparison a likely mismatch — so only the first signal is useful. - -Each call carries a node path like `/score-partwise/part[0]/measure[2]/note[1]/pitch`, which -appears in the failure message. Example: - -``` -core roundtrip mismatch in lysuite/ly01a_Pitches_Pitches.xml - at /score-partwise/part[0]/measure[2]/note[1]/pitch/step - expected element name: 'step' - actual element name: 'alter' -``` - -Attribute mismatches use `[@attr]` in the path. - -**Open question — text trimming.** Whether text comparison should trim leading/trailing -whitespace or compare exactly is deferred to implementation. Start with exact comparison; -if real round-trip cases mismatch only on whitespace inside text nodes, switch to trimmed -comparison and document the case that drove the decision. - -### No Pre-Generated Expected Inputs - -The core roundtrip suite does not read `data/expected/`. Expected is computed in-memory from -the input at runtime — no `GenerateExpected` step, no staleness concern. (A key difference -from api import, which compares against pre-generated expected files under -`data/expected/`.) - -### Output Files on Failure - -On any failure (exception caught, `fromXDoc` returning false, or tree-comparison mismatch), -two files are written to `data/testOutput/corert/` for diffing: - -- `.expected.xml` — `expectedXDoc` after normalization -- `.actual.xml` — `actualXDoc` after normalization - -`` is the test name (path relative to `data/`) with directory separators -replaced by `_`, e.g., `lysuite_ly01a_Pitches_Pitches.xml.expected.xml`. Each failure -is a single pair of files under `data/testOutput/corert/` with no nested subdirectories. - -The `corert/` subdirectory namespaces core roundtrip's debug output away from api import's -output (which writes flat into `data/testOutput/`), so the two suites can run in the same -`make test-all` invocation without filename collisions. The test creates -`data/testOutput/corert/` via `mkdir -p` before writing, so a fresh checkout works. - ---- - -## 4. Agent Surface - -Core-dev is an AI/codegen workflow, not a user feature. Substance is documented in -`AGENTS.md` and `make help`. The README gets a short pointer (≤20 words) at implementation -time naming the mode and pointing at `AGENTS.md`; no further README content. - -### `make help` Addition - -Appended after existing sections, before the Knobs/Layout line: - -``` -Core development (codegen): - make core-dev Build trimmed library (mx/core + ezxml + utility) and - mxtest-core-dev. No mx/api or mx/impl compiled. - make check-core-dev fmt-check + warning-free build for core-dev (Docker). - make test-core-dev Build core-dev then run the core roundtrip suite. - Each file under data/ is a separate Catch2 test case. - Filter: make test-core-dev ARGS='[core-roundtrip] lysuite/*' -``` - -The `ARGS` hint is included because agents iterate on subsets. The same `[core-roundtrip]` -tag filters the suite inside `mxtest` during `make test-all`. - -### `AGENTS.md` Addition - -Verbatim Markdown in `agents-md-snippet.md` (kept separate to avoid nested fenced code -blocks). Introduces a top-level `## Core Development Mode (Codegen)` section: when to use -the mode, in-mode commands, in-mode gate, and full pre-merge gate. - -### Build Directories - -| Action | Command | Build dir used | -|---------------------|----------------------|--------------------------| -| Enter (build) | `make core-dev` | `build/core-dev//` | -| Enter (test) | `make test-core-dev` | `build/core-dev//` | -| Leave (verify full) | `make test-all` | `build/core//` | -| Leave (fast verify) | `make test` | `build/dev//` | - -Each mode has its own build directory; switching never reconfigures the other. No persistent -state changes on entering or leaving the mode. diff --git a/docs/ai/projects/core-dev/index.md b/docs/ai/projects/core-dev/index.md deleted file mode 100644 index 6a551f3aa..000000000 --- a/docs/ai/projects/core-dev/index.md +++ /dev/null @@ -1,42 +0,0 @@ ---- -created: 2026-05-21 -status: complete -prs: https://github.com/webern/mx/pull/153 ---- - -# core-dev - -## Goal - -A permanent in-repo "core development" mode that lets a codegen agent churn -`mx/core` symbols without keeping `mx/api`, `mx/impl`, or higher-level tests -compiling. - -Activated via `make core-dev` / `make check-core-dev` / `make test-core-dev`, -backed by an `MX_CORE_DEV=ON` CMake option. CMake builds only `mx/core` + -`mx/ezxml` + `mx/utility` plus a roundtrip test binary; no per-file `#ifdef` -guards. - -The test suite is one roundtrip harness: each `*.xml` / `*.musicxml` file under -`data/` (excluding `expected/`, `testOutput/`, `generalxml/`, `smufl/`) is a -Catch2 test case asserting that `Document::fromXDoc -> toXDoc` matches the -normalized input. - -CI does not run core-dev. Surfaced in `AGENTS.md` and `make help`. The consumer -is the codegen rewrite tracked in `docs/ai/projects/gen/`; core-dev is its -harness. - -## Index - -- `plan.md` — milestones -- `state.md` — last/next session -- `log.md` — append-only session log (compressed 2026-05-22) -- `.prompt` — user-owned scratchpad (agents do not read) -- `design/design.md` — full design -- `design/agents-md-snippet.md` — Markdown appended to `AGENTS.md` at - implementation time - -## Notes for Agents - -- Milestones 1–3 COMPLETE as of 2026-05-22. Milestones 4–6 remain (see - `plan.md`). diff --git a/docs/ai/projects/core-dev/log.md b/docs/ai/projects/core-dev/log.md deleted file mode 100644 index 770a62fa7..000000000 --- a/docs/ai/projects/core-dev/log.md +++ /dev/null @@ -1,101 +0,0 @@ -# core-dev Log - -Compressed 2026-05-22. Earlier session-by-session notes collapsed to the -retrospective and gotchas below; full per-session entries are in git history if -needed. - -## Milestone 1 — Define goals (2026-05-21) - -User-approved scope: `mx/core` + `mx/ezxml` + `mx/utility` only; `mx/api`, -`mx/impl`, their tests, and the 522 core element unit tests excluded. Test -harness is roundtrip-only with dynamic Catch2 registration via `AutoReg` (one -case per file). Makefile: `core-dev`, `check-core-dev`, `test-core-dev`. CMake -exclusion only (no `#ifdef` guards added). CI out of scope. Docs surface: -`AGENTS.md` + `make help`. - -## Milestone 2 — Design (2026-05-21 → 2026-05-22) - -Final design landed in `design/design.md` + `design/agents-md-snippet.md`. -Notable decisions made during review: - -- One top-level `if(MX_CORE_DEV)...else()...endif()` wrap in `CMakeLists.txt`; - `cmake_build` macro takes `MX_CORE_DEV` as a parameter. -- "Option B" reconciliation: same `.cpp` files compile into two binaries - (`mxtest-core-dev` always; `mxtest` only under `MX_BUILD_CORE_TESTS=ON`), so - core roundtrip runs in `make test-all` too. Dual-Compilation Invariant - documented in design §2. -- Terminology fixed: "core roundtrip" / `CORE_ROUNDTRIP` / `CORE_RT` vs "api - import" / `API_IMPORT`. Never say "roundtrip" unqualified. Repo `AGENTS.md` - carries the terminology section. -- Source directory: `src/private/mxtest/corert/`. Discovery excludes - directories named `expected`, `testOutput`, `generalxml`, `smufl`. -- Normalization narrowed to canonicalization only (xml decl, doctype, root - version, decimal zero-stripping, attribute sort). The api import suite's - defect-workaround block was deliberately dropped — a mismatch here is a - signal to fix `mx::core`, the input, or codegen. -- Failure handling: full normalization on both sides; first-mismatch FAIL with - node path; per-file `try`/`catch` so an exception fails one case rather than - aborting the suite. Text comparison exact; trimming deferred unless a real - case drives it (Phase C surfaced no whitespace-only mismatches). -- Debug output to `data/testOutput/corert/`, flattened filenames - (`lysuite_ly01a.xml.expected.xml` / `.actual.xml`), `mkdir -p` at write - time. - -## Milestone 3 — Implement (2026-05-22) - -Four phases plus wrap-up; each ended green for what it added. - -- **Phase A** — CMake option, Makefile targets, Docker `run-core-dev` stage. - `mxtest-core-dev` reuses `cpul/main.cpp` (Catch2 main) instead of a - placeholder `main()` to avoid a link collision. `make test-core-dev` passes - `--allow-running-no-tests` to stay green when ARGS filters to zero matches. -- **Phase B** — `CoreRoundtripImpl.{h,cpp}` + `CoreRoundtripTest.cpp` with one - hardcoded case (`lysuite/ly01a_Pitches_Pitches.xml`). Helpers reused from - `mxtest/import/` (`ChangeValues`, `SortAttributes`); their `.cpp` files - enumerated explicitly via `SRC_MX_CORE_ROUNDTRIP_HELPERS` so core-dev does - not pull in `MxFileRepository` / `ImportTestImpl`. Version string hardcoded - `"3.0"` — see "Known conflicts" below. -- **Phase C** — Dynamic Catch2 registration. Gotchas resolved: - - `Catch::StringRef` does not own storage; test-name `std::string`s live in - a function-local-static `std::vector` that is `reserve()`d to final size - before any `push_back` so addresses are stable. - - `Catch::AutoReg` is non-copyable and non-movable; owned through - `std::vector>`. - - Custom `ITestInvoker` subclass owns its own `std::string` test name copy. -- **Phase D (docs)** — `## Core Development Mode (Codegen)` section appended - to `AGENTS.md` directly after `## Quality Gates`; one-line README pointer; - `make help` block. Working filter form under Catch2 v3 is - `[core-roundtrip] lysuite/*` (not `lysuite/`); both design surfaces updated - to match. -- **Phase D (wrap-up)** — Full gate green. `plan.md` / `index.md` / `state.md` - finalized. - -## Final gate (2026-05-22) - -| Target | Result | -|---------------------|---------------------------------------------| -| `make fmt` | pass | -| `make check` | pass | -| `make check-core-dev` | pass | -| `make test-all` | 3039 cases, 3004 pass, 35 fail (expected) | -| `make test-core-dev` | 361 cases, 326 pass, 35 fail (same set) | - -The 35 core roundtrip failures are the consumer signal for the codegen rewrite -at `docs/ai/projects/gen/` and out of scope for this project. By rough -category from the failure messages: ~14 attribute mismatches, ~5 text -mismatches, a couple of child-count / attribute-count mismatches, one -`fromXDoc` returning false. Files span `foundsuite/`, parts of `lysuite/` -(figured bass, accordion registrations, articulation texts, parenthesized -accidentals), `mjbsuite/krz_v40.xml` (image element attribute set), -`musuite/test_harmony.xml` (harmony kind text), `musuite/testInvalid.xml` -(identification sub-element count). - -## Known conflicts retained out of scope - -- `DocumentSpec.h` vs `DocumentHeader.h` both declare `MusicXmlVersion`; - `DocumentSpec.h` vs `Document.h` both declare `DocumentChoice`. Co-including - is a hard compile error. The core roundtrip suite needs `Document.h`, so it - hardcodes `"3.0"` to match the api import precedent. Fixing this conflict - belongs to the codegen rewrite, not this project. -- Dormant `MX_COMPILE_*` macros in `CompileControl.h` were left as-is; design - §2 marks removal out of scope. diff --git a/docs/ai/projects/core-dev/plan.md b/docs/ai/projects/core-dev/plan.md deleted file mode 100644 index 5e6f2b6d3..000000000 --- a/docs/ai/projects/core-dev/plan.md +++ /dev/null @@ -1,43 +0,0 @@ -# core-dev Plan - -## Milestone 1: Define goals ✓ COMPLETE - -See `index.md ## Goal`; Q&A in `log.md`. - -## Milestone 2: Design — in progress - -Design lives in `design/`. Until approved, no code outside this project directory changes. - -Design docs describe current state only (per `/project` skill); decisions and history go in -`log.md`. - -Exit: design reviewed with user; open questions resolved. - -## Milestone 3: Implement ✓ COMPLETE - -Landed the design across four implementation phases plus a wrap-up: - -1. Phase A — CMake option + Makefile targets (normal build unchanged when `MX_CORE_DEV=OFF`). -2. Phase B — core roundtrip sources behind one hardcoded `TEST_CASE` under - `src/private/mxtest/corert/`. -3. Phase C — dynamic Catch2 registration over every discovered file. -4. Phase D — agent surface (`AGENTS.md` section + README pointer + `make help` block). -5. Phase D wrap-up — full pre-merge gate run, docs finalized. - -Final gate (2026-05-22, this branch): - -- `make fmt`, `make check`, `make check-core-dev`: pass. -- `make test-all`: 3039 cases, 3004 pass, 35 fail (expected — codegen-rewrite signals). -- `make test-core-dev`: 361 cases, 326 pass, 35 fail (same set). - -The `TEMP: codegen-rewrite harness` edits were already reverted before Phase A -began (working tree clean as of 2026-05-22), so no revert step was needed during -implementation. CI integration for core-dev (item 5 in the original ordering) is -deferred — the design treats it as optional; no further session is allocated. -The 35 core roundtrip failures are out of scope here and are the consumer signal -for the codegen rewrite tracked in `docs/ai/projects/gen/`. - -## Milestone 4: Pass Tests and CI ✓ COMPLETE - - - diff --git a/docs/ai/projects/core-dev/state.md b/docs/ai/projects/core-dev/state.md deleted file mode 100644 index 97eb482ad..000000000 --- a/docs/ai/projects/core-dev/state.md +++ /dev/null @@ -1,8 +0,0 @@ -# core-dev State - -Milestones 1–3 COMPLETE as of 2026-05-22. - -## If a future session reopens this project - -- Read `index.md`, `plan.md`, then `design/design.md` (design describes - current state; history is in `log.md`). From 8a2dc44e318fc317c2506ee759c81e6e3c446b39 Mon Sep 17 00:00:00 2001 From: Matthew James Briggs Date: Wed, 3 Jun 2026 05:19:24 +0200 Subject: [PATCH 02/27] docs: update agents.md and incorrect version ref Maintain AGENTS.md with fresh information. Also I had thought that MusicXML 4.1 was released but it is just the unreleased working version number, so fix references to it in instruction docs. --- AGENTS.md | 118 ++++++++++++++-------------------- docs/ai/projects/gen/index.md | 4 +- docs/ai/projects/gen/plan.md | 6 +- 3 files changed, 52 insertions(+), 76 deletions(-) diff --git a/AGENTS.md b/AGENTS.md index 17fb8d46e..7c70f3b32 100644 --- a/AGENTS.md +++ b/AGENTS.md @@ -18,54 +18,44 @@ Keep this section as a Markdown table. When updating entries, maintain the table | `src/private/mx/utility/` | Shared helpers (string, parsing, file system utilities) | | `src/private/mxtest/` | Test suite (api, core, file, api import, impl, control, core roundtrip) | | `src/private/cpul/` | Catch-based unit-test harness and test runner main | -| `gen/version-a/` | Historical Ruby/shell scripts from the original brute-force code gen | -| `gen/version-b/` | Active Rust code generator for reproducing MusicXML 4.0 element classes | +| `gen/` | Python code generator: `generate.py` (main), `parse.py`, `ids.py`, `quality.py` | | `data/` | MusicXML test files and expected-output suites for api import / core roundtrip tests | -| `docs/musicxml.xsd` | MusicXML specification (reference) | -| `docs/ai/project/` | AI-assisted project planning and codegen design documents | +| `docs/musicxml.xsd` | The currently live MusicXML XSD specification | +| `docs/ai/projects/gen/` | Active codegen project: plans, state, design docs | | `Makefile` | Primary build-and-test entry point (wraps CMake; `make help` lists targets) | | `CMakeLists.txt` | CMake build configuration | -## Code Generation - Historical Context +## Code Generation -`src/private/mx/core/` and `src/private/mx/core/elements/` was originally "hand-generated" by human -brute-force using Ruby scripts which can still be found in `./gen/version-a`. This was never a -one-shot solution to generating the code from the XSD spec. Rather, it was an iterative process, -solving problems encountered one-at-a-time until the XSD spec was entirely covered. As such, it is -not viable for re-use at this time, but can be used to understand the historical nature of how the -types were first generated. +Much of the code in `src/private/mx/core/` and `src/private/mx/core/elements/` was originally +generated by hand-written Ruby scripts in a sort of brute force methodology. These scripts have +since bin deleted. This was never a one-shot XSD codegen exercise. Many decisions in the original +gen process were taken by hand — hand-rolled types, non-strict spec interpretations, special-cased +elements. These choices exist throughout `mx/core`. Some of these decisions should be preserved, +while others should be reconsidered. The user will decide. -At some point, I tried to create a Rust based codegenerator in order to be able to regenerate -`mx/core`. However, this devolved and failed. It is kept in `gen/version-b` for historical -curiosity, but it never worked. +The active generator is Python-based and lives in `gen/`. Entry point: `python3 gen/generate.py`. It +reads `docs/musicxml.xsd` and emits C++ into `src/private/mx/core/elements/`. Current target is +MusicXML 3.0; upgrading to 4.0 is in progress (see Current Project below). -### The Problem +## Roundtrip Test Suites -We are stuck somewhere around MusicXML 3.1 (or maybe 3.0) because we cannot reliably re-generate the -types from a newer version of the specification. MusicxML 4.0 has been out for a long time, and we -want to support it. But we need to write new code-gen tooling to reproduce the emission of the core -types and then expose the new features in `src/include/mx/api/`. +Two distinct test suites exercise MusicXML round-tripping. Always use the qualified name; never say +"roundtrip" unqualified. -Furthermore, many decisions in the original gen process were taken by hand. Using a hand-rolled -type, for example, instead of what would have strictly implemented the spec. There are human choices -throughout `mx/core` that will need to be preserved with future code generating efforts. +- API_IMPORT: + - `src/private/mxtest/import/` + - `mx::api::DocumentManager` + - Exercises: `mx/core`, `mx/impl`, `mx/api` + - Use: `make test` + - Purpose: test API -Code generation was never, and should not in the future, be designed to generate any valid XSD -specification. Rather, the goal of code generation is bespoke, to produce what is needed for the -`mx` library from the MusicXML specification. - -## Terminology: Roundtrip Suites - -Two distinct test suites exercise MusicXML round-tripping. Always use the qualified name; never -say "roundtrip" unqualified. - -- **api import** (`API_IMPORT`): the existing suite under `src/private/mxtest/import/`. Imports - a file through `mx::api::DocumentManager`, which exercises `mx/core` + `mx/impl` + `mx/api`, - then compares against a pre-generated expected XML file under `data/expected/`. -- **core roundtrip** (`CORE_ROUNDTRIP`, `CORE_RT`): the suite under - `src/private/mxtest/roundtrip/`. Round-trips a file through `mx::core::Document` only - (`fromXDoc` → `toXDoc`), no api/impl involvement, comparing against the normalized input - computed in-memory. +- CORE_ROUNDTRIP (`corert`): + - `src/private/mxtest/corert/` + - `mx::core::Document` + - Exercises: `mx/core` only + - Use: `make test-core-dev` + - Purpose: test `mx/core` ## Quality Gates @@ -73,55 +63,41 @@ Always run `make fmt` after modifying code under `src/`. To see whether your code changes are sound, follow that with: - for changes in `src/private/mx/core/*`: `make test-all` (very slow, can be more than 10 minutes) -- for changes not in `src/private/mx/core/*`: `make test` (faster, can take a couple minues) +- for changes not in `src/private/mx/core/*`: `make test` (faster, can take a few minues) Check for warnings with: `make check`. CI will run all of these plus the `xcode` targets. -## Core Development Mode (Codegen) +## Codegen Development Mode (core-dev) -Use core-dev when modifying files under `src/private/mx/core/` and you do not need `mx/api` -or `mx/impl` to compile. Trimmed build for codegen iteration: only `mx/core` + `mx/ezxml` + -`mx/utility` compile. +When working on `gen/`, it is permissable to skip the compilation of `mx/api` and `mx/impl` until +some very final step. The idea is that coding agents will have freedom to innovate and produce +better generated code without constantly dealing with symbol breakages at the higher levels. Then, +when reshaping of `mx/core` has reached a desireable state, the upper levels can be fixed and +tested. -### In-mode iteration +To skip upper-level compilation, simply run the following: ``` -make core-dev # configure + build (fast: skips api/impl) -make test-core-dev # run all core roundtrip cases -make test-core-dev ARGS='[core-roundtrip] lysuite/*' # subset +make core-dev # build fast +make test-core-dev # run core roundtrip tests +make test-core-dev ARGS='[core-roundtrip] lysuite/*' # run specific core roundtrip tests ``` -In-mode gate: `make fmt && make check-core-dev && make test-core-dev`. `fmt` and -`check-core-dev` run in Docker; `test-core-dev` runs natively. +Always use `make fmt` though. -### Full pre-merge gate +### Full Merge Gate -Before merging core changes, run the full gate. Changes under `mx/core/` require -`test-all`, which exercises per-element core unit tests, api import, and core -roundtrip together: +Before PRs run the full gate (unless the user says to skip it): ``` -make fmt && make check && make test-all +make fmt +make check-all +make test-all ``` -### What core-dev tests - -Backed by the **core roundtrip** suite (`CORE_ROUNDTRIP`, `CORE_RT`) under -`src/private/mxtest/corert/`. Each `*.xml` / `*.musicxml` file under `data/` (excluding -`data/expected/`, `data/testOutput/`, `data/generalxml/`, `data/smufl/`) is one Catch2 -test case round-tripping through `mx::core::Document` (`fromXDoc` → `toXDoc`) against a -normalized input computed in-memory. - -The same suite runs inside the normal `mxtest` binary during `make test-all`, gated on -`MX_BUILD_CORE_TESTS=ON`. Distinct from the **api import** (`API_IMPORT`) suite in -`src/private/mxtest/import/`, which exercises the full `mx::api` stack against pre-generated -expected files under `data/expected/`. - -For design details see `docs/ai/projects/core-dev/design/design.md`. - ## Current Project -We are working on reverse engineering a new codegen system to regenerate mx/core for MusicXML 4.0. -See the project directory `./docs/ai/project/gen`. +A Python-based codegen system is being built to regenerate `mx/core` and eventually target MusicXML +4.0. See the project directory `./docs/ai/projects/gen` for status. diff --git a/docs/ai/projects/gen/index.md b/docs/ai/projects/gen/index.md index 69b82513e..72eb1386a 100644 --- a/docs/ai/projects/gen/index.md +++ b/docs/ai/projects/gen/index.md @@ -21,7 +21,7 @@ completion_dates: Reverse engineer the codegen process that produced `mx/core` from MusicXML XSD. Build a generator that re-produces the existing C++ code from `docs/musicxml.xsd`, improve testing and coverage, -improve the generator, then point it at MusicXML 4.1 to generate updated types. +improve the generator, then point it at MusicXML 4.0 to generate updated types. ## Files @@ -135,7 +135,7 @@ the fix belongs in the shared path with a config-driven flag. - `gen/ids.py` — `NodeId` typed value (M6B; assigned to every node, currently unconsumed) - `gen/quality.py` — design-quality scorer for `make gen-quality` (excluded from its own score) - `gen/.pylintrc` — pylint config for `make gen-lint` -- `docs/musicxml.xsd` — input schema (currently MusicXML 3.0; swap to 4.1 in M6) +- `docs/musicxml.xsd` — input schema (currently MusicXML 3.0; swap to 4.0 in M6) - `src/private/mx/core/elements/` — target output (~590 .h/.cpp pairs) - `src/private/mxtest/corert/` — core-roundtrip harness diff --git a/docs/ai/projects/gen/plan.md b/docs/ai/projects/gen/plan.md index c6a66d502..1444cd2c3 100644 --- a/docs/ai/projects/gen/plan.md +++ b/docs/ai/projects/gen/plan.md @@ -51,12 +51,12 @@ Refactor edthe generator into a `parse -> configure -> render` pipeline - see Definition of the next steps is intentionally left TBD depending on the output ob 6B. I don't want to burdon the LLM with where we are going next. -## Milestone 7: mxml4-types — generate MusicXML 4.1 types +## Milestone 7: mxml4-types — generate MusicXML 4.0 types -Replace `docs/musicxml.xsd` with MusicXML 4.1, regenerate, fix all existing tests. Watch for +Replace `docs/musicxml.xsd` with MusicXML 4.0, regenerate, fix all existing tests. Watch for backported / bolted-on features (SMuFL, `UpDown`, …) that were added with hacks to 3.0/3.1 but are first-class in 4.0. Be backward-compatible with files mx may have written using those hacks. Restore the `mx/impl` TODOs left from revgen. -## Milestone 8: Surface MusicXML 4.1 features in mx/api +## Milestone 8: Surface MusicXML 4.0 features in mx/api From 26dba35b86ce21b14a35902f15eca65c78470f53 Mon Sep 17 00:00:00 2001 From: Matthew James Briggs Date: Wed, 3 Jun 2026 15:43:40 +0200 Subject: [PATCH 03/27] docs: plan gen project next steps --- docs/ai/projects/gen/plan.md | 39 +++++++++++++++++++++++++++++++++--- 1 file changed, 36 insertions(+), 3 deletions(-) diff --git a/docs/ai/projects/gen/plan.md b/docs/ai/projects/gen/plan.md index 1444cd2c3..1f54d897f 100644 --- a/docs/ai/projects/gen/plan.md +++ b/docs/ai/projects/gen/plan.md @@ -46,10 +46,43 @@ Delivered 2026-06-01: `make gen-quality` and `make gen-lint` Refactor edthe generator into a `parse -> configure -> render` pipeline - see `design/m6b-data-model.md`. -### 6C_NEXT_AND_BEYOND +### 6C_CONFIG_FILE (next: not started) -Definition of the next steps is intentionally left TBD depending on the output ob 6B. I don't want -to burdon the LLM with where we are going next. +Further refactor the gen program so that it reads a toml config file instead of embedding all of the +bespoke decisions into the python code itself. This requires an excellent design. I want this config +file to be extensible for future code gen use cases (e.g. Rust, Go, etc. and even perhaps generating +a new specification that improves upon MusicXML). + +Areas to consider during the design phase: + +- certain choices I made in the handling of MusicXML might be considered canonically correct, we + should see if any enshrined XSD deviations should be hard-coded and if so whether they could be + present in the contexts layer automatically without configuration. + +- the configuration layer should probably enrich the contexts, or is that the right hook point + +- what things will be needed for different use cases, how different can a configuration look and why + +### 6D_TEMPLATES + +Refactor out the "f-strings" from python. Use a proper template library and move the C++ boilerplate +to template files that are rendered by the generator. + +### 6E_STAND_BACK + +Likely multi-session + +How good is our design. Let's have an architect look at it through the lense of supporting future +use cases such as generating code to a different language or generating a new spec inspired by the +MusicXML spec. For example, let's imagine we want to restructure MusicXML significantly to be easier +to use and write that new spec as a JSON spec. What needs to be done to make our generator +extensible in the future (even if we don't add those extensions now, how does the current design break). + +Are there oddities in `mx/core`'s codegen that we could removed to get a cleaner generator design? + +Try it out with MusicXML 4.0 temporarily. Where did it break. Is it a design problem? + +Write a design doc better_generator.md ## Milestone 7: mxml4-types — generate MusicXML 4.0 types From a97ab241c8207d354ebaa2cc947e1efc55b812d8 Mon Sep 17 00:00:00 2001 From: Matthew James Briggs Date: Sun, 7 Jun 2026 13:30:33 +0200 Subject: [PATCH 04/27] instructions --- docs/ai/projects/gen/design/m6b-data-model.md | 189 ---------------- docs/ai/projects/gen/log.md | 208 +++--------------- docs/ai/projects/gen/plan.md | 41 +--- .../{ => synthetic-files}/synthetic-plan.md | 0 4 files changed, 37 insertions(+), 401 deletions(-) delete mode 100644 docs/ai/projects/gen/design/m6b-data-model.md rename docs/ai/projects/gen/{ => synthetic-files}/synthetic-plan.md (100%) diff --git a/docs/ai/projects/gen/design/m6b-data-model.md b/docs/ai/projects/gen/design/m6b-data-model.md deleted file mode 100644 index d33802f64..000000000 --- a/docs/ai/projects/gen/design/m6b-data-model.md +++ /dev/null @@ -1,189 +0,0 @@ -# M6B Data Model - Generator Architecture - -Static design snapshot for the M6B_DATA_MODEL refactor. Describes the target architecture; the -Status section below marks what is built versus still planned, and the realized sections describe -the code as it actually stands. How the design was reached is in `log.md`, not here. - -## Purpose - -Insert a clean seam into the ~14k-line generator so generation flows `parse -> configure -> render`. -Each `generate_*` function consumes a fully-resolved per-unit context struct instead of navigating -the global `model` and config dicts inline while it emits. This is a pure refactor: the emitted C++ -under `src/private/mx/core/` is byte-identical before and after. - -The longer-range payoff (later milestones) is that `parse.py` becomes a target-neutral, reusable -artifact that a future Rust / docs / JSON-schema backend can consume. M6B does not build a second -backend; it only establishes the seam. - -## Status - -Built: `parse.py` and `ids.py`. Their sections below describe the code as it exists. Not yet built: -`configure.py`, `contexts.py`, `render/`, `naming.py`, `cpptypes.py` - those sections describe the -target, and `generate.py` still owns config, naming, type maps, and all emission. The strangler -migration (bottom) has step 1 complete; no unit-kind has been routed through a context/renderer yet. - -Two known gaps between the target and the current code, called out where relevant below: - -- `parse.py` is not yet C++-name-free: it computes the `class_names` set and `type_usage_count` and - houses `pascal()`. The target moves `class_names` into `configure.py`'s resolve phase and `pascal` - into `naming.py`. -- Only `model.tree` is severed; `model.root` is still reachable and read by the four bespoke handlers. - -## Pipeline - -``` -parse.py pure XSD model, total IDs (target-neutral, reusable) - | -configure.py one config pass, C++-aware (output IS the contexts) - | phase 1: resolve flat indices - | phase 2: build per-unit structs - v -render/*.py pure f-string functions (struct -> string) -``` - -### parse.py - -Parses `docs/musicxml.xsd` into a self-contained data structure of XSD facts: no type maps and no -C++ emission. Two C++-lexicon leaks remain pending the `configure.py` split - the model still -computes the `class_names` set (pascal of every element and complexType name) and `type_usage_count`, -and `pascal()` itself lives here. The C++-aware structural config the parser consumes (which -anonymous sequences become synthetic group classes, which inherited groups are renamed) is not -hardcoded here; `generate.py` injects it as a `ParseConfig` so `parse.py` stays config-free. The four -synthetic-group sets in `ParseConfig` are passed by reference: the parser records discovered synthetic -groups into them during parse, and the emission code reads the same objects afterward. This is the -only stage that touches the XML. - -Self-containment invariant: after parse returns, the ElementTree is not retained on the model. -`model.tree` is severed. `model.root` is still read by the four bespoke family handlers in -`generate.py` (harmony-chord, score-wrapper, music-data, full-note) and cannot be dropped until those -families migrate to parsed data, so for now it survives scoped to bespoke-only. The former -general-path leaks are closed: enum documentation moved into parse (`model.enum_docs`, consumed by -`generate_enums_h`) and the "pattern B" complex-content predicate moved in too -(`model.complex_content_or_group_cts`, consumed by `_ct_has_complex_content`). - -parse.py assigns a `NodeId` to every dataclass-backed node (see IDs below). - -### configure.py - -Target; not yet built - `generate.py` does configuration and emission inline today. - -A single configuration pass. It is allowed to be C++-aware: lexicon config (type maps, class-name -overrides, member naming, license) and structural config (tree-ness, choice-as-set, group synthesis, -skips, bespoke shaping) are not split apart in this milestone. The lexicon/neutral split is -deferred; future non-C++ targets fork after `parse.py`, not inside `configure.py`. - -`configure.py`'s output is the set of per-unit context structs (the "contexts"). It builds them in -two flat phases - no topological sort: - -1. Resolve: walk all nodes, populate flat indices keyed by `NodeId` - each node's C++ class name, - header, resolved member type, the `class_names` set (currently computed in `parse.py`; moves here). -2. Build: walk again, emit one self-contained context struct per emittable unit, reading already- - resolved values from the phase-1 indices. - -The two-phase split is what lets a unit's struct embed its dependencies' resolved values without an -ordering pass: phase 1 is complete before phase 2 reads it. Cross-unit references in this codebase -are all name-derivable (a child's class name / header / member type come from the child's name via -flat lookups), so no leaf-first ordering and no render-time context mutation are needed. Both of -those plan ideas are dropped as YAGNI. - -### render/ - -Target; not yet built - the `generate_*` emitters in `generate.py` still navigate `model` and the -config globals inline. - -One pure module per unit-kind (element, attrs, group, choice, container, enums), plus one per -bespoke family. Each renderer is `context_struct -> str` built from f-strings. A renderer reads only -its struct: no `model`, no config dicts, no type resolution, no XSD. No template engine; f-strings -stay, made pure. - -## IDs - -`NodeId` (in `ids.py`) is a typed value with a canonical string form. `XsdModel._assign_ids` assigns -one to every dataclass-backed node in the parsed model: the named roots (`el:` elements, `cx:` -complexTypes, `st:` enum simpleTypes, `gr:` groups, `ag:` attributeGroups), their `at:` attributes, -the `el:`/`gr:` child-refs and the `seq#`/`choice#` content-tree particles beneath them, and -anonymous element-local complexTypes (`el:name/cx#0`). Constructs the model stores as plain dicts or -lists (non-enum simpleTypes; the group and attributeGroup containers themselves) and the anonymous XSD -plumbing wrappers (restriction/extension/simpleContent/complexContent) are not node objects and get no -ID. IDs are additive and unconsumed - nothing reads them yet. - -- Named root construct -> one segment `kind:name`: `el:note`, `cx:note-type`, `st:above-below`, - `gr:editorial`, `ag:bend-sound`. Unique within its XSD symbol space; stable across MusicXML - versions for free. -- Any child node -> parent-ID + `/` + a segment: - - named child: `kind:name` -> `cx:note-type/at:type`, `cx:note-type/el:pitch` - - anonymous child: `kind#ordinal` -> `cx:note-type/seq#0`, `cx:note-type/seq#0/choice#1` - -Kind is embedded in every segment (a local element `pitch` and a local attribute `pitch` under one -owner must not collide). Same-kind anonymous siblings get an ordinal. - -Stability: only Tier-1 named roots are version-stable, and that is sufficient. Every structural- -config dict keys off a named construct (element name or complexType name) and reaches into anonymous -structure by ordinal-local-to-the-owner or a human-assigned slug - never by a global anonymous ID. -So nested/anonymous IDs are positional and run-local by design; their cross-version instability is -harmless. - -### Anonymous-construct inventory (from docs/musicxml.xsd) - -A census of the XSD's anonymous constructs the total scheme aims to cover (not all are minted yet - -see below): 7 anonymous element-local complexTypes, 21 nested sequence particles, 18 nested choice -particles, 57 group-ref occurrences, 2 inline simpleTypes, 4 inline unions. Named-but-locally-scoped -(owner-scoped by local name): 419 local element decls, 276 local attribute decls. - -Current coverage vs this census: the node-backed constructs are minted (anonymous element-local -complexTypes, nested sequence/choice particles, group refs, local element/attribute decls, and the -body particles of a named owner). The inline simpleTypes/unions are carried as type strings rather -than node objects, and the restriction/extension/simpleContent/complexContent plumbing wrappers are -not modeled at all, so neither is minted today. - -## Bespoke families - -Target; not yet built - all seven families still emit inline, and four `model.root` walks remain -inside them (harmony-chord, score-wrapper, music-data, full-note). - -The seven `BESPOKE_ELEMENTS` (credit, lyric, part-list, harmony, score-wrapper, note, direction) -conform to the architecture (G1): build logic moves to `configure.py`, renderers go pure (read only -their struct). They keep their own struct types and renderers rather than being forced through the -shared generic path. Collapse a family into the shared structs only where it is obviously clean -(e.g. part-list, credit, harmony); leave the irregular ones (note, direction) with a specialized but -still-pure renderer. The non-negotiable is the invariant - purity and the build/render split - not -full unification. - -## Module layout - -``` -gen/parse.py XSD model, dataclasses, parsing, NodeId assignment [exists] -gen/ids.py NodeId typed value [exists] -gen/configure.py config dicts + two-phase build [planned] -gen/contexts.py per-unit context struct definitions [planned] -gen/render/ one pure module per unit-kind + per bespoke family [planned] -gen/naming.py pascal/camel/class-name helpers (C++ lexicon) [planned] -gen/cpptypes.py XSD-to-C++ type maps (C++ lexicon) [planned] -gen/generate.py orchestrator + all not-yet-migrated config/emission [exists] -``` - -`generate.py` is still ~13.4k lines and thins out only as kinds migrate. Until `naming.py` exists, -`pascal()` lives in `parse.py` and `camel`/`has_flag_name` in `generate.py`; until `cpptypes.py` -exists, the XSD-to-C++ type maps live in `generate.py`. The dead `gen_attrs.py` / `gen_enums.py` / -`gen_enum_members.py` helpers have been removed. - -## Oracle and migration - -Oracle (pure-refactor correctness): the committed C++ equals `python3 gen/generate.py && make fmt` -(raw generator output is unformatted, so the `make fmt` step is required). Thus -`python3 gen/generate.py && make fmt && git diff --quiet src/private/mx/core` must show no diff, and -`make test-core-dev` must pass; `make test-all` before merge. The tightest check during a refactor is -that raw generator output is byte-identical before/after the change (`diff -rq` a pre-change snapshot -of `src/private/mx/core`). `gen-quality` is ignored during the refactor and revisited only if CI -fails at the end. - -Migration is strangler-style. `generate.py` stays byte-identical the entire time: - -1. [done] Extract `parse.py` + total `NodeId`s as a pure internal move (IDs additive, unconsumed); - move enum extraction (and the complex-content predicate) in and sever `model.tree`. `model.root` - survives scoped to the four bespoke handlers pending their migration. Verify zero diff. -2. Migrate one unit-kind at a time (enums -> attrs -> simple elements -> groups -> choices -> - containers -> tree-parents -> the 7 bespoke families): build that kind's context struct + pure - renderer, route only that kind through the new path, leave the rest on the old path, verify zero - C++ diff after each kind. -3. When all kinds are migrated, delete the old dispatch and the `model`/globals reach-back. diff --git a/docs/ai/projects/gen/log.md b/docs/ai/projects/gen/log.md index 47aa76d99..90a3fb63c 100644 --- a/docs/ai/projects/gen/log.md +++ b/docs/ai/projects/gen/log.md @@ -4,209 +4,57 @@ Append chronologically, oldest on top. ## M1: revgen (2026-05-18 — 2026-05-21, 40 iterations) ✅ -Reverse-engineered the codegen, iteratively shrinking `SKIP_ELEMENTS` and `CHOICE_SKIP` to -empty. Closed with the generator producing every C++ class in `mx/core`. Tests still -failing; commit `d4f25ee6` "src: issues caused by revgen" carries hand-edits to -non-generated consumers that kept the build working — the input to M2. +Reverse-engineered the codegen.Closed with the generator producing every C++ class in `mx/core`. +Tests still failing; commit `d4f25ee6` ## M2: fix-gen (2026-05-21 — 2026-05-22) ✅ -Triaged `d4f25ee6` into 6 root causes (A–F) and ~129 `make test-all` failures into clusters -(R1–R7, D1–D4, plus an isFirst-separator family). Fixed iteratively to 0 failures. - -Notable mechanisms introduced (still in `gen/generate.py`): -- `EXTENSION_OPTIONAL_GROUP_RENAME`, `SUPPRESS_GROUP_SUFFIX`, - `WRAPPER_AS_ELEMENT_SYNTH_GROUPS` (MetronomeTuplet group flattening, issue C). -- `TREE_ELEMENT_CONFIG["key"] = {"parent_imports_choice_groups": True}` → - `Key::import{Traditional,NonTraditional}Key` (issues E/F). -- Required-set seeding rule in `generate_element_cpp` - (`min_occurs>=1, max_occurs!=1, !is_group`) fixed HarpPedals SIGSEGV (R5). -- `is_container` + `trigger_names` on `TreeChoiceBranch`; `importContainer` dispatch - in `generate_tree_parent_cpp` (D1, D4 — Metronome reader). -- `ATTR_DEFAULT_OVERRIDE` (~17 entries), `CHILD_INIT_VALUE_OVERRIDE` (Scaling, - StaffDetails). -- `_emit_direction_family` bespoke driven by - `model.complex_types["direction-type"].content_tree`. -- `ELEMENT_HAS_CONTENTS_ALWAYS_TRUE` (MeasureLayout, NoteheadText), notehead-text - `seed_choice_set`, MeasureLayout explicit child-presence check for `isOneLineOnly`. - -Deferred: issue A — original `mx/core` had a hand-applied MusicXML 4.0 `UpDownNone` -backport overwritten by schema-faithful 3.x regen. TODO comments in -`mx/impl/NotationsWriter.cpp:398` and `mx/impl/ArpeggiateFunctions.cpp:35`. Belongs to -M5/M6. - -### Lessons captured (operational invariants) - -- `git checkout -- src/private/mx/core/` preserves mtimes; incremental cmake then links - partly-old `.o` files and reports stale test counts. Use `make clean && make test-all` - for authoritative measurements. -- `make test-all` must run with generated files present — HEAD's `UpDownNone` backport - is incompatible with a schema-faithful regen, so a reset-first build will not compile. -- When removing a previously-emitted byte from a shared template, survey the whole HEAD - population that template emits (R2 burned us by regressing DirectionType). -- Bespoke handlers should still read the parsed XSD model — "custom algorithm, - schema-driven data". +Solved failing test issues ## M3: fix-core-dev (2026-05-22, 5 iterations) ✅ -Each iteration picked the smallest core-roundtrip diff in `data/testOutput/corert` and -triaged it: either a hand-rolled mx-side fix or a `{file}.invalid` marker for files that -don't conform to the XSD. Per-iteration template now lives in `plan.md`. - -- **i1** — tenths-typed `width` attribute trailing-zero strip. Added `"width"` to - `decimalFields` in `src/private/mxtest/import/DecimalFields.h`. Cleared 17 failures. - Commit `639d46a3`. -- **i2** — `musuite/testInvalid.xml` is intentionally invalid; introduced the - `{file}.invalid` marker convention (honored by `CoreRoundtripImpl.cpp::discoverInputFiles`; - api import keeps processing). Documented in `data/README.md`. -- **i3 (static-analysis sweep)** — ran `xmllint --schema` against MusicXML XSD on - remaining failing files; marked 10 as `.invalid` where the schema violation explained - the diff. Three real bugs survived: ly22b, ly41e, ly45f. -- **i3 (cont.)** — ly45f: `CommaSeparatedListOfPositiveIntegers::parse` in hand-written - `src/private/mx/core/CommaSeparatedPositiveIntegers.cpp` discarded "1, 2, 3" vs - "1,2,3" spacing on import. Added a `", "` detection that sets `myIsSpacingDesired`. - Commit `461b96d2`. -- **i4** — ly41e: `XsString::toStream` escaped only `<`, `>`, `&`. A raw `\r` was - normalized to `\n` on the next read (pugixml `parse_eol`, XML 1.0 §2.11). Added - `'\r'` → `" "`. Hand-written type, no regen. Commit `040b2152`. -- **i5** — ly22b: XSD `slash` group is `minOccurs="0"` inside complexType `slash` and - `beat-repeat`, so empty `` is valid. HEAD treated `slash-type` as - always-present; revgen preserved this via `CHILD_MIN_OCCURS_OVERRIDE`. Removed the - overrides, regenerated `Slash.{h,cpp}` and `BeatRepeat.{h,cpp}` with a `myHasSlashType` - flag, removed a matching `addChildIfNone` workaround in - `src/private/mxtest/import/ExpectedFiles.cpp`. `make test-all` then surfaced 25 - assertions in 23 `mxtest/core/*Test.cpp` cases that codified the bug; added 6 - `setHasSlashType(true)` calls across 4 fixture files. Commits `d43a222c`, `9c8efa24`. - -Final state: `make test-core-dev` 350/350, `make test` 2717/2717, `make test-all` -3028/3028 (9914 assertions), `make check` passed. PR-merge commit `6c4e18d4`. +Further test fixes using a new `make test-core-dev` target. ## M4a: test fixer (2026-05-22 -- 2026-05-25) ✅ -Built a `Fixer` that patches the expected tree before comparison via per-file -`.fixup.xml` sidecars. This lets corert handle cases where mx's clamping or -defaulting differs from the raw input without marking the file `.invalid`. Convention -documented in `data/README.md`, design doc at `src/private/mxtest/corert/Fixer.h`. -Closed at 387/387 `test-core-dev` pass and 3065/3065 `test-all`. +Built a `Fixer` that patches the expected tree before comparison via per-file `.fixup.xml` +sidecars. ## M5: test coverage expansion (2026-05-25 -- 2026-05-30) ✅ -Added real-world corpus files and generated 235 synthetic MusicXML files -(`data/synthetic/`) to achieve 100% symbol coverage of MusicXML 3.0/3.1/4.0 spec -symbols. Final three corert failures fixed: PlaybackSound "other" variant -(PlaybackSoundType wrapper class), xmlns:xlink preservation (XMLNS_PRESERVING_ATTRS -config in generator). Filed GitHub issue #161 for namespace-prefix limitation. +Added real-world corpus files and generated 235 synthetic MusicXML files (`data/synthetic/`) to +achieve 100% symbol coverage of MusicXML 3.0/3.1/4.0 spec symbols. Fixed three corert failures: +PlaybackSound "other" variant (PlaybackSoundType wrapper class), xmlns:xlink preservation +(XMLNS_PRESERVING_ATTRS config in generator). Filed GitHub issue #161 for namespace-prefix +limitation. Final state: `make test-core-dev` 676/676, `make test-all` all pass. ## M6A: gen-quality tooling (2026-06-01) ✅ -Designed the scoring methodology with the user via a grill, then built the tooling. Decisions, in -order: dropped maintainability-index (step-shaped + redundant with CC/Halstead/LOC) and Halstead -(size-proxy, redundant once a real size axis exists); excluded duplication/coupling/cohesion/DIT -after measuring that jscpd and pylint both report ~0% duplication on this f-string-heavy emission -code (the dupes are semantic, not token-identical, so any detector misreads them as clean); moved -pylint out of the score into a separate `make gen-lint` binary gate. Final rubric: structure 50% -(LOC-weighted function + file size), cyclomatic 25%, cognitive 25%, all via one smooth -`target/max(target,value)` transform so partial refactor wins register and tiny stub functions -cannot game the size axis. - -Implemented: rewrote `gen/quality.py` (scores every `gen/*.py` except itself; writes -`data/testOutput/gen-quality/score.json` with 30 offenders/axis as `path:line` refs, plus -`report.md` and stdout); added `gen/.pylintrc` (disables the complexity checks gen-quality scores); -added a pinned analyzer venv to the Dockerfile; added `make gen-quality` / `make gen-lint` with bash -floor gates; wired both into CI `linux-gate` with a job-summary line and a per-push PR comment. -Deleted dead `gen/eval.py`, `gen/eval_config.yaml`, and the old `gen/quality-baseline.json`. - -Floors are a ratchet, set just under the measured in-container baseline (deterministic, identical to -local): `GEN_QUALITY_FLOOR=37.7` (composite 37.7 = structure 20.1, cyclomatic 62.8, cognitive 47.9), -`GEN_LINT_FLOOR=9.4` (pylint 9.49). Generator behavior was off-limits by user direction, so this is -a tooling-only change; the dead `OVERWRITE_FILE_STEMS` set in `generate.py` was left untouched. - -## 2026-06-02 07:49 M6B design grill - -Grilled the user to settle the M6B_DATA_MODEL architecture before implementation. No code changed; -design captured in `design/m6b-data-model.md`. Decisions, in order: - -Seam: only `parse.py` is target-neutral (pure XSD). `configure.py` is one config pass and is allowed -to be C++-aware; we deliberately do not split lexicon-vs-neutral this milestone. Future non-C++ -targets fork after `parse.py`. Rejected splitting structural vs C++ config into two layers - user -said don't overdo it. +Designed a python gen program quality scoring methodology with the user via a grill, then built the +tooling. -IDs: total - every node gets a `NodeId` (typed value, canonical string). Named roots `kind:name` -(`cx:note-type`), stable across versions for free; children are parent-ID + `/kind:name` or -`/kind#ordinal`, kind embedded in every segment, ordinals for anonymous siblings. Found by inspection -that every structural-config dict keys off a named construct and reaches anonymous structure by -owner-local ordinal or human slug - so cross-version stability is a non-problem for anonymous nodes -and they can be cheap positional/run-local. User chose total coverage (incl ~337 unused body/plumbing -IDs) for a uniform "every node has an ID" invariant. - -Context: `configure.py`'s output is the contexts - per-unit render structs (option 2), not a mega- -context. Built in two flat phases (resolve indices, then build structs). Verified cross-unit refs are -all name-derivable (`classify_element` called once at top of main loop; `resolve_cpp_type` is a name- -keyed flat lookup) - so the plan's dependency-topology-order and generators-mutate-context ideas are -both dropped as YAGNI. - -Renderers: pure f-string functions (`struct -> str`), no template engine. Jinja rejected: diff-risk -against the byte-identical oracle, and moving LOC into `.j2` would be metric-gaming. +Implemented: `gen/quality.py` (scores every `gen/*.py` except itself; writes +`data/testOutput/gen-quality/score.json` with 30 offenders/axis as `path:line` refs, plus +`report.md` and stdout); added a pinned analyzer venv to the Dockerfile; added `make gen-quality` / +`make gen-lint` with bash floor gates; wired both into CI `linux-gate` with a job-summary line and a +per-push PR comment. -Bespoke (7 families): G1 - conform to the architecture (pure renderer, build logic in `configure.py`) -but keep their own structs/renderers; unify into the shared path only where obviously clean. +Floors are a ratchet, set just under the measured baseline: `GEN_QUALITY_FLOOR`, +`GEN_LINT_FLOOR`. -Self-containment: after parse, drop `model.root`/`model.tree`. One leak today: `generate_enums_h` -(`generate.py:854`). The three helpers `gen_attrs.py`/`gen_enums.py`/`gen_enum_members.py` are dead -(not imported) - delete them. +## 2026-06-02 07:49 M6B design grill ✅ -gen-quality: user directed to ignore it during the refactor and only revisit if CI fails at the end. +Grilled the user to settle the M6B_DATA_MODEL. Initial grill was not very productive. Design doc +discarded. The core concepts from the grill are as follows. -Migration: strangler. Keep `generate.py` byte-identical throughout; extract `parse.py` + IDs first, -then migrate one unit-kind at a time verifying zero C++ diff after each, then delete old dispatch. -Session-1 scope: stand up `parse.py`, move enum extraction in, sever `model.root`, assign `NodeId`s, -prove zero diff. +Seam: +- `parse.py` is target-neutral (pure XSD). +- Next layet, a TOML file is read to configure specifics such as + - what template file(s) map to which XSD objects + - what additional transformations are needed to the context, etc. ## 2026-06-02 08:15 M6B session 1: stood up parse.py + ids.py as a pure internal extraction. Zero C++ diff. - -Moved into gen/parse.py: the nine XSD dataclasses (XsdAttribute, XsdEnumType, XsdChildRef, -ElementRefNode, GroupRefNode, SequenceNode, ChoiceNode, XsdComplexType, XsdElement), the XsdModel -parser, and pascal(). generate.py now imports these from parse. camel()/has_flag_name()/CPP_KEYWORDS -stayed (C++ lexicon, not needed by parse). - -Config coupling: the parser reads and mutates seven structural-config globals (GENERATE_GROUPS, -SYNTHETIC_OPTIONAL_GROUPS, SYNTHETIC_UNBOUNDED_GROUPS, SUPPRESS_GROUP_SUFFIX, plus three read-only -dicts). These are C++-aware (configure.py material), so rather than move them into parse.py they are -injected: generate.py keeps the globals and passes them via a new ParseConfig dataclass into -XsdModel(xsd_path, cfg). The four sets are passed by reference, so synthetic groups the parser -records during parse are visible to the emission code afterward. parse.py stays config-free; the -future move of these dicts to configure.py is now a clean cut. Avoids a generate<->parse import cycle. - -Enum extraction moved into parse: model.enum_docs (enum name -> annotation/documentation text) is -populated in _parse_simple_types; generate_enums_h reads model.enum_docs.get(name, "") instead of -re-walking model.root. - -Discovery that contradicts the design's self-containment note: model.root was used by SIX emission -sites, not one. Besides generate_enums_h there is _ct_has_complex_content (general, "pattern B" -predicate) and four bespoke handlers (harmony-chord, score-wrapper, music-data, full-note). Migrated -the two general-path users into parse: model.complex_content_or_group_cts feeds _ct_has_complex_content. -model.tree is now severed (not stored). model.root cannot be fully severed yet because the four -bespoke handlers still walk it; severing it requires migrating those families, which session 1 forbids. -So model.root survives, scoped to bespoke-only. Updated design/m6b-data-model.md to state this. - -ids.py: NodeId is a frozen value with a canonical string form (kind:name segments joined by /, anon -siblings as kind#ordinal). _assign_ids walks the whole model and assigns a NodeId to every -dataclass-backed node (elements, complexTypes incl. attributes/children/choice_children/content_tree, -groups, attribute_groups, enum_types, anonymous element-local complexTypes). node_id fields are -field(default=None, compare=False) so equality/repr are unaffected. Additive and unconsumed - nothing -reads IDs yet. - -Deleted dead gen/gen_attrs.py, gen/gen_enums.py, gen/gen_enum_members.py (standalone probe scripts, -unimported). - -Oracle: the committed C++ equals generate.py + make fmt (raw generator output is unformatted; the -user's stated oracle omitted the fmt step). Proved correctness two ways: raw generator output is -byte-identical before/after (diff -rq of a pre-change snapshot), and generate+fmt yields zero git diff -against committed. Did not run make test-core-dev to completion: the C++ is byte-identical to -committed, so core roundtrip behavior is unchanged from the base branch and C++ CI is unaffected (per -user direction). CI gates verified locally: gen-quality 38.2 (floor 37.7, up from 37.7), gen-lint 9.50 -(floor 9.4). diff --git a/docs/ai/projects/gen/plan.md b/docs/ai/projects/gen/plan.md index 1f54d897f..4c75783d0 100644 --- a/docs/ai/projects/gen/plan.md +++ b/docs/ai/projects/gen/plan.md @@ -48,41 +48,18 @@ Refactor edthe generator into a `parse -> configure -> render` pipeline - see ### 6C_CONFIG_FILE (next: not started) -Further refactor the gen program so that it reads a toml config file instead of embedding all of the -bespoke decisions into the python code itself. This requires an excellent design. I want this config -file to be extensible for future code gen use cases (e.g. Rust, Go, etc. and even perhaps generating -a new specification that improves upon MusicXML). +CARDINAL_RULE: ZERO diff is tolerated in the generated C++ files. Diffs must not affect code outside +of `gen/`. -Areas to consider during the design phase: +Iteratively stand-up a separation of concerns between +- XSD standard, hard coded transforms +- Configuration transforms +- Template configuration +- Use of template files instead of python f-strings -- certain choices I made in the handling of MusicXML might be considered canonically correct, we - should see if any enshrined XSD deviations should be hard-coded and if so whether they could be - present in the contexts layer automatically without configuration. +### 6D++ -- the configuration layer should probably enrich the contexts, or is that the right hook point - -- what things will be needed for different use cases, how different can a configuration look and why - -### 6D_TEMPLATES - -Refactor out the "f-strings" from python. Use a proper template library and move the C++ boilerplate -to template files that are rendered by the generator. - -### 6E_STAND_BACK - -Likely multi-session - -How good is our design. Let's have an architect look at it through the lense of supporting future -use cases such as generating code to a different language or generating a new spec inspired by the -MusicXML spec. For example, let's imagine we want to restructure MusicXML significantly to be easier -to use and write that new spec as a JSON spec. What needs to be done to make our generator -extensible in the future (even if we don't add those extensions now, how does the current design break). - -Are there oddities in `mx/core`'s codegen that we could removed to get a cleaner generator design? - -Try it out with MusicXML 4.0 temporarily. Where did it break. Is it a design problem? - -Write a design doc better_generator.md +TBD ## Milestone 7: mxml4-types — generate MusicXML 4.0 types diff --git a/docs/ai/projects/gen/synthetic-plan.md b/docs/ai/projects/gen/synthetic-files/synthetic-plan.md similarity index 100% rename from docs/ai/projects/gen/synthetic-plan.md rename to docs/ai/projects/gen/synthetic-files/synthetic-plan.md From 9ba35f69b623cabfeea5b8b8458b41b1a23b09eb Mon Sep 17 00:00:00 2001 From: Matthew James Briggs Date: Sun, 7 Jun 2026 14:00:41 +0200 Subject: [PATCH 05/27] gen: jinja2 templates for simple-value elements Add Jinja2 template rendering for the 101 simple-value elements, replacing the fake-CT path through generate_element_h/cpp. Templates and routing config live in gen/cpp/. Jinja2 is pinned in a new /opt/gen-venv Docker venv. New `make generate` target runs the generator inside Docker. --- Dockerfile | 4 ++ Makefile | 10 +++- docs/ai/projects/gen/index.md | 3 ++ docs/ai/projects/gen/log.md | 23 +++++++++ docs/ai/projects/gen/plan.md | 6 ++- docs/ai/projects/gen/state.md | 56 ++++++++++----------- gen/cpp/config.toml | 3 ++ gen/cpp/simple_value_cpp.j2 | 73 +++++++++++++++++++++++++++ gen/cpp/simple_value_h.j2 | 64 ++++++++++++++++++++++++ gen/generate.py | 93 ++++++++++++++++++++++++++++++----- 10 files changed, 294 insertions(+), 41 deletions(-) create mode 100644 gen/cpp/config.toml create mode 100644 gen/cpp/simple_value_cpp.j2 create mode 100644 gen/cpp/simple_value_h.j2 diff --git a/Dockerfile b/Dockerfile index 9902a01b9..a06da8960 100644 --- a/Dockerfile +++ b/Dockerfile @@ -28,6 +28,10 @@ RUN python3 -m venv /opt/quality-venv \ pylint==4.0.5 \ cognitive_complexity==1.3.0 +RUN python3 -m venv /opt/gen-venv \ + && /opt/gen-venv/bin/pip install --no-cache-dir \ + Jinja2==3.1.6 + # Unversioned name so the Makefile invokes the pinned formatter without # knowing the version suffix. RUN ln -sf /usr/bin/clang-format-18 /usr/local/bin/clang-format diff --git a/Makefile b/Makefile index a0322785d..722906e89 100644 --- a/Makefile +++ b/Makefile @@ -75,6 +75,7 @@ GCOV ?= gcov-14 # complexity); gen-lint enforces genuine lint defects. Pinned analyzers live in # the mx-sdk venv (see Dockerfile). GEN_PY is every gen/*.py except the measurer. QUALITY_VENV := /opt/quality-venv +GEN_VENV := /opt/gen-venv QUALITY_DIR := data/testOutput/gen-quality GEN_PY := $(filter-out gen/quality.py,$(wildcard gen/*.py)) GEN_QUALITY_FLOOR ?= 37.7 @@ -153,7 +154,7 @@ endef .DEFAULT_GOAL := help .PHONY: help lib dev core test test-all examples-run all clean clean-docker \ check-docker docker-volume fmt check core-dev check-core-dev \ - test-core-dev coverage-core-dev gen-quality gen-lint \ + test-core-dev coverage-core-dev generate gen-quality gen-lint \ xcode-gen xcode-build xcode-test help: @@ -162,6 +163,7 @@ help: @echo 'Quality gates (run in docker):' @echo ' make fmt Format all C++ files under src/.' @echo ' make check fmt-check + warning-free build.' + @echo ' make generate Regenerate C++ from the XSD (gen/generate.py).' @echo ' make gen-quality Score gen/ design quality; fail below the floor.' @echo ' make gen-lint Lint gen/ with pylint; fail below the floor.' @echo '' @@ -371,6 +373,9 @@ coverage-core-dev: $(call mode_dir,cov-core-dev) | tee $(COV_DIR)/summary.txt @echo "=== coverage written to $(COV_DIR)/ ===" +generate: + $(GEN_VENV)/bin/python gen/generate.py + # Static analysis of the gen/ generator (in-container branch). quality.py # measures and writes the report tree to the workspace mount; the floor check # below is the gate. See docs/ai/projects/gen. @@ -434,6 +439,9 @@ coverage-core-dev: $(DOCKER_STAMP) docker-volume $(DOCKER_RUN) make coverage-core-dev @echo "Coverage written to $(COV_DIR)/ (open $(COV_DIR)/index.html)" +generate: $(DOCKER_STAMP) + $(DOCKER_RUN) make generate + # Static analysis gates. Pure Python -- no C++ build -- so they only need the # image. The report tree is written through the workspace mount to # ./data/testOutput/gen-quality. Same commands run identically in CI. diff --git a/docs/ai/projects/gen/index.md b/docs/ai/projects/gen/index.md index 72eb1386a..3621b8972 100644 --- a/docs/ai/projects/gen/index.md +++ b/docs/ai/projects/gen/index.md @@ -135,6 +135,9 @@ the fix belongs in the shared path with a config-driven flag. - `gen/ids.py` — `NodeId` typed value (M6B; assigned to every node, currently unconsumed) - `gen/quality.py` — design-quality scorer for `make gen-quality` (excluded from its own score) - `gen/.pylintrc` — pylint config for `make gen-lint` +- `gen/cpp/config.toml` — TOML routing config mapping element categories to Jinja2 templates (M6C) +- `gen/cpp/simple_value_h.j2` — Jinja2 header template for simple-value elements (M6C) +- `gen/cpp/simple_value_cpp.j2` — Jinja2 impl template for simple-value elements (M6C) - `docs/musicxml.xsd` — input schema (currently MusicXML 3.0; swap to 4.0 in M6) - `src/private/mx/core/elements/` — target output (~590 .h/.cpp pairs) - `src/private/mxtest/corert/` — core-roundtrip harness diff --git a/docs/ai/projects/gen/log.md b/docs/ai/projects/gen/log.md index 90a3fb63c..20d3efe3b 100644 --- a/docs/ai/projects/gen/log.md +++ b/docs/ai/projects/gen/log.md @@ -58,3 +58,26 @@ Seam: ## 2026-06-02 08:15 M6B session 1: stood up parse.py + ids.py as a pure internal extraction. Zero C++ diff. + +## 2026-06-07 13:32 + +M6C session 1: grill on the first changeset for template/config separation. User chose to start at +the leaf: simple-value elements (101 elements, e.g. Step, Duration). These wrap a single scalar +value with no attributes and no children. Two sub-variants: XMACRO enum types (use ToString/ +FromString free functions) vs. everything else (use operator<< and .parse()). + +Decisions from the grill: +- Jinja2 for templating (not string.Template or hand-rolled) +- Templates and config in gen/cpp/ (not gen/templates/) +- TOML config is routing-only (lookup tables stay in Python until all consumers are templated) +- No data duplication: Python builds the context dict from existing dicts, passes to Jinja2 +- Jinja2 in a new /opt/gen-venv Docker venv (separate from quality-venv) +- New `make generate` target runs the generator inside Docker + +## 2026-06-07 13:54 + +M6C session 1 implementation: created gen/cpp/ with config.toml, simple_value_h.j2, +simple_value_cpp.j2. Added /opt/gen-venv with Jinja2==3.1.6 to Dockerfile. Added `make generate` +target to Makefile. Modified gen/generate.py: simple-value elements now render via Jinja2 templates +instead of the shared generate_element_h/cpp f-string path. Verified zero diff across all 101 +simple-value elements (202 files). Non-simple-value elements remain on the f-string path unchanged. diff --git a/docs/ai/projects/gen/plan.md b/docs/ai/projects/gen/plan.md index 4c75783d0..91ed43df9 100644 --- a/docs/ai/projects/gen/plan.md +++ b/docs/ai/projects/gen/plan.md @@ -46,7 +46,7 @@ Delivered 2026-06-01: `make gen-quality` and `make gen-lint` Refactor edthe generator into a `parse -> configure -> render` pipeline - see `design/m6b-data-model.md`. -### 6C_CONFIG_FILE (next: not started) +### 6C_CONFIG_FILE (in progress) CARDINAL_RULE: ZERO diff is tolerated in the generated C++ files. Diffs must not affect code outside of `gen/`. @@ -57,6 +57,10 @@ Iteratively stand-up a separation of concerns between - Template configuration - Use of template files instead of python f-strings +Infrastructure: Jinja2 templates in `gen/cpp/`, TOML routing config, `/opt/gen-venv` in Docker, +`make generate` target. Simple-value elements (101) are templated. Lookup tables remain in Python +until all their consumers are templated. + ### 6D++ TBD diff --git a/docs/ai/projects/gen/state.md b/docs/ai/projects/gen/state.md index c0404900a..e7dc5be8a 100644 --- a/docs/ai/projects/gen/state.md +++ b/docs/ai/projects/gen/state.md @@ -2,48 +2,48 @@ ## Milestone -M6B_DATA_MODEL, is done. +M6C_CONFIG_FILE, in progress. -## What the last session did (2026-06-02, M6B session 1) +## What the last session did (2026-06-07, M6C session 1) -Stood up `gen/parse.py` and `gen/ids.py` as a pure internal extraction. Zero C++ diff. See `log.md` -2026-06-02 08:15 for detail. +Grilled the user on the first M6C changeset, then implemented it: -## IMPORTANT correction to the design's self-containment claim +- Created `gen/cpp/` with `config.toml` (routing), `simple_value_h.j2`, `simple_value_cpp.j2` +- Added `/opt/gen-venv` with Jinja2==3.1.6 to the Dockerfile (separate from quality-venv) +- Added `make generate` target (runs `gen/generate.py` inside Docker) +- Modified `gen/generate.py`: simple-value elements (101 elements) now render via Jinja2 templates. + The `_render_simple_value()` function builds a context dict from existing Python lookup tables and + renders the templates. The old fake-CT path through `generate_element_h/cpp` is removed for this + category. +- Verified zero diff across all 202 simple-value files -The design said `generate_enums_h` was "the one current violation" reaching into `model.root`. That -was wrong: there were SIX `model.root` users. Two were general-path and are now migrated into parse -(enum docs, complex-content predicate). **Four are bespoke handlers** (harmony-chord, score-wrapper, -music-data, full-note) and still walk `model.root` directly. So `model.tree` is severed but -`model.root` survives, scoped to bespoke-only, until those families migrate. Do not try to delete -`model.root` until the bespoke families are migrated. +Not yet committed or tested through Docker build / CI. The user needs to rebuild the Docker image +(`make generate` will trigger it) and verify the full oracle: +`make generate && make fmt && git diff --quiet src/private/mx/core`. -## What the next session should do (M6C session 21) +## What the next session should do -Get your instructions from the user. +Get instructions from the user. Likely options: +- Continue M6C: template the next element category (text-value, empty, empty-with-attrs, etc.) +- At some point, lookup tables (TYPE_DEFAULT_VALUE, etc.) can move to TOML once all their consumers + are templated ## Oracle (how to prove zero diff) -The committed C++ equals `python3 gen/generate.py && make fmt` - raw generator output is unformatted, -so the `make fmt` step is REQUIRED (the M6B prompt's oracle omitted it). Two ways to check, tightest -first: - -- Raw-output snapshot: `cp -R src/private/mx/core /tmp/core_before` after a clean generate, make your - change, regenerate, then `diff -rq /tmp/core_before src/private/mx/core` must be empty. This is - byte-exact and needs no `make fmt`. -- Committed oracle: `python3 gen/generate.py && make fmt && git diff --quiet src/private/mx/core`. +`make generate && make fmt && git diff --quiet src/private/mx/core` Then `make test-core-dev`. Reset generated C++ before committing: -`git checkout -- src/private/mx/core` (the refactor must change only `gen/*.py`). +`git checkout -- src/private/mx/core` (the refactor must change only `gen/` files). ## Gotchas - `make fmt` (~1 min, Docker) is part of the oracle - the generator emits unformatted C++. -- CI `linux-gate` runs `make gen-quality` (floor 37.7; currently 38.2) and `make gen-lint` (floor - 9.4; currently 9.50). New `gen/*.py` files are scored - keep functions small and add docstrings. +- The generator now requires Jinja2. Running `python3 gen/generate.py` bare requires a Python + environment with `jinja2` and `tomllib` (Python 3.11+). Use `make generate` to run inside Docker. +- CI `linux-gate` runs `make gen-quality` (floor 37.7) and `make gen-lint` (floor 9.4). The new + `_render_simple_value` function and imports should be scored normally. - `gen-quality`/`gen-lint` are otherwise ignored during the refactor (user directive) unless CI fails. -- Running `python3 gen/generate.py` works because Python puts `gen/` on `sys.path[0]`, so the bare - `from parse import ...` / `from ids import ...` resolve. -- `node_id` fields are `compare=False` on purpose; keep it that way so adding IDs never perturbs - dataclass equality. +- Jinja2 environment uses `trim_blocks=True` and `lstrip_blocks=True` to avoid extra blank lines + from block tags. Do not use `-%}` suffix on block tags in templates - it eats leading indentation. +- `node_id` fields are `compare=False` on purpose; keep it that way. diff --git a/gen/cpp/config.toml b/gen/cpp/config.toml new file mode 100644 index 000000000..82a70638d --- /dev/null +++ b/gen/cpp/config.toml @@ -0,0 +1,3 @@ +[categories.simple-value] +header_template = "simple_value_h.j2" +impl_template = "simple_value_cpp.j2" diff --git a/gen/cpp/simple_value_cpp.j2 b/gen/cpp/simple_value_cpp.j2 new file mode 100644 index 000000000..3928e6cc7 --- /dev/null +++ b/gen/cpp/simple_value_cpp.j2 @@ -0,0 +1,73 @@ +{{ license }} + +#include "mx/core/elements/{{ class_name }}.h" +#include "mx/core/FromXElement.h" +#include + +namespace mx +{ +namespace core +{ +{{ class_name }}::{{ class_name }}(){{ default_init }} +{ +} + +{{ class_name }}::{{ class_name }}(const {{ value_type }} &value) : myValue(value) +{ +} + +bool {{ class_name }}::hasAttributes() const +{ + return false; +} + +bool {{ class_name }}::hasContents() const +{ + return true; +} + +std::ostream &{{ class_name }}::streamAttributes(std::ostream &os) const +{ + return os; +} + +std::ostream &{{ class_name }}::streamName(std::ostream &os) const +{ + os << "{{ stream_name }}"; + return os; +} + +std::ostream &{{ class_name }}::streamContents(std::ostream &os, const int indentLevel, bool &isOneLineOnly) const +{ + MX_UNUSED(indentLevel); + isOneLineOnly = true; +{% if is_xmacro %} + os << {{ value_type }}ToString(myValue); +{% else %} + os << myValue; +{% endif %} + return os; +} + +{{ value_type }} {{ class_name }}::getValue() const +{ + return myValue; +} + +void {{ class_name }}::setValue(const {{ value_type }} &value) +{ + myValue = value; +} + +bool {{ class_name }}::fromXElementImpl(std::ostream &message, ::ezxml::XElement &xelement) +{ + MX_UNUSED(message); +{% if not is_enum %} + MX_UNUSED(xelement); +{% endif %} + {{ parse_call }} + return true; +} + +} // namespace core +} // namespace mx diff --git a/gen/cpp/simple_value_h.j2 b/gen/cpp/simple_value_h.j2 new file mode 100644 index 000000000..38e967adf --- /dev/null +++ b/gen/cpp/simple_value_h.j2 @@ -0,0 +1,64 @@ +{{ license }} + +#pragma once + +{% for inc in project_includes %} +#include "{{ inc }}" +{% endfor %} + +#include +#include +#include + +namespace mx +{ +namespace core +{ + +MX_FORWARD_DECLARE_ELEMENT({{ class_name }}) + +inline {{ class_name }}Ptr make{{ class_name }}() +{ + return std::make_shared<{{ class_name }}>(); +} +{% if is_xmacro %} + +inline {{ class_name }}Ptr make{{ class_name }}({{ value_type }} value) +{ + return std::make_shared<{{ class_name }}>(value); +} +{% else %} + +inline {{ class_name }}Ptr make{{ class_name }}(const {{ value_type }} &value) +{ + return std::make_shared<{{ class_name }}>(value); +} + +inline {{ class_name }}Ptr make{{ class_name }}({{ value_type }} &&value) +{ + return std::make_shared<{{ class_name }}>(std::move(value)); +} +{% endif %} + +class {{ class_name }} : public ElementInterface +{ + public: + {{ class_name }}(); + {{ class_name }}(const {{ value_type }} &value); + + virtual bool hasAttributes() const; + virtual bool hasContents() const; + virtual std::ostream &streamAttributes(std::ostream &os) const; + virtual std::ostream &streamName(std::ostream &os) const; + virtual std::ostream &streamContents(std::ostream &os, const int indentLevel, bool &isOneLineOnly) const; + {{ value_type }} getValue() const; + void setValue(const {{ value_type }} &value); + + private: + virtual bool fromXElementImpl(std::ostream &message, ::ezxml::XElement &xelement); + + private: + {{ value_type }} myValue; +}; +} // namespace core +} // namespace mx diff --git a/gen/generate.py b/gen/generate.py index 789aa87a9..f4dc83ade 100644 --- a/gen/generate.py +++ b/gen/generate.py @@ -6,10 +6,13 @@ import os import re import sys +import tomllib from collections import OrderedDict from dataclasses import dataclass, field from typing import Optional +import jinja2 + from parse import ( XS, ChoiceNode, @@ -28,6 +31,18 @@ XSD_PATH = "docs/musicxml.xsd" CORE_DIR = "src/private/mx/core" ELEM_DIR = os.path.join(CORE_DIR, "elements") +CPP_DIR = os.path.join(os.path.dirname(__file__), "cpp") + +with open(os.path.join(CPP_DIR, "config.toml"), "rb") as _f: + CPP_CONFIG = tomllib.load(_f) + +_JINJA_ENV = jinja2.Environment( + loader=jinja2.FileSystemLoader(CPP_DIR), + keep_trailing_newline=True, + trim_blocks=True, + lstrip_blocks=True, + undefined=jinja2.StrictUndefined, +) LICENSE = """\ // MusicXML Class Library @@ -13006,6 +13021,63 @@ def _generate_direction_cpp(class_name, attrs_name, branch_names, multi_branch_n } +def _render_simple_value(elem_name, elem, model): + """Render a simple-value element via Jinja2 templates.""" + cat_cfg = CPP_CONFIG["categories"]["simple-value"] + h_tmpl = _JINJA_ENV.get_template(cat_cfg["header_template"]) + cpp_tmpl = _JINJA_ENV.get_template(cat_cfg["impl_template"]) + + class_name = element_class_name(elem_name) + stream_name = elem_name + + vt_override = ELEMENT_VALUE_TYPE_OVERRIDE.get(elem_name) + if vt_override: + value_type = vt_override["cpp_type"] + else: + value_type = resolve_cpp_type(elem.type_name, model) + + is_xmacro = value_type in XMACRO_ENUM_TYPES + is_enum = is_enum_value_type(value_type) + use_set_value = uses_set_value(value_type) + + includes = sorted({"mx/core/ElementInterface.h", "mx/core/ForwardDeclare.h", + header_for_type(value_type)}) + if vt_override: + for extra in vt_override.get("extra_includes", []): + includes = sorted(set(includes) | {extra}) + + default_val = ELEMENT_DEFAULT_VALUE.get( + elem_name, TYPE_DEFAULT_VALUE.get(value_type, "")) + if default_val: + default_init = f" : myValue({default_val})" + else: + default_init = " : myValue()" + + if use_set_value: + parse_call = "myValue.setValue(xelement.getValue());" + elif is_enum: + pfn = parse_func_name(value_type) + parse_call = f"myValue = {pfn}(xelement.getValue());" + else: + parse_call = "myValue.parse(xelement.getValue());" + + ctx = { + "license": LICENSE.rstrip(), + "class_name": class_name, + "stream_name": stream_name, + "value_type": value_type, + "project_includes": includes, + "is_xmacro": is_xmacro, + "is_enum": is_enum, + "default_init": default_init, + "parse_call": parse_call, + } + + h_content = h_tmpl.render(ctx) + cpp_content = cpp_tmpl.render(ctx) + return h_content, cpp_content + + def _parse_config() -> ParseConfig: """Bundle the C++-aware structural-config globals for injection into the parser. @@ -13295,8 +13367,15 @@ def main(): else: stats["elem_skipped"] += 1 + elif cat == "simple-value": + h_content, cpp_content = _render_simple_value(elem_name, elem, model) + class_name = element_class_name(elem_name) + write_file(os.path.join(ELEM_DIR, f"{class_name}.h"), h_content) + write_file(os.path.join(ELEM_DIR, f"{class_name}.cpp"), cpp_content) + stats["elem_written"] += 1 + elif cat in ("empty-with-attrs", "text-with-attrs", "complex-with-attrs", - "complex", "text-value", "empty", "simple-value"): + "complex", "text-value", "empty"): class_name = element_class_name(elem_name) stream_name = elem_name @@ -13312,16 +13391,8 @@ def main(): generated_attrs.add(sname) stats["attrs_written"] += 1 - if cat == "simple-value": - value_type = resolve_cpp_type(elem.type_name, model) - fake_ct = XsdComplexType(name=elem.type_name) - fake_ct.has_simple_content = True - fake_ct.simple_content_base = elem.type_name - h_content = generate_element_h(elem_name, class_name, stream_name, cat, fake_ct, model, type_name) - cpp_content = generate_element_cpp(elem_name, class_name, stream_name, cat, fake_ct, model, type_name) - else: - h_content = generate_element_h(elem_name, class_name, stream_name, cat, ct, model, type_name) - cpp_content = generate_element_cpp(elem_name, class_name, stream_name, cat, ct, model, type_name) + h_content = generate_element_h(elem_name, class_name, stream_name, cat, ct, model, type_name) + cpp_content = generate_element_cpp(elem_name, class_name, stream_name, cat, ct, model, type_name) write_file(os.path.join(ELEM_DIR, f"{class_name}.h"), h_content) write_file(os.path.join(ELEM_DIR, f"{class_name}.cpp"), cpp_content) From c1eaf1b59664113ecc7afd31b6061dc335e9cdae Mon Sep 17 00:00:00 2001 From: Matthew James Briggs Date: Sun, 7 Jun 2026 20:16:38 +0200 Subject: [PATCH 06/27] instructions --- docs/ai/projects/gen/prompt.txt | 30 ++++++++++++++++++++++++++++++ 1 file changed, 30 insertions(+) diff --git a/docs/ai/projects/gen/prompt.txt b/docs/ai/projects/gen/prompt.txt index e69de29bb..a90f9ff6b 100644 --- a/docs/ai/projects/gen/prompt.txt +++ b/docs/ai/projects/gen/prompt.txt @@ -0,0 +1,30 @@ +/goal + +/project gen + +conduct 10 rounds of incremental refactoring competitions. + +You are the chief architect who is trying to make the codegen sytem configurable, extensible, easy +to use, easy to understand. + +You have subagents who will innovately and creatively try create an incremental improvement toward. +These subagents require their own worktree or checkout so they can work in parellel. + +A round begins when you instruct three of these subagents that they are to make a incremental, +reasonable-sized improvement to the extensibility, configurability, of this system. Vary your +instructions to each a bit to get good creativity. + +The cardinal rule is that the improvements to `gen/` cann ot cause any diff in the C++ code. + +Each of the three subagents reports back to you with the changes they made. You choose the best of +the three for furthering my goals for this project, commit the changes. That is the end of a round. +Start the next round. + +You may reject all three submissions if you feel strongly that the changes were not good. If you do +the round didn't count. The goal is met when 10 rounds resulting in good changes occurred and there +should be 10 commits in the git history when you are done. + +When you are done, the gen code must not cause any diff in C++ code. *Do not run* and do not allow +the subagents to run C++ builds or tests. Since our cardinal rule is to have zero diff in the C++ +code, there would be no benefit. And compiling and running that code is very expensive (slow). + From a2370902d12b092dd5235963984072a99032ebe3 Mon Sep 17 00:00:00 2001 From: Matthew James Briggs Date: Sun, 7 Jun 2026 20:31:28 +0200 Subject: [PATCH 07/27] gen: extract type maps and resolution to type_maps.py --- gen/generate.py | 288 +++-------------------------------------------- gen/type_maps.py | 280 +++++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 297 insertions(+), 271 deletions(-) create mode 100644 gen/type_maps.py diff --git a/gen/generate.py b/gen/generate.py index f4dc83ade..a70960ac9 100644 --- a/gen/generate.py +++ b/gen/generate.py @@ -27,6 +27,23 @@ XsdModel, pascal, ) +from type_maps import ( + BESPOKE_TYPES, + NEEDS_PARSE_FUNC_TYPES, + NUMERIC_TYPE_MAP, + SIMPLE_TYPE_TO_CPP, + STRING_LIKE_TYPES, + TYPE_TO_HEADER, + XMACRO_ENUM_TYPES, + XSD_TO_CPP_TYPE, + header_for_type, + is_enum_value_type, + needs_parse_func, + parse_func_name, + resolve_attr_cpp_type, + resolve_cpp_type, + uses_set_value, +) XSD_PATH = "docs/musicxml.xsd" CORE_DIR = "src/private/mx/core" @@ -76,275 +93,9 @@ def has_flag_name(cpp_n: str) -> str: base = cpp_n[:-1] if cpp_n.endswith("_") and cpp_n[:-1] in CPP_KEYWORDS else cpp_n return "has" + base[0].upper() + base[1:] - -# --------------------------------------------------------------------------- -# C++ Type Mapping -# --------------------------------------------------------------------------- - -XSD_TO_CPP_TYPE = { - "xs:string": "XsString", - "xs:token": "XsToken", - "xs:ID": "XsID", - "xs:IDREF": "XsIDREF", - "xs:NMTOKEN": "XsNMToken", - "xs:anyURI": "XsAnyUri", - "xs:decimal": "DecimalType", - "xs:integer": "Integer", - "xs:nonNegativeInteger": "NonNegativeInteger", - "xs:positiveInteger": "PositiveInteger", - "xs:date": "Date", - "xs:time": "TimeOnly", - "xml:lang": "XmlLang", - "xml:space": "XmlSpace", - "xlink:href": "XlinkHref", - "xlink:type": "XlinkType", - "xlink:role": "XlinkRole", - "xlink:title": "XlinkTitle", - "xlink:show": "XlinkShow", - "xlink:actuate": "XlinkActuate", -} - -SIMPLE_TYPE_TO_CPP = { - "above-below": "AboveBelow", - "accidental-value": "AccidentalValue", - "backward-forward": "BackwardForward", - "bar-style": "BarStyleEnum", - "beam-value": "BeamValue", - "cancel-location": "CancelLocation", - "clef-sign": "ClefSign", - "css-font-size": "CssFontSize", - "degree-symbol-value": "DegreeSymbolValue", - "degree-type-value": "DegreeTypeValue", - "effect-value": "EffectValue", - "enclosure-shape": "EnclosureShape", - "fan": "Fan", - "fermata-shape": "FermataShape", - "font-style": "FontStyle", - "font-weight": "FontWeight", - "group-barline-value": "GroupBarlineValue", - "group-symbol-value": "GroupSymbolValue", - "handbell-value": "HandbellValue", - "harmony-type": "HarmonyType", - "kind-value": "KindValue", - "left-center-right": "LeftCenterRight", - "left-right": "LeftRight", - "line-end": "LineEnd", - "line-shape": "LineShape", - "line-type": "LineType", - "margin-type": "MarginType", - "measure-numbering-value": "MeasureNumberingValue", - "membrane-value": "MembraneValue", - "metal-value": "MetalValue", - "mute": "MuteEnum", - "notehead-value": "NoteheadValue", - "note-size-type": "NoteSizeType", - "note-type-value": "NoteTypeValue", - "on-off": "OnOff", - "over-under": "OverUnder", - "pitched-value": "PitchedValue", - "placement": "AboveBelow", - "right-left-middle": "RightLeftMiddle", - "semi-pitched": "SemiPitchedEnum", - "show-frets": "ShowFrets", - "show-tuplet": "ShowTuplet", - "staff-type": "StaffTypeEnum", - "start-note": "StartNote", - "start-stop": "StartStop", - "start-stop-change-continue": "StartStopChangeContinue", - "start-stop-continue": "StartStopContinue", - "start-stop-discontinue": "StartStopDiscontinue", - "start-stop-single": "StartStopSingle", - "stem-value": "StemValue", - "step": "StepEnum", - "syllabic": "SyllabicEnum", - "symbol-size": "SymbolSize", - "tap-hand": "TapHand", - "text-direction": "TextDirection", - "tied-type": "TiedType", - "time-relation": "TimeRelationEnum", - "time-symbol": "TimeSymbol", - "tip-direction": "TipDirection", - "top-bottom": "TopBottom", - "tremolo-type": "TremoloType", - "trill-step": "TrillStep", - "two-note-turn": "TwoNoteTurn", - "up-down": "UpDown", - "up-down-stop-continue": "UpDownStopContinue", - "upright-inverted": "UprightInverted", - "valign": "Valign", - "valign-image": "ValignImage", - "wedge-type": "WedgeType", - "winged": "Winged", - "wood-value": "WoodValue", - "yes-no": "YesNo", -} - -NUMERIC_TYPE_MAP = { - "accordion-middle": "AccordionMiddleValue", - "beam-level": "BeamLevel", - "divisions": "DivisionsValue", - "fifths": "FifthsValue", - "midi-128": "Midi128", - "midi-16": "Midi16", - "midi-16384": "Midi16384", - "millimeters": "MillimetersValue", - "non-negative-decimal": "NonNegativeDecimal", - "number-level": "NumberLevel", - "number-of-lines": "NumberOfLines", - "octave": "OctaveValue", - "percent": "Percent", - "positive-decimal": "PositiveDecimal", - "positive-divisions": "PositiveDivisionsValue", - "rotation-degrees": "RotationDegrees", - "semitones": "Semitones", - "staff-line": "StaffLine", - "staff-number": "StaffNumber", - "string-number": "StringNumber", - "tenths": "TenthsValue", - "trill-beats": "TrillBeats", - "tremolo-marks": "TremoloMarks", - "byte": "Byte", -} - -BESPOKE_TYPES = { - "color": "Color", - "comma-separated-text": "CommaSeparatedText", - "distance-type": "DistanceType", - "font-size": "FontSize", - "line-width-type": "LineWidthType", - "mode": "ModeValue", - "number-or-normal": "NumberOrNormal", - "positive-integer-or-empty": "PositiveIntegerOrEmpty", - "yes-no-number": "YesNoNumber", - "ending-number": "EndingNumber", - "date": "Date", - "time-only": "TimeOnly", -} - -STRING_LIKE_TYPES = { - "XsString", "XsToken", "XsID", "XsIDREF", "XsNMToken", "XsAnyUri", - "PlaybackSoundType", -} - - -def uses_set_value(cpp_type: str) -> bool: - return cpp_type in STRING_LIKE_TYPES - - -def is_enum_value_type(cpp_type: str) -> bool: - return needs_parse_func(cpp_type) or cpp_type.endswith("Enum") or cpp_type in XMACRO_ENUM_TYPES - - -def resolve_cpp_type(xsd_type: str, model: XsdModel) -> str: - if xsd_type in XSD_TO_CPP_TYPE: - return XSD_TO_CPP_TYPE[xsd_type] - if xsd_type.startswith("xs:"): - return XSD_TO_CPP_TYPE.get(xsd_type, "XsString") - if xsd_type in SIMPLE_TYPE_TO_CPP: - return SIMPLE_TYPE_TO_CPP[xsd_type] - if xsd_type in NUMERIC_TYPE_MAP: - return NUMERIC_TYPE_MAP[xsd_type] - if xsd_type in BESPOKE_TYPES: - return BESPOKE_TYPES[xsd_type] - if xsd_type in model.enum_types: - base = pascal(xsd_type) - if base in model.class_names: - return base + "Enum" - return base - if xsd_type in model.simple_types: - st = model.simple_types[xsd_type] - if st["kind"] == "restriction": - return resolve_cpp_type(st["base"], model) - if st["kind"] == "union": - return pascal(xsd_type) - return pascal(xsd_type) - - -def resolve_attr_cpp_type(attr: XsdAttribute, model: XsdModel) -> str: - return resolve_cpp_type(attr.type_name, model) - - ENUM_PARSE_FUNCS = {} -def needs_parse_func(cpp_type: str) -> bool: - return cpp_type in { - "FontStyle", "FontWeight", "AboveBelow", "LeftCenterRight", "Valign", - "ValignImage", "OverUnder", "TopBottom", "EnclosureShape", "StartStop", - "StartStopContinue", "StartStopSingle", "StartStopChangeContinue", - "StartStopDiscontinue", "YesNo", "OnOff", "UpDown", "BackwardForward", - "LineType", "LineShape", "WedgeType", "BarStyleEnum", "Fan", - "TipDirection", "TextDirection", "UprightInverted", "LeftRight", - "RightLeftMiddle", "BeamValue", "AccidentalValue", "ClefSign", - "StemValue", "NoteheadValue", "StepEnum", "Syllabic", "SymbolSize", - "TiedType", "FermataShape", "KindValue", "HarmonyType", - "DegreeTypeValue", "DegreeSymbolValue", "GroupSymbolValue", - "GroupBarlineValue", "MarginType", "TimeSymbol", "CancelLocation", - "ShowTuplet", "NoteTypeValue", "HandbellValue", "EffectValue", - "MetalValue", "WoodValue", "PitchedValue", "MembraneValue", - "SemiPitched", "TapHand", "TimeRelation", "LineEnd", "ShowFrets", - "CssFontSize", "MeasureNumberingValue", "StaffTypeEnum", - "StartNote", "TrillStep", "TwoNoteTurn", "Winged", "TremoloType", - "UpDownStopContinue", "NoteSizeType", "MuteEnum", - "BeaterValue", "BreathMarkValue", "HoleClosedValue", - "HoleClosedLocation", "TimeSeparator", "PrincipalVoiceSymbol", - "ModeValue", "XmlSpace", "XlinkType", "XlinkShow", "XlinkActuate", - } - - -def parse_func_name(cpp_type: str) -> str: - if cpp_type in XMACRO_ENUM_TYPES: - return f"{cpp_type}FromString" - return f"parse{cpp_type}" - - -# --------------------------------------------------------------------------- -# Include resolution -# --------------------------------------------------------------------------- - -TYPE_TO_HEADER = { - "XsString": "mx/core/XsString.h", - "XsToken": "mx/core/XsToken.h", - "XsID": "mx/core/XsID.h", - "XsIDREF": "mx/core/XsIDREF.h", - "XsNMToken": "mx/core/XsNMToken.h", - "XsAnyUri": "mx/core/XsAnyUri.h", - "XmlLang": "mx/core/XmlLang.h", - "XlinkHref": "mx/core/XlinkHref.h", - "XlinkRole": "mx/core/XlinkRole.h", - "XlinkTitle": "mx/core/XlinkTitle.h", - "Color": "mx/core/Color.h", - "CommaSeparatedText": "mx/core/CommaSeparatedText.h", - "CommaSeparatedPositiveIntegers": "mx/core/CommaSeparatedPositiveIntegers.h", - "FontSize": "mx/core/FontSize.h", - "NumberOrNormal": "mx/core/NumberOrNormal.h", - "PositiveIntegerOrEmpty": "mx/core/PositiveIntegerOrEmpty.h", - "YesNoNumber": "mx/core/YesNoNumber.h", - "EndingNumber": "mx/core/EndingNumber.h", - "Date": "mx/core/Date.h", - "TimeOnly": "mx/core/TimeOnly.h", - "PlaybackSound": "mx/core/PlaybackSound.h", - "PlaybackSoundType": "mx/core/PlaybackSoundType.h", -} - - -def header_for_type(cpp_type: str) -> str: - if cpp_type in TYPE_TO_HEADER: - return TYPE_TO_HEADER[cpp_type] - if "Decimal" in cpp_type or "Tenths" in cpp_type or "Millimeters" in cpp_type or \ - "Percent" in cpp_type or "Semitones" in cpp_type or "TrillBeats" in cpp_type or \ - "RotationDegrees" in cpp_type or "Divisions" in cpp_type: - return "mx/core/Decimals.h" - if any(cpp_type == t for t in [ - "AccordionMiddleValue", "BeamLevel", "Byte", "FifthsValue", "Integer", - "Midi128", "Midi16", "Midi16384", "NonNegativeInteger", "NumberLevel", - "NumberOfLines", "OctaveValue", "PositiveInteger", "StaffLine", - "StaffNumber", "StringNumber", "TremoloMarks", - ]): - return "mx/core/Integers.h" - return "mx/core/Enums.h" - - # --------------------------------------------------------------------------- # Enums.h / Enums.cpp generation # --------------------------------------------------------------------------- @@ -1662,11 +1413,6 @@ def _emit_synthetic_unbounded_helper(lines: list, parent_class: str, }, } -# Types defined via X-macros (EnumWithString.h pattern) that provide -# XxxToString / XxxFromString free functions instead of operator<< / parseXxx. -XMACRO_ENUM_TYPES = { - "PlaybackSound", -} def element_class_name(elem_name: str) -> str: diff --git a/gen/type_maps.py b/gen/type_maps.py new file mode 100644 index 000000000..478084b19 --- /dev/null +++ b/gen/type_maps.py @@ -0,0 +1,280 @@ +#!/usr/bin/env python3 +"""XSD-to-C++ type mapping tables and resolution logic. + +Owns the lookup dicts that map XSD type names to their C++ counterparts in +mx/core, plus helper predicates for serialization pattern selection. Extracted +from generate.py to keep type-mapping concerns in one place and make the tables +easier to locate and extend (e.g. when adding MusicXML 4.0 types). +""" + +from parse import pascal + +# --------------------------------------------------------------------------- +# Mapping Tables +# --------------------------------------------------------------------------- + +XSD_TO_CPP_TYPE = { + "xs:string": "XsString", + "xs:token": "XsToken", + "xs:ID": "XsID", + "xs:IDREF": "XsIDREF", + "xs:NMTOKEN": "XsNMToken", + "xs:anyURI": "XsAnyUri", + "xs:decimal": "DecimalType", + "xs:integer": "Integer", + "xs:nonNegativeInteger": "NonNegativeInteger", + "xs:positiveInteger": "PositiveInteger", + "xs:date": "Date", + "xs:time": "TimeOnly", + "xml:lang": "XmlLang", + "xml:space": "XmlSpace", + "xlink:href": "XlinkHref", + "xlink:type": "XlinkType", + "xlink:role": "XlinkRole", + "xlink:title": "XlinkTitle", + "xlink:show": "XlinkShow", + "xlink:actuate": "XlinkActuate", +} + +SIMPLE_TYPE_TO_CPP = { + "above-below": "AboveBelow", + "accidental-value": "AccidentalValue", + "backward-forward": "BackwardForward", + "bar-style": "BarStyleEnum", + "beam-value": "BeamValue", + "cancel-location": "CancelLocation", + "clef-sign": "ClefSign", + "css-font-size": "CssFontSize", + "degree-symbol-value": "DegreeSymbolValue", + "degree-type-value": "DegreeTypeValue", + "effect-value": "EffectValue", + "enclosure-shape": "EnclosureShape", + "fan": "Fan", + "fermata-shape": "FermataShape", + "font-style": "FontStyle", + "font-weight": "FontWeight", + "group-barline-value": "GroupBarlineValue", + "group-symbol-value": "GroupSymbolValue", + "handbell-value": "HandbellValue", + "harmony-type": "HarmonyType", + "kind-value": "KindValue", + "left-center-right": "LeftCenterRight", + "left-right": "LeftRight", + "line-end": "LineEnd", + "line-shape": "LineShape", + "line-type": "LineType", + "margin-type": "MarginType", + "measure-numbering-value": "MeasureNumberingValue", + "membrane-value": "MembraneValue", + "metal-value": "MetalValue", + "mute": "MuteEnum", + "notehead-value": "NoteheadValue", + "note-size-type": "NoteSizeType", + "note-type-value": "NoteTypeValue", + "on-off": "OnOff", + "over-under": "OverUnder", + "pitched-value": "PitchedValue", + "placement": "AboveBelow", + "right-left-middle": "RightLeftMiddle", + "semi-pitched": "SemiPitchedEnum", + "show-frets": "ShowFrets", + "show-tuplet": "ShowTuplet", + "staff-type": "StaffTypeEnum", + "start-note": "StartNote", + "start-stop": "StartStop", + "start-stop-change-continue": "StartStopChangeContinue", + "start-stop-continue": "StartStopContinue", + "start-stop-discontinue": "StartStopDiscontinue", + "start-stop-single": "StartStopSingle", + "stem-value": "StemValue", + "step": "StepEnum", + "syllabic": "SyllabicEnum", + "symbol-size": "SymbolSize", + "tap-hand": "TapHand", + "text-direction": "TextDirection", + "tied-type": "TiedType", + "time-relation": "TimeRelationEnum", + "time-symbol": "TimeSymbol", + "tip-direction": "TipDirection", + "top-bottom": "TopBottom", + "tremolo-type": "TremoloType", + "trill-step": "TrillStep", + "two-note-turn": "TwoNoteTurn", + "up-down": "UpDown", + "up-down-stop-continue": "UpDownStopContinue", + "upright-inverted": "UprightInverted", + "valign": "Valign", + "valign-image": "ValignImage", + "wedge-type": "WedgeType", + "winged": "Winged", + "wood-value": "WoodValue", + "yes-no": "YesNo", +} + +NUMERIC_TYPE_MAP = { + "accordion-middle": "AccordionMiddleValue", + "beam-level": "BeamLevel", + "divisions": "DivisionsValue", + "fifths": "FifthsValue", + "midi-128": "Midi128", + "midi-16": "Midi16", + "midi-16384": "Midi16384", + "millimeters": "MillimetersValue", + "non-negative-decimal": "NonNegativeDecimal", + "number-level": "NumberLevel", + "number-of-lines": "NumberOfLines", + "octave": "OctaveValue", + "percent": "Percent", + "positive-decimal": "PositiveDecimal", + "positive-divisions": "PositiveDivisionsValue", + "rotation-degrees": "RotationDegrees", + "semitones": "Semitones", + "staff-line": "StaffLine", + "staff-number": "StaffNumber", + "string-number": "StringNumber", + "tenths": "TenthsValue", + "trill-beats": "TrillBeats", + "tremolo-marks": "TremoloMarks", + "byte": "Byte", +} + +BESPOKE_TYPES = { + "color": "Color", + "comma-separated-text": "CommaSeparatedText", + "distance-type": "DistanceType", + "font-size": "FontSize", + "line-width-type": "LineWidthType", + "mode": "ModeValue", + "number-or-normal": "NumberOrNormal", + "positive-integer-or-empty": "PositiveIntegerOrEmpty", + "yes-no-number": "YesNoNumber", + "ending-number": "EndingNumber", + "date": "Date", + "time-only": "TimeOnly", +} + +STRING_LIKE_TYPES = { + "XsString", "XsToken", "XsID", "XsIDREF", "XsNMToken", "XsAnyUri", + "PlaybackSoundType", +} + +XMACRO_ENUM_TYPES = { + "PlaybackSound", +} + +NEEDS_PARSE_FUNC_TYPES = { + "FontStyle", "FontWeight", "AboveBelow", "LeftCenterRight", "Valign", + "ValignImage", "OverUnder", "TopBottom", "EnclosureShape", "StartStop", + "StartStopContinue", "StartStopSingle", "StartStopChangeContinue", + "StartStopDiscontinue", "YesNo", "OnOff", "UpDown", "BackwardForward", + "LineType", "LineShape", "WedgeType", "BarStyleEnum", "Fan", + "TipDirection", "TextDirection", "UprightInverted", "LeftRight", + "RightLeftMiddle", "BeamValue", "AccidentalValue", "ClefSign", + "StemValue", "NoteheadValue", "StepEnum", "Syllabic", "SymbolSize", + "TiedType", "FermataShape", "KindValue", "HarmonyType", + "DegreeTypeValue", "DegreeSymbolValue", "GroupSymbolValue", + "GroupBarlineValue", "MarginType", "TimeSymbol", "CancelLocation", + "ShowTuplet", "NoteTypeValue", "HandbellValue", "EffectValue", + "MetalValue", "WoodValue", "PitchedValue", "MembraneValue", + "SemiPitched", "TapHand", "TimeRelation", "LineEnd", "ShowFrets", + "CssFontSize", "MeasureNumberingValue", "StaffTypeEnum", + "StartNote", "TrillStep", "TwoNoteTurn", "Winged", "TremoloType", + "UpDownStopContinue", "NoteSizeType", "MuteEnum", + "BeaterValue", "BreathMarkValue", "HoleClosedValue", + "HoleClosedLocation", "TimeSeparator", "PrincipalVoiceSymbol", + "ModeValue", "XmlSpace", "XlinkType", "XlinkShow", "XlinkActuate", +} + +TYPE_TO_HEADER = { + "XsString": "mx/core/XsString.h", + "XsToken": "mx/core/XsToken.h", + "XsID": "mx/core/XsID.h", + "XsIDREF": "mx/core/XsIDREF.h", + "XsNMToken": "mx/core/XsNMToken.h", + "XsAnyUri": "mx/core/XsAnyUri.h", + "XmlLang": "mx/core/XmlLang.h", + "XlinkHref": "mx/core/XlinkHref.h", + "XlinkRole": "mx/core/XlinkRole.h", + "XlinkTitle": "mx/core/XlinkTitle.h", + "Color": "mx/core/Color.h", + "CommaSeparatedText": "mx/core/CommaSeparatedText.h", + "CommaSeparatedPositiveIntegers": "mx/core/CommaSeparatedPositiveIntegers.h", + "FontSize": "mx/core/FontSize.h", + "NumberOrNormal": "mx/core/NumberOrNormal.h", + "PositiveIntegerOrEmpty": "mx/core/PositiveIntegerOrEmpty.h", + "YesNoNumber": "mx/core/YesNoNumber.h", + "EndingNumber": "mx/core/EndingNumber.h", + "Date": "mx/core/Date.h", + "TimeOnly": "mx/core/TimeOnly.h", + "PlaybackSound": "mx/core/PlaybackSound.h", + "PlaybackSoundType": "mx/core/PlaybackSoundType.h", +} + + +# --------------------------------------------------------------------------- +# Predicates and Resolution +# --------------------------------------------------------------------------- + + +def needs_parse_func(cpp_type: str) -> bool: + return cpp_type in NEEDS_PARSE_FUNC_TYPES + + +def uses_set_value(cpp_type: str) -> bool: + return cpp_type in STRING_LIKE_TYPES + + +def is_enum_value_type(cpp_type: str) -> bool: + return needs_parse_func(cpp_type) or cpp_type.endswith("Enum") or cpp_type in XMACRO_ENUM_TYPES + + +def parse_func_name(cpp_type: str) -> str: + if cpp_type in XMACRO_ENUM_TYPES: + return f"{cpp_type}FromString" + return f"parse{cpp_type}" + + +def resolve_cpp_type(xsd_type: str, model) -> str: + if xsd_type in XSD_TO_CPP_TYPE: + return XSD_TO_CPP_TYPE[xsd_type] + if xsd_type.startswith("xs:"): + return XSD_TO_CPP_TYPE.get(xsd_type, "XsString") + if xsd_type in SIMPLE_TYPE_TO_CPP: + return SIMPLE_TYPE_TO_CPP[xsd_type] + if xsd_type in NUMERIC_TYPE_MAP: + return NUMERIC_TYPE_MAP[xsd_type] + if xsd_type in BESPOKE_TYPES: + return BESPOKE_TYPES[xsd_type] + if xsd_type in model.enum_types: + base = pascal(xsd_type) + if base in model.class_names: + return base + "Enum" + return base + if xsd_type in model.simple_types: + st = model.simple_types[xsd_type] + if st["kind"] == "restriction": + return resolve_cpp_type(st["base"], model) + if st["kind"] == "union": + return pascal(xsd_type) + return pascal(xsd_type) + + +def resolve_attr_cpp_type(attr, model) -> str: + return resolve_cpp_type(attr.type_name, model) + + +def header_for_type(cpp_type: str) -> str: + if cpp_type in TYPE_TO_HEADER: + return TYPE_TO_HEADER[cpp_type] + if "Decimal" in cpp_type or "Tenths" in cpp_type or "Millimeters" in cpp_type or \ + "Percent" in cpp_type or "Semitones" in cpp_type or "TrillBeats" in cpp_type or \ + "RotationDegrees" in cpp_type or "Divisions" in cpp_type: + return "mx/core/Decimals.h" + if any(cpp_type == t for t in [ + "AccordionMiddleValue", "BeamLevel", "Byte", "FifthsValue", "Integer", + "Midi128", "Midi16", "Midi16384", "NonNegativeInteger", "NumberLevel", + "NumberOfLines", "OctaveValue", "PositiveInteger", "StaffLine", + "StaffNumber", "StringNumber", "TremoloMarks", + ]): + return "mx/core/Integers.h" + return "mx/core/Enums.h" From 528a8e7bbe5d3bdd6d4585c647897db076179ed3 Mon Sep 17 00:00:00 2001 From: Matthew James Briggs Date: Sun, 7 Jun 2026 20:32:16 +0200 Subject: [PATCH 08/27] gen: extract naming utilities to naming.py --- gen/generate.py | 33 +-------------------------------- gen/naming.py | 36 ++++++++++++++++++++++++++++++++++++ 2 files changed, 37 insertions(+), 32 deletions(-) create mode 100644 gen/naming.py diff --git a/gen/generate.py b/gen/generate.py index a70960ac9..19900a502 100644 --- a/gen/generate.py +++ b/gen/generate.py @@ -27,6 +27,7 @@ XsdModel, pascal, ) +from naming import CPP_KEYWORDS, camel, has_flag_name, pascal_to_camel from type_maps import ( BESPOKE_TYPES, NEEDS_PARSE_FUNC_TYPES, @@ -67,31 +68,6 @@ // Distributed under the MIT License """ -CPP_KEYWORDS = { - "continue", "double", "long", "short", "int", "float", "bool", "char", - "class", "struct", "enum", "union", "void", "for", "while", "do", "if", - "else", "switch", "case", "default", "break", "return", "new", "delete", - "this", "true", "false", "const", "static", "virtual", "public", "private", - "protected", "namespace", "using", "template", "typename", "operator", - "and", "or", "not", "xor", "auto", "register", "signed", "unsigned", - "goto", "throw", "try", "catch", "explicit", "string", -} - - -def camel(name: str) -> str: - parts = re.split(r"[-_]", name) - result = parts[0].lower() + "".join(p[:1].upper() + p[1:] for p in parts[1:]) - if result in CPP_KEYWORDS: - result += "_" - return result - - -def has_flag_name(cpp_n: str) -> str: - # The presence flag is built from the unescaped identifier: the value field may - # be keyword-escaped (e.g. 'long_'), but the has-flag must not be ('hasLong', - # not 'hasLong_'). Strip a trailing underscore added by camel() for keywords. - base = cpp_n[:-1] if cpp_n.endswith("_") and cpp_n[:-1] in CPP_KEYWORDS else cpp_n - return "has" + base[0].upper() + base[1:] ENUM_PARSE_FUNCS = {} @@ -7853,13 +7829,6 @@ def _emit_lyric_family(elem_name, elem, ct, model, generated_attrs, stats): # pascal(branch[0]) + 'Or' + pascal(branch[1]). -def pascal_to_camel(pascal_name: str) -> str: - """Convert a PascalCase identifier to camelCase by lowercasing only - the first character. Used for variable names derived from class names - that have no hyphen/underscore separators.""" - if not pascal_name: - return pascal_name - return pascal_name[0].lower() + pascal_name[1:] def _extract_part_list_structure(ct): diff --git a/gen/naming.py b/gen/naming.py new file mode 100644 index 000000000..173b88431 --- /dev/null +++ b/gen/naming.py @@ -0,0 +1,36 @@ +#!/usr/bin/env python3 +"""C++ naming and casing utilities for the code generator. + +Owns the set of C++ reserved keywords and the functions that transform +XSD/hyphenated names into legal C++ identifiers (camelCase, hasFlag, etc.). +""" +import re + +CPP_KEYWORDS = { + "continue", "double", "long", "short", "int", "float", "bool", "char", + "class", "struct", "enum", "union", "void", "for", "while", "do", "if", + "else", "switch", "case", "default", "break", "return", "new", "delete", + "this", "true", "false", "const", "static", "virtual", "public", "private", + "protected", "namespace", "using", "template", "typename", "operator", + "and", "or", "not", "xor", "auto", "register", "signed", "unsigned", + "goto", "throw", "try", "catch", "explicit", "string", +} + + +def camel(name: str) -> str: + parts = re.split(r"[-_]", name) + result = parts[0].lower() + "".join(p[:1].upper() + p[1:] for p in parts[1:]) + if result in CPP_KEYWORDS: + result += "_" + return result + + +def has_flag_name(cpp_n: str) -> str: + base = cpp_n[:-1] if cpp_n.endswith("_") and cpp_n[:-1] in CPP_KEYWORDS else cpp_n + return "has" + base[0].upper() + base[1:] + + +def pascal_to_camel(pascal_name: str) -> str: + if not pascal_name: + return pascal_name + return pascal_name[0].lower() + pascal_name[1:] From 878015a363b753e13d4f0a84964658be68a637c7 Mon Sep 17 00:00:00 2001 From: Matthew James Briggs Date: Sun, 7 Jun 2026 20:33:46 +0200 Subject: [PATCH 09/27] gen: move default-value tables to config.toml --- gen/cpp/config.toml | 65 +++++++++++++++++++++++++++++++++++++++++ gen/generate.py | 70 ++++----------------------------------------- 2 files changed, 71 insertions(+), 64 deletions(-) diff --git a/gen/cpp/config.toml b/gen/cpp/config.toml index 82a70638d..d15d63e16 100644 --- a/gen/cpp/config.toml +++ b/gen/cpp/config.toml @@ -1,3 +1,68 @@ [categories.simple-value] header_template = "simple_value_h.j2" impl_template = "simple_value_cpp.j2" + +# Default constructor values for C++ attribute types (used in Attributes structs). +[defaults.attr_type] +FontStyle = "FontStyle::normal" +FontWeight = "FontWeight::normal" +AboveBelow = "AboveBelow::below" +Valign = "Valign::bottom" +ValignImage = "ValignImage::bottom" +LeftCenterRight = "LeftCenterRight::left" +EnclosureShape = "EnclosureShape::none" +YesNo = "YesNo::no" +OnOff = "OnOff::off" +FontSize = "CssFontSize::medium" +StartStop = "StartStop::start" +StartStopContinue = "StartStopContinue::start" +StartStopSingle = "StartStopSingle::single" +LineType = "LineType::solid" +LineShape = "LineShape::straight" +SymbolSize = "SymbolSize::full" + +# Default constructor values for element value types (e.g. enums). +[defaults.value_type] +AccidentalValue = "AccidentalValue::natural" +ArrowDirectionEnum = "ArrowDirectionEnum::up" +ArrowStyleEnum = "ArrowStyleEnum::single" +BarStyleEnum = "BarStyleEnum::regular" +BeamValue = "BeamValue::begin" +BeaterValue = "BeaterValue::snareStick" +BreathMarkValue = "BreathMarkValue::emptystring" +CircularArrowEnum = "CircularArrowEnum::clockwise" +ClefSign = "ClefSign::g" +DegreeTypeValue = "DegreeTypeValue::add" +EffectEnum = "EffectEnum::anvil" +FermataShape = "FermataShape::normal" +GlassEnum = "GlassEnum::windChimes" +GroupBarlineValue = "GroupBarlineValue::yes" +GroupSymbolValue = "GroupSymbolValue::none" +HandbellValue = "HandbellValue::damp" +HoleClosedValue = "HoleClosedValue::no" +KindValue = "KindValue::none" +MeasureNumberingValue = "MeasureNumberingValue::none" +MembraneEnum = "MembraneEnum::snareDrum" +MetalEnum = "MetalEnum::bell" +MuteEnum = "MuteEnum::off" +NoteTypeValue = "NoteTypeValue::eighth" +NoteheadValue = "NoteheadValue::normal" +PitchedEnum = "PitchedEnum::xylophone" +SemiPitchedEnum = "SemiPitchedEnum::medium" +StaffTypeEnum = "StaffTypeEnum::regular" +StemValue = "StemValue::none" +StepEnum = "StepEnum::a" +StickLocationEnum = "StickLocationEnum::center" +StickMaterialEnum = "StickMaterialEnum::medium" +StickTypeEnum = "StickTypeEnum::yarn" +SyllabicEnum = "SyllabicEnum::begin" +TimeRelationEnum = "TimeRelationEnum::equals" +WoodEnum = "WoodEnum::claves" +PlaybackSound = "PlaybackSound::keyboardPiano" + +# Default values for specific elements (keyed by element name). +[defaults.element] +type = "NoteTypeValue::quarter" +duration = "1.0" +tremolo = "3" +metronome-relation = '"equals"' diff --git a/gen/generate.py b/gen/generate.py index 19900a502..908877149 100644 --- a/gen/generate.py +++ b/gen/generate.py @@ -349,26 +349,11 @@ def _apply_child_min_occurs_override(elem_name: str, children: list) -> list: return result +ATTR_TYPE_DEFAULTS = CPP_CONFIG["defaults"]["attr_type"] + + def default_value_for_type(cpp_type: str) -> str: - defaults = { - "FontStyle": "FontStyle::normal", - "FontWeight": "FontWeight::normal", - "AboveBelow": "AboveBelow::below", - "Valign": "Valign::bottom", - "ValignImage": "ValignImage::bottom", - "LeftCenterRight": "LeftCenterRight::left", - "EnclosureShape": "EnclosureShape::none", - "YesNo": "YesNo::no", - "OnOff": "OnOff::off", - "FontSize": "CssFontSize::medium", - "StartStop": "StartStop::start", - "StartStopContinue": "StartStopContinue::start", - "StartStopSingle": "StartStopSingle::single", - "LineType": "LineType::solid", - "LineShape": "LineShape::straight", - "SymbolSize": "SymbolSize::full", - } - return defaults.get(cpp_type, "") + return ATTR_TYPE_DEFAULTS.get(cpp_type, "") def generate_attrs_cpp(struct_name: str, attrs: list, model: XsdModel) -> str: @@ -1584,51 +1569,8 @@ def group_class_name(group_name: str) -> str: "defaults", "grouping", "identification", "part-group", "print", } -TYPE_DEFAULT_VALUE = { - "AccidentalValue": "AccidentalValue::natural", - "ArrowDirectionEnum": "ArrowDirectionEnum::up", - "ArrowStyleEnum": "ArrowStyleEnum::single", - "BarStyleEnum": "BarStyleEnum::regular", - "BeamValue": "BeamValue::begin", - "BeaterValue": "BeaterValue::snareStick", - "BreathMarkValue": "BreathMarkValue::emptystring", - "CircularArrowEnum": "CircularArrowEnum::clockwise", - "ClefSign": "ClefSign::g", - "DegreeTypeValue": "DegreeTypeValue::add", - "EffectEnum": "EffectEnum::anvil", - "FermataShape": "FermataShape::normal", - "GlassEnum": "GlassEnum::windChimes", - "GroupBarlineValue": "GroupBarlineValue::yes", - "GroupSymbolValue": "GroupSymbolValue::none", - "HandbellValue": "HandbellValue::damp", - "HoleClosedValue": "HoleClosedValue::no", - "KindValue": "KindValue::none", - "MeasureNumberingValue": "MeasureNumberingValue::none", - "MembraneEnum": "MembraneEnum::snareDrum", - "MetalEnum": "MetalEnum::bell", - "MuteEnum": "MuteEnum::off", - "NoteTypeValue": "NoteTypeValue::eighth", - "NoteheadValue": "NoteheadValue::normal", - "PitchedEnum": "PitchedEnum::xylophone", - "SemiPitchedEnum": "SemiPitchedEnum::medium", - "StaffTypeEnum": "StaffTypeEnum::regular", - "StemValue": "StemValue::none", - "StepEnum": "StepEnum::a", - "StickLocationEnum": "StickLocationEnum::center", - "StickMaterialEnum": "StickMaterialEnum::medium", - "StickTypeEnum": "StickTypeEnum::yarn", - "SyllabicEnum": "SyllabicEnum::begin", - "TimeRelationEnum": "TimeRelationEnum::equals", - "WoodEnum": "WoodEnum::claves", - "PlaybackSound": "PlaybackSound::keyboardPiano", -} - -ELEMENT_DEFAULT_VALUE = { - "type": "NoteTypeValue::quarter", - "duration": "1.0", - "tremolo": "3", - "metronome-relation": '"equals"', -} +TYPE_DEFAULT_VALUE = CPP_CONFIG["defaults"]["value_type"] +ELEMENT_DEFAULT_VALUE = CPP_CONFIG["defaults"]["element"] def generate_group_h(group_name: str, children: list, model: XsdModel) -> str: From 89c2a5806885cd5926bf65ff1238258322a21099 Mon Sep 17 00:00:00 2001 From: Matthew James Briggs Date: Sun, 7 Jun 2026 20:34:52 +0200 Subject: [PATCH 10/27] gen: extract behavioral overrides to overrides.py --- gen/generate.py | 107 ++++------------------------------------------- gen/overrides.py | 67 +++++++++++++++++++++++++++++ 2 files changed, 74 insertions(+), 100 deletions(-) create mode 100644 gen/overrides.py diff --git a/gen/generate.py b/gen/generate.py index 908877149..7e9e1b214 100644 --- a/gen/generate.py +++ b/gen/generate.py @@ -28,6 +28,13 @@ pascal, ) from naming import CPP_KEYWORDS, camel, has_flag_name, pascal_to_camel +from overrides import ( + ATTR_DEFAULT_OVERRIDE, + CHILD_INIT_VALUE_OVERRIDE, + CHILD_MIN_OCCURS_OVERRIDE, + ELEMENT_HAS_CONTENTS_ALWAYS_TRUE, + XMLNS_PRESERVING_ATTRS, +) from type_maps import ( BESPOKE_TYPES, NEEDS_PARSE_FUNC_TYPES, @@ -228,106 +235,6 @@ def generate_attrs_h(struct_name: str, attrs: list, model: XsdModel) -> str: return "\n".join(lines) + "\n" -# Per-attribute default value override, keyed by (attrs struct name, camelCase -# attribute field name). The value is the literal C++ initializer expression -# (e.g. '"it"'). Used when the committed code initializes an attribute to a -# value that is not encoded in the XSD (typically a hand-applied convention). -ATTR_DEFAULT_OVERRIDE = { - # AccidentalText's xml:lang attribute default: hand-applied "it" by the - # original codegen. Test01_AccidentalText asserts that setting hasLang - # without a value yields xml:lang="it". Not in the XSD. - ("AccidentalTextAttributes", "lang"): '"it"', - ("DirectiveAttributes", "lang"): '"it"', - # Lyric's justify default is hand-applied; the XSD says only "The default - # value varies for different elements". The original codegen chose - # 'center' here based on the doc text in the XSD annotation. - ("LyricAttributes", "justify"): "LeftCenterRight::center", - # Score{Partwise,Timewise}Attributes both expose the 'version' attribute - # from the document-attributes group. XSD says default="1.0" but mx/core - # hand-applies "3.0" so that newly-constructed scores serialize with the - # most recent supported version. Schema-driven generators preserve the - # hand-applied value via this override. - ("ScorePartwiseAttributes", "version"): '"3.0"', - ("ScoreTimewiseAttributes", "version"): '"3.0"', - # R4: xml:lang defaults hand-applied as "it" across text-bearing elements. - ("WordsAttributes", "lang"): '"it"', - ("TextAttributes", "lang"): '"it"', - ("RehearsalAttributes", "lang"): '"it"', - ("LyricLanguageAttributes", "lang"): '"it"', - ("CreditWordsAttributes", "lang"): '"it"', - # R4: BracketAttributes line-end default is 'down', not first enum 'up'. - ("BracketAttributes", "lineEnd"): "LineEnd::down", - # R4: NoteSizeAttributes type default is 'large', not first enum 'cue'. - ("NoteSizeAttributes", "type"): "NoteSizeType::large", - # R4: EndingAttributes number default is "1". - ("EndingAttributes", "number"): '"1"', - # R4: GroupingAttributes number default is "1". - ("GroupingAttributes", "number"): 'XsToken("1")', - # R4: PageMarginsAttributes type default is 'both', not first enum 'odd'. - ("PageMarginsAttributes", "type"): "MarginType::both", - # R4: LinkAttributes show default is 'replace', not first enum 'new'. - # Field name in struct is 'show', not 'xlinkShow'. - ("LinkAttributes", "show"): "XlinkShow::replace", - # R4: OtherAppearanceAttributes type default is "undefined". - ("OtherAppearanceAttributes", "type"): '"undefined"', - # R4: OtherNotationAttributes type default is 'start', not first enum 'single'. - ("OtherNotationAttributes", "type"): "StartStopSingle::start", - # R4: OtherOrnament/TechnicalAttributes placement: default_value_for_type - # returns AboveBelow::below but the committed code uses the default ctor - # which is AboveBelow::above (= 0). - ("OtherOrnamentAttributes", "placement"): "AboveBelow::above", - ("OtherTechnicalAttributes", "placement"): "AboveBelow::above", - # R4: PrincipalVoiceAttributes symbol default is 'none'. - ("PrincipalVoiceAttributes", "symbol"): "PrincipalVoiceSymbol::none", - # R4: StringMuteAttributes type default is 'on', not 'off'. - ("StringMuteAttributes", "type"): "OnOff::on", - # R4: PartGroupAttributes number default is "1". - ("PartGroupAttributes", "number"): 'XsToken("1")', - # R4: MetronomeAttributes halign/justify defaults are 'center'. - ("MetronomeAttributes", "halign"): "LeftCenterRight::center", - ("MetronomeAttributes", "justify"): "LeftCenterRight::center", -} - -# Attribute structs that preserve xmlns:* namespace declarations through -# round-trip. These elements may carry xmlns:xlink or other namespace -# declarations that mx does not model as typed fields, but must not drop. -XMLNS_PRESERVING_ATTRS = { - "ScorePartwiseAttributes", - "ScoreTimewiseAttributes", - "OpusAttributes", - "LinkAttributes", -} - -# Per-(parent-element-xml-name, child-element-xml-name) override for the -# constructor argument passed to make{Child}() when initializing the child -# on the parent's ctor init list. Used when HEAD initializes a required child -# with a non-default value (e.g. historical author choice rather than XSD spec). -CHILD_INIT_VALUE_OVERRIDE = { - # Scaling's millimeters and tenths use non-zero historical defaults. - ("scaling", "millimeters"): "MillimetersValue(7)", - ("scaling", "tenths"): "TenthsValue(40)", - # StaffDetails defaults staff-lines to 5 (author convention, not in XSD). - ("staff-details", "staff-lines"): "NonNegativeInteger(5)", -} - - -# Elements whose hasContents() should always return true regardless of what the -# XSD min/max-occurs analysis would produce. Keyed by element xml-name (not -# class name). Used when the committed HEAD hardcodes `return true;` for an -# element that has only optional children. -ELEMENT_HAS_CONTENTS_ALWAYS_TRUE = { - # MeasureLayout has a single optional child (measure-distance), but HEAD - # returns true unconditionally so that the element serialises as - # rather than . - "measure-layout", -} - -# Per-(element-name, child-xml-name) override for the min_occurs value that the -# generator uses when deciding whether a child needs a myHas flag. Keyed by -# (parent_element_xml_name, child_element_xml_name). Use this when XSD group -# inlining propagates minOccurs=0 from the enclosing group to an element that -# HEAD treats as unconditionally present (no getHas/setHas accessors). -CHILD_MIN_OCCURS_OVERRIDE = {} def _apply_child_min_occurs_override(elem_name: str, children: list) -> list: diff --git a/gen/overrides.py b/gen/overrides.py new file mode 100644 index 000000000..6d5a210c5 --- /dev/null +++ b/gen/overrides.py @@ -0,0 +1,67 @@ +#!/usr/bin/env python3 +"""Per-element and per-attribute behavioral overrides for the code generator. + +These tables capture hand-applied decisions from the original codegen that +deviate from what the XSD alone would produce. They exist because mx/core +embeds historical conventions (specific default values, always-true hasContents, +xmlns preservation) that predate this generator. Making them explicit and +separate from the generation logic is a step toward a fully data-driven pipeline. +""" + +# Per-attribute default value override, keyed by (attrs struct name, camelCase +# attribute field name). The value is the literal C++ initializer expression. +ATTR_DEFAULT_OVERRIDE = { + ("AccidentalTextAttributes", "lang"): '"it"', + ("DirectiveAttributes", "lang"): '"it"', + ("LyricAttributes", "justify"): "LeftCenterRight::center", + ("ScorePartwiseAttributes", "version"): '"3.0"', + ("ScoreTimewiseAttributes", "version"): '"3.0"', + ("WordsAttributes", "lang"): '"it"', + ("TextAttributes", "lang"): '"it"', + ("RehearsalAttributes", "lang"): '"it"', + ("LyricLanguageAttributes", "lang"): '"it"', + ("CreditWordsAttributes", "lang"): '"it"', + ("BracketAttributes", "lineEnd"): "LineEnd::down", + ("NoteSizeAttributes", "type"): "NoteSizeType::large", + ("EndingAttributes", "number"): '"1"', + ("GroupingAttributes", "number"): 'XsToken("1")', + ("PageMarginsAttributes", "type"): "MarginType::both", + ("LinkAttributes", "show"): "XlinkShow::replace", + ("OtherAppearanceAttributes", "type"): '"undefined"', + ("OtherNotationAttributes", "type"): "StartStopSingle::start", + ("OtherOrnamentAttributes", "placement"): "AboveBelow::above", + ("OtherTechnicalAttributes", "placement"): "AboveBelow::above", + ("PrincipalVoiceAttributes", "symbol"): "PrincipalVoiceSymbol::none", + ("StringMuteAttributes", "type"): "OnOff::on", + ("PartGroupAttributes", "number"): 'XsToken("1")', + ("MetronomeAttributes", "halign"): "LeftCenterRight::center", + ("MetronomeAttributes", "justify"): "LeftCenterRight::center", +} + +# Attribute structs that preserve xmlns:* namespace declarations through +# round-trip (e.g. xmlns:xlink on score-partwise). +XMLNS_PRESERVING_ATTRS = { + "ScorePartwiseAttributes", + "ScoreTimewiseAttributes", + "OpusAttributes", + "LinkAttributes", +} + +# Per-(parent-element, child-element) override for the constructor argument +# passed to make{Child}() on the parent's ctor init list. +CHILD_INIT_VALUE_OVERRIDE = { + ("scaling", "millimeters"): "MillimetersValue(7)", + ("scaling", "tenths"): "TenthsValue(40)", + ("staff-details", "staff-lines"): "NonNegativeInteger(5)", +} + +# Elements whose hasContents() should always return true regardless of +# what the XSD min/max-occurs analysis would produce. +ELEMENT_HAS_CONTENTS_ALWAYS_TRUE = { + "measure-layout", +} + +# Per-(element-name, child-xml-name) override for min_occurs. Used when XSD +# group inlining propagates minOccurs=0 from the enclosing group to an element +# that HEAD treats as unconditionally present. +CHILD_MIN_OCCURS_OVERRIDE = {} From 681fafe27e1f6146811986013d635c61ea5bbd76 Mon Sep 17 00:00:00 2001 From: Matthew James Briggs Date: Sun, 7 Jun 2026 20:35:39 +0200 Subject: [PATCH 11/27] gen: extract element dispatch config to element_config.py --- gen/element_config.py | 54 +++++++++++++++++++++++++ gen/generate.py | 93 +++++-------------------------------------- 2 files changed, 63 insertions(+), 84 deletions(-) create mode 100644 gen/element_config.py diff --git a/gen/element_config.py b/gen/element_config.py new file mode 100644 index 000000000..fe2a2f943 --- /dev/null +++ b/gen/element_config.py @@ -0,0 +1,54 @@ +#!/usr/bin/env python3 +"""Element dispatch configuration for the code generator. + +Controls which elements are generated by which path (skip, bespoke family, +tree-based, choice, simple-value, standard), plus class-name and value-type +overrides. This is the file to edit when adding new element-specific behavior +or when migrating elements between generation strategies. +""" +from parse import pascal + +OVERWRITE_FILE_STEMS = { + "Direction", "DirectionType", "DirectionAttributes", +} + +ELEMENT_CLASS_NAME_OVERRIDE = { + "attributes": "Properties", +} + +ELEMENT_VALUE_TYPE_OVERRIDE = { + "instrument-sound": { + "cpp_type": "PlaybackSoundType", + "header": "mx/core/PlaybackSoundType.h", + "default": "PlaybackSoundType{}", + }, +} + + +def element_class_name(elem_name: str) -> str: + """Return the C++ class name for an element, consulting overrides first.""" + return ELEMENT_CLASS_NAME_OVERRIDE.get(elem_name, pascal(elem_name)) + + +SKIP_ELEMENTS = set() + +BESPOKE_FAMILY_OWNED = { + "part", + "measure", +} + +TREE_ELEMENTS = { + "bend", + "group-abbreviation-display", + "group-name-display", + "harmonic", + "key", + "metronome", + "notations", + "notehead-text", + "ornaments", + "part-abbreviation-display", + "part-name-display", + "play", + "score-instrument", +} diff --git a/gen/generate.py b/gen/generate.py index 7e9e1b214..03d7248e5 100644 --- a/gen/generate.py +++ b/gen/generate.py @@ -27,6 +27,15 @@ XsdModel, pascal, ) +from element_config import ( + BESPOKE_FAMILY_OWNED, + ELEMENT_CLASS_NAME_OVERRIDE, + ELEMENT_VALUE_TYPE_OVERRIDE, + OVERWRITE_FILE_STEMS, + SKIP_ELEMENTS, + TREE_ELEMENTS, + element_class_name, +) from naming import CPP_KEYWORDS, camel, has_flag_name, pascal_to_camel from overrides import ( ATTR_DEFAULT_OVERRIDE, @@ -1254,90 +1263,6 @@ def _emit_synthetic_unbounded_helper(lines: list, parent_class: str, # Elements whose generated code intentionally replaces the original bespoke # implementation. Diffs are exempt from eval penalty scoring. The set contains -# file stems (PascalCase, no extension) that eval.py matches against. -OVERWRITE_FILE_STEMS = { - "Direction", "DirectionType", "DirectionAttributes", -} - -# Maps XSD element names to C++ class names when they differ from pascal(elem_name). -# The XML stream name remains the original elem_name; only the C++ identifier changes. -ELEMENT_CLASS_NAME_OVERRIDE = { - "attributes": "Properties", # XSD 'attributes' -> C++ class 'Properties' -} - -# Override the value type for a simple-value or text-value element. -# Each entry maps elem_name -> dict with: -# cpp_type: the C++ type to use instead of whatever the XSD says -# header: the header file providing that type -# default: the default-value expression for the constructor -# extra_includes: additional headers to include (list of strings) -# The streaming / parsing pattern is inferred from XMACRO_ENUM_TYPES, -# is_enum_value_type, or uses_set_value, just like any other value type. -ELEMENT_VALUE_TYPE_OVERRIDE = { - "instrument-sound": { - "cpp_type": "PlaybackSoundType", - "header": "mx/core/PlaybackSoundType.h", - "default": "PlaybackSoundType{}", - }, -} - - - -def element_class_name(elem_name: str) -> str: - """Return the C++ class name for an element, consulting overrides first.""" - return ELEMENT_CLASS_NAME_OVERRIDE.get(elem_name, pascal(elem_name)) - - -SKIP_ELEMENTS = { - # score-partwise, score-timewise: handled by shared bespoke generator - # _emit_score_wrapper_family, parameterized via SCORE_WRAPPER_FLAVOR_CONFIG. - # Each emits {Outer, set holder, music-data holder} + their attrs structs. - # part, measure: claimed by the score-wrapper-family handler (both - # partwise and timewise dispatch entries claim each name under a - # different class prefix). Listed in BESPOKE_FAMILY_OWNED rather than - # SKIP_ELEMENTS because they ARE fully generated, just not by their - # own dispatch entry. - # directive: handled via anonymous_type path (text-with-attrs, anon CT) - # part-list: handled by bespoke generator (PartGroupOrScorePart) - # credit: handled by bespoke generator (CreditChoice + CreditWordsGroup) - # key: handled by tree-based generation - # lyric: handled by bespoke generator (LyricTextChoice + SyllabicTextGroup - # + ElisionSyllabicTextGroup + ElisionSyllabicGroup) - # notations, ornaments: handled by tree-based generation - # part-abbreviation-display, part-name-display: handled by tree-based generation - # score-instrument: handled by tree-based generation (SoloOrEnsembleChoice) - # score-part: handled via UNBOUNDED_SEQUENCE_AS_GROUP -> MidiDeviceInstrumentGroup - # time-modification: handled via synthetic NormalTypeNormalDotGroup - # (anonymous nested optional sequence promoted to a group class) -} - -# Elements whose code is emitted by some other bespoke handler as part of a -# family (e.g. score-partwise's family handler emits PartwisePart and -# PartwiseMeasure too). The main discovery loop must skip these so the default -# path doesn't try to generate competing files, but they are NOT counted as -# skipped because they ARE fully generated -- just not by their own dispatch -# entry. Distinct from SKIP_ELEMENTS which represents elements with no -# generator coverage at all. -BESPOKE_FAMILY_OWNED = { - "part", # PartwisePart (partwise) + TimewisePart (timewise) - "measure", # PartwiseMeasure (partwise) + TimewiseMeasure (timewise) -} - -TREE_ELEMENTS = { - "bend", - "group-abbreviation-display", - "group-name-display", - "harmonic", - "key", - "metronome", - "notations", - "notehead-text", - "ornaments", - "part-abbreviation-display", - "part-name-display", - "play", - "score-instrument", -} TREE_ELEMENT_CONFIG = { "group-abbreviation-display": { From cb086e8ee8dddb46d38b9376d222c0b4a75c28ef Mon Sep 17 00:00:00 2001 From: Matthew James Briggs Date: Sun, 7 Jun 2026 20:37:20 +0200 Subject: [PATCH 12/27] gen: move choice/tree config tables to element_config.py --- gen/element_config.py | 143 +++++++++++++++++++++++++++++++++++++ gen/generate.py | 159 ++---------------------------------------- 2 files changed, 147 insertions(+), 155 deletions(-) diff --git a/gen/element_config.py b/gen/element_config.py index fe2a2f943..11eda73e7 100644 --- a/gen/element_config.py +++ b/gen/element_config.py @@ -52,3 +52,146 @@ def element_class_name(elem_name: str) -> str: "play", "score-instrument", } + +TREE_ELEMENT_CONFIG = { + "group-abbreviation-display": { + "choice_class": "DisplayTextOrAccidentalText", + }, + "group-name-display": { + "choice_class": "DisplayTextOrAccidentalText", + }, + "harmonic": { + "inline_choices": [ + {"choice_class": "HarmonicTypeChoice"}, + {"choice_class": "HarmonicInfoChoice"}, + ], + }, + "part-abbreviation-display": { + "choice_class": "DisplayTextOrAccidentalText", + }, + "part-name-display": { + "choice_class": "DisplayTextOrAccidentalText", + "always_has_contents": True, + }, + "notehead-text": { + "choice_class": "NoteheadTextChoice", + "always_has_contents": True, + "seed_choice_set": True, + }, + "play": { + "inlined_choice": True, + }, + "score-instrument": { + "choice_class": "SoloOrEnsembleChoice", + }, + "metronome": { + "choice_class": "BeatUnitPerOrNoteRelationNoteChoice", + "container_names": { + 0: "BeatUnitPer", + 1: "NoteRelationNote", + }, + "branch_enum_names": { + 0: "beatUnitPer", + 1: "noteRelationNote", + }, + }, + "key": { + "parent_imports_choice_groups": True, + }, +} + +ENUM_VALUE_CHOICE_CONFIG = { + "dynamics": { + "value_type": "DynamicsValue", + "enum_type": "DynamicsEnum", + "other_variant": "otherDynamics", + "other_xml_name": "other-dynamics", + }, +} + +INLINE_CHOICE_CONFIG = { + "arrow": { + "branches": [ + { + "enum_name": "arrowGroup", + "class_name": "ArrowGroup", + "is_group": True, + "children": [ + {"name": "arrow-direction", "min": 1, "max": 1}, + {"name": "arrow-style", "min": 0, "max": 1}, + ], + }, + { + "enum_name": "circularArrow", + "class_name": "CircularArrow", + "is_group": False, + "element_name": "circular-arrow", + }, + ], + "enum_start": 1, + }, +} + +CHOICE_ELEMENT_CONFIG = { + "articulations": { + "choice_class": "ArticulationsChoice", "is_set": True, "enum_start": 1, + "choice_from_x": "manual", + "choice_stream_start": "mx_unused", "choice_stream_end": None, + "choice_indent_offset": 0, "choice_braces": True, + "parent_from_x": "simple_loop", "parent_return": "bare", + "parent_else_iol": "true", "parent_if_iol": True, + }, + "technical": { + "choice_class": "TechnicalChoice", "is_set": True, "enum_start": 1, + "choice_from_x": "unused", + "choice_stream_start": None, "choice_stream_end": "is_one_line", + "choice_indent_offset": 0, "choice_braces": True, + "parent_from_x": "dispatch", "parent_return": "macro", + "parent_else_iol": "false", "parent_if_iol": False, + }, + "encoding": { + "choice_class": "EncodingChoice", "is_set": True, "enum_start": 1, + "choice_from_x": "macro", + "choice_stream_start": "endl", "choice_stream_end": "is_one_line", + "choice_indent_offset": 1, "choice_braces": True, + "parent_from_x": "simple_loop", "parent_return": "macro", + "parent_else_iol": None, "parent_if_iol": False, + "parent_stream_style": "is_first", + "parent_method_order": "remove_add", + "parent_no_get": True, + }, + "percussion": { + "choice_class": "PercussionChoice", "is_set": False, "enum_start": 1, + "choice_is_set": True, + "choice_qualified_ctor": True, + "choice_from_x": "manual_bad", + "choice_stream_start": None, "choice_stream_end": "is_one_line", + "choice_indent_offset": 0, "choice_braces": True, + "parent_from_x": "child_loop", "parent_return": "macro", + "extra_children": ["stick-type", "stick-material"], + "extra_children_after": "stick", + }, + "measure-style": { + "choice_class": "MeasureStyleChoice", "is_set": False, "enum_start": 0, + "choice_from_x": "macro", + "choice_stream_start": "is_one_line", "choice_stream_end": None, + "choice_indent_offset": 1, "choice_braces": False, + "parent_from_x": "for_loop", "parent_return": "macro", + "parent_stream_iol_first": True, + "parent_stream_indent_offset": 0, + }, + "direction-type": { + "choice_class": "DirectionType", "is_set": True, "enum_start": 1, + "skip_parent": True, + "choice_from_x": "unused", + "choice_stream_start": "is_one_line", "choice_stream_end": "is_one_line_endl", + "choice_indent_offset": 1, "choice_braces": True, + }, + "time": { + "choice_class": "TimeChoice", "is_set": False, "enum_start": 0, + "bespoke_choice": True, + "parent_from_x": "time_group", + "parent_stream_iol_last": True, + "first_var_name": "TimeSignature", + }, +} diff --git a/gen/generate.py b/gen/generate.py index 03d7248e5..4e1e21364 100644 --- a/gen/generate.py +++ b/gen/generate.py @@ -29,10 +29,14 @@ ) from element_config import ( BESPOKE_FAMILY_OWNED, + CHOICE_ELEMENT_CONFIG, ELEMENT_CLASS_NAME_OVERRIDE, ELEMENT_VALUE_TYPE_OVERRIDE, + ENUM_VALUE_CHOICE_CONFIG, + INLINE_CHOICE_CONFIG, OVERWRITE_FILE_STEMS, SKIP_ELEMENTS, + TREE_ELEMENT_CONFIG, TREE_ELEMENTS, element_class_name, ) @@ -1264,61 +1268,6 @@ def _emit_synthetic_unbounded_helper(lines: list, parent_class: str, # Elements whose generated code intentionally replaces the original bespoke # implementation. Diffs are exempt from eval penalty scoring. The set contains -TREE_ELEMENT_CONFIG = { - "group-abbreviation-display": { - "choice_class": "DisplayTextOrAccidentalText", - }, - "group-name-display": { - "choice_class": "DisplayTextOrAccidentalText", - }, - "harmonic": { - "inline_choices": [ - {"choice_class": "HarmonicTypeChoice"}, - {"choice_class": "HarmonicInfoChoice"}, - ], - }, - "part-abbreviation-display": { - "choice_class": "DisplayTextOrAccidentalText", - }, - "part-name-display": { - "choice_class": "DisplayTextOrAccidentalText", - "always_has_contents": True, - }, - "notehead-text": { - "choice_class": "NoteheadTextChoice", - # HEAD seeds the choice set with one default item (displayText) so that - # hasContents() returns true and the element serialises with content. - "always_has_contents": True, - "seed_choice_set": True, - }, - "play": { - "inlined_choice": True, - }, - "score-instrument": { - "choice_class": "SoloOrEnsembleChoice", - }, - "metronome": { - "choice_class": "BeatUnitPerOrNoteRelationNoteChoice", - "container_names": { - 0: "BeatUnitPer", - 1: "NoteRelationNote", - }, - "branch_enum_names": { - 0: "beatUnitPer", - 1: "noteRelationNote", - }, - }, - # Issues E/F: route choice-group parsing through private member functions - # on the parent element (e.g. Key::importTraditionalKey, - # Key::importNonTraditionalKey) instead of through public importGroup(...) - # overloads in FromXElement.cpp. Each group branch in the parent's choice - # produces one private member: bool import(...). The parent's - # fromXElementImpl dispatches by calling those members; the choice's - # setChoice(...) is performed inside the member body. - "key": { - "parent_imports_choice_groups": True, - }, -} # Populated dynamically by XsdModel._synthesize_optional_group as we discover # anonymous inside parent sequences. The names @@ -1759,106 +1708,6 @@ def _emit_group_real_from_x_impl(lines: list, class_name: str, children: list) - CHOICE_SKIP = set() -ENUM_VALUE_CHOICE_CONFIG = { - "dynamics": { - "value_type": "DynamicsValue", - "enum_type": "DynamicsEnum", - "other_variant": "otherDynamics", - "other_xml_name": "other-dynamics", - }, -} - -INLINE_CHOICE_CONFIG = { - "arrow": { - "branches": [ - { - "enum_name": "arrowGroup", - "class_name": "ArrowGroup", - "is_group": True, - "children": [ - {"name": "arrow-direction", "min": 1, "max": 1}, - {"name": "arrow-style", "min": 0, "max": 1}, - ], - }, - { - "enum_name": "circularArrow", - "class_name": "CircularArrow", - "is_group": False, - "element_name": "circular-arrow", - }, - ], - "enum_start": 1, - }, -} - -CHOICE_ELEMENT_CONFIG = { - "articulations": { - "choice_class": "ArticulationsChoice", "is_set": True, "enum_start": 1, - "choice_from_x": "manual", - "choice_stream_start": "mx_unused", "choice_stream_end": None, - "choice_indent_offset": 0, "choice_braces": True, - "parent_from_x": "simple_loop", "parent_return": "bare", - "parent_else_iol": "true", "parent_if_iol": True, - }, - "technical": { - "choice_class": "TechnicalChoice", "is_set": True, "enum_start": 1, - "choice_from_x": "unused", - "choice_stream_start": None, "choice_stream_end": "is_one_line", - "choice_indent_offset": 0, "choice_braces": True, - "parent_from_x": "dispatch", "parent_return": "macro", - "parent_else_iol": "false", "parent_if_iol": False, - }, - "encoding": { - "choice_class": "EncodingChoice", "is_set": True, "enum_start": 1, - "choice_from_x": "macro", - "choice_stream_start": "endl", "choice_stream_end": "is_one_line", - "choice_indent_offset": 1, "choice_braces": True, - "parent_from_x": "simple_loop", "parent_return": "macro", - "parent_else_iol": None, "parent_if_iol": False, - "parent_stream_style": "is_first", - "parent_method_order": "remove_add", - "parent_no_get": True, - }, - "percussion": { - "choice_class": "PercussionChoice", "is_set": False, "enum_start": 1, - "choice_is_set": True, - "choice_qualified_ctor": True, - "choice_from_x": "manual_bad", - "choice_stream_start": None, "choice_stream_end": "is_one_line", - "choice_indent_offset": 0, "choice_braces": True, - # The percussion choice (glass | metal | wood | ...) is selected by the - # *child* element of , so the parent must iterate its - # children and hand each to PercussionChoice (which dispatches on the - # child name). "delegate" would pass the element itself to - # the choice, which then rejects 'percussion' as unrecognized. - "parent_from_x": "child_loop", "parent_return": "macro", - "extra_children": ["stick-type", "stick-material"], - "extra_children_after": "stick", - }, - "measure-style": { - "choice_class": "MeasureStyleChoice", "is_set": False, "enum_start": 0, - "choice_from_x": "macro", - "choice_stream_start": "is_one_line", "choice_stream_end": None, - "choice_indent_offset": 1, "choice_braces": False, - "parent_from_x": "for_loop", "parent_return": "macro", - "parent_stream_iol_first": True, - "parent_stream_indent_offset": 0, - }, - "direction-type": { - "choice_class": "DirectionType", "is_set": True, "enum_start": 1, - "skip_parent": True, - "choice_from_x": "unused", - "choice_stream_start": "is_one_line", "choice_stream_end": "is_one_line_endl", - "choice_indent_offset": 1, "choice_braces": True, - }, - "time": { - "choice_class": "TimeChoice", "is_set": False, "enum_start": 0, - "bespoke_choice": True, - "parent_from_x": "time_group", - "parent_stream_iol_last": True, - "first_var_name": "TimeSignature", - }, -} # --------------------------------------------------------------------------- From a256e2e0926cf007790d634397ac638842d2aa81 Mon Sep 17 00:00:00 2001 From: Matthew James Briggs Date: Sun, 7 Jun 2026 20:40:46 +0200 Subject: [PATCH 13/27] gen: extract group config to group_config.py Move mutable sets (SYNTHETIC_OPTIONAL_GROUPS, etc.), static group dicts, WRAPPING_STREAMCONTENTS, and group_class_name() into a dedicated module. --- gen/generate.py | 93 ++++++--------------------------------- gen/group_config.py | 103 ++++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 115 insertions(+), 81 deletions(-) create mode 100644 gen/group_config.py diff --git a/gen/generate.py b/gen/generate.py index 4e1e21364..6ba496917 100644 --- a/gen/generate.py +++ b/gen/generate.py @@ -48,6 +48,18 @@ ELEMENT_HAS_CONTENTS_ALWAYS_TRUE, XMLNS_PRESERVING_ATTRS, ) +from group_config import ( + EXTENSION_OPTIONAL_GROUP_RENAME, + GENERATE_GROUPS, + NESTED_OPTIONAL_SEQUENCE_AS_GROUP, + SUPPRESS_GROUP_SUFFIX, + SYNTHETIC_OPTIONAL_GROUPS, + SYNTHETIC_UNBOUNDED_GROUP_IMPORT_GROUP_AFTER, + SYNTHETIC_UNBOUNDED_GROUPS, + UNBOUNDED_SEQUENCE_AS_GROUP, + WRAPPING_STREAMCONTENTS, + group_class_name, +) from type_maps import ( BESPOKE_TYPES, NEEDS_PARSE_FUNC_TYPES, @@ -1269,87 +1281,6 @@ def _emit_synthetic_unbounded_helper(lines: list, parent_class: str, # implementation. Diffs are exempt from eval penalty scoring. The set contains -# Populated dynamically by XsdModel._synthesize_optional_group as we discover -# anonymous inside parent sequences. The names -# stored here are the lowercase-hyphenated form (e.g. "normal-type-normal-dot") -# which round-trips through pascal() to produce the synthetic group class -# (e.g. "NormalTypeNormalDotGroup"). -SYNTHETIC_OPTIONAL_GROUPS: set = set() - -# Populated dynamically by XsdModel._synthesize_unbounded_group when we -# discover an anonymous -# inside a parent sequence. The original codegen promoted some of these -# shapes to wrapper group classes used as Sets on the parent -# (e.g. score-part's midi-device + midi-instrument repeating sequence -# becomes MidiDeviceInstrumentGroup, held as a *Set on ScorePart). -SYNTHETIC_UNBOUNDED_GROUPS: set = set() - -# Opt-in: complex types whose anonymous nested -# should be promoted to a synthetic group rather than flattened. The XSD -# permits the same shape in several places (e.g. page-layout), but the -# original codegen only chose to promote it in specific spots. The value is -# the hyphenated-lowercase ref name used as the synthetic group's element_name. -NESTED_OPTIONAL_SEQUENCE_AS_GROUP: dict = { - "time-modification": "normal-type-normal-dot", -} - -# Opt-in: when an extending complexType inherits a synthetic optional group -# from its base, the default behavior is to flatten the group's members into -# the extending type. For specific extending types the original codegen -# instead kept the group as a *separately-named wrapper sub-element* with -# its own getHas/setHas accessors. The mapping is -# extending_type_name -> { base_synthetic_group_name -> renamed_wrapper_group_name } -# The renamed group's class name omits the usual "Group" suffix (see -# SUPPRESS_GROUP_SUFFIX), so a child reference to it renders as a regular -# wrapper element on the parent. Its members are still parsed inline like any -# other synthetic optional group (the original hand-written MetronomeTuplet.cpp -# parsed the wrapper with a no-op importElement and dropped normal-type / -# normal-dot on round-trip; that was a bug). -EXTENSION_OPTIONAL_GROUP_RENAME: dict = { - "metronome-tuplet": { - "normal-type-normal-dot": "time-modification-normal-type-normal-dot", - }, -} - -# Group names whose generated class name omits the trailing "Group" suffix. -SUPPRESS_GROUP_SUFFIX: set = set() - - -def group_class_name(group_name: str) -> str: - if group_name in SUPPRESS_GROUP_SUFFIX: - return pascal(group_name) - return pascal(group_name) + "Group" - -# Opt-in: complex types whose anonymous should be promoted to a synthetic unbounded group. -# Mapping parent_type_name -> hyphenated-lowercase synthetic group ref. -UNBOUNDED_SEQUENCE_AS_GROUP: dict = { - "score-part": "midi-device-instrument", -} - -# Element names whose generated synthetic-unbounded-group parser body should -# emit an additional importGroup(messsage, iter, endIter, isSuccess, elemPtr) -# call after parsing that element. The original codegen produced this call -# for midi-instrument (a no-op in practice because importGroup(MidiInstrument) -# inspects only sibling iterators that have already been consumed). Kept to -# minimize diff against committed. -SYNTHETIC_UNBOUNDED_GROUP_IMPORT_GROUP_AFTER = { - "midi-instrument", -} - -GENERATE_GROUPS = { - "beat-unit", "display-step-octave", "editorial", "editorial-voice", - "editorial-voice-direction", "layout", "score-header", - # full-note: EXC - real code has FullNoteTypeChoice class - # time-signature: EXC - real code adds Interchangeable not in XSD group - # harmony-chord: EXC - real code has Choice logic not in XSD group def - # music-data: EXC - real code wraps choice in MusicDataChoice class -} - -WRAPPING_STREAMCONTENTS = { - "defaults", "grouping", "identification", "part-group", "print", -} - TYPE_DEFAULT_VALUE = CPP_CONFIG["defaults"]["value_type"] ELEMENT_DEFAULT_VALUE = CPP_CONFIG["defaults"]["element"] diff --git a/gen/group_config.py b/gen/group_config.py new file mode 100644 index 000000000..d6d65e5b9 --- /dev/null +++ b/gen/group_config.py @@ -0,0 +1,103 @@ +#!/usr/bin/env python3 +"""Group structural configuration for the code generator. + +Owns the mutable sets that the XSD parser populates during parsing (passed by +reference via ParseConfig), the static dicts that control group synthesis, and +the group_class_name helper. Also includes WRAPPING_STREAMCONTENTS which +controls streaming behavior for a handful of complex types. +""" +from parse import pascal + +# --------------------------------------------------------------------------- +# Mutable sets populated by XsdModel during parsing +# --------------------------------------------------------------------------- + +# Populated dynamically by XsdModel._synthesize_optional_group when we +# discover an anonymous inside a parent sequence +# which round-trips through pascal() to produce the synthetic group class +# (e.g. "NormalTypeNormalDotGroup"). +SYNTHETIC_OPTIONAL_GROUPS: set = set() + +# Populated dynamically by XsdModel._synthesize_unbounded_group when we +# discover an anonymous +# inside a parent sequence. The original codegen promoted some of these +# shapes to wrapper group classes used as Sets on the parent +# (e.g. score-part's midi-device + midi-instrument repeating sequence +# becomes MidiDeviceInstrumentGroup, held as a *Set on ScorePart). +SYNTHETIC_UNBOUNDED_GROUPS: set = set() + +# Group names whose generated class name omits the trailing "Group" suffix. +SUPPRESS_GROUP_SUFFIX: set = set() + +# --------------------------------------------------------------------------- +# Static group configuration dicts +# --------------------------------------------------------------------------- + +# Opt-in: complex types whose anonymous nested +# should be promoted to a synthetic group rather than flattened. The XSD +# permits the same shape in several places (e.g. page-layout), but the +# original codegen only chose to promote it in specific spots. The value is +# the hyphenated-lowercase ref name used as the synthetic group's element_name. +NESTED_OPTIONAL_SEQUENCE_AS_GROUP: dict = { + "time-modification": "normal-type-normal-dot", +} + +# Opt-in: when an extending complexType inherits a synthetic optional group +# from its base, the default behavior is to flatten the group's members into +# the extending type. For specific extending types the original codegen +# instead kept the group as a *separately-named wrapper sub-element* with +# its own getHas/setHas accessors. The mapping is +# extending_type_name -> { base_synthetic_group_name -> renamed_wrapper_group_name } +# The renamed group's class name omits the usual "Group" suffix (see +# SUPPRESS_GROUP_SUFFIX), so a child reference to it renders as a regular +# wrapper element on the parent. Its members are still parsed inline like any +# other synthetic optional group (the original hand-written MetronomeTuplet.cpp +# parsed the wrapper with a no-op importElement and dropped normal-type / +# normal-dot on round-trip; that was a bug). +EXTENSION_OPTIONAL_GROUP_RENAME: dict = { + "metronome-tuplet": { + "normal-type-normal-dot": "time-modification-normal-type-normal-dot", + }, +} + +# Opt-in: complex types whose anonymous should be promoted to a synthetic unbounded group. +# Mapping parent_type_name -> hyphenated-lowercase synthetic group ref. +UNBOUNDED_SEQUENCE_AS_GROUP: dict = { + "score-part": "midi-device-instrument", +} + +# Element names whose generated synthetic-unbounded-group parser body should +# emit an additional importGroup(messsage, iter, endIter, isSuccess, elemPtr) +# call after parsing that element. The original codegen produced this call +# for midi-instrument (a no-op in practice because importGroup(MidiInstrument) +# inspects only sibling iterators that have already been consumed). Kept to +# minimize diff against committed. +SYNTHETIC_UNBOUNDED_GROUP_IMPORT_GROUP_AFTER = { + "midi-instrument", +} + +GENERATE_GROUPS = { + "beat-unit", "display-step-octave", "editorial", "editorial-voice", + "editorial-voice-direction", "layout", "score-header", + # full-note: EXC - real code has FullNoteTypeChoice class + # time-signature: EXC - real code adds Interchangeable not in XSD group + # harmony-chord: EXC - real code has Choice logic not in XSD group def + # music-data: EXC - real code wraps choice in MusicDataChoice class +} + +# Complex types whose streamContents uses the "wrapping" (forEachChild) pattern +# instead of explicit per-child streaming. +WRAPPING_STREAMCONTENTS = { + "defaults", "grouping", "identification", "part-group", "print", +} + +# --------------------------------------------------------------------------- +# Group class name resolution +# --------------------------------------------------------------------------- + + +def group_class_name(group_name: str) -> str: + if group_name in SUPPRESS_GROUP_SUFFIX: + return pascal(group_name) + return pascal(group_name) + "Group" From ad7c80dbd42f86360ad6cb392a1a85fe76f6f938 Mon Sep 17 00:00:00 2001 From: Matthew James Briggs Date: Sun, 7 Jun 2026 20:42:17 +0200 Subject: [PATCH 14/27] gen: extract attrs naming config to attrs_config.py Move CORE_ROOT_ATTRS, ATTRS_TYPE_ALIAS, ELEMENTS_DIR_SHARED_ATTRS, and resolve_attrs_name() into a dedicated module for attribute struct naming logic. --- gen/attrs_config.py | 39 +++++++++++++++++++++++++++++++++++++++ gen/generate.py | 30 +++++------------------------- 2 files changed, 44 insertions(+), 25 deletions(-) create mode 100644 gen/attrs_config.py diff --git a/gen/attrs_config.py b/gen/attrs_config.py new file mode 100644 index 000000000..4d4f3ed74 --- /dev/null +++ b/gen/attrs_config.py @@ -0,0 +1,39 @@ +#!/usr/bin/env python3 +"""Attribute struct naming configuration for the code generator. + +Controls which attribute structs get type-based (shared) names vs element-based +names, and provides the resolution function used throughout the generator. +""" +from parse import pascal +from element_config import element_class_name + +# Attribute structs that live at the core root level (not in elements/). +CORE_ROOT_ATTRS = { + "EmptyPrintObjectStyleAlignAttributes", +} + +# XSD type name aliases: when the type name for an element matches a key here, +# the aliased type is used for attribute-struct naming purposes. +ATTRS_TYPE_ALIAS = { + "empty-print-style-align": "empty-print-object-style-align", +} + +# Shared attribute structs that live in the elements/ directory but are reused +# across multiple elements (generated once, included by reference). +ELEMENTS_DIR_SHARED_ATTRS = { + "EmptyPlacementAttributes", + "EmptyLineAttributes", + "EmptyTrillSoundAttributes", + "EmptyFontAttributes", + "EmptyPrintStyleAlignAttributes", +} + + +def resolve_attrs_name(elem_name: str, type_name: str, model) -> str: + """Determine the correct attributes struct name for an element. + Some empty-* types use the type name (shared). Others use element name.""" + aliased = ATTRS_TYPE_ALIAS.get(type_name, type_name) + type_attrs = pascal(aliased) + "Attributes" + if type_attrs in CORE_ROOT_ATTRS or type_attrs in ELEMENTS_DIR_SHARED_ATTRS: + return type_attrs + return element_class_name(elem_name) + "Attributes" diff --git a/gen/generate.py b/gen/generate.py index 6ba496917..b9fcafe3f 100644 --- a/gen/generate.py +++ b/gen/generate.py @@ -27,6 +27,11 @@ XsdModel, pascal, ) +from attrs_config import ( + CORE_ROOT_ATTRS, + ELEMENTS_DIR_SHARED_ATTRS, + resolve_attrs_name, +) from element_config import ( BESPOKE_FAMILY_OWNED, CHOICE_ELEMENT_CONFIG, @@ -180,31 +185,6 @@ def element_attrs_struct_name(elem_name: str, model: XsdModel) -> str: return element_class_name(elem_name) + "Attributes" -CORE_ROOT_ATTRS = { - "EmptyPrintObjectStyleAlignAttributes", -} - -ATTRS_TYPE_ALIAS = { - "empty-print-style-align": "empty-print-object-style-align", -} - -ELEMENTS_DIR_SHARED_ATTRS = { - "EmptyPlacementAttributes", - "EmptyLineAttributes", - "EmptyTrillSoundAttributes", - "EmptyFontAttributes", - "EmptyPrintStyleAlignAttributes", -} - - -def resolve_attrs_name(elem_name: str, type_name: str, model: XsdModel) -> str: - """Determine the correct attributes struct name for an element. - Some empty-* types use the type name (shared). Others use element name.""" - aliased = ATTRS_TYPE_ALIAS.get(type_name, type_name) - type_attrs = pascal(aliased) + "Attributes" - if type_attrs in CORE_ROOT_ATTRS or type_attrs in ELEMENTS_DIR_SHARED_ATTRS: - return type_attrs - return element_class_name(elem_name) + "Attributes" def generate_attrs_h(struct_name: str, attrs: list, model: XsdModel) -> str: From c20982c7509c4834126b143e1d76faed9c01d6da Mon Sep 17 00:00:00 2001 From: Matthew James Briggs Date: Sun, 7 Jun 2026 20:43:30 +0200 Subject: [PATCH 15/27] gen: move remaining config tables to modules GROUPS_WITH_REAL_FROM_X_ELEMENT -> group_config.py DYNAMICS_MARKS, CHOICE_SKIP -> element_config.py --- gen/element_config.py | 9 +++++++++ gen/generate.py | 25 +++---------------------- gen/group_config.py | 11 +++++++++++ 3 files changed, 23 insertions(+), 22 deletions(-) diff --git a/gen/element_config.py b/gen/element_config.py index 11eda73e7..524796951 100644 --- a/gen/element_config.py +++ b/gen/element_config.py @@ -195,3 +195,12 @@ def element_class_name(elem_name: str) -> str: "first_var_name": "TimeSignature", }, } + +DYNAMICS_MARKS = { + "p", "pp", "ppp", "pppp", "ppppp", "pppppp", + "f", "ff", "fff", "ffff", "fffff", "ffffff", + "mp", "mf", "sf", "sfp", "sfpp", "fp", "rf", "rfz", "sfz", "sffz", "fz", + "other-dynamics", +} + +CHOICE_SKIP = set() diff --git a/gen/generate.py b/gen/generate.py index b9fcafe3f..fc3351a38 100644 --- a/gen/generate.py +++ b/gen/generate.py @@ -35,6 +35,8 @@ from element_config import ( BESPOKE_FAMILY_OWNED, CHOICE_ELEMENT_CONFIG, + CHOICE_SKIP, + DYNAMICS_MARKS, ELEMENT_CLASS_NAME_OVERRIDE, ELEMENT_VALUE_TYPE_OVERRIDE, ENUM_VALUE_CHOICE_CONFIG, @@ -56,6 +58,7 @@ from group_config import ( EXTENSION_OPTIONAL_GROUP_RENAME, GENERATE_GROUPS, + GROUPS_WITH_REAL_FROM_X_ELEMENT, NESTED_OPTIONAL_SEQUENCE_AS_GROUP, SUPPRESS_GROUP_SUFFIX, SYNTHETIC_OPTIONAL_GROUPS, @@ -1526,21 +1529,7 @@ def generate_group_cpp(group_name: str, children: list, model: XsdModel) -> str: return "\n".join(lines) + "\n" -GROUPS_WITH_REAL_FROM_X_ELEMENT = { - "score-header", - # ArrowGroup is the inline-group branch of the inline-choice element - # (INLINE_CHOICE_CONFIG["arrow"]). Arrow::fromXElementImpl dispatches its - # group branch via myArrowGroup->fromXElement(message, xelement), so the - # group needs a real parsing body. - "arrow", -} - - def _group_needs_real_from_x(group_name: str) -> bool: - # The original codegen emits a real fromXElementImpl body for the - # synthetic optional groups (e.g. NormalTypeNormalDotGroup), even though - # they are never invoked directly by the parent (which inlines its own - # parsing). Preserve that behavior to minimize diff against committed. return (group_name in GROUPS_WITH_REAL_FROM_X_ELEMENT or group_name in SYNTHETIC_OPTIONAL_GROUPS) @@ -1610,14 +1599,6 @@ def _emit_group_real_from_x_impl(lines: list, class_name: str, children: list) - lines.append(" MX_RETURN_IS_SUCCESS;") lines.append("}\n") -DYNAMICS_MARKS = { - "p", "pp", "ppp", "pppp", "ppppp", "pppppp", - "f", "ff", "fff", "ffff", "fffff", "ffffff", - "mp", "mf", "sf", "sfp", "sfpp", "fp", "rf", "rfz", "sfz", "sffz", "fz", - "other-dynamics", -} - -CHOICE_SKIP = set() diff --git a/gen/group_config.py b/gen/group_config.py index d6d65e5b9..19bb2b007 100644 --- a/gen/group_config.py +++ b/gen/group_config.py @@ -92,6 +92,17 @@ "defaults", "grouping", "identification", "part-group", "print", } +# Groups whose generated .cpp includes a real fromXElementImpl body (most +# groups just emit a stub that returns false). +GROUPS_WITH_REAL_FROM_X_ELEMENT = { + "score-header", + # ArrowGroup is the inline-group branch of the inline-choice element + # (INLINE_CHOICE_CONFIG["arrow"]). Arrow::fromXElementImpl dispatches its + # group branch via myArrowGroup->fromXElement(message, xelement), so the + # group needs a real parsing body. + "arrow", +} + # --------------------------------------------------------------------------- # Group class name resolution # --------------------------------------------------------------------------- From 9d3c2ac70430632f8832069687146af8000cc6e6 Mon Sep 17 00:00:00 2001 From: Matthew James Briggs Date: Sun, 7 Jun 2026 20:44:37 +0200 Subject: [PATCH 16/27] gen: extract score wrapper config to score_config.py Move the 50-line SCORE_WRAPPER_FLAVOR_CONFIG dict (partwise vs timewise behavioral knobs) into its own module. --- gen/generate.py | 56 +------------------------------------------ gen/score_config.py | 58 +++++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 59 insertions(+), 55 deletions(-) create mode 100644 gen/score_config.py diff --git a/gen/generate.py b/gen/generate.py index fc3351a38..6396d707c 100644 --- a/gen/generate.py +++ b/gen/generate.py @@ -48,6 +48,7 @@ element_class_name, ) from naming import CPP_KEYWORDS, camel, has_flag_name, pascal_to_camel +from score_config import SCORE_WRAPPER_FLAVOR_CONFIG from overrides import ( ATTR_DEFAULT_OVERRIDE, CHILD_INIT_VALUE_OVERRIDE, @@ -8862,61 +8863,6 @@ def _emit_part_list_family(elem_name, elem, ct, model, generated_attrs, stats): # --------------------------------------------------------------------------- -# Per-flavor knobs that capture hand-written variations between the partwise -# and timewise families. Keys are the outer XSD element names. -SCORE_WRAPPER_FLAVOR_CONFIG = { - "score-partwise": { - # ScorePartwise.cpp - "outer_extra_includes": [], - "outer_loop_uses_end_var": False, - # PartwiseMeasure (music-data holder) - "music_data_holder_attrs_jit": True, - "music_data_holder_debug_throw": True, - # PartwisePart (set holder) - "set_holder_clear_repushes_default": True, - "set_holder_remove_has_size_guard": True, - "set_holder_post_loop_required": False, - "set_holder_first_flag_name": "isFirstAdded", - "set_holder_use_return_macro": True, - # Loop body style differences (partwise variant). - "set_holder_loop_uses_element_name_var": False, - "set_holder_unexpected_order": "message_first", # message << ...; isSuccess = false; - "set_holder_unexpected_msg": "encountered_quoted", # "...: encountered an unexpected element '...'" - "set_holder_begin_deref_parens": False, # *mySet.begin() vs *(mySet.begin()) - "set_holder_from_x_before_first_check": True, - "set_holder_blank_after_first_decl": False, - "set_holder_blank_inside_else": False, - "set_holder_child_var_source": "xml_name", # "xml_name" => camel(xml); "class_name" => pascal_to_camel(cls) - }, - "score-timewise": { - # ScoreTimewise.cpp - "outer_extra_includes": [ - "ezxml/XElement.h", - "ezxml/XElementIterator.h", - ], - "outer_loop_uses_end_var": True, - # TimewisePart (music-data holder) - "music_data_holder_attrs_jit": False, - "music_data_holder_debug_throw": False, - # TimewiseMeasure (set holder) - "set_holder_clear_repushes_default": False, - "set_holder_remove_has_size_guard": False, - "set_holder_post_loop_required": True, - "set_holder_first_flag_name": "isFirstTimewisePartFound", - "set_holder_use_return_macro": False, - # Loop body style differences (timewise variant). - "set_holder_loop_uses_element_name_var": True, - "set_holder_unexpected_order": "issuccess_first", # isSuccess = false; message << ...; - "set_holder_unexpected_msg": "trailing_encountered", # "...: unexpected element '...' encountered" - "set_holder_begin_deref_parens": True, - "set_holder_from_x_before_first_check": True, - "set_holder_blank_after_first_decl": True, - "set_holder_blank_inside_else": True, - "set_holder_child_var_source": "class_name", - }, -} - - def _extract_score_wrapper_structure(model, outer_name): """Walk model.root to recover the full nested structure of a top-level score wrapper. Returns a dict with role-based keys; no XML name is diff --git a/gen/score_config.py b/gen/score_config.py new file mode 100644 index 000000000..80063a8a1 --- /dev/null +++ b/gen/score_config.py @@ -0,0 +1,58 @@ +#!/usr/bin/env python3 +"""Score wrapper (partwise/timewise) flavor configuration. + +Per-flavor knobs that capture hand-written variations between the partwise +and timewise families. Keys are the outer XSD element names. +""" + +SCORE_WRAPPER_FLAVOR_CONFIG = { + "score-partwise": { + # ScorePartwise.cpp + "outer_extra_includes": [], + "outer_loop_uses_end_var": False, + # PartwiseMeasure (music-data holder) + "music_data_holder_attrs_jit": True, + "music_data_holder_debug_throw": True, + # PartwisePart (set holder) + "set_holder_clear_repushes_default": True, + "set_holder_remove_has_size_guard": True, + "set_holder_post_loop_required": False, + "set_holder_first_flag_name": "isFirstAdded", + "set_holder_use_return_macro": True, + # Loop body style differences (partwise variant). + "set_holder_loop_uses_element_name_var": False, + "set_holder_unexpected_order": "message_first", # message << ...; isSuccess = false; + "set_holder_unexpected_msg": "encountered_quoted", # "...: encountered an unexpected element '...'" + "set_holder_begin_deref_parens": False, # *mySet.begin() vs *(mySet.begin()) + "set_holder_from_x_before_first_check": True, + "set_holder_blank_after_first_decl": False, + "set_holder_blank_inside_else": False, + "set_holder_child_var_source": "xml_name", # "xml_name" => camel(xml); "class_name" => pascal_to_camel(cls) + }, + "score-timewise": { + # ScoreTimewise.cpp + "outer_extra_includes": [ + "ezxml/XElement.h", + "ezxml/XElementIterator.h", + ], + "outer_loop_uses_end_var": True, + # TimewisePart (music-data holder) + "music_data_holder_attrs_jit": False, + "music_data_holder_debug_throw": False, + # TimewiseMeasure (set holder) + "set_holder_clear_repushes_default": False, + "set_holder_remove_has_size_guard": False, + "set_holder_post_loop_required": True, + "set_holder_first_flag_name": "isFirstTimewisePartFound", + "set_holder_use_return_macro": False, + # Loop body style differences (timewise variant). + "set_holder_loop_uses_element_name_var": True, + "set_holder_unexpected_order": "issuccess_first", # isSuccess = false; message << ...; + "set_holder_unexpected_msg": "trailing_encountered", # "...: unexpected element '...' encountered" + "set_holder_begin_deref_parens": True, + "set_holder_from_x_before_first_check": True, + "set_holder_blank_after_first_decl": True, + "set_holder_blank_inside_else": True, + "set_holder_child_var_source": "class_name", + }, +} From 5806e287c5f239cd8166ac29f8860df8b83d48b8 Mon Sep 17 00:00:00 2001 From: Matthew James Briggs Date: Sun, 7 Jun 2026 20:47:47 +0200 Subject: [PATCH 17/27] gen: move ATTR_DEFAULT_OVERRIDE to config.toml Per-attribute default value overrides are now expressed as nested TOML tables under [overrides.attr_default], loaded at import time by overrides.py into the same tuple-keyed dict. --- gen/cpp/config.toml | 74 +++++++++++++++++++++++++++++++++++++++++++++ gen/overrides.py | 35 ++++++--------------- 2 files changed, 84 insertions(+), 25 deletions(-) diff --git a/gen/cpp/config.toml b/gen/cpp/config.toml index d15d63e16..b64ae85ad 100644 --- a/gen/cpp/config.toml +++ b/gen/cpp/config.toml @@ -66,3 +66,77 @@ type = "NoteTypeValue::quarter" duration = "1.0" tremolo = "3" metronome-relation = '"equals"' + +# Per-attribute default value overrides, keyed by attrs struct name then field. +[overrides.attr_default.AccidentalTextAttributes] +lang = '"it"' + +[overrides.attr_default.DirectiveAttributes] +lang = '"it"' + +[overrides.attr_default.LyricAttributes] +justify = "LeftCenterRight::center" + +[overrides.attr_default.ScorePartwiseAttributes] +version = '"3.0"' + +[overrides.attr_default.ScoreTimewiseAttributes] +version = '"3.0"' + +[overrides.attr_default.WordsAttributes] +lang = '"it"' + +[overrides.attr_default.TextAttributes] +lang = '"it"' + +[overrides.attr_default.RehearsalAttributes] +lang = '"it"' + +[overrides.attr_default.LyricLanguageAttributes] +lang = '"it"' + +[overrides.attr_default.CreditWordsAttributes] +lang = '"it"' + +[overrides.attr_default.BracketAttributes] +lineEnd = "LineEnd::down" + +[overrides.attr_default.NoteSizeAttributes] +type = "NoteSizeType::large" + +[overrides.attr_default.EndingAttributes] +number = '"1"' + +[overrides.attr_default.GroupingAttributes] +number = 'XsToken("1")' + +[overrides.attr_default.PageMarginsAttributes] +type = "MarginType::both" + +[overrides.attr_default.LinkAttributes] +show = "XlinkShow::replace" + +[overrides.attr_default.OtherAppearanceAttributes] +type = '"undefined"' + +[overrides.attr_default.OtherNotationAttributes] +type = "StartStopSingle::start" + +[overrides.attr_default.OtherOrnamentAttributes] +placement = "AboveBelow::above" + +[overrides.attr_default.OtherTechnicalAttributes] +placement = "AboveBelow::above" + +[overrides.attr_default.PrincipalVoiceAttributes] +symbol = "PrincipalVoiceSymbol::none" + +[overrides.attr_default.StringMuteAttributes] +type = "OnOff::on" + +[overrides.attr_default.PartGroupAttributes] +number = 'XsToken("1")' + +[overrides.attr_default.MetronomeAttributes] +halign = "LeftCenterRight::center" +justify = "LeftCenterRight::center" diff --git a/gen/overrides.py b/gen/overrides.py index 6d5a210c5..5ca4936b6 100644 --- a/gen/overrides.py +++ b/gen/overrides.py @@ -7,35 +7,20 @@ xmlns preservation) that predate this generator. Making them explicit and separate from the generation logic is a step toward a fully data-driven pipeline. """ +import os +import tomllib + +_CPP_DIR = os.path.join(os.path.dirname(__file__), "cpp") +with open(os.path.join(_CPP_DIR, "config.toml"), "rb") as _f: + _CFG = tomllib.load(_f) # Per-attribute default value override, keyed by (attrs struct name, camelCase # attribute field name). The value is the literal C++ initializer expression. +# Source of truth: [overrides.attr_default] in cpp/config.toml. ATTR_DEFAULT_OVERRIDE = { - ("AccidentalTextAttributes", "lang"): '"it"', - ("DirectiveAttributes", "lang"): '"it"', - ("LyricAttributes", "justify"): "LeftCenterRight::center", - ("ScorePartwiseAttributes", "version"): '"3.0"', - ("ScoreTimewiseAttributes", "version"): '"3.0"', - ("WordsAttributes", "lang"): '"it"', - ("TextAttributes", "lang"): '"it"', - ("RehearsalAttributes", "lang"): '"it"', - ("LyricLanguageAttributes", "lang"): '"it"', - ("CreditWordsAttributes", "lang"): '"it"', - ("BracketAttributes", "lineEnd"): "LineEnd::down", - ("NoteSizeAttributes", "type"): "NoteSizeType::large", - ("EndingAttributes", "number"): '"1"', - ("GroupingAttributes", "number"): 'XsToken("1")', - ("PageMarginsAttributes", "type"): "MarginType::both", - ("LinkAttributes", "show"): "XlinkShow::replace", - ("OtherAppearanceAttributes", "type"): '"undefined"', - ("OtherNotationAttributes", "type"): "StartStopSingle::start", - ("OtherOrnamentAttributes", "placement"): "AboveBelow::above", - ("OtherTechnicalAttributes", "placement"): "AboveBelow::above", - ("PrincipalVoiceAttributes", "symbol"): "PrincipalVoiceSymbol::none", - ("StringMuteAttributes", "type"): "OnOff::on", - ("PartGroupAttributes", "number"): 'XsToken("1")', - ("MetronomeAttributes", "halign"): "LeftCenterRight::center", - ("MetronomeAttributes", "justify"): "LeftCenterRight::center", + (struct, field): value + for struct, fields in _CFG["overrides"]["attr_default"].items() + for field, value in fields.items() } # Attribute structs that preserve xmlns:* namespace declarations through From f7b2bf10921a13d14ba6da163a5ca0018abd5f0d Mon Sep 17 00:00:00 2001 From: Matthew James Briggs Date: Sun, 7 Jun 2026 20:48:28 +0200 Subject: [PATCH 18/27] gen: move override tables to config.toml XMLNS_PRESERVING_ATTRS, CHILD_INIT_VALUE_OVERRIDE, and ELEMENT_HAS_CONTENTS_ALWAYS_TRUE now live in TOML under [overrides], loaded by overrides.py at import time. --- gen/cpp/config.toml | 21 +++++++++++++++++++++ gen/overrides.py | 28 ++++++++++------------------ 2 files changed, 31 insertions(+), 18 deletions(-) diff --git a/gen/cpp/config.toml b/gen/cpp/config.toml index b64ae85ad..657e6fa36 100644 --- a/gen/cpp/config.toml +++ b/gen/cpp/config.toml @@ -67,6 +67,27 @@ duration = "1.0" tremolo = "3" metronome-relation = '"equals"' +# Attribute structs that preserve xmlns:* namespace declarations through +# round-trip (e.g. xmlns:xlink on score-partwise). +[overrides] +xmlns_preserving_attrs = [ + "ScorePartwiseAttributes", + "ScoreTimewiseAttributes", + "OpusAttributes", + "LinkAttributes", +] + +# Elements whose hasContents() should always return true. +has_contents_always_true = ["measure-layout"] + +# Per-(parent-element, child-element) ctor argument overrides. +[overrides.child_init_value.scaling] +millimeters = "MillimetersValue(7)" +tenths = "TenthsValue(40)" + +[overrides.child_init_value.staff-details] +staff-lines = "NonNegativeInteger(5)" + # Per-attribute default value overrides, keyed by attrs struct name then field. [overrides.attr_default.AccidentalTextAttributes] lang = '"it"' diff --git a/gen/overrides.py b/gen/overrides.py index 5ca4936b6..4deb73c1f 100644 --- a/gen/overrides.py +++ b/gen/overrides.py @@ -23,28 +23,20 @@ for field, value in fields.items() } -# Attribute structs that preserve xmlns:* namespace declarations through -# round-trip (e.g. xmlns:xlink on score-partwise). -XMLNS_PRESERVING_ATTRS = { - "ScorePartwiseAttributes", - "ScoreTimewiseAttributes", - "OpusAttributes", - "LinkAttributes", -} +# Source of truth: [overrides] in cpp/config.toml. +XMLNS_PRESERVING_ATTRS = set(_CFG["overrides"]["xmlns_preserving_attrs"]) -# Per-(parent-element, child-element) override for the constructor argument -# passed to make{Child}() on the parent's ctor init list. +# Source of truth: [overrides.child_init_value] in cpp/config.toml. CHILD_INIT_VALUE_OVERRIDE = { - ("scaling", "millimeters"): "MillimetersValue(7)", - ("scaling", "tenths"): "TenthsValue(40)", - ("staff-details", "staff-lines"): "NonNegativeInteger(5)", + (parent, child): value + for parent, children in _CFG["overrides"]["child_init_value"].items() + for child, value in children.items() } -# Elements whose hasContents() should always return true regardless of -# what the XSD min/max-occurs analysis would produce. -ELEMENT_HAS_CONTENTS_ALWAYS_TRUE = { - "measure-layout", -} +# Source of truth: [overrides] has_contents_always_true in cpp/config.toml. +ELEMENT_HAS_CONTENTS_ALWAYS_TRUE = set( + _CFG["overrides"]["has_contents_always_true"] +) # Per-(element-name, child-xml-name) override for min_occurs. Used when XSD # group inlining propagates minOccurs=0 from the enclosing group to an element From e85e416f02bae111188d73d902dbd25152546e9d Mon Sep 17 00:00:00 2001 From: Matthew James Briggs Date: Sun, 7 Jun 2026 20:49:33 +0200 Subject: [PATCH 19/27] gen: jinja2 template for group header generation Replace 55-line generate_group_h f-string body with a group_h.j2 template, registered in config.toml under [categories.group]. --- gen/cpp/config.toml | 3 ++ gen/cpp/group_h.j2 | 77 +++++++++++++++++++++++++++++++++++++++++ gen/generate.py | 83 ++++++++------------------------------------- 3 files changed, 94 insertions(+), 69 deletions(-) create mode 100644 gen/cpp/group_h.j2 diff --git a/gen/cpp/config.toml b/gen/cpp/config.toml index 657e6fa36..8c45a8b61 100644 --- a/gen/cpp/config.toml +++ b/gen/cpp/config.toml @@ -2,6 +2,9 @@ header_template = "simple_value_h.j2" impl_template = "simple_value_cpp.j2" +[categories.group] +header_template = "group_h.j2" + # Default constructor values for C++ attribute types (used in Attributes structs). [defaults.attr_type] FontStyle = "FontStyle::normal" diff --git a/gen/cpp/group_h.j2 b/gen/cpp/group_h.j2 new file mode 100644 index 000000000..cb7e6fa7b --- /dev/null +++ b/gen/cpp/group_h.j2 @@ -0,0 +1,77 @@ +{{ license }} + +#pragma once + +#include "mx/core/ElementInterface.h" +#include "mx/core/ForwardDeclare.h" + +#include +#include +#include + +namespace mx +{ +namespace core +{ + +{% for cc in forward_declares %} +MX_FORWARD_DECLARE_ELEMENT({{ cc }}) +{% endfor %} +MX_FORWARD_DECLARE_ELEMENT({{ class_name }}) + +inline {{ class_name }}Ptr make{{ class_name }}() +{ + return std::make_shared<{{ class_name }}>(); +} + +class {{ class_name }} : public ElementInterface +{ + public: + {{ class_name }}(); + + virtual bool hasAttributes() const; + virtual std::ostream &streamAttributes(std::ostream &os) const; + virtual std::ostream &streamName(std::ostream &os) const; + virtual bool hasContents() const; + virtual std::ostream &streamContents(std::ostream &os, const int indentLevel, bool &isOneLineOnly) const; +{% for child in children %} +{% if child.max_occurs != 1 %} + + /* _________ {{ child.cc }} minOccurs = {{ child.min_occurs }}, maxOccurs = unbounded _________ */ + const {{ child.cc }}Set &get{{ child.cc }}Set() const; + void add{{ child.cc }}(const {{ child.cc }}Ptr &value); + void remove{{ child.cc }}(const {{ child.cc }}SetIterConst &value); + void clear{{ child.cc }}Set(); + {{ child.cc }}Ptr get{{ child.cc }}(const {{ child.cc }}SetIterConst &setIterator) const; +{% elif child.min_occurs == 0 %} + + /* _________ {{ child.cc }} minOccurs = 0, maxOccurs = 1 _________ */ + {{ child.cc }}Ptr get{{ child.cc }}() const; + void set{{ child.cc }}(const {{ child.cc }}Ptr &value); + bool getHas{{ child.cc }}() const; + void setHas{{ child.cc }}(const bool value); +{% else %} + + /* _________ {{ child.cc }} minOccurs = 1, maxOccurs = 1 _________ */ + {{ child.cc }}Ptr get{{ child.cc }}() const; + void set{{ child.cc }}(const {{ child.cc }}Ptr &value); +{% endif %} +{% endfor %} + + private: + virtual bool fromXElementImpl(std::ostream &message, ::ezxml::XElement &xelement); + + private: +{% for child in children %} +{% if child.max_occurs != 1 %} + {{ child.cc }}Set my{{ child.cc }}Set; +{% else %} + {{ child.cc }}Ptr my{{ child.cc }}; +{% if child.min_occurs == 0 %} + bool myHas{{ child.cc }}; +{% endif %} +{% endif %} +{% endfor %} +}; +} // namespace core +} // namespace mx diff --git a/gen/generate.py b/gen/generate.py index 6396d707c..b7b415715 100644 --- a/gen/generate.py +++ b/gen/generate.py @@ -1271,76 +1271,21 @@ def _emit_synthetic_unbounded_helper(lines: list, parent_class: str, def generate_group_h(group_name: str, children: list, model: XsdModel) -> str: class_name = group_class_name(group_name) - - lines = [LICENSE, "#pragma once\n"] - lines.append('#include "mx/core/ElementInterface.h"') - lines.append('#include "mx/core/ForwardDeclare.h"') - lines.append("") - lines.append("#include ") - lines.append("#include ") - lines.append("#include ") - lines.append("") - lines.append("namespace mx\n{\nnamespace core\n{\n") - child_classes = [child_class_name(c) for c in children] - for cc in sorted(set(child_classes)): - lines.append(f"MX_FORWARD_DECLARE_ELEMENT({cc})") - lines.append(f"MX_FORWARD_DECLARE_ELEMENT({class_name})\n") - - lines.append(f"inline {class_name}Ptr make{class_name}()") - lines.append("{") - lines.append(f" return std::make_shared<{class_name}>();") - lines.append("}") - - lines.append(f"\nclass {class_name} : public ElementInterface") - lines.append("{") - lines.append(" public:") - lines.append(f" {class_name}();") - lines.append("") - lines.append(" virtual bool hasAttributes() const;") - lines.append(" virtual std::ostream &streamAttributes(std::ostream &os) const;") - lines.append(" virtual std::ostream &streamName(std::ostream &os) const;") - lines.append(" virtual bool hasContents() const;") - lines.append(" virtual std::ostream &streamContents(std::ostream &os, const int indentLevel, bool &isOneLineOnly) const;") - - for child in children: - cc = child_class_name(child) - if child.max_occurs != 1: - lines.append(f"\n /* _________ {cc} minOccurs = {child.min_occurs}, maxOccurs = unbounded _________ */") - lines.append(f" const {cc}Set &get{cc}Set() const;") - lines.append(f" void add{cc}(const {cc}Ptr &value);") - lines.append(f" void remove{cc}(const {cc}SetIterConst &value);") - lines.append(f" void clear{cc}Set();") - lines.append(f" {cc}Ptr get{cc}(const {cc}SetIterConst &setIterator) const;") - elif child.min_occurs == 0: - lines.append(f"\n /* _________ {cc} minOccurs = 0, maxOccurs = 1 _________ */") - lines.append(f" {cc}Ptr get{cc}() const;") - lines.append(f" void set{cc}(const {cc}Ptr &value);") - lines.append(f" bool getHas{cc}() const;") - lines.append(f" void setHas{cc}(const bool value);") - else: - lines.append(f"\n /* _________ {cc} minOccurs = 1, maxOccurs = 1 _________ */") - lines.append(f" {cc}Ptr get{cc}() const;") - lines.append(f" void set{cc}(const {cc}Ptr &value);") - - lines.append("") - lines.append(" private:") - lines.append(" virtual bool fromXElementImpl(std::ostream &message, ::ezxml::XElement &xelement);") - lines.append("") - lines.append(" private:") - for child in children: - cc = child_class_name(child) - if child.max_occurs != 1: - lines.append(f" {cc}Set my{cc}Set;") - else: - lines.append(f" {cc}Ptr my{cc};") - if child.min_occurs == 0: - lines.append(f" bool myHas{cc};") - - lines.append("};") - lines.append("} // namespace core") - lines.append("} // namespace mx") - return "\n".join(lines) + "\n" + forward_declares = sorted(set(child_classes)) + + tmpl = _JINJA_ENV.get_template( + CPP_CONFIG["categories"]["group"]["header_template"]) + return tmpl.render( + license=LICENSE, + class_name=class_name, + forward_declares=forward_declares, + children=[ + {"cc": child_class_name(c), "min_occurs": c.min_occurs, + "max_occurs": c.max_occurs} + for c in children + ], + ) def generate_group_cpp(group_name: str, children: list, model: XsdModel) -> str: From d7717d657e55401dd5e8e930629b4d175cdd9883 Mon Sep 17 00:00:00 2001 From: Matthew James Briggs Date: Sun, 7 Jun 2026 20:51:13 +0200 Subject: [PATCH 20/27] gen: jinja2 template for group impl generation Replace 185-line generate_group_cpp f-string body with a group_cpp.j2 template. Pre-renders ctor init list and fromXElementImpl (complex line-wrapping logic) in Python, passes the rest as structured data to the template. --- gen/cpp/config.toml | 1 + gen/cpp/group_cpp.j2 | 173 ++++++++++++++++++++++++++++++++++++++++ gen/generate.py | 185 +++++++------------------------------------ 3 files changed, 202 insertions(+), 157 deletions(-) create mode 100644 gen/cpp/group_cpp.j2 diff --git a/gen/cpp/config.toml b/gen/cpp/config.toml index 8c45a8b61..7aa16176e 100644 --- a/gen/cpp/config.toml +++ b/gen/cpp/config.toml @@ -4,6 +4,7 @@ impl_template = "simple_value_cpp.j2" [categories.group] header_template = "group_h.j2" +impl_template = "group_cpp.j2" # Default constructor values for C++ attribute types (used in Attributes structs). [defaults.attr_type] diff --git a/gen/cpp/group_cpp.j2 b/gen/cpp/group_cpp.j2 new file mode 100644 index 000000000..99af44efd --- /dev/null +++ b/gen/cpp/group_cpp.j2 @@ -0,0 +1,173 @@ +{{ license }} + +#include "mx/core/elements/{{ class_name }}.h" +#include "mx/core/FromXElement.h" +{% for inc in child_includes %} +#include "mx/core/elements/{{ inc }}.h" +{% endfor %} +#include + +namespace mx +{ +namespace core +{ +{{ ctor_init }} +{ +} + +bool {{ class_name }}::hasAttributes() const +{ + return false; +} + +std::ostream &{{ class_name }}::streamAttributes(std::ostream &os) const +{ + return os; +} + +std::ostream &{{ class_name }}::streamName(std::ostream &os) const +{ + return os; +} + +bool {{ class_name }}::hasContents() const +{ +{% if has_contents_always_true %} + return true; +{% elif has_contents_parts %} + return {{ has_contents_parts | join(' || ') }}; +{% else %} + return false; +{% endif %} +} + +std::ostream &{{ class_name }}::streamContents(std::ostream &os, const int indentLevel, bool &isOneLineOnly) const +{ +{% if all_optional %} + bool firstItem = true; + isOneLineOnly = true; +{% for child in children %} + if (myHas{{ child.cc }}) + { + if (!firstItem) + os << std::endl; + my{{ child.cc }}->toStream(os, indentLevel); + firstItem = false; + } +{% endfor %} +{% else %} + bool isFirst = true; +{% for child in children %} +{% if child.max_occurs != 1 %} + for (auto x : my{{ child.cc }}Set) + { + if (!isFirst) + os << std::endl; + x->toStream(os, indentLevel); + isFirst = false; + } +{% elif child.min_occurs == 0 %} + if (myHas{{ child.cc }}) + { + if (!isFirst) + os << std::endl; + my{{ child.cc }}->toStream(os, indentLevel); + isFirst = false; + } +{% else %} + if (!isFirst) + os << std::endl; + my{{ child.cc }}->toStream(os, indentLevel); + isFirst = false; +{% endif %} +{% endfor %} +{% endif %} + isOneLineOnly = !hasContents(); + return os; +} + +{% for child in children %} +{% if child.max_occurs != 1 %} +const {{ child.cc }}Set &{{ class_name }}::get{{ child.cc }}Set() const +{ + return my{{ child.cc }}Set; +} + +void {{ class_name }}::remove{{ child.cc }}(const {{ child.cc }}SetIterConst &value) +{ + if (value != my{{ child.cc }}Set.cend()) + { + my{{ child.cc }}Set.erase(value); + } +} + +void {{ class_name }}::add{{ child.cc }}(const {{ child.cc }}Ptr &value) +{ + if (value) + { + my{{ child.cc }}Set.push_back(value); + } +} + +void {{ class_name }}::clear{{ child.cc }}Set() +{ + my{{ child.cc }}Set.clear(); +} + +{{ child.cc }}Ptr {{ class_name }}::get{{ child.cc }}(const {{ child.cc }}SetIterConst &setIterator) const +{ + if (setIterator != my{{ child.cc }}Set.cend()) + { + return *setIterator; + } + return {{ child.cc }}Ptr(); +} + +{% elif child.min_occurs == 0 %} +{{ child.cc }}Ptr {{ class_name }}::get{{ child.cc }}() const +{ + return my{{ child.cc }}; +} + +void {{ class_name }}::set{{ child.cc }}(const {{ child.cc }}Ptr &value) +{ + if (value) + { + my{{ child.cc }} = value; + } +} + +bool {{ class_name }}::getHas{{ child.cc }}() const +{ + return myHas{{ child.cc }}; +} + +void {{ class_name }}::setHas{{ child.cc }}(const bool value) +{ + myHas{{ child.cc }} = value; +} + +{% else %} +{{ child.cc }}Ptr {{ class_name }}::get{{ child.cc }}() const +{ + return my{{ child.cc }}; +} + +void {{ class_name }}::set{{ child.cc }}(const {{ child.cc }}Ptr &value) +{ + if (value) + { + my{{ child.cc }} = value; + } +} + +{% endif %} +{% endfor %} +{% if from_x_impl %} +{{ from_x_impl }} +{% else %} +MX_FROM_XELEMENT_UNUSED({{ class_name }}); + +{% endif %} +} // namespace core +} // namespace mx diff --git a/gen/generate.py b/gen/generate.py index b7b415715..3a07b82f0 100644 --- a/gen/generate.py +++ b/gen/generate.py @@ -1290,18 +1290,7 @@ def generate_group_h(group_name: str, children: list, model: XsdModel) -> str: def generate_group_cpp(group_name: str, children: list, model: XsdModel) -> str: class_name = group_class_name(group_name) - - lines = [LICENSE] - lines.append(f'#include "mx/core/elements/{class_name}.h"') - lines.append('#include "mx/core/FromXElement.h"') - child_includes = sorted(set( - f'#include "mx/core/elements/{child_class_name(c)}.h"' - for c in children - )) - for inc in child_includes: - lines.append(inc) - lines.append("#include \n") - lines.append("namespace mx\n{\nnamespace core\n{") + child_includes = sorted(set(child_class_name(c) for c in children)) init_parts = [] for child in children: @@ -1313,27 +1302,10 @@ def generate_group_cpp(group_name: str, children: list, model: XsdModel) -> str: if child.min_occurs == 0: init_parts.append(f"myHas{cc}(false)") - _emit_ctor_init(lines, f"{class_name}::{class_name}()", init_parts) - lines.append("{") - lines.append("}\n") - - lines.append(f"bool {class_name}::hasAttributes() const") - lines.append("{") - lines.append(" return false;") - lines.append("}\n") - - lines.append(f"std::ostream &{class_name}::streamAttributes(std::ostream &os) const") - lines.append("{") - lines.append(" return os;") - lines.append("}\n") + ctor_lines = [] + _emit_ctor_init(ctor_lines, f"{class_name}::{class_name}()", init_parts) + ctor_init = "\n".join(ctor_lines) - lines.append(f"std::ostream &{class_name}::streamName(std::ostream &os) const") - lines.append("{") - lines.append(" return os;") - lines.append("}\n") - - lines.append(f"bool {class_name}::hasContents() const") - lines.append("{") parts = [] for child in children: cc = child_class_name(child) @@ -1343,136 +1315,35 @@ def generate_group_cpp(group_name: str, children: list, model: XsdModel) -> str: parts.append(f"myHas{cc}") else: parts.append("true") - if any("true" == p for p in parts): - lines.append(" return true;") - elif parts: - lines.append(f" return {' || '.join(parts)};") - else: - lines.append(" return false;") - lines.append("}\n") - lines.append(f"std::ostream &{class_name}::streamContents(std::ostream &os, const int indentLevel, bool &isOneLineOnly) const") - lines.append("{") - all_optional = all(c.min_occurs == 0 and c.max_occurs == 1 for c in children) - if all_optional: - lines.append(" bool firstItem = true;") - lines.append(" isOneLineOnly = true;") - for child in children: - cc = child_class_name(child) - lines.append(f" if (myHas{cc})") - lines.append(" {") - lines.append(" if (!firstItem)") - lines.append(" os << std::endl;") - lines.append(f" my{cc}->toStream(os, indentLevel);") - lines.append(" firstItem = false;") - lines.append(" }") - else: - lines.append(" bool isFirst = true;") - for child in children: - cc = child_class_name(child) - if child.max_occurs != 1: - lines.append(f" for (auto x : my{cc}Set)") - lines.append(" {") - lines.append(" if (!isFirst)") - lines.append(" os << std::endl;") - lines.append(" x->toStream(os, indentLevel);") - lines.append(" isFirst = false;") - lines.append(" }") - elif child.min_occurs == 0: - lines.append(f" if (myHas{cc})") - lines.append(" {") - lines.append(" if (!isFirst)") - lines.append(" os << std::endl;") - lines.append(f" my{cc}->toStream(os, indentLevel);") - lines.append(" isFirst = false;") - lines.append(" }") - else: - lines.append(" if (!isFirst)") - lines.append(" os << std::endl;") - lines.append(f" my{cc}->toStream(os, indentLevel);") - lines.append(" isFirst = false;") - lines.append(" isOneLineOnly = !hasContents();") - lines.append(" return os;") - lines.append("}\n") + has_contents_always_true = any(p == "true" for p in parts) + has_contents_parts = [] if has_contents_always_true else parts - for child in children: - cc = child_class_name(child) - if child.max_occurs != 1: - lines.append(f"const {cc}Set &{class_name}::get{cc}Set() const") - lines.append("{") - lines.append(f" return my{cc}Set;") - lines.append("}\n") - lines.append(f"void {class_name}::remove{cc}(const {cc}SetIterConst &value)") - lines.append("{") - lines.append(f" if (value != my{cc}Set.cend())") - lines.append(" {") - lines.append(f" my{cc}Set.erase(value);") - lines.append(" }") - lines.append("}\n") - lines.append(f"void {class_name}::add{cc}(const {cc}Ptr &value)") - lines.append("{") - lines.append(" if (value)") - lines.append(" {") - lines.append(f" my{cc}Set.push_back(value);") - lines.append(" }") - lines.append("}\n") - lines.append(f"void {class_name}::clear{cc}Set()") - lines.append("{") - lines.append(f" my{cc}Set.clear();") - lines.append("}\n") - lines.append(f"{cc}Ptr {class_name}::get{cc}(const {cc}SetIterConst &setIterator) const") - lines.append("{") - lines.append(f" if (setIterator != my{cc}Set.cend())") - lines.append(" {") - lines.append(" return *setIterator;") - lines.append(" }") - lines.append(f" return {cc}Ptr();") - lines.append("}\n") - elif child.min_occurs == 0: - lines.append(f"{cc}Ptr {class_name}::get{cc}() const") - lines.append("{") - lines.append(f" return my{cc};") - lines.append("}\n") - lines.append(f"void {class_name}::set{cc}(const {cc}Ptr &value)") - lines.append("{") - lines.append(" if (value)") - lines.append(" {") - lines.append(f" my{cc} = value;") - lines.append(" }") - lines.append("}\n") - lines.append(f"bool {class_name}::getHas{cc}() const") - lines.append("{") - lines.append(f" return myHas{cc};") - lines.append("}\n") - lines.append(f"void {class_name}::setHas{cc}(const bool value)") - lines.append("{") - lines.append(f" myHas{cc} = value;") - lines.append("}\n") - else: - lines.append(f"{cc}Ptr {class_name}::get{cc}() const") - lines.append("{") - lines.append(f" return my{cc};") - lines.append("}\n") - lines.append(f"void {class_name}::set{cc}(const {cc}Ptr &value)") - lines.append("{") - lines.append(" if (value)") - lines.append(" {") - lines.append(f" my{cc} = value;") - lines.append(" }") - lines.append("}\n") + all_optional = all(c.min_occurs == 0 and c.max_occurs == 1 for c in children) - # fromXElementImpl - most groups use the UNUSED macro since they are - # imported via importGroup helpers in the parent. score-header is the - # exception: ScorePartwise/ScoreTimewise call myScoreHeaderGroup->fromXElement - # directly, so it needs a real parsing body. + from_x_impl = None if _group_needs_real_from_x(group_name): - _emit_group_real_from_x_impl(lines, class_name, children) - else: - lines.append(f"MX_FROM_XELEMENT_UNUSED({class_name});\n") + fx_lines = [] + _emit_group_real_from_x_impl(fx_lines, class_name, children) + from_x_impl = "\n".join(fx_lines) - lines.append("} // namespace core") - lines.append("} // namespace mx") - return "\n".join(lines) + "\n" + tmpl = _JINJA_ENV.get_template( + CPP_CONFIG["categories"]["group"]["impl_template"]) + return tmpl.render( + license=LICENSE, + class_name=class_name, + child_includes=child_includes, + ctor_init=ctor_init, + has_contents_always_true=has_contents_always_true, + has_contents_parts=has_contents_parts, + all_optional=all_optional, + children=[ + {"cc": child_class_name(c), "min_occurs": c.min_occurs, + "max_occurs": c.max_occurs} + for c in children + ], + from_x_impl=from_x_impl, + ) def _group_needs_real_from_x(group_name: str) -> bool: From 1f48caa88ac3c8cc828491a8034dc843700e9657 Mon Sep 17 00:00:00 2001 From: Matthew James Briggs Date: Sun, 7 Jun 2026 20:51:54 +0200 Subject: [PATCH 21/27] gen: jinja2 template for attrs header generation Replace generate_attrs_h f-string body with attrs_h.j2 template, registered in config.toml under [categories.attrs]. --- gen/cpp/attrs_h.j2 | 44 +++++++++++++++++++++++++++++++ gen/cpp/config.toml | 3 +++ gen/generate.py | 64 ++++++++++++--------------------------------- 3 files changed, 64 insertions(+), 47 deletions(-) create mode 100644 gen/cpp/attrs_h.j2 diff --git a/gen/cpp/attrs_h.j2 b/gen/cpp/attrs_h.j2 new file mode 100644 index 000000000..cd68a0fc3 --- /dev/null +++ b/gen/cpp/attrs_h.j2 @@ -0,0 +1,44 @@ +{{ license }} + +#pragma once + +{% for inc in project_includes %} +#include "{{ inc }}" +{% endfor %} + +#include +#include +{% if preserves_xmlns %} +#include +#include +{% endif %} +#include + +namespace mx +{ +namespace core +{ + +MX_FORWARD_DECLARE_ATTRIBUTES({{ struct_name }}) + +struct {{ struct_name }} : public AttributesInterface +{ + public: + {{ struct_name }}(); + virtual bool hasValues() const; + virtual std::ostream &toStream(std::ostream &os) const; +{% for a in attrs %} + {{ a.cpp_type }} {{ a.cpp_name }}; +{% endfor %} +{% for a in attrs %} + {{ a.const_prefix }}bool {{ a.has_name }}; +{% endfor %} +{% if preserves_xmlns %} + std::vector> xmlnsDeclarations; +{% endif %} + + private: + virtual bool fromXElementImpl(std::ostream &message, ::ezxml::XElement &xelement); +}; +} // namespace core +} // namespace mx diff --git a/gen/cpp/config.toml b/gen/cpp/config.toml index 7aa16176e..d1b483419 100644 --- a/gen/cpp/config.toml +++ b/gen/cpp/config.toml @@ -2,6 +2,9 @@ header_template = "simple_value_h.j2" impl_template = "simple_value_cpp.j2" +[categories.attrs] +header_template = "attrs_h.j2" + [categories.group] header_template = "group_h.j2" impl_template = "group_cpp.j2" diff --git a/gen/generate.py b/gen/generate.py index 3a07b82f0..39231e843 100644 --- a/gen/generate.py +++ b/gen/generate.py @@ -193,55 +193,25 @@ def element_attrs_struct_name(elem_name: str, model: XsdModel) -> str: def generate_attrs_h(struct_name: str, attrs: list, model: XsdModel) -> str: preserves_xmlns = struct_name in XMLNS_PRESERVING_ATTRS - includes = set() - includes.add("mx/core/AttributesInterface.h") - includes.add("mx/core/ForwardDeclare.h") - for a in attrs: - cpp_t = resolve_attr_cpp_type(a, model) - h = header_for_type(cpp_t) - includes.add(h) - - lines = [LICENSE, "#pragma once\n"] - for inc in sorted(includes): - lines.append(f'#include "{inc}"') - lines.append("") - lines.append("#include ") - lines.append("#include ") - if preserves_xmlns: - lines.append("#include ") - lines.append("#include ") - lines.append("#include ") - lines.append("") - lines.append("namespace mx\n{\nnamespace core\n{\n") - lines.append(f"MX_FORWARD_DECLARE_ATTRIBUTES({struct_name})\n") - lines.append(f"struct {struct_name} : public AttributesInterface") - lines.append("{") - lines.append(" public:") - lines.append(f" {struct_name}();") - lines.append(" virtual bool hasValues() const;") - lines.append(" virtual std::ostream &toStream(std::ostream &os) const;") - - for a in attrs: - cpp_t = resolve_attr_cpp_type(a, model) - cpp_n = camel(a.name) - lines.append(f" {cpp_t} {cpp_n};") - + includes = {"mx/core/AttributesInterface.h", "mx/core/ForwardDeclare.h"} for a in attrs: - cpp_n = camel(a.name) - has_name = has_flag_name(cpp_n) - const_prefix = "const " if a.use == "required" else "" - lines.append(f" {const_prefix}bool {has_name};") + includes.add(header_for_type(resolve_attr_cpp_type(a, model))) - if preserves_xmlns: - lines.append(" std::vector> xmlnsDeclarations;") - - lines.append("") - lines.append(" private:") - lines.append(" virtual bool fromXElementImpl(std::ostream &message, ::ezxml::XElement &xelement);") - lines.append("};") - lines.append("} // namespace core") - lines.append("} // namespace mx") - return "\n".join(lines) + "\n" + tmpl = _JINJA_ENV.get_template( + CPP_CONFIG["categories"]["attrs"]["header_template"]) + return tmpl.render( + license=LICENSE, + project_includes=sorted(includes), + preserves_xmlns=preserves_xmlns, + struct_name=struct_name, + attrs=[ + {"cpp_type": resolve_attr_cpp_type(a, model), + "cpp_name": camel(a.name), + "has_name": has_flag_name(camel(a.name)), + "const_prefix": "const " if a.use == "required" else ""} + for a in attrs + ], + ) From 8f88429925f91a6a885d8257d92fb71bb5b21347 Mon Sep 17 00:00:00 2001 From: Matthew James Briggs Date: Sun, 7 Jun 2026 20:53:01 +0200 Subject: [PATCH 22/27] gen: jinja2 template for attrs impl generation Replace 125-line generate_attrs_cpp f-string body with attrs_cpp.j2 template. Pre-renders the ctor init list in Python, passes attr metadata as structured data. --- gen/cpp/attrs_cpp.j2 | 86 +++++++++++++++++++++++++++ gen/cpp/config.toml | 1 + gen/generate.py | 136 ++++++++++++------------------------------- 3 files changed, 125 insertions(+), 98 deletions(-) create mode 100644 gen/cpp/attrs_cpp.j2 diff --git a/gen/cpp/attrs_cpp.j2 b/gen/cpp/attrs_cpp.j2 new file mode 100644 index 000000000..289496b66 --- /dev/null +++ b/gen/cpp/attrs_cpp.j2 @@ -0,0 +1,86 @@ +{{ license }} + +#include "mx/core/elements/{{ struct_name }}.h" +#include "mx/core/FromXElement.h" +#include + +namespace mx +{ +namespace core +{ +{{ ctor_init }} +{ +} + +bool {{ struct_name }}::hasValues() const +{ +{% if has_values_expr %} + return {{ has_values_expr }}; +{% else %} + return false; +{% endif %} +} + +std::ostream &{{ struct_name }}::toStream(std::ostream &os) const +{ + if (hasValues()) + { +{% for a in attrs %} + streamAttribute(os, {{ a.cpp_name }}, "{{ a.xml_name }}", {{ a.has_name }}); +{% endfor %} +{% if preserves_xmlns %} + for (const auto &ns : xmlnsDeclarations) + { + os << " " << ns.first << "=\"" << ns.second << "\""; + } +{% endif %} + } + return os; +} + +bool {{ struct_name }}::fromXElementImpl(std::ostream &message, ::ezxml::XElement &xelement) +{ + const char *const className = "{{ struct_name }}"; + bool isSuccess = true; +{% for a in required_attrs %} + bool {{ a.found_local }} = false; +{% endfor %} + + auto it = xelement.attributesBegin(); + auto endIter = xelement.attributesEnd(); + + for (; it != endIter; ++it) + { +{% for a in attrs %} +{% if a.parse_func %} + if (parseAttribute(message, it, className, isSuccess, {{ a.cpp_name }}, {{ a.parse_has }}, "{{ a.xml_name }}", &{{ a.parse_func }})) +{% else %} + if (parseAttribute(message, it, className, isSuccess, {{ a.cpp_name }}, {{ a.parse_has }}, "{{ a.xml_name }}")) +{% endif %} + { + continue; + } +{% endfor %} +{% if preserves_xmlns %} + const auto attrName = it->getName(); + if (attrName == "xmlns" || (attrName.size() > 6 && attrName.substr(0, 6) == "xmlns:")) + { + xmlnsDeclarations.emplace_back(attrName, it->getValue()); + continue; + } +{% endif %} + } + +{% for a in required_attrs %} + if (!{{ a.found_local }}) + { + isSuccess = false; + message << className << ": '{{ a.xml_name }}' is a required attribute but was not found" << std::endl; + } + +{% endfor %} + MX_RETURN_IS_SUCCESS; +} + +} // namespace core +} // namespace mx diff --git a/gen/cpp/config.toml b/gen/cpp/config.toml index d1b483419..57dea100b 100644 --- a/gen/cpp/config.toml +++ b/gen/cpp/config.toml @@ -4,6 +4,7 @@ impl_template = "simple_value_cpp.j2" [categories.attrs] header_template = "attrs_h.j2" +impl_template = "attrs_cpp.j2" [categories.group] header_template = "group_h.j2" diff --git a/gen/generate.py b/gen/generate.py index 39231e843..24e3c1e86 100644 --- a/gen/generate.py +++ b/gen/generate.py @@ -244,13 +244,7 @@ def default_value_for_type(cpp_type: str) -> str: def generate_attrs_cpp(struct_name: str, attrs: list, model: XsdModel) -> str: preserves_xmlns = struct_name in XMLNS_PRESERVING_ATTRS - lines = [LICENSE] - lines.append(f'#include "mx/core/elements/{struct_name}.h"') - lines.append('#include "mx/core/FromXElement.h"') - lines.append("#include \n") - lines.append("namespace mx\n{\nnamespace core\n{") - # constructor init_parts = [] for a in attrs: cpp_t = resolve_attr_cpp_type(a, model) @@ -258,115 +252,61 @@ def generate_attrs_cpp(struct_name: str, attrs: list, model: XsdModel) -> str: override = ATTR_DEFAULT_OVERRIDE.get((struct_name, cpp_n)) if override: init_parts.append(f"{cpp_n}({override})") - continue - dv = default_value_for_type(cpp_t) - if dv: - init_parts.append(f"{cpp_n}({dv})") else: - init_parts.append(f"{cpp_n}()") + dv = default_value_for_type(cpp_t) + init_parts.append(f"{cpp_n}({dv})" if dv else f"{cpp_n}()") for a in attrs: cpp_n = camel(a.name) - has_name = has_flag_name(cpp_n) init_val = "true" if a.use == "required" else "false" - init_parts.append(f"{has_name}({init_val})") + init_parts.append(f"{has_flag_name(cpp_n)}({init_val})") - _emit_ctor_init(lines, f"{struct_name}::{struct_name}()", init_parts) - lines.append("{") - lines.append("}\n") + ctor_lines = [] + _emit_ctor_init(ctor_lines, f"{struct_name}::{struct_name}()", init_parts) + ctor_init = "\n".join(ctor_lines) - # hasValues - has_parts = [] - for a in attrs: - cpp_n = camel(a.name) - has_name = has_flag_name(cpp_n) - has_parts.append(has_name) + has_parts = [has_flag_name(camel(a.name)) for a in attrs] if preserves_xmlns: has_parts.append("!xmlnsDeclarations.empty()") - lines.append(f"bool {struct_name}::hasValues() const") - lines.append("{") - if has_parts: - lines.append(f" return {' || '.join(has_parts)};") - else: - lines.append(" return false;") - lines.append("}\n") + has_values_expr = " || ".join(has_parts) if has_parts else "" - # toStream - lines.append(f"std::ostream &{struct_name}::toStream(std::ostream &os) const") - lines.append("{") - lines.append(" if (hasValues())") - lines.append(" {") + required_locals = [] for a in attrs: - cpp_n = camel(a.name) - has_name = has_flag_name(cpp_n) - lines.append(f' streamAttribute(os, {cpp_n}, "{a.get_xml_name()}", {has_name});') - if preserves_xmlns: - lines.append(" for (const auto &ns : xmlnsDeclarations)") - lines.append(" {") - lines.append(' os << " " << ns.first << "=\\"" << ns.second << "\\"";') - lines.append(" }") - lines.append(" }") - lines.append(" return os;") - lines.append("}\n") + if a.use == "required": + required_locals.append({ + "found_local": "is" + pascal(a.name) + "Found", + "xml_name": a.get_xml_name() or a.name, + }) - # fromXElementImpl - lines.append(f"bool {struct_name}::fromXElementImpl(std::ostream &message, ::ezxml::XElement &xelement)") - lines.append("{") - lines.append(f' const char *const className = "{struct_name}";') - lines.append(" bool isSuccess = true;") - required_locals = [] + required_local_map = {} for a in attrs: if a.use == "required": - cpp_n = camel(a.name) - local_name = "is" + pascal(a.name) + "Found" - required_locals.append((a, local_name)) - lines.append(f" bool {local_name} = false;") - lines.append("") - lines.append(" auto it = xelement.attributesBegin();") - lines.append(" auto endIter = xelement.attributesEnd();\n") - lines.append(" for (; it != endIter; ++it)") - lines.append(" {") - required_local_map = {id(a): ln for a, ln in required_locals} + required_local_map[a.name] = "is" + pascal(a.name) + "Found" + + attr_data = [] for a in attrs: cpp_t = resolve_attr_cpp_type(a, model) cpp_n = camel(a.name) - parse_has = required_local_map.get(id(a), has_flag_name(cpp_n)) - if needs_parse_func(cpp_t): - pf = parse_func_name(cpp_t) - lines.append(f" if (parseAttribute(message, it, className, isSuccess, {cpp_n}, {parse_has}, " - f'"{a.get_xml_name()}", &{pf}))') - else: - lines.append(f" if (parseAttribute(message, it, className, isSuccess, {cpp_n}, {parse_has}, " - f'"{a.get_xml_name()}"))') - lines.append(" {") - lines.append(" continue;") - lines.append(" }") - if preserves_xmlns: - lines.append(" const auto attrName = it->getName();") - lines.append(' if (attrName == "xmlns" || (attrName.size() > 6 && attrName.substr(0, 6) == "xmlns:"))') - lines.append(" {") - lines.append(" xmlnsDeclarations.emplace_back(attrName, it->getValue());") - lines.append(" continue;") - lines.append(" }") - lines.append(" }\n") - for a, local_name in required_locals: - lines.append(f" if (!{local_name})") - lines.append(" {") - lines.append(" isSuccess = false;") - # Use the XSD attribute name (xml form, e.g. 'non-controlling') in - # the error message rather than hardcoding 'number'. The original - # codegen had a bug here that produced the wrong attribute name for - # any required attribute not named 'number' (visible in committed - # ScorePartAttributes.cpp, which says 'number' when it should say - # 'id'). - xml_name = a.get_xml_name() or a.name - lines.append(f' message << className << ": \'{xml_name}\' is a required attribute but was not found" << std::endl;') - lines.append(" }\n") - lines.append(" MX_RETURN_IS_SUCCESS;") - lines.append("}\n") + parse_has = required_local_map.get(a.name, has_flag_name(cpp_n)) + pf = parse_func_name(cpp_t) if needs_parse_func(cpp_t) else None + attr_data.append({ + "cpp_name": cpp_n, + "xml_name": a.get_xml_name(), + "has_name": has_flag_name(cpp_n), + "parse_has": parse_has, + "parse_func": pf, + }) - lines.append("} // namespace core") - lines.append("} // namespace mx") - return "\n".join(lines) + "\n" + tmpl = _JINJA_ENV.get_template( + CPP_CONFIG["categories"]["attrs"]["impl_template"]) + return tmpl.render( + license=LICENSE, + struct_name=struct_name, + ctor_init=ctor_init, + has_values_expr=has_values_expr, + preserves_xmlns=preserves_xmlns, + attrs=attr_data, + required_attrs=required_locals, + ) # --------------------------------------------------------------------------- From ffc875c6e634d9e39071515c58776604732cc827 Mon Sep 17 00:00:00 2001 From: Matthew James Briggs Date: Sun, 7 Jun 2026 20:53:50 +0200 Subject: [PATCH 23/27] gen: move score wrapper flavor config to TOML SCORE_WRAPPER_FLAVOR_CONFIG (partwise/timewise behavioral knobs) now lives under [score_wrapper] in config.toml, loaded by score_config.py at import time. --- gen/cpp/config.toml | 39 +++++++++++++++++++++++++++++ gen/score_config.py | 60 ++++++--------------------------------------- 2 files changed, 47 insertions(+), 52 deletions(-) diff --git a/gen/cpp/config.toml b/gen/cpp/config.toml index 57dea100b..3a8d136a6 100644 --- a/gen/cpp/config.toml +++ b/gen/cpp/config.toml @@ -169,3 +169,42 @@ number = 'XsToken("1")' [overrides.attr_default.MetronomeAttributes] halign = "LeftCenterRight::center" justify = "LeftCenterRight::center" + +# Score wrapper (partwise/timewise) per-flavor behavioral knobs. +[score_wrapper.score-partwise] +outer_extra_includes = [] +outer_loop_uses_end_var = false +music_data_holder_attrs_jit = true +music_data_holder_debug_throw = true +set_holder_clear_repushes_default = true +set_holder_remove_has_size_guard = true +set_holder_post_loop_required = false +set_holder_first_flag_name = "isFirstAdded" +set_holder_use_return_macro = true +set_holder_loop_uses_element_name_var = false +set_holder_unexpected_order = "message_first" +set_holder_unexpected_msg = "encountered_quoted" +set_holder_begin_deref_parens = false +set_holder_from_x_before_first_check = true +set_holder_blank_after_first_decl = false +set_holder_blank_inside_else = false +set_holder_child_var_source = "xml_name" + +[score_wrapper.score-timewise] +outer_extra_includes = ["ezxml/XElement.h", "ezxml/XElementIterator.h"] +outer_loop_uses_end_var = true +music_data_holder_attrs_jit = false +music_data_holder_debug_throw = false +set_holder_clear_repushes_default = false +set_holder_remove_has_size_guard = false +set_holder_post_loop_required = true +set_holder_first_flag_name = "isFirstTimewisePartFound" +set_holder_use_return_macro = false +set_holder_loop_uses_element_name_var = true +set_holder_unexpected_order = "issuccess_first" +set_holder_unexpected_msg = "trailing_encountered" +set_holder_begin_deref_parens = true +set_holder_from_x_before_first_check = true +set_holder_blank_after_first_decl = true +set_holder_blank_inside_else = true +set_holder_child_var_source = "class_name" diff --git a/gen/score_config.py b/gen/score_config.py index 80063a8a1..08ca7dbc1 100644 --- a/gen/score_config.py +++ b/gen/score_config.py @@ -2,57 +2,13 @@ """Score wrapper (partwise/timewise) flavor configuration. Per-flavor knobs that capture hand-written variations between the partwise -and timewise families. Keys are the outer XSD element names. +and timewise families. Source of truth: [score_wrapper] in cpp/config.toml. """ +import os +import tomllib -SCORE_WRAPPER_FLAVOR_CONFIG = { - "score-partwise": { - # ScorePartwise.cpp - "outer_extra_includes": [], - "outer_loop_uses_end_var": False, - # PartwiseMeasure (music-data holder) - "music_data_holder_attrs_jit": True, - "music_data_holder_debug_throw": True, - # PartwisePart (set holder) - "set_holder_clear_repushes_default": True, - "set_holder_remove_has_size_guard": True, - "set_holder_post_loop_required": False, - "set_holder_first_flag_name": "isFirstAdded", - "set_holder_use_return_macro": True, - # Loop body style differences (partwise variant). - "set_holder_loop_uses_element_name_var": False, - "set_holder_unexpected_order": "message_first", # message << ...; isSuccess = false; - "set_holder_unexpected_msg": "encountered_quoted", # "...: encountered an unexpected element '...'" - "set_holder_begin_deref_parens": False, # *mySet.begin() vs *(mySet.begin()) - "set_holder_from_x_before_first_check": True, - "set_holder_blank_after_first_decl": False, - "set_holder_blank_inside_else": False, - "set_holder_child_var_source": "xml_name", # "xml_name" => camel(xml); "class_name" => pascal_to_camel(cls) - }, - "score-timewise": { - # ScoreTimewise.cpp - "outer_extra_includes": [ - "ezxml/XElement.h", - "ezxml/XElementIterator.h", - ], - "outer_loop_uses_end_var": True, - # TimewisePart (music-data holder) - "music_data_holder_attrs_jit": False, - "music_data_holder_debug_throw": False, - # TimewiseMeasure (set holder) - "set_holder_clear_repushes_default": False, - "set_holder_remove_has_size_guard": False, - "set_holder_post_loop_required": True, - "set_holder_first_flag_name": "isFirstTimewisePartFound", - "set_holder_use_return_macro": False, - # Loop body style differences (timewise variant). - "set_holder_loop_uses_element_name_var": True, - "set_holder_unexpected_order": "issuccess_first", # isSuccess = false; message << ...; - "set_holder_unexpected_msg": "trailing_encountered", # "...: unexpected element '...' encountered" - "set_holder_begin_deref_parens": True, - "set_holder_from_x_before_first_check": True, - "set_holder_blank_after_first_decl": True, - "set_holder_blank_inside_else": True, - "set_holder_child_var_source": "class_name", - }, -} +_CPP_DIR = os.path.join(os.path.dirname(__file__), "cpp") +with open(os.path.join(_CPP_DIR, "config.toml"), "rb") as _f: + _CFG = tomllib.load(_f) + +SCORE_WRAPPER_FLAVOR_CONFIG = _CFG["score_wrapper"] From 22a4fa0c830bf3d80a7ab1ac2349838fc576ed34 Mon Sep 17 00:00:00 2001 From: Matthew James Briggs Date: Sun, 7 Jun 2026 20:54:23 +0200 Subject: [PATCH 24/27] gen: move attrs naming config to TOML CORE_ROOT_ATTRS, ELEMENTS_DIR_SHARED_ATTRS, and ATTRS_TYPE_ALIAS now live under [attrs] in config.toml, loaded by attrs_config.py. --- gen/attrs_config.py | 29 ++++++++++------------------- gen/cpp/config.toml | 17 +++++++++++++++++ 2 files changed, 27 insertions(+), 19 deletions(-) diff --git a/gen/attrs_config.py b/gen/attrs_config.py index 4d4f3ed74..801357410 100644 --- a/gen/attrs_config.py +++ b/gen/attrs_config.py @@ -3,30 +3,21 @@ Controls which attribute structs get type-based (shared) names vs element-based names, and provides the resolution function used throughout the generator. +Source of truth: [attrs] in cpp/config.toml. """ +import os +import tomllib + from parse import pascal from element_config import element_class_name -# Attribute structs that live at the core root level (not in elements/). -CORE_ROOT_ATTRS = { - "EmptyPrintObjectStyleAlignAttributes", -} - -# XSD type name aliases: when the type name for an element matches a key here, -# the aliased type is used for attribute-struct naming purposes. -ATTRS_TYPE_ALIAS = { - "empty-print-style-align": "empty-print-object-style-align", -} +_CPP_DIR = os.path.join(os.path.dirname(__file__), "cpp") +with open(os.path.join(_CPP_DIR, "config.toml"), "rb") as _f: + _CFG = tomllib.load(_f) -# Shared attribute structs that live in the elements/ directory but are reused -# across multiple elements (generated once, included by reference). -ELEMENTS_DIR_SHARED_ATTRS = { - "EmptyPlacementAttributes", - "EmptyLineAttributes", - "EmptyTrillSoundAttributes", - "EmptyFontAttributes", - "EmptyPrintStyleAlignAttributes", -} +CORE_ROOT_ATTRS = set(_CFG["attrs"]["core_root"]) +ELEMENTS_DIR_SHARED_ATTRS = set(_CFG["attrs"]["shared"]) +ATTRS_TYPE_ALIAS = _CFG["attrs"]["type_alias"] def resolve_attrs_name(elem_name: str, type_name: str, model) -> str: diff --git a/gen/cpp/config.toml b/gen/cpp/config.toml index 3a8d136a6..0bfb4ff5c 100644 --- a/gen/cpp/config.toml +++ b/gen/cpp/config.toml @@ -75,6 +75,23 @@ duration = "1.0" tremolo = "3" metronome-relation = '"equals"' +# Attribute struct naming configuration. +[attrs] +# Attribute structs at the core root level (not in elements/). +core_root = ["EmptyPrintObjectStyleAlignAttributes"] +# Shared attrs in elements/ reused across multiple elements. +shared = [ + "EmptyPlacementAttributes", + "EmptyLineAttributes", + "EmptyTrillSoundAttributes", + "EmptyFontAttributes", + "EmptyPrintStyleAlignAttributes", +] + +# XSD type name aliases for attribute-struct naming purposes. +[attrs.type_alias] +empty-print-style-align = "empty-print-object-style-align" + # Attribute structs that preserve xmlns:* namespace declarations through # round-trip (e.g. xmlns:xlink on score-partwise). [overrides] From b79a91d03861037f8b2ec98457f589abce12f3a4 Mon Sep 17 00:00:00 2001 From: Matthew James Briggs Date: Sun, 7 Jun 2026 20:55:26 +0200 Subject: [PATCH 25/27] gen: move element dispatch tables to config.toml OVERWRITE_FILE_STEMS, ELEMENT_CLASS_NAME_OVERRIDE, ELEMENT_VALUE_TYPE_OVERRIDE, SKIP_ELEMENTS, BESPOKE_FAMILY_OWNED, and TREE_ELEMENTS now live under [elements] in config.toml. --- gen/cpp/config.toml | 26 +++++++++++++++++++++ gen/element_config.py | 54 ++++++++++++------------------------------- 2 files changed, 41 insertions(+), 39 deletions(-) diff --git a/gen/cpp/config.toml b/gen/cpp/config.toml index 0bfb4ff5c..fefc4dd02 100644 --- a/gen/cpp/config.toml +++ b/gen/cpp/config.toml @@ -1,3 +1,29 @@ +# Element dispatch configuration. +[elements] +# File stems to always overwrite (even if file already exists). +overwrite_file_stems = ["Direction", "DirectionType", "DirectionAttributes"] +# Elements skipped entirely during generation. +skip = [] +# Elements owned by bespoke family emitters (not generated standalone). +bespoke_family_owned = ["part", "measure"] +# Elements handled by tree-based generation path. +tree = [ + "bend", "group-abbreviation-display", "group-name-display", "harmonic", + "key", "metronome", "notations", "notehead-text", "ornaments", + "part-abbreviation-display", "part-name-display", "play", + "score-instrument", +] + +# Class name overrides (element-name -> C++ class name). +[elements.class_name_override] +attributes = "Properties" + +# Value type overrides for specific elements. +[elements.value_type_override.instrument-sound] +cpp_type = "PlaybackSoundType" +header = "mx/core/PlaybackSoundType.h" +default = "PlaybackSoundType{}" + [categories.simple-value] header_template = "simple_value_h.j2" impl_template = "simple_value_cpp.j2" diff --git a/gen/element_config.py b/gen/element_config.py index 524796951..e2928af86 100644 --- a/gen/element_config.py +++ b/gen/element_config.py @@ -3,56 +3,32 @@ Controls which elements are generated by which path (skip, bespoke family, tree-based, choice, simple-value, standard), plus class-name and value-type -overrides. This is the file to edit when adding new element-specific behavior -or when migrating elements between generation strategies. +overrides. Simple data tables live in cpp/config.toml under [elements]; +complex structured config (choice tables, tree config) stays here in Python. """ +import os +import tomllib + from parse import pascal -OVERWRITE_FILE_STEMS = { - "Direction", "DirectionType", "DirectionAttributes", -} +_CPP_DIR = os.path.join(os.path.dirname(__file__), "cpp") +with open(os.path.join(_CPP_DIR, "config.toml"), "rb") as _f: + _CFG = tomllib.load(_f) -ELEMENT_CLASS_NAME_OVERRIDE = { - "attributes": "Properties", -} +_ELEM_CFG = _CFG["elements"] -ELEMENT_VALUE_TYPE_OVERRIDE = { - "instrument-sound": { - "cpp_type": "PlaybackSoundType", - "header": "mx/core/PlaybackSoundType.h", - "default": "PlaybackSoundType{}", - }, -} +OVERWRITE_FILE_STEMS = set(_ELEM_CFG["overwrite_file_stems"]) +ELEMENT_CLASS_NAME_OVERRIDE = _ELEM_CFG["class_name_override"] +ELEMENT_VALUE_TYPE_OVERRIDE = _ELEM_CFG.get("value_type_override", {}) +SKIP_ELEMENTS = set(_ELEM_CFG["skip"]) +BESPOKE_FAMILY_OWNED = set(_ELEM_CFG["bespoke_family_owned"]) +TREE_ELEMENTS = set(_ELEM_CFG["tree"]) def element_class_name(elem_name: str) -> str: """Return the C++ class name for an element, consulting overrides first.""" return ELEMENT_CLASS_NAME_OVERRIDE.get(elem_name, pascal(elem_name)) - -SKIP_ELEMENTS = set() - -BESPOKE_FAMILY_OWNED = { - "part", - "measure", -} - -TREE_ELEMENTS = { - "bend", - "group-abbreviation-display", - "group-name-display", - "harmonic", - "key", - "metronome", - "notations", - "notehead-text", - "ornaments", - "part-abbreviation-display", - "part-name-display", - "play", - "score-instrument", -} - TREE_ELEMENT_CONFIG = { "group-abbreviation-display": { "choice_class": "DisplayTextOrAccidentalText", From 9e9283c78ddf15ba2f5d1d2d51a085f0cd1d68ac Mon Sep 17 00:00:00 2001 From: Matthew James Briggs Date: Sun, 7 Jun 2026 20:56:20 +0200 Subject: [PATCH 26/27] gen: move group static config to TOML GENERATE_GROUPS, WRAPPING_STREAMCONTENTS, GROUPS_WITH_REAL_FROM_X, NESTED_OPTIONAL_SEQUENCE_AS_GROUP, EXTENSION_OPTIONAL_GROUP_RENAME, UNBOUNDED_SEQUENCE_AS_GROUP, and SYNTHETIC_UNBOUNDED_GROUP_IMPORT_- GROUP_AFTER now live under [groups] in config.toml. --- gen/cpp/config.toml | 28 ++++++++++++ gen/group_config.py | 106 +++++++++----------------------------------- 2 files changed, 49 insertions(+), 85 deletions(-) diff --git a/gen/cpp/config.toml b/gen/cpp/config.toml index fefc4dd02..df693fc9a 100644 --- a/gen/cpp/config.toml +++ b/gen/cpp/config.toml @@ -24,6 +24,34 @@ cpp_type = "PlaybackSoundType" header = "mx/core/PlaybackSoundType.h" default = "PlaybackSoundType{}" +# Group structural configuration. +[groups] +# Groups that get a real fromXElementImpl body (not the UNUSED stub). +real_from_x = ["score-header", "arrow"] +# Groups to generate (the subset of XSD groups we emit classes for). +generate = [ + "beat-unit", "display-step-octave", "editorial", "editorial-voice", + "editorial-voice-direction", "layout", "score-header", +] +# Complex types using wrapping (forEachChild) streamContents. +wrapping_streamcontents = [ + "defaults", "grouping", "identification", "part-group", "print", +] +# Element names triggering extra importGroup call in unbounded-group parsing. +unbounded_import_group_after = ["midi-instrument"] + +# Opt-in: type -> synthetic group ref name for nested optional sequences. +[groups.nested_optional_as_group] +time-modification = "normal-type-normal-dot" + +# Opt-in: type -> synthetic group ref for unbounded sequences. +[groups.unbounded_as_group] +score-part = "midi-device-instrument" + +# Opt-in: extending type -> base group -> renamed wrapper group. +[groups.extension_rename.metronome-tuplet] +normal-type-normal-dot = "time-modification-normal-type-normal-dot" + [categories.simple-value] header_template = "simple_value_h.j2" impl_template = "simple_value_cpp.j2" diff --git a/gen/group_config.py b/gen/group_config.py index 19bb2b007..281320317 100644 --- a/gen/group_config.py +++ b/gen/group_config.py @@ -3,105 +3,41 @@ Owns the mutable sets that the XSD parser populates during parsing (passed by reference via ParseConfig), the static dicts that control group synthesis, and -the group_class_name helper. Also includes WRAPPING_STREAMCONTENTS which -controls streaming behavior for a handful of complex types. +the group_class_name helper. Static config lives in cpp/config.toml under +[groups]; mutable sets remain here (they must be Python objects passed by ref). """ +import os +import tomllib + from parse import pascal +_CPP_DIR = os.path.join(os.path.dirname(__file__), "cpp") +with open(os.path.join(_CPP_DIR, "config.toml"), "rb") as _f: + _CFG = tomllib.load(_f) + +_GRP = _CFG["groups"] + # --------------------------------------------------------------------------- # Mutable sets populated by XsdModel during parsing # --------------------------------------------------------------------------- -# Populated dynamically by XsdModel._synthesize_optional_group when we -# discover an anonymous inside a parent sequence -# which round-trips through pascal() to produce the synthetic group class -# (e.g. "NormalTypeNormalDotGroup"). SYNTHETIC_OPTIONAL_GROUPS: set = set() - -# Populated dynamically by XsdModel._synthesize_unbounded_group when we -# discover an anonymous -# inside a parent sequence. The original codegen promoted some of these -# shapes to wrapper group classes used as Sets on the parent -# (e.g. score-part's midi-device + midi-instrument repeating sequence -# becomes MidiDeviceInstrumentGroup, held as a *Set on ScorePart). SYNTHETIC_UNBOUNDED_GROUPS: set = set() - -# Group names whose generated class name omits the trailing "Group" suffix. SUPPRESS_GROUP_SUFFIX: set = set() # --------------------------------------------------------------------------- -# Static group configuration dicts +# Static group configuration (loaded from TOML) # --------------------------------------------------------------------------- -# Opt-in: complex types whose anonymous nested -# should be promoted to a synthetic group rather than flattened. The XSD -# permits the same shape in several places (e.g. page-layout), but the -# original codegen only chose to promote it in specific spots. The value is -# the hyphenated-lowercase ref name used as the synthetic group's element_name. -NESTED_OPTIONAL_SEQUENCE_AS_GROUP: dict = { - "time-modification": "normal-type-normal-dot", -} - -# Opt-in: when an extending complexType inherits a synthetic optional group -# from its base, the default behavior is to flatten the group's members into -# the extending type. For specific extending types the original codegen -# instead kept the group as a *separately-named wrapper sub-element* with -# its own getHas/setHas accessors. The mapping is -# extending_type_name -> { base_synthetic_group_name -> renamed_wrapper_group_name } -# The renamed group's class name omits the usual "Group" suffix (see -# SUPPRESS_GROUP_SUFFIX), so a child reference to it renders as a regular -# wrapper element on the parent. Its members are still parsed inline like any -# other synthetic optional group (the original hand-written MetronomeTuplet.cpp -# parsed the wrapper with a no-op importElement and dropped normal-type / -# normal-dot on round-trip; that was a bug). -EXTENSION_OPTIONAL_GROUP_RENAME: dict = { - "metronome-tuplet": { - "normal-type-normal-dot": "time-modification-normal-type-normal-dot", - }, -} - -# Opt-in: complex types whose anonymous should be promoted to a synthetic unbounded group. -# Mapping parent_type_name -> hyphenated-lowercase synthetic group ref. -UNBOUNDED_SEQUENCE_AS_GROUP: dict = { - "score-part": "midi-device-instrument", -} - -# Element names whose generated synthetic-unbounded-group parser body should -# emit an additional importGroup(messsage, iter, endIter, isSuccess, elemPtr) -# call after parsing that element. The original codegen produced this call -# for midi-instrument (a no-op in practice because importGroup(MidiInstrument) -# inspects only sibling iterators that have already been consumed). Kept to -# minimize diff against committed. -SYNTHETIC_UNBOUNDED_GROUP_IMPORT_GROUP_AFTER = { - "midi-instrument", -} - -GENERATE_GROUPS = { - "beat-unit", "display-step-octave", "editorial", "editorial-voice", - "editorial-voice-direction", "layout", "score-header", - # full-note: EXC - real code has FullNoteTypeChoice class - # time-signature: EXC - real code adds Interchangeable not in XSD group - # harmony-chord: EXC - real code has Choice logic not in XSD group def - # music-data: EXC - real code wraps choice in MusicDataChoice class -} - -# Complex types whose streamContents uses the "wrapping" (forEachChild) pattern -# instead of explicit per-child streaming. -WRAPPING_STREAMCONTENTS = { - "defaults", "grouping", "identification", "part-group", "print", -} - -# Groups whose generated .cpp includes a real fromXElementImpl body (most -# groups just emit a stub that returns false). -GROUPS_WITH_REAL_FROM_X_ELEMENT = { - "score-header", - # ArrowGroup is the inline-group branch of the inline-choice element - # (INLINE_CHOICE_CONFIG["arrow"]). Arrow::fromXElementImpl dispatches its - # group branch via myArrowGroup->fromXElement(message, xelement), so the - # group needs a real parsing body. - "arrow", -} +NESTED_OPTIONAL_SEQUENCE_AS_GROUP: dict = _GRP["nested_optional_as_group"] +EXTENSION_OPTIONAL_GROUP_RENAME: dict = _GRP["extension_rename"] +UNBOUNDED_SEQUENCE_AS_GROUP: dict = _GRP["unbounded_as_group"] +SYNTHETIC_UNBOUNDED_GROUP_IMPORT_GROUP_AFTER = set( + _GRP["unbounded_import_group_after"] +) +GENERATE_GROUPS = set(_GRP["generate"]) +WRAPPING_STREAMCONTENTS = set(_GRP["wrapping_streamcontents"]) +GROUPS_WITH_REAL_FROM_X_ELEMENT = set(_GRP["real_from_x"]) # --------------------------------------------------------------------------- # Group class name resolution From 26670c4d33ead5924f2c14ab8fee4f23636a5e15 Mon Sep 17 00:00:00 2001 From: Matthew James Briggs Date: Sun, 7 Jun 2026 21:38:29 +0200 Subject: [PATCH 27/27] close out refactor competition rounds --- docs/ai/projects/gen/index.md | 13 ++++++++- docs/ai/projects/gen/log.md | 31 ++++++++++++++++++++ docs/ai/projects/gen/plan.md | 7 +++-- docs/ai/projects/gen/state.md | 53 ++++++++++++++++++++--------------- 4 files changed, 78 insertions(+), 26 deletions(-) diff --git a/docs/ai/projects/gen/index.md b/docs/ai/projects/gen/index.md index 3621b8972..e5d1a2a40 100644 --- a/docs/ai/projects/gen/index.md +++ b/docs/ai/projects/gen/index.md @@ -135,9 +135,20 @@ the fix belongs in the shared path with a config-driven flag. - `gen/ids.py` — `NodeId` typed value (M6B; assigned to every node, currently unconsumed) - `gen/quality.py` — design-quality scorer for `make gen-quality` (excluded from its own score) - `gen/.pylintrc` — pylint config for `make gen-lint` -- `gen/cpp/config.toml` — TOML routing config mapping element categories to Jinja2 templates (M6C) +- `gen/type_maps.py` — XSD-to-C++ type mapping tables and resolution logic (M6C) +- `gen/naming.py` — C++ keywords, camel/pascal helpers (M6C) +- `gen/overrides.py` — per-element/attribute behavioral overrides, loaded from TOML (M6C) +- `gen/element_config.py` — element dispatch config, loaded from TOML (M6C) +- `gen/group_config.py` — group config (mutable sets + TOML-loaded statics) (M6C) +- `gen/attrs_config.py` — attribute struct naming, loaded from TOML (M6C) +- `gen/score_config.py` — score wrapper flavor config, loaded from TOML (M6C) +- `gen/cpp/config.toml` — TOML config: routing, defaults, overrides, element/group/attrs data (M6C) - `gen/cpp/simple_value_h.j2` — Jinja2 header template for simple-value elements (M6C) - `gen/cpp/simple_value_cpp.j2` — Jinja2 impl template for simple-value elements (M6C) +- `gen/cpp/group_h.j2` — Jinja2 header template for group classes (M6C) +- `gen/cpp/group_cpp.j2` — Jinja2 impl template for group classes (M6C) +- `gen/cpp/attrs_h.j2` — Jinja2 header template for attribute structs (M6C) +- `gen/cpp/attrs_cpp.j2` — Jinja2 impl template for attribute structs (M6C) - `docs/musicxml.xsd` — input schema (currently MusicXML 3.0; swap to 4.0 in M6) - `src/private/mx/core/elements/` — target output (~590 .h/.cpp pairs) - `src/private/mxtest/corert/` — core-roundtrip harness diff --git a/docs/ai/projects/gen/log.md b/docs/ai/projects/gen/log.md index 20d3efe3b..a6019433c 100644 --- a/docs/ai/projects/gen/log.md +++ b/docs/ai/projects/gen/log.md @@ -81,3 +81,34 @@ simple_value_cpp.j2. Added /opt/gen-venv with Jinja2==3.1.6 to Dockerfile. Added target to Makefile. Modified gen/generate.py: simple-value elements now render via Jinja2 templates instead of the shared generate_element_h/cpp f-string path. Verified zero diff across all 101 simple-value elements (202 files). Non-simple-value elements remain on the f-string path unchanged. + +## 2026-06-07 21:33 + +M6C session 2: two batches of 10 refactoring rounds (20 commits total on wrk branch). + +Batch 1 (rounds 1-10): extract config from generate.py into dedicated Python modules. +- type_maps.py (280 lines): XSD-to-C++ type mapping tables and resolution logic +- naming.py (37 lines): C++ keywords, camel/pascal helpers +- overrides.py (64 lines): per-element/attribute behavioral overrides +- element_config.py (207 lines): element dispatch config, choice tables, dynamics marks +- group_config.py (113 lines): group mutable sets, static dicts, group_class_name +- attrs_config.py (41 lines): attribute struct naming config, resolve_attrs_name +- score_config.py (59 lines): score wrapper partwise/timewise flavor knobs +- Also moved default-value tables to config.toml + +Batch 2 (rounds 11-20): TOML config and Jinja2 templates. +TOML additions (config.toml grew from 69 to 281 lines): +- [overrides.attr_default]: 25 per-attribute default value overrides +- [overrides]: xmlns_preserving_attrs, has_contents_always_true, child_init_value +- [score_wrapper]: partwise/timewise behavioral knobs (17 fields each) +- [attrs]: core_root, shared, type_alias +- [elements]: overwrite_file_stems, skip, bespoke_family_owned, tree, class_name_override, value_type_override +- [groups]: generate, wrapping_streamcontents, real_from_x, unbounded_import_group_after, nested_optional_as_group, unbounded_as_group, extension_rename + +New Jinja2 templates (4 templates, 380 lines total): +- group_h.j2, group_cpp.j2: group header/impl generation +- attrs_h.j2, attrs_cpp.j2: attrs struct header/impl generation + +generate.py reduced from 13,441 to 12,343 lines. All syntax verified. Jinja2 infrastructure +preserved throughout. Cannot run full oracle (jinja2 not installed locally) but all Python modules +parse cleanly and cross-imports verified. diff --git a/docs/ai/projects/gen/plan.md b/docs/ai/projects/gen/plan.md index 91ed43df9..d82ec1e4d 100644 --- a/docs/ai/projects/gen/plan.md +++ b/docs/ai/projects/gen/plan.md @@ -57,9 +57,10 @@ Iteratively stand-up a separation of concerns between - Template configuration - Use of template files instead of python f-strings -Infrastructure: Jinja2 templates in `gen/cpp/`, TOML routing config, `/opt/gen-venv` in Docker, -`make generate` target. Simple-value elements (101) are templated. Lookup tables remain in Python -until all their consumers are templated. +Infrastructure: Jinja2 templates in `gen/cpp/`, TOML config (281 lines), `/opt/gen-venv` in Docker, +`make generate` target. Templated: simple-value (101 elements), group h/cpp, attrs h/cpp. Seven +Python config modules extracted from generate.py. Most lookup tables moved to config.toml; remaining +mutable sets stay in Python (passed by reference to parser). ### 6D++ diff --git a/docs/ai/projects/gen/state.md b/docs/ai/projects/gen/state.md index e7dc5be8a..a19b90e6f 100644 --- a/docs/ai/projects/gen/state.md +++ b/docs/ai/projects/gen/state.md @@ -4,29 +4,31 @@ M6C_CONFIG_FILE, in progress. -## What the last session did (2026-06-07, M6C session 1) +## What the last session did (2026-06-07, M6C session 2) -Grilled the user on the first M6C changeset, then implemented it: +Two batches of 10 refactoring rounds (20 commits on `wrk` branch, not pushed): -- Created `gen/cpp/` with `config.toml` (routing), `simple_value_h.j2`, `simple_value_cpp.j2` -- Added `/opt/gen-venv` with Jinja2==3.1.6 to the Dockerfile (separate from quality-venv) -- Added `make generate` target (runs `gen/generate.py` inside Docker) -- Modified `gen/generate.py`: simple-value elements (101 elements) now render via Jinja2 templates. - The `_render_simple_value()` function builds a context dict from existing Python lookup tables and - renders the templates. The old fake-CT path through `generate_element_h/cpp` is removed for this - category. -- Verified zero diff across all 202 simple-value files +**Batch 1**: Extracted config from the generate.py monolith into 7 Python modules: +- `gen/type_maps.py`, `gen/naming.py`, `gen/overrides.py`, `gen/element_config.py`, + `gen/group_config.py`, `gen/attrs_config.py`, `gen/score_config.py` -Not yet committed or tested through Docker build / CI. The user needs to rebuild the Docker image -(`make generate` will trigger it) and verify the full oracle: -`make generate && make fmt && git diff --quiet src/private/mx/core`. +**Batch 2**: Moved data to TOML config and created Jinja2 templates: +- `gen/cpp/config.toml` grew from 69 to 281 lines (all override tables, element dispatch, + group config, attrs naming, score wrapper config) +- 4 new Jinja2 templates: `group_h.j2`, `group_cpp.j2`, `attrs_h.j2`, `attrs_cpp.j2` +- `generate.py` down to 12,343 lines (from 13,441) + +All Python syntax verified. Jinja2 infrastructure intact. Zero-diff oracle not yet run (needs Docker +rebuild for jinja2 env). ## What the next session should do -Get instructions from the user. Likely options: -- Continue M6C: template the next element category (text-value, empty, empty-with-attrs, etc.) -- At some point, lookup tables (TYPE_DEFAULT_VALUE, etc.) can move to TOML once all their consumers - are templated +1. Run the oracle: `make generate && make fmt && git diff --quiet src/private/mx/core` to confirm + zero diff. If any diff, debug and fix. +2. Run `make gen-quality` and `make gen-lint` to check floor compliance. +3. Continue M6C: more f-string functions to templates (choice_class_h/cpp, element_h/cpp are big + targets). More lookup tables to TOML as their consumers get templated. +4. Eventually: squash or organize the 20 commits for a clean PR. ## Oracle (how to prove zero diff) @@ -40,10 +42,17 @@ Then `make test-core-dev`. Reset generated C++ before committing: - `make fmt` (~1 min, Docker) is part of the oracle - the generator emits unformatted C++. - The generator now requires Jinja2. Running `python3 gen/generate.py` bare requires a Python environment with `jinja2` and `tomllib` (Python 3.11+). Use `make generate` to run inside Docker. -- CI `linux-gate` runs `make gen-quality` (floor 37.7) and `make gen-lint` (floor 9.4). The new - `_render_simple_value` function and imports should be scored normally. +- CI `linux-gate` runs `make gen-quality` (floor 37.7) and `make gen-lint` (floor 9.4). - `gen-quality`/`gen-lint` are otherwise ignored during the refactor (user directive) unless CI fails. -- Jinja2 environment uses `trim_blocks=True` and `lstrip_blocks=True` to avoid extra blank lines - from block tags. Do not use `-%}` suffix on block tags in templates - it eats leading indentation. -- `node_id` fields are `compare=False` on purpose; keep it that way. +- Jinja2 environment uses `trim_blocks=True` and `lstrip_blocks=True`. Do not use `-%}` suffix on + block tags in templates. +- The new modules (`overrides.py`, `attrs_config.py`, `score_config.py`, `element_config.py`, + `group_config.py`) each independently load config.toml. This is fine at import time but means + the TOML is parsed multiple times. Not a perf concern for a code generator. +- `SYNTHETIC_OPTIONAL_GROUPS`, `SYNTHETIC_UNBOUNDED_GROUPS`, `SUPPRESS_GROUP_SUFFIX` remain as + mutable Python sets in `group_config.py` (they're passed by reference to ParseConfig and + mutated during parsing). They cannot move to TOML. +- `BESPOKE_ELEMENTS` dict maps to function objects - cannot be extracted to config. +- The `_emit_ctor_init` line-wrapping logic and `_emit_group_real_from_x_impl` are pre-rendered + in Python and passed as strings to templates (too complex for Jinja2 logic).