S4b: fair MO cross-library harness — shared ported SBX + aligned NSGA-II (milestone-12a)#16
Merged
Merged
Conversation
…-II (ctrl-freak/pymoo/DEAP), PM fix (milestone-12a)
Contributor
Author
Adversarial review — APPROVE ✅ (after F1 fix)Fresh reviewer verified by execution: ported SBX live in both adapters (100 core calls each; F1 (resolved in 144378d): the unplanned full-budget Recommend merge. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
s4b-mo-harness (milestone-12a) — fair MO cross-library harness
ctrl-freak
nsga2vs pymooNSGA2vs DEAP (selNSGA2), configured to run the IDENTICAL algorithm. Returns normalized raw output (extracted NON-DOMINATED front objectives + per-generation history); computes NO metrics (s5 does).Files
benchmarks/harness/multi_objective.py(NEW): the three NSGA-II adapters +run_mo. Imports the shared ported SBX frombenchmarks.harness.operators(created by s4a — read-only here).tests/benchmarks/test_mo_harness.py(NEW).Alignment (Option C — see PLAN.md decision)
make_pymoo_ctrl_freak_sbx(custom pymoo Crossover) anddeap_ctrl_freak_sbx(DEAP mate); ctrl-freak uses nativesbx_crossover. Identical crossover → comparison isolates the NSGA-II loop.PM(prob=1.0, prob_var=1/n_var), DEAPmutPolynomialBounded(indpb=1/n_var)every offspring = ctrl-freak. (The legacy per-individualPM(prob=1/n_var)≈3% was the real cause of the old "pymoo ZDT3 scatter" — fixed.)evaluate()calls all three (100 + 250×100), verified by independent counter; pymoo driven via OOsetup/NoTermination/next()to fix the gen-1 off-by-one.non_dominated_sort==0mask; every reported front is non-dominated; ZDT3 clean.FitnessMinMO/IndividualMO(per-n_objweights), coexists with s4a's SO classes.Parity payoff (IGD+, seeds 0–2 median, full 25,100 budget)
With the identical ported SBX, the three coincide within seed noise on the convex/standard fronts (the ~70% gap under stock-0.5 SBX is gone). zdt4/zdt6 stay far/high-variance for all three (multimodal n=30); s5's 30-seed overlapping-variance test adjudicates.
Acceptance (verified twice — critic + orchestrator)
pytest test_mo_harness.py --no-cov: 22 passed. Doctests: 5 passed.ruff check,ty check src/: clean. fulluv run pytest: 506 passed @ 98.89%.Plan converged via planner↔critic over 2 rounds (critic reproduced the IGD+ collapse, the live ported operator (100 core calls/run), eval counts, extraction, and creator coexistence by execution).