feat(workflow-benchmarks): baseline workflow profiling and bottleneck analysis (#1841)#1848
Conversation
…he from build context
…or sub-step timing visibility
…and benchmark report (torrust#1841)
…d and benchmark report
There was a problem hiding this comment.
Pull request overview
Establishes an in-repo, repeatable baseline for profiling CI workflow performance (container + testing), with scripts to capture cold/warm timings, committed evidence artifacts, and a written bottleneck analysis to anchor future optimizations under EPIC #1840.
Changes:
- Added reproducible benchmark scripts for container and testing workflow equivalents, emitting structured timing logs into the issue’s
evidence/directory. - Added a durable baseline report (
benchmark-results-baseline.md) with measurements, bottleneck ranking, and linker-heavy target analysis. - Improved timing visibility and build-context correctness via
time-wrapped multi-commandRUNsteps,.dockerignoreupdates for/.tmp/, and supporting documentation/cspell updates.
Reviewed changes
Copilot reviewed 14 out of 17 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description |
|---|---|
| project-words.txt | Adds British-English spellings used in the new benchmark/report docs. |
| docs/issues/open/1841-1840-workflow-performance-baseline-analysis/ISSUE.md | Marks the sub-issue as implemented and links the new scripts/report/evidence. |
| docs/issues/open/1841-1840-workflow-performance-baseline-analysis/benchmark-results.md | Removes the old placeholder benchmark artifact. |
| docs/issues/open/1841-1840-workflow-performance-baseline-analysis/benchmark-results-baseline.md | Adds the baseline results report, methodology, and bottleneck analysis. |
| docs/issues/open/1841-1840-workflow-performance-baseline-analysis/evidence/container-baseline-20260527T210123Z.log | Adds raw timing evidence for the container baseline run. |
| docs/issues/open/1841-1840-workflow-performance-baseline-analysis/evidence/testing-baseline-20260527T211129Z.log | Adds raw timing evidence for the testing baseline run. |
| docs/issues/open/1841-1840-workflow-performance-baseline-analysis/evidence/cargo-timing-release-20260528T074109Z.html | Adds cargo-timings HTML artifact used for linker/target cost analysis. |
| docs/issues/drafts/1840-workflow-performance-dependency-layer-cache-reuse/ISSUE.md | Updates semantic links to point at the new baseline report filename. |
| docs/issues/drafts/1840-workflow-performance-containerfile-target-scope/ISSUE.md | Updates semantic links to point at the new baseline report filename. |
| docs/issues/drafts/1840-workflow-performance-container-workflow-build-deduplication/ISSUE.md | Updates semantic links to point at the new baseline report filename. |
| docs/issues/drafts/1840-workflow-performance-container-test-gating/ISSUE.md | Updates semantic links to point at the new baseline report filename. |
| cspell.json | Excludes evidence HTML artifacts from cspell checking. |
| contrib/dev-tools/workflow-benchmarks/run-container-baseline.sh | Adds a container workflow baseline runner that logs structured phase timings. |
| contrib/dev-tools/workflow-benchmarks/run-testing-baseline.sh | Adds a testing workflow baseline runner (unit + docker-e2e equivalents) with structured timings. |
| Containerfile | Wraps multi-command RUN blocks with time for sub-step duration visibility in BuildKit output. |
| .dockerignore | Excludes /.tmp/ from Docker build context to avoid cache busting / slow COPY. |
| AGENTS.md | Documents .tmp/ as a workspace-local temp dir and notes its use by tooling/benchmarks. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| t0=$(date +%s) | ||
| "$@" | ||
| rc=$? | ||
| t1=$(date +%s) | ||
| echo "[$scope] ${name}_seconds=$((t1 - t0))" |
There was a problem hiding this comment.
Valid observation. For the purposes of this baseline (measuring multi-minute build phases), second resolution is sufficient and the 0s entries honestly reflect that the phase completed in under one second. Sub-second precision is noted as a future improvement. Switching to date +%s%N would break portability on macOS (BSD date has no %N), so it warrants its own decision when the need arises.
| time_phase() { | ||
| local scope="$1" name="$2" | ||
| shift 2 | ||
| echo "[$scope] ${name}_start" | ||
| local t0 t1 rc | ||
| t0=$(date +%s) | ||
| set +e | ||
| "$@" | ||
| rc=$? | ||
| set -e | ||
| t1=$(date +%s) | ||
| echo "[$scope] ${name}_seconds=$((t1 - t0))" |
There was a problem hiding this comment.
Same as the corresponding comment on run-container-baseline.sh: for multi-minute build phases second resolution is adequate and the 0s entries are correct. Sub-second timing is noted as a future improvement (portability concern with date +%s%N on BSD/macOS).
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## develop #1848 +/- ##
===========================================
- Coverage 77.81% 77.78% -0.03%
===========================================
Files 380 380
Lines 28647 28647
Branches 28647 28647
===========================================
- Hits 22292 22284 -8
- Misses 6044 6057 +13
+ Partials 311 306 -5 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
…r version - run-container-baseline.sh: wrap command with set +e/set -e in time_phase() so failures are timed and logged rather than aborting the script silently (matches the pattern already used in run-testing-baseline.sh) - run-testing-baseline.sh: pin linter install to --rev 70f84a29925b16a903110e494c9b8de519633a7f (torrust/torrust-linting main tip, 2026-04-06) for reproducible baseline runs
|
ACK 4fe1b0e |
Summary
Closes #1841 (child of EPIC #1840 — Improve PR workflow performance).
Establishes a reproducible baseline for the container and testing CI workflows,
capturing cold-cache and warm-cache wall-clock timings and identifying the
primary bottlenecks.
Changes
contrib/dev-tools/workflow-benchmarks/run-container-baseline.sh— reproducible cold/warm timing script for the container workflowcontrib/dev-tools/workflow-benchmarks/run-testing-baseline.sh— reproducible cold/warm timing script for the testing workflowdocs/issues/open/1841-1840-workflow-performance-baseline-analysis/benchmark-results-baseline.md— baseline report with measurements, bottleneck analysis, and Docker layer breakdowndocs/issues/open/1841-1840-workflow-performance-baseline-analysis/evidence/— raw timing logs and cargo-timing HTML for the baseline runsContainerfile— addedtimewrappers to multi-commandRUNblocks for sub-step timing visibility; addedtimepackage to tester stage apt install; removedtimewrappers from gcc stage (gcc:trixie has notimebinary).dockerignore— added/.tmp/to exclude AI agent log/cargo isolation dirs from Docker build contextAGENTS.md— added.tmp/entry to the Key Directories sectionKey Findings
All timings are local machine sequential totals (AMD Ryzen 9 7950X, 64 GiB RAM).
CI-equivalent wall time is lower due to parallel matrix execution — see the benchmark report for details.
Top bottlenecks (cold):
cargo chef cook) — ~18 min (container) / ~15 min (testing)COPY . /build/srcbeforecargo chef cookinrecipestage.tmp/directory (AI agent logs + benchmark cargo isolation) included in Docker build context — added to.dockerignoreTesting
All pre-push checks passed:
cargo +nightly fmt --check✓cargo +nightly check --workspace --all-targets --all-features✓cargo +nightly doc --workspace --all-features✓cargo +stable test --workspace --all-targets --all-features✓