Skip to content

feat(workflow-benchmarks): baseline workflow profiling and bottleneck analysis (#1841)#1848

Open
josecelano wants to merge 8 commits into
torrust:developfrom
josecelano:1841-1840-workflow-performance-baseline-analysis
Open

feat(workflow-benchmarks): baseline workflow profiling and bottleneck analysis (#1841)#1848
josecelano wants to merge 8 commits into
torrust:developfrom
josecelano:1841-1840-workflow-performance-baseline-analysis

Conversation

@josecelano
Copy link
Copy Markdown
Member

@josecelano josecelano commented May 28, 2026

Summary

Closes #1841 (child of EPIC #1840 — Improve PR workflow performance).

Establishes a reproducible baseline for the container and testing CI workflows,
capturing cold-cache and warm-cache wall-clock timings and identifying the
primary bottlenecks.

Changes

  • contrib/dev-tools/workflow-benchmarks/run-container-baseline.sh — reproducible cold/warm timing script for the container workflow
  • contrib/dev-tools/workflow-benchmarks/run-testing-baseline.sh — reproducible cold/warm timing script for the testing workflow
  • docs/issues/open/1841-1840-workflow-performance-baseline-analysis/benchmark-results-baseline.md — baseline report with measurements, bottleneck analysis, and Docker layer breakdown
  • docs/issues/open/1841-1840-workflow-performance-baseline-analysis/evidence/ — raw timing logs and cargo-timing HTML for the baseline runs
  • Containerfile — added time wrappers to multi-command RUN blocks for sub-step timing visibility; added time package to tester stage apt install; removed time wrappers from gcc stage (gcc:trixie has no time binary)
  • .dockerignore — added /.tmp/ to exclude AI agent log/cargo isolation dirs from Docker build context
  • AGENTS.md — added .tmp/ entry to the Key Directories section

Key Findings

All timings are local machine sequential totals (AMD Ryzen 9 7950X, 64 GiB RAM).
CI-equivalent wall time is lower due to parallel matrix execution — see the benchmark report for details.

Workflow Cold run (sequential) CI-equivalent cold Warm run
Container (debug+release) ~499 s (~8.3 m) ~260 s (~4.3 m) ~2 s
Testing (unit+e2e) ~767 s (~12.8 m) ~510 s (~8.5 m) ~393 s (~6.6 m)

Top bottlenecks (cold):

  1. Dependency compilation (cargo chef cook) — ~18 min (container) / ~15 min (testing)
  2. Docker layer rebuild triggered by COPY . /build/src before cargo chef cook in recipe stage
  3. .tmp/ directory (AI agent logs + benchmark cargo isolation) included in Docker build context — added to .dockerignore

Testing

All pre-push checks passed:

  • cargo +nightly fmt --check
  • cargo +nightly check --workspace --all-targets --all-features
  • cargo +nightly doc --workspace --all-features
  • cargo +stable test --workspace --all-targets --all-features

Copilot AI review requested due to automatic review settings May 28, 2026 12:51
@josecelano josecelano self-assigned this May 28, 2026
@josecelano josecelano added Continuous Integration Workflows and Automation - Developer - Torrust Improvement Experience Optimization Make it Faster labels May 28, 2026
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Establishes an in-repo, repeatable baseline for profiling CI workflow performance (container + testing), with scripts to capture cold/warm timings, committed evidence artifacts, and a written bottleneck analysis to anchor future optimizations under EPIC #1840.

Changes:

  • Added reproducible benchmark scripts for container and testing workflow equivalents, emitting structured timing logs into the issue’s evidence/ directory.
  • Added a durable baseline report (benchmark-results-baseline.md) with measurements, bottleneck ranking, and linker-heavy target analysis.
  • Improved timing visibility and build-context correctness via time-wrapped multi-command RUN steps, .dockerignore updates for /.tmp/, and supporting documentation/cspell updates.

Reviewed changes

Copilot reviewed 14 out of 17 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
project-words.txt Adds British-English spellings used in the new benchmark/report docs.
docs/issues/open/1841-1840-workflow-performance-baseline-analysis/ISSUE.md Marks the sub-issue as implemented and links the new scripts/report/evidence.
docs/issues/open/1841-1840-workflow-performance-baseline-analysis/benchmark-results.md Removes the old placeholder benchmark artifact.
docs/issues/open/1841-1840-workflow-performance-baseline-analysis/benchmark-results-baseline.md Adds the baseline results report, methodology, and bottleneck analysis.
docs/issues/open/1841-1840-workflow-performance-baseline-analysis/evidence/container-baseline-20260527T210123Z.log Adds raw timing evidence for the container baseline run.
docs/issues/open/1841-1840-workflow-performance-baseline-analysis/evidence/testing-baseline-20260527T211129Z.log Adds raw timing evidence for the testing baseline run.
docs/issues/open/1841-1840-workflow-performance-baseline-analysis/evidence/cargo-timing-release-20260528T074109Z.html Adds cargo-timings HTML artifact used for linker/target cost analysis.
docs/issues/drafts/1840-workflow-performance-dependency-layer-cache-reuse/ISSUE.md Updates semantic links to point at the new baseline report filename.
docs/issues/drafts/1840-workflow-performance-containerfile-target-scope/ISSUE.md Updates semantic links to point at the new baseline report filename.
docs/issues/drafts/1840-workflow-performance-container-workflow-build-deduplication/ISSUE.md Updates semantic links to point at the new baseline report filename.
docs/issues/drafts/1840-workflow-performance-container-test-gating/ISSUE.md Updates semantic links to point at the new baseline report filename.
cspell.json Excludes evidence HTML artifacts from cspell checking.
contrib/dev-tools/workflow-benchmarks/run-container-baseline.sh Adds a container workflow baseline runner that logs structured phase timings.
contrib/dev-tools/workflow-benchmarks/run-testing-baseline.sh Adds a testing workflow baseline runner (unit + docker-e2e equivalents) with structured timings.
Containerfile Wraps multi-command RUN blocks with time for sub-step duration visibility in BuildKit output.
.dockerignore Excludes /.tmp/ from Docker build context to avoid cache busting / slow COPY.
AGENTS.md Documents .tmp/ as a workspace-local temp dir and notes its use by tooling/benchmarks.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread contrib/dev-tools/workflow-benchmarks/run-container-baseline.sh
Comment on lines +56 to +60
t0=$(date +%s)
"$@"
rc=$?
t1=$(date +%s)
echo "[$scope] ${name}_seconds=$((t1 - t0))"
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Valid observation. For the purposes of this baseline (measuring multi-minute build phases), second resolution is sufficient and the 0s entries honestly reflect that the phase completed in under one second. Sub-second precision is noted as a future improvement. Switching to date +%s%N would break portability on macOS (BSD date has no %N), so it warrants its own decision when the need arises.

Comment on lines +51 to +62
time_phase() {
local scope="$1" name="$2"
shift 2
echo "[$scope] ${name}_start"
local t0 t1 rc
t0=$(date +%s)
set +e
"$@"
rc=$?
set -e
t1=$(date +%s)
echo "[$scope] ${name}_seconds=$((t1 - t0))"
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as the corresponding comment on run-container-baseline.sh: for multi-minute build phases second resolution is adequate and the 0s entries are correct. Sub-second timing is noted as a future improvement (portability concern with date +%s%N on BSD/macOS).

Comment thread contrib/dev-tools/workflow-benchmarks/run-testing-baseline.sh
@codecov
Copy link
Copy Markdown

codecov Bot commented May 28, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 77.78%. Comparing base (633370f) to head (4fe1b0e).
⚠️ Report is 2 commits behind head on develop.

Additional details and impacted files
@@             Coverage Diff             @@
##           develop    #1848      +/-   ##
===========================================
- Coverage    77.81%   77.78%   -0.03%     
===========================================
  Files          380      380              
  Lines        28647    28647              
  Branches     28647    28647              
===========================================
- Hits         22292    22284       -8     
- Misses        6044     6057      +13     
+ Partials       311      306       -5     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

…r version

- run-container-baseline.sh: wrap command with set +e/set -e in time_phase()
  so failures are timed and logged rather than aborting the script silently
  (matches the pattern already used in run-testing-baseline.sh)
- run-testing-baseline.sh: pin linter install to --rev 70f84a29925b16a903110e494c9b8de519633a7f
  (torrust/torrust-linting main tip, 2026-04-06) for reproducible baseline runs
@josecelano
Copy link
Copy Markdown
Member Author

ACK 4fe1b0e

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

- Developer - Torrust Improvement Experience Continuous Integration Workflows and Automation Optimization Make it Faster

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Baseline workflow profiling and bottleneck analysis (EPIC #1840)

2 participants