Skip to content

[codex] Add CollectiveX benchmark dashboard#497

Draft
Oseltamivir wants to merge 7 commits into
masterfrom
collectivex
Draft

[codex] Add CollectiveX benchmark dashboard#497
Oseltamivir wants to merge 7 commits into
masterfrom
collectivex

Conversation

@Oseltamivir

@Oseltamivir Oseltamivir commented Jun 25, 2026

Copy link
Copy Markdown
Contributor

What changed

  • adds the /collectivex dashboard using the shared InferenceX D3 chart stack, responsive controls, theme colors, sidebar legend, zoom, and pinned tooltips
  • aligns the frontend with the updated experimental/CollectiveX/results/plots/collectivex_ep.html report: measured dispatch/combine/round-trip operations, isolated-sum caveat, publication/routing filters, phase-specific token ranges, scaling views, heatmaps, sensitivity, failures, coverage, and provenance
  • separates chart controls from data filters and makes the All / BF16 / FP8 dispatch-precision toggle permanently visible and visually prominent
  • adds an activation-profile selector (Normal, Zeros, Small amplitude, Wide dynamic range, FP8 saturation) across the explorer, overview panels, scaling, heatmaps, and sensitivity table; non-normal legend entries are labeled explicitly
  • uses measured geometric sweep values as explicit log2 X-axis ticks/grid lines, removing generated log subdivisions and fixing the odd token placement
  • simplifies CollectiveX Y axes with domain-aware sparse 1-2-5 log ticks, avoids whole-decade domain expansion, and regenerates ticks while zooming
  • normalizes v4 artifacts without conflating independently measured round trip with isolated dispatch+combine percentiles; preserves workload, routing, activation, quantization, resource, placement, runner, and failure identity
  • commits a generated static snapshot with 255 retained sweeps, 1,494 rows, and 7 quarantined cases from 234 workflow runs / 255 artifacts; CollectiveX does not require a database
  • hardens GitHub artifact ingestion with pagination, bounded retries/timeouts, strict artifact-list failures, and source-run inclusion before workflow completion
  • updates the manual/push-driven ingestion path to use the app repository PAT; the persistent source collectivex branch dispatches updates with INFX_FRONTEND_PAT after successful push or manual benchmark runs

Data flow

SemiAnalysisAI/InferenceX (collectivex branch push or workflow dispatch) → repository_dispatchUpdate CollectiveX Data in this repository → generated packages/app/public/data/collectivex.json commit.

The page fetches that static JSON through React Query. No Neon table or runtime GitHub API request is involved.

Validation

  • pnpm lint
  • pnpm fmt
  • pnpm typecheck
  • pnpm build (Node 24)
  • pnpm test:unit — 2,501 passed across workspaces (Node 24, TZ=UTC)
  • Cypress component suite — 152 passed
  • collectivex.cy.ts — 9 passed; landing-performance.cy.ts — 5 passed
  • visual desktop check with the generated 255-series snapshot
  • actionlint .github/workflows/update-collectivex-data.yml
  • live authenticated snapshot regeneration completed successfully and was deterministic
  • source-report parity: 255/255 series, 1,494/1,494 rows, 7/7 failures, zero field mismatches
  • sensitivity parity: all 7 TypeScript summaries exactly match tests/sensitivity.py

The repository-wide integration suite was attempted locally; DB-backed specs cannot run without the repository database environment. The CollectiveX spec passes independently against its generated fixture.

Unofficial inference/evaluation overlays are not applicable to this independent static CollectiveX schema.

@vercel

vercel Bot commented Jun 25, 2026

Copy link
Copy Markdown

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
inferencemax-app Ready Ready Preview, Comment Jun 27, 2026 2:57am

Request Review

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant