diff --git a/astro.config.mjs b/astro.config.mjs index 6460310..d5331fb 100644 --- a/astro.config.mjs +++ b/astro.config.mjs @@ -45,14 +45,21 @@ export default defineConfig({ { label: '/gaia-plan', slug: 'commands/plan' }, { label: '/gaia-spec', slug: 'commands/spec' }, { label: '/gaia-handoff and pickup', slug: 'commands/handoff-pickup' }, - { label: '/gaia-audit', slug: 'commands/audit' }, - { label: '/gaia-fitness', slug: 'commands/fitness' }, { label: '/gaia-forensics', slug: 'commands/forensics' }, - { label: '/gaia-wiki', slug: 'commands/wiki' }, { label: '/update-deps', slug: 'commands/update-deps' }, { label: '/update-gaia', slug: 'commands/update-gaia' }, ], }, + { + label: 'Maintenance', + items: [ + { label: 'Overview', slug: 'maintenance' }, + { label: '/gaia-fitness', slug: 'commands/fitness' }, + { label: '/gaia-audit', slug: 'commands/audit' }, + { label: '/gaia-harden', slug: 'commands/harden' }, + { label: '/gaia-wiki', slug: 'commands/wiki' }, + ], + }, { label: 'Skills', items: [ diff --git a/src/content/docs/commands/audit.mdx b/src/content/docs/commands/audit.mdx index 307756d..c21cadf 100644 --- a/src/content/docs/commands/audit.mdx +++ b/src/content/docs/commands/audit.mdx @@ -1,31 +1,51 @@ --- title: /gaia-audit -description: Audit memory, rules, and wiki for duplication, stale entries, and over-budget autoloads. +description: Audit memory, rules, and wiki for duplication, contradictions, stale entries, and over-budget autoloads, then apply the fixes behind a single review gate. --- -`/gaia-audit` reviews the project's knowledge stores (machine-local memory, project rules, wiki, autoloaded `CLAUDE.md` files) for duplication, stale entries, and bloated base context noise. It runs in two stages: a research stage produces a report, then an apply stage executes the report mechanically. +`/gaia-audit` reviews the project's knowledge stores (machine-local memory, project rules, the wiki, and autoloaded `CLAUDE.md` files) for duplication, contradictions, stale entries, and bloated base context. It runs in two stages: a research stage writes a report, then a single decision gate asks you what to do before an apply stage executes anything. The wiki is the source of truth. Memory is machine-local only. Autoloaded files carry a token cost on every session, so the audit pushes detail behind lazy wikilinks and keeps the autoloaded surface as a pointer set. ## When to use it -Run `/gaia-audit` periodically: after a stretch of work, or when your base context size feels too large. +Run `/gaia-audit` periodically: after a stretch of work, or when your base context size feels too large. GAIA also nudges you. When debt accrues, the statusline shows `Run /gaia-audit ()`; see [Statusline nudge](#statusline-nudge) for the triggers. ## How to invoke -Default. Research, then apply: +There are three invocations. + +Default. Research, review at a gate, then apply on confirmation: ``` /gaia-audit ``` -Apply only. Re-execute the most recent report: +Scoped. Narrow what the research stage inspects to a hint, then run the same gate: + +``` +/gaia-audit "hot.md and rules only" +``` + +The hint is synthesis guidance. It can narrow which stores or files the research stage looks at, but it cannot change the report schema, the action types, the guardrails, or any specific edit. An empty hint is identical to the default full run. + +Apply only. Re-run the apply stage against the most recent report: ``` /gaia-audit --apply ``` -Use `--apply` when an earlier run produced a report but drift caused some actions to skip; fix the drift, then re-apply. Reports older than 24 hours are refused. Generate a fresh one in that case. +Use `--apply` to resume a report that was interrupted, or to retry after fixing drift that caused some actions to skip. It targets the newest report that is still `draft` or `applied-partial`. See [The re-apply window](#the-re-apply-window) for the grace period. + +## The decision gate + +Calling `/gaia-audit` is the intent to audit, not a commitment to apply. After the research stage writes its report, the main conversation summarizes the findings and asks one question with three options: + +- **Apply.** Run the apply stage now. This is the one-keystroke fast path. +- **Discuss / refine.** Talk the report through. Edits happen in the report in place (it stays a `draft`), then the gate is asked again. +- **Decline.** Delete the report. Nothing changes. + +Nothing in any knowledge store is touched until you choose **Apply**. ## What gets audited @@ -40,13 +60,16 @@ Use `--apply` when an earlier run produced a report but drift caused some action | Project rules | `.claude/rules/*.md` | Yes when `paths:` frontmatter matches | | Nested `CLAUDE.md` | `**/CLAUDE.md` (e.g. monorepo packages) | Yes when cwd matches | -The audit checks each entry for cross-store duplication and classifies it as: +The audit checks each entry against the other stores and classifies it as: -- **DUPLICATE**: fact already canonical in the wiki; mark for deletion. -- **PROMOTE**: durable knowledge only in memory; propose moving to a specific wiki page. +- **DUPLICATE**: fact already canonical in the wiki; mark the local copy for deletion. +- **CONFLICT**: an entry that asserts the opposite of an authoritative source on the same subject. Two scopes resolve differently. A memory entry or rule that contradicts the wiki resolves toward the wiki. Two committed project files that disagree (for example a command and a skill that name different models for the same stage) resolve toward whichever file is canonical for that fact, judged per finding. Resolutions are written as ordinary `replace` or `delete` actions. +- **PROMOTE**: durable knowledge only in memory; propose moving it to a specific wiki page. - **KEEP-LOCAL**: genuinely machine-local (personal preference, machine path, unique dev env); keep as-is. - **STALE**: references a file, branch, or feature no longer present; mark for deletion. +Contradiction detection has two deliberate exclusions: a `paths:`-scoped rule that duplicates the wiki on purpose is sanctioned, not a conflict; and a wiki-page-vs-wiki-page disagreement is out of scope here. For that, use `/gaia-wiki consolidate`. + Then it computes auto-load budgets: | File | Budget | @@ -60,11 +83,11 @@ Anything over budget is flagged with a proposed fix: inline facts moved to the w ## The two-stage flow -**Stage 1: research (Sonnet).** Reads the stores, classifies entries, computes budgets, writes a report at `.gaia/local/audit/KNOWLEDGE-{YYYY-MM-DD-HHMM}.md`. Snapshots `git status --short` and `git rev-parse HEAD` into the report's frontmatter so stage 2 can detect drift. Mutates nothing outside `.gaia/local/audit/`. +**Stage 1: research (Sonnet).** Reads the stores, classifies entries, computes budgets, and writes a report to `.gaia/local/audit/KNOWLEDGE-{YYYY-MM-DD-HHMM}.md` with `status: draft`. It snapshots `git status --short` and `git rev-parse HEAD` into the report's frontmatter so the apply stage can detect drift. It mutates nothing outside `.gaia/local/audit/`. -**Stage 2: apply (Haiku).** Reads the most recent report. Re-resolves project root, memory dir, and agent memory dir; if they differ from the report's frontmatter, stops with a clear error. Verifies each action's drift signal: SHA-256 of the target file, or a verbatim snippet that must appear in current content. Applies the action verbatim, or skips with a recorded reason. Never improvises. +**Stage 2: apply (Sonnet).** Reads the most recent unfinished report, re-resolves the project root and memory directories, and stops with a clear error if they differ from the report's frontmatter (the report was generated on another machine or clone). It verifies each action's drift signal, applies the action verbatim or skips it with a recorded reason, then verifies the result landed. The apply stage runs on Sonnet because it makes coordinated multi-file edits where a wrong move can delete the only copy of an entry. -The split exists for technical reasons (different reasoning loads, drift-check between stages), not as a user-confirmation gate. Calling `/gaia-audit` is the intent to apply. +The two-stage split is technical (the stages carry different reasoning loads, with a drift check between them). The user-confirmation checkpoint is the single decision gate, not the stage boundary. ## Action types @@ -73,29 +96,61 @@ The split exists for technical reasons (different reasoning loads, drift-check b | `delete` | Remove a memory or rules file whose facts are canonical in the wiki, gated on the file's SHA-256 matching what the report saw. | | `delete-entry` | Remove a specific block (e.g. a heading section in `MEMORY.md`) by verbatim match. | | `promote` | Move durable knowledge from memory to a named wiki page. Performs the target action (`append_section`, `insert_after_heading`, or `create_new`), prepends a log entry to `wiki/log.md`, appends an index entry to `wiki/index.md`, then deletes the source if the action says to. | -| `replace` | Shrink inline content in an auto-loaded file to a wikilink. The current block must match byte-for-byte; the replacement is typically a single wikilink line. | +| `replace` | Shrink inline content in an autoloaded file to a wikilink, or swap a contradicting line for the canonical value. The current block must match byte-for-byte. | + +The apply stage runs actions in order: `replace` → `delete-entry` → `promote` → `delete`. Earlier replaces never reference content that later actions touch; deletes come last so pointers do not go stale before they are used. Contradiction findings reuse `replace` and `delete`; there is no separate action type for them. -Stage 2 applies actions in order: `replace` → `delete-entry` → `promote` → `delete`. Earlier shrinks never reference content that later actions touch; deletes come last so pointers do not go stale before they are used. +## Report lifecycle and resume + +Each report carries a `status:` field: + +- **`draft`** is what the research stage writes. A draft survives an interrupted run and is resumable with `/gaia-audit --apply`. It is never pruned automatically. +- **`applied`** means every action landed cleanly. +- **`applied-partial`** means some actions skipped or failed. The report is kept so `--apply` can retry only the remainder. + +The research stage prunes its own report directory before writing a new one: it keeps the newest five `applied` or `applied-partial` reports unconditionally, then deletes anything older than 30 days beyond that floor. Drafts are never pruned. Declining at the gate deletes the report immediately. + +## The re-apply window + +When you run `/gaia-audit --apply`, the apply stage checks the report's age on a tiered grace: + +- Up to 24 hours old: proceed. +- 24 to 72 hours old: proceed with a warning that drift checks will catch any staleness. +- Over 72 hours old: stop, and ask you to run a fresh audit. + +The wall clock is a coarse guard. The real safety is the per-action drift check (SHA-256 plus verbatim snippets), which catches any file that changed since the report regardless of age. ## Drift handling -Each action ships with a drift signal: either `expect_sha256` (file-level) or a verbatim `before:` / `expect:` snippet. Stage 2 checks the signal before applying: +Each action ships with a drift signal: either `expect_sha256` (file-level) or a verbatim `before:` / `expect:` snippet. The apply stage checks the signal before applying: - Match: apply, flip the checkbox to `[x]`. - Mismatch: skip, flip to `[~]` with reason `sha drift` or `snippet drift`. - Error during apply: flip to `[!]` with the error. - Target missing: `[!]` with `target missing`. -Files dirty in git that appear as targets get marked `SKIP (dirty)` before any action runs. Dirty-file protection is preventive, not diagnostic. +Files dirty in git that appear as targets are marked `SKIP (dirty)` before any action runs. Dirty-file protection is preventive, not diagnostic. ## After the run -The output is a working-tree diff, not a commit. Stage 2 never runs `git add` or `git commit`. Review the diff and commit if satisfied. +The apply stage verifies each flipped action mechanically before reporting: a `promote` is confirmed present at its target (and its source gone), a `delete` is confirmed gone. Any action that does not verify is downgraded and the report is marked `applied-partial`, so `--apply` retries only what is left. The run ends with a counted summary. + +The output is a working-tree diff, not a commit. The apply stage never runs `git add` or `git commit`. It prints a `recovery:` line so you can undo any change: `git restore ` for tracked edits, `git checkout -- `, or `git clean` for newly created files. There is no separate snapshot store; git is the snapshot for tracked files. Review the diff and commit if satisfied. + +Reports live under `.gaia/local/audit/` and are gitignored. They are machine-local and never committed. + +## Statusline nudge + +GAIA surfaces `Run /gaia-audit ()` in the statusline when any of three conservative, debounced signals fires: + +- **Memory drift**: machine-local memory has grown by 10 or more entries, or 30 or more days have passed since the last applied audit. +- **Project drift**: an autoloaded file is over budget (`wiki/hot.md` over 200 words, root `CLAUDE.md` over 400 words, or any `.claude/rules/*.md` over 200 lines). +- **Pending draft**: a `draft` report is waiting to be resumed. -The skill prunes its own report directory: keeps the newest five reports unconditionally, then deletes anything older than 30 days beyond that floor. +The nudge is suppressed inside git worktrees and is computed in a cached background refresher, never on the statusline's render path. All thresholds are tunable. ## Related -`/gaia-audit` is distinct from the `code-review-audit` agent. The agent reviews code changes on a branch before merge for security, performance, and correctness. `/gaia-audit` reviews knowledge stores for duplication and bloat. Different inputs, different outputs, different purposes. +`/gaia-audit` is distinct from the `code-review-audit` agent. The agent reviews code changes on a branch before merge for security, performance, and correctness. `/gaia-audit` reviews knowledge stores for duplication, contradictions, and bloat. Different inputs, different outputs, different purposes. For wiki-internal redundancy (multiple wiki pages saying the same thing) and broken wikilinks, use `/gaia-wiki`. diff --git a/src/content/docs/commands/fitness.mdx b/src/content/docs/commands/fitness.mdx index b869797..c6e91a3 100644 --- a/src/content/docs/commands/fitness.mdx +++ b/src/content/docs/commands/fitness.mdx @@ -5,7 +5,7 @@ description: Health check and auto-heal for your project's Claude integration. T `/gaia-fitness` checks the health of your project's Claude integration and repairs what it safely can. One invocation, no flags: calling `/gaia-fitness` is the statement of intent to be fit. It runs three phases in sequence: triage, heal, verify, and reports a per-category and overall grade from F up to A+. -It checks the Claude *surface* (hooks, skill/command/agent frontmatter, rule files, `CLAUDE.md`, `.claude/settings.json`, GAIA itself, and wiki structure), not your application code. For app code review before merge, use the `code-review-audit` agent. For knowledge-store bloat and duplication, use [`/gaia-audit`](/commands/audit/). +It checks the Claude *surface* (hooks, skill/command/agent frontmatter, rule files, `CLAUDE.md`, `.claude/settings.json`, GAIA itself, and wiki structure), not your application code. For app code review before merge, use the `code-review-audit` agent. For knowledge-store bloat, duplication, and contradictions, use [`/gaia-audit`](/commands/audit/). ## When to use it @@ -72,13 +72,17 @@ Before any heal-phase change, `/gaia-fitness` checks whether the repo is safe to ## After the run -`/gaia-fitness` prints a findings list grouped by the seven categories, one line per finding: +`/gaia-fitness` renders the result as a deterministic width-aware ASCII card, produced by `gaia fitness render-card`. The card is pasted directly into the chat reply as a fenced code block. -``` -- [severity] `file:line` — remediation -``` +The card has three sections: + +1. **Header row.** The command name (`/gaia-fitness`), the overall grade, and the `OVERALL` label right-aligned. +2. **Category rows.** One row per category, alphabetically sorted. Each row shows the category name, a compact severity note (for example, `2 errors, 1 warning`), and the category grade. A clean category shows no note. +3. **FINDINGS block** (omitted on a clean run). When there are adjudicated findings, a `FINDINGS` section follows the category rows. Findings are grouped by category, and each entry shows the severity tag (`[error]`, `[warning]`, or `[info]`), the file path, and the remediation text, word-wrapped to fit the terminal width. Unresolved or unfixable findings appear here with their recommended approach. + +On a clean run (zero adjudicated findings, overall A+), the FINDINGS block is omitted. The card shows only the header and category rows, all graded A+. -Unresolved or unfixable findings are included alongside their recommended approach. The findings list is followed by the grades table and the overall grade. A clean run skips the findings list and prints only the grades. +The card self-sizes: the box width is clamped to the longest content line, with a hard ceiling of 120 columns and a floor set by the widest category name. Remediation text wraps to fit. Then, depending on what happened: @@ -89,4 +93,4 @@ Then, depending on what happened: ## Related -`/gaia-fitness` checks the Claude integration surface. For knowledge-store duplication and autoload budgets, use [`/gaia-audit`](/commands/audit/). For wiki-internal redundancy and broken wikilinks, use [`/gaia-wiki`](/commands/wiki/). For your application code on a branch before merge, use the [`code-review-audit` agent](/reference/agents/). +`/gaia-fitness` checks the Claude integration surface. For knowledge-store duplication, contradictions, and autoload budgets, use [`/gaia-audit`](/commands/audit/). For wiki-internal redundancy and broken wikilinks, use [`/gaia-wiki`](/commands/wiki/). For your application code on a branch before merge, use the [`code-review-audit` agent](/reference/agents/). diff --git a/src/content/docs/commands/handoff-pickup.mdx b/src/content/docs/commands/handoff-pickup.mdx index 353ee2d..abb06ec 100644 --- a/src/content/docs/commands/handoff-pickup.mdx +++ b/src/content/docs/commands/handoff-pickup.mdx @@ -63,7 +63,7 @@ Sections present only when there is real content. Empty sections are dropped, no No arguments. Pickup runs through four steps: -1. **Locate.** Find the most recent handoff: `ls -t .gaia/local/handoff/HANDOFF-*.md | head -1`. If no handoff exists, fall back to `wiki/hot.md` and report `No handoff found — resuming from hot cache.` +1. **Locate.** Find the most recent handoff: `ls -t .gaia/local/handoff/HANDOFF-*.md | head -1`. If no handoff exists, fall back to `wiki/hot.md` and report `No handoff found, resuming from hot cache.` 2. **Read.** Load the handoff in full. Run `git rev-parse --abbrev-ref HEAD`, `git status --short`, and `git log -1 --oneline` in parallel. 3. **Report.** Surface a tight status block: current branch (with drift from the handoff if any), handoff filename and date, one-line context, 1 to 3 bullets on what's done or in-flight, 1 to 3 bullets on gaps, and a suggested next action. 4. **Archive.** After you confirm a direction (pick an action, start editing, or say "go"), the consumed handoff moves to `.gaia/local/handoff/archive/`. diff --git a/src/content/docs/commands/harden.mdx b/src/content/docs/commands/harden.mdx new file mode 100644 index 0000000..138bbfa --- /dev/null +++ b/src/content/docs/commands/harden.mdx @@ -0,0 +1,116 @@ +--- +title: /gaia-harden +description: Human-gated Policy-Memory Loop. Reviews recurring code-review-audit findings and, with your approval, drafts the lowest-context-weight enforcement form into the working tree. +--- + +`/gaia-harden` is the human gate on the Policy-Memory Loop. When the same anti-pattern recurs across three or more distinct merged PRs at `warning` or `error` severity within a rolling 90-day window, the statusline nudges you with `Run /gaia-harden (N recurring patterns)`. Calling `/gaia-harden` reads the live candidate queue, judges the lightest durable enforcement form for each finding, and presents a per-candidate approve / decline / defer choice. Nothing is authored or activated without your answer. + +## When to use it + +The statusline tells you. When recurring code-review findings accumulate, you will see: + +``` +Run /gaia-harden (N recurring patterns) +``` + +where `N` is the number of open candidates. Run `/gaia-harden` then to work through them. + +You can also call `/gaia-harden list` at any time to see the current candidates without entering the interactive flow, or `/gaia-harden why ` to inspect a specific one. + +## How to invoke + +Default (interactive review, same as what the statusline nudge points at): + +``` +/gaia-harden +``` + +List the live candidates and their recommended forms without prompting: + +``` +/gaia-harden list +``` + +Explain one candidate: the PRs it recurred on, the recommended form, and the rationale: + +``` +/gaia-harden why rule/use-effect-derived-state +``` + +## What triggers a candidate + +The `code-review-audit` agent tags each eligible finding with a stable `finding_class` and a `severity`. A `finding_class` becomes a candidate when: + +- it appears at `warning` or `error` severity (`suggestion` is ineligible), +- across at least **3 distinct merged PRs**, +- within a rolling **90-day window**, and +- no promoted rule already covers it, and the decline ledger does not suppress it. + +The tally is recomputed from GitHub PR data via `gaia harden-tally` each time you invoke `/gaia-harden`. It is an ephemeral projection: an unacted pattern decays by ageing out of the window. + +## Judge-the-form logic + +For each candidate, `/gaia-harden` judges two axes before recommending exactly one enforcement form. The bias is always to the lowest-context-weight form the pattern admits. + +**Edit vs new (checked first).** Before recommending a new artifact, the command checks whether an existing rule, skill, or hook already covers the finding's territory. If one does, the recommendation is to edit it, not create a new one. + +**Which form (lowest context weight that fits).** The `finding_class` prefix and the pattern's nature determine the form: + +| Pattern type | Recommended form | +| --- | --- | +| Oracle-class (`react-doctor/`, `axe/`, `knip/`, `cve/`) | Tighten the existing deterministic check: make it blocking or add it to the quality gate. Never a new prose rule. | +| Mechanizable pattern (catchable by a lint rule, hook, or test) | A deterministic check. v1 produces a hook+script sketch only; nothing is activated. | +| Correct procedure (a sequence of steps) | A skill via `skill-creator`. v1 produces a scaffold only; nothing is activated. | +| Judgment-based anti-pattern (no reliable mechanization) | A path-scoped prose rule. v1 owns this end to end. | + +For each candidate, `/gaia-harden` presents the `finding_class`, its distinct-PR count and the PR numbers, the recommended form, and a one-line rationale. You then choose: **approve**, **decline**, **defer**, or **redirect** (override the form choice). + +## What gets drafted on approve + +### Prose rule (v1: owned end to end) + +The rule is written to `.claude/rules/.md`. It is always path-scoped: a `paths:` frontmatter glob is derived from the candidate's `area_tags`. The rule body uses present-tense prose naming the anti-pattern and the correct pattern. The first line after the frontmatter carries a provenance marker: + +``` + +``` + +The marker documents the rule's origin and instructs `/gaia-audit` not to treat non-recurrence as a staleness signal. A quiet pattern under a live rule means the rule is working. + +### Deterministic check (v1: scaffold only) + +A hook+script sketch: a proposed hook entry and a script outline. The sketch is never wired into `.claude/settings.json`, never made executable, and no `.claude/rules/` file is written for it. You finish and wire the check yourself. + +### Skill (v1: scaffold only) + +A `skill-creator` invocation with the captured intent (what the skill should enable, when it triggers, the expected output). Nothing is activated. + +### Oracle-class enforcement edit + +An edit to the existing enforcement wiring: the tool's rule file, the `code-review-audit` agent, the quality gate doc, or the CI workflow. A new prose rule is never drafted for an oracle-class finding. + +All approved work lands in the working tree only. It ships through your normal PR review. + +## Decline and defer + +**Decline.** The decline is written to the machine-local, gitignored ledger at `.gaia/local/harden/declines.json`. It is never shared: a teammate still sees the nudge and can approve the same candidate independently. The ledger is evidence-based: the declined class stays suppressed for you until at least **3 more distinct PRs** carrying that `finding_class` merge after the decline. At that point it resurfaces. + +**Defer.** Nothing is persisted. The candidate stays in the next tally pass and ages out naturally if the pattern stops recurring before anyone acts. + +## Guardrails + +- `/gaia-harden` never runs `git add`, `git commit`, or `git push`. +- It never auto-activates a skill or wires a deterministic check. +- Every drafted prose rule has a mandatory `paths:` glob: no always-loaded frontmatter-less rules are ever produced. +- A decline is machine-local only: it never vetoes the candidate for a teammate. +- A defer persists nothing. +- The command never reads the privacy-sealed mentorship event store. The Policy-Memory Loop keys only on `finding_class` recurrence from the PR window. + +## Pruning promoted rules + +Promoted rules are not permanent in the "never touch" sense: `/gaia-audit` is the single pruner. It treats a provenance-marked rule as an ordinary rule and prunes it only on obsolescence, redundancy, supersession, or duplication. Non-recurrence is never a prune signal. + +## Related + +- [`code-review-audit` agent](/reference/agents/): the source of the `finding_class` emissions the loop counts. +- [`/gaia-audit`](/commands/audit/): the single pruner for promoted rules and all other knowledge-store hygiene. diff --git a/src/content/docs/commands/index.mdx b/src/content/docs/commands/index.mdx index 9673a34..8030456 100644 --- a/src/content/docs/commands/index.mdx +++ b/src/content/docs/commands/index.mdx @@ -5,7 +5,7 @@ sidebar: order: 1 --- -GAIA's day-to-day workflows are discrete slash entries that show up in autocomplete as you type `/gaia-`. Five are commands (`/gaia-plan`, `/gaia-spec`, `/gaia-audit`, `/gaia-fitness`, `/gaia-forensics`). Three are skills (`/gaia-wiki`, `/gaia-handoff`, `/gaia-pickup`) that also respond to plain-English requests, not just the slash form. +GAIA's day-to-day workflows are discrete slash entries that show up in autocomplete as you type `/gaia-`. Six are commands (`/gaia-plan`, `/gaia-spec`, `/gaia-audit`, `/gaia-fitness`, `/gaia-forensics`, `/gaia-harden`). Three are skills (`/gaia-wiki`, `/gaia-handoff`, `/gaia-pickup`) that also respond to plain-English requests, not just the slash form. ## Workflow Commands @@ -20,8 +20,9 @@ GAIA's day-to-day workflows are discrete slash entries that show up in autocompl | Command | What it does | | --- | --- | -| [`/gaia-audit`](/commands/audit/) | Audit memory, rules, and wiki for duplication, stale entries, and unnecessary base context overhead. | +| [`/gaia-audit`](/commands/audit/) | Audit memory, rules, and wiki for duplication, contradictions, stale entries, and unnecessary base context overhead. | | [`/gaia-fitness`](/commands/fitness/) | Health check and auto-heal for the Claude integration surface. Triage, heal, verify, with a letter grade. | +| [`/gaia-harden`](/commands/harden/) | Human-gated Policy-Memory Loop. Reviews recurring code-review-audit findings and drafts the lowest-context-weight enforcement form for your approval. | | [`/gaia-wiki`](/commands/wiki/) | Wiki maintenance. With no arguments, runs the full chain. Each can also be invoked individually. | ## Update Commands @@ -39,4 +40,4 @@ GAIA's day-to-day workflows are discrete slash entries that show up in autocompl ## No-argument behavior -Each entry has a sensible default when called with no arguments. `/gaia-wiki` runs the full chain (sync, consolidate, lint). `/gaia-audit` runs research then apply. `/gaia-plan`, `/gaia-spec`, and `/gaia-forensics` prompt for or infer what they need before continuing. +Each entry has a sensible default when called with no arguments. `/gaia-wiki` runs the full chain (sync, consolidate, lint). `/gaia-audit` researches, then asks before it applies. `/gaia-harden` defaults to its interactive review mode. `/gaia-plan`, `/gaia-spec`, and `/gaia-forensics` prompt for or infer what they need before continuing. diff --git a/src/content/docs/commands/plan.mdx b/src/content/docs/commands/plan.mdx index b753a5a..52fd12e 100644 --- a/src/content/docs/commands/plan.mdx +++ b/src/content/docs/commands/plan.mdx @@ -50,7 +50,7 @@ Inside the directory: ## The flow 1. **Description capture.** If `$ARGUMENTS` is empty, the skill asks. If a SPEC reference is detected, the SPEC itself becomes the source of truth and the dispatch summary is just a label. -2. **Model check.** If the current session is not on Opus, the skill offers to spawn the planning agent on Opus 4.7. Plans benefit from the higher-capacity model. +2. **Model check.** If the current session is not on Opus, the skill offers to spawn the planning agent on Opus. Plans benefit from the higher-capacity model. 3. **Plan directory resolution.** Slug is derived (or seeded from the SPEC id), suffixed on collision, then created under `.gaia/local/plans/`. 4. **Planner dispatch.** The skill spawns a `general-purpose` Agent restricted to writing inside the plan directory only. The planner reads `wiki/concepts/Task Orchestration.md` for the orchestration pattern, then writes the plan files directly to disk. 5. **Verification.** The skill confirms `README.md`, `ORCHESTRATOR.md`, `KICKOFF.md`, and at least one `task-*.md` exist. If anything is missing, the failure surfaces; no silent retry. diff --git a/src/content/docs/commands/spec.mdx b/src/content/docs/commands/spec.mdx index 8a66b52..0614499 100644 --- a/src/content/docs/commands/spec.mdx +++ b/src/content/docs/commands/spec.mdx @@ -3,7 +3,7 @@ title: /gaia-spec description: Run a Socratic discovery loop and save an immutable SPEC artifact. --- -`/gaia-spec` produces an immutable SPEC artifact through a guided Socratic conversation. The artifact lands at `.gaia/local/specs/SPEC-NNN.md` and serves as the contract for [`/gaia-plan`](/commands/plan/) to decompose into work. The skill produces an artifact and stops. It does not implement anything. +`/gaia-spec` produces an immutable SPEC artifact through a guided Socratic conversation. The artifact lands at `.gaia/local/specs/SPEC-NNN/SPEC.md` and serves as the contract for [`/gaia-plan`](/commands/plan/) to decompose into work. The skill produces an artifact and stops. It does not implement anything. This is for when you want to implement a large feature or change request with multiple parts, or you want to present an idea or concept and refine it before implementation. @@ -60,12 +60,12 @@ A handful of mechanics are used by multiple steps: 1. **Description capture.** Open-ended prompt if not provided as arguments. 2. **GitHub issue mirror prompt.** Single `AskUserQuestion`: skip (recommended) or mirror to a GitHub issue on save. 3. **Resume-vs-start-new.** If an in-progress SPEC exists, the skill surfaces it (last touched, intent first line, UAT count) and asks: resume, start new, or discard the draft cache. -4. **Initial draft.** `/speckit-specify` runs the GAIA preset's body, allocates a SPEC id, lands the artifact at `.gaia/local/specs/SPEC-NNN.md`. The `before_specify` hook (constitution check) fires automatically. +4. **Initial draft.** `/speckit-specify` runs the GAIA preset's body, allocates a SPEC id, lands the artifact at `.gaia/local/specs/SPEC-NNN/SPEC.md`. The `before_specify` hook (constitution check) fires automatically. 5. **Gate 1.** Plain prompt: confirm intent and UATs, or revise. 6. **Socratic clarify loop.** `/speckit-clarify` runs, capped at five questions. Closed-set questions go through `AskUserQuestion` with the recommended option first; open-ended questions are plain prompts. Per-topic exhaustion checkpoints fire when the natural well runs dry. Research subagents dispatch for any question requiring prior-art lookup. 7. **Self-review.** The `after_clarify` hook fires `/speckit-gaia-self-review` against the gate-1 snapshot. Findings come back classified low, medium, or high. Low and medium auto-apply; high surfaces to the user before applying. Pending clarifications block save until answered or deferred with rationale. 8. **Gate 2.** Plain prompt: confirm the full rendered artifact, or revise. -9. **Canonical save.** Writes `.gaia/local/specs/SPEC-NNN.md` and deletes the working-draft cache. The `after_specify` hook fires `/speckit-gaia-lint` for immutability checks. Cycles 1 and 2 surface failures; cycle 3 prompts to step back to gate 2, defer remaining findings, or push another fix. +9. **Canonical save.** Writes `.gaia/local/specs/SPEC-NNN/SPEC.md` and deletes the working-draft cache. The `after_specify` hook fires `/speckit-gaia-lint` for immutability checks. Cycles 1 and 2 surface failures; cycle 3 prompts to step back to gate 2, defer remaining findings, or push another fix. 10. **Optional GitHub issue mirror.** If opted in at step 2, `gh-mirror.sh` creates the issue and stamps the URL into frontmatter. Skips silently when `gh` is not authenticated, Issues are disabled, or the viewer lacks write permission. 11. **Chain to `/gaia-plan`.** `AskUserQuestion`: trigger `/gaia-plan` now or defer. On yes, the skill enters a multi-plan dispatch loop. Each plan runs in its own `general-purpose` Agent so the wrapper context stays bounded across plan count. diff --git a/src/content/docs/commands/update-deps.mdx b/src/content/docs/commands/update-deps.mdx index f10e16c..bf70024 100644 --- a/src/content/docs/commands/update-deps.mdx +++ b/src/content/docs/commands/update-deps.mdx @@ -3,7 +3,7 @@ title: /update-deps description: Bump outdated packages group-by-group, run migrations for major-version bumps, and validate the full quality gate. --- -`/update-deps` discovers outdated packages, applies migrations for major-version bumps, and validates the project through the full quality gate. It is wired to the statusline: when work is pending, the statusline shows a `Run /update-deps` indicator that loads the skill when clicked. +`/update-deps` discovers outdated packages, shows you a grouped preview so you can snooze anything you are not ready for, applies migrations for major-version bumps, and validates the project through the full quality gate. It is wired to the statusline: when work is pending, the statusline shows a `Run /update-deps` indicator that loads the skill when clicked. ## When to use it @@ -17,6 +17,20 @@ Click the `Run /update-deps` statusline indicator, or trigger by phrasing: - "bump deps" - "run dependabot" +## Interactive preview and snooze + +Before applying anything, the skill shows a grouped preview. Updates are bucketed into four sections: **Major**, **Non-semver**, **Minor**, **Patch**. Companion groups (packages that always move together) appear as a single block under the section of their most-severe member. + +The preview offers three choices: + +- **Update all** (default): apply every group. Any prior snoozes are cleared. +- **Choose what to skip**: name the packages or groups to defer. Each name expands to its whole companion group. The remaining apply set is echoed back for confirmation. If you skip everything, the skill exits without creating a branch. +- **Cancel**: exit with no changes and no ledger write. + +Skipped groups are recorded in `.gaia/local/declined-updates.json` (gitignored, local only). A snoozed group resurfaces automatically when a newer version ships or after 14 days, whichever comes first. Snoozes only quiet the statusline nudge; every future preview still shows snoozed groups as updatable, and CI ignores the ledger entirely. + +`CI=true` and `--scope ` runs skip the preview and apply the full set (or the named group) unattended. + ## How it picks targets Updates are grouped so companion packages move together. When any package in a group is outdated, the entire group ships in the same install. Groups include: @@ -32,9 +46,23 @@ After grouping, updates are classified into two waves: - **Wave A**: minor and patch bumps, bundled into one install. - **Wave B**: major-version bumps, each group processed individually with its own migration guide and code edits. -## The CLI primitive behind it +## The CLI primitives behind it + +The deterministic parts (discovering outdated packages, applying the group rules, and classifying each group into Wave A or B) run through the bundled CLI: `gaia update-deps run --emit-updates ` writes the grouped, wave-classified set to a JSON file. The skill calls it for those steps and layers the preview, migrations, and quality gate on top. The same primitive backs the `gaia-ci-update-deps` workflow that [`/setup-gaia-ci`](/getting-started/setup-gaia-ci/) can install, so scheduled and interactive runs agree on what is outdated. -The deterministic parts (discovering outdated packages, applying the group rules, and classifying each group into Wave A or B) run through the bundled CLI: `gaia update-deps run --emit-updates ` writes the grouped, wave-classified set to a JSON file. The skill calls it for those steps and layers the migrations and quality gate on top. The same primitive backs the `gaia-ci-update-deps` workflow that [`/setup-gaia-ci`](/getting-started/setup-gaia-ci/) can install, so scheduled and interactive runs agree on what is outdated. +`gaia update-deps decline` manages the snooze ledger: + +- `--source --skip `: record the named groups as snoozed. Accepts either a group name (e.g. `react-router`) or any member package name; both resolve to the whole group. +- `--clear`: empty the ledger (used automatically when you choose "Update all"). + +## Override audit + +Before Wave A runs, the skill audits every entry in the `overrides:` map in `pnpm-workspace.yaml` (pnpm 11 reads overrides here; the `pnpm.overrides` field in `package.json` is no longer honored). For each override, two tests determine whether removing it is safe: + +1. **Peer-dep test**: does removing it produce a `pnpm ls` peer-dep error? +2. **Security-floor test**: does removing it introduce any new advisory that was absent from the baseline `pnpm audit` output? + +An override is declared obsolete only when removing it passes both tests. A peer-dep test alone would silently delete security-floor pins (CVE pins never produce a peer-dep error), so both tests are required. The audit runs again after Wave A and Wave B complete, in case a version bump elsewhere resolved the original conflict. ## Pinned versions @@ -52,10 +80,15 @@ pnpm pw pnpm build ``` +## Wave B remediation cap + +For major-version bumps, the skill makes one remediation pass if the quality gate fails after applying migration changes, then re-runs the gate once. If the gate still fails after that single re-run, the entire group is reverted and logged as skipped. There is no unbounded retry loop. + ## Gotchas - Must run from the main checkout, not a linked worktree. The skill rejects worktree invocations early and surfaces the cached outdated count from main so you know whether action is even pending. - Must run from `main` or `master`, or from an existing chore branch. If on `main`, the skill creates `chore/update-deps-`. - The final report is built from agent-returned data only. Anything filtered before installation (the ESLint cap) is silent on purpose. +- `actionable_count` in the statusline already subtracts snoozed groups; the skill uses `total_count` internally for the actual apply set. Source: `.claude/skills/update-deps/SKILL.md`. diff --git a/src/content/docs/commands/update-gaia.mdx b/src/content/docs/commands/update-gaia.mdx index 542b5ea..23e4c76 100644 --- a/src/content/docs/commands/update-gaia.mdx +++ b/src/content/docs/commands/update-gaia.mdx @@ -3,7 +3,7 @@ title: /update-gaia description: Pull the latest GAIA release into your project without overwriting your customizations. --- -`/update-gaia` pulls the latest GAIA release into your project using a three-way merge strategy. It compares your file, the baseline tarball your project was installed from, and the latest tarball — applying updates where GAIA owns content and surfacing conflicts where you do. It is wired to the statusline: when a new release is available, the statusline shows a `Run /update-gaia` indicator that loads the skill when clicked. +`/update-gaia` pulls the latest GAIA release into your project using a three-way merge strategy. It compares your file, the baseline tarball your project was installed from, and the latest tarball. It applies updates where GAIA owns content and surfaces conflicts where you do. It is wired to the statusline: when a new release is available, the statusline shows a `Run /update-gaia` indicator that loads the skill when clicked. ## When to use it @@ -24,7 +24,7 @@ The three-way merge is governed by ownership classes defined in `.gaia/manifest. | Class | Who controls it | On drift | | --- | --- | --- | | `owned` | GAIA fully. Overwritten silently when unchanged from baseline. | Prompts if you have local changes. | -| `shared` | GAIA seeds, you customize. | Emits a `.gaia-merge/.patch` for manual resolution. | +| `shared` | GAIA seeds, you customize. | Emits a `.gaia-merge/.patch` for manual resolution. Exception: `package.json` and `pnpm-workspace.yaml` get field-aware key-level merges instead of whole-file conflict patches (see below). | | `wiki-owned` | GAIA-seeded wiki pages (concepts, decisions, modules). Same as `shared`. | Emits a patch for manual resolution. | | adopter-owned (implicit) | Anything not in the manifest, plus sentinels like `wiki/hot.md`, `wiki/log.md`, `CHANGELOG.md`, `.gaia/VERSION`, `.gaia/manifest.json`. | Never touched. | @@ -34,17 +34,27 @@ The three-way merge is governed by ownership classes defined in `.gaia/manifest. 2. Resolves the latest release tag via `gh release list --repo gaia-react/gaia` (falls back to the GitHub API). 3. Shows the release notes and asks you to confirm. 4. Downloads the baseline and latest tarballs into `.gaia/cache/` (gitignored). -5. Runs `gaia update merge` to compute the three-way diff per file. +5. Runs the three-way merge per file directly (no CLI subcommand). Owned files are overwritten or conflict-patched; shared and wiki-owned files are merged or conflict-patched. `package.json` and `pnpm-workspace.yaml` are both classed `shared` but receive field-aware key-level merges rather than whole-file patches (see below). 6. Walks any conflicts with you one at a time; reads `.gaia-merge/.patch` for each. -7. Prompts before deleting any file that the latest release removed. -8. Writes the new version to `.gaia/VERSION` and copies the latest manifest into `.gaia/manifest.json`. -9. Prints a summary with counts (overwritten, added, skipped, conflicts, deleted, backed up). +7. For `pnpm-workspace.yaml`, uses `gaia update merge-workspace` to compute per-key verdicts. This is a field-aware merge of both the GAIA-managed settings keys and the `overrides` and `allowBuilds` maps. Overrides and build approvals live here (pnpm 11 no longer reads them from `package.json`); an adopter-only entry is never visited or clobbered. +8. Prompts before deleting any file that the latest release removed. +9. Prints a summary with counts (overwritten, added, skipped, conflicts, deleted, backed up) and per-file counts for `package.json` and `pnpm-workspace.yaml` (applied, conflicts, suggestions). Then writes the new version to `.gaia/VERSION` and copies the latest manifest into `.gaia/manifest.json`. The VERSION bump is deferred to this point so an interrupted run stays resumable at the baseline version. Backups land in `.gaia-backup//`. Conflict patches land in `.gaia-merge/`. +## Field-aware merges + +`package.json` and `pnpm-workspace.yaml` are both classed `shared`, but a whole-file three-way merge produces noise for them (every project diverges `package.json` at init; every override addition drifts `pnpm-workspace.yaml`). Both get key-level merges instead. + +**`package.json`** merges at JSON-key granularity across the managed sections: `dependencies`, `devDependencies`, `scripts`, `engines`, and `packageManager`. Identity keys (`name`, `version`, `description`, `author`, etc.) are never compared or patched. For each managed key, the outcome is one of: **apply** (GAIA changed it, you are still at the baseline pin), **conflict** (GAIA changed it, you re-pinned independently, left as yours with a note), or **suggestion** (GAIA added a key or changed one you removed, surfaced opt-in, never auto-inserted). Applied changes are written in place; conflicts and suggestions go to `.gaia-merge/package.json.notes`. + +**`pnpm-workspace.yaml`** merges at key/entry granularity using `gaia update merge-workspace`. The seven GAIA-managed settings keys (`minimumReleaseAge`, `trustPolicy`, etc.) are compared whole-value; the `overrides` and `allowBuilds` maps are compared per entry. An adopter-only override or build approval is never visited. Same apply/conflict/suggestion buckets as `package.json`; notes go to `.gaia-merge/pnpm-workspace.yaml.notes`. If the file is missing in the baseline (pre-pnpm 11 project) or unparseable, the skill falls back to a whole-file conflict patch. + +A release that touches no managed keys in either file produces a clean skip with no notes file. + ## After the run -Review patches in `.gaia-merge/`, run the quality gate, and inspect `git diff` before committing. The skill does not auto-commit. When satisfied: +Review patches in `.gaia-merge/`, reconcile any `.gaia-merge/package.json.notes` or `.gaia-merge/pnpm-workspace.yaml.notes` files, then run `pnpm install` if `package.json` or `pnpm-workspace.yaml` was changed. Run the quality gate and inspect `git diff` before committing. The skill does not auto-commit. When satisfied: ``` git commit -m "chore: update GAIA to " diff --git a/src/content/docs/contributors/ci.mdx b/src/content/docs/contributors/ci.mdx index 0ccb24b..b332e4c 100644 --- a/src/content/docs/contributors/ci.mdx +++ b/src/content/docs/contributors/ci.mdx @@ -35,7 +35,7 @@ PRs that don't touch `.gaia/cli/**` or this workflow file report green without i When the filter triggers, the job: -1. Sets up pnpm (pinned to v4.4.0 because v6 surfaces an `ERR_PNPM_IGNORED_BUILDS` regression on the `pnpm -C .gaia/cli install --frozen-lockfile` invocation). +1. Sets up pnpm via `pnpm/action-setup` (pinned to v6.0.5). 2. Sets up Node from `.node-version` with pnpm cache keyed to `.gaia/cli/pnpm-lock.yaml`. 3. Installs CLI dependencies with `--frozen-lockfile`. 4. Runs `pnpm -C .gaia/cli typecheck`. diff --git a/src/content/docs/contributors/cli.mdx b/src/content/docs/contributors/cli.mdx index f0f05bb..5c22d07 100644 --- a/src/content/docs/contributors/cli.mdx +++ b/src/content/docs/contributors/cli.mdx @@ -67,13 +67,66 @@ gaia setup link-worktree [--json] Source: `.gaia/cli/src/update/index.ts` -Wired by the `/update-gaia` skill for the deterministic byte-level merge step. +Wired by the `/update-gaia` skill for the field-aware `pnpm-workspace.yaml` merge step (Step 7b). ``` -gaia update merge --baseline --latest --manifest [--json] +gaia update merge-workspace --baseline --latest --current [--json] ``` -The skill drives tarball fetching and user-facing prompts; the CLI handles per-manifest-entry classification and three-way file compare so the skill never reads bytes per entry. +A field-aware, read-only verdict oracle for `pnpm-workspace.yaml`. It parses three YAML files (baseline from the prior GAIA release, latest from the incoming tarball, current working-tree copy) and emits a JSON report of `{applied, conflicts, suggestions}`. It never writes the workspace file. The `/update-gaia` skill applies the `applied` entries with the Edit tool so comments, key order, and quote style survive. The `--json` flag switches output from the human summary to the raw JSON report. + +### `gaia update-deps` + +Source: `.gaia/cli/src/update-deps/index.ts` + +Dependency update primitives invoked by the `/update-deps` skill. + +``` +gaia update-deps run --emit-updates +gaia update-deps decline --source --skip +gaia update-deps decline --clear +``` + +`run` discovers outdated packages, classifies them into Wave A (minor/patch) and Wave B (major), and writes a JSON payload to ``. `decline` records the groups a developer skipped in the interactive preview into `.gaia/local/declined-updates.json` (gitignored, local statusline only) so the statusline nudge stops counting them. `--source` points at the payload written by `run --emit-updates`; `--skip` takes a comma-separated list of package or group names, each expanded to its whole companion group. `--clear` empties the ledger (used when the developer chose "update all"). + +### `gaia fitness` + +Source: `.gaia/cli/src/fitness/index.ts` + +Presentation helper invoked by the `/gaia-fitness` skill to render the final report card. + +``` +gaia fitness render-card [--cols N] +``` + +Reads a `/gaia-fitness` report JSON document on stdin and writes a width-aware ASCII report card to stdout. `--cols` sets the target terminal width (default: autodetect from `process.stdout.columns`, fallback 100). The box self-sizes to the longest content line, clamped to 120 columns and to the terminal width, wrapping remediation text to fit. Categories render alphabetically. The FINDINGS block is omitted when the `findings` array is empty (clean run). The skill pipes the assembled report JSON through this command and pastes the stdout verbatim into its chat reply. + +### `gaia harden-tally` + +Source: `.gaia/cli/src/harden/tally.ts` + +Tallies recurring code-review-audit findings for the `/gaia-harden` policy-memory loop. + +``` +gaia harden-tally +``` + +Reads the rolling 90-day merged-PR window via `gh`, extracts each PR's machine-readable findings block, counts distinct PRs per `finding_class` at error/warning severity, drops classes already covered by a promoted rule or suppressed by the decline ledger, self-cleans stale ledger entries, and prints the candidate list as JSON to stdout. A class is a candidate when it recurs across 3 or more distinct PRs. Network failures are non-fatal: a `gh` error yields an empty candidate list rather than aborting. No flags. + +### `gaia harden-ledger` + +Source: `.gaia/cli/src/harden/ledger.ts` + +Machine-local decline ledger for the `/gaia-harden` policy-memory loop. The ledger is gitignored so a decline on one machine never vetoes the rule for a teammate. + +``` +gaia harden-ledger list +gaia harden-ledger record --finding-class --pr-count +gaia harden-ledger is-suppressed --finding-class --current-pr-count +gaia harden-ledger prune --window-classes +``` + +`list` prints the full ledger as JSON. `record` upserts one entry keyed by `finding_class`, snapshotting the current PR count at decline time (re-recording overwrites the timestamp and count). `is-suppressed` exits 0 (suppressed) when an entry exists and fewer than 3 distinct PRs carrying the class have merged since the decline; exits 1 (not suppressed) otherwise. `prune` removes entries whose `finding_class` is no longer in the active window, idempotently. Ledger file: `.gaia/local/harden/declines.json`. ### `gaia wiki` diff --git a/src/content/docs/maintenance.mdx b/src/content/docs/maintenance.mdx new file mode 100644 index 0000000..2d8dd0e --- /dev/null +++ b/src/content/docs/maintenance.mdx @@ -0,0 +1,34 @@ +--- +title: Maintenance +description: The GAIA commands that keep the GAIA setup itself in shape, the Claude integration surface, the loaded context, and the wiki. +--- + +GAIA ships commands for doing work and commands for keeping the GAIA setup itself in shape. This page covers the second kind: the commands that check and repair GAIA's own setup as a project grows. + +As a project accumulates work, three parts of that setup drift independently: + +- the **Claude integration surface**: hooks, command and skill frontmatter, rule wiring, `.claude/settings.json`; +- the **loaded context** Claude reads every session: machine-local memory, project rules, and autoloaded `CLAUDE.md` files; +- the **wiki** itself: pages that overlap or contradict each other, and broken wikilinks. + +Each part has its own command. They run independently and do not call one another. + +| Command | What it checks | The question it answers | +| --- | --- | --- | +| [`/gaia-fitness`](/commands/fitness/) | The Claude integration surface: hooks, frontmatter, rules, `.claude/settings.json` | Does the machine run? | +| [`/gaia-audit`](/commands/audit/) | The loaded context: memory, rules, and autoloaded files, for duplication, contradictions against the source of truth, stale entries, and over-budget autoloads | Is the fuel clean? | +| [`/gaia-wiki`](/commands/wiki/) | The wiki itself: redundant or conflicting pages, and broken wikilinks | Is the knowledge base internally consistent? | + +## Which one to run + +Pick by symptom: + +- The integration feels off, a hook is not firing, a rule is not loading, a command misbehaves: run [`/gaia-fitness`](/commands/fitness/). +- The base context feels heavy, or machine-local memory has grown large: run [`/gaia-audit`](/commands/audit/). +- The wiki has grown and pages overlap, or wikilinks break: run [`/gaia-wiki`](/commands/wiki/). + +The boundary between `/gaia-audit` and `/gaia-wiki` is deliberate. `/gaia-audit` reconciles the loaded context against the wiki (the source of truth) and resolves disagreements between project files; it does not touch wiki-page-vs-wiki-page conflicts. Those belong to `/gaia-wiki consolidate`. + +## Related + +For keeping dependencies and GAIA itself current, a different kind of upkeep, see [`/update-deps`](/commands/update-deps/) and [`/update-gaia`](/commands/update-gaia/). diff --git a/src/content/docs/reference/agents.mdx b/src/content/docs/reference/agents.mdx index 960acfe..8542de2 100644 --- a/src/content/docs/reference/agents.mdx +++ b/src/content/docs/reference/agents.mdx @@ -11,7 +11,7 @@ For Claude Code's own subagent documentation, see [the Subagents reference on do ## code-review-audit -Path: `.claude/agents/code-review-audit.md`. Model: `sonnet`. +Path: `.claude/agents/code-review-audit.md`. Model: `opus`. Comprehensive code review, security audit, performance analysis, and architectural assessment. Goes beyond what ESLint and TypeScript catch by reasoning about intent, data flow, and architectural fit. **Mandatory before any PR merge.** @@ -30,7 +30,11 @@ Comprehensive code review, security audit, performance analysis, and architectur - **Accessibility**: keyboard reachability, semantic HTML, focus management, ARIA usage. - **Maintainability**: magic values, dead code, coupling, comment quality. -The main agent handles cross-cutting reasoning. Specialist subagents (React patterns, TypeScript / architecture, translation) run in parallel for line-level rule compliance, alongside `react-doctor`, `pnpm knip --reporter json`, and `pnpm audit --json`. The `pnpm audit` pass is the Dependency-CVE advisory: it reports high and critical findings for you to decide on and never blocks the audit marker or `gh pr merge`. See the `dep-audit` rule. +The review runs in two phases. In the first phase, the main agent surfaces every candidate finding tagged with severity and confidence (coverage-first: low-confidence candidates are surfaced, not silently dropped). In the second phase, each surviving Critical or Important holistic finding is handed to a fresh-context refuter subagent; the refuter overturns a finding only with concrete counter-evidence (a specific guard, a test, or a demonstration that the failure path is unreachable). The report is not written until the adversarial pass completes. + +The main agent owns every spawn: specialist subagents (React patterns, TypeScript / architecture, translation), oracle tools (`react-doctor`, `pnpm knip --reporter json`, `pnpm audit --json`), and the per-finding refuters. None of these spawn further (depth-1 star topology). The `pnpm audit` pass is the Dependency-CVE advisory: it reports high and critical findings for you to decide on and never blocks the audit marker or `gh pr merge`. See the `dep-audit` rule. + +Per-phase progress breadcrumbs (scope resolved, oracles done, holistic review done, adversarial verify done, report stamped) are written to `.gaia/local/audit/progress.log` and surfaced in the GitHub Actions step summary. ### Output diff --git a/src/content/docs/reference/hooks.mdx b/src/content/docs/reference/hooks.mdx index 17ccbcd..9b6aaa1 100644 --- a/src/content/docs/reference/hooks.mdx +++ b/src/content/docs/reference/hooks.mdx @@ -52,18 +52,18 @@ For Claude Code's own hook documentation, see [the Hooks reference on docs.claud | Hook | What it does | Path | |---|---|---| -| `block-bare-test` | Blocks bare `pnpm test` / `npm test` invocations; requires `--run` so vitest does not start watch mode. | `.claude/hooks/block-bare-test.sh` | +| `block-bare-test` | Blocks bare `pnpm test` / `npm test` invocations; requires `--run` so vitest does not start watch mode. The block is anchored to command position: a `pnpm test` or `npm test` mention inside a commit message, `--body` string, or other quoted text is not an invocation and does not trigger the block. | `.claude/hooks/block-bare-test.sh` | | `block-main-destructive-git` | Blocks commits to `main`/`master` and force-pushes to those branches. | `.claude/hooks/block-main-destructive-git.sh` | | `block-no-verify` | Denies `git commit` / `git push` that carry a hook bypass (`--no-verify`, or commit-only `-n`), so the commit-time floor (typecheck, lint, test) cannot be skipped. | `.claude/hooks/block-no-verify.sh` | | `block-rm-rf` | Denies `rm -rf` against root, `$HOME`, cwd, unscoped globs, `.git`, `node_modules`, and any path outside a small whitelist of scratch dirs. | `.claude/hooks/block-rm-rf.sh` | | `pr-merge-audit-check` | Blocks `gh pr merge` until a code-review-audit marker file exists at `.gaia/local/audit/.ok`. | `.claude/hooks/pr-merge-audit-check.sh` | -| `red-verify-commit-check` | Blocks `git commit` when a new-at-HEAD test that now passes has no recorded failing (RED) run matching its current content. Enforces mechanical TDD: a new test must be observed failing before its passing version can land. | `.claude/hooks/red-verify-commit-check.sh` | +| `red-verify-commit-check` | Blocks `git commit` when a new-at-HEAD test that now passes has no recorded failing (RED) run matching its current content. Enforces mechanical TDD: a new test must be observed failing before its passing version can land. Type-only tests (all assertions are type-level via `expectTypeOf`/`assertType`/`@ts-expect-error`, no runtime `expect`) are exempt: they have no runtime failure mode, so `tsc` enforces them instead. | `.claude/hooks/red-verify-commit-check.sh` | ## PostToolUse (Bash) | Hook | What it does | Path | |---|---|---| -| `capture-red-observations` | After a one-shot vitest run, re-runs the same scope with the JSON reporter and appends each genuinely-failing test to the RED-observation ledger at `.gaia/local/red-ledger/`. Observe-only: never blocks, always exits clean. Pairs with `red-verify-commit-check`. | `.claude/hooks/capture-red-observations.sh` | +| `capture-red-observations` | After a one-shot vitest run, re-runs the same scope with the JSON reporter and appends each genuinely-failing test to the RED-observation ledger at `.gaia/local/red-ledger/`. Observe-only: never blocks, always exits clean. Pairs with `red-verify-commit-check`. Trigger detection is anchored to command position: a `pnpm test --run` mention inside a commit message or `--body` string is not an invocation and does not fire a re-run. | `.claude/hooks/capture-red-observations.sh` | | `wiki-commit-nudge` | After every non-amend `git commit`, injects commit metadata and the current wiki drift count into context. | `.claude/hooks/wiki-commit-nudge.sh` | ## Agent-invoked (no event registration) diff --git a/src/content/docs/reference/rules.mdx b/src/content/docs/reference/rules.mdx index 012a89a..a829a40 100644 --- a/src/content/docs/reference/rules.mdx +++ b/src/content/docs/reference/rules.mdx @@ -13,7 +13,7 @@ A rule's frontmatter `paths:` field is the gate. When Claude edits or reads a fi | Rule | What it enforces | |---|---| -| `coding-guidelines` | File-naming conventions (PascalCase components, camelCase hooks, kebab-case other), simplicity-first, surgical changes, goal-driven execution, mandatory TDD, mandatory quality-gate run before commit. | +| `coding-guidelines` | File-naming conventions (PascalCase components, camelCase hooks, kebab-case other), simplicity-first (no speculative abstractions or impossible-scenario error handling; real failure modes such as non-zero exits, loop non-convergence, and network failures still get handled and bounded), surgical changes, goal-driven execution, mandatory TDD, mandatory quality-gate run before commit. | | `dep-audit` | Dependency-CVE advisory. The audit agent runs `pnpm audit --json` pre-merge and reports high and critical findings for the operator to decide on. Read-only: never blocks the audit marker or `gh pr merge`. Distinct from the blocking CI `pnpm audit` pass. | | `knip` | When and when not to run `pnpm knip`; the audit agent runs it pre-merge, manual runs only after a refactor, deletion, or dependency replacement. | | `pr-merge` | Before any `gh pr merge`, run the audit and marker handshake, then confirm the PR reports `MERGED` before any local branch cleanup; use `--auto` (not `--admin`) when branch protection blocks the merge. | diff --git a/src/content/docs/skills/code.mdx b/src/content/docs/skills/code.mdx index d6b132e..0d75df5 100644 --- a/src/content/docs/skills/code.mdx +++ b/src/content/docs/skills/code.mdx @@ -19,7 +19,7 @@ Source: `.claude/skills/typescript/SKILL.md`. ### React Code -Conventions for React components, pages, routes, hooks, and forms. Includes pre-flight gates for `useEffect`, `useCallback`, and `useState`; a form-element check that swaps native `` for the project's `Form/*` components; and a translation gate that requires every user-visible string to come from `t()`. Covers component extraction thresholds and route/page architecture (`app/routes/` thin shells, `app/pages/` for UI). +Conventions for React components, pages, routes, hooks, and forms. Includes pre-flight gates for `useEffect`, `useCallback`, and `useState`; a form-element check that swaps native `` for the project's `Form/*` components; and a translation gate that requires every user-visible string to come from `t()` (exception: approximate skeleton-loader placeholders standing in for dynamic runtime values stay hardcoded; static skeleton text that mirrors a real `t()` call must still use `t()`). Covers component extraction thresholds and route/page architecture (`app/routes/` thin shells, `app/pages/` for UI). **Triggers** when writing or reviewing React components, hooks (`useEffect`, `useCallback`, `useState`), event handlers, or component extraction decisions. Also triggers when debugging stale closures, infinite re-renders, or unnecessary re-renders caused by memoization issues. @@ -43,6 +43,10 @@ Source: `.claude/skills/tailwind/SKILL.md`. Build skeleton loading states that are pixel-perfect matches of the real content. Uses real HTML elements (`

`, ``, `

`, `