Skip to content

build(docs-gen): add simple extension name extractor#1157

Open
makhnatkin wants to merge 7 commits into
mainfrom
codex/docs-gen-name-extractor-simple
Open

build(docs-gen): add simple extension name extractor#1157
makhnatkin wants to merge 7 commits into
mainfrom
codex/docs-gen-name-extractor-simple

Conversation

@makhnatkin

@makhnatkin makhnatkin commented Jun 23, 2026

Copy link
Copy Markdown
Collaborator

What changed

Adds a deliberately small docs-gen extractor that builds raw extension name records for future extension docs pages.

The key idea is that every configured .ts/.tsx source is parsed by TypeScript itself via ts.createSourceFile, and extension export names are detected on TypeScript AST nodes instead of regex/string matching.

The output stays minimal:

{
  "extensions": [{"name": "Bold"}]
}

Implementation notes:

  • keeps paths, categories, extra source dirs, blacklist, and AST type names in one small extractor config;
  • scans configured source roots recursively instead of relying on uppercase directory names;
  • extracts TypeScript extension export names from each source file;
  • applies the blacklist to extracted export names, so multiple extensions in one file are filtered independently;
  • keeps filesystem/JSON orchestration in the CLI and TypeScript export detection in a focused AST helper.

Validation

  • eslint on the docs-gen extractor files
  • prettier --check on touched docs-gen files
  • smoke checked extractor output against the repository tree: 100 extension export names
  • docs-gen build: generated 113 pages, including 100 extension-reference pages

Full project tests were not run locally because AGENTS.md requires Docker/Podman for test runs.

Summary by Sourcery

Add a docs-gen CLI to extract public editor extension names into a JSON file for future documentation generation.

New Features:

  • Introduce a TypeScript-based extractor that scans extension sources and outputs extension name records as JSON.
  • Add a docs-gen CLI entrypoint and script for running the extension data extraction.

Enhancements:

  • Add unit tests validating extension discovery, blacklist handling, and JSON output structure for the extractor.
  • Extend docs-gen package scripts and dependencies to support running the new extractor and tests.

@makhnatkin makhnatkin requested a review from d3m1d0v as a code owner June 23, 2026 07:16
@sourcery-ai

sourcery-ai Bot commented Jun 23, 2026

Copy link
Copy Markdown

Reviewer's Guide

Adds a new docs-gen CLI extractor that walks extension directories, uses the TypeScript AST to detect extension-like exports, filters them (including a blacklist and extra references), and writes a minimal extensions.json file, plus corresponding tests and npm scripts/wiring.

File-Level Changes

Change Details Files
Introduce a standalone extension metadata extractor CLI that scans extension directories, detects real extensions via TypeScript AST analysis, filters them, and writes a minimal JSON manifest.
  • Define repository and docs-gen paths, extension category lists, extra extension references, and a blacklist of non-public or non-extension entries
  • Implement directory scanning that only treats PascalCase subdirectories as extension candidates and recursively reads .ts/.tsx source files
  • Use the TypeScript compiler API to parse source, collect exported variable declarations, and detect extension exports by type annotations, initializer shapes, or Object.assign wrappers around known extensions
  • Compose extension name resolution logic that merges discovered extension refs with extra refs, filters via blacklist, and returns a sorted list of valid extension names
  • Generate extension records from names and write them as extensions.json into tmp/docs-gen, with a CLI-style main() entry and shebang for direct execution
infra/docs-gen/src/extract-extension-data.mjs
Add focused tests for the extractor’s name resolution and JSON output behavior using node:test and a temporary filesystem layout.
  • Create temp extension directory trees with different categories and extension-like/non-extension-like source files to validate AST-based filtering and blacklist behavior
  • Verify that listExtensionNames returns only expected extension names including extra refs and excluding blacklisted or non-extension exports
  • Verify that writeExtensionsJson writes the expected extensions.json structure using createExtensionRecords as the source of record formatting
  • Ensure temporary directories are cleaned up after each test run
infra/docs-gen/src/extract-extension-data.test.mjs
Wire the new extractor into the docs-gen package with dedicated scripts and dependencies.
  • Add an npm script to run the extension extractor and a test script that runs node --test over src/*.test.mjs
  • Add a TypeScript dependency (via catalog:ts) required by the extractor’s AST usage
  • Update pnpm-lock.yaml to capture the new dependency graph
infra/docs-gen/package.json
pnpm-lock.yaml

Possibly linked issues

  • #: PR delivers the first deterministic extractor (extension names), directly advancing the requested extension docs pipeline.

Tips and commands

Interacting with Sourcery

  • Trigger a new review: Comment @sourcery-ai review on the pull request.
  • Continue discussions: Reply directly to Sourcery's review comments.
  • Generate a GitHub issue from a review comment: Ask Sourcery to create an
    issue from a review comment by replying to it. You can also reply to a
    review comment with @sourcery-ai issue to create an issue from it.
  • Generate a pull request title: Write @sourcery-ai anywhere in the pull
    request title to generate a title at any time. You can also comment
    @sourcery-ai title on the pull request to (re-)generate the title at any time.
  • Generate a pull request summary: Write @sourcery-ai summary anywhere in
    the pull request body to generate a PR summary at any time exactly where you
    want it. You can also comment @sourcery-ai summary on the pull request to
    (re-)generate the summary at any time.
  • Generate reviewer's guide: Comment @sourcery-ai guide on the pull
    request to (re-)generate the reviewer's guide at any time.
  • Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
    pull request to resolve all Sourcery comments. Useful if you've already
    addressed all the comments and don't want to see them anymore.
  • Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
    request to dismiss all existing Sourcery reviews. Especially useful if you
    want to start fresh with a new review - don't forget to comment
    @sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

  • Enable or disable review features such as the Sourcery-generated pull request
    summary, the reviewer's guide, and others.
  • Change the review language.
  • Add, remove or edit custom review instructions.
  • Adjust other review settings.

Getting Help

@sourcery-ai sourcery-ai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey - I've found 1 issue

Prompt for AI Agents
Please address the comments from this code review:

## Individual Comments

### Comment 1
<location path="infra/docs-gen/src/extract-extension-data.mjs" line_range="201-41" />
<code_context>
+    writeExtensionsJson();
+}
+
+if (process.argv[1] && fileURLToPath(import.meta.url) === process.argv[1]) {
+    main();
+}
</code_context>
<issue_to_address>
**issue (bug_risk):** The CLI entry-point check may fail depending on how the script is invoked (relative vs absolute path, Windows path differences).

Directly comparing `fileURLToPath(import.meta.url)` with `process.argv[1]` is brittle because `argv[1]` may be relative while `fileURLToPath` is absolute, and path string formats differ across platforms. Consider normalizing both sides, e.g. `if (process.argv[1] && import.meta.url === pathToFileURL(process.argv[1]).href)` or comparing `resolve(process.argv[1])` to `fileURLToPath(import.meta.url)` so `main()` reliably runs when the script is invoked via `node path/to/extract-extension-data.mjs`.
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

firstChar === firstChar.toUpperCase() &&
firstChar !== firstChar.toLowerCase()
);
}

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

issue (bug_risk): The CLI entry-point check may fail depending on how the script is invoked (relative vs absolute path, Windows path differences).

Directly comparing fileURLToPath(import.meta.url) with process.argv[1] is brittle because argv[1] may be relative while fileURLToPath is absolute, and path string formats differ across platforms. Consider normalizing both sides, e.g. if (process.argv[1] && import.meta.url === pathToFileURL(process.argv[1]).href) or comparing resolve(process.argv[1]) to fileURLToPath(import.meta.url) so main() reliably runs when the script is invoked via node path/to/extract-extension-data.mjs.

@gravity-ui

gravity-ui Bot commented Jun 23, 2026

Copy link
Copy Markdown

Storybook Deployed

@gravity-ui

gravity-ui Bot commented Jun 23, 2026

Copy link
Copy Markdown

🎭 Playwright Report

@makhnatkin makhnatkin force-pushed the codex/docs-gen-name-extractor-simple branch from 52eac20 to 6de3079 Compare June 23, 2026 07:55
@makhnatkin makhnatkin force-pushed the codex/docs-gen-name-extractor-simple branch from 6de3079 to abe5aee Compare June 23, 2026 08:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant