Skip to content

fix(parsers/zig): canonical call-graph API + correct call extraction and resolution#85

Open
gadievron wants to merge 2 commits into
masterfrom
fix/parsers-zig-canonical-call-graph-api-correct-call
Open

fix(parsers/zig): canonical call-graph API + correct call extraction and resolution#85
gadievron wants to merge 2 commits into
masterfrom
fix/parsers-zig-canonical-call-graph-api-correct-call

Conversation

@gadievron
Copy link
Copy Markdown
Collaborator

Six defects in parsers/zig/call_graph_builder.py:

  1. API parity: the zig builder exposed only build()->Dict + save_results(), diverging from the canonical
    CallGraphBuilder (c/php/python/ruby) which has build_call_graph()->None, export()->Dict,
    get_statistics()->Dict, get_dependencies()/get_callers(). Added those methods; build() is retained as
    a back-compat wrapper (build_call_graph()+export()) since the pipeline (zig/test_pipeline.py) consumes
    its return value.

  2. Statistics: get_statistics() now also reports avg_in_degree and max_in_degree (previously only the
    out-degree side was emitted).

  3. @call extraction: @call(.modifier, fn, args) parses as a builtin_function node, so the wrapped
    function fn was never recorded. It is now extracted as the call target (the @call builtin itself
    stays filtered).

  4. Method-call recall + safe resolution: at base a method call obj.method() parses as call_expression
    over a field_expression, but the extractor only handled the non-existent field_access node type,
    so method calls produced no edges. field_expression callees are now extracted. And _resolve_call no
    longer returns ALL same-named candidates on a bare-name collision (which would link a.method() to
    every struct's method); when genuinely ambiguous it returns nothing. Trade-off: lower recall for
    ambiguous bare-name calls, higher precision (no namespace leak).

  5. Import-file matching: step 2 matched imp in candidate_file -- an unanchored substring -- so
    'util.zig' matched 'myutil.zig_x/...'. Now matches the import file name exactly (== or path-suffix).

  6. Stdlib import filter: @import("std")/("builtin")/("root") are not file imports; they are now
    skipped in resolution (previously they substring-matched candidate paths).

Scope: zig-specific (the c/php/python/ruby builders are separate implementations and are the parity
baseline, not siblings to change). function_extractor.py is unchanged.

Tests: tests/test_zig_call_graph_builder.py (new; the package had no tests, and the tree-sitter-zig
grammar must be installed to run them). Four tests: API parity + in-degree stats, @call extraction,
method-call extraction via field_expression, and exact-import / stdlib-filter / conservative resolution.
RED 4 failed (pre-fix) -> GREEN 4 passed; ruff clean; full suite 180 passed / 63 skipped.

Co-Authored-By: Claude Opus 4.7 (1M context) noreply@anthropic.com

…and resolution

Six defects in parsers/zig/call_graph_builder.py:

1. API parity: the zig builder exposed only build()->Dict + save_results(), diverging from the canonical
   CallGraphBuilder (c/php/python/ruby) which has build_call_graph()->None, export()->Dict,
   get_statistics()->Dict, get_dependencies()/get_callers(). Added those methods; build() is retained as
   a back-compat wrapper (build_call_graph()+export()) since the pipeline (zig/test_pipeline.py) consumes
   its return value.

2. Statistics: get_statistics() now also reports avg_in_degree and max_in_degree (previously only the
   out-degree side was emitted).

3. @call extraction: `@call(.modifier, fn, args)` parses as a `builtin_function` node, so the wrapped
   function `fn` was never recorded. It is now extracted as the call target (the @call builtin itself
   stays filtered).

4. Method-call recall + safe resolution: at base a method call `obj.method()` parses as `call_expression`
   over a `field_expression`, but the extractor only handled the non-existent `field_access` node type,
   so method calls produced no edges. field_expression callees are now extracted. And _resolve_call no
   longer returns ALL same-named candidates on a bare-name collision (which would link a.method() to
   every struct's method); when genuinely ambiguous it returns nothing. Trade-off: lower recall for
   ambiguous bare-name calls, higher precision (no namespace leak).

5. Import-file matching: step 2 matched `imp in candidate_file` -- an unanchored substring -- so
   'util.zig' matched 'myutil.zig_x/...'. Now matches the import file name exactly (== or path-suffix).

6. Stdlib import filter: `@import("std")`/`("builtin")`/`("root")` are not file imports; they are now
   skipped in resolution (previously they substring-matched candidate paths).

Scope: zig-specific (the c/php/python/ruby builders are separate implementations and are the parity
baseline, not siblings to change). function_extractor.py is unchanged.

Tests: tests/test_zig_call_graph_builder.py (new; the package had no tests, and the tree-sitter-zig
grammar must be installed to run them). Four tests: API parity + in-degree stats, @call extraction,
method-call extraction via field_expression, and exact-import / stdlib-filter / conservative resolution.
RED 4 failed (pre-fix) -> GREEN 4 passed; ruff clean; full suite 180 passed / 63 skipped.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…lean install

The zig parser modules top-level import tree_sitter_zig, but the dependency was
absent from requirements.txt and pyproject.toml, so a clean install raised
ModuleNotFoundError: No module named 'tree_sitter_zig' and every zig test
errored on collection. Add tree-sitter-zig pinned to the version the zig tests
pass with.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant