Update dependency tree-sitter-language-pack to v1#34
Open
renovate[bot] wants to merge 1 commit into
Open
Conversation
3e8c5ea to
aacb88b
Compare
aacb88b to
cd72f93
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This PR contains the following updates:
==0.13.0→==1.10.1Release Notes
kreuzberg-dev/tree-sitter-language-pack (tree-sitter-language-pack)
v1.10.1Compare Source
Fixed
EXCEPTION_ACCESS_VIOLATION) when traversing a parsed tree via opaquehandles (#146).
Parser.parse,Tree.walk,Tree.rootNode,Node.parent/child, andTreeCursor.nodefreed the returned native handle in afinallyblock immediately after wrappingit, so the returned
Tree/Node/TreeCursorreferenced already-freed memory and the next nativecall dereferenced it and crashed. The wrapper now owns the handle and frees it once on
close().Fixed in the alef Java backend and picked up by the 0.25.55 regen; value/DTO returns
(
byteRange/startPosition/process) still correctly free the FFI temporary after reading it.Changed
packages/kotlin-android, add--kotlinlang-styleto ktfmt, switchalef.tomlkotlin format/check from gradle-ktlintFormat to ktfmt so alef and prek agree, and exclude the vendored Gradle wrapper from shellcheck. detekt remains for static analysis. (.pre-commit-config.yaml,alef.toml)Added
Languagepassthrough across the C-ABI binding family (#143).get_languagenow returns each ecosystem's native tree-sitter
Languageinstead of an opaque alef handle, sothe result drops straight into the host runtime's parser: Go (
*tree_sitter.Languageviago-tree-sitter), Zig (?*const tree_sitter.Languageviazig-tree-sitter), Java(
jtreesitter.Language), C# (TreeSitter.Language), Kotlin Android (ktreesitter.Language),and Swift (
SwiftTreeSitter.Language) — joining the existing Python and Node passthrough. Eachbinding gained a dependency on its host tree-sitter runtime, injected into the generated manifest.
Configured via
[crates.*.capsule_types.Language]inalef.toml; regenerated against alef 0.25.55.Changed
tree-sitterdependency so the capsule shim can nametree_sitter::ffi::TSLanguage(the pointeeit casts
value.into_raw()to), and the zigbuild.zig.zoncarries the resolvedzig-tree-sittercontent hash.
v1.9.1Compare Source
Fixed
getLanguage(name:)function in theTreeSitterLanguagePackmodule. An alef 0.25.38 codegen regression added opaque types to the Swift forwarder
exclusion set, dropping
get_language(the only free function returning the opaqueLanguagetype) from the generated public API in v1.9.0. Regenerated against alef 0.25.43.v1.9.0Compare Source
Changed
alefpin 0.25.28 → 0.25.38 and regenerated all bindings. Picks up alef 0.25.29–0.25.38: enum associated (static factory) methods surfaced across backends, the swift opaque no-op shim so$_freeis synthesised for handle types with no visible methods (e.g.Language), swift streaming-owneralready_declaredre-declaration, javamarshal_optional_bytestemplate registration, and java@Nullabletype-use placement on qualified types.task upgradeacross every language workspace; lock files regenerated and committed.packages/kotlin-android/.gradle/cache and the.basemind/index (untracking the accidentally-committed Gradle cache files), and exclude the deterministic.ai-rulez/.generated-manifest.jsonfrom theoxfmtpre-commit hook so it no longer fights theai-rulez-generatehook.Fixed
UnsupportedOperationExceptionstubs forSelf-returning DTO/enum methods. There is no JNI/FFM symbol for DTO methods yet, so the throwing stubs compiled but misled callers and broke any path that reached them. The Java backend now skips these methods until marshaling lands.truedefault for boxed@Nullable Boolean#[serde(default)]record fields. A non-optional#[serde(default)] bool = truefield is boxed to@Nullable Boolean, so JSON that omitted it deserialised tonulland the accessor returnednullinstead oftrue.v1.8.1Compare Source
Added
dart pub add tree_sitter_language_pack. Built with flutter_rust_bridge for isolate-safe Future APIs.dev.kreuzberg.tslp:tslp-androidAAR on Maven Central. JNI-based with per-ABI native libraries (arm64-v8a, armeabi-v7a, x86_64, x86). JVM Kotlin users continue to consume the canonical Java / Panama-FFM package.TreeSitterLanguagePackvia SwiftPM. swift-bridge for macOS, iOS, and Linux.zig fetch --save <tarball-url>from GitHub Releases. Direct C FFI via@cImport.tree-sitter-language-pack-dart(FRB bridge) andtree-sitter-language-pack-swift(swift-bridge).crates/ts-pack-core-jniRust crate exportingJava_...JNI symbols for the Kotlin-Android binding (excluded from the default workspace build because it cross-compiles viacargo ndk).ci-zig.yaml,ci-swift.yaml,ci-dart.yaml, plus a combinedci-mobile.yamlcovering Android cross-compile + iOS cargo check.publish-pub), Swift Package Index (publish-swift), Zig (publish-zig→ GitHub Release tarball), and Maven Central kotlin-android (publish-kotlin-android).Fixed
DOWNLOAD_CACHE_LOCKincrates/ts-pack-core/src/lib.rswas aMutex<()>— intra-process only — so multi-worker servers (gunicorn / Puma / Node cluster), fan-out build pipelines (make -j8, parallel test runners), and the zig e2e suite (zig build testspawns eight test binaries in parallel) all raced on the same~/.cache/tree-sitter-language-pack/v{version}/directory. Partialentry.unpackwrites were observable to other workers'libloading::open, producing intermittentLanguageNotFound/ segfaults on first request for an uncached language; N processes could also each redundantly pull the 50MB platform bundle. Cache writes are now atomic (write to<dest_dir>/.<name>.tmp.<pid>.<seq>thenfs::rename— readers see old, new, or nothing, never partial) and the bundle-fetch / extract / clean critical section is serialized across processes with an exclusivefd-lockon<version_cache_dir>/.download.lock. Double-checked locking preserves the lock-free hot path: steady-stateis_cachedlookups never pay the OS file-lock cost. NewError::CacheLock(String)variant surfaces lock-acquisition failures cleanly. Affects every binding (Python, Node.js, Ruby, PHP, Go, Java, C#, Elixir, WASM, Dart, Swift, Zig, Kotlin-Android) because the fix lives entirely in the sharedts-pack-coreRust crate. Newfd-lock = "4"dependency (gated under thedownloadfeature). Cross-process safety relies onflocksemantics, which are unreliable on NFS — users withXDG_CACHE_HOMEon NFS should use a local-FS cache or serialize at the application layer. (crates/ts-pack-core/src/{download.rs,error.rs},crates/ts-pack-core/Cargo.toml,Cargo.toml, newcrates/ts-pack-core/tests/concurrent_download.rs)65f1a129). Declared[crates.zig].languages = [<curated 18-grammar list>]mirroring theTSLP_LANGUAGESvalue in[crates.test.zig].before. Alef's new Zig codegen filter consults bothinput.languageandinput.config.languageand drops fixtures whose target grammar is not in the list (mirroring the WASMf9e0ff50pattern). Eliminatessmoke_bibtexand every other non-static-set test that previously failed at parser-load time. Also reverts the per-fixtureskip: { languages: ["zig"] }workaround onfixtures/smoke/actionscript.jsonsince the auto-omit subsumes it. (alef.toml,fixtures/smoke/actionscript.json)processcontainsassertions onVec<DTO>fields aggregate every stringy accessor (regen on alef857c55d1).testProcessPythonImportsDetailandtestProcessRustStructureNamepreviously failed because the codegen relied onresult_field_accessornaming a single "primary" accessor per array field (imports → source,structure → kind), which misses values surfaced on sibling fields —"os"againstImportInfo.items,"MyConfig"againstStructureItem.namerather thanStructureKind. The regenerated tests now emit acontains(where: { item in … })closure that gathers every text-bearing accessor (String, Option, Vec, serde-enum) into a[String]and substring-matches the expected value, mirroring python's_alef_e2e_item_texts. Swift e2e: 411 tests, 0 failures. (e2e/swift_e2e/Tests/TreeSitterLanguagePackE2ETests/ProcessTests.swift)natives/native/(#128). The re-stage loop inbuild-maven-packagewalked onedirnametoo far when extracting the classifier from each lib's path, so all six platform libs landed atnatives/native/{lib}instead ofnatives/{classifier}/{lib}. The Maven Central JAR shipped in v1.8.1 contained only three files (one per.so/.dylib/.dllextension) andTreeSitterLanguagePack.getParser("…")failed withUnsatisfiedLinkError: Expected resource: /natives/windows-x86_64/ts_pack_core_ffi.dll. Fixed the path-walk depth, and hardened both build-side and deploy-side verification steps to require everylinux-x86_64 / linux-arm64 / macos-arm64 / macos-x86_64 / windows-x86_64 / windows-arm64classifier directory is present in the staged JAR so the regression cannot ship again. Additionally corrected the Windows-ARM classifier fromwindows-aarch64towindows-arm64: the Java loader (NativeLib.resolveNativesRid) normalizes every ARM architecture toarm64and resolves tonatives/windows-arm64/, so a JAR staged underwindows-aarch64would stillUnsatisfiedLinkErroron Windows ARM64 — the publish matrix and both verification steps now usewindows-arm64, consistent with thelinux-arm64/macos-arm64classifiers and the loader. (.github/workflows/publish.yaml)[crates.test.wasm].beforepreviously ranwasm-pack buildwith noTSLP_LANGUAGESset, which triggered a full 305-grammar static build — the 97MBabl/parser.calone hangs clang at -O2 for tens of minutes. Mirrored the publish-wasm CI environment locally:TSLP_LINK_MODE=static TSLP_LANGUAGES=<curated 31-grammar list> CARGO_PROFILE_RELEASE_LTO=false CARGO_PROFILE_RELEASE_CODEGEN_UNITS=16. Also declared[crates.wasm].languages = [<same list>]so alef's wasm e2e auto-skip path correctly elides 268 of the 302 smoke tests for grammars not in the bundle (with the matching aleff23ae5d3/f9e0ff50fixes that teach the wasm filter to look up bothinput.languageandinput.config.language). (alef.toml)4f6a9056csharp List emission formock_url_list;06caa440goosimport include guard formock_url_list;1fde7aaePHP deterministic accessor extraction order (HashMap→BTreeMap; resolves the recurring$imports/$structureflip ine2e/php/tests/ProcessTest.php);13717e24swift e2e — trailing()on scalar accessors that bridge through opaque structs, drop spurious?.map ... ?? []on non-optionalRustVecaccessors, and camelCase swift-bridge method names (e.g.asStr()notas_str()); plus the wasminput.config.languagefilter follow-up cited above. (e2e/php/**,e2e/swift_e2e/**,e2e/wasm/**,e2e/zig/**)crates/ts-pack-core-node/package.json#napi.targetsalready listedx86_64-apple-darwin, but thebuild-node-nativematrix in.github/workflows/publish.yamlomitted themacos-15-intelrunner — so v1.8.0 / v1.8.1 npm tarballs shipped withoutts-pack-core-node.darwin-x64.node, breakingrequire('@​kreuzberg/tree-sitter-language-pack')on Intel Macs. Added amacos-15-intel/darwin-x64/x86_64-apple-darwinrow to the matrix, mirroring the parity already present in the Python/Ruby/Java/Go publish matrices. The next published version (≥1.8.2) will include the darwin-x64 binary. (.github/workflows/publish.yaml)fix(alef-e2e/rust): unwrap Option<scalar> leaf fields in numeric comparison assertions(the threegreater_than/less_than/less_than_or_equaloperators no longer fail to compile when the leaf field isOption<T>),fix(alef-e2e/rust): use serde_json::from_str instead of json! macro for fixture json_object args(sidesteps the macro recursion-limit on fixtures with large JSON payloads),fix(alef-backend-php): emit Box::default() instead of Box::new(Default::default()) for boxed fallback fields(resolvesclippy::box-default-D warnings on the PHP umbrella crate), andfeat(alef-core,alef-e2e/wasm,alef-e2e/typescript): auto-skip wasm fixtures outside the static-compiled language set(foundational for tslp's curated wasm32 builds; no-op for now since[crates.wasm].languagesis empty, but unlocks the future curated-build flow). Side effects in this regen: a few Rust e2e fixture bodies re-formatted,e2e/c/main.ccosmetic update, andpackages/swift/rust/Cargo.tomldeps re-ordered. (alef.toml,e2e/{c,php,rust}/**,packages/swift/rust/Cargo.toml)shellcheck SC2129flagged four consecutiveecho … >> "$GITHUB_ENV"lines in the Set library paths for .NET step; consolidated into a single grouped{ … } >> "$GITHUB_ENV"block to keep actionlint clean on the workflow. (.github/workflows/ci-e2e.yaml)alef_versioninalef.tomland the alef pre-commit-hook rev. Lands the Phase-5 leakage-sanitizer chain plus follow-up codegen fixes: v0.17.4 csharp/elixir/kotlin/swift codegen-consumer unblocks; v0.17.5 NAPI/PHP/Java docstring sanitizer wiring; v0.17.7 sanitizer recognises rustdoc test-attribute fences (```no_run,```ignore,```should_panic,```compile_fail,```edition*) as Rust code (so their bodies are dropped for foreign-language targets); v0.17.8/v0.17.9 csharp U1-bool P/Invoke call-site fix; v0.17.10 Swift free-function forwarder fixes —Option<String>returns now use?.toString()and host DTO args flow through.intoRust()before the bridge call, sodetectLanguageFromExtension/Path/Content, the*Querygetters, andprocess(_:config:)compile and execute against the high-level Swift API. Downstream surface: 61 Rust-code-block leaks incrates/ts-pack-core-node/index.d.tsand 20+ incrates/ts-pack-core-php/src/lib.rscollapse to 0 after this regen.chunksundefined.e2e/rust/tests/process_test.rsfourtest_*_chunking_*cases were emittingassert!(chunks.len() >= 2 as usize, ...)wherechunkswas undeclared (E0425). Same class of bug as the PHP$chunksfix; alef's Rust e2e codegen unconditionally fired the streaming-virtual-field assertion arm forchunks/imports/structureeven for non-streaming fixtures. Fix pulled in via alefa32ca2a0 fix(rust-gen): bind fields_array accessor before len() assertion in e2e tests— non-streaming fixtures with a collidingfields_arrayfield now emit a leadinglet {field} = &{result}.{field};binding.e2e/nodetree-sitterdev-dep restored (recurring).alef generatestripstree-sitter@^0.25.0frome2e/node/package.jsonon every regen, buttests/capsule_passthrough.test.tsimports it to verify FFI capsule type-tag pass-through between ourLanguageobject and the upstream tree-sitter Node native module. Hand-restored, alongside the correspondingpnpm-lock.yamlrows.2eaa260a fix(swift): hide RustVec/RustString/intoRust from public API; convert at forwarder boundariesplus a handful of smaller adapter fixes (fix(alef-backend-pyo3),fix(alef-backend-napi,wasm),fix(alef-backend-ffi)clippy). Public Swift API surface no longer leaksRustVec/RustString/intoRust(); conversion happens at forwarder boundaries inside generated extensions.pnpm-lock.yamlto drop the stalee2e/node → tree-sitter@^0.25.0devDependency that brokepnpm install --frozen-lockfileinCI Validate. Regenerated thedocs/reference/api-*.mdset so committed output matchesalef docs(compact Markdown tables) andalef verifystays green onmain.fix(swift): enum intoRust(), Ref→owned init, Vec<RustString> elem type(Swift CI was failing onCommentKind.intoRust(),RustStringRef.toString(),RustVec<String>not conforming toVectorizable);fix(php-gen): bind fields_array accessor before count() assertion in e2e tests(PHP e2etest_*_chunking_*cases were referencing undefined$chunks);fix(alef-backend-go): null-check and box Option<String> returns instead of dereferencing(generatedpackages/go/binding.gowas returningC.GoString(ptr)where the signature expected*string, breakinggolangci-lintandgovulncheck). Side-effect: API docstrings now elide Rust-style[Type]-link syntax (e.g. PHPNode.phpdoc comments now readA single syntax node within a 'Tree'instead ofA single syntax node within a [Tree]).tree-sitter-yuckproducesRuntimeError: unreachablewhen parsing under wasm32 (same class of bug as zig/ziggy, which already skip on wasm).fixtures/smoke/yuck.jsonnow carriesskip: { languages: ["wasm"] };alef e2e generateremoved the corresponding test frome2e/wasm/tests/smoke.test.ts. Native bindings remain unaffected.package.jsonpnpm-field cleanup. Removed the now-ignoredpnpm.onlyBuiltDependenciesblock from the rootpackage.json. pnpm 11 reads that setting frompnpm-workspace.yaml(which already declares the same allowlist); the duplicate field made pnpm emit a warning on every install.github.com/kreuzberg-dev/tree-sitter-language-pack/releases/...previously used ureq 3.x's default rustls agent, which trusts only the bundled Mozilla webpki roots and ignores the platform store. On Linux/WSL2 hosts where GitHub HTTPS traffic is presented with a chain rooted in a locally trusted (corp / private) CA — and wherecurl,pip, andgitall succeed against the same URL via the OS trust store — first-use parser downloads failed withDownloadError: ... io: invalid peer certificate: UnknownIssuer. The downloader now constructs a configuredureq::AgentwithRootCerts::PlatformVerifierby default (viarustls-platform-verifier), matching the behaviour of every other host-trust-aware HTTP client on the system. SetTREE_SITTER_LANGUAGE_PACK_TLS_ROOTS=webpkito opt back into ureq's bundled Mozilla roots; setTREE_SITTER_LANGUAGE_PACK_TLS_ROOTS=platformto make the default explicit. Affects every binding (Python, Node.js, Ruby, PHP, Go, Java, C#, Elixir, WASM, Dart, Swift, Zig, Kotlin-Android) because the fix lives entirely in the sharedts-pack-coreRust crate. (crates/ts-pack-core/src/{download.rs,pack_config.rs}, workspaceCargo.toml)Removed
wolframgrammar dropped from the language pack.tree-sitter-wolframproduces glibc heap corruption (free(): invalid next size) when parsing trivial input under serial test execution on Linux; macOS allocator silently tolerated the corruption. The entire upstream ecosystem is unmaintained (canonicalbostick/tree-sitter-wolframlast touched 2021-11-11 with 3 stars; every known fork —LumaKernel,LoganAMorrison,JuanG970,jakassebaum— ships the sameLANGUAGE_VERSION 13parser tables and is inactive). Rather than fork-and-maintain a Wolfram grammar in-house for marginal demand, the entry is removed fromlanguage_definitions.json, all CITSLP_LANGUAGESlists, the smoke fixture, the e2e harness, the docs, and the README ecosystem listings. Total supported grammar count drops from 306 to 305, which matches the long-standing "305 languages" marketing copy (previously off-by-one due to the broken wolfram entry).Changed
publish-pubdev.yamlworkflow triggered bypush: tags: v*. pub.dev OIDC trusted publishing rejects tokens fromreleaseevents; onlypushandworkflow_dispatchevents are accepted. The new workflow produces an accepted token. One-time setup required: configure pub.dev → tree_sitter_language_pack package → Admin → Automated publishing with workflow path.github/workflows/publish-pubdev.yaml. (.github/workflows/publish-pubdev.yaml,.github/workflows/publish.yaml)_supported_languages.pyto reflect the 305-grammar count.scripts/generate_grammar_table.pydefault output path corrected fromdocs/supported-languages.mdto the canonical nav-referenceddocs/languages.md; Taskfiledocs:generate:languagesgenerates:field updated to match.v1.8.0Added
dart pub add tree_sitter_language_pack. Built with flutter_rust_bridge for isolate-safe Future APIs.dev.kreuzberg.tslp:tslp-androidAAR on Maven Central. JNI-based with per-ABI native libraries (arm64-v8a, armeabi-v7a, x86_64, x86). JVM Kotlin users continue to consume the canonical Java / Panama-FFM package.TreeSitterLanguagePackvia SwiftPM. swift-bridge for macOS, iOS, and Linux.zig fetch --save <tarball-url>from GitHub Releases. Direct C FFI via@cImport.tree-sitter-language-pack-dart(FRB bridge) andtree-sitter-language-pack-swift(swift-bridge).crates/ts-pack-core-jniRust crate exportingJava_...JNI symbols for the Kotlin-Android binding (excluded from the default workspace build because it cross-compiles viacargo ndk).ci-zig.yaml,ci-swift.yaml,ci-dart.yaml, plus a combinedci-mobile.yamlcovering Android cross-compile + iOS cargo check.publish-pub), Swift Package Index (publish-swift), Zig (publish-zig→ GitHub Release tarball), and Maven Central kotlin-android (publish-kotlin-android).Fixed
DOWNLOAD_CACHE_LOCKincrates/ts-pack-core/src/lib.rswas aMutex<()>— intra-process only — so multi-worker servers (gunicorn / Puma / Node cluster), fan-out build pipelines (make -j8, parallel test runners), and the zig e2e suite (zig build testspawns eight test binaries in parallel) all raced on the same~/.cache/tree-sitter-language-pack/v{version}/directory. Partialentry.unpackwrites were observable to other workers'libloading::open, producing intermittentLanguageNotFound/ segfaults on first request for an uncached language; N processes could also each redundantly pull the 50MB platform bundle. Cache writes are now atomic (write to<dest_dir>/.<name>.tmp.<pid>.<seq>thenfs::rename— readers see old, new, or nothing, never partial) and the bundle-fetch / extract / clean critical section is serialized across processes with an exclusivefd-lockon<version_cache_dir>/.download.lock. Double-checked locking preserves the lock-free hot path: steady-stateis_cachedlookups never pay the OS file-lock cost. NewError::CacheLock(String)variant surfaces lock-acquisition failures cleanly. Affects every binding (Python, Node.js, Ruby, PHP, Go, Java, C#, Elixir, WASM, Dart, Swift, Zig, Kotlin-Android) because the fix lives entirely in the sharedts-pack-coreRust crate. Newfd-lock = "4"dependency (gated under thedownloadfeature). Cross-process safety relies onflocksemantics, which are unreliable on NFS — users withXDG_CACHE_HOMEon NFS should use a local-FS cache or serialize at the application layer. (crates/ts-pack-core/src/{download.rs,error.rs},crates/ts-pack-core/Cargo.toml,Cargo.toml, newcrates/ts-pack-core/tests/concurrent_download.rs)65f1a129). Declared[crates.zig].languages = [<curated 18-grammar list>]mirroring theTSLP_LANGUAGESvalue in[crates.test.zig].before. Alef's new Zig codegen filter consults bothinput.languageandinput.config.languageand drops fixtures whose target grammar is not in the list (mirroring the WASMf9e0ff50pattern). Eliminatessmoke_bibtexand every other non-static-set test that previously failed at parser-load time. Also reverts the per-fixtureskip: { languages: ["zig"] }workaround onfixtures/smoke/actionscript.jsonsince the auto-omit subsumes it. (alef.toml,fixtures/smoke/actionscript.json)processcontainsassertions onVec<DTO>fields aggregate every stringy accessor (regen on alef857c55d1).testProcessPythonImportsDetailandtestProcessRustStructureNamepreviously failed because the codegen relied onresult_field_accessornaming a single "primary" accessor per array field (imports → source,structure → kind), which misses values surfaced on sibling fields —"os"againstImportInfo.items,"MyConfig"againstStructureItem.namerather thanStructureKind. The regenerated tests now emit acontains(where: { item in … })closure that gathers every text-bearing accessor (String, Option, Vec, serde-enum) into a[String]and substring-matches the expected value, mirroring python's_alef_e2e_item_texts. Swift e2e: 411 tests, 0 failures. (e2e/swift_e2e/Tests/TreeSitterLanguagePackE2ETests/ProcessTests.swift)natives/native/(#128). The re-stage loop inbuild-maven-packagewalked onedirnametoo far when extracting the classifier from each lib's path, so all six platform libs landed atnatives/native/{lib}instead ofnatives/{classifier}/{lib}. The Maven Central JAR shipped in v1.8.1 contained only three files (one per.so/.dylib/.dllextension) andTreeSitterLanguagePack.getParser("…")failed withUnsatisfiedLinkError: Expected resource: /natives/windows-x86_64/ts_pack_core_ffi.dll. Fixed the path-walk depth, and hardened both build-side and deploy-side verification steps to require everylinux-x86_64 / linux-arm64 / macos-arm64 / macos-x86_64 / windows-x86_64 / windows-arm64classifier directory is present in the staged JAR so the regression cannot ship again. Additionally corrected the Windows-ARM classifier fromwindows-aarch64towindows-arm64: the Java loader (NativeLib.resolveNativesRid) normalizes every ARM architecture toarm64and resolves tonatives/windows-arm64/, so a JAR staged underwindows-aarch64would stillUnsatisfiedLinkErroron Windows ARM64 — the publish matrix and both verification steps now usewindows-arm64, consistent with thelinux-arm64/macos-arm64classifiers and the loader. (.github/workflows/publish.yaml)[crates.test.wasm].beforepreviously ranwasm-pack buildwith noTSLP_LANGUAGESset, which triggered a full 305-grammar static build — the 97MBabl/parser.calone hangs clang at -O2 for tens of minutes. Mirrored the publish-wasm CI environment locally:TSLP_LINK_MODE=static TSLP_LANGUAGES=<curated 31-grammar list> CARGO_PROFILE_RELEASE_LTO=false CARGO_PROFILE_RELEASE_CODEGEN_UNITS=16. Also declared[crates.wasm].languages = [<same list>]so alef's wasm e2e auto-skip path correctly elides 268 of the 302 smoke tests for grammars not in the bundle (with the matching aleff23ae5d3/f9e0ff50fixes that teach the wasm filter to look up bothinput.languageandinput.config.language). (alef.toml)4f6a9056csharp List emission formock_url_list;06caa440goosimport include guard formock_url_list;1fde7aaePHP deterministic accessor extraction order (HashMap→BTreeMap; resolves the recurring$imports/$structureflip ine2e/php/tests/ProcessTest.php);13717e24swift e2e — trailing()on scalar accessors that bridge through opaque structs, drop spurious?.map ... ?? []on non-optionalRustVecaccessors, and camelCase swift-bridge method names (e.g.asStr()notas_str()); plus the wasminput.config.languagefilter follow-up cited above. (e2e/php/**,e2e/swift_e2e/**,e2e/wasm/**,e2e/zig/**)crates/ts-pack-core-node/package.json#napi.targetsalready listedx86_64-apple-darwin, but thebuild-node-nativematrix in.github/workflows/publish.yamlomitted themacos-15-intelrunner — so v1.8.0 / v1.8.1 npm tarballs shipped withoutts-pack-core-node.darwin-x64.node, breakingrequire('@​kreuzberg/tree-sitter-language-pack')on Intel Macs. Added amacos-15-intel/darwin-x64/x86_64-apple-darwinrow to the matrix, mirroring the parity already present in the Python/Ruby/Java/Go publish matrices. The next published version (≥1.8.2) will include the darwin-x64 binary. (.github/workflows/publish.yaml)fix(alef-e2e/rust): unwrap Option<scalar> leaf fields in numeric comparison assertions(the threegreater_than/less_than/less_than_or_equaloperators no longer fail to compile when the leaf field isOption<T>),fix(alef-e2e/rust): use serde_json::from_str instead of json! macro for fixture json_object args(sidesteps the macro recursion-limit on fixtures with large JSON payloads),fix(alef-backend-php): emit Box::default() instead of Box::new(Default::default()) for boxed fallback fields(resolvesclippy::box-default-D warnings on the PHP umbrella crate), andfeat(alef-core,alef-e2e/wasm,alef-e2e/typescript): auto-skip wasm fixtures outside the static-compiled language set(foundational for tslp's curated wasm32 builds; no-op for now since[crates.wasm].languagesis empty, but unlocks the future curated-build flow). Side effects in this regen: a few Rust e2e fixture bodies re-formatted,e2e/c/main.ccosmetic update, andpackages/swift/rust/Cargo.tomldeps re-ordered. (alef.toml,e2e/{c,php,rust}/**,packages/swift/rust/Cargo.toml)shellcheck SC2129flagged four consecutiveecho … >> "$GITHUB_ENV"lines in the Set library paths for .NET step; consolidated into a single grouped{ … } >> "$GITHUB_ENV"block to keep actionlint clean on the workflow. (.github/workflows/ci-e2e.yaml)alef_versioninalef.tomland the alef pre-commit-hook rev. Lands the Phase-5 leakage-sanitizer chain plus follow-up codegen fixes: v0.17.4 csharp/elixir/kotlin/swift codegen-consumer unblocks; v0.17.5 NAPI/PHP/Java docstring sanitizer wiring; v0.17.7 sanitizer recognises rustdoc test-attribute fences (```no_run,```ignore,```should_panic,```compile_fail,```edition*) as Rust code (so their bodies are dropped for foreign-language targets); v0.17.8/v0.17.9 csharp U1-bool P/Invoke call-site fix; v0.17.10 Swift free-function forwarder fixes —Option<String>returns now use?.toString()and host DTO args flow through.intoRust()before the bridge call, sodetectLanguageFromExtension/Path/Content, the*Querygetters, andprocess(_:config:)compile and execute against the high-level Swift API. Downstream surface: 61 Rust-code-block leaks incrates/ts-pack-core-node/index.d.tsand 20+ incrates/ts-pack-core-php/src/lib.rscollapse to 0 after this regen.chunksundefined.e2e/rust/tests/process_test.rsfourtest_*_chunking_*cases were emittingassert!(chunks.len() >= 2 as usize, ...)wherechunkswas undeclared (E0425). Same class of bug as the PHP$chunksfix; alef's Rust e2e codegen unconditionally fired the streaming-virtual-field assertion arm forchunks/imports/structureeven for non-streaming fixtures. Fix pulled in via alefa32ca2a0 fix(rust-gen): bind fields_array accessor before len() assertion in e2e tests— non-streaming fixtures with a collidingfields_arrayfield now emit a leadinglet {field} = &{result}.{field};binding.e2e/nodetree-sitterdev-dep restored (recurring).alef generatestripstree-sitter@^0.25.0frome2e/node/package.jsonon every regen, buttests/capsule_passthrough.test.tsimports it to verify FFI capsule type-tag pass-through between ourLanguageobject and the upstream tree-sitter Node native module. Hand-restored, alongside the correspondingpnpm-lock.yamlrows.2eaa260a fix(swift): hide RustVec/RustString/intoRust from public API; convert at forwarder boundariesplus a handful of smaller adapter fixes (fix(alef-backend-pyo3),fix(alef-backend-napi,wasm),fix(alef-backend-ffi)clippy). Public Swift API surface no longer leaksRustVec/RustString/intoRust(); conversion happens at forwarder boundaries inside generated extensions.pnpm-lock.yamlto drop the stalee2e/node → tree-sitter@^0.25.0devDependency that brokepnpm install --frozen-lockfileinCI Validate. Regenerated thedocs/reference/api-*.mdset so committed output matchesalef docs(compact Markdown tables) andalef verifystays green onmain.fix(swift): enum intoRust(), Ref→owned init, Vec<RustString> elem type(Swift CI was failing onCommentKind.intoRust(),RustStringRef.toString(),RustVec<String>not conforming toVectorizable);fix(php-gen): bind fields_array accessor before count() assertion in e2e tests(PHP e2etest_*_chunking_*cases were referencing undefined$chunks);fix(alef-backend-go): null-check and box Option<String> returns instead of dereferencing(generatedpackages/go/binding.gowas returningC.GoString(ptr)where the signature expected*string, breakinggolangci-lintandgovulncheck). Side-effect: API docstrings now elide Rust-style[Type]-link syntax (e.g. PHPNode.phpdoc comments now readA single syntax node within a 'Tree'instead ofA single syntax node within a [Tree]).tree-sitter-yuckproducesRuntimeError: unreachablewhen parsing under wasm32 (same class of bug as zig/ziggy, which already skip on wasm).fixtures/smoke/yuck.jsonnow carriesskip: { languages: ["wasm"] };alef e2e generateremoved the corresponding test frome2e/wasm/tests/smoke.test.ts. Native bindings remain unaffected.package.jsonpnpm-field cleanup. Removed the now-ignoredpnpm.onlyBuiltDependenciesblock from the rootpackage.json. pnpm 11 reads that setting frompnpm-workspace.yaml(which already declares the same allowlist); the duplicate field made pnpm emit a warning on every install.github.com/kreuzberg-dev/tree-sitter-language-pack/releases/...previously used ureq 3.x's default rustls agent, which trusts only the bundled Mozilla webpki roots and ignores the platform store. On Linux/WSL2 hosts where GitHub HTTPS traffic is presented with a chain rooted in a locally trusted (corp / private) CA — and wherecurl,pip, andgitall succeed against the same URL via the OS trust store — first-use parser downloads failed withDownloadError: ... io: invalid peer certificate: UnknownIssuer. The downloader now constructs a configuredureq::AgentwithRootCerts::PlatformVerifierby default (viarustls-platform-verifier), matching the behaviour of every other host-trust-aware HTTP client on the system. SetTREE_SITTER_LANGUAGE_PACK_TLS_ROOTS=webpkito opt back into ureq's bundled Mozilla roots; setTREE_SITTER_LANGUAGE_PACK_TLS_ROOTS=platformto make the default explicit. Affects every binding (Python, Node.js, Ruby, PHP, Go, Java, C#, Elixir, WASM, Dart, Swift, Zig, Kotlin-Android) because the fix lives entirely in the sharedts-pack-coreRust crate. (crates/ts-pack-core/src/{download.rs,pack_config.rs}, workspaceCargo.toml)Removed
wolframgrammar dropped from the language pack.tree-sitter-wolframproduces glibc heap corruption (free(): invalid next size) when parsing trivial input under serial test execution on Linux; macOS allocator silently tolerated the corruption. The entire upstream ecosystem is unmaintained (canonicalbostick/tree-sitter-wolframlast touched 2021-11-11 with 3 stars; every known fork —LumaKernel,LoganAMorrison,JuanG970,jakassebaum— ships the sameLANGUAGE_VERSION 13parser tables and is inactive). Rather than fork-and-maintain a Wolfram grammar in-house for marginal demand, the entry is removed fromlanguage_definitions.json, all CITSLP_LANGUAGESlists, the smoke fixture, the e2e harness, the docs, and the README ecosystem listings. Total supported grammar count drops from 306 to 305, which matches the long-standing "305 languages" marketing copy (previously off-by-one due to the broken wolfram entry).Changed
publish-pubdev.yamlworkflow triggered bypush: tags: v*. pub.dev OIDC trusted publishing rejects tokens fromreleaseevents; onlypushandworkflow_dispatchevents are accepted. The new workflow produces an accepted token. One-time setup required: configure pub.dev → tree_sitter_language_pack package → Admin → Automated publishing with workflow path.github/workflows/publish-pubdev.yaml. (.github/workflows/publish-pubdev.yaml,.github/workflows/publish.yaml)_supported_languages.pyto reflect the 305-grammar count.scripts/generate_grammar_table.pydefault output path corrected fromdocs/supported-languages.mdto the canonical nav-referenceddocs/languages.md; Taskfiledocs:generate:languagesgenerates:field updated to match.v1.6.3Fixed
TSLP_LINK_MODEandTSLP_LANGUAGESenv vars to Go task (#102)LDFLAGSpaths — point to workspacetarget/release/instead of crate-local path (#102)ffi.go(already ints_pack.h) (#102)Init,Download) (#102)cache_dir()to registry on creation (#102)additional_dependenciesfor all textlint plugins (#102)v1.6.2Compare Source
Fixed
-fno-strict-aliasingto prevent undefined behavior (#100)Changed
v1.6.1Compare Source
Fixed
packages/go/v1/topackages/go/so the Go module proxy can resolvego.modat the correct path —go get github.com/kreuzberg-dev/tree-sitter-language-pack/packages/gonow works (#97)SRCDIR-relative include/lib paths (one fewer../after directory restructure)features = ["all"]from e2e Rust testCargo.toml— usedownloadfeature for runtime parser fetchinglang-*features to unblock crates.io publish (300 feature limit)rustls-webpkito patch RUSTSEC-2026-0098 and RUSTSEC-2026-0099 (#99)language_definitions.jsonin crateChanged
download/TSLP_LANGUAGESdocumentation in READMEsv1.6.0Compare Source
Added
parse_string()— avoids re-creating parsers on repeated calls for the same languagerun_query()— avoids recompiling tree-sitter queriesparse_with_language()internal API for callers that already have aLanguageobjectCompiledExtraction— avoids rebuilding on every extraction calltype_specdeclarations extracted as symbols with correctSymbolKind(struct, interface, type)Fixed
compiled_query()now propagatesError::LockPoisonedinstead of silently ignoring poisoned RwLockQueryCursorbyte-range no longer leaks between patterns when reusing the cursor inextract_from_tree()std::collections::HashMapwithahash::AHashMapin parser cache for consistencyget_language()call removed fromparse_string()hot path — only called on cache missChanged
CompiledExtraction::extract()andintel::parse_source()now use the thread-local parser cacheQueryCursorreused across patterns within a singleextract_from_tree()callStringallocation removed fromnode_types.contains()check in chunkingRemoved
lang-*Cargo features and group features (all,web,systems,scripting,data,jvm,functional,wasm) — language selection is now viaTSLP_LANGUAGESenv var at build time; thedownloadfeature (default) fetches parsers at runtimev1.5.0Compare Source
Added
ci-validate.yaml— blocks PRs that introduce non-permissive (GPL/AGPL/LGPL/MPL) grammarsFixed
lessgrammar: regenerated parser from ABI 11 to ABI 14 (was incompatible with tree-sitter 0.26)cornsmoke fixture: replaced invalid"x"snippet with valid corn syntaxv1.4.2Compare Source
Fixes
ts-pack-clito crates.io —cargo install ts-pack-clinow works (#87)install.shreference from documentationInstall the CLI
or via Homebrew:
v1.4.1Compare Source
Fixed
language_definitions.jsonin the published crate sobuild.rscan find extension mappings, ambiguity data, and C symbol overrides when installed from crates.ioChanged
v1.4.0Compare Source
Fixed
detect_languagein Python public API (#85)ts-pack-php(hyphens)Changed
v1.3.3Compare Source
Fixed
C_SYMBOL_OVERRIDEStable now includes ALL languages fromlanguage_definitions.json, not just compiled ones — fixes download and loading ofcsharp,vb,embeddedtemplate,nushellfrom PyPI/npm/RubyGems packagesdownloaded_languages()returns canonical names (csharp) instead of c_symbol names (c_sharp)JSON.parseon native Hash return fromprocess()ProcessResultfields directly (nometadatawrapper)Changed
rustler_precompiledupdated to 0.9.0 (Elixir)v1.3.2Compare Source
Fixed
c_symboloverrides (csharp,vb,embeddedtemplate,nushell) — build was naming libraries with the raw name but runtime loader expected thec_symbolname (#80)tspackimport in non-process test filesextract/2andvalidate_extraction/1NIF declarations\nis interpreted correctlyparanim/tree-sitter-nim(ABI v11) toaMOPel/tree-sitter-nim(MIT, ABI v14)Added
c_symboloverride languages (csharp, vb, embeddedtemplate, nushell)ci-all-grammars.yamlto catchc_symbolnaming mismatchesv1.3.1Compare Source
Fixed
process(),extract(),validate_extraction()now return native Ruby Hash instead of raw JSON string*ProcessResultstruct fields instead of invalidjson.Unmarshalon non-string returnload_fromloaderv1.3.0Compare Source
Added
extract_patterns()/extract()across Python, Node.js, Rust, Ruby, Elixir, PHP, WASM, C FFIvalidate_extraction()for config validation without executionCompiledExtractionfor pre-compiled query reuse (Rust)ProcessConfig.extractionsfor combining custom queries with standard analysisFixed
process_imports_contains_sourceassertion uses contains instead of equalitydetectLanguageFromPathanddetectLanguageFromExtensionexportsprocess()result assertionscratefield resolution withload_fromoverridev1.2.1Compare Source
Fixed
c_symboloverride — linker errorundefined symbol: tree_sitter_nushell.as_deref()onStringtype (compile error on CI)c_symbol_forbehinddynamic-loading/downloadfeatures (dead code warning)crate:field with Cargo[lib]name (underscores, not hyphens)--cfgflag patch to publish workflow for Rustler 0.37.3 compatibilitywithout_gil(): addcatch_unwindto ensure GIL is reacquired on panic//!,/*!, anddoc_commentnode typestext.contains()Changed
Arc<Vec<PathBuf>>for extra lib dirs (avoids Vec clone per language lookup)AHashSet<&str>inavailable_languages()(avoids 248+ String allocations)NodeInfo.kindusesCow::Borrowed(zero-copy from tree-sitter's&'static str)with_tree()/try_with_tree()helpers replace 9 duplicate lock patternswithout_gil()helper replaces 5 duplicate GIL release patternsextension_ambiguity_json()helper replaces duplicated JSON serialization in 4 bindingsMetadataCollectorstruct reduces function from 11 to 7 parametersv1.2.0Compare Source
Added
Configuration
📅 Schedule: (UTC)
🚦 Automerge: Disabled by config. Please merge this manually once you are satisfied.
♻ Rebasing: Whenever PR becomes conflicted, or you tick the rebase/retry checkbox.
🔕 Ignore: Close this PR and you won't be reminded about this update again.
This PR was generated by Mend Renovate. View the repository job log.