WikiStub-Seed is a multilingual-ready JSON knowledge framework for AI-assisted research, documentation, learning systems and LLM workflows. It ships a curated wikistub_seed.json dataset with 630 compact German/English knowledge stubs across 12 scientific and cultural domains, prepared ES/ZH/JA/RU language slots, plus Python tools for validation, export and translation.
WikiStub-Seed is a knowledge-stub seed library, not a wiki.
- 630 knowledge stubs in
wikistub_seed.jsonwith DE/EN content and prepared ES/ZH/JA/RU language slots - 12 top-level domains, including mathematics, physics, chemistry, biology, medicine, psychology, AI, engineering, society, economics, history and culture
- 85 subcategories with short, neutral definitions and relevance notes
- Canonical
definitions.{lang}andrelevance_i18n.{lang}maps while retaining legacydefinition_de,definition_enandrelevance - Python CLI tooling for statistics, validation, consistency checks and Markdown export
- A documented
wikistub-seed-data-v1export direction for future static Web/PWA use - No required external dependencies for core import, export, validation or CLI use
- Seed a local knowledge base for AI-assisted writing or research
- Build documentation glossaries, learning maps or concept catalogs
- Export structured Markdown for Obsidian, GitHub Pages or static sites
- Feed retrieval, embeddings or LLM context pipelines with compact domain stubs
- Translate and extend a domain-neutral knowledge skeleton in a controlled JSON format
Each stub is intentionally small and machine-readable:
{
"title": "Domain-Driven Design",
"definition_de": "Ein Ansatz zur Modellierung komplexer Software, der die Fachdomäne in den Mittelpunkt stellt.",
"definition_en": "An approach to modeling complex software that places the business domain at the center of development.",
"relevance": "Hilft, komplexe Systeme verständlich und wartbar zu gestalten.",
"definitions": {
"de": "Ein Ansatz zur Modellierung komplexer Software, der die Fachdomäne in den Mittelpunkt stellt.",
"en": "An approach to modeling complex software that places the business domain at the center of development.",
"es": "",
"zh": "",
"ja": "",
"ru": ""
},
"relevance_i18n": {
"de": "Hilft, komplexe Systeme verständlich und wartbar zu gestalten.",
"en": "",
"es": "",
"zh": "",
"ja": "",
"ru": ""
},
"tags": ["Informatik", "Software Engineering"]
}The current authoritative source is wikistub_seed.json. EXPORTFORMAT.md documents the stable wrapper format wikistub-seed-data-v1 for Web/PWA, API and LLM exports.
git clone https://github.com/dev-bricks/WikiStub-Seed.git
cd WikiStub-Seed
python wikistub_seed_cli.py --help
python wikistub_seed_cli.py stats
python wikistub_seed_cli.py check
python wikistub_seed_pipeline.py validate
python wikistub_seed_pipeline.py export --output --englishOn Windows, start.bat opens the CLI entry point. Exported files are written to output/; that folder is local and not versioned.
| Command | Purpose |
|---|---|
python wikistub_seed_cli.py stats |
Print stub, category and tag statistics |
python wikistub_seed_cli.py check |
Run consistency checks over the JSON dataset |
python wikistub_seed_pipeline.py validate |
Validate the pipeline input data |
python wikistub_seed_pipeline.py export --output --english |
Export the JSON dataset to Markdown |
python wikistub_seed_pipeline.py translate |
Optionally translate missing English definitions when configured |
| Path | Purpose |
|---|---|
wikistub_seed.json |
Authoritative bilingual knowledge dataset |
01_Mathematik/ ... 12_Kultur_Kunst_Sprache/ |
Domain-oriented Markdown source/export structure |
wikistub_seed_cli.py |
CLI for stats and checks |
wikistub_seed_pipeline.py |
Import, export, validation and optional translation pipeline |
md_to_json.py |
Markdown-to-JSON import helper |
check_duplicates.py |
Duplicate/consistency helper |
EXPORTFORMAT.md |
Stable exchange-format plan |
web_publisher/ |
Static Web/PWA publisher (PWA, offline cache, search, DE/EN toggle) |
WikiStub-Seed is local-first. Core usage reads and writes local JSON/Markdown files only. There is no telemetry and no automatic network communication.
The optional translation command can call an external API only when ANTHROPIC_API_KEY is set and the optional anthropic package is installed.
Completed:
- 12 top-level domains and 85 subcategories
- 630 bilingual stubs in a single JSON master file
- Markdown export and JSON synchronization tooling
- CLI smoke tests in GitHub Actions, plus dedicated macOS/Linux source smokes for
wikistub_seed_cli.py checkandwikistub_seed_pipeline.py validate - Static Web/PWA publisher with search and offline cache (
web_publisher/) wikistub-seed-data-v1schema wrapper with prepared DE/EN/ES/ZH/JA/RU language slots
Planned:
- Unified tag cleanup
- Obsidian/GitHub Pages export paths
- Optional embeddings and search API
WikiStub-Seed ist ein mehrsprachig vorbereitetes JSON-Wissensgerüst für KI-gestützte Wissensarbeit. Das Repository enthält 630 kompakte Wissens-Stubs mit Deutsch/Englisch-Inhalten und vorbereiteten Sprachslots für Spanisch, Chinesisch, Japanisch und Russisch, verteilt auf 12 Wissenschafts- und Kulturbereiche. Die Stubs sind kurz, neutral, versionierbar und für Automatisierung, Dokumentation, Lernsysteme und LLM-Kontexte geeignet.
WikiStub-Seed arbeitet standardmäßig lokal mit wikistub_seed.json. Die Kernfunktionen benötigen keine externen Pakete. Nur die optionale Übersetzungsfunktion nutzt externe API-Aufrufe, wenn ein API-Key gesetzt und das optionale Paket installiert wurde.
Wichtige Einstiegspunkte:
python wikistub_seed_cli.py statszeigt Statistik und Kategorien.python wikistub_seed_cli.py checkprüft den Datenbestand.python wikistub_seed_pipeline.py export --output --englishexportiert Markdown.EXPORTFORMAT.mdbeschreibt den geplanten stabilen Austauschstandard.web_publisher/enthält den fertigen statischen Web/PWA-Publisher mit Offline-Cache und DE/EN-Toggle.
MIT License. See LICENSE.
This project is an unpaid open-source donation. Liability is limited to intent and gross negligence under Section 521 of the German Civil Code, with the MIT License disclaimers applying as well. Use at your own risk.