docs: cover 0.4.44 — quantize, federation, models redesign, secrets#1
Open
webdevtodayjason wants to merge 1 commit into
Open
docs: cover 0.4.44 — quantize, federation, models redesign, secrets#1webdevtodayjason wants to merge 1 commit into
webdevtodayjason wants to merge 1 commit into
Conversation
Bring the docs site up to the 0.4.44 product surface. - add guides/quantize.mdx — in-browser AWQ / NVFP4 quantization, idle-node guardrail, HF push, Qwen3.5 multimodal/hybrid handling (AWQ verified; NVFP4-on-Qwen3.5 marked experimental) - add guides/secrets.mdx — local Secrets store; HF read vs write tokens (write required for pushes, read-only rejected up front); NGC/W&B/OpenAI - add guides/federation.mdx — master router (route /v1/* by model name), load/unload any node from the UI, model stacking + persistence, mem-util knob - rewrite guides/models.mdx — two-list Installed/Browse redesign, Unload (was DELETE), serve-from-on-disk-weights - introduction.mdx — add Quantization / Federated serving / Model stacking cards; add federation + quantization to the architecture box - docs.json — add the three new pages to the Guides nav Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01JfM4xyZR4DdC3W74ea99Mi
|
Preview deployment for your docs. Learn more about Mintlify Previews.
💡 Tip: Enable Workflows to automatically generate PRs for you. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Brings the docs site up to the current product surface (code is 0.4.44; the site was last aligned around 0.4.4). Closes the largest gaps found in a README/CHANGELOG/docs gap analysis.
What changed
guides/quantize.mdx— in-browser AWQ (W4A16) / NVFP4 quantization: scheme choice, calibration samples, the idle-node guardrail (409while a model is loaded;force:trueoverride), push-to-HF, and Qwen3.5 multimodal/Gated-DeltaNet handling. AWQ is presented as proven; NVFP4-on-Qwen3.5 is explicitly marked experimental/unverified.guides/secrets.mdx— the local Secrets store and the read vs write HF token distinction (write required to push; read-only tokens rejected up front), plus NGC / W&B / OpenAI slots.guides/federation.mdx— federated master router (routes/v1/*by model name), load/unload any model on any node from the UI, model stacking + restart persistence, and the per-node memory-utilization knob.guides/models.mdx— the two-list Installed / Browse redesign and Unload (the old text described the single "Downloads" tab + "Launch Model"); adds serve-from-on-disk-weights.introduction.mdx— new capability cards (Quantization, Federated serving, Model stacking) and federation + quantization added to the architecture box.docs.json— the three new pages added to the Guides nav.Deliberately omitted
TRITON_ATTN(currently a no-op hedge) — not documented.Validation
docs.jsonparses as valid JSON; every nav-referenced guide page exists on disk.Do not merge yet — opening for review.
🤖 Generated with Claude Code
https://claude.ai/code/session_01JfM4xyZR4DdC3W74ea99Mi