docs(lapis): fix maintainer docs#1703
Conversation
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
bc4d540 to
5a7540d
Compare
There was a problem hiding this comment.
Pull request overview
Updates LAPIS maintainer reference docs to reflect current SILO/LAPIS configuration and preprocessing expectations, replacing older mixed-format and component-driven documentation with a more direct, SILO-referenced spec.
Changes:
- Rewrite preprocessing reference to emphasize SILO-driven preprocessing, NDJSON ingestion, and updated config keys (lineage/phylo/incremental sections).
- Restructure
database_config.yamlreference into explicit top-level/schema/metadata/features sections and remove conditional/component-rendered content. - Remove the docs component previously used to render metadata type documentation (and inline the type list instead).
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.
| File | Description |
|---|---|
| lapis-docs/src/content/docs/maintainer-docs/references/preprocessing.mdx | Rewritten preprocessing reference (NDJSON schema, config keys, lineage/phylo/incremental notes). |
| lapis-docs/src/content/docs/maintainer-docs/references/database-configuration.mdx | Reworked database config reference into a more explicit spec and removed component-based rendering. |
| lapis-docs/src/components/Configuration/MetadataTypesList.astro | Deleted docs component used for listing metadata types (verified no remaining references in repo). |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Pages on database configuration and preprocessing were outdated.
5a7540d to
4caa53d
Compare
| configuration and input format that maintainers need to know in order to operate LAPIS. | ||
| For the authoritative reference, see the [SILO repository](https://github.com/GenSpectrum/LAPIS-SILO), | ||
| in particular the documents in [`documentation/`](https://github.com/GenSpectrum/LAPIS-SILO/tree/main/documentation) | ||
| (`input_format.md`, `lineage_definitions.md`, `phylogenetic_queries.md`, `incremental_preprocessing.md`). |
There was a problem hiding this comment.
I'm not sure whether we should mention specific filenames here. It's prone to becoming stale when someone renames files in SILO.
| - a TSV file with the metadata | ||
| - FASTA files with the sequences | ||
| SILO ingests data in [NDJSON](https://ndjson.org/) format (Newline-Delimited JSON). One JSON object per line | ||
| describes a single sequence record. There is no separate TSV/FASTA input mode. |
There was a problem hiding this comment.
| describes a single sequence record. There is no separate TSV/FASTA input mode. | |
| describes a single sequence record. |
or do you consider this especially relevant for the target audience?
There was a problem hiding this comment.
There is still the tutorial (/maintainer-docs/tutorials/start-lapis-and-silo) - let's flag it as outdated?
There was a problem hiding this comment.
Do we actually still want to maintain this page here? It's quite unrelated to LAPIS itself. IMO we could move most of this file to the SILO docs and shorten this to briefly explaining the concept and then referring to the SILO docs.
| | --------------------------- | ------ | -------- | ----------------------------------------------------------------------------------------------------- | | ||
| | `schema` | object | true | The [schema object](#the-schema-object). | | ||
| | `defaultNucleotideSequence` | string | false | Name of the default nucleotide sequence segment. Only meaningful when there is more than one segment. | | ||
| | `defaultAminoAcidSequence` | string | false | Name of the default amino acid gene | |
There was a problem hiding this comment.
nit: Missing period at end of defaultAminoAcidSequence description — all other rows in this table end with a period.
| <ul> | ||
| <li> | ||
| <code>string</code>: Arbitrary text values. | ||
| </li> | ||
| <li> | ||
| <code>int</code>: Integer values. | ||
| </li> | ||
| <li> | ||
| <code>float</code>: Floating-point values. | ||
| </li> | ||
| <li> | ||
| <code>boolean</code>: <code>true</code> or <code>false</code>. | ||
| </li> | ||
| <li> | ||
| <code>date</code>: Values must be valid dates in the form <code>YYYY-MM-DD</code>. | ||
| </li> | ||
| </ul> |
There was a problem hiding this comment.
nit: Metadata types list uses raw HTML <ul>/<li> while the rest of the page uses markdown tables. Minor inconsistency — could use a markdown list or table instead.
| | ---------------------------- | ------- | ------------------------ | ------------------------ | ---------------------------------------------------------------------------------------------------------------------------------- | | ||
| | `inputDirectory` | path | `./` | `/preprocessing/input/` | Directory containing the input files. | | ||
| | `outputDirectory` | path | `./output/` | `/preprocessing/output/` | Directory where SILO writes the preprocessed database state. | | ||
| | `ndjsonInputFilename` | path | (none — **required**) | | NDJSON file with the input records, relative to `inputDirectory`. SILO will refuse to start preprocessing if this is unset. | |
There was a problem hiding this comment.
q: ndjsonInputFilename shown as "(none — required)" in the Default column — slightly confusing since the column header says "Default". Consider just "required" or a footnote to clarify there is no default and the field must be set.
| description: Reference on the SILO preprocessing | ||
| --- | ||
|
|
||
| import TsvExample from '../../../../components/TsvExample.astro'; |
There was a problem hiding this comment.
nit: The TsvExample.astro component is now orphaned — this was its only import. Consider deleting it in this PR alongside MetadataTypesList.astro.
Pages on database configuration and preprocessing were outdated. This updates the pages, mostly relying on Claude. It would be great if someone more familiar with the the setups could review and adapt.
Relevant preview pages:
PR Checklist
[ ] All necessary documentation has been adapted.[ ] All necessary changes are explained in thellms.txt.[ ] The implemented feature is covered by an appropriate test.