Skip to content

tskit-dev/sc2ts

Repository files navigation

sc2ts

Sc2ts stands for "SARS-CoV-2 tree sequence" (pronounced "scoots" optionally) and consists of

  1. A method to infer Ancestral Recombination Graphs (ARGs) from SARS-CoV-2 genome sequence data at pandemic scale.
  2. A lightweight wrapper around tskit Python APIs specialised for the output of sc2ts which enables efficient node metadata access.
  3. A lightweight wrapper around Zarr Python which enables convenient and efficient access to the full Viridian dataset (alignments and metadata) in a single file using the VCF Zarr specification.

For details on the software, please see the online documentation. For information on the method and an inferred ARG, please see this preprint:

Shing H. Zhan, Yan Wong, Anastasia Ignatieva, Katherine Eaton, Isobel Guthrie, Benjamin Jeffery, Duncan S. Palmer, Carmen Lia Murall, Sarah P. Otto, and Jerome Kelleher (2025) A Pandemic-Scale Ancestral Recombination Graph for SARS-CoV-2. bioRxiv: 2023.06.08.544212; doi: https://doi.org/10.1101/2023.06.08.544212

About

ARG inference and analysis utilities for pandemic-scale SARS-CoV-2 data

Topics

Resources

License

Code of conduct

Stars

Watchers

Forks

Contributors