Show simple item record

2023-09-01Zeitschriftenartikel
Lessons learned: overcoming common challenges in reconstructing the SARS-CoV-2 genome from short-read sequencing data via CoVpipe2
dc.contributor.authorLataretu, Marie
dc.contributor.authorDrechsel, Oliver
dc.contributor.authorKmiecinski, René
dc.contributor.authorTrappe, Kathrin
dc.contributor.authorHölzer, Martin
dc.contributor.authorFuchs, Stephan
dc.date.accessioned2026-04-30T08:11:51Z
dc.date.available2026-04-30T08:11:51Z
dc.date.issued2023-09-01none
dc.identifier.other10.12688/f1000research.136683.1
dc.identifier.urihttp://edoc.rki.de/176904/13720
dc.description.abstractBackground: Accurate genome sequences form the basis for genomic surveillance programs, the added value of which was impressively demonstrated during the COVID-19 pandemic by tracing transmission chains, discovering new viral lineages and mutations, and assessing them for infectiousness and resistance to available treatments. Amplicon strategies employing Illumina sequencing have become widely established for variant detection and reference-based reconstruction of SARS-CoV-2 genomes, and are routine bioinformatics tasks. Yet, specific challenges arise when analyzing amplicon data, for example, when crucial and even lineage-determining mutations occur near primer sites. Methods: We present CoVpipe2, a bioinformatics workflow developed at the Public Health Institute of Germany to reconstruct SARS-CoV-2 genomes based on short-read sequencing data accurately. The decisive factor here is the reliable, accurate, and rapid reconstruction of genomes, considering the specifics of the used sequencing protocol. Besides fundamental tasks like quality control, mapping, variant calling, and consensus generation, we also implemented additional features to ease the detection of mixed samples and recombinants. Results: Here, we highlight common pitfalls in primer clipping, detecting heterozygote variants, and dealing with low-coverage regions and deletions. We introduce CoVpipe2 to address the above challenges and have compared and successfully validated the pipeline against selected publicly available benchmark datasets. CoVpipe2 features high usability, reproducibility, and a modular design that specifically addresses the characteristics of short-read amplicon protocols but can also be used for whole-genome short-read sequencing data. Conclusions: CoVpipe2 has seen multiple improvement cycles and is continuously maintained alongside frequently updated primer schemes and new developments in the scientific community. Our pipeline is easy to set up and use and can serve as a blueprint for other pathogens in the future due to its flexibility and modularity, providing a long-term perspective for continuous support. CoVpipe2 is written in Nextflow and is freely accessible from https://github.com/rki-mf1/CoVpipe2 under the GPL3 license.eng
dc.language.isoengnone
dc.publisherRobert Koch-Institut
dc.rights(CC BY 3.0 DE) Namensnennung 3.0 Deutschlandger
dc.rights.urihttp://creativecommons.org/licenses/by/3.0/de/
dc.subjectSARS-CoV-2eng
dc.subjectgenome reconstructioneng
dc.subjectwhole-genome sequencingeng
dc.subjectshort readseng
dc.subjectIlluminaeng
dc.subjectampliconseng
dc.subjectWGSeng
dc.subjectNextflow pipelineeng
dc.subjectvirus bioinformaticseng
dc.subject.ddc610 Medizin und Gesundheitnone
dc.titleLessons learned: overcoming common challenges in reconstructing the SARS-CoV-2 genome from short-read sequencing data via CoVpipe2none
dc.typearticle
dc.identifier.urnurn:nbn:de:0257-176904/13720-9
dc.type.versionpublishedVersionnone
local.edoc.container-titleF1000Researchnone
local.edoc.type-nameZeitschriftenartikel
local.edoc.container-typeperiodical
local.edoc.container-type-nameZeitschrift
local.edoc.container-publisher-nameF1000 Research Ltd.none
local.edoc.container-reportyear2023none
local.edoc.container-firstpage1none
local.edoc.container-lastpage27none
dc.description.versionPeer Reviewednone

Show simple item record