Show simple item record

2024-08-08Zeitschriftenartikel
Impact of reference design on estimating SARS-CoV-2 lineage abundances from wastewater sequencing data
dc.contributor.authorAßmann, Eva
dc.contributor.authorAgrawal, Shelesh
dc.contributor.authorOrschler, Laura
dc.contributor.authorBöttcher, Sindy
dc.contributor.authorLackner, Susanne
dc.contributor.authorHölzer, Martin
dc.date.accessioned2026-02-12T12:36:08Z
dc.date.available2026-02-12T12:36:08Z
dc.date.issued2024-08-08none
dc.identifier.other10.1093/gigascience/giae051
dc.identifier.urihttp://edoc.rki.de/176904/13333
dc.description.abstractBackground: Sequencing of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) RNA from wastewater samples has emerged as a valuable tool for detecting the presence and relative abundances of SARS-CoV-2 variants in a community. By analyzing the viral genetic material present in wastewater, researchers and public health authorities can gain early insights into the spread of virus lineages and emerging mutations. Constructing reference datasets from known SARS-CoV-2 lineages and their mutation profiles has become state-of-the-art for assigning viral lineages and their relative abundances from wastewater sequencing data. However, selecting reference sequences or mutations directly affects the predictive power. Results: Here, we show the impact of a mutation- and sequence-based reference reconstruction for SARS-CoV-2 abundance estimation. We benchmark 3 datasets: (i) synthetic “spike-in”’ mixtures; (ii) German wastewater samples from early 2021, mainly comprising Alpha; and (iii) samples obtained from wastewater at an international airport in Germany from the end of 2021, including first signals of Omicron. The 2 approaches differ in sublineage detection, with the marker mutation-based method, in particular, being challenged by the increasing number of mutations and lineages. However, the estimations of both approaches depend on selecting representative references and optimized parameter settings. By performing parameter escalation experiments, we demonstrate the effects of reference size and alternative allele frequency cutoffs for abundance estimation. We show how different parameter settings can lead to different results for our test datasets and illustrate the effects of virus lineage composition of wastewater samples and references. Conclusions: Our study highlights current computational challenges, focusing on the general reference design, which directly impacts abundance allocations. We illustrate advantages and disadvantages that may be relevant for further developments in the wastewater community and in the context of defining robust quality metrics.eng
dc.language.isoengnone
dc.publisherRobert Koch-Institut
dc.rights(CC BY 3.0 DE) Namensnennung 3.0 Deutschlandger
dc.rights.urihttp://creativecommons.org/licenses/by/3.0/de/
dc.subjectSARS-CoV-2eng
dc.subjectwastewatereng
dc.subjectsewageeng
dc.subjectabundance estimationeng
dc.subjectnext-generation sequencingeng
dc.subjectbenchmarkeng
dc.subject.ddc610 Medizin und Gesundheitnone
dc.titleImpact of reference design on estimating SARS-CoV-2 lineage abundances from wastewater sequencing datanone
dc.typearticle
dc.identifier.urnurn:nbn:de:0257-176904/13333-0
dc.type.versionpublishedVersionnone
local.edoc.container-titleGigaSciencenone
local.edoc.type-nameZeitschriftenartikel
local.edoc.container-typeperiodical
local.edoc.container-type-nameZeitschrift
local.edoc.container-publisher-nameOxford University Pressnone
local.edoc.container-reportyear2024none
local.edoc.container-firstpage1none
local.edoc.container-lastpage16none
dc.description.versionPeer Reviewednone

Show simple item record