Zur Kurzanzeige

2021-02-01Zeitschriftenartikel DOI: 10.25646/9594
Interpretable detection of novel human viruses from genome sequencing data
dc.contributor.authorBartoszewicz, Jakub M.
dc.contributor.authorSeidel, Anja
dc.contributor.authorRenard, Bernhard Y.
dc.date.accessioned2022-01-31T09:38:27Z
dc.date.available2022-01-31T09:38:27Z
dc.date.issued2021-02-01none
dc.identifier.other10.1093/nargab/lqab004
dc.identifier.urihttp://edoc.rki.de/176904/9289
dc.description.abstractViruses evolve extremely quickly, so reliable meth- ods for viral host prediction are necessary to safe- guard biosecurity and biosafety alike. Novel human- infecting viruses are difficult to detect with stan- dard bioinformatics workflows. Here, we predict whether a virus can infect humans directly from next- generation sequencing reads. We show that deep neural architectures significantly outperform both shallow machine learning and standard, homology- based algorithms, cutting the error rates in half and generalizing to taxonomic units distant from those presented during training. Further, we develop a suite of interpretability tools and show that it can be applied also to other models beyond the host pre- diction task. We propose a new approach for con- volutional filter visualization to disentangle the in- formation content of each nucleotide from its contri- bution to the final classification decision. Nucleotide- resolution maps of the learned associations between pathogen genomes and the infectious phenotype can be used to detect regions of interest in novel agents, for example, the SARS-CoV-2 coronavirus, unknown before it caused a COVID-19 pandemic in 2020. All methods presented here are implemented as easy- to-install packages not only enabling analysis of NGS datasets without requiring any deep learning skills, but also allowing advanced users to easily train and explain new models for genomics.eng
dc.language.isoengnone
dc.publisherRobert Koch-Institut
dc.rights(CC BY 3.0 DE) Namensnennung 3.0 Deutschlandger
dc.rights.urihttp://creativecommons.org/licenses/by/3.0/de/
dc.subject.ddc610 Medizin und Gesundheitnone
dc.titleInterpretable detection of novel human viruses from genome sequencing datanone
dc.typearticle
dc.identifier.urnurn:nbn:de:0257-176904/9289-9
dc.identifier.doihttp://dx.doi.org/10.25646/9594
dc.type.versionpublishedVersionnone
local.edoc.container-titleNAR Genomics and Bioinformaticsnone
local.edoc.type-nameZeitschriftenartikel
local.edoc.container-typeperiodical
local.edoc.container-type-nameZeitschrift
local.edoc.container-urlhttps://academic.oup.com/nargab/article/3/1/lqab004/6125551none
local.edoc.container-publisher-nameOxford University Pressnone
local.edoc.container-volume3none
local.edoc.container-issue1none
local.edoc.container-year2021none
local.edoc.container-firstpage1none
local.edoc.container-lastpage14none
local.edoc.rki-departmentMethodenentwicklung und Forschungsinfrastrukturnone
dc.description.versionPeer Reviewednone

Zur Kurzanzeige