SuRankCo: supervised ranking of contigs in de novo assemblies
dc.contributor.author | Kuhring, Mathias | |
dc.contributor.author | Dabrowski, Piotr Wojtek | |
dc.contributor.author | Piro, Vitor C. | |
dc.contributor.author | Nitsche, Andreas | |
dc.contributor.author | Renard, Bernhard Y. | |
dc.date.accessioned | 2018-05-07T18:23:22Z | |
dc.date.available | 2018-05-07T18:23:22Z | |
dc.date.created | 2015-08-06 | |
dc.date.issued | 2015-07-30 | none |
dc.identifier.other | http://edoc.rki.de/oa/articles/reuMFUyptp8s/PDF/26EWOLwHWwpXo.pdf | |
dc.identifier.uri | http://edoc.rki.de/176904/2108 | |
dc.description.abstract | Background: Evaluating the quality and reliability of a de novo assembly and of single contigs in particular is challenging since commonly a ground truth is not readily available and numerous factors may influence results. Currently available procedures provide assembly scores but lack a comparative quality ranking of contigs within an assembly. Results: We present SuRankCo, which relies on a machine learning approach to predict quality scores for contigs and to enable the ranking of contigs within an assembly. The result is a sorted contig set which allows selective contig usage in downstream analysis. Benchmarking on datasets with known ground truth shows promising sensitivity and specificity and favorable comparison to existing methodology. Conclusions: SuRankCo analyzes the reliability of de novo assemblies on the contig level and thereby allows quality control and ranking prior to further downstream and validation experiments. | eng |
dc.language.iso | eng | |
dc.publisher | Robert Koch-Institut | |
dc.subject | Algorithms | eng |
dc.subject | Software | eng |
dc.subject | Escherichia coli/genetics | eng |
dc.subject | Quality control | eng |
dc.subject | De novo assembly | eng |
dc.subject | Genome assembly | eng |
dc.subject | Next generation sequencing | eng |
dc.subject | Contigs | eng |
dc.subject | Machine learning | eng |
dc.subject | Random forest | eng |
dc.subject | Escherichia coli/metabolism | eng |
dc.subject | Contig Mapping/methods | eng |
dc.subject | ROC Curve | eng |
dc.subject.ddc | 610 Medizin | |
dc.title | SuRankCo: supervised ranking of contigs in de novo assemblies | |
dc.type | periodicalPart | |
dc.identifier.urn | urn:nbn:de:0257-10040193 | |
dc.identifier.doi | 10.1186/s12859-015-0644-7 | |
dc.identifier.doi | http://dx.doi.org/10.25646/2033 | |
local.edoc.container-title | BMC Bioinformatics | |
local.edoc.fp-subtype | Artikel | |
local.edoc.type-name | Zeitschriftenartikel | |
local.edoc.container-type | periodical | |
local.edoc.container-type-name | Zeitschrift | |
local.edoc.container-url | http://www.biomedcentral.com/1471-2105/16/240 | |
local.edoc.container-publisher-name | BioMedCentral | |
local.edoc.container-volume | 16 | |
local.edoc.container-issue | 240 | |
local.edoc.container-year | 2015 |