Theoretical and Practical Analyses in Metagenomic Sequence Classification
Contributo in Atti di convegno
Data di Pubblicazione:
2019
Abstract:
Metagenomics is the study of genomic sequences in a heterogeneous
microbial sample taken, e.g., from soil, water, human microbiome
and skin. One of the primary objectives of metagenomic studies
is to assign a taxonomic identity to each read sequenced from a sample
and then to estimate the abundance of the known clades. With
ever-increasing metagenomic datasets obtained from high-throughput
sequencing technologies readily available nowadays, several fast and accurate
methods have been developed that can work with reasonable computing
requirements. Here we provide an overview of the state-of-theart
methods for the classification of metagenomic sequences, especially
highlighting theoretical factors that seem to correlate well with practical
factors, and could therefore be useful in the choice or development of a
new method in experimental contexts. In particular, we emphasize that
the information derived from the known genomes and eventually used in
the learning and classification processes may create several experimental
issues--mostly based on the amount of information used in the processes
and its uniqueness, significance, and redundancy,--and some of these
issues are intrinsic both in current alignment-based approaches and in
compositional ones. This entails the need to develop efficient alignmentfree
methods that overcome such problems by combining the learning
and classification processes in a single framework.
Tipologia CRIS:
04.01 Contributo in Atti di convegno
Keywords:
Metagenomic sequence classification; Alignment-free algorithms; Genome analysis; Combinatorics; Pattern discovery; Strings
Elenco autori:
Verzotto, Davide
Link alla scheda completa:
Titolo del libro:
Database and Expert Systems Applications (DEXA 2019)
Pubblicato in: