Skip to Main Content (Press Enter)

Logo CNR
  • ×
  • Home
  • Persone
  • Pubblicazioni
  • Strutture
  • Competenze

UNI-FIND
Logo CNR

|

UNI-FIND

cnr.it
  • ×
  • Home
  • Persone
  • Pubblicazioni
  • Strutture
  • Competenze
  1. Pubblicazioni

Matchtigs: minimum plain text representation of k-mer sets

Articolo
Data di Pubblicazione:
2023
Abstract:
We propose a polynomial algorithm computing a minimum plain-text representation of k-mer sets, as well as an efficient near-minimum greedy heuristic. When compressing read sets of large model organisms or bacterial pangenomes, with only a minor runtime increase, we shrink the representation by up to 59% over unitigs and 26% over previous work. Additionally, the number of strings is decreased by up to 97% over unitigs and 90% over previous work. Finally, a small representation has advantages in downstream applications, as it speeds up SSHash-Lite queries by up to 4.26× over unitigs and 2.10× over previous work.
Tipologia CRIS:
01.01 Articolo in rivista
Keywords:
k-mer sets; Plain text compression; Genomic sequences; Graph algorithm; Sequence analysis; Minimum-cost flow; Chinese postman problem
Elenco autori:
Pibiri, GIULIO ERMANNO
Link alla scheda completa:
https://iris.cnr.it/handle/20.500.14243/464352
Pubblicato in:
GENOME BIOLOGY (ONLINE)
Journal
  • Dati Generali

Dati Generali

URL

https://genomebiology.biomedcentral.com/articles/10.1186/s13059-023-02968-z
  • Utilizzo dei cookie

Realizzato con VIVO | Designed by Cineca | 26.5.0.0 | Sorgente dati: PREPROD (Ribaltamento disabilitato)