Pan-conserved segment tags identify ultra-conserved sequences across assemblies in the human pangenome
Academic Article
Publication Date:
2023
abstract:
The human pangenome, a new reference sequence, addresses many limitations of the current GRCh38 reference.
The first release is based on 94 high-quality haploid assemblies from individuals with diverse backgrounds.
We employed a k-mer indexing strategy for comparative analysis across multiple assemblies,
including the pangenome reference, GRCh38, and CHM13, a telomere-to-telomere reference assembly.
Our k-mer indexing approach enabled us to identify a valuable collection of universally conserved sequences
across all assemblies, referred to as ''pan-conserved segment tags'' (PSTs). By examining intervals between
these segments, we discerned highly conserved genomic segments and those with structurally related polymorphisms.
We found 60,764 polymorphic intervals with unique geo-ethnic features in the pangenome reference.
In this study, we utilized ultra-conserved sequences (PSTs) to forge a link between human pangenome
assemblies and reference genomes. This methodology enables the examination of any sequence of interest
within the pangenome, using the reference genome as a comparative framework.
Iris type:
01.01 Articolo in rivista
Keywords:
k-mer; pan-conserved segment; pangenome; reference genome; structural polymorphism; structural variations.
List of contributors:
Colonna, Vincenza
Published in: