Skip to Main Content (Press Enter)

Logo CNR
  • ×
  • Home
  • Persone
  • Pubblicazioni
  • Strutture
  • Competenze

UNI-FIND
Logo CNR

|

UNI-FIND

cnr.it
  • ×
  • Home
  • Persone
  • Pubblicazioni
  • Strutture
  • Competenze
  1. Pubblicazioni

Haplotyping populations by pure parsinomy: complexity, exact and approximation algorithms

Articolo
Data di Pubblicazione:
2004
Abstract:
In this paper we address the Pure Parsimony Haplotyping problem: Given a set of genotypes G, find a minimum cardinality set of haplotypes which explains G. A genotype g is an n-dimensional vector in {0, 1, 2} and a haplotype h is an n-dimensional binary vector. A set of haplotypes H is said to explain G if for every g 2 G there are h1, h2 2 H such that h1 h2 = g. The position-wise sum h1 h2 indicates the genotype which has a 2 in the positions where h1 and h2 disagree, and the same value as h1 and h2 where they agree. In this paper we show the APX-hardness of the problem even when the number of "2" symbols is at most 3 for every g 2 G. We then give two Integer Linear Programming formulations. From the first one (appeared also in [9]) we derive a 2k-1-approximation algorithm when the number of symbols "2" is at most k for every g 2 G. This formulation has an exponential number of variables and constraints. The second formulation is new, and has a polynomial number of variables and constraints. Finally, we give approximation algorithms not based on Linear Programming, such as a trivial p|G|-approximation algorithm for the general case, and an effective probabilistic algorithm with a performance guarantee of 2k+1blog |G|c (1 + dln |G|e) when the number of symbols "2" is at most k for every g 2 G. The expected running time of the algorithm turns out to be almost linear in the input size. 1 Introduction A Single Nucleotide Polymorphism (SNP) is a site of the human genome (i.e., the posi- tion of a specific nucleotide), whose content shows a statistically significant variability within a population. A position is considered a SNP if for the minority of the pop- ulation (as long as it consists of at least 5% of the population) a certain nucleotide
Tipologia CRIS:
01.01 Articolo in rivista
Keywords:
Distributed Systems
Elenco autori:
Pinotti, MARIA CRISTINA
Link alla scheda completa:
https://iris.cnr.it/handle/20.500.14243/39987
Pubblicato in:
INFORMS JOURNAL ON COMPUTING
Journal
  • Dati Generali

Dati Generali

URL

http://joc.journal.informs.org/content/16/4.toc
  • Utilizzo dei cookie

Realizzato con VIVO | Designed by Cineca | 26.5.0.0 | Sorgente dati: PREPROD (Ribaltamento disabilitato)