Improved haplotype assembly using Xor genotypes

Sayyed R. Mousavi

Research output: Contribution to journalArticlepeer-review

2 Citations (Scopus)


Given a set of aligned fragments, haplotype assembly is the problem of finding the haplotypes from which the fragments have been read. The problem is important because haplotypes contain SNP information, which is essential to many genomic analyses such as the analysis of potential association between certain diseases and genetic variations. The current state-of-the-art haplotype assembly algorithm, HapSAT, does not exploit genotype information and only receives a read matrix as input. However, the imminent importance of haplotypes and inexpensiveness of genotype information motivate for exploiting genotype information to obtain more accurate haplotypes. In this paper, an improved haplotype assembly method, xGenHapSAT, is proposed, which exploits xor genotype information for more accurate haplotype assembly. Xor genotype information is even less expensive than full genotype information, e.g., using the Denaturing High-Performance Liquid Chromatography (DHPLC) technique. It is shown that using this inexpensively obtainable information significantly improves the accuracy of the assembled haplotypes. In addition, a new, more efficient, Max-2-SAT formulation is adopted in xGenHapSAT, which, on average, increases the speed of the algorithm. Moreover, the proposed xGenHapSAT method replaces the current state-of-the-art haplotype assembly method based on genotype information. Finally, our state-of-the-art haplotype assembly software, HapSoft, which includes both xGenHapSAT and HapSAT, is made freely available for research purposes.

Original languageEnglish
Pages (from-to)122-130
Number of pages9
JournalJournal of Theoretical Biology
Early online date12 Jan 2012
Publication statusPublished - 7 Apr 2012
Externally publishedYes


  • Computational biology
  • Haplotype assembly
  • Single individual haplotyping
  • SNP
  • Xor genotype.

ASJC Scopus subject areas

  • Statistics and Probability
  • Modelling and Simulation
  • Biochemistry, Genetics and Molecular Biology(all)
  • Immunology and Microbiology(all)
  • Agricultural and Biological Sciences(all)
  • Applied Mathematics


Dive into the research topics of 'Improved haplotype assembly using Xor genotypes'. Together they form a unique fingerprint.

Cite this