Maximum likelihood model based on minor allele frequencies and weighted Max-SAT formulation for haplotype assembly

Sayyed R. Mousavi, Ilnaz Khodadadi, Hossein Falsafain, Reza Nadimi, Nasser Ghadiri

Research output: Contribution to journalArticlepeer-review

2 Citations (Scopus)

Abstract

Human haplotypes include essential information about SNPs, which in turn provide valuable information for such studies as finding relationships between some diseases and their potential genetic causes, e.g., for Genome Wide Association Studies. Due to expensiveness of directly determining haplotypes and recent progress in high throughput sequencing, there has been an increasing motivation for haplotype assembly, which is the problem of finding a pair of haplotypes from a set of aligned fragments. Although the problem has been extensively studied and a number of algorithms have already been proposed for the problem, more accurate methods are still beneficial because of high importance of the haplotypes information. In this paper, first, we develop a probabilistic model, that incorporates the Minor Allele Frequency (MAF) of SNP sites, which is missed in the existing maximum likelihood models. Then, we show that the probabilistic model will reduce to the Minimum Error Correction (MEC) model when the information of MAF is omitted and some approximations are made. This result provides a novel theoretical support for the MEC, despite some criticisms against it in the recent literature. Next, under the same approximations, we simplify the model to an extension of the MEC in which the information of MAF is used. Finally, we extend the haplotype assembly algorithm HapSAT by developing a weighted Max-SAT formulation for the simplified model, which is evaluated empirically with positive results.

Original languageEnglish
Pages (from-to)49-56
Number of pages8
JournalJournal of Theoretical Biology
Volume350
Early online date31 Jan 2014
DOIs
Publication statusPublished - 7 Jun 2014
Externally publishedYes

Keywords

  • Algorithms
  • Haplotype reconstruction
  • Minimum error correction
  • Single individual haplotyping
  • Single nucleotide polymorphism

ASJC Scopus subject areas

  • Statistics and Probability
  • Modelling and Simulation
  • Biochemistry, Genetics and Molecular Biology(all)
  • Immunology and Microbiology(all)
  • Agricultural and Biological Sciences(all)
  • Applied Mathematics

Fingerprint

Dive into the research topics of 'Maximum likelihood model based on minor allele frequencies and weighted Max-SAT formulation for haplotype assembly'. Together they form a unique fingerprint.

Cite this