Identifying subset errors in multiple sequence alignments

Aparna Roy, Bruck Taddese, Shabana Vohra, Phani K. Thimmaraju, Christopher J.R. Illingworth, Lisa M. Simpson, Keya Mukherjee, Christopher A. Reynolds, Sree V. Chintapalli

Research output: Contribution to journalArticle

3 Citations (Scopus)

Abstract

Multiple sequence alignment (MSA) accuracy is important, but there is no widely accepted method of judging the accuracy that different alignment algorithms give. We present a simple approach to detecting two types of error, namely block shifts and the misplacement of residues within a gap. Given a MSA, subsets of very similar sequences are generated through the use of a redundancy filter, typically using a 70-90% sequence identity cut-off. Subsets thus produced are typically small and degenerate, and errors can be easily detected even by manual examination. The errors, albeit minor, are inevitably associated with gaps in the alignment, and so the procedure is particularly relevant to homology modelling of protein loop regions. The usefulness of the approach is illustrated in the context of the universal but little known [K/R]KLH motif that occurs in intracellular loop 1 of G protein coupled receptors (GPCR); other issues relevant to GPCR modelling are also discussed.

Original languageEnglish
Pages (from-to)364-371
Number of pages8
JournalJournal of Biomolecular Structure and Dynamics
Volume32
Issue number3
Early online date25 Mar 2013
DOIs
Publication statusPublished - 4 Mar 2014
Externally publishedYes

Keywords

  • alignment accuracy
  • alignment errors
  • errors
  • homology modelling
  • multiple sequence alignments
  • Redundancy

ASJC Scopus subject areas

  • Structural Biology
  • Molecular Biology

Fingerprint Dive into the research topics of 'Identifying subset errors in multiple sequence alignments'. Together they form a unique fingerprint.

  • Cite this

    Roy, A., Taddese, B., Vohra, S., Thimmaraju, P. K., Illingworth, C. J. R., Simpson, L. M., ... Chintapalli, S. V. (2014). Identifying subset errors in multiple sequence alignments. Journal of Biomolecular Structure and Dynamics, 32(3), 364-371. https://doi.org/10.1080/07391102.2013.770371