A neural network based multi-classifier system for gene identification in DNA sequences

Romesh Ranawana, Vasile Palade

Research output: Contribution to journalArticle

43 Citations (Scopus)

Abstract

The paper presents a neural network based multi-classifier system for the identification of Escherichia coli promoter sequences in strings of DNA. As each gene in DNA is preceded by a promoter sequence, the successful location of an E. coli promoter leads to the identification of the corresponding E. coli gene in the DNA sequence. A set of 324 known E. coli promoters and a set of 429 known non-promoter sequences were encoded using four different encoding methods. The encoded sequences were then used to train four different neural networks. The classification results of the four individual neural networks were then combined through an aggregation function, which used a variation of the logarithmic opinion pool method. The weights of this function were determined by a genetic algorithm. The multi-classifier system was then tested on 159 known promoter sequences and 171 non-promoter sequences not contained in the training set. The results obtained through this study proved that the same data set, when presented to neural networks in different forms, can provide slightly varying results. It also proves that when different opinions of more classifiers on the same input data are integrated within a multi-classifier system, we can obtain results that are better than the individual performances of the neural networks. The performances of our multi-classifier system outperform the results of other prediction systems for E. coli promoters developed so far.

Original languageEnglish
Pages (from-to)122-131
Number of pages10
JournalNeural Computing and Applications
Volume14
Issue number2
Early online date24 Nov 2004
DOIs
Publication statusPublished - Jul 2005
Externally publishedYes

Fingerprint

DNA sequences
Escherichia coli
Classifiers
Genes
Neural networks
DNA
Agglomeration
Genetic algorithms

Keywords

  • Genetic algorithms
  • Multi-classifier systems
  • Neural network optimization
  • Neural networks
  • Promoter recognition

ASJC Scopus subject areas

  • Software
  • Artificial Intelligence

Cite this

A neural network based multi-classifier system for gene identification in DNA sequences. / Ranawana, Romesh; Palade, Vasile.

In: Neural Computing and Applications, Vol. 14, No. 2, 07.2005, p. 122-131.

Research output: Contribution to journalArticle

@article{fe158a0d10ef4b708d558fa51c491c9c,
title = "A neural network based multi-classifier system for gene identification in DNA sequences",
abstract = "The paper presents a neural network based multi-classifier system for the identification of Escherichia coli promoter sequences in strings of DNA. As each gene in DNA is preceded by a promoter sequence, the successful location of an E. coli promoter leads to the identification of the corresponding E. coli gene in the DNA sequence. A set of 324 known E. coli promoters and a set of 429 known non-promoter sequences were encoded using four different encoding methods. The encoded sequences were then used to train four different neural networks. The classification results of the four individual neural networks were then combined through an aggregation function, which used a variation of the logarithmic opinion pool method. The weights of this function were determined by a genetic algorithm. The multi-classifier system was then tested on 159 known promoter sequences and 171 non-promoter sequences not contained in the training set. The results obtained through this study proved that the same data set, when presented to neural networks in different forms, can provide slightly varying results. It also proves that when different opinions of more classifiers on the same input data are integrated within a multi-classifier system, we can obtain results that are better than the individual performances of the neural networks. The performances of our multi-classifier system outperform the results of other prediction systems for E. coli promoters developed so far.",
keywords = "Genetic algorithms, Multi-classifier systems, Neural network optimization, Neural networks, Promoter recognition",
author = "Romesh Ranawana and Vasile Palade",
year = "2005",
month = "7",
doi = "10.1007/s00521-004-0447-7",
language = "English",
volume = "14",
pages = "122--131",
journal = "Neural Computing and Applications",
issn = "0941-0643",
publisher = "Springer Verlag",
number = "2",

}

TY - JOUR

T1 - A neural network based multi-classifier system for gene identification in DNA sequences

AU - Ranawana, Romesh

AU - Palade, Vasile

PY - 2005/7

Y1 - 2005/7

N2 - The paper presents a neural network based multi-classifier system for the identification of Escherichia coli promoter sequences in strings of DNA. As each gene in DNA is preceded by a promoter sequence, the successful location of an E. coli promoter leads to the identification of the corresponding E. coli gene in the DNA sequence. A set of 324 known E. coli promoters and a set of 429 known non-promoter sequences were encoded using four different encoding methods. The encoded sequences were then used to train four different neural networks. The classification results of the four individual neural networks were then combined through an aggregation function, which used a variation of the logarithmic opinion pool method. The weights of this function were determined by a genetic algorithm. The multi-classifier system was then tested on 159 known promoter sequences and 171 non-promoter sequences not contained in the training set. The results obtained through this study proved that the same data set, when presented to neural networks in different forms, can provide slightly varying results. It also proves that when different opinions of more classifiers on the same input data are integrated within a multi-classifier system, we can obtain results that are better than the individual performances of the neural networks. The performances of our multi-classifier system outperform the results of other prediction systems for E. coli promoters developed so far.

AB - The paper presents a neural network based multi-classifier system for the identification of Escherichia coli promoter sequences in strings of DNA. As each gene in DNA is preceded by a promoter sequence, the successful location of an E. coli promoter leads to the identification of the corresponding E. coli gene in the DNA sequence. A set of 324 known E. coli promoters and a set of 429 known non-promoter sequences were encoded using four different encoding methods. The encoded sequences were then used to train four different neural networks. The classification results of the four individual neural networks were then combined through an aggregation function, which used a variation of the logarithmic opinion pool method. The weights of this function were determined by a genetic algorithm. The multi-classifier system was then tested on 159 known promoter sequences and 171 non-promoter sequences not contained in the training set. The results obtained through this study proved that the same data set, when presented to neural networks in different forms, can provide slightly varying results. It also proves that when different opinions of more classifiers on the same input data are integrated within a multi-classifier system, we can obtain results that are better than the individual performances of the neural networks. The performances of our multi-classifier system outperform the results of other prediction systems for E. coli promoters developed so far.

KW - Genetic algorithms

KW - Multi-classifier systems

KW - Neural network optimization

KW - Neural networks

KW - Promoter recognition

UR - http://www.scopus.com/inward/record.url?scp=22144436656&partnerID=8YFLogxK

U2 - 10.1007/s00521-004-0447-7

DO - 10.1007/s00521-004-0447-7

M3 - Article

VL - 14

SP - 122

EP - 131

JO - Neural Computing and Applications

JF - Neural Computing and Applications

SN - 0941-0643

IS - 2

ER -