Linear dimensionality reduction for classification via a sequential Bayes error minimisation with an application to flow meter diagnostics

Research output: Contribution to journalArticle

7 Citations (Scopus)

Abstract

Supervised linear dimensionality reduction (LDR) performed prior to classification often improves the accuracy of classification by reducing overfitting and removing multicollinearity. If a Bayes classifier is to be used, then reduction to a dimensionality of $K-1$ is necessary and sufficient to preserve the classification information in the original feature space for the $K$-class problem. However, most of the existing algorithms provide no optimal dimensionality to which to reduce the data, thus classification information can be lost in the reduced space if $K-1$ dimensions are used. In this paper, we present a novel LDR technique to reduce the dimensionality of the original data to $K-1$, such that it is well-primed for Bayesian classification. This is done by sequentially constructing linear classifiers that minimise the Bayes error via a gradient descent procedure, under an assumption of normality. We experimentally validate the proposed algorithm on $10$ UCI datasets. Our algorithm is shown to be superior in terms of the classification accuracy when compared to existing algorithms including LDR based on Fisher's criterion and the Chernoff criterion. The applicability of our algorithm is then demonstrated by employing it in diagnosing the health states of $2$ ultrasonic flow meters. As with the UCI datasets, the proposed algorithm is found to have superior performance to the existing algorithms, achieving classification accuracies of $99.4\%$ and $97.5\%$ on the two flow meters. Such high classification accuracies on the flow meters promise significant cost benefits in oil and gas operations.

NOTICE: this is the author’s version of a work that was accepted for publication in Expert Systems with Applications. Changes resulting from the publishing process, such as peer review, editing, corrections, structural formatting, and other quality control mechanisms may not be reflected in this document. Changes may have been made to this work since it was submitted for publication. A definitive version was subsequently published in Expert Systems with Applications, [91, (2017)] DOI: 10.1016/j.eswa.2017.09.010

© 2017, Elsevier. Licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International http://creativecommons.org/licenses/by-nc-nd/4.0/
LanguageEnglish
Pages252-262
Number of pages11
JournalExpert Systems with Applications
Volume91
Early online date9 Sep 2017
DOIs
Publication statusPublished - Jan 2018

Fingerprint

Expert systems
Classifiers
Quality control
Ultrasonics
Health
Gases
Costs
Oils

Keywords

  • Linear dimensionality reduction
  • LDA
  • Heteroscedasticity
  • Bayes error
  • Flow meter diagnostics

Cite this

@article{d59eae9ba97841edae8b7e12dde8b6f2,
title = "Linear dimensionality reduction for classification via a sequential Bayes error minimisation with an application to flow meter diagnostics",
abstract = "Supervised linear dimensionality reduction (LDR) performed prior to classification often improves the accuracy of classification by reducing overfitting and removing multicollinearity. If a Bayes classifier is to be used, then reduction to a dimensionality of $K-1$ is necessary and sufficient to preserve the classification information in the original feature space for the $K$-class problem. However, most of the existing algorithms provide no optimal dimensionality to which to reduce the data, thus classification information can be lost in the reduced space if $K-1$ dimensions are used. In this paper, we present a novel LDR technique to reduce the dimensionality of the original data to $K-1$, such that it is well-primed for Bayesian classification. This is done by sequentially constructing linear classifiers that minimise the Bayes error via a gradient descent procedure, under an assumption of normality. We experimentally validate the proposed algorithm on $10$ UCI datasets. Our algorithm is shown to be superior in terms of the classification accuracy when compared to existing algorithms including LDR based on Fisher's criterion and the Chernoff criterion. The applicability of our algorithm is then demonstrated by employing it in diagnosing the health states of $2$ ultrasonic flow meters. As with the UCI datasets, the proposed algorithm is found to have superior performance to the existing algorithms, achieving classification accuracies of $99.4\{\%}$ and $97.5\{\%}$ on the two flow meters. Such high classification accuracies on the flow meters promise significant cost benefits in oil and gas operations.NOTICE: this is the author’s version of a work that was accepted for publication in Expert Systems with Applications. Changes resulting from the publishing process, such as peer review, editing, corrections, structural formatting, and other quality control mechanisms may not be reflected in this document. Changes may have been made to this work since it was submitted for publication. A definitive version was subsequently published in Expert Systems with Applications, [91, (2017)] DOI: 10.1016/j.eswa.2017.09.010{\circledC} 2017, Elsevier. Licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International http://creativecommons.org/licenses/by-nc-nd/4.0/",
keywords = "Linear dimensionality reduction, LDA, Heteroscedasticity, Bayes error, Flow meter diagnostics",
author = "Gyamfi, {Kojo Sarfo} and James Brusey and Andrew Hunt and Elena Gaura",
year = "2018",
month = "1",
doi = "10.1016/j.eswa.2017.09.010",
language = "English",
volume = "91",
pages = "252--262",
journal = "Expert Systems with Applications",
issn = "0957-4174",
publisher = "Permagon Press",

}

TY - JOUR

T1 - Linear dimensionality reduction for classification via a sequential Bayes error minimisation with an application to flow meter diagnostics

AU - Gyamfi, Kojo Sarfo

AU - Brusey, James

AU - Hunt, Andrew

AU - Gaura, Elena

PY - 2018/1

Y1 - 2018/1

N2 - Supervised linear dimensionality reduction (LDR) performed prior to classification often improves the accuracy of classification by reducing overfitting and removing multicollinearity. If a Bayes classifier is to be used, then reduction to a dimensionality of $K-1$ is necessary and sufficient to preserve the classification information in the original feature space for the $K$-class problem. However, most of the existing algorithms provide no optimal dimensionality to which to reduce the data, thus classification information can be lost in the reduced space if $K-1$ dimensions are used. In this paper, we present a novel LDR technique to reduce the dimensionality of the original data to $K-1$, such that it is well-primed for Bayesian classification. This is done by sequentially constructing linear classifiers that minimise the Bayes error via a gradient descent procedure, under an assumption of normality. We experimentally validate the proposed algorithm on $10$ UCI datasets. Our algorithm is shown to be superior in terms of the classification accuracy when compared to existing algorithms including LDR based on Fisher's criterion and the Chernoff criterion. The applicability of our algorithm is then demonstrated by employing it in diagnosing the health states of $2$ ultrasonic flow meters. As with the UCI datasets, the proposed algorithm is found to have superior performance to the existing algorithms, achieving classification accuracies of $99.4\%$ and $97.5\%$ on the two flow meters. Such high classification accuracies on the flow meters promise significant cost benefits in oil and gas operations.NOTICE: this is the author’s version of a work that was accepted for publication in Expert Systems with Applications. Changes resulting from the publishing process, such as peer review, editing, corrections, structural formatting, and other quality control mechanisms may not be reflected in this document. Changes may have been made to this work since it was submitted for publication. A definitive version was subsequently published in Expert Systems with Applications, [91, (2017)] DOI: 10.1016/j.eswa.2017.09.010© 2017, Elsevier. Licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International http://creativecommons.org/licenses/by-nc-nd/4.0/

AB - Supervised linear dimensionality reduction (LDR) performed prior to classification often improves the accuracy of classification by reducing overfitting and removing multicollinearity. If a Bayes classifier is to be used, then reduction to a dimensionality of $K-1$ is necessary and sufficient to preserve the classification information in the original feature space for the $K$-class problem. However, most of the existing algorithms provide no optimal dimensionality to which to reduce the data, thus classification information can be lost in the reduced space if $K-1$ dimensions are used. In this paper, we present a novel LDR technique to reduce the dimensionality of the original data to $K-1$, such that it is well-primed for Bayesian classification. This is done by sequentially constructing linear classifiers that minimise the Bayes error via a gradient descent procedure, under an assumption of normality. We experimentally validate the proposed algorithm on $10$ UCI datasets. Our algorithm is shown to be superior in terms of the classification accuracy when compared to existing algorithms including LDR based on Fisher's criterion and the Chernoff criterion. The applicability of our algorithm is then demonstrated by employing it in diagnosing the health states of $2$ ultrasonic flow meters. As with the UCI datasets, the proposed algorithm is found to have superior performance to the existing algorithms, achieving classification accuracies of $99.4\%$ and $97.5\%$ on the two flow meters. Such high classification accuracies on the flow meters promise significant cost benefits in oil and gas operations.NOTICE: this is the author’s version of a work that was accepted for publication in Expert Systems with Applications. Changes resulting from the publishing process, such as peer review, editing, corrections, structural formatting, and other quality control mechanisms may not be reflected in this document. Changes may have been made to this work since it was submitted for publication. A definitive version was subsequently published in Expert Systems with Applications, [91, (2017)] DOI: 10.1016/j.eswa.2017.09.010© 2017, Elsevier. Licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International http://creativecommons.org/licenses/by-nc-nd/4.0/

KW - Linear dimensionality reduction

KW - LDA

KW - Heteroscedasticity

KW - Bayes error

KW - Flow meter diagnostics

U2 - 10.1016/j.eswa.2017.09.010

DO - 10.1016/j.eswa.2017.09.010

M3 - Article

VL - 91

SP - 252

EP - 262

JO - Expert Systems with Applications

T2 - Expert Systems with Applications

JF - Expert Systems with Applications

SN - 0957-4174

ER -