This research is focused on the design, development, implementation, and evaluation of a hybrid classifier system that discriminates between three (3) classes of colonic histopathological images namely, normal, adenomatous polyp, and cancerous lesions. Here, a hybrid classifier system is realised by combining and using fuzzy logic, artificial neural networks and genetic algorithms to tackle the classification problem. The implementation of the solution to the problem has been divided into two parts: feature selection and classification. The scope of the study is focused on the use of textural features introduced by Haralick, as input to the classifier system. Variance ratios derived from scatter matrices and genetic algorithms are the tools used and compared in order to select candidate feature sets. A Kohonen self-organising map is used in the fitness function of the genetic algorithm. Results show that the use of variance ratio derived from scatter matrices is far simpler and faster than the use of a genetic algorithm with the Kohonen map. In the classification part of this study, a hybrid neuro-fuzzy adaptive network, known as Adaptive Network-Based Fuzzy Inference System, or ANFIS, is used. The elegance and power of this computational framework is clearly evident as the different network parameters and fuzzy membership functions are adaptively adjusted, given simply the data from the feature sets. It is later pointed out in this thesis that the confusion matrix is an effective presentation format of the performance of a classifier but lacks certain important details regarding the shortcomings of a particular classifier that is being evaluated. This study proposes the use of a Mean Relative Difference Confusion Matrix, or MRDCM, a name coined in this study. MRDCM can be thought of as a modified version of the conventional confusion matrix. Instead of counting the number of correct classifications and misclassifications, MRDCM tabulates the average differences between expected and predicted real number output values of the Sugeno-type defuzzification of the ANFIS. Another performance indicator that is introduced in this research is a parameter which is coined to be known as Classification Performance Index, or CPI. The advantage of using CPI is that it is simply a single number similar to accuracy percentage, a value that one would normally obtain when the sum of the leading diagonal of a confusion matrix is calculated and normalised. Although the CPI is slightly more complicated to compute, it definitely accounts for the misclassifications produced by a classifier under scrutiny. The CPI is calculated by multiplying each cell of the confusion matrix by performance factors that either increase or decrease a particular number, depending on its location in the confusion matrix. It is believed that performance indicators of classifiers are as important and as crucial as the classifier algorithms themselves since these parameters allow us to truly measure the success and failure of our solutions.
|Date of Award||2011|
|Supervisor||Raouf Naguib (Supervisor) & Elmer Dadios (Supervisor)|