Al-Rajab, Murad Mustafa Jaber (2019) EFFICIENT ALGORITHMS FOR CANCER GENE SEARCHING AND CLASSIFICATION:COLON CANCER. Doctoral thesis, University of Huddersfield.

Cancer kills millions of people worldwide each year. It is a growing problem and is the foremost cause of death worldwide. The numbers of people battling cancer is growing rapidly, owing to different reasons, such as lifestyle. Clinically, determining the cause of cancer is very challenging and often inaccurate.

The goal of this research springs from the increasing necessity to incorporate efficient and accurate algorithms to detect colon cancer. In this research, two main models within case studies are proposed. The first case study (model) suggests a 3-phased method of examining the accuracy and time efficiency of high-performance gene selection and cancer classification algorithms applied to detecting colon cancer cells. The first and second phases examine gene/feature selection and cancer classification algorithms applied independently across the entire colon dataset. Phase three examines the performance of the first two phases incorporated together. The performance accuracies and time analyses are then compared across algorithms. The second case study proposes a model that reports accuracy improvements using a two-stage hybrid multifilter feature selection method for colon-cancer classification. This model is a benefit of applying gene selection prior to classification methods, and it enhances the accuracy of cancer-cell detection performance results. The proposed model first applies a hybrid genetic algorithm (GA) and information gain incorporated as the first stage of selection, followed by a filter-ranking algorithm of minimum redundancy maximum relevance (mRMR) to refine the subset of selected genes for the second stage of selection. Thereafter, the selected genes are evaluated by a variety of machine-learning algorithms.

It is found from the first case study that GA performs better for gene selection on the colon dataset during phase 1. Whereas, during phase 2, decision tree (DT) and support vector machine (SVM) classifiers reflect very good accuracy results(86%–87%). During phase 3, the incorporation of GA as a selector and DT as a classifier outperforms other algorithms with respect to accuracy (92%). The incorporation also analyses better with a time efficiency. However, the second case study finds that SVM classifiers reflected high accuracy following the proposed 2-stage multifilter selection approach (94%). When compared to methods in the literature, the proposed models yield better results.

Al-Rajab THESIS.pdf - Accepted Version
Restricted to Repository staff only
Available under License Creative Commons Attribution Non-commercial No Derivatives.

Download (7MB)
Add to AnyAdd to TwitterAdd to FacebookAdd to LinkedinAdd to PinterestAdd to Email