Abu Mansour, Hussein Y (2012) Rule pruning and prediction methods for associative classification approach in data mining. Doctoral thesis, University of Huddersfield.
- Accepted Version
Available under License Creative Commons Attribution Non-commercial No Derivatives.
Recent studies in data mining revealed that Associative Classification (AC) data mining approach builds competitive classification classifiers with reference to accuracy when compared to classic classification approaches including decision tree and rule based. Nevertheless, AC algorithms suffer from a number of known defects as the generation of large number of rules which makes it hard for end-user to maintain and understand its outcome and the possible over-fitting issue caused by the confidence-based rule evaluation used by AC.
This thesis attempts to deal with above problems by presenting five new pruning methods, prediction method and employs them in an AC algorithm that significantly reduces the number of generated rules without having large impact on the prediction rate of the classifiers. Particularly, the new pruning methods that discard redundant and insignificant rules during building the classifier are employed. These pruning procedures remove any rule that either has no training case coverage or covers a training case without the requirement of class similarity between the rule class and that of the training case. This enables large coverage for each rule and reduces overfitting as well as construct accurate and moderated size classifiers. Beside, a novel class assignment method based on multiple rules is proposed which employs group of rule to make the prediction decision. The integration of both the pruning and prediction procedures has been used to enhanced a known AC algorithm called Multiple-class Classification based on Association Rules (MCAR) and resulted in competent model in regard to accuracy and classifier size called " Multiple-class Classification based on Association Rules 2(MCAR2)". Experimental results against different datasets from the UCI data repository showed that the predictive power of the resulting classifiers in MCAR2 slightly increase and the resulting classifier size gets reduced comparing with other AC algorithms such as Multiple-class Classification based on Association Rules (MCAR).
|Item Type:||Thesis (Doctoral)|
|Subjects:||Q Science > QA Mathematics > QA75 Electronic computers. Computer science|
|Schools:||School of Computing and Engineering|
|Depositing User:||Gail Hurst|
|Date Deposited:||14 May 2013 09:36|
|Last Modified:||04 Nov 2015 19:15|
Downloads per month over past year
Repository Staff Only: item control page