Computing and Library Services - delivering an inspiring information environment

Pruning techniques in associative classification: Survey and comparison

Thabtah, Fadi Abdeljaber (2006) Pruning techniques in associative classification: Survey and comparison. Journal of Digital Information Management, 4 (3). pp. 197-202. ISSN 0972-7272

[img] PDF
Restricted to Registered users only

Download (182kB)


Association rule discovery and classification
are common data mining tasks. Integrating association
rule and classification also known as associative
classification is a promising approach that derives
classifiers highly competitive with regards to accuracy to
that of traditional classification approaches such as rule
induction and decision trees. However, the size of the
classifiers generated by associative classification is often
large and therefore pruning becomes an essential task.
In this paper, we survey different rule pruning methods
used by current associative classification techniques.
Further, we compare the effect of three pruning methods
(database coverage, pessimistic error estimation, lazy
pruning) on the accuracy rate and the number of rules
derived from different classification data sets. Results
obtained from experimenting on different data sets from
UCI data collection indicate that lazy pruning algorithms
may produce slightly higher predictive classifiers than
those which utilise database coverage and pessimistic
error pruning methods. However, the potential use of such
classifiers is limited because they are difficult to
understand and maintain by the end-user.

Item Type: Article
Additional Information: © Reproduced by permission of Journal of Digital Information Management Published by Digital Information Research Foundation
Uncontrolled Keywords: Associative Classification, Association Rule, Classification, Data Mining, Rule Pruning
Subjects: Q Science > QA Mathematics > QA75 Electronic computers. Computer science
Q Science > QA Mathematics > QA76 Computer software
Schools: School of Computing and Engineering

Adamo, J. (2006). Association Rule based Classifier Built via Direct
Enumeration, Online Pruning and Genetic Algorithm based Rule
Decimation. Artificial Intelligence and Applications 2006: 370-376.
Agrawal, R., Srikant, R (1994). Fast algorithms for mining
association rule. Proceedings of the 20th International Conferenceon
Very Large Data Bases. p. 487-499.
Baralis, E., Torino, P (2002). A lazy approach to pruning classification
rules. Proceedings of the 2002 IEEE ICDM’02. p. 35.
Clark, P., Boswell, R (1991). Rule induction with CN2: Some recent
improvements. In Y. Kodratoff, editor, Machine Learning - EWSL-
91, p. 151-163. Berlin, Springer-Verlag.
Cohen, W. (1995). Fast effective rule induction. Proceedings of the
12th International Conference on Machine Learning, (pp. 115-123).
Morgan Kaufmann, CA.
Dong, G., Li., J (1999). Efficient mining of emerging patterns:
Discovering trends and differences. Proceedings of the Int’l Conf.
Of Knowledge Discovery and Data Mining, (pp. 43-52).
Duda, R., Hart, P (1973). Pattern classification and scene analysis.
John Wiley & son.
Frank, E., Witten, I (1998). Generating accurate rule sets without
global optimisation. Proceedings of the Fifteenth International
Conference on Machine Learning, p. 144–151. Morgan Kaufmann,
Madison, Wisconsin.
Freitas, A (2000). Understanding the crucial difference between
classification and association rule discovery. ACM SIGKDD
Explorations Newsletter, 2 (1) 65-69.
Li, W., Han, J., Pei, J (2001). CMAR: Accurate and efficient
classification based on multiple-class association rule. Proceedings
of the ICDM’01 p. 369-376). San Jose, CA.
Liu, B., Hsu, W., Ma, Y (1998). Integrating classification and
association rule mining. Proceedings of the KDD, (pp. 80-86). New
York, NY.
Liu, B., Hsu, W., Ma, Y (1999). Mining association rules with multiple
minimum supports. Proceedings of the fifth ACM SIGKDD International
Conference on Knowledge Discovery and Data Mining, p.337-341.
San Diego, California.
Meretakis, D., Wüthrich, B (1999). Extending naïve Bayes classifiers
using long itemsets. Proceedings of the fifth ACM SIGKDD
International Conference on Knowledge Discovery and Data Mining,
p. 165 – 174). San Diego, California.
Merz, C., Murphy, P (1996). UCI repository of machine learning
databases. Irvine, CA, University of California, Department of
Information and Computer Science.
Tan P-N, Steinbach M., Kumar V (2005). Introduction to data mining.
Addison Wesley
Thabtah, F., Cowling, P., Peng, Y (2005). MCAR: Multi-class
classification based on association rule approach. Proceeding of
the 3rd IEEE International Conference on Computer Systems and
Applications p. 1-7. Cairo, Egypt.
Thabtah, F., Cowling, P., Peng, Y (2004). MMAC: A new multi-class,
multi-label associative classification approach. Proceedings of the
Fourth IEEE International Conference on Data Mining (ICDM ’04), (pp.
217-224). Brighton, UK. (Nominated for the Best paper award).
Quinlan, J (1993). C4.5: Programs for machine learning. San Mateo,
CA: Morgan Kaufmann.
Quinlan, J (1987). Simplifying decision trees. International journal of
man-machine studies, 27-(3) 221-248.
Quinlan, J (1979). Discovering rules from large collections of
examples: a case study. In: D. Michie, editor, Expert Systems in the
Micro-electronic Age, p.168—201). Edinburgh University Press,
Snedecor, W., Cochran, W (1989). Statistical Methods, Eighth Edition,
Iowa State University Press.
Antonie, M., Zaïane, O., Coman, A. (2003) associative classifiers
for medical images. Lecture Notes in Artificial Intelligence 2797, Mining
Multimedia and Complex Data, (pp. 68-83). Springer-Verlag.
Witten, I., Frank, E. (2000). Data mining: practical machine learning
tools and techniques with Java implementations. San Francisco:
Morgan Kaufmann.
Zaiane, O., Antonie, M (2005). Pruning and Tuning Rules for
Associative Classifiers. Ninth International Conference on
Knowledge-Based Intelligence Information & Engineering Systems
(KES’05), (pp. 966-973). Melbourne, Australia, September 2005.
Yin, X., Han, J (2003). CPAR: Classification based on predictive
association rule. Proceedings of the SDM p. 369-376. San Francisco,
WEKA (2000). Data Mining Software in Java: http://

Depositing User: Sara Taylor
Date Deposited: 05 Jul 2007
Last Modified: 06 Nov 2015 13:07


Downloads per month over past year

Repository Staff Only: item control page

View Item View Item

University of Huddersfield, Queensgate, Huddersfield, HD1 3DH Copyright and Disclaimer All rights reserved ©