A review of associative classification mining

FADI THABTAH

doi:10.1017/S0269888907001026

A review of associative classification mining

Published online by Cambridge University Press: 01 March 2007

FADI THABTAH

Show author details

FADI THABTAH: Affiliation:
Department of Computing and Engineering, University of Huddersfield, HD1 3DH, UK; e-mail: [email protected]

Article contents

Abstract
References

Get access

Rights & Permissions

Abstract

Associative classification mining is a promising approach in data mining that utilizes the association rule discovery techniques to construct classification systems, also known as associative classifiers. In the last few years, a number of associative classification algorithms have been proposed, i.e. CPAR, CMAR, MCAR, MMAC and others. These algorithms employ several different rule discovery, rule ranking, rule pruning, rule prediction and rule evaluation methods. This paper focuses on surveying and comparing the state-of-the-art associative classification techniques with regards to the above criteria. Finally, future directions in associative classification, such as incremental learning and mining low-quality data sets, are also highlighted in this paper.

Type: Original Article
Information: The Knowledge Engineering Review , Volume 22 , Issue 1 , March 2007 , pp. 37 - 65

DOI: https://doi.org/10.1017/S0269888907001026 [Opens in a new window]
Copyright: Copyright © Cambridge University Press 2007

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

Agrawal, R. & Srikant, R. 1994 Fast algorithms for mining association rule. In Proceedings of the 20th International Conference on Very Large Data Bases, Morgan Kaufmann, Santiago, Chile, pp. 487–499.Google Scholar

Agrawal, R., Amielinski, T. & Swami, A. 1993 Mining association rule between sets of items in large databases. In Buneman, P. & Jajodia, S. (eds.), Proceedings of the ACM SIGMOD International Conference on Management of Data, Washington, DC, pp. 207–216.Google Scholar

Ali, K., Manganaris, S. & Srikant, R. 1997 Partial classification using association rules. In Heckerman, D., Mannila, H., Pregibon, D. & Uthurusamy, R. (eds.), Proceedings of the 3rd International Conference on Knowledge Discovery and Data Mining, Newport Beach, CA, pp. 115–118.Google Scholar

Antonie, M. & Zaïane, O. 2004 An associative classifier based on positive and negative rules. In Proceedings of the 9th ACM SIGMOD Workshop on Research Issues in Data Mining and Knowledge Discovery. Paris, France: ACM Press, pp. 64–69.Google Scholar

Antonie, M., Zaïane, O. & Coman, A. 2003 Associative classifiers for medical images. Mining Multimedia and Complex Data (Lecture Notes in Artificial Intelligence, 2797). Berlin: Springer, pp. 68–83.Google Scholar

Baralis, E. & Torino, P. 2002 A lazy approach to pruning classification rules. Proceedings of the 2002 IEEE International Conference on Data Mining (ICDM’02), Maebashi City, Japan, p. 35.Google Scholar

Baralis, E., Chiusano, S. & Graza, P. 2004 On support thresholds in associative classification. In Proceedings of the 2004 ACM Symposium on Applied Computing. Nicosia, Cyprus: ACM Press, pp. 553–558.Google Scholar

Boutell, M., Shen, X., Luo, J. & Brown, C. 2003 Multi-label semantic scene classification. Technical Report 813, Department of Computer Science, University of Rochester, NY and Electronic Imaging Products R & D, Eastern Kodak Company.Google Scholar

Cheung, D. W., Ng, V. T. & Tam, B. W. 1996 Maintenance of discovered knowledge: A case in multi-level association rules. In Proceedings of the International Conference on Knowledge Discovery and Data Mining, Portland, OR: AAAI Press, pp. 307–310.Google Scholar

Clare, A. & King, R. 2001 Knowledge discovery in multi-label phenotype data. In De Raedt, L. & Siebes, A. (eds.), Proceedings of the 5th European Conference on Principles and Practice of Knowledge Discovery in Databases (PKDD’01) (Lecture Notes in Artificial Intelligence, 2168). Berlin: Springer, pp. 42–53.Google Scholar

Clark, P. & Boswell, R. 1991 Rule induction with CN2: Some recent improvements. In Proceedings of the 5th European Working Session on Learning. Berlin, Germany: Springer Verlag, pp. 151–163.Google Scholar

Cohen, W. 1995 Fast effective rule induction. In Proceedings of the 12th International Conference on Machine Learning, Morgan Kaufmann, CA, pp. 115–123.Google Scholar

Dong, G., Zhang, X., Wong, L. & Li, J. 1999 CAEP: Classification by aggregating emerging patterns. In Proceedings of the 2nd Imitational Conference on Discovery Science. Tokyo, Japan: Springer Verlag, pp. 30–42.Google Scholar

Duda, R. & Hart, P. 1973 Pattern Classification and Scene Analysis. New York: Wiley.Google Scholar

Fayyad, U., Piatetsky-Shapiro, G., Smith, G. & Uthurusamy, R. 1998 Advances in Knowledge Discovery and Data Mining. Menlo Park, CA: AAAI Press.Google Scholar

Freitas, A. 2000 Understanding the crucial difference between classification and association rule discovery. ACM SIGKDD Explorations Newsletter 2, 65–69.CrossRef Google Scholar

Furnkranz, J. & Widmer, G. 1994 Incremental reduced error pruning. In Proceedings of the 11th International Machine Learning Conference, New Brunswick, NJ, pp. 70–75.Google Scholar

Gehrke, J., Ramakrishnan, R. & Ganti, V. 1998 RainForest: A Framework for fast decision tree construction of large datasets. In Proceedings of the International Conference on very Large Data Bases, New York, NY, pp. 416–427.Google Scholar

Han, J., Pei, J. & Yin, Y. 2000 Mining frequent patterns without candidate generation. In Proceedings of the 2000 ACM SIGMOD International Conference on Management of Data. Dallas, TX: ACM Press, pp. 1–12.Google Scholar

Hu, H. & Li, J. 2005 Using association rules to make rule-based classifiers robust. In Proceedings of the 16th Australasian Database Conference, Newcastle, Australia, pp. 47–54.Google Scholar

Li, W. 2001 Classification based on multiple association rules. MSc thesis, Simon Fraser University, BC, Canada, April 2001.Google Scholar

Li, W., Han, J. & Pei, J. 2001 CMAR: Accurate and efficient classification based on multiple-class association rule. In Proceedings of the International Conference on Data Mining (ICDM’01), San Jose, CA, pp. 369–376.Google Scholar

Lim, T., Loh, W. & Shih, Y. 2000 A comparison of prediction accuracy, complexity and training time of thirty-three old and new classification algorithms. Machine Learning 40, 203–228.Google Scholar

Liu, B., Hsu, W. & Ma, Y. 1998 Integrating classification and association rule mining. In Proceedings of the International Conference on Knowledge Discovery and Data Mining. New York, NY: AAAI Press, pp. 80–86.Google Scholar

Liu, B., Hsu, W. & Ma, Y. 1999 Mining association rules with multiple minimum supports. In Proceedings of the 5th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. San Diego, CA: ACM Press, pp. 337–341.Google Scholar

Liu, B., Ma, Y. & Wong, C.-K. 2000 Improving an association rule based classifier. In Proceedings of the 4th European Conference on Principles of Data Mining and Knowledge Discovery, Lyon, France, pp. 504–509.Google Scholar

Liu, B., Ma, Y. & Wong, C.-K. 2001 Classification using association rules: Weakness and enhancements. In Vipin Kumar, et al. (eds), Data Mining for Scientific Applications, 2001.Google Scholar

Meretakis, D. & Wüthrich, B. 1999 Extending naïve Bayes classifiers using long itemsets. In Proceedings of the 5th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. San Diego, CA: ACM Press, pp. 165–174.Google Scholar

Merz, C. & Murphy, P. 1996 UCI repository of machine learning databases. Irvine, CA: University of California, Department of Information and Computer Science.Google Scholar

Provost, F., Fawcett, T. & Kohavi, R. 1997 The case against accuracy estimation for comparing induction algorithms. In Proceedings of the 15th International Conference on Machine Learning, Madison, WI, pp. 445–453.Google Scholar

Quinlan, J. 1987 Simplifying decision trees. International Journal of Man–Machine Studies 27, 221–248.Google Scholar

Quinlan, J. 1998 Data mining tools See5 and C5.0. Technical Report, RuleQuest Research.Google Scholar

Quinlan, J. 1993 C4.5: Programs for Machine Learning. San Mateo, CA: Morgan Kaufmann.Google Scholar

Quinlan, J. & Cameron-Jones, R. 1993 FOIL: A midterm report. In Proceedings of the European Conference on Machine Learning. Vienna, Austria: Springer Verlag, pp. 3–20.Google Scholar

Savasere, A., Omiecinski, E. & Navathe, S. 1995 An efficient algorithm for mining association rules in large databases. In Proceedings of the 21st conference on Very Large Databases (VLDB’95), Zurich, Switzerland, pp. 432–444.Google Scholar

Schapire, R. & Singer, Y. 2000 BoosTexter: A boosting-based system for text categorization. Machine Learning 39(2/3), 135–168.CrossRef Google Scholar

Snedecor, W. & Cochran, W. 1989 Statistical Methods, 8th edn. Iowa City, IA: Iowa State University Press.Google Scholar

Thabtah, F. 2006 Pruning techniques in associative classification: Survey and comparison. Journal of Digital Information Management 4, 202–205.Google Scholar

Thabtah, F., Cowling, P. & Peng, Y. 2004 MMAC: A new multi-class, multi-label associative classification approach. In Proceedings of the 4th IEEE International Conference on Data Mining (ICDM’04), Brighton, UK, pp. 217–224.Google Scholar

Thabtah, F., Cowling, P. & Peng, Y. 2005 MCAR: Multi-class classification based on association rule approach. In Proceeding of the 3rd IEEE International Conference on Computer Systems and Applications, Cairo, Egypt, pp. 1–7.Google Scholar

Topor, R. & Shen, H. 2001 Construct robust rule sets for classification. In Proceedings of the 8th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. Edmonton, Alberta, Canada: ACM Press, pp. 564–569.Google Scholar

Tsai, P., Lee, C. & Chen, A. 1999 An efficient approach for incremental association rule mining. In Proceedings of the 3rd Pacific–Asia Conference on Methodologies for Knowledge Discovery and Data Mining. London, UK: Springer Verlag, pp. 74–83.Google Scholar

Valtchev, P., Missaoui, R., Godin, R. & Meridji, M. 2002 A framework for incremental generation of frequent closed itemsets using galois (Concept) lattice theory. Journal of Experimental and Theoretical Artificial Intelligence (JETAI), Special Issue on Concept Lattice based Theory, Methods and Tools for Knowledge Discovery in Databases, 14, 115–142.Google Scholar

Van Rijsbergan, C. 1979 Information Retrieval, 2nd edn. London: Buttersmiths.Google Scholar

Wang, K., Zhou, S. & He, Y. 2000 Growing decision tree on support-less association rules. In Proceedings of the 6th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. Boston, MA: ACM Press, pp. 265–269.Google Scholar

Wang, K., He, Y. & Cheung, D. 2001 Mining confidence rules without support requirements. In Proceedings of the 10th International Conference on Information and Knowledge Management. Atlanta, GA: ACM Press, pp. 89–96.Google Scholar

Weka, 2000 Data mining software in Java. www.cs.waikato.ac.nz/ml/weka.Google Scholar

Witten, I. & Frank, E. 2000 Data Mining: Practical Machine Learning Tools and Techniques with Java Implementations. San Francisco, CA: Morgan Kaufmann.Google Scholar

Xu, X., Han, G. & Min, H. 2004 A novel algorithm for associative classification of images blocks. In Proceedings of the 4th IEEE International Conference on Computer and Information Technology, Lian, Shiguo, China, pp. 46–51.Google Scholar

Yang, Y., Slattery, S. & Ghani, R. 2002 A study of approaches to hypertext categorization. Journal of Intelligent Information Systems 18, 149–241.CrossRef Google Scholar

Yin, X. & Han, J. 2003 CPAR: Classification based on predictive association rule. In Proceedings of the SIAM International Conference on Data Mining. San Francisco, CA: SIAM Press, pp. 369–376.Google Scholar

Zaïane, O. & Antonie, A. 2002 Classifying text documents by associating terms with text categories. In Proceedings of the 13th Australasian Database Conference (ADC’02), Melbourne, Australia, pp. 215–222.Google Scholar

Zaki, M. & Gouda, K. 2003 Fast vertical mining using diffsets. In Proceedings of the 9th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. Washington, DC: ACM Press, pp. 326–335.Google Scholar

Zaki, M., Parthasarathy, S., Ogihara, M. & Li, W. 1997 New algorithms for fast discovery of association rules. In Proceedings of the 3rd International Conference on Knowledge Discovery and Data Mining. Menlo Park, CA: AAAI Press, pp. 283–286.Google Scholar

Zhou, Z. & Ezeife, C. 2001 A low-scan incremental association rule maintenance method based on the Apriori property. In Proceedings of the 14th Biennial Conference of the Canadian Society on Computational Studies of Intelligence: Advances in Artificial Intelligence. London, UK: Springer-Verlag, pp. 26–35.Google Scholar

Article contents

A review of associative classification mining

Abstract

Access options

References

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests