Hostname: page-component-586b7cd67f-2brh9 Total loading time: 0 Render date: 2024-11-22T04:25:26.615Z Has data issue: false hasContentIssue false

White-box Induction From SVM Models: Explainable AI with Logic Programming

Published online by Cambridge University Press:  21 September 2020

FARHAD SHAKERIN
Affiliation:
The University of Texas at Dallas, Texas, USA (e-mail: [email protected], [email protected])
GOPAL GUPTA
Affiliation:
The University of Texas at Dallas, Texas, USA (e-mail: [email protected], [email protected])

Abstract

We focus on the problem of inducing logic programs that explain models learned by the support vector machine (SVM) algorithm. The top-down sequential covering inductive logic programming (ILP) algorithms (e.g., FOIL) apply hill-climbing search using heuristics from information theory. A major issue with this class of algorithms is getting stuck in local optima. In our new approach, however, the data-dependent hill-climbing search is replaced with a model-dependent search where a globally optimal SVM model is trained first, then the algorithm looks into support vectors as the most influential data points in the model, and induces a clause that would cover the support vector and points that are most similar to that support vector. Instead of defining a fixed hypothesis search space, our algorithm makes use of SHAP, an example-specific interpreter in explainable AI, to determine a relevant set of features. This approach yields an algorithm that captures the SVM model’s underlying logic and outperforms other ILP algorithms in terms of the number of induced clauses and classification evaluation metrics.

Type
Original Article
Copyright
© The Author(s), 2020. Published by Cambridge University Press

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

Aggarwal, Charu C. and Jiawei (editors) Han. 2014. Frequent Pattern Mining. Springer International Publishing.10.1007/978-3-319-07821-2CrossRefGoogle Scholar
Rakesh, Agrawal and Srikant, Ramakrishnan. 1994. Fast Algorithms for Mining Association Rules in Large Databases. In Proceedings of the 20th International Conference on Very Large Data Bases (VLDB ’94). Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, 487–499.Google Scholar
Baral, Chitta. 2003. Knowledge representation, reasoning and declarative problem solving. Cambridge University Press.10.1017/CBO9780511543357CrossRefGoogle Scholar
Tianqi, Chen and Guestrin, Carlos. 2016. XGBoost: A Scalable Tree Boosting System. In Proceedings of the 22nd ACM SIGKDD (San Francisco, California, USA) (KDD ’16). 785–794.Google Scholar
Cohen, William W.. 1995. Fast Effective Rule Induction. In Proceedings of the Twelfth International Conference on International Conference on Machine Learning (Tahoe City, California, USA) (ICML’95). Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, 115–123.Google Scholar
Corinna, Cortes and Vapnik, Vladimir. 1995. Support-Vector Networks. Mach. Learn. 20, 3 (Sept. 1995), 273297. https://doi.org/10.1023/A:1022627411411 CrossRefGoogle Scholar
Craven, Mark W. and Shavlik, Jude W.. 1995. Extracting Tree-structured Representations of Trained Networks. In Proceedings of the 8th International Conference on Neural Information Processing Systems (Denver, Colorado) (NIPS’95). MIT Press, Cambridge, MA, USA, 24–30.Google Scholar
Diederich, Joachim. 2008. Rule Extraction from Support Vector Machines: An Introduction. Springer Berlin Heidelberg, Berlin, Heidelberg, 3–31.Google Scholar
Dheeru, Dua and Graff, Casey. 2017. UCI Machine Learning Repository. http://archive.ics.uci.edu/ml Google Scholar
Philippe Fournier-Viger, Jerry Chun-Wei Lin, Tin Truong-Chi, and Roger Nkambou. 2019. A Survey of High Utility Itemset Mining. In High-Utility Pattern Mining: Theory, Algorithms and Applications. Springer International Publishing, 1–45. https://doi.org/10.1007/978-3-030-04921-8_1 CrossRefGoogle Scholar
Philippe Fournier-Viger, Cheng-Wei Wu, Souleymane Zida, and Vincent S. Tseng. 2014. FHM: Faster High-Utility Itemset Mining Using Estimated Utility Co-occurrence Pruning. In Foundations of Intelligent Systems, Troels, Andreasen, Henning, Christiansen, Juan-Carlos, Cubero, and Raś, Zbigniew W. (Eds.). Springer International Publishing, 8392.Google Scholar
Glenn, Fung, Sathyakama, Sandilya, and Bharat Rao, R.. 2005. Rule Extraction from Linear Support Vector Machines. In Proceedings of the Eleventh ACM SIGKDD International Conference on Knowledge Discovery in Data Mining (Chicago, Illinois, USA) (KDD ’05). ACM, New York, NY, USA, 32–40.Google Scholar
Wensheng, Gan, Jerry Chun-Wei, Lin, Philippe Fournier-Viger, Han-Chieh Chao, Tzung-Pei Hong, and Hamido Fujita. 2018. A survey of incremental high-utility itemset mining. Wiley Interdiscip. Rev. Data Min. Knowl. Discov. 8, 2 (2018).Google Scholar
Michael, Gelfond and Kahl, Yulia. 2014. Knowledge representation, reasoning, and the design of intelligent agents: The answer-set programming approach. Cambridge University Press.Google Scholar
Gunning, David. 2015. Explainable Artificial Intelligence (XAI), https://www.darpa.mil/program/explainable-artificial-intelligence. Retrieved June 2018.Google Scholar
Mark, Hall, Eibe, Frank, Geoffrey, Holmes, Bernhard, Pfahringer, Peter, Reutemann, and Witten, Ian H.. 2009. The WEKA data mining software: an update. SIGKDD Explorations 11, 1 (2009), 1018.Google Scholar
Johan, Huysmans, Bart, Baesens, and Vanthienen, Jan. 2006. ITER: an algorithm for predictive regression rule extraction. In International Conference on Data Warehousing and Knowledge Discovery. Springer, Krakow, Poland, 270–279.Google Scholar
Johan, Huysmans, Rudy, Setiono, Bart, Baesens, and Vanthienen, Jan. 2008. Minerva: Sequential covering for rule extraction. IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics) 38, 2 (2008), 299309.Google Scholar
Niels, Landwehr, Kristian, Kersting, and De Raedt, Luc. 2005. nFOIL: Integrating Nave Bayes and FOIL. In Proceedings, The Twentieth National Conference on Artificial Intelligence and the Seventeenth Innovative Applications of Artificial Intelligence Conference, July 9-13, 2005, Pittsburgh, Pennsylvania, USA. 795–800.Google Scholar
Niels, Landwehr, Andrea, Passerini, Luc De, Raedt, and Frasconi, Paolo. 2006. kFOIL: Learning Simple Relational Kernels. In Proceedings, The Twenty-First National Conference on Artificial Intelligence and the Eighteenth Innovative Applications of Artificial Intelligence Conference, July 16-20, 2006, Boston, Massachusetts, USA. 389–394.Google Scholar
Lundberg, Scott M, Gabriel G Erion, and Su-In Lee. 2018. Consistent individualized feature attribution for tree ensembles. arXiv preprint arXiv:1802.03888 (2018).Google Scholar
Scott M Lundberg and Su-In Lee. 2017. A Unified Approach to Interpreting Model Predictions. In Advances in Neural Information Processing Systems 30. Curran Associates, Inc., Long Beach, CA, 4765–4774.Google Scholar
David Martens, Johan Huysmans, Rudy Setiono, Jan Vanthienen, and Bart Baesens. 2008. Rule Extraction from Support Vector Machines: An Overview of Issues and Application in Credit Scoring. In Rule Extraction from Support Vector Machines, Joachim, Diederich (Ed.). Springer Berlin Heidelberg, Berlin, Heidelberg, 33–63.Google Scholar
Molnar, Christoph. 2019. Interpretable Machine Learning: A Guide for Making Black Box Models Explainable. https://christophm.github.io/interpretable-ml-book/.Google Scholar
Muggleton, Stephen. 1991. Inductive Logic Programming. New Gen. Comput. 8, 4 (Feb. 1991), 295318.Google Scholar
Stephen Muggleton, Luc de Raedt, David Poole, Ivan Bratko, Peter Flach, Katsumi Inoue, and Ashwin Srinivasan. 2012. ILP Turns 20. Mach. Learn. 86, 1 (Jan. 2012), 323.Google Scholar
Stephen Muggleton, Huma Lodhi, Ata Amini, and Michael J. E. Sternberg. 2005. Support Vector Inductive Logic Programming. In Discovery Science, Achim, Hoffmann, Hiroshi, Motoda, and Tobias, Scheffer (Eds.). Springer Berlin Heidelberg, Berlin, Heidelberg.CrossRefGoogle Scholar
Haydemar, Nuñez, Cecilio, Angulo, and Català, Andreu. 2002. Rule extraction from SVM. In Proc. The European Symposium on Artificial Neural Networks. Bruges, Belgium, 107–112.Google Scholar
Plotkin, G. D.. 1971. A further note on inductive generalization. In Machine Intelligence (1971), Vol. 6. Edinburgh University Press, pages 101124.Google Scholar
Ross Quinlan, J.. 1990. Learning Logical Definitions from Relations. Machine Learning 5 (1990), 239266.CrossRefGoogle Scholar
Ross Quinlan, J.. 1993. C4.5: Programs for Machine Learning. Morgan Kaufmann Publishers Inc., San Francisco, CA, USA.Google Scholar
Marco Tulio, Ribeiro, Sameer, Singh, and Guestrin, Carlos. 2016. “Why Should I Trust You?”: Explaining the Predictions of Any Classifier. In Proceedings of the 22nd ACM SIGKDD 2016. 1135–1144.Google Scholar
Riguzzi, F.. 2016. ALEPH in SWI-Prolog. https://github.com/friguzzi/aleph Google Scholar
Sakama, Chiaki. 2005. Induction from answer sets in nonmonotonic logic programs. ACM Trans. Comput. Log. 6, 2 (2005), 203231.CrossRefGoogle Scholar
Farhad, Shakerin and Gupta, Gopal. 2020. Whitebox Induction of Default Rules Using High-Utility Itemset Mining. In Practical Aspects of Declarative Languages - 22nd International Symposium, PADL 2020, New Orleans, LA, USA, January 20-21, 2020, Proceedings (Lecture Notes in Computer Science, Vol. 12007), Ekaterina, Komendantskaya and Yanhong Annie, Liu (Eds.). Springer, 168–176. https://doi.org/10.1007/978-3-030-39197-3_11 CrossRefGoogle Scholar
Farhad, Shakerin, Elmer, Salazar, and Gupta, Gopal. 2017. A new algorithm to automate inductive learning of default theories. TPLP 17, 5-6 (2017), 10101026.Google Scholar
Tseng, Vincent S, Cheng-Wei, Wu, Philippe Fournier-Viger, and S Yu Philip. 2016. Efficient algorithms for mining top-k high utility itemsets. IEEE Trans. on Know. and Data Engg. 28, 1 (2016), 54–67.Google Scholar
Wielemaker, Jan. 2020. The SWI-Prolog System. http://www.swi-prolog.org/ Google Scholar
Qiang, Zeng, Patel, Jignesh M., and Page, David. 2014. QuickFOIL: Scalable Inductive Logic Programming. Proc. VLDB Endow. 8, 3 (Nov. 2014), 197208.Google Scholar