White-box Induction From SVM Models: Explainable AI with Logic Programming

FARHAD SHAKERIN; GOPAL GUPTA

doi:10.1017/S1471068420000356

White-box Induction From SVM Models: Explainable AI with Logic Programming

Published online by Cambridge University Press: 21 September 2020

FARHAD SHAKERIN

and

GOPAL GUPTA

Show author details

FARHAD SHAKERIN: Affiliation:
The University of Texas at Dallas, Texas, USA (e-mail: [email protected], [email protected])
GOPAL GUPTA: Affiliation:
The University of Texas at Dallas, Texas, USA (e-mail: [email protected], [email protected])

Article contents

Abstract
References

Get access

Rights & Permissions

Abstract

We focus on the problem of inducing logic programs that explain models learned by the support vector machine (SVM) algorithm. The top-down sequential covering inductive logic programming (ILP) algorithms (e.g., FOIL) apply hill-climbing search using heuristics from information theory. A major issue with this class of algorithms is getting stuck in local optima. In our new approach, however, the data-dependent hill-climbing search is replaced with a model-dependent search where a globally optimal SVM model is trained first, then the algorithm looks into support vectors as the most influential data points in the model, and induces a clause that would cover the support vector and points that are most similar to that support vector. Instead of defining a fixed hypothesis search space, our algorithm makes use of SHAP, an example-specific interpreter in explainable AI, to determine a relevant set of features. This approach yields an algorithm that captures the SVM model’s underlying logic and outperforms other ILP algorithms in terms of the number of induced clauses and classification evaluation metrics.

Keywords

Explainable AI Data Mining Inductive Logic Programming Machine Learning

Type: Original Article
Information: Theory and Practice of Logic Programming , Volume 20 , Issue 5: 36th International Conference on Logic Programming Special Issue I , September 2020 , pp. 656 - 670

DOI: https://doi.org/10.1017/S1471068420000356 [Opens in a new window]
Copyright: © The Author(s), 2020. Published by Cambridge University Press

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Article purchase

Temporarily unavailable

References

Aggarwal, Charu C. and Jiawei (editors) Han. 2014. Frequent Pattern Mining. Springer International Publishing.10.1007/978-3-319-07821-2CrossRef Google Scholar

Rakesh, Agrawal and Srikant, Ramakrishnan. 1994. Fast Algorithms for Mining Association Rules in Large Databases. In Proceedings of the 20th International Conference on Very Large Data Bases (VLDB ’94). Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, 487–499.Google Scholar

Baral, Chitta. 2003. Knowledge representation, reasoning and declarative problem solving. Cambridge University Press.10.1017/CBO9780511543357CrossRef Google Scholar

Tianqi, Chen and Guestrin, Carlos. 2016. XGBoost: A Scalable Tree Boosting System. In Proceedings of the 22nd ACM SIGKDD (San Francisco, California, USA) (KDD ’16). 785–794.Google Scholar

Cohen, William W.. 1995. Fast Effective Rule Induction. In Proceedings of the Twelfth International Conference on International Conference on Machine Learning (Tahoe City, California, USA) (ICML’95). Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, 115–123.Google Scholar

Corinna, Cortes and Vapnik, Vladimir. 1995. Support-Vector Networks. Mach. Learn. 20, 3 (Sept. 1995), 273–297. https://doi.org/10.1023/A:1022627411411 CrossRef Google Scholar

Craven, Mark W. and Shavlik, Jude W.. 1995. Extracting Tree-structured Representations of Trained Networks. In Proceedings of the 8th International Conference on Neural Information Processing Systems (Denver, Colorado) (NIPS’95). MIT Press, Cambridge, MA, USA, 24–30.Google Scholar

Diederich, Joachim. 2008. Rule Extraction from Support Vector Machines: An Introduction. Springer Berlin Heidelberg, Berlin, Heidelberg, 3–31.Google Scholar

Dheeru, Dua and Graff, Casey. 2017. UCI Machine Learning Repository. http://archive.ics.uci.edu/ml Google Scholar

Philippe Fournier-Viger, Jerry Chun-Wei Lin, Tin Truong-Chi, and Roger Nkambou. 2019. A Survey of High Utility Itemset Mining. In High-Utility Pattern Mining: Theory, Algorithms and Applications. Springer International Publishing, 1–45. https://doi.org/10.1007/978-3-030-04921-8_1 CrossRef Google Scholar

Philippe Fournier-Viger, Cheng-Wei Wu, Souleymane Zida, and Vincent S. Tseng. 2014. FHM: Faster High-Utility Itemset Mining Using Estimated Utility Co-occurrence Pruning. In Foundations of Intelligent Systems, Troels, Andreasen, Henning, Christiansen, Juan-Carlos, Cubero, and Raś, Zbigniew W. (Eds.). Springer International Publishing, 83–92.Google Scholar

Glenn, Fung, Sathyakama, Sandilya, and Bharat Rao, R.. 2005. Rule Extraction from Linear Support Vector Machines. In Proceedings of the Eleventh ACM SIGKDD International Conference on Knowledge Discovery in Data Mining (Chicago, Illinois, USA) (KDD ’05). ACM, New York, NY, USA, 32–40.Google Scholar

Wensheng, Gan, Jerry Chun-Wei, Lin, Philippe Fournier-Viger, Han-Chieh Chao, Tzung-Pei Hong, and Hamido Fujita. 2018. A survey of incremental high-utility itemset mining. Wiley Interdiscip. Rev. Data Min. Knowl. Discov. 8, 2 (2018).Google Scholar

Michael, Gelfond and Kahl, Yulia. 2014. Knowledge representation, reasoning, and the design of intelligent agents: The answer-set programming approach. Cambridge University Press.Google Scholar

Gunning, David. 2015. Explainable Artificial Intelligence (XAI), https://www.darpa.mil/program/explainable-artificial-intelligence. Retrieved June 2018.Google Scholar

Mark, Hall, Eibe, Frank, Geoffrey, Holmes, Bernhard, Pfahringer, Peter, Reutemann, and Witten, Ian H.. 2009. The WEKA data mining software: an update. SIGKDD Explorations 11, 1 (2009), 10–18.Google Scholar

Johan, Huysmans, Bart, Baesens, and Vanthienen, Jan. 2006. ITER: an algorithm for predictive regression rule extraction. In International Conference on Data Warehousing and Knowledge Discovery. Springer, Krakow, Poland, 270–279.Google Scholar

Johan, Huysmans, Rudy, Setiono, Bart, Baesens, and Vanthienen, Jan. 2008. Minerva: Sequential covering for rule extraction. IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics) 38, 2 (2008), 299–309.Google Scholar

Niels, Landwehr, Kristian, Kersting, and De Raedt, Luc. 2005. nFOIL: Integrating Nave Bayes and FOIL. In Proceedings, The Twentieth National Conference on Artificial Intelligence and the Seventeenth Innovative Applications of Artificial Intelligence Conference, July 9-13, 2005, Pittsburgh, Pennsylvania, USA. 795–800.Google Scholar

Niels, Landwehr, Andrea, Passerini, Luc De, Raedt, and Frasconi, Paolo. 2006. kFOIL: Learning Simple Relational Kernels. In Proceedings, The Twenty-First National Conference on Artificial Intelligence and the Eighteenth Innovative Applications of Artificial Intelligence Conference, July 16-20, 2006, Boston, Massachusetts, USA. 389–394.Google Scholar

Lundberg, Scott M, Gabriel G Erion, and Su-In Lee. 2018. Consistent individualized feature attribution for tree ensembles. arXiv preprint arXiv:1802.03888 (2018).Google Scholar

Scott M Lundberg and Su-In Lee. 2017. A Unified Approach to Interpreting Model Predictions. In Advances in Neural Information Processing Systems 30. Curran Associates, Inc., Long Beach, CA, 4765–4774.Google Scholar

David Martens, Johan Huysmans, Rudy Setiono, Jan Vanthienen, and Bart Baesens. 2008. Rule Extraction from Support Vector Machines: An Overview of Issues and Application in Credit Scoring. In Rule Extraction from Support Vector Machines, Joachim, Diederich (Ed.). Springer Berlin Heidelberg, Berlin, Heidelberg, 33–63.Google Scholar

Molnar, Christoph. 2019. Interpretable Machine Learning: A Guide for Making Black Box Models Explainable. https://christophm.github.io/interpretable-ml-book/.Google Scholar

Muggleton, Stephen. 1991. Inductive Logic Programming. New Gen. Comput. 8, 4 (Feb. 1991), 295–318.Google Scholar

Stephen Muggleton, Luc de Raedt, David Poole, Ivan Bratko, Peter Flach, Katsumi Inoue, and Ashwin Srinivasan. 2012. ILP Turns 20. Mach. Learn. 86, 1 (Jan. 2012), 3–23.Google Scholar

Stephen Muggleton, Huma Lodhi, Ata Amini, and Michael J. E. Sternberg. 2005. Support Vector Inductive Logic Programming. In Discovery Science, Achim, Hoffmann, Hiroshi, Motoda, and Tobias, Scheffer (Eds.). Springer Berlin Heidelberg, Berlin, Heidelberg.CrossRef Google Scholar

Haydemar, Nuñez, Cecilio, Angulo, and Català, Andreu. 2002. Rule extraction from SVM. In Proc. The European Symposium on Artificial Neural Networks. Bruges, Belgium, 107–112.Google Scholar

Plotkin, G. D.. 1971. A further note on inductive generalization. In Machine Intelligence (1971), Vol. 6. Edinburgh University Press, pages 101–124.Google Scholar

Ross Quinlan, J.. 1990. Learning Logical Definitions from Relations. Machine Learning 5 (1990), 239–266.CrossRef Google Scholar

Ross Quinlan, J.. 1993. C4.5: Programs for Machine Learning. Morgan Kaufmann Publishers Inc., San Francisco, CA, USA.Google Scholar

Marco Tulio, Ribeiro, Sameer, Singh, and Guestrin, Carlos. 2016. “Why Should I Trust You?”: Explaining the Predictions of Any Classifier. In Proceedings of the 22nd ACM SIGKDD 2016. 1135–1144.Google Scholar

Riguzzi, F.. 2016. ALEPH in SWI-Prolog. https://github.com/friguzzi/aleph Google Scholar

Sakama, Chiaki. 2005. Induction from answer sets in nonmonotonic logic programs. ACM Trans. Comput. Log. 6, 2 (2005), 203–231.CrossRef Google Scholar

Farhad, Shakerin and Gupta, Gopal. 2020. Whitebox Induction of Default Rules Using High-Utility Itemset Mining. In Practical Aspects of Declarative Languages - 22nd International Symposium, PADL 2020, New Orleans, LA, USA, January 20-21, 2020, Proceedings (Lecture Notes in Computer Science, Vol. 12007), Ekaterina, Komendantskaya and Yanhong Annie, Liu (Eds.). Springer, 168–176. https://doi.org/10.1007/978-3-030-39197-3_11 CrossRef Google Scholar

Farhad, Shakerin, Elmer, Salazar, and Gupta, Gopal. 2017. A new algorithm to automate inductive learning of default theories. TPLP 17, 5-6 (2017), 1010–1026.Google Scholar

Srinivasan, Ashwin. 2001. The Aleph Manual. http://web.comlab.ox.ac.uk/oucl/research/areas/machlearn/Aleph/ Google Scholar

Tseng, Vincent S, Cheng-Wei, Wu, Philippe Fournier-Viger, and S Yu Philip. 2016. Efficient algorithms for mining top-k high utility itemsets. IEEE Trans. on Know. and Data Engg. 28, 1 (2016), 54–67.Google Scholar

Wielemaker, Jan. 2020. The SWI-Prolog System. http://www.swi-prolog.org/ Google Scholar

Qiang, Zeng, Patel, Jignesh M., and Page, David. 2014. QuickFOIL: Scalable Inductive Logic Programming. Proc. VLDB Endow. 8, 3 (Nov. 2014), 197–208.Google Scholar

Article contents

White-box Induction From SVM Models: Explainable AI with Logic Programming

Abstract

Keywords

Access options

Article purchase

Temporarily unavailable

References

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests