Clique-based semantic kernel with application to semantic relatedness

A. H. JADIDINEJAD; F. MAHMOUDI; M. R. MEYBODI

doi:10.1017/S135132491500008X

Clique-based semantic kernel with application to semantic relatedness

Published online by Cambridge University Press: 14 April 2015

A. H. JADIDINEJAD ,

F. MAHMOUDI and

M. R. MEYBODI

Show author details

A. H. JADIDINEJAD: Affiliation:
Department of Computer Engineering, Science and Research Branch, Islamic Azad University, Tehran, Iran e-mail: [email protected]
F. MAHMOUDI: Affiliation:
Computer and IT Engineering Faculty, Islamic Azad University, Qazvin Branch, Qazvin, Iran
M. R. MEYBODI: Affiliation:
Computer Engineering and Information Technology Department, Amirkabir University of Technology, Tehran, Iran

Article contents

Abstract
References

Get access

Rights & Permissions

Abstract

The emergence of knowledge repositories in a variety of domains provides a valuable opportunity for semantic interpretation of high dimensional datasets. Previous researches investigate the use of concept instead of word as a core semantic feature for incorporating semantic knowledge from an ontology into the representation model of documents. On the other hand, in machine learning and information retrieval, data objects are represented as a flat feature vector. The inconsistency between the structural nature of the knowledge repositories and the flat representation of features in machine learning leads researchers to neglect the structure of the knowledge base and leverage concepts as isolated semantic features, which is known as bag-of-concepts. Although, using concepts has some advantages over words, by neglecting the relation between concepts, the problem of vocabulary mismatch remains in force. In this paper, a novel semantic kernel is proposed which is capable of incorporating the relatedness between conceptual features. This kernel leverages clique theory to map data objects to a novel feature space wherein complex data objects will be comparable. The proposed kernel is relevant to all applications which have a prior knowledge about the relatedness between features. We concentrate on representing text documents and words using Wikipedia and WordNet, respectively. The experimental results over a set of benchmark datasets have revealed that the proposed kernel significantly improves the representation of both words and texts in the application of semantic relatedness.

Type: Articles
Information: Natural Language Engineering , Volume 21 , Special Issue 5: Graphs in NLP , November 2015 , pp. 725 - 742

DOI: https://doi.org/10.1017/S135132491500008X [Opens in a new window]
Copyright: Copyright © Cambridge University Press 2015

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Article purchase

Temporarily unavailable

References

Agirre, E., Alfonseca, E., Hall, K., Kravalova, J., Paşca, M., and Soroa, A. 2009. A study on similarity and relatedness using distributional and wordnet-based approaches. In Proceedings of the Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics, Association for Computational Linguistics, pp. 19–27.Google Scholar

Anderka, M., and Stein, B. 2009. The ESA retrieval model revisited. In Proceedings of the Thirty-Second international ACM SIGIR conference on Research and development in information retrieval, ACM, pp. 670–1.Google Scholar

Assent, I. 2012., Clustering high dimensional data. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery 2: 340–350.Google Scholar

Baroni, M., and Lenci, A., 2010. Distributional memory: A general framework for corpus-based semantics. Computational Linguistics 36: 673–721.CrossRef Google Scholar

Basili, R., Cammisa, M., and Moschitti, A., 2006. A Semantic kernel to classify texts with very few training examples. Informatica 30: 163–172.Google Scholar

Bizer, C., Lehmann, J., Kobilarov, G., Auer, S., Becker, C., Cyganiak, R., and Hellmann, S., 2009. DBpedia - A crystallization point for the web of data. Web Semantics 7: 154–65.CrossRef Google Scholar

Bloehdorn, S., Basili, R., Cammisa, M., and Moschitti, A. 2006. Semantic kernels for text classification based on topological measures of feature similarity. In Proceeding of the Sixth International Conference on Data Mining, pp. 808–12.Google Scholar

Bloehdorn, S., and Moschitti, A. 2007. Structure and semantics for expressive text kernels. In Proceedings of the Sixteenth ACM Conference on Conference on Information and Knowledge Management, ACM, pp. 861–864.Google Scholar

Budanitsky, A., and Hirst, G., 2006. Evaluating WordNet-based measures of lexical semantic relatedness. Computational Linguistics 32: 13–47.CrossRef Google Scholar

Cristianini, N., Shawe-Taylor, J., and Lodhi, H., 2002. Latent semantic kernels. Journal of Intelligent Information Systems 18: 127–152.CrossRef Google Scholar

Croce, D., Moschitti, A., and Basili, R. 2011. Structured lexical similarity via convolution kernels on dependency trees. In Proceedings of the Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, pp. 1034–1046.Google Scholar

Csomai, A., and Mihalcea, R., 2008. Linking documents to encyclopedic knowledge. IEEE Intelligent Systems 23: 34–41.CrossRef Google Scholar

Dumais, Susan T., 2004. Latent semantic analysis. Annual Review of Information Science and Technology 38: 188–230.CrossRef Google Scholar

Finlayson, M. A. 2014. Java libraries for accessing the princeton Wordnet: comparison and evaluation. In Proceedings of the Seventh Global Wordnet Conference, pp. 713–721.Google Scholar

Fodeh, S., Punch, B., and Tan, P.-N., 2011. On ontology-driven document clustering using core semantic features. Knowledge and Information Systems 28: 395–421.CrossRef Google Scholar

Gabrilovich, E., and Markovitch, S., 2009. Wikipedia-based semantic interpretation for natural language processing. Journal of Artificial Intelligence Research 34: 443–498.CrossRef Google Scholar

Gurevych, I., and Wolf, E., 2010. Expert-built and collaboratively constructed lexical semantic resources. Language and Linguistics Compass 4: 1074–1090.CrossRef Google Scholar

Gurevych, I., and Zesch, T., 2013. Collective intelligence and language resources: introduction to the special issue on collaboratively constructed language resources. Language Resources and Evaluation 47: 1–7.CrossRef Google Scholar

Hachey, B., Radford, W., Nothman, J., Honnibal, M. and Curran, J.R., 2013. Evaluating entity linking with Wikipedia. Artificial Intelligence 194: 130–150.CrossRef Google Scholar

Harispe, S., Ranwez, S., Janaqi, S., and Montmain, J. 2013. Semantic Measures for the Comparison of Units of Language, Concepts or Entities from Text and Knowledge Base Analysis. ArXiv e-prints.Google Scholar

Huang, L., Milne, D., Frank, E., and Witten, I. H., 2012. Learning a concept-based document similarity measure. Journal of the American Society for Information Science and Technology 63: 1593–1608.CrossRef Google Scholar

Hovy, E., Navigli, R., and Ponzetto, S.P., 2013. Collaboratively built semi-structured content and artificial intelligence: the story so far. Artificial Intelligence 194: 2–27.CrossRef Google Scholar

Jadidinejad, A. H., and Mahmoudi, F., 2014. Unsupervised short answer grading using spreading activation over an associative network of concepts. Canadian Journal of Information and Library Science 38: 287–303.CrossRef Google Scholar

Kriegel, H.-P., Peer, K., and Zimek, A. 2009. Clustering high-dimensional data: a survey on subspace clustering, pattern-based clustering, and correlation clustering. ACM Transactions on Knowledge Discovery from Data 3 (1): 1:1–1:58.CrossRef Google Scholar

Lee, M. D., Pincombe, B., and Welsh, M. 2005. An empirical evaluation of models of text document similarity. In Proceedings of the Twenty-Seventh Annual Conference of the Cognitive Science Society, pp. 1254–59.Google Scholar

Liberman, S., and Markovitch, S. 2009. Compact hierarchical explicit semantic representation. In Proceedings of the IJCAI 2009 Workshop on User-Contributed Knowledge and Artificial Intelligence: An Evolving Synergy (WikiAI09), pp. 36–38.Google Scholar

Medelyan, O., Milne, D., Legg, C., and Witten, Ian H., 2009. Mining meaning from Wikipedia. Int. J. Hum.-Comput. Stud. 67: 716–54.CrossRef Google Scholar

Mehdad, Y., Moschitti, A., and Zanzotto, F. M. 2010. Syntactic/semantic structures for textual entailment recognition. Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics, pp. 1020–1028.Google Scholar

Mihalcea, R., and Radev, D., 2011. Graph-based Natural Language Processing and Information Retrieval. Cambridge: Cambridge University Press.CrossRef Google Scholar

Mihalcea, R., and Csomai, A. 2007. Wikify!: linking documents to encyclopedic knowledge. In Proceedings of the Sixteenth ACM conference on Conference on information and knowledge management, ACM, pp. 233–42.Google Scholar

Milne, D., and Witten, I. H., 2013. An open-source toolkit for mining Wikipedia. Artificial Intelligence 194: 222–239.CrossRef Google Scholar

Navigli, R., and Ponzetto, S. P., 2012. BabelNet: the automatic construction, evaluation and application of a wide-coverage multilingual semantic network. Artificial Intelligence 193: 217–250.CrossRef Google Scholar

Patwardhan, S., and Pedersen, T. 2006. Using WordNet-based context vectors to estimate the semantic relatedness of concepts. In Proceedings of EACL Workshop Making Sense of Sense — Bringing Computational Linguistics and Psycholinguistics TogetherWorkshop Making Sense of Sense—Bringing Computational Linguistics and Psycholinguistics Together, pp. 1–8.Google Scholar

Resnik, P., 1999. Semantic similarity in a taxonomy: an information-based measure and its application to problems of ambiguity in natural language. Journal of Artificial Intelligence Research 11: 95–130.CrossRef Google Scholar

Schwartz, H. A., and Gomez, F. 2011. Evaluating semantic metrics on tasks of concept similarity. In Proceedings of the Twenty-Fourth International Florida Artificial Intelligence Research Society Conference, pp. 299–304.Google Scholar

Shawe-Taylor, J., and Cristianini, N., 2004. Kernel Methods for Pattern Analysis. Cambridge: Cambridge University.CrossRef Google Scholar

Singer, P., Niebler, T., Strohmaier, M., and Hotho, A., 2013. Computing semantic relatedness from human navigational paths: a case study on Wikipedia. International Journal on Semantic Web and Information Systems 9: 41–70.CrossRef Google Scholar

Steyvers, M., and Tenenbaum, J. B., 2005. The large-scale structure of semantic networks: statistical analyses and a model of semantic growth. Cognitive Science 29: 41–78.CrossRef Google Scholar

Taieb, M. A. H., Aouicha, M. B., and Hamadou, A. B., 2013. Computing semantic relatedness using Wikipedia features. Knowledge-Based Systems 50: 260–78.CrossRef Google Scholar

Tsatsaronis, G., Varlamis, I., and Vazirgiannis, M., 2010. Evaluating entity linking with Wikipedia. Journal of Artificial Intelligence Research 37: 1–40.CrossRef Google Scholar

Turney, P. D., and Pantel, P., 2010. From frequency to meaning: vector space models of semantics. Journal of Artificial Intelligence Research 37: 141–88.CrossRef Google Scholar

Wang, P., and Domeniconi, C. 2008. Building semantic kernels for text classification using wikipedia. In Proceedings of the fourteenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, ACM, pp. 713–721.Google Scholar

Wang, P., Hu, J., Zeng, H.-J. and Chen, Z., 2009. Using Wikipedia knowledge to improve text classification. Knowledge and Information Systems 19: 265–281.CrossRef Google Scholar

Yang, D., and Powers, D. M. W. 2006. Verb similarity on the taxonomy of Wordnet. In Proceedings of the third International WordNet Conference (GWC-06), pp. 121–128.Google Scholar

Yazdani, M., and Popescu-Belis, A., 2013. Computing text semantic relatedness using the contents and links of a hypertext encyclopedia. Artificial Intelligence 194: 176–202.CrossRef Google Scholar

Zesch, T., and Gurevych, I., 2010. Wisdom of crowds versus wisdom of linguists - measuring the semantic relatedness of words. Natural Language Engineering 16: 25–59.CrossRef Google Scholar

Zhang, Z., Gentile, A. L., and Ciravegna, F., 2013. Recent advances in methods of lexical semantic relatedness a survey. Natural Language Engineering 19: 411–479.CrossRef Google Scholar

Zou, G. Y., 2007. Toward using confidence intervals to compare correlations. Psychological Methods 12: 399–413.CrossRef Google Scholar PubMed

Article contents

Clique-based semantic kernel with application to semantic relatedness

Abstract

Access options

Article purchase

Temporarily unavailable

References

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests