Evaluation of taxonomic and neural embedding methods for calculating semantic similarity

Dongqiang Yang; Yanqin Yin

doi:10.1017/S1351324921000279

Evaluation of taxonomic and neural embedding methods for calculating semantic similarity

Published online by Cambridge University Press: 28 September 2021

Dongqiang Yang

and

Yanqin Yin

Show author details

Dongqiang Yang*: Affiliation:
School of Computer Science and Technology, Shandong Jianzhu University, Jinan, China
Yanqin Yin: Affiliation:
School of Computer Science and Technology, Shandong Jianzhu University, Jinan, China
*: *Corresponding author. E-mail: [email protected]

Article contents

Abstract
References

Get access

Rights & Permissions

Abstract

Modelling semantic similarity plays a fundamental role in lexical semantic applications. A natural way of calculating semantic similarity is to access handcrafted semantic networks, but similarity prediction can also be anticipated in a distributional vector space. Similarity calculation continues to be a challenging task, even with the latest breakthroughs in deep neural language models. We first examined popular methodologies in measuring taxonomic similarity, including edge-counting that solely employs semantic relations in a taxonomy, as well as the complex methods that estimate concept specificity. We further extrapolated three weighting factors in modelling taxonomic similarity. To study the distinct mechanisms between taxonomic and distributional similarity measures, we ran head-to-head comparisons of each measure with human similarity judgements from the perspectives of word frequency, polysemy degree and similarity intensity. Our findings suggest that without fine-tuning the uniform distance, taxonomic similarity measures can depend on the shortest path length as a prime factor to predict semantic similarity; in contrast to distributional semantics, edge-counting is free from sense distribution bias in use and can measure word similarity both literally and metaphorically; the synergy of retrofitting neural embeddings with concept relations in similarity prediction may indicate a new trend to leverage knowledge bases on transfer learning. It appears that a large gap still exists on computing semantic similarity among different ranges of word frequency, polysemous degree and similarity intensity.

Keywords

Type: Article
Information: Natural Language Engineering , Volume 28 , Issue 6 , November 2022 , pp. 733 - 761

DOI: https://doi.org/10.1017/S1351324921000279 [Opens in a new window]
Copyright: © The Author(s), 2021. Published by Cambridge University Press

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Article purchase

Temporarily unavailable

References

Agirre, E., Alfonseca, E., Hall, K., Strakova, J., Pasca, M. and Soroa, A. (2009). A Study on Similarity and Relatedness Using Distributional and Wordnet-Based Approaches. The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics, Boulder Colorado. Association for Computational Linguistics, pp. 19–27.CrossRef Google Scholar

Agirre, E., Arregi, X., Artola, X., Díaz de Ilarraza, A. and Sarasola, K. (1994). Conceptual Distance and Automatic Spelling Correction. Workshop on Computational Linguistics for Speech and Handwriting Recognition, Leeds, UK. Nottingham Trent University, Faculty of Engineering & Computing.Google Scholar

Baker, CF., Fillmore, C.J. and Lowe, J.B. (1998). The Berkeley Framenet Project. The 36th Annual Meeting of the Association for Computational Linguistics and 17th International Conference on Computational Linguistics, Montreal, Quebec, Canada. Association for Computational Linguistics, pp. 86–90.Google Scholar

Banjade, R., Maharjan, N., Niraula, N., Rus, V. and Gautam, D. (2015). Lemon and Tea Are Not Similar: Measuring Word-to-Word Similarity by Combining Different Methods. International Conference on Intelligent Text Processing and Computational Linguistics. pp. 335–346.CrossRef Google Scholar

Baroni, M., Dinu, G. and Kruszewski, G. (2014). Don’t Count, Predict! A Systematic Comparison of Context-Counting Vs. Context-Predicting Semantic Vectors. The 52nd Annual Meeting of the Association for Computational Linguistics, Baltimore, Maryland. Association for Computational Linguistics, pp. 238–247.Google Scholar

Bengio, Y., Ducharme, J., Vincent, P. and Janvin, C. (2003). “A Neural Probabilistic Language Model.” Journal of Machine Learning Research 3, 1137–1155.Google Scholar

Bojanowski, P., Grave, E., Joulin, A. and Mikolov, T. (2017). “Enriching Word Vectors with Subword Information.” Transactions of the Association for Computational Linguistics 5, 135–146.CrossRef Google Scholar

Bommasani, R., Davis, K. and Cardie, C. (2020). Interpreting Pretrained Contextualized Representations Via Reductions to Static Embeddings. The 58th Annual Meeting of the Association for Computational Linguistics, Online. Association for Computational Linguistics, pp. 4758–4781.Google Scholar

Budanitsky, A. and Hirst, G. (2006). “Evaluating Wordnet-Based Measures of Lexical Semantic Relatedness.” Computational Linguistics 32, 13–47.CrossRef Google Scholar

Bullinaria, J.A. and Levy, J.P. (2006). “Extracting Semantic Representations from Word Co-Occurrence Statistics: A Computational Study.” Behavior Research Methods 39, 510–526.CrossRef Google Scholar

Collins, A.M. and Loftus, E.F. (1975). “A Spreading Activation Theory of Semantic Priming.” Psychological Review 82, 407–428.CrossRef Google Scholar

Collins, A.M. and Quillian, M.R. (1969). “Retrieval Time from Semantic Memory.” Journal of Verbal Learning and Verbal Behavior 8, 240–247.CrossRef Google Scholar

Collobert, R., Weston, J., Bottou, L., Karlen, M., Kavukcuoglu, K. and Kuksa, P. (2011). “Natural Language Processing (Almost) from Scratch.” The Journal of Machine Learning Research 12, 2493–2537.Google Scholar

Curran, J.R. (2003). From Distributional to Semantic Similarity. Ph.D thesis, University of Edinburgh.Google Scholar

Deerwester, S.C., Dumais, S.T., Landauer, T.K., Furnas, T.K. and Harshman, R.A. (1990). “Indexing by Latent Semantic Analysis.” Journal of the American Society of Information Science 41, 391–407.3.0.CO;2-9>CrossRef Google Scholar

Devlin, J., Chang, M.W., Lee, K. and Toutanova, K. (2018). Bert: Pre-Training of Deep Bidirectional Transformers for Language Understanding. The 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. pp. 4171–4186.Google Scholar

Dinu, G. (2012). Word Meaning in Context: A Probabilistic Model and Its Application to Question Answering. PhD thesis, Saarlan University.Google Scholar

Ethayarajh, K. (2019). How Contextual Are Contextualized Word Representations? Comparing the Geometry of Bert, Elmo, and Gpt-2 Embeddings. The 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP). pp. 55–65.CrossRef Google Scholar

Faruqui, M., Dodge, J., Jauhar, S.K., Dyer, C., Hovy, E. and Smith, N.A. (2015). Retrofitting Word Vectors to Semantic Lexicons. The 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. pp. 1606–1615.Google Scholar

Fellbaum, C. (1998). Wordnet: An Electronic Lexical Database. Cambridge, MA, The MIT Press.CrossRef Google Scholar

Feng, Y., Bagheri, E., Ensan, F. and Jovanovic, J. (2017). “The State of the Art in Semantic Relatedness: A Framework for Comparison.” The Knowledge Engineering Review 32, 1–30.CrossRef Google Scholar

Finkelstein, L., Gabrilovich, E., Matias, Y., Rivlin, E., Solan, Z., Wolfman, G. and Ruppin, E. (2001). “Placing Search in Context: The Concept Revisited.” ACM Transactions on Information Systems 20, 116–131.Google Scholar

Firth, J.R. (1957). A Synopsis of Linguistic Theory 1930–1955. Selected Papers of J.R. Firth 1952–1959. Longman, pp. 1–32.Google Scholar

Gabrilovich, E. and Markovitch, S. (2007). Computing Semantic Relatedness Using Wikipedia-Based Explicit Semantic Analysis. The 20th International Joint Conference for Artificial Intelligence, Hyderabad, India. Morgan Kaufmann, pp. 1606–1611.Google Scholar

Ganitkevitch, J., VanDurme, B. and Callison-Burch, C. (2013). Ppdb: The Paraphrase Database. The 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Atlanta, Georgia. Association for Computational Linguistics, pp. 758–764.Google Scholar

Gerz, D., Vuli’c, I., Hill, F., Reichart, R. and Korhonen, A. (2016). Simverb-3500: A Large-Scale Evaluation Set of Verb Similarity. The 2016 Conference on Empirical Methods in Natural Language Processing, Austin, Texas. Association for Computational Linguistics, pp. 2173–2182.CrossRef Google Scholar

Goldberg, Y. (2019) “Assessing Bert’s Syntactic Abilities.” arXiv:1901.05287.Google Scholar

Goldstone, R.L. (1994). “Similarity, Interactive Activation, and Mapping.” Journal of Experimental Psychology: Learning, Memory and Cognition 20, 3–28.Google Scholar

Guzzi, P.H., Mina, M., Guerra, C. and Cannataro, M. (2011). “Semantic Similarity Analysis of Protein Data: Assessment with Biological Features and Issues.” Briefings in Bioinformatics 13, 569–585.CrossRef Google Scholar PubMed

Harispe, S., Ranwez, S., Janaqi, S. and Montmain, J. (2015). Semantic Similarity from Natural Language and Ontology Analysis, Morgan & Claypool Publishers.CrossRef Google Scholar

Harris, Z. (1985). Distributional Structure. The Philosophy of Linguistics. Oxford University Press, pp. 26–47.Google Scholar

Hill, F., Reichart, R. and Korhonen, A. (2015). “Simlex-999: Evaluating Semantic Models with Genuine Similarity Estimation.” Computational Linguistics 41, 665–695.CrossRef Google Scholar

Hirst, G. and St-Onge, D. (1997). Lexical Chains as Representations of Context for the Detection and Correction of Malapropisms. Wordnet: An Electronic Lexical Database. The MIT Press.Google Scholar

Howard, J. and Ruder, S. (2018). Universal Language Model Fine-Tuning for Text Classification. ACL. Association for Computational Linguistics.CrossRef Google Scholar

Huang, E.H., Socher, R., Manning, C.D. and Ng, A.Y. (2012). Improving Word Representations Via Global Context and Multiple Word Prototypes. The 50th Annual Meeting of the Association for Computational Linguistics, Jeju Island, Korea. Association for Computational Linguistics, pp. 873–882.Google Scholar

Jarmasz, M. and Szpakowicz, S. (2003). Roget’s Thesaurus and Semantic Similarity. Recent Advances in Natural Language Processing (RANLP 2003), Borovets, Bulgaria. John Benjamins Publishing Company pp. 212–219.Google Scholar

Jawahar, G., Sagot, B. and Seddah, D. (2019). What Does Bert Learn About the Structure of Language? The 57th Annual Meeting of the Association for Computational Linguistics, Florence, Italy. Association for Computational Linguistics, pp. 3651–3657.CrossRef Google Scholar

Jiang, J.J. and Conrath, D.W. (1997). Semantic Similarity Based on Corpus Statistics and Lexical Taxonomy. The 10th International Conference on Research in Computational Linguistics (ROCLING), Taiwan. Association for Computational Linguistics, pp. 19–33.Google Scholar

Kilgarriff, A. (1997). “I Don’t Believe in Word Senses.” Computers and the Humanities 31, 91–113.CrossRef Google Scholar

Kilgarriff, A. (2004). How Dominant Is the Commonest Sense of a Word? TSD 2004, Brno, Czech Republic. Springer-Verlag Berlin Heidelberg, pp. 103–112.Google Scholar

Kilgarriff, A. and Yallop, C. (2000). What’s in a Thesaurus? The Second International Conference on Language Resources and Evaluation (LREC-2000), Athens, Greece. European Language Resources Association (ELRA), pp. 1371–1379.Google Scholar

Krizhevsky, A., Sutskever, I. and Hinton, G.E. (2012). “Imagenet Classification with Deep Convolutional Neural Networks.” Communications of the ACM 60, 84–90.CrossRef Google Scholar

Lastra-Díaz, J. and Garcia-Serrano, A. (2015). “A Novel Family of Ic-Based Similarity Measures with a Detailed Experimental Survey on Wordnet.” Engineering Applications of Artificial Intelligence 46, 140–153.CrossRef Google Scholar

Lê, M. and Fokkens, A. (2015). Taxonomy Beats Corpus in Similarity Identification, but Does It Matter? International Conference Recent Advances in NLP 2015. pp. 346–355.Google Scholar

Leacock, C. and Chodorow, M. (1994). Filling in a Sparse Training Space for Word Sense Identification.Google Scholar

Leacock, C. and Chodorow, M. (1998). Combining Local Context and Wordnet Similarity for Word Sense Identification. Wordnet: An Electronic Lexical Database. The MIT Press, pp. 265–283.Google Scholar

Levy, O. and Goldberg, Y. (2014a). Dependency-Based Word Embeddings. The 52nd Annual Meeting of the Association for Computational Linguistics, Baltimore, Maryland. Association for Computational Linguistics, pp. 302–308.CrossRef Google Scholar

Levy, O. and Goldberg, Y. (2014b). Neural Word Embedding as Implicit Matrix Factorization. The 27th International Conference on Neural Information Processing Systems (NIPS), Montreal, Canada. MIT Press, pp. 2177–2185.Google Scholar

Levy, O., Goldberg, Y. and Dagan, I. (2015). “Improving Distributional Similarity with Lessons Learned from Word Embeddings.” Transactions of the Association for Computational Linguistics 3, 211–225.CrossRef Google Scholar

Lin, D. (1997). Using Syntactic Dependency as a Local Context to Resolve Word Sense Ambiguity. The 35th Annual Meeting of the Association for Computational Linguistics, Madrid, Spain. Association for Computational Linguistics, pp. 64–71.CrossRef Google Scholar

Lipton, Z.C. and Steinhardt, J. (2018). “Troubling Trends in Machine Learning Scholarship: Some Ml Papers Suffer from Flaws That Could Mislead the Public and Stymie Future Research.” Queue 17, 1–33.Google Scholar

Luong, T., Socher, R. and Manning, C. (2013). Better Word Representations with Recursive Neural Networks for Morphology. The Seventeenth Conference on Computational Natural Language Learning, 104–113.Google Scholar

McCann, B., Bradbury, J., Xiong, C. and Socher, R. (2017). Learned in Translation: Contextualized Word Vectors. The 31st International Conference on Neural Information Processing Systems. pp. 6297–6308.Google Scholar

McCarthy, D., Koeling, R. and Weeds, J. (2004a). Ranking Wordnet Senses Automatically.Google Scholar

McCarthy, D., Koeling, R., Weeds, J. and Carroll, J. (2004b). Finding Predominant Senses in Untagged Text. The 42nd Annual Meeting of the Association for Computational Linguistics (ACL-04), Barcelona, Spain. Association for Computational Linguistics, pp. 267–287.CrossRef Google Scholar

McHale, M.L. (1998). A Comparison of Wordnet and Roget’s Taxonomy for Measuring Semantic Similarity. Coling-ACLʼ98 Workshop on Usage of WordNet in Natural Language Processing, Montreal, Canada. Association for Computational Linguistics, pp. 115–120.Google Scholar

Mikolov, T., Chen, K., Corrado, G. and Dean, J. (2013a). Efficient Estimation of Word Representations in Vector Space. International Conference on Learning Representations (ICLR) Workshop Track pp. 1301–3781.Google Scholar

Mikolov, T., Sutskever, I., Chen, K., Corrado, G. and Dean, J. (2013b). Distributed Representations of Words and Phrases and Their Compositionality. The 26th International Conference on Neural Information Processing Systems (NIPS), Lake Tahoe, Nevada. Curran Associates Inc., pp. 3111–3119.Google Scholar

Miller, G. (1995). “A Lexical Database for English.” Communications of the ACM 38, 39–41.CrossRef Google Scholar

Miller, G.A. and Charles, W.G. (1991). “Contextual Correlates of Semantic Similarity.” Language and Cognitive Processes 6, 1–28.CrossRef Google Scholar

Miller, G.A., Beckwith, R., Fellbaum, C., Gross, D. and Miller, K.J. (1990). “Introduction to Wordnet: An Online Lexical Database.” International Journal of Lexicography 3, 235–244.CrossRef Google Scholar

Mohammad, S.M. and Hirst, G. (2012). “Distributional Measures as Proxies for Semantic Relatedness.” ArXiv abs/1203.1889.Google Scholar

Morris, J. and Hirst, G. (1991). “Lexical Cohesion Computed by Thesaural Relations as an Indicator of the Structure of Text.” Computational Linguistics 17, 21–48.Google Scholar

Mrkšić, N., Ó Séaghdha, D., Thomson, B., Gasic, M., Rojas-Barahona, L.M., Su, P.-H., Vandyke, D., Wen, T.-H. and Young, S.J. (2016). Counter-Fitting Word Vectors to Linguistic Constraints. The 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies: 142–148.Google Scholar

Mrkšić, N., Vulić, I., Ó Séaghdha, D., Leviant, I., Reichart, R., Gašić, M., Korhonen, A. and Young, S. (2017). “Semantic Specialization of Distributional Word Vector Spaces Using Monolingual and Cross-Lingual Constraints.” Transactions of the Association for Computational Linguistics 5, 309–324.CrossRef Google Scholar

Navigli, R. and Ponzetto, S.P. (2012a). “Babelnet: The Automatic Construction, Evaluation and Application of a Wide-Coverage Multilingual Semantic Network.” Artificial Intelligence 193(Supplement C), 217–250.CrossRef Google Scholar

Navigli, R. and Ponzetto, S.P. (2012b). Babelrelate! A Joint Multilingual Approach to Computing Semantic Relatedness. The Twenty-Sixth AAAI Conference on Artificial Intelligence, Toronto Ontario Canada. AAAI Press, pp. 108–114.CrossRef Google Scholar

Padó, S. and Lapata, M. (2007). “Dependency-Based Construction of Semantic Space Models.” Computational Linguistics 33, 161–199.CrossRef Google Scholar

Panchenko, A. (2013). Similarity Measures for Semantic Relation Extraction. PhD thesis, Université catholique de Louvain.Google Scholar

Pedersen, T., Banerjee, S. and Patwardhan, A. (2005). Maximizing Semantic Relatedness to Perform Word Sense Disambiguation. Supercomputing Institute Research Report UMSI, University of Minnesota.Google Scholar

Pedersen, T., Pakhomov, S.V.S., Patwardhan, S. and Chute, C.G. (2007). “Measures of Semantic Similarity and Relatedness in the Biomedical Domain.” Journal of Biomedical Informatics 40, 288–299.CrossRef Google Scholar PubMed

Pedersen, T., Patwardhan, S. and Michelizzi, J. (2004). Wordnet::Similarity - Measuring the Relatedness of Concepts. The Nineteenth National Conference on Artificial Intelligence (AAAI-04), San Jose, CA. AAAI Press, pp. 1024–1025.Google Scholar

Pennington, J., Socher, R. and Manning, C.D. (2014). Glove: Global Vectors for Word Representation. The 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP). pp. 1532–1543.CrossRef Google Scholar

Peters, M., Neumann, M., Zettlemoyer, L. and Yih, W.-T. (2018b). Dissecting Contextual Word Embeddings: Architecture and Representation. The 2018 Conference on Empirical Methods in Natural Language Processing, Brussels, Belgium. Association for Computational Linguistics, pp. 1499–1509.CrossRef Google Scholar

Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K. and Zettlemoyer, L. (2018a). Deep Contextualized Word Representations. The 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. pp. 2227–2237.CrossRef Google Scholar

Pilehvar, M.T., Kartsaklis, D., Prokhorov, V. and Collier, N. (2018). Card-660: Cambridge Rare Word Dataset - a Reliable Benchmark for Infrequent Word Representation Models. The 2018 Conference on Empirical Methods in Natural Language Processing, Brussels, Belgium. Association for Computational Linguistics, pp. 1391–1401.CrossRef Google Scholar

Ponti, E.M., Vulić, I., Glavaš, G., Mrkšić, N. and Korhonen, A. (2018). Adversarial Propagation and Zero-Shot Cross-Lingual Transfer of Word Vector Specialization. The 2018 Conference on Empirical Methods in Natural Language Processing. pp. 282–293.CrossRef Google Scholar

Quillian, M.R. (1967). “Word Concepts: A Theory and Simulation of Some Basic Semantic Capabilities.” Behavioral Science 12, 410–430.CrossRef Google Scholar PubMed

Quillian, M.R. (1968). Semantic Memory. Semantic Information Processing. The MIT Press, pp. 227–270.Google Scholar

Rada, R., Mili, H., Bicknell, E. and Blettner, M. (1989). “Development and Application of a Metric on Semantic Nets.” IEEE Transactions on Systems, Man and Cybernetics 19, 17–30.CrossRef Google Scholar

Radford, A., Wu, J., Child, R., Luan, D., Amodei, D. and Sutskever, I. (2018) “Language Models Are Unsupervised Multitask Learners.”Google Scholar

Rajpurkar, P., Zhang, J., Lopyrev, K. and Liang, P. (2016). Squad: 100,000+ Questions for Machine Comprehension of Text. The 2016 Conference on Empirical Methods in Natural Language Processing. pp. 2383–2392.CrossRef Google Scholar

Raunak, V., Gupta, V. and Metze, F. (2019). Effective Dimensionality Reduction for Word Embeddings. The 4th Workshop on Representation Learning for NLP (RepL4NLP-2019), Florence, Italy. Association for Computational Linguistics, pp. 235–243.CrossRef Google Scholar

Recski, G., Iklódi, E., Pajkossy, K. and Kornai, A. (2016). Measuring Semantic Similarity of Words Using Concept Networks. The 1st Workship on Representation Learning for NLP. pp. 193–200.CrossRef Google Scholar

Resnik, P. and Diab, M. (2000). Measuring Verb Similarity. The 22nd Annual Meeting of the Cognitive Science Society (CogSci 2000), Philadelphia, Pennsylvania, USA. LAWRENCE ERLBAUM ASSOCIATES, pp. 399–404.Google Scholar

Resnik, R. (1995). Using Information Content to Evaluate Semantic Similarity in a Taxonomy. The 1995 International Joint Conference on AI (IJCAI-95), Montreal, Canada. International Joint Conferences on Artificial Intelligence, pp. 448–453.Google Scholar

Rogers, A., Kovaleva, O. and Rumshisky, A. (2020) “A Primer in Bertology: What We Know About How Bert Works.” arXiv:2002.12327.CrossRef Google Scholar

Rubenstein, H. and Goodenough, J.B. (1965). “Contextual Correlates of Synonymy.” Communications of the ACM 8, 627–633.CrossRef Google Scholar

Russakovsky, O., Deng, J., Su, H., Krause, J., Satheesh, S., Ma, S., Huang, Z., Karpathy, A., Khosla, A., Bernstein, M., Berg, A.C. and Fei-Fei, L. (2014). “Imagenet Large Scale Visual Recognition Challenge.” International Journal of Computer Vision 115, 211–252.CrossRef Google Scholar

Sahlgren, M. (2006). The Word-Space Model: Using Distributional Analysis to Represent Syntagmatic and Paradigmatic Relations between Words in High-Dimensional Vector Spaces. Ph.D thesis, Stockholm University.Google Scholar

Sánchez, D., Batet, M. and Isern, D. (2011). “Ontology-Based Information Content Computation.” Knowledge-Based Systems 24, 297–303.CrossRef Google Scholar

Seco, N., Veale, T. and Hayes, J. (2004). An Intrinsic Information Content Metric for Semantic Similarity in Wordnet. The 16th European Conference on Artificial Intelligence, Valencia, Spain. IOS Press, pp. 1089–1090.Google Scholar

Shi, W., Chen, M., Zhou, P. and Chang, K.-W. (2019). Retrofitting Contextualized Word Embeddings with Paraphrases. The 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), Hong Kong, China. Association for Computational Linguistics, pp. 1198–1203.CrossRef Google Scholar

Speer, R. and Chin, J. (2016) “An Ensemble Method to Produce High-Quality Word Embeddings.” arXiv:1604.01692.Google Scholar

Speer, R. and Havasi, C. (2012). Representing General Relational Knowledge in Conceptnet 5. The Eighth International Conference on Language Resources and Evaluation (LRECʼ12), Istanbul, Turkey. European Language Resources Association (ELRA), pp. 3679–3686.Google Scholar

Strube, M. and Ponzetto, S.P. (2006). Wikirelate! Computing Semantics Relatedness Using Wikipedia. The 21st National Conference on Artificial Intelligence, Boston, Mass. AAAI Press, pp. 1219–1224.Google Scholar

Sussna, M. (1993). Word Sense Disambiguation for Free-Text Indexing Using a Massive Semantic Network. The Second International Conference on Information and Knowledge Management (CKIM), Washington, D.C., United States. ACM, pp. 67–74.CrossRef Google Scholar

Tang, D., Wei, F., Yang, N., Zhou, M., Liu, T. and Qin, B. (2014). Learning Sentiment-Specific Word Embedding for Twitter Sentiment Classification. The 52nd Annual Meeting of the Association for Computational Linguistics, Baltimore, Maryland. Association for Computational Linguistics, pp. 1555–1565.CrossRef Google Scholar

Tenney, I., Xia, P., Chen, B., Wang, A., Poliak, A., McCoy, R.T., Kim, N., Van Durme, B., Bowman, S.R., Das, D. and Pavlick, E. (2019). What Do You Learn from Context? Probing for Sentence Structure in Contextualized Word Representations. The 7th International Conference on Learning Representations.Google Scholar

Torgerson, W.S. (1965). “Multidimensional Scaling of Similarity.” Psychometrika 30, 379–393.CrossRef Google Scholar PubMed

Turney, P.D. (2001). Mining the Web for Synonyms: Pmi-Ir Versus Lsa on Toefl. The Twelfth European Conference on Machine Learning (ECML2001), Freiburg, Germany. Springer, pp. 491–502.CrossRef Google Scholar

Turney, P.D. and Pantel, P. (2010). “From Frequency to Meaning: Vector Space Models of Semantics.” Journal of Artificial Intelligence Research 37, 141–188.CrossRef Google Scholar

Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L. and Polosukhin, I. (2017) “Attention Is All You Need.” arXiv e-prints, arXiv:1706.03762.Google Scholar

Wang, A., Singh, A., Michael, J., Hill, F., Levy, O. and Bowman, S.R. (2018). Glue: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding. The 2018 EMNLP Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP. pp. 353–355.Google Scholar

Wang, T. and Hirst, G. (2011). Refining the Notions of Depth and Density in Wordnet-Based Semantic Similarity Measures. The Conference on Empirical Methods in Natural Language Processing, Edinburgh, United Kingdom. Association for Computational Linguistics, pp. 1003–1011.Google Scholar

Weeds, J.E. (2003). Measures and Applications of Lexical Distributional Similarity. Ph.D thesis, University of Sussex.Google Scholar

Wieting, J., Bansal, M., Gimpel, K. and Livescu, K. (2015). “From Paraphrase Database to Compositional Paraphrase Model and Back.” Transactions of the Association for Computational Linguistics 3, 345–358.CrossRef Google Scholar

Wu, Z. and Palmer, M. (1994). Verb Semantics and Lexical Selection. The 32nd Annual Meeting of the Association for Computational Linguistics, Madrid, Spain. Association for Computational Linguistics, pp. 133–138.CrossRef Google Scholar

Yang, D. and Powers, D. (2010). “Using Grammatical Relations to Automate Thesaurus Construction.” Journal of Research and Practice in Information Technology 42, 105–122.Google Scholar

Yang, D. and Powers, D.M.W. (2005). Measuring Semantic Similarity in the Taxonomy of Wordnet. The Twenty-Eighth Australasian Computer Science Conference (ACSC2005), Newcastle, Australia. ACS, pp. 315–322.Google Scholar

Yang, D. and Powers, D.M.W. (2006). Verb Similarity on the Taxonomy of Wordnet. The 3rd International WordNet Conference (GWC-06), Jeju Island, Korea. KAIST, pp. 121–128.Google Scholar

Yin, Z. and Shen, Y. (2018). On the Dimensionality of Word Embedding. Advances in Neural Information Processing Systems 31. Curran Associates, Inc., pp. 895–906.Google Scholar

Yosinski, J., Clune, J., Bengio, Y. and Lipson, H. (2014). How Transferable Are Features in Deep Neural Networks? The 27th International Conference on Neural Information Processing Systems. pp. 3320–3328.Google Scholar

Yu, M. and Dredze, M. (2014). Improving Lexical Embeddings with Semantic Knowledge. The 52nd Annual Meeting of the Association for Computational Linguistics. pp. 545–550.CrossRef Google Scholar

Zesch, T. and Gurevych, I. (2010). “Wisdom of Crowds Versus Wisdom of Linguists – Measuring the Semantic Relatedness of Words.” Natural Language Engineering 16, 25–59.CrossRef Google Scholar

Zhang, Z., Gentile, A.L. and Ciravegna, F. (2013). “Recent Advances in Methods of Lexical Semantic Relatedness – a Survey.” Natural Language Engineering 19, 411–479.CrossRef Google Scholar

Zipf, G.K. (1965). Human Behavior and the Principle of Least Effort: An Introduction to Human Ecology. N.Y., Hafner Pub. Co.Google Scholar

Article contents

Evaluation of taxonomic and neural embedding methods for calculating semantic similarity

Abstract

Keywords

Access options

Article purchase

Temporarily unavailable

References

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests