Hostname: page-component-cd9895bd7-8ctnn Total loading time: 0 Render date: 2024-12-24T01:10:19.506Z Has data issue: false hasContentIssue false

High-dimensional distributed semantic spaces for utterances

Published online by Cambridge University Press:  31 July 2019

Jussi Karlgren*
Affiliation:
Gavagai and KTH Royal Institute of Technology, Stockholm, Sweden
Pentti Kanerva
Affiliation:
Redwood Center for Theoretical Neuroscience, UC Berkeley, CA, USA
*
*Corresponding author. Email: [email protected]

Abstract

High-dimensional distributed semantic spaces have proven useful and effective for aggregating and processing visual, auditory and lexical information for many tasks related to human-generated data. Human language makes use of a large and varying number of features, lexical and constructional items as well as contextual and discourse-specific data of various types, which all interact to represent various aspects of communicative information. Some of these features are mostly local and useful for the organisation of, for example, argument structure of a predication; others are persistent over the course of a discourse and necessary for achieving a reasonable level of understanding of the content.

This paper describes a model for high-dimensional representation for utterance and text-level data including features such as constructions or contextual data, based on a mathematically principled and behaviourally plausible approach to representing linguistic information. The implementation of the representation is a straightforward extension of Random Indexing models previously used for lexical linguistic items. The paper shows how the implementedmodel is able to represent a broad range of linguistic features in a common integral framework of fixed dimensionality, which is computationally habitable, and which is suitable as a bridge between symbolic representations such as dependency analysis and continuous representations used, for example, in classifiers or further machine-learning approaches. This is achieved with operations on vectors that constitute a powerful computational algebra, accompanied with an associative memory for the vectors. The paper provides a technical overview of the framework and a worked through implemented example of how it can be applied to various types of linguistic features.

Type
Article
Copyright
© Cambridge University Press 2019 

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

Baroni, M. and Lenci, A. (2010). Distributional memory: A general framework for corpus-based semantics. Computational Linguistics 36(4), 673721.CrossRefGoogle Scholar
Baroni, M. and Zamparelli, R. (2010). Nouns are vectors, adjectives are matrices: Representing adjective-noun constructions in semantic space. In Proceedings of the conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics. pp. 11831193.Google Scholar
Bird, S., Klein, E. and Loper, E. (2009). Natural Language Processing with Python: Analyzing Text with the Natural Language Toolkit. O’Reilly Media.Google Scholar
Clark, S., Rimell, L., Polajnar, T. and Maillard, J. (2016). The Categorial Framework for Compositional Distributional Semantics. University of Cambridge Computer Laboratory.Google Scholar
Cohen, T. and Widdows, D. (2017). Embedding of semantic predications. Journal of Biomedical Informatics 68, 150166.CrossRefGoogle ScholarPubMed
Croft, W. (2005). Radical and typological arguments for radical construction grammar. In Construction Grammars: Cognitive Grounding and Theoretical Extensions. John Benjamins.Google Scholar
Deerwester, S., Dumais, S.T., Furnas, G.W., Landauer, T.K. and Harshman, R. (1990). Indexing by latent semantic analysis. Journal of the American Society for Information Science 41(6), 391407.3.0.CO;2-9>CrossRefGoogle Scholar
Dubin, D. (2004). The most influential paper Gerard Salton never wrote. Library Trends 52(4), 748764.Google Scholar
Frady, E.P., Kleyko, D. and Sommer, F.T. (2018). A theory of sequence indexing and working memory in recurrent neural networks. Neural Computation 30(6), 14491513.CrossRefGoogle ScholarPubMed
Gallant, S.I. and Okaywe, T.W. (2013). Representing objects, relations, and sequences. Neural Computation 25(8), 20382078.CrossRefGoogle ScholarPubMed
Gayler, R.W. (1998). Multiplicative binding, representation operators & analogy. In Gentner, D., Holyoak, K.J. and Kokinov, B.N. (eds), Advances in Analogy Research: Integration of Theory and Data from the Cognitive, Computational, and Neural Sciences. Sofia: New Bulgarian University.Google Scholar
Gayler, R.W. (2004). Vector symbolic architectures answer Jackendoff’s challenges for cognitive neuroscience. arXiv:cs/0412059.Google Scholar
Harris, Z. (1968). Mathematical Structures of Language. Interscience Publishers.Google Scholar
Joshi, A., Halseth, J.T. and Kanerva, P. (2016). Language geometry using random indexing. In International Symposium on Quantum Interaction. Springer. pp. 265274.Google Scholar
Kanerva, P. (2009). Hyperdimensional computing: An introduction to computing in distributed representation with high-dimensional random vectors. Cognitive Computation 1(2), 139159.CrossRefGoogle Scholar
Kanerva, P., Kristoferson, J. and Holst, A. (2000). Random indexing of text samples for latent semantic analysis. In proceedings of the Annual Meeting of the Cognitive Science Society (CogSci).Google Scholar
Karlgren, J. and Kanerva, P. (2018). Hyperdimensional utterance spaces—a more transparent language representation. In Design of Experimental Search & Information Retrieval Systems (DESIRES). In 1st Biennial Conference on Design of Experimental Search and Information Retrieval Systems (DESIRES) CEUR-WS. pp. 2935.Google Scholar
Landauer, T.K. and Dumais, S.T. (1997). A solution to Plato’s problem: The Latent Semantic Analysis theory of acquisition, induction, and representation of knowledge. Psychological Review 104(2).CrossRefGoogle Scholar
Levy, O., Goldberg, Y. and Dagan, I. (2015). Improving distributional similarity with lessons learned from word embeddings. TACL 3. In Transactions of the Association for Computational Linguistics. MIT Press. pp. 211225.Google Scholar
Manning, C.D., Surdeanu, M., Bauer, J., Finkel, J., Bethard, S.J. and McClosky, D. (2014). The Stanford CoreNLP natural language processing toolkit. In Association for Computational Linguistics (ACL) System Demonstrations.CrossRefGoogle Scholar
Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S. and Dean, J. (2013). Distributed representations of words and phrases and their compositionality. In In Advances in Neural Information Processing Systems. pp. 31113119.Google Scholar
Mitchell, J. and Lapata, M. (2008). Vector-based models of semantic composition. In Proceeding from the Association for Computational Linguistics Human Language Technology Conference. pp. 236244.Google Scholar
Mitchell, J. and Lapata, M. (2010). Composition in distributional models of semantics. Cognitive Science 34(8), 13881429.CrossRefGoogle ScholarPubMed
Padó, S. and Lapata, M. (2007). Dependency-based construction of semantic space models. Computational Linguistics 33(2), 161199.CrossRefGoogle Scholar
Papadimitriou, C.H., Raghavan, P., Tamaki, H. and Vempala, S. (2000). Latent semantic indexing: A probabilistic analysis. Journal of Computer and System Sciences 61(2), 217235.CrossRefGoogle Scholar
Plate, T. (1991). Holographic reduced representations: Convolution algebra for compositional distributed representations. In Proceedings of the 12th International Joint Conference on Artificial Intelligence (IJCAI). Morgan Kaufmann. pp. 3035.Google Scholar
Plate, T. A. (2003). Holographic Reduced Representation: Distributed representation for cognitive structures. CSLI Lecture notes, vol. 150. CSLI Publications.Google Scholar
Polajnar, T., Fagarasan, L. and Clark, S. (2014). Reducing dimensions of tensors in type-driven distributional semantics. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP). Association for Computational Linguistics.Google Scholar
Sahlgren, M. (2006). The Word-Space Model: Using distributional analysis to represent syntagmatic and paradigmatic relations between words in high-dimensional vector spaces. PhD Dissertation, Department of Linguistics, Stockholm University.Google Scholar
Sahlgren, M. (2008). The distributional hypothesis. Rivista di Linguistica (Italian J Linguistics) 20(1), 3353.Google Scholar
Sahlgren, M., Gyllensten, A.C., Espinoza, F., Hamfors, O., Karlgren, J., Olsson, F., Persson, P., Viswanathan, A. and Holst, A. (2016). The Gavagai Living Lexicon. In Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC 2016). European Language Resources Association (ELRA).Google Scholar
Sahlgren, M., Holst, A. and Kanerva, P. (2008). Permutations as a means to encode order in word space. In The 30th Annual Meeting of the Cognitive Science Society (CogSci’08).Google Scholar
Salton, G., Wong, A. and Yang, C.S. (1975). A vector space model for automatic indexing. Communications of the ACM 18(11), 613620.CrossRefGoogle Scholar
Sandin, F., Emruli, B. and Sahlgren, M. (2017). Random indexing of multidimensional data. Knowledge and Information Systems 52(1), 267290.CrossRefGoogle Scholar
Schütze, H. (1993). Word space. In Proceedings of the 1993 Conference on Advances in Neural Information Processing Systems (NIPS), San Francisco, CA, USA: Morgan Kaufmann Publishers Inc., pp. 895902.Google Scholar
Weeds, J., Weir, D. and Reffin, J. (2014). Distributional composition using higher-order dependency vectors. In Workshop on Continuous Vector Space Models and their Compositionality (CVSC).CrossRefGoogle Scholar
Widdows, D. and Cohen, T. (2014). Reasoning with vectors: A continuous model for fast robust inference. Logic Journal of the IGPL 23(2), 141173.CrossRefGoogle ScholarPubMed
Wu, S. and Schuler, W. (2011). Structured composition of semantic vectors. In International Conference on Computational Semantics. Association for Computational Linguistics. pp. 295304.Google Scholar