Introduction: Cognitive Issues in Natural Language Processing

doi:10.1017/9781316676974.001

1 - Introduction: Cognitive Issues in Natural Language Processing

from Part I - About This Book

Published online by Cambridge University Press: 30 November 2017

Thierry Poibeau and

Aline Villavicencio

Edited by

Thierry Poibeau and

Aline Villavicencio

Show author details

Thierry Poibeau: Affiliation:
Lattice laboratory, CNRS and, Ecole Normale Supérieure and Université Sorbonne nouvelle, France
Aline Villavicencio: Affiliation:
Institute of Informatics, Federal University of Rio Grande do Sul, Brazil, and School of Computer Science and Electronic Engineering, University of Essex, UK
Thierry Poibeau: Affiliation:
Centre National de la Recherche Scientifique (CNRS), Paris
Aline Villavicencio: Affiliation:
Universidade Federal do Rio Grande do Sul, Brazil

Book contents

Get access

Summary

On the Relationships between Natural Language Processing and Cognitive Sciences

This introduction aims at giving an overview of the questions and problems addressed jointly in natural language processing and cognitive science. More precisely, the idea of this introduction, and more generally of this book, is to address how these fields can fertilize each other, bringing recent advances to produce richer studies.

Natural language processing is fundamentally dealing with semantics and more generally with knowledge. Cognitive science is also mostly dealing with knowledge: how knowledge is acquired and processed in the brain. The two domains have developed largely independently, as we discuss later in this Introduction, but there are obvious links between the two, and a large number of researchers have investigated problems involving the two fields, in either the data or the methods used.

A Quick Historical Overview

The landscape of natural language processing (NLP) has dramatically changed in the last decades. Until recently, it was generally assumed that one first needs to adequately formalize an information context (for example information contained in a text) in order to be able to subsequently develop applications dealing with semantics (see, for example, Sowa 1991; Allen 1994; Nirenburg and Raskin 2004). This initial step involved manipulating large knowledge bases of manually hand-crafted rules, and has resulted in the new field of “knowledge engineering” (Brachman and Levesque 2004).

Knowledge can be seen as the result of the confrontation of our a priori ideas with the reality of the outside world. This leads to several difficulties: (1) the task is potentially infinite since people constantly perceive a multiplicity of things; (2) perception interferes with information already registered in the brain, leading to complex inferences with commonsense knowledge; (3) additionally, very little is known about how information is processed in the brain, which makes things even harder to formalize.

To answer some of these issues, a common assumption is that knowledge could be disconnected from perception, which led to projects aiming at developing large static databases of “common sense knowledge” from CYC (Lenat 1995) to more recent general domain ontologies like ConceptNet (Liu and Singh 2004). However, these projects have always led to databases that, despite their sizes, were never enough to completely and accurately formalize a given domain, and domain-independent applications were thus even more unattainable.

Type: Chapter
Information: Language, Cognition, and Computational Models , pp. 3 - 24

DOI: https://doi.org/10.1017/9781316676974.001 [Opens in a new window]

Publisher: Cambridge University Press

Print publication year: 2018

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Book purchase

Temporarily unavailable

References

Abney, Steven. 1991. Parsing by Chunks. In: Berwick, Robert, Abney, Steven, and Tenny, Carol (eds), Principle-Based Parsing. Dordrecht: Kluwer Academic Press.Google Scholar

Alishahi, Afra. 2010. Computational modeling of human language acquisition. Synthesis Lectures on Human Language Technologies, 3(1), 1–107.CrossRef Google Scholar

Alishahi, Afra, and Stevenson, Suzanne. 2008. A computational model of early argument structure acquisition. Cognitive Science, 32(5), 789–834.CrossRef Google Scholar PubMed

Allen, James. 1994. Natural Language Understanding. New York: Pearson.Google Scholar

Anderson, John R. 1976. Language, Memory, and Thought. Mahwah, NJ: Lawrence Erlbaum Associates.Google Scholar

Anderson, John R., Bothell, Dan, Byrne, Michael D., Douglass, Scott, Lebiere, Christian, and Qin, Yulin. 2004. An integrated theory of themind. Psychological Review, 111(4), 1036–1060.CrossRef Google Scholar

Bergen, Benjamin, and Chang, Nancy. 2003. Embodied Construction Grammar in Simulation-Based Language Understanding. In: Ostman, J.-O., and Fried, M. (eds), Construction Grammar(s): Cognitive and Cross-Language Dimensions. Amsterdam: Johns Benjamins.Google Scholar

Bertola, Laiss, Mota, Natalia Bezerra, Copelli, Mauro, Rivero, Thiago, Satler, Breno, Diniz, De Oliveira, Romano, Marco, Aurelio, Ribeiro, Sidarta, and Malloy-diniz, Leandro Fernandes. 2014. Graph analysis of verbal fluency test discriminate between patients with Alzheimer's disease, mild cognitive impairment and normal elderly controls. Frontiers in Aging Neuroscience, 6(185).CrossRef Google Scholar PubMed

Bertolo, Stefano. 2001. A Brief Overview of Learnability. Pages 1–14. of: Bertolo, Stefano (ed), Language Acquisition and Learnability. Cambridge University Press.CrossRef Google Scholar

Berwick, Robert C. 1985. The Acquisition of Syntactic Knowledge. MIT Press.Google Scholar

Berwick, Robert C., and Chomsky, Noam. 2015. Why Only Us: Language and Evolution. Cambridge, MA: MIT Press.Google Scholar

Berwick, Robert C., and Niyogi, Partha. 1996. Learning from Triggers. Linguistic Inquiry, 27(4), 605–622.Google Scholar

Bickerton, Derek. 2016. Roots of Language. Berlin: Language Science Press.Google Scholar

Blache, Philippe. 2013. Chunks et activation: Un modle de facilitation du traitement linguistique. In: Proceedings of the Conference Traitement Automatique du Langage Naturel. Les Sables d'Olonnes: ATALA.Google Scholar

Borge-Holthoefer, Javier, Moreno, Yamir, and Arenas, Alex. 2011. Modeling abnormal priming in Alzheimer's patients with a free association network. PLoS ONE, 6(8).CrossRef Google Scholar PubMed

Boston, Marisa F., Hale, John T., Vasishth, Shravan, and Kliegl, Reinhold. 2011. Parallel processing and sentence comprehension difficulty. Language and Cognitive Processes, 26(3), 301–349.CrossRef Google Scholar

Brachman, Ronald, and Levesque, Hector. 2004. Knowledge Representation and Reasoning. San Francisco: Morgan Kaufmann Publishers Inc.Google Scholar

Brent, Michael R. 1999. An efficient, probabilistically sound algorithm for segmentation and word discovery. Machine Learning, 34 (1), 71–105.CrossRef Google Scholar

Briscoe, T. ed. 1997. Co-evolution of language and of the language acquisition device. Pages 418–427.of: Proceedings of the Eighth Conference on European Chapter of the Association for Computational Linguistics. Association for Computational Linguistics.

Briscoe, T. ed. 2000. Grammatical acquisition: Inductive bias and coevolution of language and the language acquisition device. Language, 245–296.

Cabana, Álvaro, Valle-lisboa, Juan C., Elvevåg, Brita, and Mizraji, Eduardo. 2011. Detecting order-disorder transitions in discourse: Implications for schizophrenia. Schizophrenia Research, 131(1–3), 157–164.CrossRef Google Scholar

Chomsky, Noam. 1965. Aspects of the Theory of Syntax. Cambridge, MA: MIT Press.Google Scholar

Christiansen, Morten H., and Kirby, Simon. 2003. Language Evolution: The Hardest Problem in Science? In: Christiansen, M. H., and Kirby, S. (eds), Language Evolution: The States of the Art. New York: Oxford University Press.CrossRef Google Scholar

Chrupala, Grzegorz, Ákos, Kádár, and Alishahi, Afra. 2015. Learning language through pictures. Pages 112–118. of: Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics. Beijing: ACL.Google Scholar

Coltheart, Max. 1981. TheMRC psycholinguistic database. Quarterly Journal of Experimental Psychology, 33A, 497–505.Google Scholar

De Deyne, Simon, and Storms, Gert. 2008. Word associations: Network and semantic properties. Behavior research methods, 40(1), 213–231.CrossRef Google Scholar PubMed

Demberg, Vera, and Keller, Frank. 2008. Data from eye-tracking corpora as evidence for theories of syntactic processing complexity. Cognition, 109(2), 193–210.CrossRef Google Scholar PubMed

Elsner, Micha, Goldwater, Sharon, Feldman, Naomi, and Wood, Frank. 2013. A Joint Learning Model of Word Segmentation, Lexical Acquisition, and Phonetic Variability. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing.

Erk, Katrin, and McCarthy, Diana. 2009. Graded word sense assignment. In: Proceedings of the Empirical Methods in Natural Language Processing Conference. ACL.

Fazly, Afsaneh, Alishahi, Afra, and Stevenson, Suzanne. 2010. A probabilistic computational model of cross-situational word learning. Cognitive Science, 34(6), 1017– 1063.CrossRef Google Scholar PubMed

Fellbaum, Christiane (ed). 1998. Word Net: An Electronic Lexical Database. Language, Speech, and Communication Series. Cambridge, MA: MIT Press.

Ferrucci, David A. 2012. Introduction to “This is Watson.” IBM J. Res. Dev., 56(3), 235–249.CrossRef Google Scholar

Francez, Nissim, and Wintner, Shuly. 2012. Unification Grammars. Cambridge, UK: Cambridge University Press.Google Scholar

Fu, Ruiji, Guo, Jiang, Qin, Bing, Che, Wanxiang, Wang, Haifeng, and Liu, Ting. 2014. Learning Semantic Hierarchies via Word Embeddings. Pages 1199–1209.of: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics, ACL 2014.

Gee, James P., and Grosjean, François. 2004. Performance structures: A psycholinguistic and linguistic appraisal. Cognitive Psychology, 15(4), 411–458.Google Scholar

Gold, E. Mark. 1967. Language identification in the limit. Information and Control, 10(5), 447–474.CrossRef Google Scholar

Hale, John T. 2001. A Probabilistic Earley Parser as a Psycholinguistic Model. In: Proceedings of the Second Meeting of the North American Chapter of the Association for Computational Linguistics.

Hauser, Marc D., Yang, Charles, Berwick, Robert C., Tattersall, Ian, Ryan, Michael J., Watumull, Jeffrey, Chomsky, Noam, and Lewontin, Richard C. 2014. The mystery of language evolution. Frontiers in Psychology, 5(May), 1–12.CrossRef Google Scholar PubMed

Hill, Felix, Reichart, Roi, and Korhonen, Anna. 2015. SimLex-999: Evaluating semantic models with (genuine) similarity estimation. Computational Linguistics, 41(4), 665–695.CrossRef Google Scholar

Holland, John H. 1992. Adaptation in Natural and Artificial Systems: An Introductory Analysis with Applications to Biology, Control and Artificial Intelligence. Cambridge, MA: MIT Press.Google Scholar

Hsu, Anne S., and Chater, Nick. 2010. The logical problem of language acquisition: A probabilistic perspective. Cognitive Science, 34(6), 972–1016.CrossRef Google Scholar PubMed

Hyams, Nina. 1987. The theory of parameters and syntactic development. Pages 1–22. of: Roeper, Thomas, and Williams, Edwin (eds), Parameter Setting. Dordrecht: Springer Netherlands.Google Scholar PubMed

Johansson, Sverker. 2005. Origins of Language: Constraints on Hypotheses. Converging Evidence in Language and Communication Research. Amsterdam: John Benjamins Publishing Company.CrossRef Google Scholar

Joshi, Aravind K. 1990. Processing crossed and nested dependencies: An automaton perspective on the psycholinguistic results. Language and Cognitive Processes, 5(1), 1–27.Google Scholar

Jurafsky, Daniel, and Martin, James H. 2000. Speech and Language Processing: An Introduction to Natural Language Processing, Computational Linguistics, and Speech Recognition. Upper Saddle River: Pearson Prentice Hall.Google Scholar

Köper, Maximilian, and Schulte im Walde, Sabine. 2016. Automatically Generated Affective Norms of Abstractness, Arousal, Imageability and Valence for 350 000 German Lemmas. Pages 2595–2598.of: Proceedings of the 10th International Conference on Language Resources and Evaluation.

Kwiatkowski, Tom, Goldwater, Sharon, Zettlemoyer, Luke, and Steedman, Mark. 2012. A Probabilistic Model of Syntactic and Semantic Acquisition from Child-directed Utterances and Their Meanings. Pages 234–244.of: Proceedings of the 13th Conference of the European Chapter of the Association for Computational Linguistics. EACL –12.

Le, Quoc, and Mikolov, Tomas. 2014. Distributed Representations of Sentences and Documents. In: 31st International Conference on Machine Learning.

Legate, Julie Anne, and Yang, Charles. 2005. The richness of the poverty of the stimulus. On the Occasion of Happy Golden Anniversary, Generative Syntax: 50 Years Since Logical Structure of Linguistic Theory.

Legate, Julie Anne, and Yang, Charles. 2007. Morphosyntactic learning and the development of tense. Language Acquisition, 14(3), 315–344.CrossRef Google Scholar

Lenat, Douglas B. 1995. CYC: A large-scale investment in knowledge infrastructure. Communications of the ACM, 38(11), 33–38.CrossRef Google Scholar

Levy, Roger. 2013. Memory and surprisal in human sentence comprehension. Page 78114 of: van Gompel, Roger P. G. (ed), Sentence Processing. Hove, UK: Psychology Press.Google Scholar

Lignos, Constantine. 2011. Modeling Infant Word Segmentation. Pages 29–38.of: Proceedings of the Fifteenth Conference on Computational Natural Language Learning.

Lin, Dekang. 1998. Automatic retrieval and clustering of similar words. Pages 768–774.of: Proceedings of the 36th Annual Meeting of the Association for Computational Linguistics and 17th International Conference on Computational Linguistics.

Liu, Hugo, and Singh, Push. 2004. ConceptNet – A practical commonsense reasoning tool-kit. BT Technology Journal, 22(4), 211–226.CrossRef Google Scholar

MacWhinney, Brian. 2000. The CHILDES Project: Tools for Analyzing Talk. Mahwah, NJ: Lawrence Erlbaum Associates.Google Scholar

Mandera, Pawel, Keuleers, Emmanuel, and Brysbaert, Marc. in press. Explaining human performance in psycholinguistic tasks with models of semantic similarity based on prediction and counting: A review and empirical validation. Journal of Memory and Language.

Manning, Christopher D., and Schütze, Hinrich. 1999. Foundations of Statistical Natural Language Processing. Cambridge, MA: MIT Press.Google Scholar

McDonald, Scott, and Brew, Chris. 2004. A Distributional Model of Semantic Context Effects in Lexical Processing. Pages 17–24.of: Proceedings of the 42nd Annual Meeting of the Association for Computational Linguistics.

Mikolov, Tomas, Chen, Kai, Corrado, Greg, and Dean, Jeffrey. 2013. Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781.

Mitchell, Thomas M. 1997. Machine Learning. New York: McGraw-Hill.Google Scholar PubMed

Moor, James H. 2003. Turing test. Pages 1801–1802.of: Encyclopedia of Computer Science. Chichester, UK: John Wiley and Sons.Google Scholar

Mota, Natalia B, Vasconcelos, Nivaldo A P, Lemos, Nathalia, Pieretti, Ana C, Kinouchi, Osame, Cecchi, Guillermo A, Copelli, Mauro, and Ribeiro, Sidarta. 2012. Speech graphs provide a quantitative measure of thought disorder in psychosis. 7(4), 1–9.Google Scholar PubMed

Nelson, Douglas L., McEvoy, Cathy L., and Schreiber, Thomas A. 2004. The University of South Florida free association, rhyme, and word fragment norms. Behavior Research Methods, Instruments, & Computers, 36(3), 402–407.CrossRef Google Scholar PubMed

Nematzadeh, Aida, Fazly, Afsaneh, and Stevenson, Suzanne. 2013. Child acquisition of multiword verbs: A computational investigation. Pages 235–256.of: Villavicencio, A., Poibeau, T., Korhonen, A., and Alishahi, A. (eds), Cognitive Aspects of Computational Language Acquisition. Berlin: Springer.Google Scholar

Nirenburg, Sergei, and Raskin, Victor. 2004. Ontological Semantics. Cambridge, MA: MIT Press.Google Scholar

Niyogi, Partha, and Berwick, Robert C. 1996. A language learning model for finite parameter spaces. Cognition, 61(1-2), 161–193.CrossRef Google Scholar PubMed

Padó, Sebastian, and Lapata, Mirella. 2007. Dependency-based construction of semantic space models. Computational Linguistics, 33(2), 161–199.CrossRef Google Scholar

Pantel, Patrick, and Lin, Dekang. 2002. Discovering Word Senses from Text. Pages 613–619.of: Proceedings of the Eighth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.

Parisien, Christopher, and Stevenson, Suzanne. 2010. Learning verb alternations in a usage-based Bayesian model. In: Proceedings of the 32nd Annual Conference of the Cognitive Science Society.

Parisien, Christopher, Fazly, Afsaneh, and Stevenson, Suzanne. 2008. An incremental Bayesian model for learning syntactic categories. Pages 89–96.of: Proceedings of the Twelfth Conference on Computational Natural Language Learning.

Pearl, Lisa. 2005. The Input for Syntactic Acquisition: Solutions from Language Change Modeling. Pages 1–9.of: Proceedings of the Workshop on Psychocomputational Models of Human Language Acquisition.

Pearl, Lisa, Goldwater, Sharon, and Steyvers, Mark. 2010. How ideal are we? Incorporating human limitations into Bayesian models of word segmentation. Page 315– 326 of: Proceedings of the 34th Annual Boston University Conference on Child Language Development. Somerville, MA: Cascadilla Press.Google Scholar

Perfors, Amy, Tenenbaum, Joshua B, and Wonnacott, Elizabeth. 2010. Variability, negative evidence, and the acquisition of verb argument constructions. Journal of child language, 37(03), 607–642.CrossRef Google Scholar PubMed

Phillips, Lawrence, and Pearl, Lisa. 2014. Bayesian inference as a viable cross-linguistic word segmentation strategy: It's all about what's useful. In: Proceedings of the 36th Annual Meeting of the Cognitive Science Society.

Rissanen, J. 1989. Stochastic Complexity in Statistical Inquiry. Series in Computer Science, vol. 15. World Scientific.Google Scholar

Rumelhart, D. E., McClelland, J. L., and the PDP Research Group. 1986. Parallel Distributed Processing: Explorations in the Microstructure of Cognition, Vol. 2, Psychological and Biological Models. Cambridge, MA: MIT Press.Google Scholar

Sagae, Kenji, MacWhinney, Brian, and Lavie, Alon. 2004. Adding Syntactic Annotations to Transcripts of Parent-Child Dialogs. In: Proceedings of the Fourth International Conference on Language Resources and Evaluation.

Sowa, John (ed). 1991. Principles of Semantic Networks: Explorations in the Representation of Knowledge. San Mateo, CA: Morgan Kaufmann Publishers.

Steedman, Mark. 2012. Probabilistic Models of Grammar Acquisition. Pages 19–29.of: Proceedings of the Workshop on Computational Models of Language Acquisition and Loss.

Steyvers, Mark, and Tenenbaum, Joshua B. 2005. The large-scale structure of semantic networks: Statistical analyses and a model of semantic growth. Cognitive Science, 29, 41–78.CrossRef Google Scholar

Tiedemann, Jörg. 2011. Bitext Alignment. Synthesis Lectures on Human Language Technologies. Morgan & Claypool Publishers.Google Scholar

Turing, Alan. 1950. Computing machinery and intelligence. Mind, 236, 433–460.Google Scholar

Turney, Peter D., and Pantel, Patrick. 2010. From Frequency to Meaning: Vector Space Models of Semantics. J. Artif. Int. Res., 37(1), 141–188.Google Scholar

Valiant, Leslie. 2013. Probably Approximately Correct: Nature's Algorithms for Learning and Prospering in a Complex World. New York: Basic Books, Inc.Google Scholar

Villavicencio, Aline. 2002. The acquisition of a unification-based generalised categorial grammar. PhD thesis, University of Cambridge.

Villavicencio, Aline. 2011. Language acquisition with a unification-based grammar. In: K., Borjars, R., Borsley (ed), Non-transformational Syntax: Formal and Explicit Models of Grammar. Chichester, West Sussex, UK; Malden, MA: Wiley- Blackwell.Google Scholar

Villavicencio, Aline, Yankama, Beracah, Wilkens, Rodrigo, Idiart, Marco, and Berwick, Robert. 2012. An annotated English child language database. Pages 23–25.of: Proceedings of the Workshop on Computational Models of Language Acquisition and Loss.

Villavicencio, Aline, Idiart, Marco, Berwick, Robert, and Malioutov, Igor. 2013. Language Acquisition and Probabilistic Models: Keeping It Simple. Pages 1321–1330.of: Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers).

Wanner, Eric, and Gleitman, Lila R. (eds). 1982. Language Acquisition: The State of the Art. Cambridge, UK: Cambridge University Press.

Watts, Duncan J., and Strogatz, Steven H. 1998. Collective dynamics of ‘small-world’ networks. Nature, 393(June), 440–442.CrossRef Google Scholar PubMed

Weizenbaum, Joseph. 1966. ELIZA – A Computer Program for the Study of Natural Language Communication Between Man and Machine. Commun. ACM, 9(1), 36–45.CrossRef Google Scholar

Wintner, Shuly. 2010. Computational Models of Language Acquisition. Pages 86–99.of: Gelbukh, Alexander (ed), Computational Linguistics and Intelligent Text Processing: 11th International Conference, CICLing 2010.

Yang, Charles. 2004. Universal grammar, statistics or both? Trends in Cognitive Sciences, 8(10), 451–456.CrossRef

Yang, Charles. 2013. Ontogeny and phylogeny of language. Proceedings of the National Academy of Sciences, 110(16), 6324–6327.Google Scholar PubMed