Hostname: page-component-cd9895bd7-dzt6s Total loading time: 0 Render date: 2024-12-23T07:58:01.521Z Has data issue: false hasContentIssue false

Computational cognitive modeling for syntactic acquisition: Approaches that integrate information from multiple places

Published online by Cambridge University Press:  13 June 2023

Lisa PEARL*
Affiliation:
University of California, Irvine
Rights & Permissions [Opens in a new window]

Abstract

Computational cognitive modeling is a tool we can use to evaluate theories of syntactic acquisition. Here, I review several models implementing theories that integrate information from both linguistic and non-linguistic sources to learn different types of syntactic knowledge. Some of these models additionally consider the impact of factors coming from children’s developing non-linguistic cognition. I discuss some existing child behavioral work that can inspire future model-building, and conclude by considering more specifically how to build better models of syntactic acquisition.

Type
Article
Creative Commons
Creative Common License - CCCreative Common License - BY
This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (http://creativecommons.org/licenses/by/4.0), which permits unrestricted re-use, distribution and reproduction, provided the original article is properly cited.
Copyright
© The Author(s), 2023. Published by Cambridge University Press

Introduction

About computational cognitive modeling for syntactic acquisition

One tool we can use to understand how syntactic acquisition works is computational cognitive modeling. The computational part refers to implementing an idea (that is, a theory) very precisely, typically using mathematical techniques that are carried out on computers. The cognitive part refers to what the implemented ideas are about, which is some part of human cognition. The modeling part refers to the theory itself, which captures (i.e., models) some aspect of cognition (here: syntactic acquisition). With this tool of computational cognitive modeling, we can then make a theory about syntactic acquisition concrete enough to evaluate, because the computational cognitive model allows us to generate predictions about children’s syntactic behavior that can be evaluated. That is, when we have a computational cognitive model for syntactic acquisition, we have a theory about syntactic acquisition that is implemented precisely enough to evaluate against empirical data.

Importantly, the computational cognitive model serves as a “proof of concept” for a theory. When the model generates predictions that match human behavior (e.g., children’s syntactic behavior), this is proof there is at least one way the theory could explain human behavior – which is the way the theory was implemented in the computational cognitive model. An important limitation of computational cognitive modeling is that modeling success (or failure) can only be interpreted with respect to the specific theory implemented by the model. That is, if the model succeeds at matching human behavior, we can only interpret this success as success of that specific implementation of that acquisition theory – we have nothing to say about other implementations of this particular theory, or other theories not implemented in the model. The same is true for interpreting model failure: failure is only demonstrated for that specific theory implementation. If we want to evaluate some other theory implementation, we need to build another model and see how it does. See Pearl (Reference Pearl2014, Reference Pearl and Sprousein press) for more detailed discussion about how to interpret computational cognitive model success (and failure).

Implementing a theory in a computational cognitive model

When we have a theory of syntactic acquisition, how do we implement it in a computational cognitive model? Implementing the model involves several key aspects. First, the model needs to encode relevant prior knowledge and learning abilities the child is supposed to have at this stage of development. This knowledge and these abilities are often assumed implicitly by the acquisition theory. For instance, a syntactic acquisition theory might assume prior knowledge of individual words in the language and the ability to segment speech reliably from the input.

Second, the model needs to learn from realistic input. For instance, a model meant to capture syntactic acquisition behavior that occurs at age four should ideally learn from input that children encounter by age four.

Third, the model needs to output predictions that connect in some interpretable way to children’s behavior. For instance, a model might predict if a child at age four would treat two verbs as being syntactically the same (i.e., appearing in the same syntactic contexts and having the same interpretations of their arguments).

Fourth, the model needs to encode learning, which is how the modeled child uses the information from the input to update hypotheses about syntax. Learning is typically the main component specified by the acquisition theory. For instance, a model might attend to the distribution of certain features of the input viewed as relevant (e.g., animacy of verb arguments, syntactic contexts a verb appears in), and then use probabilistic inference to group verbs together that seem similar enough with respect to those relevant features.

So, to sum up, implementing an acquisition theory in a computational cognitive model involves encoding the acquisition theory assumptions (i.e., the prior knowledge assumed, the learning abilities assumed, and how learning proceeds), learning from realistic input estimates, and generating interpretable output that can be evaluated against empirical data from children. This is an approach that the models reviewed below have taken for investigating syntactic acquisition.

Road map

I will focus on computational cognitive models of syntactic acquisition that integrate information from multiple places, including both linguistic and non-linguistic sources of information. That is, the syntactic acquisition theories implemented by these models assume that syntactic learning proceeds by children attending to information from these different sources, rather than solely syntactic sources. Why discuss this kind of model? To me, these models seem more realistic because children are surrounded by many different types of information and have many different learning goals simultaneously. That is, children do not ever only learn about syntax; instead, they learn about syntax and about who is likely to give them a hug and about how to communicate their desire for more milk, among many other things. So, non-syntactic sources of information may be particularly salient in any given moment while children are learning about syntax; if these sources of information happen to be helpful for learning about syntax, then children may very well be able to harness those sources to do so.

Moreover, children are likely impacted by non-linguistic factors during acquisition. For instance, cognitive limitations on memory, attention, and executive control can affect how children perceive the information in their input, how they update their internal hypotheses, and how they generate their observable syntactic behavior. In addition, children likely rely on non-linguistic learning mechanisms to update their internal hypotheses, such as probabilistic inference. In fact, all the models of syntactic acquisition reviewed below rely on probabilistic inference, and so already incorporate this non-linguistic component into their theories of syntactic acquisition.Footnote 1

Here, as mentioned, I focus on syntactic acquisition models that also integrate information from non-syntactic sources. I should note that these are selected case studies in syntactic acquisition modeling from my own work, rather than capturing the full range of computational cognitive models that implement this type of syntactic acquisition theory. I first review three case studies, whose acquisition theories incorporate conceptual information such as the animacy of an event participant, participant event roles more generally, and components of lexical meaning. Some of these theories additionally incorporate non-linguistic cognitive limitations affecting both input perception and hypothesis updating by implementing the impact of those limitations on input perception and hypothesis updating. I note that these theories are agnostic as to the specific source of the cognitive limitations (e.g., whether the source of the limitations is developing knowledge, developing learning abilities, or something else); instead, the practical impact of the cognitive limitations on the acquisition process is what the model captures. These case studies involve the acquisition of syntactic knowledge about linking theories, the passive, and pronoun interpretation.

I then briefly review some existing child behavioral work that we can take inspiration from when it comes to building better computational cognitive models of syntactic acquisition. I also discuss more specifically how we can think about building better models, and how we can incorporate the insights from both the behavioral work reviewed and current modeling work. I conclude with a few other ideas for building better models of syntactic acquisition in the future.

Some modeling case studies in syntactic acquisition

For each of the modeling case studies below, I first describe the syntactic knowledge children are trying to acquire. I then describe the relevant aspects of the acquisition theory implemented in the computational cognitive model, including the prior theories the implemented theory builds on, which information sources are used, the form the information sources take, and how those sources are used to update the modeled child’s hypotheses. I explicitly highlight which information sources are non-syntactic, as relevant. I also describe the input to the model, how the model’s output is evaluated against empirical data from children’s behavior, and what we learned by using modeling this way.

Linking theories

The syntactic knowledge

One type of syntactic knowledge is how to interpret a verb’s arguments in context. For instance, consider this sentence: The little girl blicked the kitten on the stairs. Even if we do not know what blick means, we still prefer to interpret this sentence as the little girl doing something (blicking) to the kitten, and that event happening on the stairs. The reason we as adults prefer this interpretation is because we have linking theories that link the thematic roles specified by a verb’s lexical semantics (e.g., agent, patient, location) to the syntactic argument positions specified by that verb’s syntactic frame (e.g., subject, direct object, object of a preposition). Moreover, our linking theories are so well-developed that they can impose these links even when we do not know a verb’s specific lexical semantics (like here with blick).

Verbs can be grouped together into classes where the verbs in a class behave the same way with respect to the links between syntactic positions and thematic roles. That is, solving the linking problem (i.e., acquiring linking theories for the verbs of the language) involves learning how to link syntactic positions and thematic roles for different verbs; verb classes are collections of verbs that behave the same way for linking. For example, verbs with “subject-raising” behavior like appear and seem allow their subject to not have a thematic role. So, in Lindy seemed/appeared to hug the kitten, Lindy is not a “seemer” or an “appearer”, but rather a kitten-hugger. As another example, verbs with “unaccusative” behavior like fall and break have a patient in the subject position. So, in The toy kitten fell/broke, falling or breaking is happening to the toy kitten. As a third example, verbs with passivizable behavior like hug and break allow their subject to be a patient in the passive construction, while verbs like appear, seem, and fall do not. That is, The toy kitten was hugged/broken by Lindy, with hugging or breaking happening to the toy kitten, is acceptable. In contrast, The toy kitten was seemed/appeared/fallen by Lindy, with seeming, appearing, or falling happening to the toy kitten, is not acceptable.

These examples demonstrate that a verb class can involve many linking behaviors. Here, one verb class involving fall might be characterized as +unaccusative and -passivizable; another verb class involving break might be characterized as +unaccusative and +passivizable; a third verb class involving seem and appear might be characterized as +subject-raising and -passivizable. To learn what verbs belong together in a class, children must implicitly develop the linking theory for that verb class. This is why acquiring verb classes can be used as a measure of linking theory development. In short, if a child (and therefore a modeled child) can cluster verbs together into classes that behave the same linking-wise, then the child (real or modeled) can be said to have developed the relevant linking theory knowledge that leads to those verb classes.

The acquisition theory implemented in the model

Pearl and Sprouse (Reference Pearl and Sprouse2019) proposed that children can cluster verbs into appropriate verb classes by paying attention to several pieces of information associated with verbs in their input: argument animacy, syntactic context, and link distribution. This verb information has been proposed by prior theories as (potentially) relevant (e.g., Becker, Reference Becker2009, Reference Becker2014, Reference Becker2015; Becker & Estigarribia, Reference Becker and Estigarribia2013; Fisher, Gertner, Scott & Yuan, Reference Fisher, Gertner, Scott and Yuan2010; Gillette, Gleitman, Gleitman & Lederer, Reference Gillette, Gleitman, Gleitman and Lederer1999; Gleitman, Reference Gleitman1990; Gutman, Dautriche, Crabbé & Christophe, Reference Gutman, Dautriche, Crabbé and Christophe2015; Harrigan, Hacquard & Lidz, Reference Harrigan, Hacquard and Lidz2016; Hartshorne, Pogue & Snedeker, Reference Hartshorne, Pogue and Snedeker2015b; Kirby, Reference Kirby2009a, Reference Kirby2009b; Landau & Gleitman, Reference Landau and Gleitman1985; Levin, Reference Levin1993; Scott & Fisher, Reference Scott and Fisher2009). To see a concrete example of each information type, consider two of the utterances involving break from our examples: the unaccusative The toy kitten broke and the passive The toy kitten was broken by Lindy. First, the animacy of the verb’s arguments matters. For instance, a child would notice that The toy kitten is inanimate. Second, the syntactic contexts that a verb appears in matter. So, a child would notice that break appeared in an unaccusative context of the form Noun-Phrase Verb and a passive context Noun-Phrase was Verb Preposition Noun-Phrase. Third, the distribution of links between thematic roles and syntactic positions matters. Here, a child would notice that break has the following links in the two utterances above: two instances of Patient in subject position (from The toy kitten in both utterances) and one instance of Agent in the prepositional phrase position (from Lindy in the passive utterance).

Pearl and Sprouse made the idealizing assumption that children would have enough prior knowledge and sufficient learning abilities to accurately extract this information from any particular verb use they encountered. This assumption can be relaxed in future work (i.e., we can assume that children do not accurately extract information due to immature knowledge, immature learning abilities, or cognitive limitations more generally). However, this assumption of accurate extraction provides a simple starting point for theory evaluation via computational cognitive modeling, in the absence of a particular theory about how children may inaccurately extract information.

So, with this information extracted from the inputFootnote 2, children would then create verb classes by using Bayesian inference, a type of probabilistic learning shown to accord with a variety of developmental patterns across cognition (see Pearl, Reference Pearl2021 for a brief review). When using Bayesian inference, a learner updates hypotheses by balancing prior knowledge or biases against fit to the observed data. For learning verb classes, Pearl and Sprouse (Reference Pearl and Sprouse2019) built in a standard type of prior knowledge for learning classes of any kind, which is that fewer classes are preferred. The fit to the observed data is about the child’s input: here, if the modeled child assumes a certain set of verb classes, is the information observed in the input about argument animacy, syntactic context, and link distribution more probable? A verb class hypothesis that causes the observed information to be more probable is a better fit than a hypothesis that causes the observed information to be less probable.

To better understand this idea of a hypothesis fitting the observed data, consider two verb class hypotheses involving seem and appear. The first hypothesis $ {H}_1 $ puts each verb in its own verb class ( $ {H}_1 $ : $ {class}_1 $ = {appear}, $ {class}_2 $ = {seem}); the second hypothesis $ {H}_2 $ puts both verbs together into one verb class ( $ {H}_2 $ : $ {class}_1 $ = {appear, seem}). Suppose the observed information the modeled child learns from comes from this utterance: Lindy appeared to be sad, but then she seemed to be happy.

In this utterance, the information from argument animacy, syntactic contexts, and link distributions is the same for appear and seem. Hypothesis $ {H}_1 $ , which separates these verbs into different verb classes, views this similarity as a coincidence – similar verb behavior is not expected if verbs are in different classes. In contrast, hypothesis $ {H}_2 $ , which puts these verbs into the same verb class, expects this similarity in verb behavior precisely because the verbs are in the same verb class. When a hypothesis’s expectations are met, it will find the observed information to be more probable and therefore be a better fit. So, $ {H}_2 $ will find the observed information to be more probable, and a modeled learner relying on Bayesian inference will prefer $ {H}_2 $ over $ {H}_1 $ as a better fit for the observed information.

Information integrated

The acquisition theory implemented in the model involves integrating several types of information: (i) animacy (non-linguistic), (ii) syntactic contexts (syntactic), and (iii) links between thematic roles (semantic) and syntactic positions (syntactic). These information sources are combined using the non-linguistic learning mechanism of Bayesian inference.

Model input

To generate predictions about verb classes that English-learning children would have, the model learned from verb uses in English child-directed speech samples. Pearl and Sprouse estimated how many verb uses children at different ages (three, four, and five) would encounter, and implemented models that learned from these same quantities. So, for instance, the three-year-old modeled child learned from the amount of verb uses a three-year-old English-learning child would encounter, distributed according to the samples of speech directed to English-learning children up to age three.

Model output and evaluation

To evaluate a modeled child, Pearl and Sprouse compared the verb classes predicted by the modeled child against verb classes that children of the appropriate age seem to have. More specifically, Pearl and Sprouse used 12 types of syntactic or interpretation behavior surveyed from a large collection of child behavioral studies in order to identify verb classes that three-, four-, and five-year-old English children have. These behaviors included subject-raising, unaccusative, and passivizable, among others. From these verb behaviors at ages three to five, Pearl and Sprouse derived age-specific verb classes that a modeled child should attempt to match when it learns from the same data that three-, four-, or five-year-olds learn from. In particular, verbs in the same class are treated the same by children of that age (i.e., the verbs either have or do not have a specific syntactic or interpretation behavior, such as being passivizable). So, the modeled child of that age should cluster those verbs together if it has learned the way children of that age learn.

Pearl and Sprouse found that their modeled three-, four-, and five-year-olds were able to generate verb classes that matched English-learning children’s verb classes fairly well.

What we learned

The model’s success at matching available empirical data from children supports the acquisition theory implemented in the model, and suggests that children may indeed be learning from these different information types when developing the linking theory knowledge that leads to their observable verb classes. More specifically, the way English-learning children cluster verbs together during syntactic acquisition aligns with them learning not just from syntactic information (e.g., syntactic contexts), but also from non-syntactic information (e.g., animacy and thematic roles).

Passives

The syntactic knowledge

As mentioned above, the passive structure in English allows the subject to be a patient. For instance, in The toy kitten was broken by Lindy, the The toy kitten is the one being broken. So, this sentence seems to have a structure more like The toy kitten was broken_The toy kitten by Lindy, where _The toy kitten marks the position where The toy kitten is understood (as the object of break).

Children then need to learn that this interpretation is possible, which involves understanding where the element in the subject position is understood (in this case, a position where it can serve as Patient). Importantly, not all verbs passivize: recall that The toy kitten was fallen is not acceptable to English speakers (i.e., fall doesn’t passivize). So, a key learning problem is to learn which verbs in English can passivize (i.e., which verbs allow the passive structure and related interpretation with the subject as Patient).

Interestingly, there seems to be significant variation in English for when children realize certain verbs are passivizable. Some verbs, such as hug, are recognized as young as age three while others, such as love, appear delayed till after age five. Moreover, verb meaning (i.e., the lexical semantics) seems to matter. For instance, hug is an observable action, and love is not; love is a “psych subject-experiencer” verb where the subject experiences the psychological state described (love), while hug is not a psychological verb at all. These and other lexical semantic features have been proposed to impact when English-learning children learn that specific verbs are passivizable (see Nguyen & Pearl, Reference Nguyen and Pearl2021 for a review of the acquisition trajectory and proposed lexical semantic features.)

In addition, the syntactic feature of transitivity has been proposed as a key indicator that a verb is likely passivizable in English (Levin, Reference Levin1993). A transitive syntactic context has a subject and direct object, as in Lindy broke the toy kitten, with Lindy as the subject and the toy kitten as the direct object. So, verbs that allow a transitive context, like break, are likely to be passivizable in English.

The acquisition theory implemented in the model

Nguyen and Pearl (Reference Nguyen, Pearl, Brown and Dailey2019) proposed that children decide whether a verb is passivizable on the basis of two things. First, children consider several of the verb’s lexical semantic features (like being observable or a psych subject-experiencer verb) and potentially the syntactic feature of transitivity, as proposed by prior acquisition theories (Liter, Huelskamp, Weerakoon & Munn, Reference Liter, Huelskamp, Weerakoon and Munn2015; Maratsos, Fox, Becker & Chalkley, Reference Maratsos, Fox, Becker and Chalkley1985; Pinker, Lebeaux & Frost, Reference Pinker, Lebeaux and Frost1987; Levin, Reference Levin1993; Messenger, Branigan, McLean & Sorace, Reference Messenger, Branigan, McLean and Sorace2012). Second, children consider how often verbs with those features are passivized in their input. Information about a verb’s features are integrated via Bayesian inference.

As with the Pearl and Sprouse model, Nguyen and Pearl made the idealizing assumption that children would have enough prior knowledge and sufficient learning abilities to accurately extract this information from any particular verb use they encountered. As mentioned before, this assumption of accurate extraction provides a simple starting point for theory evaluation via cognitive modeling, in the absence of a particular theory about how children may inaccurately extract information.

As before, Bayesian inference balances prior knowledge or biases against fit to the observed data. Here, the prior captures how easy (or difficult) it is for children to deploy their knowledge of the passive in the moment, which can be impacted by immature cognitive development. That is, even if a child knows a specific verb is passivizable, she might not be able to access the passive structure appropriately in the moment after hearing the verb in the passive. So, she might not use her syntactic knowledge of the passive structure for that verb instance.

The fit to the observed data is again about the child’s input. In particular, the modeled child assumes passivization is based on a verb’s features and the frequencies of those features in passive forms. Is the information observed in the input about how often verbs with certain features passivize more or less probable? If the verbs in the input are more probable, then there is a good fit to the observed data.

Importantly, the modeled child can heed or ignore any given feature when deciding if a particular verb is passivizable. So, for instance, a five-year-old might ignore whether a verb is an observable action, and instead key into whether it encodes a psychological state. The acquisition theory implemented in the model of Nguyen and Pearl explored theories of selective learning for the English passive (i.e., selectively ignoring available information when deciding if a verb is passivizable).

Information integrated

The information integrated via Bayesian inference is the selected features of a verb (syntactic and lexical semantic), whatever those happen to be. Notably, these features will be the ones children attend to for all the verbs of the language (rather than a feature set for each verb or type of verb). So, the acquisition theory assumes both syntactic and non-syntactic information is relevant. These information sources are then combined using the non-linguistic learning mechanism of Bayesian inference.

Model input

The model learned from verb uses in English child-directed speech samples, both passive uses like The toy kitten was broken and active uses like The toy kitten broke.

Model output and evaluation

To evaluate a modeled learner attending to some set of features, Nguyen and Pearl looked at the age when children have been observed to correctly interpret or produce the passive of a verb more than half the time in previous child behavioral experiments. They called this age the age of acquisition (AoA) for the passive of that verb, and Nguyen and Pearl used the AoA of 30 verbs as a model target. They focused on age five, and therefore split the 30 verbs into verbs whose AoA was five or younger versus verbs whose AoA was older.

The modeled learner predicts a specific verb is either passivizable or not at a certain age, on the basis of its input. So, the modeled five-year-old learned from the distribution of verb input that English-learning five-year-olds encounter and predicted which verbs would be passivizable. Nguyen and Pearl found that a modeled five-year-old who ignored many of the available features was able to match the behavior of English-learning five-year-olds, and passivize the subset of verbs whose AoA was five or younger. This modeled child instead focused on the syntactic feature of transitivity and a single lexical semantic feature.Footnote 3

What we learned

These modeling results suggest that English five-year-old passivization behavior can be captured if five-year-olds selectively attend to these syntactic and lexical semantic features in their input.

Pronoun interpretation

The syntactic knowledge

Consider this English sentence: Lisa sang to the triplets and then Pronoun took a nap. How we interpret Pronoun depends on several factors. One is agreement information: If the pronoun is the singular she, we look for a singular antecedent like Lisa; if the pronoun is the plural they, we look for a plural antecedent like the triplets. Another factor is our discourse-level knowledge about the lexical items that connect the two clauses together, such as and then. In languages like Spanish, the equivalent to and then biases the interpretation towards the subject Lisa rather than the object the triplets. Another factor in languages like Spanish is whether the pronoun is overt (i.e., pronounced) or not. Spanish is a language that allows the pronoun not to be pronounced; when it is not pronounced, the subject (e.g., Lisa) tends to be favored as the pronoun’s antecedent (see Pearl and Forsythe, Reference Pearl and Forsythe2022 for a brief overview of these factors in pronoun interpretation). Children need to learn how to interpret pronouns of their language in context, taking these factors (and others) into account the way adult speakers of their language do.

The acquisition theory implemented in the model

Pearl and Forsythe (Forsythe & Pearl, Reference Forsythe and Pearl2020; Pearl & Forsythe, Reference Pearl and Forsythe2022) proposed that Spanish-learning children decide how to interpret a pronoun in context by potentially considering information from their input about agreement, lexical connective items, and whether the pronoun is overt. Pearl and Forsythe based their proposal on prior theories that highlight the usefulness of this information for pronoun interpretation (e.g., Asher & Lascarides, Reference Asher and Lascarides2003; Brandt-Kobele & Höhle, Reference Brandt-Kobele and Höhle2010; Clahsen, Aveledo & Roca, Reference Clahsen, Aveledo and Roca2002; González-Gómez, Hsin, Barriere, Nazzi & Legendre, Reference González-Gómez, Hsin, Barriere, Nazzi and Legendre2017; Hartshorne, Nappa & Snedeker, Reference Hartshorne, Nappa and Snedeker2015a; Johnson, de Villiers & Seymore, Reference Johnson, de Villiers and Seymore2005; Legendre et al., Reference Legendre, Culbertson, Zaroukian, Hsin, Barrière and Nazzi2014; Pérez-Leroux, Reference Pérez-Leroux and Gurski2005; Pyykkönen, Matthews & Järvikivi, Reference Pyykkönen, Matthews and Järvikivi2010; Soderstrom, Reference Soderstrom2002; Song & Fisher, Reference Song and Fisher2005, Reference Song and Fisher2007). In Pearl and Forysthe’s implementation, these information sources are integrated via Bayesian inference.

Pearl and Forsythe considered two options for how accurately children extract this information from their input. One option was that the modeled child has enough prior knowledge and sufficient learning abilities to accurately extract this information, similar to the two models discussed before. The other option was that the modeled child does not, and in fact would inaccurately represent this information (for whatever reason: immature knowledge, immature learning abilities, and/or cognitive limitations more generally). More specifically, the modeled child would skew the probability distributions observed in the input about these information sources (e.g., how often singular agreement information occurs when the pronoun’s antecedent is singular). In particular, a modeled child with inaccurate representations of the information in the input could flatten a distribution (e.g., turning a 30/70 distribution into a 40/60 distribution) or sharpen a distribution (e.g., turning a 30/70 distribution into a 20/80 distribution).

As before, Bayesian inference balances prior knowledge or biases against fit to the observed data. Here, the prior encodes how often a pronoun preferred a particular antecedent in children’s input, irrespective of any other useful information about how to interpret that pronoun. The fit to the observed data is about how often each information type occurs in children’s input when a pronoun has a particular interpretation. If certain information (e.g., singular agreement information) almost always occurs when a pronoun’s antecedent is interpreted a certain way (e.g., a singular antecedent), then using that highly-reliable information to interpret the pronoun is a good fit.

Pearl and Forsythe also considered two options for how accurately children perform this inference in the moment of deciding a pronoun’s interpretation. One option was that the modeled child would use all the information sources when performing the Bayesian inference calculation. The other option was that the modeled child would ignore one or more information sources when performing that inference calculation (for whatever reason: immature knowledge, immature learning abilities, and/or cognitive limitations more generally).

So, to sum up, Pearl and Forsythe modeled two types of children. The first type was a modeled child without cognitive limitations, able to (i) accurately extract and represent the probability distributions from the information sources in the input, and (ii) always use those represented probabilities during the Bayesian inference calculation. The second type was a modeled child with cognitive limitations (of whatever kind) that affected (i) the accurate representation of information in the input, (ii) the use of all that information in the Bayesian inference calculation, or (iii) both. In particular, irrespective of the source of inaccurate information representations or inaccurate use of those representations, the modeled child could represent information inaccurately, use that information inaccurately, or both. Thus, the models of Pearl and Forsythe considered certain theories for children’s pronoun interpretation behavior that involve cognitive limitations; the effect of those limitations is to impact either the representation of information from the input, the use of that information when deciding a pronoun’s interpretation in context, or both.

Information integrated

The information integrated via Bayesian inference is linguistic: agreement information (morphology), the lexical connectives between clauses (lexical), and whether the pronoun is pronounced (syntactic/phonological). These information sources are then combined using the non-linguistic learning mechanism of Bayesian inference. The way the information is combined can be mediated by non-linguistic factors arising from cognitive limitations: misrepresenting the information from the input and/or not using select information during Bayesian inference.

Model input

The modeled child learned from pronoun uses in Spanish speech samples involving children. These pronoun uses involved two clauses and had the pronoun as the subject of the second clause (e.g., [Lisa sang to the triplets] $ {}_{clause_1} $ and then [Pronoun took a nap] $ {}_{clause_2} $ .)

Model output and evaluation

Pearl and Forsythe evaluated modeled children that attended to this set of linguistic features and potentially had cognitive limitations impacting information representation and/or use. The modeled children generated predictions for how to interpret pronouns that Spanish-learning children ages three to five had interpreted in different experimental contexts involving information about agreement, lexical connectives, and whether the pronoun was pronounced.

Pearl and Forysthe found that modeled three-, four-, and five-year-olds were able to best match the interpretation preferences of actual three-, four-, and five-year-olds when cognitive limitations impacting either information representation or information use (but not both) were active. That is, children’s interpretation behavior could be captured by integrating information from agreement, lexical connectives, and whether the pronoun was pronounced as long as children either (i) always mis-perceived information from these sources in the input, leading to inaccurate information, or (ii) often ignored accurate information from these sources when deciding how to interpret a pronoun in the moment. Importantly, children’s behavior wasn’t captured as well if the modeled child had both effects (inaccurate information often ignored) or neither effect (accurate information never ignored).

What we learned

These modeling results thus offer specific explanations about how cognitive limitations (whatever their specific source happens to be) could impact children’s pronoun interpretation preferences, if children rely on these linguistic information sources.

Some experimental work to take inspiration from

I now briefly turn to some work from child behavioral experiments that can provide inspiration for other factors we might want to consider (or consider further) for syntactic acquisition. The first set of experiments involves cognitive limitations, while the second involves knowledge about pragmatics and the world more generally.

Cognitive limitations

The model of Forsythe and Pearl highlighted one effect that cognitive limitations could have on children’s acquisition (syntactic or otherwise): children have adult-like knowledge but can’t deploy it effectively in the moment. Several child behavioral experiments have been interpreted as demonstrating this effect for syntactic acquisitionFootnote 4, including Gerard, Lidz, Zuckerman, and Pinto (Reference Gerard, Lidz, Zuckerman and Pinto2018), Ud Deen, Bondoc, Camp, Estioca, Hwang, Shin, Takahashi, Zenker, and Zhong (Reference Ud Deen, Bondoc, Camp, Estioca, Hwang, Shin, Takahashi, Zenker and Zhong2018), and Liter, Grolla, and Lidz (Reference Liter, Grolla and Lidz2022).

In Gerard et al. (Reference Gerard, Lidz, Zuckerman and Pinto2018), four- and five-year-old English-learning children were asked to interpret utterances with unpronounced subject pronouns in the second clause, like Dora washed Diego before eating a red apple. An adult-like interpretation is that Dora is the one eating a red apple, so the syntactic representation is something like this: Dora washed Diego before Pronoun Dora eating a red apple. Children were asked to interpret this kind of utterance in tasks that were either more or less cognitively-demanding. A more cognitively-demanding task might involve children having to hold additional information in mind and also evaluate whether the utterance itself is true; a less cognitively-demanding task would involve children simply indicating their interpretation by coloring a picture of the appropriate interpretation (i.e., Dora eating the apple, rather than Diego).Footnote 5 When children had to do the more cognitively-demanding task – and so use up more cognitive resources on something besides interpreting the unpronounced pronoun – they gave more non-adult-like interpretations (e.g., Diego eating the apple). In contrast, when children did the less cognitively-demanding task – and so focused more cognitive resources on interpreting the unpronounced pronoun – they gave more adult-like interpretations (e.g., Dora eating the apple). One way to interpret these results is that four- and five-year-olds have adult-like knowledge of how to interpret these unpronounced pronouns, but cannot always use that knowledge in the moment when their cognitive resources are being used up by other things. This idea aligns broadly with the Forsythe and Pearl modeled children who cannot accurately use their information about pronoun interpretation in the moment.

Another example comes from Ud Deen et al. (Reference Ud Deen, Bondoc, Camp, Estioca, Hwang, Shin, Takahashi, Zenker and Zhong2018) on children’s interpretation of the passive. English-learning four-year-olds correctly interpreted passives like Elephant was surprised by Monkey more often when the utterance was simply repeated. One interpretation of this finding is that children can adjust their mistaken expectations about the thematic role associated with the subject (i.e., that Elephant is not the surprise-causer but instead the surprise-experiencer) when they hear the sentence again because they know they made a mistake the first time. That is, children can inhibit the incorrect thematic role assignment of Elephant because they know it will not be correct. However, the first time children hear the utterance, they do not know this and so they make an incorrect assignment (e.g., of Elephant as surprise-causer), which is hard for them to adjust afterwards. In other words, children have adult-like knowledge about how to interpret the passive, but cannot use it effectively when their cognitive inhibition ability is not strong enough. So, more broadly, this child behavior was interpreted as domain-general cognitive factors like immature cognitive inhibition impacting children’s ability to use their knowledge of the passive.

A third example comes from Liter et al. (Reference Liter, Grolla and Lidz2022), and also involves immature cognitive inhibition, this time impacting children’s production of questions involving wh-words like where. More specifically, English-learning children will sometimes produce “medial wh” questions that seem to duplicate the wh-word, with an extra copy appearing in the middle, such as Where do you think where they were walking? Liter et al. (Reference Liter, Grolla and Lidz2022) found that children’s production of medial-wh questions correlated with a measure of their cognitive inhibition abilities. One way to interpret this is that children do in fact know that English does not allow medial wh, but children simply lack the cognitive control sometimes to inhibit the extra wh-word from being produced in the moment. As with the passive example above, this result highlights that acquisition theories (and therefore the computational cognitive models we build to explain children’s behavior) need to consider the non-linguistic systems controlling cognitive inhibition in children.

Pragmatics and world knowledge

Other sources of information children could harness involve knowledge about how speakers use their language (i.e., pragmatic knowledge) and knowledge about the world more generally. We already have behavioral evidence that children can rely on these information sources during syntactic acquisition, such as when learning to interpret pronouns (e.g., Hartshorne et al., Reference Hartshorne, Nappa and Snedeker2015a; Pyykkönen et al., Reference Pyykkönen, Matthews and Järvikivi2010; Song & Fisher, Reference Song and Fisher2005, Reference Song and Fisher2007; Wykes, Reference Wykes1981; among others).

As one example of pragmatic knowledge with pronouns, consider the sentence Lisa sang to Lindy and then she took a nap. The pronoun she could refer to either Lisa or Lindy, but adults know that speakers like to have clauses refer to the same topic (Asher & Lascarides, Reference Asher and Lascarides2003). This leads to a “first-mention bias”, where the element first mentioned (e.g., the subject Lisa) is the topic and listeners prefer a subsequent pronoun to refer to that first-mentioned element (Crawley, Stevenson & Kleinman, Reference Crawley, Stevenson and Kleinman1990; Arnold, Eisenband, Brown-Schmidt & Trueswell, Reference Arnold, Eisenband, Brown-Schmidt and Trueswell2000; Järvikivi, van Gompel, Hyönä & Bertram, Reference Järvikivi, van Gompel, Hyönä and Bertram2005). English-learning children ages three to five also seem to have this pragmatic knowledge, leading to a first-mention bias in a variety of contexts (Song & Fisher, Reference Song and Fisher2005, Reference Song and Fisher2007; Pyykkönen et al., Reference Pyykkönen, Matthews and Järvikivi2010; Hartshorne et al., Reference Hartshorne, Nappa and Snedeker2015a).

As one example of world knowledge with pronouns, consider this sentence pair: Jane needed Susan’s pencil. She gave it to her. Knowledge about how the world works allows listeners to pick situationally-appropriate interpretations (e.g., Hobbs, Reference Hobbs1979; Kehler, Kertz, Rohde & Elman, Reference Kehler, Kertz, Rohde and Elman2008). Here, if Jane needs a pencil, she cannot already have one, so she cannot be the one to give a pencil away. That means that the one doing the giving (referred to by She in the second sentence) must not be Jane, and instead is probably the other mentioned person Susan. Similarly, if Jane needs a pencil, she is likely to be the one getting a pencil from someone else, i.e., the recipient of giving indicated by her. So, world knowledge allows listeners to interpret She as Susan and her as Jane. English-learning five-year-olds seem able to complete this chain of reasoning and correctly interpret the second sentence (Wykes, Reference Wykes1981).

These are just select examples of pragmatic and world knowledge impacting pronoun interpretation, which of course is simply one aspect of syntactic knowledge. More generally, these examples suggest that future syntactic acquisition theories (and the computational cognitive models implementing them) should consider these information sources.

Moving forward

Computational cognitive modeling is a tool that complements other techniques for investigating language development, providing insight into aspects of language acquisition that can be difficult to investigate otherwise. For instance, the models reviewed here investigated how children might learn certain syntactic knowledge from their input (verb constructions like subject-raising, unaccusatives, and passives) and why child behavior may differ from adult behavior for certain syntactic elements (pronoun interpretation).

In general, I think questions of how acquisition works and why children behave as they do are much easier to investigate with modeling. This is because the underlying factors that impact how acquisition works (and therefore why children behave as they do) can be explicitly defined and manipulated within a computational cognitive model. Such factors include how information from the input is perceived, which information is learned from, and how information is used to update internal hypotheses, as well as which hypotheses are under consideration in the first place. To me, it is not at all obvious how to control these factors (and others) with other techniques commonly used to investigate child language development, such as behavioral techniques.

With that said, informative models typically build on data collected with other techniques. Model input is based on estimates of the information children encounter in their language interactions. Model learning mechanisms are based on ideas of what abilities and learning biases children demonstrate at certain ages. Model output is based on data collected from children (or that can be collected in the future), so that the model can explain children’s observed linguistic behavior.

As we move forward, a basic goal is to build “better” models – that is, models that capture more of the relevant aspects of the acquisition process so that we can better link children’s input to their observable behavior. When we have these better models, we then have better explanations – as implemented in the models – for why acquisition (syntactic or otherwise) proceeds the way it does. So, how do we build better models?

Building better models

To build a computational cognitive model of language acquisition, we need to be very precise about the acquisition process the model is implementing. One concrete proposal for the relevant components of the acquisition process is in Figure 1, adapted from Pearl (Reference Pearl and Sprousein press). This proposal specifies components both external and internal to the child during the acquisition process, and is meant to capture the iterative process of acquisition unfolding over time.

Figure 1. Proposal for the relevant components of the acquisition process that a computational cognitive model of language acquisition should consider. External components (input and behavior) are observable. Internal components are not observable, and include perceptually encoding information from the input signal (yielding the perceptual intake), generating output from the encoded information (yielding observable behavior), and learning from the encoded information (using constraints & filters to yield the acquisitional intake, and doing inference over that intake). The developing systems and developing knowledge (both linguistic and non-linguistic) impact all internal components, while the learning component updates the developing knowledge.

External components are observable. We can observe the input signal available to children (e.g., the child language interactions they experience). For example, consider a version of our utterance from before: “Lisa sang to the triplets and then she took a power nap.” The input signal is the physical signal in the world, such as auditory components like pitch, timbre, and loudness of the utterance. The input can also include other aspects of the environment, such as who said the utterance, where they said it, when they said it, and what people or objects were in the environment at the time.

We can also observe children’s behavior at any stage of development, either through naturalistic productions and behavior or clever experimental designs that elicit productions or behavior. In the example utterance above, we can observe who the child thinks she refers to, Lisa or the triplets. One way to do this is to present the child with two pictures, one of Lisa napping and one of the triplets napping, and ask the child to point to the picture the utterance describes.

The internal components of the acquisition process involve several pieces. The first piece concerns the information the child is able to perceive in the input signal. In particular, perceptual encoding involves extracting information from the input signal to create the perceptual intake. Perceptual encoding draws on the child’s developing knowledge and systems to extract information. For instance, in our example utterance, the child may be able to perceive syllables (e.g., /li/, /sǝ/, /sεŋ/, etc.), words (e.g., Lisa, sang, etc.), syntactic structure (e.g., [ $ {}_{IP} $ Lisa [ $ {}_{VP} $ sang [ $ {}_{PP} $ to [ $ {}_{NP} $ the triplets]]]]), pronoun interpretations (she = Lisa), as well as the event participants (Lisa, the triplets) and properties of the events described (singing, napping), among many other types of information. What children can perceive depends on what they know about their language (e.g., developing linguistic knowledge: Lisa, the triplets, and she are words), what they know about the world (e.g., developing non-linguistic knowledge: who’s likely to take a power nap), and how well they can extract information of different kinds (e.g., developing linguistic systems: speech segmentation, syntactic parsing, pronoun interpretation biases; developing non-linguistic systems: memory, cognitive inhibition). Notably, extracting information from the input signal involves ignoring information present (e.g., where the utterance was spoken) and adding information not explicitly present (e.g., where the words are, how a pronoun is interpreted). What children ignore and add depends on their developing knowledge and developing systems.

The second internal piece concerns how children generate their observable behavior. For this, children rely on the information they have been able to perceptually encode (the perceptual intake) and their developing systems and knowledge. In particular, children apply their production systems to the perceptual intake in order to generate behavior like speaking (which relies on linguistic systems and non-linguistic systems involved in utterance generation). In our example utterance, a child might say “Lisa’s the one napping”. Children can also respond non-verbally (e.g., look at a picture that encodes a scene described by the utterance, which relies on non-linguistic systems like motor control, attention, and decision-making). In our example utterance, a child might look at the picture of Lisa napping.

The last internal piece concerns learning, which is how the child’s developing knowledge (both linguistic and non-linguistic) is updated over time. As with the other internal pieces, the child’s developing systems and knowledge impact this piece. In particular, learning occurs over the part of the perceptual intake the child deems relevant to learn from: this is the acquisitional intake. The acquisitional intake is typically not all of the perceptual intake. That is, it is not everything the child is able to encode. Instead, depending on what the child is trying to learn, what is relevant is likely some subset of the perceptual intake. For instance, in our example utterance, the fact that the pronoun she is singular may be in the acquisitional intake, while the fact that she is a separate word from took may not.

The child’s developing knowledge can filter the perceptual intake down to the relevant information by providing both constraints on possible hypotheses (i.e., what options are worth considering) and attentional filters (i.e., what in the information signal to pay attention to). For instance, in our pronoun interpretation example, a linguistic constraint may limit the possible hypotheses for she’s antecedent to noun phrases, and so the number feature is relevant for choosing among different noun phrases; a non-linguistic constraint may limit potential antecedents to animate participants who are capable of power napping. An attentional filter may focus the child on the pronoun’s interpretation, rather than other aspects of the utterance, because of uncertainty about how to interpret pronouns more generally at the child’s current stage of development.

Inference then operates over the acquisitional intake, and typically involves non-linguistic abilities like probabilistic inference, statistical learning, or hypothesis testing. The result of this inference can be used to update the developing knowledge – potentially both linguistic knowledge and non-linguistic knowledge. For instance, in our pronoun interpretation example, the child might update her hypotheses about how likely it is that she’s antecedent is singular (linguistic knowledge) and how likely adults like Lisa are to take power naps (non-linguistic knowledge).

With this proposal in hand for relevant components of a computational cognitive model of acquisition, we can now think about some of the ideas we might want to incorporate into future models of syntactic acquisition. I briefly discuss some ideas for incorporating non-syntactic components and simultaneous acquisition of different knowledge aspects.

Incorporating non-syntactic components into acquisition models

Prior behavioral work has found that children are sensitive to animacy when learning aspects of syntax (e.g., see Becker, Reference Becker2015). Pearl and Sprouse (Reference Pearl and Sprouse2019) used animacy in their model of linking theory acquisition, allowing the animacy of a verb’s arguments to be part of the acquisitional intake that children learned from.

Prior behavioral work has also found that children can use both pragmatic and world knowledge to help them choose between potential interpretations of pronouns (e.g., Hartshorne et al., Reference Hartshorne, Nappa and Snedeker2015a; Pyykkönen et al., Reference Pyykkönen, Matthews and Järvikivi2010; Song & Fisher, Reference Song and Fisher2005, Reference Song and Fisher2007; Wykes, Reference Wykes1981). Some recent computational cognitive modeling work has investigated how children choose between potential interpretations of utterances like Every horse didn’t jump, which can either mean “No horses jumped” or “Not all horses jumped” (Savinelli, Scontras & Pearl, Reference Savinelli, Scontras and Pearl2017, Reference Savinelli, Scontras and Pearl2018; Scontras & Pearl, Reference Scontras and Pearl2021). The modeled children in these studies incorporated both pragmatic knowledge about what speakers think the topic of conversation is and world knowledge about the event described (e.g., how likely horses are to jump) into the perceptual intake. Notably, differences in children’s ability to adjust their expectations about the pragmatics and world of the experiment – due to immature non-linguistic systems – can explain children’s observed non-adult-like behavior, according to these models.

More generally, prior behavioral work (Gerard et al., Reference Gerard, Lidz, Zuckerman and Pinto2018; Liter et al., Reference Liter, Grolla and Lidz2022; Ud Deen et al., Reference Ud Deen, Bondoc, Camp, Estioca, Hwang, Shin, Takahashi, Zenker and Zhong2018) has noted the impact of immature non-linguistic systems (e.g., cognitive inhibition) in children’s use of their knowledge – that is, how children generate their observed behavior in experimental contexts. So, I think it is useful for future computational cognitive models to consider the impact of these developing non-linguistic systems when accounting for children’s behavior (i.e., the output generation process).

Moreover, these developing non-linguistic systems may also impact several other pieces of the acquisition process: (i) perceptual encoding, leading to a perceptual intake that captures immature representations of information in the input, (ii) constraints & filters, leading to an acquisitional intake that is inaccurate, and (iii) inference, leading to learning that is non-adult-like. The exact way developing non-linguistic systems impact these pieces depends on what system is developing and how that system is proposed to contribute to the acquisition process. While this is certainly non-trivial to specify for any given non-linguistic system and model piece, the more we can do it, the better we will be able to capture the acquisition process in children and link their input to their observable behavior with a concrete acquisition theory encoded in a model.

Thinking about simultaneous acquisition

Another interesting consideration is simultaneous acquisition, where multiple types of knowledge are learned simultaneously. In the case studies discussed here, the acquisition of linking theories from Pearl and Sprouse (Reference Pearl and Sprouse2019) was an example of this. More specifically, when learning how to cluster verbs together into classes whose linking theories were similar, the modeled child effectively learned about many different verb constructions simultaneously (e.g., which verbs are subject-raising, which verbs are unaccusative, which verbs are passivizable, etc.). The key insight is that the modeled child’s objective was broad – to learn about verbs that “behave” similarly with respect to certain types of information in the acquisitional intake (argument animacy, syntactic contexts, links between thematic roles and syntactic positions), instead of learning about which verbs allow a specific syntactic behavior (e.g., subject-raising). In other words, the specific syntactic knowledge about which constructions any given verb allows is a by-product of trying to learn something else about that verb – namely, which other verbs it behaves similarly to (i.e., which class it belongs to) and what the behavior of that verb class is.

I think this may be a more realistic approach to syntactic acquisition (and acquisition more generally), with children trying to learn about their language more broadly and picking up specific linguistic knowledge along the way as part of that broader learning goal. What this means modeling-wise is that the modeled child’s objective – what hypotheses are being considered – would be adjusted. For instance, instead of explicitly learning if a verb is subject-raising, can children’s observable behavior about which verbs are subject-raising be captured by a modeled child learning about verb classes more generally and implicitly learning which verbs are subject-raising? This approach worked well for Pearl and Sprouse (Reference Pearl and Sprouse2019).

Another example of simultaneous syntactic acquisition from my own research (Bates & Pearl, Reference Bates, Pearl, Brown and Dailey2019; Dickson, Pearl & Futrell, Reference Dickson, Pearl and Futrell2022; Pearl & Bates, Reference Pearl and Batesin press; Pearl and Sprouse, Reference Pearl and Sprouse2013) is the acquisition of knowledge about “syntactic islands” in children. For example, English-speaking children must learn that Who did Lily think the kitten for —who was cute? is not a good wh-question, which draws on their implicit knowledge of syntactic islands. Here, the modeled child’s objective is to learn in general how to represent wh-dependencies like those in wh-questions, rather than learning how good a specific wh-dependency is (or is not). By learning to do this, modeled children learn to have adult-like preferences about how good different wh-dependencies are (Bates & Pearl, Reference Bates, Pearl, Brown and Dailey2019; Pearl & Bates, Reference Pearl and Batesin press; Pearl & Sprouse, Reference Pearl and Sprouse2013), especially if the modeled children are trying to represent wh-dependencies in an “efficient” way (Dickson et al., Reference Dickson, Pearl and Futrell2022) that makes processing future wh-dependencies easier.

A related approach gaining momentum in syntactic acquisition modeling involves simply learning to predict the next word, with the modeled children implicitly learning whatever knowledge is necessary to make that next word highly probable (and therefore easier to process). Along the way, several models of this type seem to implicitly learn a variety of syntactic knowledge, including knowledge about syntactic islands (e.g., Wilcox, Levy, Morita & Futrell, Reference Wilcox, Levy, Morita and Futrell2018; Futrell et al., Reference Futrell, Wilcox, Morita, Qian, Ballesteros and Levy2019; Chaves, Reference Chaves2020; Warstadt et al., Reference Warstadt, Parrish, Liu, Mohananey, Peng, Wang and Bowman2020; Wilcox, Futrell & Levy, Reference Wilcox, Futrell and Levy2021).

Conclusion

Here I hope to have shown how computational cognitive modeling can inform our understanding of syntactic acquisition by implementing theories of acquisition precisely enough to evaluate against empirical data from children. I reviewed some previous models that consider information from non-syntactic sources and the impact of non-linguistic cognitive development on syntactic acquisition. I also highlighted some behavioral work that notes the role of other information sources children use and specific cognitive limitations children have during syntactic acquisition. I then discussed how we might build future models that incorporate these insights and so provide better explanations of syntactic acquisition. With this information in mind, I believe we can create, evaluate, and refine better theories of syntactic acquisition through computational cognitive modeling.

Acknowledgements

I am deeply grateful to Elma Blom and Titia Benders for putting together this special issue on a topic near and dear to my heart, and inviting me to be part of it. Both they and an anonymous reviewer have provided very helpful feedback. I am also indebted to the Quantitative Approaches to Language Science (QuantLang) Collective at UC Irvine, who have shaped my thinking on modeling over the years.

Competing interest

The authors declare none.

Footnotes

1 See Pearl (Reference Pearl and Sprousein press) for discussion of many other examples of syntactic acquisition models that rely on probabilistic inference, statistical learning, or otherwise “counting things”, even if those models learn only from syntactic information.

2 Pearl and Sprouse’s theory also assumed children could potentially have additional biases about how to interpret the link distribution in their input. See Pearl and Sprouse (Reference Pearl and Sprouse2019) for details.

3 This lexical semantic feature was “psych object-experiencer”, where the object of the verb experiences the psychological state. An example is annoy: In The non-stop crying annoyed Lisa, the object Lisa is experiencing the psychological state of being annoyed.

4 I note that other interpretations of these specific results are of course possible.

5 I note that a task can be thought of as more cognitively-demanding because it seems to require more cognitive resources of whatever kind (e.g., working memory, attention, executive control, or something else) without specifying exactly what additional resources are required and how those specific resources are drawn upon. Of course, it is more satisfying to have a precise theory of how different cognitive resources interact to produce observable behavior in any given experimental task. See Gerard et al. (Reference Gerard, Lidz, Zuckerman and Pinto2018) for discussion of some of the specific resources that may be involved for this task.

References

Arnold, J., Eisenband, J., Brown-Schmidt, S., & Trueswell, J. C. (2000). The rapid use of gender information: Evidence of the time course of pronoun resolution from eyetracking. Cognition, 76, B13B26.CrossRefGoogle ScholarPubMed
Asher, N., & Lascarides, A. (2003). Logics of Conversation. Cambridge: Cambridge University Press.Google Scholar
Bates, A., & Pearl, L. (2019). What do you think that happens? A quantitative and cognitive modeling analysis of linguistic evidence across socioeconomic status for learning syntactic islands. In Brown, Megan M. and Dailey, Brady (Ed.), Proceedings of the 43rd annual Boston University Conference on Language Development (pp. 4256). Somerville, MA: Cascadilla Press.Google Scholar
Becker, M. (2009). The role of np animacy and expletives in verb learning. Language Acquisition, 16(4), 283296.CrossRefGoogle Scholar
Becker, M. (2014). The acquisition of syntactic structure: Animacy and thematic alignment (Vol. 141). Cambridge University Press.CrossRefGoogle Scholar
Becker, M. (2015). Animacy and the Acquisition of Tough Adjectives. Language Acquisition, 22(1), 68103.CrossRefGoogle Scholar
Becker, M., & Estigarribia, B. (2013). Harder words: Learning abstract verbs with opaque syntax. Language Learning and Development, 9(3), 211244.CrossRefGoogle Scholar
Brandt-Kobele, O.-C., & Höhle, B. (2010). What asymmetries within comprehension reveal about asymmetries between comprehension and production: The case of verb inflection in language acquisition. Lingua, 120(8), 19101925.CrossRefGoogle Scholar
Chaves, R. P. (2020). What Don’t RNN Language Models Learn About Filler-Gap Dependencies? Proceedings of the Society for Computation in Linguistics, 3(1), 2030.Google Scholar
Clahsen, H., Aveledo, F., & Roca, I. (2002). The development of regular and irregular verb inflection in Spanish child language. Journal of Child Language, 29, 591622.CrossRefGoogle ScholarPubMed
Crawley, R. A., Stevenson, R., & Kleinman, D. (1990). The Use of Heuristic Strategies in the Interpretation of Pronouns. Journal of Psycholinguistic Research, 19(4), 245264.CrossRefGoogle ScholarPubMed
Dickson, N., Pearl, L., & Futrell, R. (2022). Learning constraints on wh-dependencies by learning how to efficiently represent wh-dependencies: A developmental modeling investigation with Fragment Grammars. Proceedings of the Society for Computation in Linguistics, 5(1), 220224.Google Scholar
Fisher, C., Gertner, Y., Scott, R. M., & Yuan, S. (2010). Syntactic bootstrapping. Wiley Interdisciplinary Reviews: Cognitive Science, 1(2), 143149.Google ScholarPubMed
Forsythe, H., & Pearl, L. (2020). Immature representation or immature deployment? Modeling child pronoun resolution. In Proceedings of the Society for Computation in Linguistics 3 (pp. 2226).Google Scholar
Futrell, R., Wilcox, E., Morita, T., Qian, P., Ballesteros, M., & Levy, R. (2019). Neural language models as psycholinguistic subjects: Representations of syntactic state. Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 1.Google Scholar
Gerard, J., Lidz, J., Zuckerman, S., & Pinto, M. (2018). The acquisition of adjunct control is colored by the task. Glossa: A Journal of General Linguistics, 3(1), 122. Retrieved from https://doi.org/10.5334/gjgl.547CrossRefGoogle Scholar
Gillette, J., Gleitman, H., Gleitman, L., & Lederer, A. (1999). Human simulations of vocabulary learning. Cognition, 73(2), 135176.CrossRefGoogle ScholarPubMed
Gleitman, L. (1990). The structural sources of verb meanings. Language Acquisition, 1(1), 355.CrossRefGoogle Scholar
González-Gómez, N., Hsin, L., Barriere, I., Nazzi, T., & Legendre, G. (2017). Agarra, agarran: Evidence of early comprehension of subject–verb agreement in Spanish. Journal of Experimental Child Psychology, 160, 3349.CrossRefGoogle ScholarPubMed
Gutman, A., Dautriche, I., Crabbé, B., & Christophe, A. (2015). Bootstrapping the syntactic bootstrapper: Probabilistic labeling of prosodic phrases. Language Acquisition, 22(3), 285309.CrossRefGoogle Scholar
Harrigan, K., Hacquard, V., & Lidz, J. (2016). Syntactic Bootstrapping in the Acquisition of Attitude Verbs: think, want and hope. In Proceedings of WCCFL (Vol. 33).Google Scholar
Hartshorne, J. K., Nappa, R., & Snedeker, J. (2015a). Development of the First-Mention Bias. Journal of Child Language, 42(2).CrossRefGoogle ScholarPubMed
Hartshorne, J. K., Pogue, A., & Snedeker, J. (2015b). Love is hard to understand: The relationship between transitivity and caused events in the acquisition of emotion verbs. Journal of child language, 42(03), 467504.CrossRefGoogle ScholarPubMed
Hobbs, J. R. (1979). Coherence and coreference. Cognitive Science, 3, 6790.CrossRefGoogle Scholar
Järvikivi, J., van Gompel, R., Hyönä, J., & Bertram, R. (2005). Ambiguous pronoun resolution: Contrasting the first-mention and subject-preference accounts. Psychological Science, 16(4), 260264.CrossRefGoogle ScholarPubMed
Johnson, V., de Villiers, J., & Seymore, H. (2005). Agreement without understanding? The case of third person singular /s/. First Language, 25(3), 317330.CrossRefGoogle Scholar
Kehler, A., Kertz, L., Rohde, H., & Elman, J. L. (2008). Coherence and coreference revisited. Journal of Semantics, 25(1), 144.CrossRefGoogle ScholarPubMed
Kirby, S. (2009a). Do what you know: “semantic scaffolding” in biclausal raising and control. In Annual meeting of the berkeley linguistics society (Vol. 35, pp. 190201).CrossRefGoogle Scholar
Kirby, S. (2009b). Semantic scaffolding in first language acquisition: The acquisition of raisingto-object and object control (Unpublished doctoral dissertation). University of North Carolina, Chapel Hill, Chapel Hill, NC.Google Scholar
Landau, B., & Gleitman, L. R. (1985). Language and experience: Evidence from the blind child. Harvard University Press.Google Scholar
Legendre, G., Culbertson, J., Zaroukian, E., Hsin, L., Barrière, I., & Nazzi, T. (2014). Is children’s comprehension of subject-verb agreement universally late? Comparative evidence from French, English, and Spanish. Lingua.CrossRefGoogle Scholar
Levin, B. (1993). English verb classes and alternations: A preliminary investigation. University of Chicago Press.Google Scholar
Liter, A., Grolla, E., & Lidz, J. (2022). Cognitive inhibition explains children’s production of medial wh-phrases. Language Acquisition. Retrieved from https://doi.org/10.1080/10489223.2021.2023813CrossRefGoogle Scholar
Liter, A., Huelskamp, T., Weerakoon, S., & Munn, A. (2015). What drives the Maratsos Effect, agentivity or eventivity? In Boston University Conference on Language Development (BUCLD). Boston University, Boston, MAGoogle Scholar
Maratsos, M., Fox, D. E., Becker, J. A., & Chalkley, M. A. (1985). Semantic restrictions on children’s passives. Cognition, 19(2), 167191.CrossRefGoogle ScholarPubMed
Messenger, K., Branigan, H. P., McLean, J. F., & Sorace, A. (2012). Is young children’s passive syntax semantically constrained? Evidence from syntactic priming. Journal of Memory and Language, 66(4), 568587.CrossRefGoogle Scholar
Nguyen, E., & Pearl, L. (2019). Using Developmental Modeling to Specify Learning and Representation of the Passive in English Children. In Brown, M. M. & Dailey, B. (Eds.), Proceedings of the Boston University Conference on Language Development 43 (pp. 469482). Somerville, MA: Cascadilla Press.Google Scholar
Nguyen, E., & Pearl, L. (2021). The link between lexical semantic features and children’s comprehension of English verbal be-passives. Language Acquisition, 28(4), 433450.CrossRefGoogle Scholar
Pearl, L. (2014). Evaluating learning strategy components: Being fair. Language, 90(3), e107e114.CrossRefGoogle Scholar
Pearl, L. (2021). Theory and predictions for the development of morphology and syntax: A Universal Grammar+ statistics approach. Journal of Child Language, 48(5), 907936.CrossRefGoogle ScholarPubMed
Pearl, L. (2023). Modeling syntactic acquisition. In Sprouse, J. (Ed.), Oxford handbook of experimental syntax. Oxford University Press, 209–270.Google Scholar
Pearl, L., & Bates, A. (2022). A new way to identify if variation in children’s input could be developmentally meaningful: Using computational cognitive modeling to assess input across socio-economic status for syntactic islands. Journal of Child Language. 1–34. doi:10.1017/S0305000922000514.CrossRefGoogle Scholar
Pearl, L., & Forsythe, H. (2022). Inaccurate representations, inaccurate deployment, or both? Using computational cognitive modeling to investigate the development of pronoun interpretation in Spanish. lingbuzz. Retrieved from https://ling.auf.net/lingbuzz/006141Google Scholar
Pearl, L., & Sprouse, J. (2013). Syntactic islands and learning biases: Combining experimental syntax and computational modeling to investigate the language acquisition problem. Language Acquisition, 20, 1964.CrossRefGoogle Scholar
Pearl, L., & Sprouse, J. (2019). Comparing solutions to the linking problem using an integrated quantitative framework of language acquisition. Language, 95(4), 583611. Retrieved from https://ling.auf.net/lingbuzz/003913CrossRefGoogle Scholar
Pérez-Leroux, A. T. (2005). Number Problems in Children. In Gurski, C. (Ed.), Proceedings of the 2005 Canadian Linguistics Association Annual Conference (p. 112).Google Scholar
Pinker, S., Lebeaux, D. S., & Frost, L. A. (1987). Productivity and constraints in the acquisition of the passive. Cognition, 26(3), 195267.CrossRefGoogle ScholarPubMed
Pyykkönen, P., Matthews, D., & Järvikivi, J. (2010). Three-year-olds are sensitive to semantic prominence during online language comprehension: A visual world study of pronoun resolution. Language and Cognitive Processes, 25(1), 115129.CrossRefGoogle Scholar
Savinelli, K., Scontras, G., & Pearl, L. (2017). Modeling scope ambiguity resolution as pragmatic inference: Formalizing differences in child and adult behavior. In Proceedings of the 39th annual meeting of the Cognitive Science Society. London, UK: Cognitive Science Society.Google Scholar
Savinelli, K., Scontras, G., & Pearl, L. (2018). Exactly two things to learn from modeling scope ambiguity resolution: Developmental continuity and numeral semantics. In Proceedings of the 8th workshop on Cognitive Modeling and Computational Linguistics (CMCL 2018) (pp. 6775).CrossRefGoogle Scholar
Scontras, G., & Pearl, L. S. (2021). When pragmatics matters more for truth-value judgments: An investigation of quantifier scope ambiguity. Glossa: A journal of General Linguistics, 6(1). https://doi.org/10.16995/glossa.5724CrossRefGoogle Scholar
Scott, R. M., & Fisher, C. (2009). Two-year-olds use distributional cues to interpret transitivity-alternating verbs. Language and cognitive processes, 24(6), 777803.CrossRefGoogle ScholarPubMed
Soderstrom, M. (2002). The acquisition of inflection morphology in early perceptual knowledge of syntax (Unpublished doctoral dissertation). Johns Hopkins University, Baltimore, MD.Google Scholar
Song, H.-J., & Fisher, C. (2005). Who’s “She?”: Discourse prominence influences preschooler’s comprehension of pronouns. Journal of Memory and Language, 52(1), 2957.CrossRefGoogle Scholar
Song, H.-J., & Fisher, C. (2007). Discourse prominence effects on 2.5-year-old children’s interpretation of pronouns. Lingua, 117(11), 19591987.CrossRefGoogle ScholarPubMed
Ud Deen, K., Bondoc, I., Camp, A., Estioca, S., Hwang, H., Shin, G.-H., Takahashi, M., Zenker, F., & Zhong, J. (2018). Repetition brings success: Revealing knowledge of the passive voice. In Proceedings of the 42nd Annual Boston University Conference on Language Development, Boston, Massachusetts, pp. 200213Google Scholar
Warstadt, A., Parrish, A., Liu, H., Mohananey, A., Peng, W.,Wang, S.-F., & Bowman, S. R. (2020). BLiMP: The benchmark of linguistic minimal pairs for English. Transactions of the Association for Computational Linguistics, 8, 377392.CrossRefGoogle Scholar
Wilcox, E., Futrell, R., & Levy, R. (2021). Using computational models to test syntactic learnability. https://ling.auf.net/lingbuzz/006327Google Scholar
Wilcox, E., Levy, R., Morita, T., & Futrell, R. (2018). What do RNN language models learn about filler-gap dependencies? In Proceedings of the 2018 EMNLP Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP. Association for Computational Linguistics.Google Scholar
Wykes, T. (1981). Inference and Children’s Comprehension of Pronouns. Journal of Experimental and Child Psychology, 32, 264278.CrossRefGoogle Scholar
Figure 0

Figure 1. Proposal for the relevant components of the acquisition process that a computational cognitive model of language acquisition should consider. External components (input and behavior) are observable. Internal components are not observable, and include perceptually encoding information from the input signal (yielding the perceptual intake), generating output from the encoded information (yielding observable behavior), and learning from the encoded information (using constraints & filters to yield the acquisitional intake, and doing inference over that intake). The developing systems and developing knowledge (both linguistic and non-linguistic) impact all internal components, while the learning component updates the developing knowledge.