Introduction
An important part of learning a first language is learning to label not only objects and people that exist in a child's environment, but also relations that exist between them. People are often related to objects by actions they perform on them. These actions are typically labelled by verbs. Previous research has shown that young children tend to be more conservative when extending novel verbs to novel exemplars, and incorporate more specific elements of a scene, such as the agent or objects, in their verb interpretations compared to adults. For instance, Forbes and Poulin-Dubois (Reference Forbes and Poulin-Dubois1997) reported that very young children view the manner in which an action is performed as a crucial part of the verb's meaning. They found that 20-month-olds are less likely than 26-month-olds to extend familiar verbs (e.g., pick up) to new instances where the manner of the action has changed (e.g., object picked up with foot rather than hand). Also, compared to adults, young children have a strong tendency to view objects or agents as part of a novel verb's meanings. Both Behrend (Reference Behrend1990) and Forbes and Farrar (Reference Forbes and Farrar1993) found that children five years and under viewed the instrument with which an action was performed as part of the verb's meaning. They were less likely than adults to extend a novel verb to a new instance of the same action when performed using a different instrument. Kersten and Smith (Reference Kersten and Smith2002) found that three-and-a-half to four-year-olds consider the agent of an action as part of a novel verb's meaning much more strongly than adults. Children accepted scenes in which the motion (the path travelled) of a bug-like agent had changed but the agent did not equally often as scenes in which the motion remained the same but the agent changed. Finally, Imai, Haryu, and Okada (Reference Imai, Haryu and Okada2005) and Imai, Haryu, Okada, Hirsh-Pasek, Golinkoff, and Shigematsu (Reference Imai2008) showed that young children strongly consider the object that an action is performed on to be part of their interpretation of a novel verb. They found that Japanese, English, and Chinese three-year-olds were unable to extend a newly learned verb to a new instance of the same action when selecting between a scene depicting the same action being performed on a new object or a scene depicting a new action being performed on the same object. In contrast, five-year-old Japanese and English children as well as Chinese adults were able to do this.
The research presented so far concentrated on the fast mapping of verbs from a single exemplar to a new instance of a verb. While this tells us something important about the challenges that children face when first encountering a novel verb and their first (mis-)interpretations, young children do not seem to use verbs inappropriately in everyday speech. The literature on verb learning presents a number of suggestions of what helps young children to correctly map verbs onto actions. For instance, findings by Arunachalam and Waxman (Reference Arunachalam and Waxman2011) support the notion that extensive linguistic scaffolding can improve verb understanding in two-year-olds. The same is true for rich semantic information and an informative syntactic frame (Arunachalam & Waxman, Reference Arunachalam and Waxman2015). Furthermore, Haryu, Imai, and Okada (Reference Haryu, Imai and Okada2011) have shown that object similarity can provide a scaffolding role to aid young children's correct verb extensions.
The present study is concerned with another way that may help children understand the meaning of verbs, namely the opportunity to align simultaneously presented exemplars during the learning phase. It is possible that simultaneous presentation of exemplars may overload children, providing them with too much information to capitalise on. However, the potential benefit of the opportunity to align exemplars in word learning is theoretically supported by the structural alignment theory proposed by Gentner (e.g., Markman & Gentner, Reference Markman and Gentner1993; Gentner & Markman, Reference Gentner and Markman1997; Gentner, Reference Gentner, Gentner and Goldin-Meadow2003). This theory suggests that simultaneous presentation of multiple exemplars leads to active comparison of the exemplars, which in turn promotes a search for commonalities between the exemplars’ conceptual representations. Even if this comparison is initially prompted by noticing perceptual similarities, it can lead to noticing deeper relational commonalities. Indeed, relational commonalities have been found to be preferentially highlighted by the comparison process (Gentner & Markman, Reference Gentner and Markman1997).
As structural alignment theory predicts, making comparisons between exemplars has been shown to lead children to look past attention-grabbing perceptual similarities and to notice deeper semantic commonalities. For instance, Gentner and Namy (Reference Gentner and Namy1999) found that four-year-old children would extend a novel noun (e.g., kig) used to label an object (e.g., apple) to a perceptually similar object (e.g., balloon) equally often as to a semantically similar object (e.g., banana). However, when a noun was used to label two instead of one objects (e.g., apple and pear), children's extensions were predominantly to the semantically similar object (banana). This is particularly remarkable as both the apple and the pear were more perceptually similar to the balloon, effectively providing the child with more evidence for choosing on the basis of perceptual similarity. This experiment shows that providing the opportunity to compare two exemplars (apple and pear) highlights deeper semantic commonalities. Since the two exemplars were presented side-by-side, this result suggests that rather than being overwhelmed by simultaneously presented multiple exemplars, children's learning was facilitated by it.
Only a small number of studies have presented multiple verb exemplars, and not with the purpose of investigating the effect of multiple exposures (Childers, Reference Childers2011; Maguire, Hirsh-Pasek, Golinkoff, & Brandone, Reference Maguire2008; Waxman, Lidz, Braun, & Lavin, Reference Waxman2009). In a preferential looking paradigm with 24-months-olds, Waxman et al. (Reference Waxman2009) demonstrated that the presentation of several consecutive videos showing a particular action (e.g., waving) performed on four different exemplars of the same category (e.g., four different coloured balloons) allowed children to successfully learn novel verbs. Childers (Reference Childers2011) found that young children associated a novel verb with whatever aspect remained constant across several exemplar events. She presented three consecutive events to 2½-year-olds. If children saw three consecutive scenes which preserved the action but not the result, they were more likely to replicate the action when asked to carry out the verb. If children were shown scenes which preserved the result but not the action, they were more likely to replicate the result.
Maguire et al. (Reference Maguire2008) showed that variation across exemplars is not always beneficial for learning. They compared the effect of consecutively seeing the same actor performing a particular novel action against seeing different actors performing the novel action. They found that 2½ to three-year-olds extended novel verbs more successfully when seeing four consecutive identical video clips featuring the same actor performing the same action than when seeing four video clips featuring different actors performing the same action. They argue that the different actors drew too much attention away from the action.
The research above all focused on the potential benefit of sequentially presented exemplars to young children's verb learning. However, children often encounter verb learning opportunities in the context of multiple, simultaneous events in the real world. For instance, lots of activities are done by multiple people, with different objects, at the same time (e.g., eating, playing with different things). We are interested in whether young children can capitalise on these events and are not instead overloaded by them. The main aim of the current study was therefore to investigate whether the presentation of multiple, simultaneously presented exemplars would aid young children's verb learning.
It has also been shown that variation in exemplars can promote generalisation in both language development (e.g., Waxman & Klibanoff, Reference Waxman and Klibanoff2000; Bowerman & Choi, Reference Bowerman, Choi, Bowerman and Levinson2001) and in the non-verbal domain of object categorisation (e.g., Ribar, Oakes, & Spalding, Reference Ribar, Oakes and Spalding2004; Vukatana, Graham, Curtin, & Zepeda, Reference Vukatana2015). This suggests that verb learning might also benefit from variability, yet a study by Maguire et al. (Reference Maguire2008) found that, to the contrary, no variability in exemplars better supported verb learning than varying exemplars. However one possibility is that this was due to the fact that they presented exemplars consecutively, in contrast to the current study where we present them simultaneously. When viewing exemplars simultaneously, exemplar variability might be more beneficial, because a side-by-side presentation can ease comparison and might therefore help children engage in the kind of abstraction processes that have been suggested by Gentner and colleagues. In other words, a simultaneous presentation of varying exemplars might ease abstraction processes and focus children more strongly on the relation between the actor and the object, i.e., the action. The secondary aim of the current study was therefore to determine whether simultaneously presented exemplars should vary to aid correct verb extensions.
To achieve the study aims, we investigated three-year-olds’ extensions of novel verbs to new exemplars in three separate conditions. First a baseline condition, based on Imai et al.'s (Reference Imai, Haryu and Okada2005) paradigm, to establish level of performance with only a single exemplar. Children viewed a novel action being repeatedly performed on a novel object within a single video. In the second condition, children were presented with two videos side-by-side that varied in terms of the objects that the action was performed on. In the third condition, children were again presented with two videos, but these were identical exemplar videos. Better performance when two identical (condition 3) instead of one video (condition 1) was presented would suggest that children can learn verbs from viewing simultaneous exemplars, even without the need for variability. Better performance when exemplars vary in content (condition 2) would suggest that variability supports comparison and abstraction processes.
Method
Participants
Sixty three-year-olds (mean age 41.9 months, SD = 3.4) participated in the experiment, with 20 children randomly assigned to each condition. Participants were recruited from nurseries in the West Midlands area of the United Kingdom. All were native monolingual speakers of English. All nurseries served families from areas of the same mid-level socioeconomic status. Permission to participate was granted by the owner of the nursery. Parental consent was obtained when requested by the nursery owner. The study had approval from the Ethical Review Committee of the University of Birmingham.
Materials
A laptop computer displayed Microsoft Powerpoint® slides containing either one video in the centre of the screen or two videos playing side-by-side. All videos were the same size regardless of which condition they featured in (15 cm ×11 cm). In line with the action-scenes used in Imai et al. (Reference Imai, Haryu and Okada2005), each video displayed an actor repeatedly performing a novel action on a novel object for a 30-second period. Six novel verbs (blicking, gloobing, rinting, zanging, triting, plewing) were used to label six novel actions (for a description of the actions, see supplementary material, available at <https://doi.org/10.1017/S0305000918000119>). All actions were iterative, durative, and involved direct contact with the object.
Procedure
Participants were told: “We're going to play a game on the computer. We're going to look at some videos of some people doing some funny things.” To begin with, the participant took part in a pair of warm-up trials. These were the same for all three conditions. In the first warm-up trial, a picture of a dog and a picture of a cat were shown side-by-side. The child was asked to point at one of the pictures, e.g., “Can you show me the dog? Which picture is the dog in? Only one, can you show me?” In the second warm-up trial, participants were simultaneously shown a video of an actor jumping up and down and a video of the same actor going from a standing to a sitting position, side-by-side for 30 seconds. The child was asked to point at one of the videos, e.g., “Can you show me jumping? Which video is she jumping in? Only one, can you show me?” Which picture or video the child was asked to label in each of the warm-up trials was randomised across participants, but participants were always asked to point at one picture/video on the left and one on the right. In this way, we ensured that participants were willing and able to point to both sides of the screen. No children failed the warm-up trials.
Participants were presented with six experimental trials. Each trial consisted of a training slide followed by a test slide. Participants viewed training slides associated with the condition they were taking part in, but all groups were presented with the same test slides.
Training – Single action-scene condition
The training slide consisted of a single video in the centre of the screen showing a female actor performing a novel action on a novel object. For instance, a woman was holding a novel object and rolled it backwards and forwards between the palms of her hands (see Panel A of Figure 1). This video was shown for 30 seconds and consisted of the actor repeatedly performing the action. While the video was being shown, the experimenter pointed at the video and labelled the action three times, at 10-second intervals, e.g., “Look she is blicking.”
Training – Multiple action-scene condition (MA): Different action-scenes
This condition was identical to the single action-scene condition with the exception that the participants saw two videos side-by-side, featuring the novel action being performed on two different novel objects, and heard each video labelled with the same novel verb (see Panels A and B of Figure 1). The experimenter labelled and pointed at both videos; e.g., “Look she is blicking, and look she is blicking.” This occurred at 10-second intervals, resulting in each video being labelled three times in total.
Training – Multiple action-scene condition (MA): Same action-scene twice
This condition was identical to that of the MA Different action-scenes condition, with the exception that, rather than seeing two different action-scenes on the training slide, participants saw the same action-scene twice (e.g., Panel A of Figure 1 on both sides of the screen). Note that these were not presented in sync.
Testing
On the test slide, two videos, featuring the same female actor as in the training phase, played side-by-side simultaneously. The foil (same object–different action) video showed the actor using one of the objects used in the corresponding training but with a new novel durative and iterative action (see Panel C of Figure 1). The target (same action–different object) video showed the actor carrying out the same action seen during the corresponding training, but with a new novel object (see Panel D of Figure 1). Which video appeared on which side was randomised across participants. While the videos were playing, the experimenter asked the participant to point to the video that featured the novel verb that they heard during the presentation of the training slide: “Can you show me blicking? Which video is she blicking in? Only one, can you show me?” The videos were 30 seconds long, although no participant required the full 30 seconds in order to produce a response. As soon as the participant pointed to one of the videos, the experimenter moved on to the next trial.
Results
Selecting the target video containing the action originally labelled with the novel verb was considered a correct response. Figure 2 displays the results. The number of correct selections was analysed with a between-subjects ANOVA with Action-scene condition as the between-participants factor. The test indicated a significant main effect of Action-scene condition (F(2,57) = 5.2, p = .008, partial η 2 = .154). LSD post-hoc tests revealed a significant difference between the Single action-scene condition and the MA Different action-scenes condition (p = .003), but not between the Single action-scene condition and the MA Same action-scene twice condition (p = .424). There was also a significant difference between the MA Different action-scenes condition and the MA Same action-scene twice condition (p = .025).
Additionally, number of correct selections was compared against chance (3 out of 6 responses) for each condition. Children did not make the correct selection any more often than would be expected by chance in the Single action-scene condition (t(19) = 0.3, p = .797) or in the MA Same action-scene twice condition (t(38) = –0.7, p = .463), but they did so in the MA Different action-scenes condition (t(19) = 5.9, p < .001).
Discussion
The main aim of the present study was to investigate whether multiple, simultaneously presented exemplars would aid young children's verb learning. The secondary aim was to determine whether simultaneously presented exemplars should vary to aid correct verb extensions. We achieved this by comparing instances where only a single action-scene was presented, instances where multiple identical action-scenes were presented, and instances where multiple varied action-scenes were presented. Using the single action-scene condition as a baseline, we found that three-year-old children showed improved verb extension performance when they were presented with multiple exemplars, but only when exemplars varied. This was also the only condition in which children extended verbs above chance level. These findings suggest that children can learn verbs from simultaneously presented multiple exemplars, but only when exemplars vary in content.
The results of the Single action-scene condition replicate Imai et al.'s (Reference Imai, Haryu and Okada2005, Reference Imai2008) findings that three-year-olds have difficulty with mapping verbs to actions. Imai et al. (Reference Imai, Haryu and Okada2005) suggests that three-year-olds likely map the verb onto an action–object interaction, believing that both the original action and object need to be present in order to extend the verb. They demonstrated that the actor was not considered to be part of the verb's meaning as three-year-olds were willing to extend verbs when the actor changed, as long as both the action and object stayed the same.
Of particular note is that, in order to succeed in our paradigm and regardless of experimental condition, children only needed to show a preference for the same action performed on a new novel object over a completely different action. Three-year-olds were only able to do this when they had been presented with multiple varied exemplars. It therefore appears that without seeing the action carried out on multiple objects, three-year-olds did not appear to recognise that the novel action was more similar to the same novel action used with a different object than to a wholly different action. This is striking, but can be understood with Imai et al.'s (Reference Imai, Haryu and Okada2005) suggestion that three-year-olds have a strong tendency to incorporate the object as part of the verb's meaning, and seem to consider the object to be as important as the action. It should be noted that incorporating objects into the meaning of a verb is not something that adults would never do. Object arguments are almost always constrained in some way: food is eaten, relatively small and light things can be thrown, only trees are felled, etc. But for adults it is clear that the action is the central part of a verb's meaning, and therefore the best basis for extension.
One might wonder whether children's improved performance when presented with multiple varying exemplars really reflects that they learned that the verb maps onto the action component of a scene, or whether they might have learned instead that the verb does not apply to the object. But the latter suggestion would presuppose that the children were biased to try and map the novel word to an object. If this were the case, then we would expect performance in the single action-scene condition to be below chance, i.e., children would consistently have chosen the incorrect foil with the same object. Because this was not the case, we can conclude that they indeed learned that the verb mapped onto the action.
Our results provide support for the utility of structural alignment (e.g., Markman & Gentner Reference Markman and Gentner1993; Gentner & Markman, Reference Gentner and Markman1997; Gentner, Reference Gentner, Gentner and Goldin-Meadow2003) in aiding verb learning. Our findings further show that variability between exemplars is important for the abstraction of relational commonalities (here the action) when exemplars are presented simultaneously. Multiple exemplar scenes in which no content varies may increase opportunity for reflection on the scenes and the meaning of the novel verb, but this may not necessarily lead to different insights. Because the components of the scene were the same, comparing them may not have prompted the same kind of abstraction as varied exemplars. This is supported by the numerically higher, but not significantly higher, scores in the multiple action-scene condition with identical videos, compared to the single action-scene condition.
Our findings are in accordance with the suggestion of Waxman et al. (Reference Waxman2009) and Childers (Reference Childers2011) that multiple exemplars can facilitate verb understanding. They therefore add to previous literature into how young children's verb understanding can be improved (Arunachalam & Waxman, Reference Arunachalam and Waxman2011, Reference Arunachalam and Waxman2015). We have found that exposure to varied multiple exemplars can enable young children to correctly extend verbs, even without the additional linguistic scaffolding provided by Arunachalam and Waxman (Reference Arunachalam and Waxman2011), or the necessity for rich semantic and syntactic information provided by Arunachalam and Waxman (Reference Arunachalam and Waxman2015).
More generally, our findings show the potential utility and benefits of alignment opportunities in language learning. In a similar vein to the findings of, for instance, Gentner and Namy (Reference Gentner and Namy1999) regarding noun extensions to category members, we have found that allowing children to make comparisons across two different scenes which varied in content, labelled with the same novel verb, allows young children to focus on the action featured and thus to understand that the object is not part of the verb's meaning. Therefore, providing young children with conditions that facilitate alignment can bootstrap them up to a level of word understanding more akin to that of older children and adults.
It has previously been argued that less information may be beneficial for the formation of non-linguistic relational categories (Casasola & Cohen, Reference Casasola and Cohen2002; Kersten & Smith, Reference Kersten and Smith2002; Quinn, Poly, Furer, Dobson, & Narter, Reference Quinn2002); that children might focus first on objects in a scene and only later on relations. Therefore, the use of different objects across action-scenes may keep children's focus on objects. However, our finding that participants did not benefit from seeing the same action-scene twice suggests that less information is not necessarily better for verb learning. Our findings rather fit with the suggestion of Childers and Paik (Reference Childers and Paik.2009) that highly similar exemplars may lead children to be more conservative with their verb extensions. They support the suggestion by Waxman and Klibanoff (Reference Waxman and Klibanoff2000) that exemplar variability is important, and their finding that exemplars that did not vary at all failed to enable correct adjective extension in three-year-old children.
Our findings contrast with Maguire et al. (Reference Maguire2008), who concluded that seeing the same information/scenes multiple times was more beneficial in enabling correct verb extensions in young children than seeing scenes where the action stayed the same but other components varied. But there are important differences between their study and ours which may explain the differing findings. While in Maguire et al., multiple scenes were consecutive, we presented scenes side-by-side. Variation might be more beneficial when direct alignment opportunities through simultaneous presentation are provided. Also, Maguire et al., varied the actor in their scenes, while we varied the objects acted on. The question arises whether there is something specific about varying objects that aids verb learning, in contrast to varying actors. Future research should aim to determine if varying other aspects of simultaneously presented exemplars, for instance the agent, can also improve verb learning.
More broadly, our findings bear similarity to work on the generalisation of syntactic constructs. Wonnacott, Boyd, Thomson, and Goldberg (Reference Wonnacott2012) investigated five-year-old children's extension of novel verb argument constructions. These were taught to children in the context of either a single verb or multiple verbs. They found that children were less likely to correctly extend the construction to untrained verbs in the single verb condition. As in our study, children were more successful when exposed to multiple different exemplars than with a single exemplar. This, together with similar findings on the role of type frequency in morphology acquisition (Bybee, Reference Bybee1985; Plunkett & Marchman, Reference Plunkett and Marchman1991), demonstrates how our findings fit within a broader, input-driven approach to language development.
In conclusion, we have shown that when generalising from a single exemplar scene, three-year-olds do not appear to recognise that a novel action is more similar to the same novel action used with a different object than to a wholly different action. However, when presented with multiple exemplars, occurring simultaneously, they are able to pick the relevant information from the scenes, i.e., the action, for verb learning. Importantly, in order for these exemplars to lead to learning, the content of exemplars needs to vary.
Acknowledgement
Financial support for this research was provided by a grant from the Economic and Social Research Council.
Supplementary materials
For supplementary material for this paper, please visit <https://doi.org/10.1017/S0305000918000119>.