Highlights
-
• Semantic richness facilitation in word recognition remains underexplored in second language (L2).
-
• Megastudy data revealed that L2 speakers relied more on lexical information.
-
• In first-language (L1) speakers, all aspects of sensorimotor information facilitated performance.
-
• In L2, taste- and smell-related concepts did not benefit from additional facilitation.
-
• In L2, auditory processing facilitated responses the most.
1. Introduction
Reading a word in one’s native language activates the orthographic and phonological representation of the word, which further spreads activation to its semantic content. This can feed activation back to the orthographic or phonological representation and facilitate a rapid, efficient decision in a word recognition task (Coltheart et al., Reference Coltheart, Rastle, Perry, Langdon and Ziegler2001; Harm & Seidenberg, Reference Harm and Seidenberg2004), allowing for comprehension of the word’s meaning. Such information includes perceptual and motor experience associated with the concept, encoded in the brain areas responsible for perceiving or interacting with the word’s referent (Barsalou, Reference Barsalou1999; Goldberg et al., Reference Goldberg, Perfetti and Schneider2006; Vigliocco et al., Reference Vigliocco, Meteyard, Andrews and Kousta2009), and is simulated using similar neural pathways when encountering the concept’s name. For example, motor cortex activation occurs upon reading action-related words, such as “kick” (Hauk et al., Reference Hauk, Johnsrude and Pulvermüller2004). According to the semantic richness theory (Buchanan et al., Reference Buchanan, Westbury and Burgess2001; Pexman et al., Reference Pexman, Hargreaves, Siakaluk, Bodner and Pope2008), the stronger, more detailed the activation, with large amount of information about the concept, the faster and more accurate the word recognition (Pexman et al., Reference Pexman, Hargreaves, Siakaluk, Bodner and Pope2008; Recchia & Jones, Reference Recchia and Jones2012; Yap et al., Reference Yap, Tan, Pexman and Hargreaves2011; Zdrazilova & Pexman, Reference Zdrazilova and Pexman2013).
The facilitation effect of perceptual and motor simulation in word recognition has been demonstrated in many languages (e.g., English: Lynott et al., Reference Lynott, Connell, Brysbaert, Brand and Carney2020; Miklashevsky et al., Reference Miklashevsky, Reifegerste, García, Pulvermüller, Balota, Veríssimo and Ullman2024; Dutch: Speed & Brysbaert, Reference Speed and Brysbaert2021; and Italian: Vergallito et al., Reference Vergallito, Petilli and Marelli2020), but less is clear about this mechanism in L2 word recognition. On the one hand, access to semantic information appears to overlap between the two lexicons when comparing brain activity using electroencephalogram (EEG) and functional magnetic resonance imaging (fMRI) imaging (Ma et al., Reference Ma, Chen, Guo and Kroll2017; Van de Putte et al., Reference Van de Putte, De Baene, Brass and Duyck2017), and L2 automatically activates native language representations in behavioural studies (Brysbaert et al., Reference Brysbaert, Van Dyck and Van de Poel1999; Thierry & Wu, Reference Thierry and Wu2007; Vukovic & Williams, Reference Vukovic and Williams2014), which should drive similar semantic facilitation effects of word recognition. Indeed, activation of motor areas appears to be modulated by both native and non-native language processing (De Grauwe et al., Reference De Grauwe, Willems, Rueschemeyer, Lemhöfer and Schriefers2014; Dudschig et al., Reference Dudschig, De la Vega and Kaup2014; Monaco et al., Reference Monaco, Mouthon, Britz, Sato, Stefanos-Yakoub, Annoni and Jost2023; see also Kühne & Gianelli, Reference Kühne and Gianelli2019, for review). On the other hand, some evidence suggests reduced grounding of L2 lexical labels in sensorimotor experience. Norman and Peleg (Reference Norman and Peleg2022) found that participants did not automatically simulate shape information when reading sentences about visual objects in L2. Access to negative emotional information is usually suppressed in L2 (Foroni, Reference Foroni2015; Jończyk et al., Reference Jończyk, Boutonnet, Musiał, Hoemann and Thierry2016; Thierry & Wu, Reference Thierry and Wu2007; Wu & Thierry, Reference Wu and Thierry2012), and memory for emotional stimuli in L2 does not benefit from the same facilitation over neutral stimuli as in the native language (L1; Baumeister et al., Reference Baumeister, Foroni, Conrad, Rumiati and Winkielman2017). The difference likely stems from the way in which the word meaning was acquired (i.e., in classroom settings rather than in the context of emotional situations or interacting with the referent concepts). As a result, one could expect that the links between L2 labels and semantic representation are weaker and less dense, resulting in slower and more effortful access to semantic information (Monaco et al., Reference Monaco, Jost, Gygax and Annoni2019) and less reliance on semantic information during word recognition.
Recently, psycholinguistic research has turned to megastudy techniques (Balota et al., Reference Balota, Yap, Hutchison, Cortese and Adelman2012) to address questions about the semantic effects in language processing (Cortese, Reference Cortese, Spieler and Schumacher2019; Cortese et al., Reference Cortese, Yates, Schock and Vilks2018; Dymarska et al., Reference Dymarska, Connell and Banks2023b). This allows for examining multiple item characteristics along the full range of their values, rather than a dichotomous split of predictors. Although megastudies of sensorimotor simulation effects are focused largely on L1 (Cortese et al., Reference Cortese, Yates, Schock and Vilks2018; Lynott et al., Reference Lynott, Connell, Brysbaert, Brand and Carney2020; Mandera et al., Reference Mandera, Keuleers and Brysbaert2020; Speed & Brysbaert, Reference Speed and Brysbaert2021), Brysbaert et al. (Reference Brysbaert, Keuleers and Mandera2021) collected the first dataset of word knowledge of non-native English speakers, where participants were presented with real words (62,000 in total, spanning the lexicon of an average native English speaker; Brysbaert et al., Reference Brysbaert, Stevens, Mandera and Keuleers2016) and non-words and asked to indicate whether they knew the meaning of the word. This dataset presents a unique opportunity to study the effects of semantic variables (i.e., strength of sensorimotor simulation) on the accuracy and time of response on non-native word recognition. Critically, data on the same set of stimuli and the same task was also collected among native speakers of English (Mandera et al., Reference Mandera, Keuleers and Brysbaert2020), which allows for a comparison of semantic effects where any differences that emerge cannot be attributed to differences between the stimuli or tasks, but only the language status of the participants. In other words, a comparison of the two datasets can reveal to what extent the mechanisms of word recognition in native and non-native speakers of English vary.
The benefits of such analysis are twofold. Firstly, unlike lexical decision or word naming data, the word knowledge task (Mandera et al., Reference Mandera, Keuleers and Brysbaert2020) has not been investigated with semantic variables as predictors of performance. Analysis of this dataset of native speakers allows for the semantic richness theory to be tested on a new task, similar to a lexical decision task, but tapping more directly into the semantic aspect of the word, due to the emphasis on knowing the word’s meaning. Namely, participants are not asked to determine whether the displayed stimulus in simply an English word – which requires primarily knowledge of English orthography – but to indicate whether they know the meaning of the word. Here, the effects of semantic strength of a concept should be even more evident, and thus, it is expected that this analysis will lend further support to semantic richness theory. Secondly, and critically, analysing data from non-native speakers, both separately and in comparison with data from native speakers, can shed light on the effects of perceptual and motor simulation on the speed and accuracy of word recognition in L2. It is possible that similar effects will emerge from the two datasets, suggesting that sensorimotor strength elicits similar facilitation regardless of the language status (i.e., using one’s native language or not). However, if the effects in L2 are weaker, whether in accuracy or time of response, then this would have implications for the theories of semantic representations in L2. In other words, it is now possible to test the semantic richness theory in L2 using a large dataset (nearly 20 thousand words, see Methods for details).
Finally, collating the perceptual and motor strength information and the word knowledge data from the two groups of speakers allows us to compare the sensorimotor profile of the words which are known by the two groups. Brysbaert et al. (Reference Brysbaert, Keuleers and Mandera2021) examined the characteristics of words best known by native and non-native English speakers and reported that L2 speakers tended to have better knowledge of academic words and complex words derived from other languages, with poorer knowledge of words used in everyday life in social or family settings, likely due to differences in exposure. However, they did not compare the perceptual and motor characteristics of best-known words among the two groups of participants. It is likely that high sensorimotor strength words are best known by native speakers of English, since they are most likely to be acquired by interacting with their referent objects. On the other hand, it is also possible that L2 speakers report knowing high sensorimotor strength words more, as they can be easier to remember (Dymarska et al., Reference Dymarska, Connell and Banks2023a), or easy to directly match to referents frequently encountered in everyday life.
2. Current study
In the current study, I use data from existing megastudies of word knowledge in L1 and L2 English, in order to analyse the sensorimotor profile of the words that people tend to know best in their L1 and L2. Employing Bayesian hierarchical linear regressions on datasets from Mandera et al. (Reference Mandera, Keuleers and Brysbaert2020) and Brysbaert et al. (Reference Brysbaert, Keuleers and Mandera2021), I then evaluate the effects of perceptual and motor strength ratings on accuracy, reponse time (RT) and rank order in an English word knowledge task. I first set out to establish to what extent sensorimotor information facilitates L1 word knowledge and whether the pattern of results is comparable to research on lexical decision, validating the Mandera et al. (Reference Mandera, Keuleers and Brysbaert2020) dataset as a useful source to study word recognition alongside, for example, lexical decision and word naming from the English Lexicon Project (ELP; Balota et al., Reference Balota, Yap, Hutchison, Cortese, Kessler, Loftis, Nelson, Simpson and Treiman2007). Subsequently, I investigate, for the first time, whether strength of perceptual and motor experience predicts word knowledge task performance in L2, in order to shed light on the extent to which sensorimotor information facilitates word processing in L2 compared to L1.
As predictors, I use Lancaster Sensorimotor strength ratings from Lynott et al. (Reference Lynott, Connell, Brysbaert, Brand and Carney2020) which have proven to be reliable predictors of lexical decision performance (e.g., Dymarska et al., Reference Dymarska, Connell and Banks2023b; Lynott et al., Reference Lynott, Connell, Brysbaert, Brand and Carney2020 and Speed & Brysbaert, Reference Speed and Brysbaert2021). They include six individual perceptual strength ratings, five motor effector strength ratings and the composite Minkowski3 sensorimotor strength, which combines the multidimensional ratings into a single variable, while attenuating the influence of weaker dimensions.
3. Study 1
In this study, I analyse rank order of best-known words, as well as accuracy and speed of responses, in a word knowledge task from Mandera et al. (Reference Mandera, Keuleers and Brysbaert2020). This dataset was selected to provide direct comparison of performance in L1 and L2 speakers (Study 2), without the confounding effect of task differences. Additionally, as the dataset from Mandera et al. (Reference Mandera, Keuleers and Brysbaert2020) is more recent than commonly used datasets of word recognition (e.g., British Lexicon Project, Keuleers et al., Reference Keuleers, Lacey, Rastle and Brysbaert2012; English Lexicon Project, Balota et al., Reference Balota, Yap, Hutchison, Cortese, Kessler, Loftis, Nelson, Simpson and Treiman2007), the current study will investigate for the first time whether word knowledge is affected by lexical and semantic word characteristics in the same way as word recognition, which can inform future studies attempting to use word knowledge as both dependent and independent variables. The study will provide a sensorimotor profile of words that are most commonly known by English speakers.
The study also aims to establish baseline lexical effects on word knowledge, as well as to evaluate perceptual and motor strength effects, for further comparison with L2 English speakers in Study 2. It is the first study to investigate whether words better known by the English-speaking population are rated higher on semantic variables, such as sensorimotor strength.
3.1. Method
3.1.1. Materials
The study used data from an online “word knowledge” megastudy, with speakers of English as L1 (English Crowdsourcing Project, Mandera et al., Reference Mandera, Keuleers and Brysbaert2020 Footnote 1), where participants were presented with words and non-words in English and were asked to “indicate whether it is a word you know or not”. Our dependent variables (DVs) were as follows: rank order of the words from best known to least known (based on accuracy and RT, as ranked in the original study); accuracy in responding “yes” to words that were true English words, known to the participants, as well as response time on correct responses to real words.
Based on previously established lexical effects on word recognition, I collated a number of lexical predictors: log word frequency and log contextual diversity, word length in letters, Orthographic Levenshtein Distance (OLD), Phonological Levenshtein Distance (PLD), number of morphemes, number of syllables and Age of Acquisition (AoA; all available from ELP), as well as Zipf frequency (van Heuven et al., Reference van Heuven, Mandera, Keuleers and Brysbaert2014). As semantic variables, I used concreteness ratings (available from ELP) and sensorimotor strength ratings in English from Lynott et al. (Reference Lynott, Connell, Brysbaert, Brand and Carney2020), encompassing six perceptual modalities (vision, hearing, smell, taste, touch and interoception), five motor effectors (head, mouth, torso, leg arm action) and a composite measure of sensorimotor strength (Minkowski3) which was found by Lynott et al. to be a good predictor of word recognition.
I selected words from Mandera et al. (Reference Mandera, Keuleers and Brysbaert2020) for which values were available for all nine lexical predictors, as well as concreteness and sensorimotor strength ratings. This resulted in a sample of 19,816 words.
Due to high multicollinearity of the lexical and sensorimotor variables, I opted for a principal components analysis (PCA; parallel analysis, 95th percentile, with orthogonal varimax rotation and pairwise exclusion, using the correlation matrix), reducing the 22 variables to 6 orthogonal components (see supplemental materials for zero-order correlations between predictors and details of the PCA). The components captured 75.7% of the original variance, and the intercorrelation of the components was zero or near zero. Composite Minkowski3 sensorimotor strength was not included in the PCA, in order to compare the total effects of the rotated components with the effects of composite sensorimotor strength and to determine which method of compressing data from the 11 sensorimotor dimensions provides better predictor variables.
The PCA represented different aspects of information about concepts – two were lexical (capturing length- and frequency-related variables), and the remaining four captured sensorimotor experience of the words. The sensorimotor components concentrated around concrete objects (CO; loading strongly on concreteness, touch and vision and loading most negatively on interoception), physical sensation (PS; loading on interoceptive, arm, foot strength and torso experience), taste and smell (TS; loading on gustatory and olfactory experience and experience with the mouth) and talking and listening (TL; loading on auditory, head and mouth experience). This was similar to the components in Dymarska et al. (Reference Dymarska, Connell and Banks2023b) who analysed over 9,000 words with similar variables, supporting the robustness of the division of the components, and perhaps mirroring the type of concepts we often tend to encounter and label in everyday life. The components also partially overlapped with factors identified in a sample of 6,000 words by Diveica et al. (Reference Diveica, Muraki, Binney and Pexman2024): visuotactile (to do with touch and vision), body action (motor) and social interaction (similar to communication; listening and talking).
The components, together with the lexical components, served as predictors in the analysis. All data along with the analysis code, can be found at: https://osf.io/wx8fc/
3.1.2. Data and analysis
Data was analysed using Bayesian hierarchical linear regressions in JASP (0.18.1: JASP Team, 2023), with default JZS priors (r = .354) and a Bernoulli distribution (p = 0.5). In Step 1, I entered the two lexical components (Length and Frequency)Footnote 2. In Step 2, I added sensorimotor strength measures: the four sensorimotor components (Step 2a) and Minkowski3 sensorimotor strength (Step 2b), in order to determine whether sensorimotor strength facilitates performance in the word knowledge task over and above the lexical variables.
I used Bayes Factors (BFs) per model to establish whether the model step with Minkowski 3 sensorimotor strength or with the four sensorimotor components outperformed the lexical model and which model step provided better fit to the data. I report regression coefficients and their BFinclusion (evidence threshold: BFinclusion ≥ 3) to estimate the effect of each predictor on word knowledge.
The analysis was repeated for each of the three dependent variables from Mandera et al. (Reference Mandera, Keuleers and Brysbaert2020) – rank order, accuracy and reaction time.
3.2. Results
Overall accuracy was high, with 97% of words recognised correctly by native English speakers (standard deviation (SD) = 4.2%) and average RT on correct responses was 982 ms (SD = 140ms). Top five words,Footnote 3 with highest accuracy and fastest response times, were “pet”, “water”, “blood”, “horse” and “hair”. Words ranked the lowest, with poor accuracy of recognition and slowest response times were “wisenheimer”, “riffle”, “enmesh”, “matchwood” and “zodiacal”. Table 1 shows descriptive statistics of the top 500 words in L1 and L2, based on the lexico-semantic variables entered into the PCA.
Table 1. Descriptive statistics of top 500 words known by L1 and L2 speakers

Figure 1 illustrates how the component scores are represented in the 500 most and least known words. While the two groups appear to vary, and the most known words tend to score higher on sensorimotor components, the CO component did not show significant difference between the two groups of words (p > .1). Native speakers of English tend to report knowing words that are generally higher in sensorimotor strength, but it is possible that some less known words can still refer to highly concrete concepts or reflect specialised knowledge (e.g., obscure objects).

Figure 1. Comparison of the component scores for 500 best and 500 least known words in the L1 group (Study 1).
Note: CO – concrete objects; PS – physical sensation; TS – taste and smell; TL – talking and listening.
The baseline lexical model explained 28.2% variance in accuracy, 40.9% of variance in L1 rank and 50.5% in zRT (based on total R2; see Table 2 and Figure 2), which was comparable or slightly higher than lexical decision accuracy and RT (Khanna & Cortese, Reference Khanna and Cortese2021; Dymarska et al., Reference Dymarska, Connell and Banks2023b). Participants responded faster to shorter and more frequent words. Higher frequency also improved accuracy and led to higher rank. Unexpectedly, accuracy was better for longer words, and there was no effect of word length on rank.
Table 2. Variance in word knowledge explained by each step of the regression models (change in R 2, with Bayes Factors for each step compared to the previous) and by each lexical and sensorimotor component (mean posterior coefficients of individual predictors) for each DV.

Note: Rank coefficients have large values due to the large units that rank is measured in (1:57,327), relative to the units of PCA (around −3 to 3). Negative coefficients indicate that higher frequency or sensorimotor strength are associated with numerically lower (i.e., better) ranks, meaning values closer to 1.
*BF10 ≥ 3, positive evidence; **BF10 ≥ 20, strong evidence; and ***BF10 ≥ 150, very strong evidence.

Figure 2. Mean of posterior coefficients of effects of lexical (Step 1 model) and sensorimotor components on word knowledge (Step 2a – sensorimotor components, 2b – Minkowski 3). Plain bars show results of Study 1 (effects in L1); shaded bars show results of Study 2 (effects in L2). Error bars represent 95% Credible Intervals. *BF10 ≥ 3; **BF10 ≥ 20; and ***BF10 ≥ 150.
Note: CO – concrete objects; PS – physical sensation; TS – taste and smell; TL – talking and listening.
Step 2a model with sensorimotor components explained an additional 2.1% of variance in accuracy (BF10 = 1.205×10117), 2.8% of variance in L1 rank (BF10 = 3.507×10198) and 3% in zRT (BF10 = 3.636×10265), which was comparable to previous findings on sensorimotor effects in lexical tasks (Dymarska et al., Reference Dymarska, Connell and Banks2023b).
The Step 2b model with Minkowski3 sensorimotor strength, contributed between 1.9% and 3% of variance in dependent variables. The Step 2a model with sensorimotor components offered a better fit than the 2b model with Minkowski 3 aggregate measure for accuracy and rank (BF10 ≥ 5980148.883), but Step 2b model offered a better fit for zRT (BF01 = 17978.54785).
Within both model 2a and model 2b, each component, as well as the composite measure, contributed to reported word knowledge of native English speakers in a facilitatory manner. All effects were very strong (BFinclusion ≥ 210.391; see Figure 2). Participants were more likely to indicate that the word was known when it was rated highly on sensorimotor strength and were also faster to make the response. Words with stronger sensorimotor strength were ranked as lower numerically, that is, the stronger the sensorimotor representation, the closer to number 1 (best known) the word was.
3.3. Conclusion
The results support the semantic richness theory, whereby visual processing of words is likely to be faster and more accurate when the word is rated as stronger in sensorimotor experience (Connell & Lynott, Reference Connell and Lynott2012; Pexman et al., Reference Pexman, Hargreaves, Siakaluk, Bodner and Pope2008; Speed & Brysbaert, Reference Speed and Brysbaert2021). This was found in a task which was not a typical lexical decision task; that is, participants were asked to determine whether they knew the meaning of a given word, not whether it was a real word. In the current task, participants may have consciously tried to focus more on the meaning of the word, but this did not seem to influence the effect sizes of the semantic variables, which accounted for the same amount of variance as would be expected in a lexical decision task. It appears that semantic activation occurs automatically upon encountering a written word, to the same extent regardless of specific task instructions, or in other words, the two tasks tap into the same word recognition mechanism. Overall, native speakers of English report knowledge of words with strong sensorimotor representations.
4. Study 2
In Study 2, I examined whether systematic and automatic activation of sensorimotor information during word recognition occurs in participants who are not native speakers of English, whether the patterns that emerged in Study 1 extended to non-native speakers, or whether the semantic activation in L2 word processing varies from the activation in L1 to the extent that it affects word recognition performance.
4.1. Method
Materials and data analysis were the same as in Study 1, with one key difference: the dependent variables (rank, accuracy and RT) were taken from an online megastudy where participants were speakers of English as L2 (Brysbaert et al., Reference Brysbaert, Keuleers and Mandera2021). Participants of the study came from different countries (more than 150 different languages reported as L1) and varied in their proficiency levels, with the majority self-reporting as fluent in English. Self-reported language proficiency statistics are presented in Table 3.
Table 3. Number of sessions and performance for the four proficiency levels indicated by participants in Brysbaert et al. (Reference Brysbaert, Keuleers and Mandera2021)

I report regression coefficient and BFinclusion values from the full Step 2a model (including all four sensorimotor components) and the Step 2b model with Minkowski 3 sensorimotor strength.
4.2. Results
Accuracy was somewhat lower and less consistent than among native speakers, with 79% of words recognised correctly (SD = 17.7%), and average RT on correct responses was slower (mean = 1240 ms, SD = 224 ms). Top five words, with highest accuracy and fastest response times, were “help”, “best”, “smile”, “full” and “coffee”. Words ranked the lowest, with poor accuracy of recognition and slowest response times were “wisenheimer”, “sudsy”, “hibachi”, “cummerbund” and “varmint”. Among non-native speakers of English, the top 500 words generally recognised as best known were slightly more frequent, longer and with denser orthographic and phonological neighbourhoods (see Table 1). On the other hand, they were rated higher on most, though not all, sensorimotor dimensions, compared to the top 500 words recognised as best known by native speakers.
Similar to native English speakers in Study 1, participants tended to report words scoring higher on sensorimotor strength as known (see Figure 3), and the 500 most and least known words generally differed significantly on the sensorimotor dimensions, except for the CO component (p > .1).

Figure 3. Comparison of the component scores for 500 best and 500 least known words in the L2 group (Study 2).
Note: CO – concrete objects; PS – physical sensation; TS – taste and smell; TL – talking and listening.
In the regression analysis of L2 word knowledge, 44.0% of variance in accuracy was explained by the lexical model, which was higher than in Study 1. The baseline model explained 47.3% of variance in L2 rank (higher than in Study 1) and 53.2% of variance in zRT (higher than in Study 1). Critically, both Length and Frequency components elicited much stronger, consistent facilitation effects in L2 than in L1 (see Table 2 and Figure 2).
In Step 2a, the full model explained an additional 2.0% of variance in accuracy (BF10 = 5.897×10147), 2.1% of variance in word rank (BF10 = 792331.784) and 2.2% of variance in response time (BF10 = 1.181×10126), similar to Study 1 for accuracy, but less for rank and zRT. Facilitation was primarily driven by two components: Talking and Listening and Physical Sensation, which elicited very strong effects on all DVs (BFinclusion ≥ 74959.284), with weaker effects of the Concrete Objects component (BFinclusion ≤ 39.694). That is, participants were most likely to report the word as known when it was scored highly on the two strongest components. The final component, Taste and Smell, elicited the most variable effects, with a strong facilitation of RT, a weaker effect on rank, and no effect on accuracy. Interestingly, the effects of component TL were much stronger in L2 compared to L1 participants (see Table 2 and Figure 2), where the inclusion of BFs indicated that it was consistently outperforming in Study 2 compared to Study 1 for each dependent variable.
Step 2a model offered a better fit than the 2b model for accuracy and RT (BF10 ≥ 4.710×1066), with step 2b model outperforming 2a for rank (BF01 = 6.573×1059). The contribution of Minkowski3 sensorimotor strength was much lower than in Study 1, explaining less than 1.5% of variance (based on total R2), although the regression coefficient indicated a higher contribution of Minkowski 3 to accuracy in L2 than in L1.
4.3. Cross-study comparison
As a final comparison, I conducted a combined analysis of response time of the two groups, using full Step 2a model parameters as baseline (all six rotated components), with Study number and its interaction with each sensorimotor component as an additional step of the analysis. Results showed that the addition of the study predictor and its interaction with CO and TL components accounted for an additional 7.4% of variance in the model (log(BF10) = 3060.566). The contribution of the CO component to RT was overall weaker in Study 2 (
$ \hat{\beta} $
= 0.012, log(BFinclusion) = 26.140), while the contribution of the TL component was stronger in Study 2 (
$ \hat{\beta} $
= −0.015, log(BFinclusion) = 41.696), and the other two components did not interact with the study number. This is consistent with the results of separate analyses, as illustrated in Figure 2.
4.4. Conclusions
The results of Study 2 showed a reduced contribution of sensorimotor variables to word processing accuracy in non-native English speakers, with an increased contribution of lexical variables compared to native speakers in Study 1. Non-native speakers of English tended to report knowledge of words that relate to communication with the world (e.g., “concert”, “human” and “chat”), as well as with bodily sensations (e.g., “pain”, “exercise”, “move” and “feeling”). While easier to interact with and access in real life, words which referred to Concrete Objects were only somewhat more likely to be known among non-native speakers, perhaps because it was a broad category not fully capturing what kind of objects non-native speakers of English are familiar with or have encountered in a non-native context. Similarly, words describing food and related actions were not more likely to be reported as known, which may be due to the fact that beyond certain basic items, food tends to include specialised vocabulary, which varies between regions of the world, and may only be acquired at a native level when immersed in a given culture for a long time. Unexpectedly, the contribution of the Talking and Listening component was surprisingly strong, higher than its contribution to L1 processing on all DVs, which may provide some insight into the way L2 speakers process words, with focus on auditory information. I address this in more detail in the general discussion.
Minkowski 3 composite variable explained some of the variance in performance, with weaker effects than the four combined sensorimotor components, but at times better model fit. The contribution of the composite variable was also much higher for accuracy in Study 2 than in Study 1 (see Table 2 for detailed coefficients). Since Lynott et al. (Reference Lynott, Connell, Brysbaert, Brand and Carney2020) found it to be a stronger variable than both summed sensorimotor components and principal components of sensorimotor strength, it is possible that the mixed results in the current study are driven by concreteness which was included in the PCA.
Critically, while the contribution of sensorimotor components was overall reduced compared to Study 1, the increased contribution of Frequency and Length variables led to a larger total R2 of accuracy of L2 participants (see Table 2 and Figure 2). The difference was supported by the BF (also larger for the L2 model). This is in line with previous research on frequency effects in L2 word recognition and can be attributed to lower exposure and smaller vocabulary size in L2, increasing participants’ sensitivity to high-frequency words (Brysbaert et al., Reference Brysbaert, Lagrou and Stevens2017; Duyck et al., Reference Duyck, Vanderelst, Desmet and Hartsuiker2008).
At the same time, in the analysis of response times, all components elicited a facilitation effect on speed of processing, with the effect sizes mostly similar to Study 1, suggesting that just like in L1, processing known written words in L2 (i.e., reading the word, accessing its meaning and making a decision on how to respond) can be faster for words that have richer semantic representations. In line with accuracy results, the contribution of the component CO was reduced and the contribution of the component TL was stronger than in L1, but overall, the results provide evidence that semantic richness can play a role in L2 word processing.
5. General discussion
The study investigated the role of semantic variables in word recognition performance and sensorimotor profiles of words best known by native and non-native speakers of English. I performed Bayesian hierarchical regressions on megastudy data from a novel “word knowledge” task (Brysbaert et al., Reference Brysbaert, Keuleers and Mandera2021; Mandera et al., Reference Mandera, Keuleers and Brysbaert2020) and found that in L1 speakers, both lexical and semantic variables elicited expected facilitation effects. The results of Study 1 supported semantic richness theory predictions, showing that native speakers of English respond faster and more accurately to words with stronger sensorimotor representations. Sensorimotor information was a good measure of semantic content activated upon seeing a word, and it reliably facilitated both accuracy and response times in the word knowledge task, which provided further evidence for the involvement of sensorimotor simulation in language processing. On the other hand, Study 2 found that sensorimotor information plays a reduced role in word recognition and knowledge in L2 word processing, with lexical characteristics eliciting stronger effects on performance.
The contribution of sensorimotor components to word knowledge in L1 provides support for the linguistic-sensorimotor theories of language processing and is in line with previous research demonstrating the role of sensorimotor variables in word recognition (e.g., Dymarska et al., Reference Dymarska, Connell and Banks2023b; Speed & Brysbaert, Reference Speed and Brysbaert2021). Despite very high accuracy of performance among participants, with relatively little variance to explain, strong facilitation effects emerged across all types of sensorimotor experience. Although the Taste and Smell component produced a relatively weaker effect on accuracy, it still strongly facilitated response time. As taste and smell have been underexplored in word recognition studies, with mixed results regarding their contribution to task performance (Dymarska et al., Reference Dymarska, Connell, Lynott, Nesi, Milin and Đurđevićin prep), this finding suggests a possibility that olfactory and gustatory experience can indeed support word recognition. On the other hand, it is possible that the effect of taste and smell could stem from the emphasis on word meaning in the current word knowledge task. Specifically, the instruction to focus on knowing the word meaning may have led participants to direct their attention to these senses (c.f. Connell & Lynott, Reference Connell and Lynott2016), which enhanced the relevance and statistical contribution of the TS component to task performance. Notably, most native speakers achieved very high accuracy on the task, and it is possible that the true contribution of spreading semantic activation in performance on the task is underestimated, so further research is needed to unequivocally determine whether the word knowledge task relies on sensorimotor information to the same extent.
The most striking result in Study 2, on the other hand, is the strong contribution of the Talking and Listening component, related to auditory experience. While visual strength is expected to dominate performance in a visual task (Connell & Lynott, Reference Connell and Lynott2014), this finding hints at a mechanism or a strategy employed by L2 speakers in the word knowledge task. When asked about the meaning of a given word, it is possible that participants (who performed the study online, likely at home) chose to pronounce the word out loud or mentally focus on the sound of the word, in order to bring to mind the memory of the word’s meaning, and to evaluate whether or not they knew it. This is especially likely given the reduced activation of other aspects of word meaning (as illustrated by weaker CO and TS effects), which provided less support for the task compared to L1 participants. In that case, the task became somewhat auditory in nature and thus activated more of the related auditory simulation where available, resulting in a facilitation for such words (see Connell & Lynott, Reference Connell and Lynott2014). For other words, where simulation of auditory experience was not relevant, the sound of the word and its phonological characteristics, as well as the familiarity of its sound form, may have driven participants’ recognition, which is what led to increased frequency effects, and consistent but small effects of the remaining sensorimotor components on response time (especially if words with strong sensorimotor representations are used more often in language and are therefore more familiar, regardless of the content of their representation). Due to the environment in which L2 is often acquired (i.e., classroom-based learning requiring repetition of word sounds rather than grounded experience of the associated word referents), it is possible that non-native speakers of English are more successful at learning and processing words based on their lexical characteristics, such as what the word sounds like and the frequency of exposure. This can result in a weaker link between the L2 lexical entry and semantic information and can lead non-native speakers to rely on lexical characteristics more than native speakers do, at least until they build up native-like vocabulary (c.f. Brysbaert et al., Reference Brysbaert, Lagrou and Stevens2017).
Interestingly, a strong effect of the Physical Sensation component also emerged for L2 speakers, whereby participants were more accurately responding to words related to bodily movements and experiences (e.g., “workout” and “pain”). Since body- and physical experience-related concepts may capture increased attention due to their importance for survival (Bonin et al., Reference Bonin, Thiebaut and Méot2024; Dymarska et al., Reference Dymarska, Connell and Banks2023a), it is possible that the activation of this aspect of sensorimotor experience is strong even when encountering the concept in L2, although this conflicts to some extent with the findings that attention to negative information (often associated with survival as well) is reduced in L2 (Jończyk et al., Reference Jończyk, Boutonnet, Musiał, Hoemann and Thierry2016; Wu & Thierry, Reference Wu and Thierry2012). On the other hand, Body and Communication components in Dymarska et al. (Reference Dymarska, Connell and Banks2023b; similar to PS and TL components in the current analysis) elicited the most consistent facilitation effects on lexical decision across different word samples, so it is possible that they simply have stable enough representations that their activation is more reliably predicted in L2 context. Further research into the nature of sensorimotor simulation in L2 processing is needed to disentangle these potential mechanisms.
I performed a large-scale analysis of data from a megastudy of a novel word recognition task to shed light on the sensorimotor strength effects in second language users of English. A systematic comparison with native speakers showed that all other variables being equal (i.e., words used as stimuli, instructions in the task, dependent measures and predictor variables), non-native language processing relies on sensorimotor variables to a lesser extent than native language processing. Because the data was collected online, and L2 participants came from different backgrounds, these differences cannot be attributed to the characteristics of a specific language or to low proficiency levels, but rather they shed light on the general process of word recognition in a large, diverse group of non-native English speakers, which is representative of much of the world’s population today. The results indicate that when processing words in L2 English, sensorimotor simulation is reduced. Even when asked explicitly about knowing the words’ meaning, people make the decision primarily based on their knowledge of the word form. Nonetheless, some physical and auditory experience is also used, appearing to have relatively stronger activation, possibly due to the nature of the task (auditory experience) or its importance in interacting with the world (physical and bodily experience).
It is important to note that all the predictor variables, that is, sensorimotor strength ratings, concreteness ratings and age of acquisition ratings, were obtained from L1 speakers, and frequency measures were calculated based on assumed native exposure. Although sensorimotor strength norms from L2 English speakers do exist for a limited sample of words (Lee & Shin, Reference Lee and Shin2023), other variables are not available, and there were many advantages to using norms obtained from native speakers. First, it allowed for a direct comparison of effect sizes between analyses where only participant characteristics vary, thus indicating to what extent and in what ways semantic activation in non-native English speakers affects performance on word recognition differently, compared to a native speaker’s performance. Second, ratings from L2 speakers are likely to be influenced by their native language, where differences in semantic representations may be the result of linguistic or cultural variations, and not general cognitive mechanisms that occur when using a language learned later in life. Nonetheless, it is important to look at the issue more closely and to investigate L2 semantic representations using variables obtained from native speakers of languages other than English, comparing them directly with variables obtained from non-native speakers of English, as well as other languages.
Critically, knowing which aspects of sensorimotor simulation are strongly activated in L2 helps generate further predictions about processes beyond word recognition. For example, memory for words was recently found to be affected by various aspects of sensorimotor information in different ways (Dymarska et al., Reference Dymarska, Connell and Banks2023a), depending on which aspect of sensorimotor information was most relevant for a given concept. I hope that this study will spark curiosity among researchers on bilingualism and semantic processing, leading to future studies addressing the question of embodiment effects in L2 processing.
Data availability statement
The data that supported the findings of this study was collated from Brysbaert et al. (Reference Brysbaert, Keuleers and Mandera2021); Mandera et al. (Reference Mandera, Keuleers and Brysbaert2020); Lynott et al. (Reference Lynott, Connell, Brysbaert, Brand and Carney2020); Balota et al. (Reference Balota, Yap, Hutchison, Cortese, Kessler, Loftis, Nelson, Simpson and Treiman2007); and van Heuven et al. (Reference van Heuven, Mandera, Keuleers and Brysbaert2014). Final dataset, together with the analysis code, is available on Open Science Framework at https://osf.io/wx8fc/.
Competing interests
The authors declare none.