Introduction
In 1970, Fraser suggested that idioms are not homogenous when it comes to their transformability. Some idioms are more flexible and can easily undergo transformations (e.g., “the towel was thrown in by him” which is the passivized form of “throw in the towel” retains the idiomatic meaning “he gave up”), while others are inflexible (e.g., “the bucket was kicked by him” is less likely to be interpreted as “he died”) (Gibbs & Gonzales, Reference Gibbs and Gonzales1985). However, the diversity of idioms is not limited to their structural flexibility; indeed, idioms may vary along a large number of linguistic dimensions (see Libben & Titone, Reference Libben and Titone2008 and Bulkes & Tanner, Reference Bulkes and Tanner2017 for English idioms; Bonin, et al., Reference Bonin, Méot and Bugaiska2013 for French idioms; Tabossi et al., Reference Tabossi, Arduino and Fanari2011 for Italian idioms; Li et al., Reference Li, Zhang and Wang2016 for Chinese idioms; Hubers et al., Reference Hubers, Cucchiarini, Strik and Dijkstra2019 for Dutch idioms). For example, some idioms are more familiar to language users than others (e.g., “is a piece of cake” vs. “go pear-shaped”). Some idioms have literal interpretations but many lack this quality (e.g., “cross one’s fingers” vs. “be the apple of someone’s eye”). Furthermore, the meaning of some idioms can be extracted from the constituents that compose them, while others retain low decomposability (e.g., “cover one’s track” vs. “go cold turkey”). Multiple studies in the field of psycholinguistics and neurolinguistics have indicated that these dimensions generally affect the processing of idioms (e.g., Cieślicka, & Heredia, Reference Cieślicka and Heredia2011; Titone et al., Reference Titone, Lovseth, Kasparian and Tiv2019; Carrol & Conklin, Reference Carrol and Conklin2020; Morid et al., Reference Morid, Bachar and Sabourin2021).
More recently, researchers have put forward the idea that idioms additionally vary along dimensions of emotional content. In general, it has been well-established that the emotional content of stimuli modulates their processing both at the word level and the sentence level (e.g., Arfé et al., Reference Arfé, Delatorre and Mason2022). Emotional valence (the extent to which an emotion is pleasant/positive or unpleasant/negative) and arousal (whether the evoked emotion is perceived as exciting or calming) are considered two main dimensions that define emotions (Russell, Reference Russell2003). Though researchers have access to databases which specify affective measures (including emotional valence) for single words in various languages (e.g., Warriner et al., Reference Warriner, Kuperman and Brysbaert2013; Imbir, Reference Imbir2016; Yao et al., Reference Yao, Wu, Zhang and Wang2017; Stadthagen-Gonzalez et al., Reference Stadthagen-Gonzalez, Imbault, Pérez Sánchez and Brysbaert2017), idiom databases are remarkably scarce (see Citron et al., Reference Citron, Cacciari, Kucharski, Beck, Conrad and Jacobs2016 for German idioms, and Gavilán et al., Reference Gavilán, Haro, Hinojosa, Fraga and Ferré2021 for Spanish idioms). The impact of emotional content on idiom processing is compelling given that these expressions are typically used in emotionally charged conversations (Citron et al., Reference Citron, Cacciari, Kucharski, Beck, Conrad and Jacobs2016). As such, the primary aim of this study is to create the first known set of affective norms for English idioms.
Previous studies suggest that concreteness (the degree to which the meaning of a word or expression is understood through perception and action) is a crucial variable in the processing of emotional words (Barber et al., Reference Barber, Otten, Kousta and Vigliocco2013). In particular, studies suggest that abstract words are more likely to refer to emotional states than concrete words (Altarriba et al., Reference Altarriba, Bauer and Benvenuto1999). Some researchers have proposed that, besides linguistic information, two major sources of experiential information (sensory–motor and affective) are involved in the process of word learning and representation (Vigliocco et al., Reference Vigliocco, Meteyard, Andrews and Kousta2009). These authors also argued that the representation of concrete words relies on sensory–motor information, while emotional information plays a crucial role in the representation and processing of abstract words (see Kousta et al., Reference Kousta, Vigliocco, Vinson, Andrews and Del Campo2011 and Vigliocco et al., Reference Vigliocco, Kousta, Della Rosa, Vinson, Tettamanti, Devlin and Cappa2014 for behavioural and fMRI support, respectively). Importantly, it seems that the so-called “concreteness effect” (i.e., the fact that participants process concrete words more rapidly than abstract words; Holcomb et al., Reference Holcomb, Kounios, Anderson and West1999) can be replaced by an “abstractness effect” (i.e., abstract words are processed more rapidly than concrete words), once the researchers control for a large number of lexico-semantic variables (including familiarity and imageability). Given the relation between affective factors, concreteness and imageability, a database containing norms for all these dimensions would be indispensable for future research. Thus, the second aim of the current study is to additionally include concreteness and imageability ratings for the same set of English idioms, allowing researchers to control for these factors.
Why is it important to collect these measures for idiomatic expression? Psycholinguistic approaches to idiom processing have been influenced by the ideas proposed by Lakoff & Johnson (Reference Lakoff and Johnson1981) in their book “Metaphors we live by.” The idea which has been theorized as a “conceptual metaphor” argues that the nature of our conceptual system is metaphorical. For example, in a culture, argument might be understood as a battle, when the people involved attack each other. So, the argument is conceptualized metaphorically as argument is war. In an imaginary culture, however, argument could be realized as dance during which the people involved try to create a performance rather than attacking each other. Two major conclusions of this view is that: the metaphorical concepts (e.g., argument is war versus argument is dance) help us understand one kind of experience in terms of another kind of experience and essentially a less concrete experience (i.e., argument) is understood through more concrete ones (i.e., war or dance). Based on this view, cognitive embodiment argues that all cognitive processes, including language processing, have links with affective and motor areas of the brain (Vigliocco et al., Reference Vigliocco, Meteyard, Andrews and Kousta2009). More specifically, embodied theories of semantic representation and processing argue that during the processing of meaning, the same neural structures that relate to perception and action gets activated automatically. Recently, and based on this view, researchers suggest that the emotional content of idioms (valence: the extent to which an emotion is pleasant/positive or unpleasant/negative, and arousal: whether the evoked emotion is perceived as exciting or calming) and concreteness (the degree to which the meaning of a word or expression is understood through perception and action) may impact their processing (e.g., Citron et al., Reference Citron, Cacciari, Kucharski, Beck, Conrad and Jacobs2016). Given that idioms usually convey meanings that are different from the meaning of its constituents, the individual words that constitute idioms do not necessarily transfer their characteristics to the idiom. For example, the level of concreteness of that idiom might be different from the level of concreteness of the individual words of the idiomatic expression. For example, the level of concreteness of the individual constituents of the expression “racked her brain” is different from the concreteness level of the idiom. When the participants read/hear the idiom “racked her brain,” if the expression is unfamiliar to them and if there is no supporting context before the idiom, the word “brain” might be analyzed as an individual unit. However, if the language user is familiar with this expression, and more importantly, if the prior context makes the expression more predictable, then by reaching the end of the expression, the meaning of “racked her brain” might get activated. These two scenarios show that the language user is processing two words/concepts with varying level of concreteness (3.96 and 2.81 for “brain” or “racked her brain,” respectively). The same scenario holds for the level of valence and arousal of idiomatic expressions. As such, and given the importance of considering the role of affective factors and concreteness mentioned by embodied accounts, this study aims to collect such measures for a set of English idioms.
As mentioned above, valence and arousal are two main dimensions that define the structure of affect. The relation between valence and arousal is typically reported as quadratic (Citron et al., Reference Citron, Cacciari, Kucharski, Beck, Conrad and Jacobs2016) with both negative and positive words being more arousing than neutral words. Moreover, the literature on single words suggests that words that are highly valenced and arousing tends to be more abstract (Vigliocco et al., Reference Vigliocco, Kousta, Della Rosa, Vinson, Tettamanti, Devlin and Cappa2014). Other studies report a positive correlation between arousal and imageability (Citron et al., Reference Citron, Weekes and Ferstl2014) or a negative quadratic correlation between arousal and concreteness (Montefinese et al., Reference Montefinese, Ambrosini, Fairfield and Mammarella2014; Citron et al., Reference Citron, Cacciari, Kucharski, Beck, Conrad and Jacobs2016). As such, beyond simply providing a database of ratings for idioms, the third goal of this study is to explore the relation between these collected measures, as motivated by previous research on single words and nonliteral expressions.Footnote 1
Finally, given that idiomatic expressions are pervasive in everyday conversations (Citron et al., Reference Citron, Cacciari, Kucharski, Beck, Conrad and Jacobs2016), psycholinguistic researchers have increasingly compared first-language (L1) and second-language (L2) idiom processing. Research shows that L2 learners usually encounter difficulty when learning and comprehending idioms (Abel, Reference Abel2003; Titone et al., Reference Titone, Columbus, Whitford, Mercier, Libben, Heredia and Cieślicka2015). Though L2 idiom processing appears to bear similarities to L1 idiom processing (Heredia & Cieślicka, Reference Heredia and Cieślicka2015), psycholinguistic and affective ratings by L1 and L2 speakers may vary. For example, compared to native speakers, English learners rated English idioms as more decomposable (Abel, Reference Abel2003). The final goal of this study is thus to provide both L1 and L2 affective and sensory–motor norms for the same English idioms. This will allow second-language researchers to account for these factors in future idiom-processing studies.
Methods
Participants
A total of 555 students from the University of Ottawa (318 women, 216 men, 21 unspecified gender), between 18 and 23 years of age (Mean = 19.48), completed the online survey. Participants were recruited through the university’s Integrated System of Participant Research (ISPR). They received partial course credit as compensation for their participation. Fifteen participants were excluded from the final analysis (see Data analysis section). The final sample thus contained 540 participants (314 women, 208, men, 18 unspecified gender; Mean age = 20.1, range = 18.4 – 23.7).
Participants completed an extensive language background questionnaire (LBQ; Sabourin et al., Reference Sabourin, Leclerc, Lapierre, Burkholder and Brien2016). Based on the information obtained from this LBQ, participants were divided into two groups (see Table 1): native (L1) English speakers (n = 300) and second-language (L2) English speakers (n = 240). L1 speakers were participants who self-reported that English was their first language (over 90% exposure during infancy) and their current most dominant language. L2 speakers were participants who self-reported an alternative language as an L1 (less than 50% exposure to English during infancy) and indicated that English was currently their secondmost dominant language. Note that both groups were highly proficient in English (see Table 1). In cases where participants provided potentially inconsistent self-reports (e.g., they indicated that English was their first language but also said they had low English proficiency), the participant’s data was not included in the analysis.
Materials
The experimental materials consisted of 210 idiomatic expressions from Libben and Titone (Reference Libben and Titone2008). The idioms from this database had previously been normed for the psycholinguistic dimensions of interest: familiarity, literal plausibility, and decomposability (defined in Table 2, below). As such, we could compare the pre-existing psycholinguistic dimensions with our novel affective and concreteness ratings. All idioms possessed the form of “She/He/It verbpast tense x noun,” where x was a preposition, an article, or a determiner (e.g., “It slipped his mind,” “She raised the devil,” “He got a toehold”). This uniformity ensured that length and phrasal complexity were well controlled (Libben & Titone, Reference Libben and Titone2008). The full list of experimental materials can be found on the Open Science Framework repository at ASN Idioms.
Procedure
We used Gorilla Experiment Builder (www.gorilla.sc) to create and host our online study (Anwyl-Irvine et al., Reference Anwyl-Irvine, Massonnié, Flitton, Kirkham and Evershed2018). Participants first gave informed consent before being directed to the LBQ. To ensure that participants started the task with common knowledge about what an idiomatic expression is, they were instructed to read a short description about idioms, accompanied by some examples. Moreover, they were asked to rate the expressions based on the idiomatic meaning and not their literal meaning (e.g., “kick the bucket” should be rated based on the meaning “to die”). Participants were randomly assigned to one of four lists, corresponding to one of the four dimensions of ratings. That is, each participant was only asked to rate the full set of items (all 210 English idioms) according to a single measure (valence, arousal, imageability, or concreteness; see Table 5 for each list n).
The specific instruction for each measure was adapted from previous studies (Altarriba et al., Reference Altarriba, Bauer and Benvenuto1999; Citron et al., Reference Citron, Cacciari, Kucharski, Beck, Conrad and Jacobs2016; Yao et al., Reference Yao, Wu, Zhang and Wang2017). The instructions were modified and accompanied by proper English examples to assure clarity. Note that ten native English speakers read and assessed the instructions for clarity before participants were recruited. Each instruction contained three main parts: a definition of the measure along with some examples, an explanation of the scale from which the participants were supposed to rate each idiom and an explanation of the labels for each scale. These are briefly presented in Table 3. A screenshot of all instruction pages and one example of the questionnaire page may be found on the Open Science Framework repository at ASN Idioms.
Data analysis
Data pre-processing
Participants’ data were removed from the analyses based on three exclusion criteria. First, if a participant responded “I do not know the meaning of this idiom” for over 50% of expressions, we did not include their data in the analysis (n = 7). Second, when a participant attributed the same rating to over 85% of idioms or when the responses followed any noticeable patterns (e.g., the first 20 responses were the same and the second 25 responses were the same, etc.), we excluded them from the data analysis (n = 3). Finally, if the participant’s responses on the LBQ did not allow us to confidently group their data into the L1 or the L2 group, we excluded their data from analysis (n = 5). Based on these criteria, 15 participants (2.7 %) were excluded from the data analysis.
Data analysis
The purpose of the current study was (i) to explore the relation between affective variables; (ii) to explore the relation among non-affective variables; and (iii) to examine the relation between affective and sensory–motor variables with the psycholinguistic variables obtained in previous studies (Libben & Titone, Reference Libben and Titone2008). We calculated Pearson partial correlations to explore these relations. Additionally, since previous studies consistently reported a quadratic relation between various affective measures (Ferré et al., Reference Ferré, Guasch, Moldovan and Sánchez-Casas2012; Warriner et al., Reference Warriner, Kuperman and Brysbaert2013; Citron et al., Reference Citron, Weekes and Ferstl2014; Montefinese et al., Reference Montefinese, Ambrosini, Fairfield and Mammarella2014), we conducted a quadratic regression predicting (i) arousal from valence and (ii) familiarity from valence.
The last aim of the current study was to compare the ratings from L1 and L2 speakers. We conducted t-tests to statistically compare these groups. Note that, for both correlation and regression analyses, we compared pre-existing psycholinguistic dimensions with our novel affective and concreteness ratings for our L1 speakers only; this is because the previous literature has not collected the psycholinguistic variable ratings from second-language informants.
Results and discussion
Descriptive statistics
The descriptive statistics for the ratings of L1 and L2 speakers are presented in Table 4. The final column (“Valid response %”) indicates the proportion of obtained ratings for each variable; when participants indicate that an idiom is unknown and thus do not attribute it a rating, this response is not counted toward the “Valid response %.” The raw idiom ratings can be found on the Open Science Framework repository at ASN Idioms.
In order to inspect whether the idioms’ ratings are correlated with the ratings of their constituents (i.e., the noun and verb of each idiom; for example, “beat” and “breast” in the idiom “he beat his breast”), we calculated partial correlation between the idiom ratings and the ratings of each constituent. The constituent ratings (valence, arousal, concreteness, and imageability) were extracted from available databases (Warriner et al., Reference Warriner, Kuperman and Brysbaert2013; Brysbaert et al., Reference Brysbaert, Warriner and Kuperman2014). We did not find any significant correlation between idiom-related ratings with the respective ratings of the constituents.
Reliability measures
To assess the reliability of the ratings of the four variables that were included in the database, we calculated the intraclass correlation via Cronbach’s alpha values (which is based on internal consistency) for each group (L1 and L2). The literature suggests that this analysis represents a more reliable measure than the split-half procedure (Citron et al., Reference Citron, Cacciari, Kucharski, Beck, Conrad and Jacobs2016). As such, individual participants were inputted as different variables, with idioms representing cases. The analysis showed high reliability for all variables for the L1 group. Generally, the reliability values for the L2 group were lower; however, they were all above an acceptable level (α > 0.7). The reliability measures for all variables and for L1 and L2 groups are reported in Table 5.
Relations between affective variables
Of 210 idioms, 103 idioms were rated as positive (valence more than zero), 103 as negative (valence less than zero), and 4 idioms as neutral (valence equal zero) by the L1 group. A small number of neutral idioms are repeatedly reported in previous studies for various languages (4 out of 619 in case of German idioms, and 11 out of 1252 were neutral in case Spanish idioms (Citron et al., Reference Citron, Cacciari, Kucharski, Beck, Conrad and Jacobs2016; Gavilán et al., Reference Gavilán, Haro, Hinojosa, Fraga and Ferré2021). The L2 group rated 122 idioms as positive and 88 idioms as negative. The numbers of positive and negative idioms are at odds with the previous studies in which most idioms were rated as negative (Citron et al., Reference Citron, Cacciari, Kucharski, Beck, Conrad and Jacobs2016). A t-test revealed that L1 speakers rated negative idioms as significantly more arousing than positive idioms, (t(197) = −2.76, p < 0.01). However, L2 speakers rated negative and positive idioms as similarly arousing (t(156) = 0.43, p = 0.43).
Valence and arousal are plotted against each other in Fig. 1. A visual inspection of these plots suggests that the relation between these variables (for both the L1 and L2 groups) may be quadratic. To verify this impression, we conducted regression analyses wherein mean arousal was the dependent variable. Two regression models were compared. In Model 1, all psycholinguistic and sensory–motor variables were included as predictors (L1 = imageability, concreteness, familiarity, literal plausibility, decomposability, L2 = imageability, concreteness). In Model 2, valence and valence squared were also included as predictors.
For the L1 data, Model 1 accounted for 16% of the variance (R 2 = 0.16), (F(5, 204) = 7.95, p < 0.001). Model 2 accounted an for additional 40% of the variance (R 2 = 0.56), (F(7, 202) = 37.14, p < 0.001), with both valence and valence squared as significant predictors.
For the L2 data, Model 1 accounted for 3% of the variance (R 2 = 0.03), (F(2, 207) = 3.81, p = 0.02). Model 2 accounted for an additional 23% of the variance (R 2 = 0.26), (F(4, 205) = 18.17, p < 0.001), with only valence squared as a significant predictor. This result shows that the typical U-shaped relation between valence and arousal holds for multiword expressions and can be found regardless of whether the language users tested are L1 or L2 speakers.
Relation between affective and nonaffective variables
Table 6 represents the results of linear partial correlations between affective variables (arousal and valence) and nonaffective (sensory–motor and psycholinguistic) variables for the L1 group, as well the relation between affective and sensory–motor variables for the L2 group.
The numbers in the columns represent Pearson’s r values; p values are expressed as specified.
*p < 0.05; **p < 0.01; ***p < 0.001.
There was a significant positive correlation between arousal and imageability (for both L1 and L2 groups), indicating that participants found it easier to evoke a mental image for more arousing words. Generally, the relation between arousal and imageability for single words is mixed; both negative and positive effects are reported in the literature (Citron et al., Reference Citron, Cacciari, Kucharski, Beck, Conrad and Jacobs2016) and most studies call for further investigation of the relation between arousal and imageability.
For the L1 group, familiarity was significantly positively correlated to both arousal and valence. This indicates that idioms which are encountered more frequently were rated as being more arousing and more positive. Positive correlations between arousal and familiarity have indeed previously been reported for both single words and idioms (Montefinese et al., Reference Montefinese, Ambrosini, Fairfield and Mammarella2014; Citron et al., Reference Citron, Cacciari, Kucharski, Beck, Conrad and Jacobs2016). One study that considered the relation between affective and familiarity for Spanish idioms (Gavilán et al., Reference Gavilán, Haro, Hinojosa, Fraga and Ferré2021) did not find the relation between valence and familiarity. These authors assumed that the lack of this relation could be related to the larger size of their database compared to the smaller database in Citron et al. (Reference Citron, Cacciari, Kucharski, Beck, Conrad and Jacobs2016) study.
Unlike one previous study which has explored the relation between decomposability and affective variables for German idiomsFootnote 2 (Citron et al., Reference Citron, Cacciari, Kucharski, Beck, Conrad and Jacobs2016), we did not find any relation between decomposability or literal plausibility and either of the affective variables for the L1 group. These findings suggest that the emotional perception of an idiom is not impacted by the degree to which it possesses a literal interpretation. Similarly, the extent to which individual constituents contribute to the idiomatic meaning does not impact the perceived emotional or sensory–motor status of the idiomatic expressions.
Inspecting the quadratic relation between valence and familiarity
Since valence is a bipolar dimension (i.e., it ranges from positive to negative), and given we found a significant correlation between valence and familiarity in our previous analysis, we explored whether the relation between these variables is best explained by a linear or a quadratic relationship. Two quadratic regression models were compared wherein familiarity was the dependent variable. In Model 1, all psycholinguistic, sensory–motor variables, and arousal were included as predictors. In Model 2, valence and valence squared were also included as a predictor. Model 1 predicted 32% of the variance (R 2 = 0.32), F(5, 204) = 19.91, p < 0.001. Model 2 accounted for an additional 3% of the variance (R 2 = 0.35), F(7, 202) = 15.87, p < 0.001 with valence only (not valence squared) as an additional significant predictor. Therefore, a linear function best describes the relation between familiarity and valence. Estimated familiarity = 0.90 valence + 1.20 decomposability + 0.47 arousal + 0.21 imageability + 0.11. Studies on single words show the same linear relation between valence and familiarity (e.g., Citron et al., Reference Citron, Weekes and Ferstl2014). However, research also indicates that a quadratic function best explains the relation between valence and familiarity when it comes to idioms (Citron et al., Reference Citron, Cacciari, Kucharski, Beck, Conrad and Jacobs2016).
Relations between non-affective variables
The linear correlations between nonaffective variables are reported in Table 7. A significant positive correlation between concreteness and imageability was observed for both L1 and L2 groups. For the L1 group, imageability showed a significant positive correlation with familiarity. These findings suggest that it is easier to form a mental image of an idiom as it becomes more concrete and familiar. However, familiarity was not correlated to concreteness. This suggests that, unlike single words (Yao et al., Reference Yao, Wu, Zhang and Wang2017), idioms which are encountered more frequently are not necessarily more concrete.
Img: Imageability; Con.: Concreteness; Fam.: Familiarity; LP: Literal plausibility; Decomp.: Decomposability. The numbers in the columns represent Pearson’s r values; p values are expressed as specified:
*p < 0.05; **p < 0.01; ***p < 0.001.
Comparing L1 and L2 ratings
The ratings produced by L1 and L2 speakers were compared using two-tailed t-tests. L1 speakers rated idioms as generally more arousing (t(316.51) = −7.71, p < 0.001) and less concrete (t(415.63) = 9.54, p < 0.001) compared to L2 speakers. There appeared to be more variance among L2 arousal ratings (also see Table 4) than L1 ratings. The mean valence and imageability ratings did not differ between L1 and L2 group; nevertheless, a visual inspection of Fig. 2 suggests that L1 imageability ratings has less variance than L2 ratings.
Finally, we fit separate linear mixed-effects multiple regression models to the difference between L1 and L2 ratings for the two measures where we found differences between L1 and L2 ratings (Valence and Concreteness) as a dependent variable and Age of Acquisition (AoA) of L2, proficiency in L2, and length of exposure to L2 environmentFootnote 3 as independent variables. However, we did not find any significant effect of these factors on the difference between L1 and L2 ratings.
General discussion
Considering the growing interest in understanding the relation between language and emotion, there is growing demand for reliable experimental material. The current study sought to establish affective and sensory–motor norms (valence, arousal, imageability, and concreteness) for 210 English idioms by both native speakers and highly proficient second-language speakers. We aimed to describe the association between the collected norms (i.e., affective and sensory–motor variables) as well as to delineate the relationship between our novel data and previously collected psycholinguistic norms. Given the expanding literature on second-language acquisition, processing, and production, we collected data from L2 speakers and compared them to the idiom ratings of L1 participants. In the following paragraphs, we summarize the main findings of the current study.
Similar to the findings for single words and idioms (Kuppens et al., Reference Kuppens, Tuerlinckx, Yik, Koval, Coosemans, Zeng and Russell2017; Gavilán et al., Reference Gavilán, Haro, Hinojosa, Fraga and Ferré2021), we observed a quadratic relation between valence and arousal. This quadratic relation shows that idioms that are highly valenced (either positive or negative) are perceived as highly arousing. This quadratic relation holds for both L1 and L2 groups (despite differences in variance), supporting the universality in the nature of this association.
For the L1 group, the relation between valence and arousal was asymmetrical (i.e., negative items had higher mean arousal than positive items). Similar results were observed by Citron and colleagues (Reference Citron, Cacciari, Kucharski, Beck, Conrad and Jacobs2016), who postulated that this asymmetry was the result of the disproportionate number of negative idioms in their stimuli set. However, since the current study utilized a balanced set of items, it appears that negative idioms are indeed considered to be more arousing. Contrastively, for the L2 group, the mean arousal for negative and positive items were not significantly different. However, considering that the number of negative items (items with valence below zero) were much lower than positive items in the L2 group (negative: 88; positive: 122), the similar arousal for the negative and positive items found in the L2 data confirms that negative items were considered as being more arousing.
Why might L1 speakers be more likely to rate certain idioms as negative? While we do not have a definite explanation for this difference, we postulate that such differences between L1 and L2 speakers may be related to a number of factors, including the length of immersion in the L2 environment. For instance, Imbault and colleagues (Reference Imbault, Titone, Warriner and Kuperman2021) examined whether proficiency and length of immersion in English affected the valence and arousal ratings of English words by L2 speakers of English. These authors found that more proficient L2 speakers and those who lived in Canada (the L2 environment) for longer periods displayed more similar emotional responses to L1 speakers. In our study, L2 participants had a wide variety of length of living in L2 environment. These factors merit further investigation in future research.
Studies on single words show that the differences between L1 and L2 emotional ratings depend on the category of words. For example, Garrido and Prada (Reference Garrido and Prada2021) showed that only taboo words evoked higher emotional ratings from L1 speakers while the ratings of negative and positive words were not insignificantly different between L1 speakers and Portuguese–English bilinguals. Further studies may examine whether the sensitivity to taboo stimuli can be extended to idiomatic expressions. Additionally, previous studies show that the AoA of the L2 (Rodriguez-Cuadrado et al., Reference Rodriguez-Cuadrado, Hinojosa, Guasch, Romero-Rivas, Sabater, Suárez-Coalla and Ferré2022) and the level proficiency in L2 (Imbault et al., Reference Imbault, Titone, Warriner and Kuperman2021) are predictors of the differences between L1 and L2 ratings. In our study, we conducted analyses to find out whether the differences between L1 and L2 ratings can be predicted by AoA, proficiency, and length of exposure to L2 environment. We did not find an impact of these factors on the difference between the L1 and L2 ratings. A lack of such an impact could be related to our L2 group being vastly heterogenous (having over 20 different L1s, various AoA, etc.). So consideration of these factors in future studies is another potentially interesting avenue of research.
Note that only four items possessed neutral valence (valence = 0) in the present study. Similarly, Citron and colleagues (Reference Citron, Cacciari, Kucharski, Beck, Conrad and Jacobs2016) reported that a mere three out of 619 German idioms carried neutral valence. These results support the idea that idioms are generally emotion laden. This is particularly important for researchers who are interested in the affective processing of language. When it comes to idioms, researchers may have difficulty in compiling enough neutral items for their studies and may thus need to consider the emotional content of their stimuli. Aside from the differences in the number of negative and positive items for the two groups, we found no significant difference between valence ratings for the L1 and L2 groups.
Unlike valence, the L1 and L2 groups attributed distinct arousal ratings to the idioms, with L1 speakers generally assigning higher arousal ratings. This result corroborates the findings on single words which show greater crosslanguage variability for arousal and more generalized ratings for valence across languages (Redondo et al., Reference Redondo, Fraga, Padrón and Comesaña2007; Eilola & Havelka, Reference Eilola and Havelka2010; Montefinese et al., Reference Montefinese, Ambrosini, Fairfield and Mammarella2014). Indeed, while the dimension of valence is perceived similarly across cultures (Russell, Reference Russell1991), the variability in arousal ratings can be related to crosscultural diversity (Montefinese et el., Reference Montefinese, Ambrosini, Fairfield and Mammarella2014). In the current study, L2 speakers originated from a wide range of language backgrounds (French, Persian, Arabic, Chinese, Vietnamese, Russian, etc.) and cultures and tended to assign more variable arousal ratings (SD = 1.05) compared to the uniform L1 group (SD = 0.55). These findings suggest that second-language researchers may use L1 valence ratings if they do not have access to L2 ratings, but that they should be cautious of using L1 arousal ratings for L2 speakers, as these groups are differently aroused by idioms.
An additional objective of the current study was to explore the relation between affective and nonaffective idiom ratings. We observed positive correlations between arousal and imageability for both the L1 and L2 groups. Furthermore, both groups showed either a trend (L1 speakers) or a significant negative correlation (L2 speakers) between arousal and concreteness. Since prior studies show that the relation between arousal and sensory–motor variables may be nonlinear, we examined whether a quadratic function best describes their association. Our result only marginally supported a quadratic relation between these variables. Given that the idioms in our study were generally attributed low-to-moderate arousal ratings (minimum = 1.40; maximum = 4.28), it is possible that such a compression in the dispersion of the data may have obscured the relationship between arousal and concreteness. Future studies should consider using a larger set of idioms (including idioms eliciting higher arousal responses) in order to capture the full picture.
Next, positive correlations between the affective factors and familiarity were observed: as participants rated idioms as more familiar, they tended to also consider them to be more arousing and more positive. This positive relation was previously found for German idioms (Citron et al., Reference Citron, Cacciari, Kucharski, Beck, Conrad and Jacobs2016). Though these authors found that a quadratic function best represented the relation between valence and familiarity, a linear function was a better fit for our own data. It is worth mentioning that in Citron and colleagues (Reference Citron, Cacciari, Kucharski, Beck, Conrad and Jacobs2016), negative idioms made up more than two-thirds of the items. When valence is more balanced, as in the current study, it appears that a linear relation is favored.
Finally, similar to previous studies (Schwanenflugel & Stowe, Reference Schwanenflugel and Stowe1989; Paivio, Reference Paivio1991), imageability and concreteness yielded a strong positive correlation. The robust association between these variables (as well as their relation to other variables, like arousal) suggests that researchers should take special care while manipulating these factors; there is a danger of collinearity within certain types of analyses. Though imageability and concreteness are strongly related, they should still not be used interchangeably, since they represent two different constructs.
Conclusion, limitations, and future directions
In sum, the present descriptive study compiles highly reliable affective and sensory–motor norms for a set of English idioms. It describes the relationship between collected measures, thereby enabling researchers to make systematic decisions about their stimuli in future research. In short, we have created the first known database consolidating affective and sensory norms for English idioms. It is worth mentioning that in the current study, most participants were young female university students. Considering that previous studies show that both gender (Fischer, Reference Fischer2000) and age (Fairfield et al., Reference Fairfield, Ambrosini, Mammarella and Montefinese2017) might influence participants’ ratings (especially affective ratings), future studies should seek to account for these potential factors.
Replication package
All supplementary material are available to public via the following link: https://osf.io/g5u8h/.
Availability of data and material
The data is available at ASN Idioms.
Funding
This study was partially funded by an Ontario Graduate Scholarship (OGS) to the first author.
Competing interests
The authors have no conflicts of interest to declare.
Consent to participants
All participants provided their written informed consent before completing the survey. The data were collected and analyzed anonymously.
Code availability
Not applicable.
Ethics approval
This study was reviewed and approved by the University of Ottawa’s Social Sciences and Humanities Research Ethics Board. The file number is “S-12-19-5098.”