1. Introduction
Imagine you are walking around your neighbourhood when you find a wallet full of cash and credit cards lying on the ground. From the credit cards and other valuable objects in the wallet, you are convinced that the owner comes from an upper social class. You are on a low income that barely covers your basic needs, and you start wondering whether stealing money (or other objects) is morally justified under specific circumstances. Moral cognition comprises all mental – both cognitive and emotional – processes involved in understanding, reflecting and making judgements and choices about ethical issues and dilemmas (Baez et al., Reference Baez, García, Santamaría-García, Ibáñez, Sedeño and García2017). Moral cognition within a deontological framework entails centering on moral rights and principles to distinguish right behaviour from wrong, since some actions can never be considered as ethically acceptable even if the reasons underlying them are good (e.g., stealing money to help your family members who are starving). At the other extreme, moral cognition within a utilitarian framework emphasises the outcomes of people’s actions and their impact on individual and community well-being rather than inherent qualities of people and rigid moral principles (e.g., stealing money might be morally right if it produces more benefit than harm) (see Gawronski & Beer, Reference Gawronski and Beer2017, and Hennig & Hütter, Reference Hennig and Hütter2020, for reviews).
For decades, psychological research has sought to identify individual differences variables that influence moral reasoning and decision-making in adults, such as personality traits (Bartels & Pizarro, Reference Bartels and Pizarro2011; Duriez & Soenens, Reference Duriez and Soenens2006: Lifton, Reference Lifton1985), cognitive abilities (Street et al., Reference Street, Douglas, Geiger and Martinko2001; Tinghög et al., Reference Tinghög, Andersson, Bonn, Johannesson, Kirchler, Koppel and Västfjäll2016) and religious beliefs (Duriez & Soenens, Reference Duriez and Soenens2006; Shariff, Reference Shariff2015). For example, evidence suggests that males are more likely to choose the utilitarian option when facing emotionally charged moral dilemmas (Arutyunova et al., Reference Arutyunova, Alexandrov and Hauser2016; Gao & Tang, Reference Gao and Tang2013)Footnote 1, and that strong religious beliefs (e.g., “I am sure that Crist exists”) are generally associated with deontological responses (Christensen et al., Reference Christensen, Flexas, Calabrese, Gut and Gomila2014; Szekely et al., Reference Szekely, Opre and Miu2015). However, international migration – that reached historical peaks in 2020 outpacing the world population growth rate (Batalova, Reference Batalova2022; Natarajan et al., Reference Natarajan, Moslimani and Hugo López2022) – has urged researchers to turn their attention to linguistic factors that may impact moral judgements, such as the language in which a dilemmatic situation is presented and proficiency level in that language, as well as the linguistic proximity (i.e., the degree of resemblanceFootnote 2) between the first language (L1) and the second language (LX)Footnote 3 (Białek et al., Reference Białek, Paruzel-Czachura and Gawronski2019; Brouwer, Reference Brouwer2019, Reference Brouwer2021; Costa et al., Reference Costa, Foucart, Hayakawa, Aparici, Apesteguia, Heafner and Keysar2014; Driver, Reference Driver2022; Dylman & Champoux-Larsson, Reference Dylman and Champoux-Larsson2020; Geipel et al., Reference Geipel, Hadjichristidis and Surian2015a, Reference Geipel, Hadjichristidis and Surian2015b; Hayakawa et al., Reference Hayakawa, Tannenbaum, Costa, Corey and Keysar2017; Kyriakou et al., Reference Kyriakou, Foucart and Mavrou2023; Muda et al., Reference Muda, Pieńkosz, Francis and Białek2020). These studies have found that bilinguals tend to be more deontological in their L1 and more utilitarian in their LX, especially when the LX is linguistically distant from the L1 (see meta-analysis by Circi et al., Reference Circi, Gatti, Russo and Vecchi2021). This phenomenon is known as the Moral Foreign Language effect (MFLe).
Although the mechanisms that generate the MFLe are still a matter of debate, three main hypotheses have been proposed to explain this phenomenon. The increased deliberation hypothesis suggests that the MFLe occurs because of the additional cognitive effort that processing moral dilemmas in an LX entails. According to this hypothesis, conducting cognitively demanding tasks in an LX increases cognitive load (see, e.g., Plass et al., Reference Plass, Chun, Mayer and Leutner2003), prompting LX users to make less automatic and therefore more thoughtful decisions (see, Del Maschio et al., Reference Del Maschio, Crespi, Peressotti, Abutalebi and Sulpizio2022, for a review). However, studies that employed a process-dissociation procedure (see Conway & Gawronski, Reference Conway and Gawronski2013) to measure separately the strength of deontological versus utilitarian processes during moral decision-making found that the moral choice of sacrificing a person causes less aversion in the LX, irrespective of whether this choice improves well-being for the greatest number of people (Hayakawa et al., Reference Hayakawa, Tannenbaum, Costa, Corey and Keysar2017; Muda et al., Reference Muda, Niszczota, Białek and Conway2018). The psychological distance hypothesis postulates that people tend to be more utilitarian in their LX, as opposed to their L1, because they feel more distant from hypothetical situations presented in an LX (see Shin & Kim, Reference Shin and Kim2017). According to the construal level theory (Trope & Liberman, Reference Trope and Liberman2010), people are more willing to assess a transgression as being morally acceptable when its perceived believability (i.e., the likelihood of the transgression happening) is weak. In this regard, Körner and Deutsch (Reference Körner and Deutsch2023) indicated that moral dilemmas based on daily situations increase respondents’ engagement because of their realistic nature in comparison with sacrificial-type dilemmas. Several studies have suggested that the use of an LX may increase psychological distance prompting people to be more utilitarian and less emotional in their LX (Kyriakou & Mavrou, Reference Kyriakou and Mavrou2023; Shin & Kim, Reference Shin and Kim2017), thus supporting the psychological distance hypothesis. However, a recent study by Yavuz et al. (Reference Yavuz, Küntay and Brouwer2023) failed to demonstrate that the effect of language on moral decision-making is caused by heightened psychological distance. The third hypothesis that has been proposed to explain the MFLe, the reduced emotionality hypothesis, argues that the MFLe is the result of dampened emotional reactions when verbal stimuli (e.g., moral dilemmas) are presented in an LX. This hypothesis has been supported by many studies suggesting that bi−/multilinguals associate their L1 with real-life emotional situations and thus experience reduced affective reactivity to their LX (Dewaele, Reference Dewaele2010; Dewaele et al., Reference Dewaele, Mavrou, Kyriakou and Lorette2024; Kyriakou et al., Reference Kyriakou, Mavrou and Palapanidi2024; Pavlenko, Reference Pavlenko2012, Reference Pavlenko2017).
Although the most widely accepted explanation for the MFLe rests on the assumption that LX stimuli attenuate emotional reactivity and increase emotional distance, previous research has rarely considered how using specific LX emotion words influences moral decision-making. The present study contributes to a better understanding of the relationship between language, emotion and moral decision-making by exploring whether bilinguals’ moral judgements are modulated by the emotiveness of the moral questions accompanying personal and impersonal moral dilemmas. Drawing on the reduced emotionality hypothesis (Dewaele, Reference Dewaele2010; Pavlenko, Reference Pavlenko2012), this study examined the extent to which language (L1 versus LX), the type of moral dilemma (personal/unrealistic versus impersonal/realistic) and the emotiveness of the moral questions accompanying moral dilemmas (emotive versus neutral moral questions) influenced bilinguals’ moral judgements. The emotiveness of the moral questions was manipulated by carefully choosing the emotion words included in these questions. Emotion words are usually characterised in terms of two dimensions, namely valence (a continuum from positive to negative emotions) and arousal (a continuum from calm to excited) (Bradley & Lang, Reference Bradley and Lang1999; Compton et al., Reference Compton, Banich, Mohanty, Milham, Herrington, Miller, Scalf, Webb and Heller2003; Russell, Reference Russell1980; Stadthagen-Gonzalez et al., Reference Stadthagen-Gonzalez, Imbault, Pérez Sánchez and Brysbaert2017). Therefore, we compared bilinguals’ responses to emotive moral questions (i.e., questions of high arousal and negative valence) versus non-emotive moral questions (i.e., questions of low arousal and neutral valence) in both their L1 and LX.
1.1. Emotional reactivity in moral decision-making
Dual-process models of moral decision-making suggest that moral reasoning is the product of two distinct psychological processes: (1) an emotional – fast, intuitive and effortless – processing which prompts individuals to act on the basis of universally accepted moral rules and principles regardless of the outcome (deontological responses); and (2) a cognitive – slow, conscious and reflective – processing that leads individuals to consider whether an action is morally right or wrong by focusing exclusively on the outcome or the consequences, such as maximising the greatest good for the greatest number of people (utilitarian decisions) (Greene, Reference Greene2007; Greene et al., Reference Greene, Sommerville, Nystrom, Darley and Cohen2001; Kahneman, Reference Kahneman2003). To develop this model, Greene et al. (Reference Greene, Sommerville, Nystrom, Darley and Cohen2001) presented participants with personal and impersonal moral dilemmas and measured their brain activity via functional magnetic resonance imagining (fMRI). Personal dilemmas imply causing bodily harm to an innocent person or group of people to achieve better outcomes for the greatest number of individuals. In other words, agents in personal dilemmas are forced to injure someone in a direct way to save more lives. For example, in the footbridge dilemma (Thomson, Reference Thomson1985), individuals must choose between pushing or not a man off a bridge onto the train tracks and thus sacrificing him in order to save five people. On the contrary, dilemmas that do not comply with the above requirements are considered impersonal dilemmas (e.g., the switch dilemmaFootnote 4 ; Foot, Reference Foot and Foot1978). Greene et al. (Reference Greene, Sommerville, Nystrom, Darley and Cohen2001) concluded that emotions dominate over reason when people respond to personal moral dilemmas. Specifically, participants’ utilitarian judgements were associated with longer reaction times than did their deontological judgements, but this pattern was only observed in the personal dilemmas. Moreover, fMRI results revealed increased activation of brain areas involved in emotional processing when reading personal (but not impersonal) moral dilemmas in the L1. Similar findings were reported in subsequent studies (Ciaramelli et al., Reference Ciaramelli, Muccioli, Làdavas and Di Pellegrino2007; Moore et al., Reference Moore, Lee, Clark and Conway2011; Suter & Hertwig, Reference Suter and Hertwig2011). For example, Ciaramelli et al. (Reference Ciaramelli, Muccioli, Làdavas and Di Pellegrino2007) found that contrary to healthy individuals, patients with ventromedial prefrontal lesions (a brain area associated with emotional processing) showed a preference for the utilitarian choice in personal moral dilemmas and made this choice faster, but similar results were not observed when they were asked to respond to impersonal moral dilemmas.
Over the last decade, several studies have focused on language as a potential factor influencing bilinguals’ moral judgements. Costa et al. (Reference Costa, Foucart, Hayakawa, Aparici, Apesteguia, Heafner and Keysar2014) used both the footbridge and the switch dilemmas to explore whether LX learners change their moral decision depending on the language they use (L1 versus LX). They recruited participants from different ethnic and linguistic backgrounds and asked them to make a moral choice after reading the above dilemmas either in their L1 or in their LX. The results revealed that the participants were more willing to choose the utilitarian option after reading the personal (footbridge) dilemma in their LX than in their L1, whereas no differences were observed between L1 and LX choices in the impersonal (switch) dilemma. Based on the dual-process model of moral decision-making, Costa et al. (Reference Costa, Foucart, Hayakawa, Aparici, Apesteguia, Heafner and Keysar2014) argued that the use of an LX reduces emotional reactivity, leading people to make more utilitarian judgements in their LX. These findings were later replicated among bilinguals with different L1–LX combinations (Brouwer, Reference Brouwer2019, Reference Brouwer2021; Dylman & Champoux-Larsson, Reference Dylman and Champoux-Larsson2020; Hayakawa et al., Reference Hayakawa, Tannenbaum, Costa, Corey and Keysar2017).
However, other studies yielded contradictory findings. For example, Geipel et al.’s (Reference Geipel, Hadjichristidis and Surian2015b, Study 2) results challenged the reduced emotionality hypothesis, as the presence of the MFLe in the footbridge dilemma was not mediated by emotional blunting in the LX. Furthermore, the MFLe was absent in the personal and highly emotional crying baby dilemma – in which individuals must decide between smoothing or not their child in order to save themselves and other people –, but it was observed in the less emotional and more realistic lost wallet dilemma – according to which one must decide whether it is morally permissible to keep the money found in a lost wallet – in which participants were more willing to steal money when they read the dilemma in their LX (Study 3). Likewise, Driver (Reference Driver2022) did not find an association between language, emotion and moral decision-making among English–Spanish and Spanish–English bilinguals. Although Driver’s study provided further evidence for the MFLe in personal moral dilemmas, the level of emotional intensity that the participants reported after reading a personal (the footbridge) and an impersonal (the switch) dilemma did not vary significantly across languages (L1 and LX).
Kyriakou et al. (Reference Kyriakou, Foucart and Mavrou2023) went a step further by analysing the types of moral arguments underlying Spanish–English bilinguals’ moral judgements after reading the footbridge dilemma, as well as the number of emotional (high-arousal) words used in their responses. Their participants provided more deontological and emotional arguments in their L1. Many of them stated that they did not have the right to decide who lives and who dies, for whatever reason and under whatever circumstances, and that they were not capable of sacrificing or harming an innocent person. By contrast, the most prevalent argument in the LX was that one death is better than five and that the end justifies the means. Moreover, the number of high-arousal words used in bilinguals’ arguments was significantly greater in L1 than in LX and mediated the effect of language on moral judgements.
In a subsequent study that used both sacrificial (unrealistic) and realistic moral dilemmas, Kyriakou and Mavrou (Reference Kyriakou and Mavrou2023) examined whether emotion mediated bilinguals’ moral judgements, as well as the specific emotions these bilinguals experienced during or after making their decision in L1 and LX. Their findings suggested that the MFLe is influenced by two main factors, the type of the dilemma and the cultural impact of the LX in the bilinguals’ country of origin. Whereas Spanish–English bilinguals tended to make more utilitarian decisions in response to unrealistic moral dilemmas in their LX, Greek Cypriot–English bilinguals provided very similar moral judgements in their L1 (Greek) and LX (English). According to the authors, this could be attributed to the omnipresence of the English language in the daily life of Greek Cypriot people. Furthermore, the MFLe was observed only in the realistic moral dilemmas in which the agents could easily put themselves in the protagonist’s shoes, and both moral choices (i.e., deontological and selfish) were emotionally charged. The study also revealed that the participants experienced a wide range of moral emotions (such as fear, sadness and guilt) regardless of their country of origin, the language in which they read the dilemmas and the type of these dilemmas, and these emotions did not appear to influence their moral judgements. These results were interpreted in terms of a greater psychological distance in LX leading to lower engagement in moral decision-making.
Taken together, the available evidence on the role of emotion in moral decision-making is far from conclusive. The current study aimed to expand on this line of research by investigating whether any differences in the emotiveness of moral questions influence bilinguals’ moral judgements in their L1 and LX. The next section briefly describes some studies that examined whether people’s moral judgements vary depending on how specific sentences or moral questions are phrased.
1.2. Wording effects in moral decision-making
In order to determine whether the use of different words has an impact on the way people make moral decisions, O’Hara et al. (Reference O’Hara, Sinnott-Armstrong and Sinnott-Armstrong2010) instructed their participants to read 15 moral vignettes that presented hypothetical moral transgressions in English L1 and to judge 15 statements that expressed disapprobation towards each one of those moral transgressions. Each statement contained one of the following adjectives: wrong, inappropriate, forbidden or blameworthy (e.g., turning the train was wrong, turning the train was forbidden). For some participants, the four adjectives were preceded by the term morally (e.g., turning the train was morally forbidden). The moral vignettes were divided into six categories: (1) three vignettes were variations of the footbridge dilemma; (2) another three vignettes referred to victimless offences (e.g., brother–sister incestFootnote 5; Haidt, Reference Haidt2001); (3) two dilemmas described a private and a public taboo transgression, respectively; (4) two dilemmas involved deceiving; (5) another three dilemmas involved moral luck and (6) and finally two dilemmas elicited disgust. The authors did not find evidence of wording effects for the dilemmas involving private transgressions, public taboo, deceit and moral luck, but a small effect size was revealed for victimless offences, disgust and the three variations of the footbridge dilemma, i.e., participants tended to judge these situations as morally less permissible when the moral questions included the labels wrong or inappropriate than when they included the labels forbidden and blameworthy. Furthermore, moral transgressions leading to feelings of disgust were judged as more permissible when the term morally preceded the four adjectives.
Likewise, Barbosa and Jiménez-Leal (Reference Barbosa and Jiménez-Leal2017) presented their participants with three versions of five sacrificial moral dilemmas (i.e., dilemmas in which the agent must cause harm to a person in order to save more lives) in English L1 and asked them to evaluate, on a 7-point Likert scale, the moral permissibility of the utilitarian action which was phrased using six different labels: wrong, blame, impermissible, unacceptable, should (whether the agent should choose the utilitarian option) and best action (whether the best decision is the utilitarian one). The three versions of each dilemma were identical, except for the word that described the moral and legal status of the utilitarian action: (1) in the innocent version the utilitarian action was legal and therefore did not have legal implications for the agents; (2) in the guilty version the utilitarian action was illegal leading to a four-year prison sentence; (3) in the control version the legal outcomes of the action were not specified. According to the results, the use of terms that expressed a purely utilitarian approach (i.e., should and best action) prompted participants to adopt a stronger position against utilitarianism. Similarly, participants tended to advocate more deontological choices when the utilitarian actions entailed judicial consequences (i.e., the guilty version).
To our knowledge, the only study that addressed how moral questions influence bilinguals’ moral judgements in L1 and LX was carried out by Corey et al. (Reference Corey, Hayakawa, Foucart, Aparici, Botella, Costa and Keysar2017). Their participants were Spanish L1–English LX bilinguals and were presented with the footbridge and the switch dilemmas, but the authors modified the moral question accompanying the two moral dilemmas. Specifically, the question “Would you let five people die?” replaced the questions “Would you push the man?” (in the footbridge dilemma) and “Would you change the track?” (in the switch dilemma). The results indicated that the participants were less willing to perform a moral transgression in their L1 than in their LX so as to save a greater number of people, even when the moral question did emphasise the negative outcomes of the deontological choice (i.e., letting five people die). The current study focuses on the level of emotionality of the verbs included in moral questions and how this emotionality influences bilinguals’ moral judgements. In the next section, we summarise previous evidence on emotion word processing in different languages.
1.3. Emotion word processing across languages
A considerable body of behavioural and electrophysiological studies on emotion word processing in different L1 domains indicate that emotional stimuli are processed differently than neutral stimuli (Altarriba & Bauer, Reference Altarriba and Bauer2004; Jay et al., Reference Jay, Caldwell-Harris and King2008; Kousta et al., Reference Kousta, Vinson and Vigliocco2009; Schacht & Sommer, Reference Schacht and Sommer2009; Sereno et al., Reference Sereno, Scott, Yao, Thaden and O’Donnell2015; Zhang et al., Reference Zhang, He, Wang, Luo, Zhu, Gu, Li and Luo2014). These studies found that words with positive and negative valence tend to be processed more quickly than neutral ones. As Altrarriba and Bauer (Reference Altarriba and Bauer2004) argued, the emotion processing advantage for emotional over neutral words can be attributed to differences in imageability, concreteness and context availability of these two word categories.
Although this processing advantage has been well established in different L1 domains, the difference between emotion and neutral word processing in LX domains is still unclear. Previous research suggests that LX word processing is less automatic, more effortful and slower than L1 word processing (Segalowitz & Hulstijn, Reference Segalowitz, Hulstijn, Kroll and De Groot2005), and that LX emotional words cannot automatically activate affective connotations which are almost always acquired during the early years of life when children are immersed in daily sensory experiences and social interactions in diverse naturalistic contexts (Altarriba, Reference Altarriba2003; Pavlenko, Reference Pavlenko2012). However, the findings are rather contradictory. For example, whereas a growing number of studies (Hahne, Reference Hahne2001; Harris et al., Reference Harris, Ayçiçeği and Gleason2003; Lizarazo Pereira et al., Reference Lizarazo Pereira, Roberts and Tamayo2023; Tang & Ding, Reference Tang and Ding2023) point to the lower emotional intensity when processing LX emotion words as compared to L1 emotion words, other studies revealed that emotion word processing is quite similar across L1 and LX and that bilinguals tend to provide faster and more accurate responses to emotion words as compared to neutral ones, regardless of the language they use (Eilola et al., Reference Eilola, Havelka and Sharma2007; Grabovac & Pléh, Reference Grabovac and Pléh2014; Naranowicz et al., Reference Naranowicz, Jankowiak and Bromberek-Dyzman2023; Ponari et al., Reference Ponari, Rodríguez-Cuadrado, Vinson, Fox, Costa and Vigliocco2015). Of interest, a comparison of these studies indicates that differences in emotion word processing across L1 and LX are modulated by certain linguistic factors, such as the age of LX acquisition, LX proficiency level and LX exposure (but see Ferré et al., Reference Ferré, García, Fraga, Sánchez-Casas and Molero2010). For example, Harris et al. (Reference Harris, Ayçiçeği and Gleason2003) measured skin conductance responses of L1 Turkish users who started learning their LX (English) after the age of 12 and found that the use of the LX elicited weaker electrodermal responses to taboo words and childhood reprimands than did the use of the L1. On the other hand, Eilola et al. (Reference Eilola, Havelka and Sharma2007) presented L1 Finnish speakers, who lived in Finland and acquired their LX (English) after the age of 7, with emotion (positive, negative, taboo) and neutral words in both their L1 and LX and instructed them to indicate the colour in which the words were presented without paying attention to the meaning of these words. Their results revealed that reaction times to negative and taboo words were slower in comparison with neutral words irrespective of the language in which they were written (i.e., L1 versus LX). According to the authors, the absence of language effects on emotion word processing could be due to the advanced LX proficiency level of their participants.
Contrary to the above studies that used single decontextualised words, Iacozza et al. (Reference Iacozza, Costa and Duñabeitia2017) presented late Spanish–English bilinguals with emotional and neutral sentences either in their L1 or in their LX. The critical words that were included in the emotional and the neutral sentences differed in their valence and arousal. Emotional sentences included high-arousal words (e.g., “at noon the hostile terrorist will bring his toxic bomb to the schizophrenic cannibal”), while neutral sentences included low-arousal words (e.g., “at noon the civil receptionist will bring his toxic bomb to the schizophrenic cannibal”). The authors measured pupil dilation in response to the emotional and the neutral sentences and instructed their participants to indicate on a 7-point Likert scale the emotional resonance of each sentence. Pupillometry responses revealed that the effect of emotion was greater when participants read the emotional sentences in their L1. However, subjective ratings of the emotional resonance of each sentence did not differ significantly between L1 and LX.
Altogether, the above controversies highlight the need to investigate further the reduced emotionality hypothesis in LX. Therefore, the current study sought to explore whether the emotionality of the critical verbs included in the moral questions accompanying moral dilemmas would influence bilinguals’ moral decisions in their L1 and LX.
1.4. The current study
This study aimed to provide evidence on the role of emotions in moral decision-making by manipulating the emotiveness (i.e., high-arousal versus low-arousal words and negative versus neutral words) of the moral questions accompanying different types of moral dilemmas (personal versus impersonal) across languages (Spanish L1 versus English LX). Based on the assumption of a weaker emotional resonance in the LX (Dewaele, Reference Dewaele2008, Reference Dewaele2010; Harris et al., Reference Harris, Ayçiçeği and Gleason2003; Puntoni et al., Reference Puntoni, De Langhe and Van Osselaer2009) and previous findings suggesting that L1 emotional words elicit different response patterns than L1 neutral words (see Caldwell-Harris, Reference Caldwell-Harris2014, for a review), we hypothesised that any difference in moral judgements between emotive and non-emotive moral questions would be more pronounced in the L1 than in the LX, regardless of the type of moral dilemma (personal versus impersonal). The study was approved by the Research Ethics Committee of Nebrija University (Reference number: UNNE-2023-00010) and followed the principles expressed in the Declaration of Helsinki.
2. Method
2.1. Participants
Two hundred and sixty-two Spanish L1–English LX bilinguals were recruited via social media platforms, 233 males, 28 females and 1 non-binary, aged between 21 and 60 years (M = 34.47, SD = 7.5), of which 131 were assigned to the L1 conditions (66 to the L1 emotive condition and 65 to the L1 neutral condition) and the remaining 131 to the LX conditions (65 to the LX emotive condition and 66 to the LX neutral condition). Participants were randomly assigned to one of the four conditions and did not know in advance which condition they would be allocated to. No statistically significant differences were observed in the distribution (χ2(1, 262) = 4.785, p = .57) and education level (χ2(1, 262) = 17.763, p = .27) of men and women across emotive and neutral conditions. In the emotive conditions, participants had to respond to emotive moral questions accompanying personal and impersonal moral dilemmas, while in the neutral conditions they read neutral moral questions accompanying the same moral dilemmas. All the participants had started to learn English after the age of 3 in formal educational settings and reported having an upper-intermediate proficiency level in English according to the self-reports they provided for their English abilities in reading, writing, speaking and listening using 7-point Likert scales (1 = no knowledge, 4 = intermediate, 7 = native-like; see Table 1). Moreover, the number of participants who had lived in an English-speaking country did not vary significantly among the four conditions (χ2(1, 262) = 4.006, p = .26). Participants’ demographic and language data are summarised in Table 1. For comparison purposes, we used one-way analyses of variance.
2.2. Materials
2.2.1. Moral dilemmas and word manipulation
Following Greene et al.’s (Reference Greene, Sommerville, Nystrom, Darley and Cohen2001) criteria about the distinction between personal and impersonal dilemmas, two personal moral dilemmas – the footbridge dilemma (mean emotion rating: 6.0) and the Sophie’s choice dilemma (mean emotion rating: 6.6) – and two impersonal moral dilemmas – the lost wallet dilemma (mean emotion rating: 2.9) and the resume dilemma (mean emotion rating: 2.8), all adapted from Koenigs et al. (Reference Koenigs, Young, Adolphs, Tranel, Cushman, Hauser and Damasio2007), were used in this study.Footnote 6 In the personal-sacrificial moral dilemmas, participants had to decide whether to sacrifice an innocent person in order to save five people (the footbridge dilemma) and whether to bring one of their two children to a laboratory where a doctor performs painful and deadly experiments on humans in order to avoid the death of both their children (the Sophie’s dilemma). In the impersonal and more realistic moral dilemmas, participants had to decide whether to keep the money they found in a wallet lying on the ground (the lost wallet dilemma) and whether to lie on their resume in order to find employment more easily (the resume dilemma). Our study focused exclusively on the impact of the emotionality of the moral questions on bilinguals’ moral judgements in both personal and impersonal moral dilemmas. It is important to clarify that although some studies have suggested the MFLe can be found only in emotionally charged (i.e., personal) moral dilemmas, other scholar argued that the MFLe extends to less emotionally salient and more realistic moral dilemmas (Brouwer, Reference Brouwer2019; Geipel et al., Reference Geipel, Hadjichristidis and Surian2015b; Kyriakou & Mavrou, Reference Kyriakou and Mavrou2023). For example, Geipel et al. (Reference Geipel, Hadjichristidis and Surian2015b) found that bilingual participants were more willing to make a utilitarian choice when responding to the impersonal lost wallet dilemma in their LX, even though this dilemma is supposed to elicit weaker emotional reactions. The four dilemmas used in this study were originally written in English and were translated into Spanish by a bilingual Spanish–English speaker. Back translations of the Spanish versions of the dilemmas into English were also conducted by a second Spanish–English bilingual speaker in order to evaluate the overall quality of the translation and make sure that the exact meaning of each moral dilemma was conveyed correctly.
Two versions of each moral dilemma (emotional versus neutral versions) and in each language (Spanish L1 and English LX) were created by manipulating the emotiveness of the moral questions. Specifically, we manipulated the arousal and the valence of the verbs included in the moral questions that described the action to be judged. In the emotional conditions, we employed high-arousal and negatively valenced verbs, such as kill (e.g., Would you kill that stranger man in order to save five people?; the footbridge dilemma); in the neutral conditions, we used low-arousal and neutral verbs, such as throw (e.g., Would you throw that stranger man (off the bridge) in order to save five people?; the footbridge dilemma). English emotional and neutral words were selected from the Affective Norms for English Words (ANEW; Bradley & Lang, Reference Bradley and Lang1999) and Warriner et al.’s (Reference Warriner, Kuperman and Brysbaert2013) set of norms, while for Spanish words we used Redondo et al.’s (Reference Redondo, Fraga, Padrón and Comesaña2007) and Stadthagen-Gonzalez et al.’s (Reference Stadthagen-Gonzalez, Imbault, Pérez Sánchez and Brysbaert2017) databases. In these databases, arousal and valence are rated on 9-point Likert scales. We followed Guasch et al.’s (Reference Guasch, Ferré and Fraga2016) criteria for the classification of the verbs included in the moral questions into low-arousal (with values ranging between 1.50 and 5.41) and high-arousal (values from 5.43 to 8.40). We also employed Hinojosa et al.’s (Reference Hinojosa, Martínez-García, Villalba-García, Fernández-Folgueiras, Sánchez-Carmona, Pozo and Montoro2016) criteria for classifying these verbs into negative (values of valence below 4.00), neutral (values between 4.00 and 5.99) and positive (values of 6.00 and above). The level of emotionality of the verbs significantly varied across the two conditions (i.e., emotional versus neutral) both in L1 (arousal: t = 8.568, p < .001; valence: t = −6.839, p < .001) and in LX (arousal: t = 4.803, p = .003; valence: t = −9.336, p < .001). Furthermore, the level of emotionality of the verbs in L1 and LX did not differ significantly across the two emotional (arousal: t = 2.478, p = .068; valence: t = −1.171, p = .307) and the two neutral conditions (arousal: t = 2.010, p = .091; valence: t = −0.491, p = .641).
Word frequency was calculated using the Zipf scale (Van Heuven et al., Reference Van Heuven, Mandera, Keuleers and Brysbaert2014). The Zipf values for the English verbs were obtained using SUBTLEX-US (Brysbaert & New, Reference Brysbaert and New2009; Brysbaert et al., Reference Brysbaert, New and Keuleers2012), whilst the Zipf values for the Spanish verbs were extracted from the EsPal database (Duchon et al., Reference Duchon, Perea, Sebastián-Gallés, Martí and Carreiras2013). Following Van Heuven et al.’s (Reference Van Heuven, Mandera, Keuleers and Brysbaert2014) cut-off points, values between 1 and 3 correspond to low-frequency words, whereas values between 4 and 7 refer to high-frequency words. The mean Zipf frequency values for the English emotion verbs were 5.03 (SD = 0.65) and 4.65 (SD = 0.97) for the emotional and the neutral condition, respectively (t = 0.086, p = .936). The mean Zipf frequency values for the Spanish emotion verbs were 4.79 (SD = 0.47) in the emotional condition and 4.17 (SD = 0.35) in the neutral condition (t = −2.025, p = .099). Emotional and neutral verbs were balanced in terms of frequency across L1 and LX (emotional condition: t = −0.515, p = .634; neutral condition: t = −0.930, p = .388).
2.3. Procedure
The study was conducted online using the survey platform QuestionPro (Bhaskaran, Reference Bhaskaran2002). After giving their consent, the participants were asked to answer questions about their demographic and language backgrounds and then they had to complete one of the two versions of the questionnaire (emotional versus neutral conditions) either in Spanish L1 or in English LX. Specifically, they had to read four moral dilemmas (two personal and two impersonal dilemmas), whose order was randomised, and to make a yes (utilitarian) or no (deontological) moral decision. The completion of the questionnaire took approximately 10 minutes.
3. Results
The frequencies of the utilitarian decisions according to the type of the dilemma (personal versus impersonal), the language condition (Spanish L1 versus English LX) and the emotiveness of the moral question (emotive versus neutral), as well as the percentage of participants who opted for the utilitarian decision, are summarised in Table 2.
Mixed effects logistic regression models were run in RStudio 2022.02.3 (Posit team, 2023) using the glmer function in the lme4 package (Bates et al., Reference Bates, Mächler, Bolker and Walker2015). The results revealed a main effect of language and type of the dilemma, i.e., participants tended to provide more utilitarian responses in their LX and in the impersonal and more realistic moral dilemmas (see Table 3). A statistically significant interaction of language and dilemma type was also observed in that participants provided more deontological responses in the personal dilemmas in their L1. Furthermore, the interaction term of language and emotiveness of the moral questions was significant at the .10 level (but not at the .05 level). A closer inspection of the descriptive statistics revealed a greater tendency towards the utilitarian option in the neutral condition but only for the participants who read the dilemmas in Spanish L1. Therefore, we ran mixed effects logistic regression models for the L1 and LX groups separately (see Tables 4 and 5). The results indicated that the participants who read the dilemmas in Spanish L1 tended to be more utilitarian in the neutral condition than in the emotive condition; however, the same pattern was not observed in the English LX groups.
Note. Dependent variable: Participants’ moral judgements (Yes/No response).
* p < .05,
** p < .01,
*** p < .001.
Note. Dependent variable: Participants’ moral judgements in Spanish L1 (Yes/No response).
* p < .05,
** p < .01,
*** p < .001.
Note. Dependent variable: Participants’ moral judgements in English LX (Yes/No response).
* p < .05,
** p < .01,
*** p < .001.
4. Discussion
The current study explored the influence of language (L1 versus LX), type of moral dilemma (personal versus impersonal) and emotiveness (emotive versus neutral) of the moral questions accompanying moral dilemmas on the moral judgements made by Spanish L1–English LX bilinguals. According to our results, these bilinguals showed a greater preference for the utilitarian choice in the personal moral dilemmas that they read in their LX. This finding is consistent with previous studies suggesting that the impact of language on moral judgements tends to be limited to footbridge-type moral dilemmas (i.e., to sacrifice one person to avoid greater harm) (Białek & Fugelsang, Reference Białek and Fugelsang2019; Brouwer, Reference Brouwer2021; Costa et al., Reference Costa, Foucart, Hayakawa, Aparici, Apesteguia, Heafner and Keysar2014; Hayakawa et al., Reference Hayakawa, Tannenbaum, Costa, Corey and Keysar2017; but see Del Maschio et al., Reference Del Maschio, Crespi, Peressotti, Abutalebi and Sulpizio2022).
Furthermore, we found that the manipulation of the emotional verbs included in the moral questions influenced bilinguals’ moral judgements, but only in the L1 condition. Specifically, participants were more willing to select the utilitarian option in the neutral L1 condition, rather than in the emotive L1 condition, whereas participants in the LX condition provided similar responses regardless of the degree of emotiveness of the moral questions. These results support the view that L1 emotion words elicit stronger emotional reactions than LX emotion words, leading people to adopt a more deontological approach in the emotive condition as compared to the neutral one. As mentioned previously, late bilinguals behave differently when processing emotion versus neutral words (Altarriba & Bauer, Reference Altarriba and Bauer2004; Altarriba et al., Reference Altarriba, Bauer and Benvenuto1999; Talmi & Moscovitch, Reference Talmi and Moscovitch2004), since affective associations of words are usually established throughout childhood (Pavlenko, Reference Pavlenko2012). Therefore, the reduced LX emotionality observed in our study could the result of a late age of onset of English LX or an instructional setting of LX learning (see Caldwell-Harris, Reference Caldwell-Harris2014, for a review).Footnote 7 Our participants acquired their LX after puberty through formal education and might thus have been less able to perceive the emotional intensity and affective connotations of the LX emotional verbs included in the moral questions, even if they could understand the meaning of those verbs (see, e.g., Ahn & Jiang, Reference Ahn and Jiang2023; Dewaele, Reference Dewaele2010; Ferré et al., Reference Ferré, Guasch, Stadthagen-Gonzalez and Comesaña2022). Since emotional resonance of LX decreases as age of LX onset increases (Bromberek-Dyzman et al., Reference Bromberek-Dyzman, Jończyk, Vasileanu, Niculescu-Gorpin and Bąk2021; Ferré et al., Reference Ferré, Anglada-Tort and Guasch2018; Mergen & Kuruoglu, Reference Mergen and Kuruoglu2017; Pavlenko, Reference Pavlenko2012), future follow-up studies should examine more thoroughly both early and late bilinguals’ responses to emotive and neutral moral questions.
Another factor that may have influenced the results of our study is participants’ LX proficiency level. Evidence suggests low and intermediate LX proficiency levels may lead to more pronounced emotional dampening effects, whilst a high LX proficiency level could promote emotional responses similar to those of L1 speakers due to greater engagement with the emotional content of words (Imbault et al., Reference Imbault, Titone, Warriner and Kuperman2021; Ferré et al., Reference Ferré, García, Fraga, Sánchez-Casas and Molero2010; Ponari et al., Reference Ponari, Rodríguez-Cuadrado, Vinson, Fox, Costa and Vigliocco2015). Consequently, bilinguals are more likely to make less emotional and thus more rational decisions in response to emotionally charged moral dilemmas in their LX, particularly when their LX proficiency level is low to intermediate. This assumption aligns with a recent metanalysis of 57 studies addressing bilinguals’ decision-making, revealing that the MFLe has been systematically found in personal moral dilemmas at intermediate LX proficiency levels but tends to diminish as LX proficiency increases (Teitelbaum Dorfman et al., Reference Teitelbaum Dorfman, Kogan, Barttfeld and García2024; but see Del Maschio et al., Reference Del Maschio, Crespi, Peressotti, Abutalebi and Sulpizio2022). Nevertheless, evidence of the interplay between LX proficiency level and moral decision-making across different types of moral dilemmas (e.g., unrealistic/classic versus realistic) is still limited, and the existing findings do not always converge. Therefore, future research should employ more varied types of moral dilemmas and compare bilingual groups at different LX proficiency levels.
Although our results appear to be consistent with the reduced emotionality hypothesis, we do not rule out the possibility that the elevated number of utilitarian responses in the LX could be due to the additional cognitive load that performing demanding LX tasks entails (this is known as the increased deliberation hypothesis). Processing information (e.g., reading a moral dilemma) in an LX may produce cognitive fatigue (Segalowitz, Reference Segalowitz2010), thus leading to less emotional responses. According to the dual process theory (Greene et al., Reference Greene, Sommerville, Nystrom, Darley and Cohen2001), the use of an LX involves more effortful (moral decision-making) processing, increasing people’s willingness to maximise overall well-being. Nevertheless, our study did not measure cognitive load and therefore future studies should include such measurement to empirically test whether there is a relationship between enhanced cognitive load and utilitarian responses in LX.
The second main finding of the current study concerns the influence of the type of moral dilemma on participants’ moral decision-making. In line with previous research (Costa et al., Reference Costa, Foucart, Hayakawa, Aparici, Apesteguia, Heafner and Keysar2014; Greene, Reference Greene, Gazzaniga, Strick, Graybiel, Mink and Shadmehr2009; Greene et al., Reference Greene, Sommerville, Nystrom, Darley and Cohen2001; Koenigs et al., Reference Koenigs, Young, Adolphs, Tranel, Cushman, Hauser and Damasio2007), our participants were more prone to make utilitarian decisions in response to the impersonal dilemmas. This finding could be due to the decreased psychological distance to hypothetical scenarios which are based on common situations that one may experience in everyday life (Trope & Liberman, Reference Trope and Liberman2010). As the construal level theory (Trope & Liberman, Reference Trope and Liberman2010) posits, the information conveyed by a text can be processed in an abstract (abstract construals) or concrete (concrete construals) manner depending on its psychological distance; in turn, the distance of the construals has an impact on peoples’ moral judgements (see Körner & Volk, Reference Körner and Volk2014). Specifically, people tend to focus on the outcomes of an action when they confront psychologically proximal situations, leading them to make more utilitarian and deliberate decisions (Eyal et al., Reference Eyal, Liberman and Trope2008). Personal moral dilemmas – as the ones used in this study – lack mundane realism because they are based on unusual and absurd contexts that require people to cause permanent and serious harm to other people (Bauman et al., Reference Bauman, McGraw, Bartels and Warren2014; Kahane et al., Reference Kahane, Everett, Earp, Caviola, Faber, Crockett and Savulescu2018; Sommer et al., Reference Sommer, Rothmayr, Döhnel, Meinhardt, Schwerdtner, Sodian and Hajak2010); therefore, these dilemmas are perceived as psychologically more distant, leading people to place more importance on the nature of the action rather than on the overall well-being, as the present study suggested. On the other hand, the impersonal moral dilemmas used in this study were based on realistic and less emotive situations that do not require people to cause bodily harm. Therefore, when responding to these dilemmas, our participants appeared to be more concerned about the outcomes of their actions (Körner & Deutsch, Reference Körner and Deutsch2023) reacting in a rather utilitarian and less aversive way (Körner et al., Reference Körner, Joffe and Deutsch2019; Yavuz et al., Reference Yavuz, Küntay and Brouwer2023).
Nevertheless, the present study has several limitations. First, we used a limited number of emotional and neutral words (specifically verbs) to examine whether the effect of language on moral decision-making is moderated by the emotionality of these words. Future studies should manipulate a larger number of words to draw more solid conclusions on the influence of the emotionality of moral questions on bilinguals’ moral judgements. Second, the effect of language was present in the personal moral dilemmas and absent in the impersonal ones; however, the total number of moral dilemmas in our study was quite limited. Future studies should include a greater number of emotionally charged everyday moral dilemmas that are based on people’s daily life experiences (see, e.g., Starcke et al., Reference Starcke, Polzer, Wolf and Brand2011). Another limitation of this study concerns the use of self-reports to assess LX proficiency, which are not always in agreement with individuals’ performance or scores on objective LX proficiency measures (Sitzmann et al., Reference Sitzmann, Ely, Brown and Bauer2010). Lastly, the unbalanced number of men and women in our study restricts the generalisability of the results. Future studies should address this limitation by using larger sample sizes and more gender-balanced bilingual groups.
5. Conclusions
The present study suggests that bilinguals’ moral judgements may differ depending on the language they use. Based on our findings, we can conclude that emotion words may serve as a significant internal driving force that guides people’s moral reasoning and decision-making, but this phenomenon mainly occurs in the L1. On the contrary, the use of LX emotion words may reduce emotional biases due to the emotional detachment that people experience when using languages that they acquired later in life in non-naturalistic settings. In this context, a reduction in emotional biases is interpreted as a greater willingness to adopt utilitarian solutions to emotionally charged moral dilemmas. From this point of view, the use of an LX could serve as a useful tool to change one’s mindset and decision-making, allowing them to approach dilemmatic situations with a more objective perspective.
These findings have important implications for intrapersonal and interpersonal relationships. For example, when people feel upset, they may realise that describing their emotional states in the LX will help them put themselves in a more positive frame of mind and reflect and find ways to move forward. Likewise, people can deliberately swap to their LX when arguing with their bilingual partners; this can serve as a strategy to minimise the likelihood of making impulsive and emotionally driven decisions, which often put romantic relationship at risk. Our findings also have implications for other domains, such as advertising and marketing. For example, advertisers aim to grab customers’ attention and turn them into brand advocates. Promoting emotional messages in publicity by using a large number of positively or negatively valenced words, as well as high-arousal words, could be converted into a powerful tool for manipulating customers’ feelings and opinion and for encouraging them to acquire specific products and services, or even adopt specific points of view, at least when these messages are read or viewed in customers’ L1. On the other hand, advertisements projected in individuals’ LX could be more effective when advertisers appeal to reason and logic, rather than to emotion, to convince customers that one product is superior to others.
Data availability
The data that support the findings of this study are available from the first author of this study upon request.
Declaration of conflicting interests
The authors declare none.