Introduction
The accurate assessment of cognitive functioning is crucial in job selection (Hausdorf, LeBlanc, & Chawla, Reference Hausdorf, LeBlanc and Chawla2003; Outtz, Reference Outtz2002) and prediction of academic success (Peng & Kievit, Reference Peng and Kievit2020; Tikhomirova, Malykh, & Malykh, Reference Tikhomirova, Malykh and Malykh2020) in healthy subjects, as well as to evaluate cognitive deficits, predict functional outcomes, and monitor patient recovery in various brain injury patients (Casaletto & Heaton, Reference Casaletto and Heaton2017; Mansour & Lajiness-O'Neill, Reference Mansour and Lajiness-O'Neill2015). A variety of cognitive skills such as learning and verbal and visuo-spatial memory, various types of attention, processing speed, reasoning, judgment, mental flexibility, problem-solving, spatial, and language functions among others can be assessed (Kessels & Hendriks, Reference Kessels and Hendriks2016). Language tests are one of the most commonly used tests in clinical settings and elsewhere for the differential diagnosis of aphasia and dementia, as well as in evaluating patients with traumatic brain injury (Maseda et al., Reference Maseda, Lodeiro-Fernández, Lorenzo-López, Núñez-Naveira, Balo and Millán-Calenti2014).
Normative comparisons are a crucial element in any type of standardized assessment procedure (Huizenga, van Rentergem, Grasman, Muslimovic, & Schmand, Reference Huizenga, van Rentergem, Grasman, Muslimovic and Schmand2016). Such assessment involves taking the performance of an individual and comparing that performance to reference groups of the same age, sex, race, and educational attainment. It has become evident that these demographic factors impact performance on the different neuropsychological tests and thus necessitate that the evaluation of the performance of a client or patient be based upon comparisons with like individuals. These normative comparisons allow for a determination of whether an individual is performing as would be expected given this person's demographic characteristics or if their performance is poorer than expected.
Most available normative reference data for neuropsychological tests aimed at the language domain such as the Token Test, Boston Naming Test and Verbal Fluency Test are derived from exclusively or predominantly WEIRD people (Western, Educated, Industrialized, Rich, and Democratic) – that is, Caucasian samples (Henrich, Heine, & Norenzayan, Reference Henrich, Heine and Norenzayan2010) and monolingual societies. Only recently has the importance of non-WEIRD neuropsychological research been acknowledged (Pathak, Rijal, & Pathak, Reference Pathak, Rijal and Pathak2021). It is also true that there is a substantial likelihood that language tests are not culture free and/or that an earlier tested normative group is no longer representative due to changes in health and education level (Siciliano, Chiorri, Battini, Sant'Elia, Altieri, Trojano, & Santangelo, Reference Siciliano, Chiorri, Battini, Sant'Elia, Altieri, Trojano and Santangelo2018). Moreover, one cannot simply assume that the language tests are valid measures for any population other than the one for which the tests were first developed and normed. It is obvious that lower test performances may be misinterpreted – in, for example, ethnic minority subjects; or for subjects who were not assessed in their first language (Smith, Ivnik, & Lucas, Reference Smith, Ivnik and Lucas2008). If outdated or normative scores from other countries or cultures are used, they may be falsely interpreted as indicative of cognitive impairment and can lead to disproportionate misdiagnosis. Therefore, it is imperative that each country and culture needs recently collected representative normative data.
Indonesia provides an especially challenging environment for the administration of standardized tests as it is enormously heterogeneous regarding daily spoken language and language spoken at home. Over 300 different native languages are spoken in Indonesia. Although Bahasa Indonesia is the official language used in education, mass media, public information, most business, and administration, this language is not the daily language for most Indonesians. Bahasa Indonesia is their first language only for less than 10 percent of the total population; for over 200 million Indonesians their second language. Therefore, it is important to investigate whether speaking Bahasa daily in public or at home does influence the performance of neuropsychological tests and in particular language tests. As so many Indonesians are bilingual, it is also important to know the influence this may have when linguistic processes are assessed.
It is widely acknowledged that language comprises receptive and expressive language functions. Receptive language is the capacity to read, to repeat words and sentences including the understanding of spoken language. Oral comprehension, defined as the ability to process and manipulate information received through speech, can be measured by the Token Test (TT). The TT consists of items such as “Touch the second circle and the first red square” with increasing levels of difficulty. It is affected in patients with Wernicke's aphasia (Jahagirdar, Reference Jahagirdar2014). There are several studies that provide normative data for the TT (Strauss, Sherman & Spreen, Reference Strauss, Sherman and Spreen2006) for various Anglo-Saxon and Romance languages (Moreira, Schlottfeldt, de Paula, Daniel, Paiva, Cazita, Coutinho, Salgado, & Malloy-Diniz, Reference Moreira, Schlottfeldt, de Paula, Daniel, Paiva, Cazita, Coutinho, Salgado and Malloy-Diniz2011; Peña-Casanova, Quiñones-Úbeda, Gramunt-Fombuena, Aguilar, Casas, Molinuevo, Robles, Rodríguez, Barquero, Antúnez, Martínez-Parra, Frank-García, Fernández, Molano, Alfonso, Sol, & Blesa, Reference Peña-Casanova, Quiñones-Úbeda, Gramunt-Fombuena, Aguilar, Casas, Molinuevo, Robles, Rodríguez, Barquero, Antúnez, Martínez-Parra, Frank-García, Fernández, Molano, Alfonso, Sol and Blesa2009), but not for Asian languages such as Bahasa Indonesia. The TT is sensitive for left parietal temporal infarcts and moderately sensitive for Alzheimer disease (Paula, Bertola, Nicolato, Moraes, & Malloy-Diniz, Reference Paula, Bertola, Nicolato, Moraes and Malloy-Diniz2012). Effects of demographic factors such as age and education on the TT are modest (Peña-Casanova et al., Reference Peña-Casanova, Quiñones-Úbeda, Gramunt-Fombuena, Aguilar, Casas, Molinuevo, Robles, Rodríguez, Barquero, Antúnez, Martínez-Parra, Frank-García, Fernández, Molano, Alfonso, Sol and Blesa2009).
The other main category is expressive language: besides spontaneous speech often observed by a clinician during an intake, the naming of objects belongs to this category and the Boston Naming Test (BNT) is often used. It is one of the two most commonly administered tests in Western oriented countries (Kiran, Cherney, Kagan, Haley, Antonucci, Schwartz, Holland, & Simmons-Mackie, Reference Kiran, Cherney, Kagan, Haley, Antonucci, Schwartz, Holland and Simmons-Mackie2018). Normative data have been published for various Western cultures and countries (Mitrushina, Boone, Razani, & D'Elia, Reference Mitrushina, Boone, Razani and D'Elia2005; Quiñones-Ubeda, Peña-Casanova, Böhm, Gramunt-Fombuena, & Comas, Reference Quiñones-Ubeda, Peña-Casanova, Böhm, Gramunt-Fombuena and Comas2004) including Brazil (Leite, Miotto, Nitrini, & Yassuda, Reference Leite, Miotto, Nitrini and Yassuda2017). The BNT has been adapted for the Indonesian population (I-BNT) and preliminary normative scores have been proposed (Sulastri, Utami, Jongsma, Hendriks, & van Luijtelaar, Reference Sulastri, Utami, Jongsma, Hendriks and van Luijtelaar2019). The scores of the BNT are influenced by the level of education while age effects are predominantly found for people over 60 (Peña-Casanova et al., Reference Peña-Casanova, Quiñones-Úbeda, Gramunt-Fombuena, Aguilar, Casas, Molinuevo, Robles, Rodríguez, Barquero, Antúnez, Martínez-Parra, Frank-García, Fernández, Molano, Alfonso, Sol and Blesa2009). Similar results were obtained with the I-BNT (Wahyuningrum, van Luijtelaar, & Sulastri, Reference Wahyuningrum, van Luijtelaar and Sulastri2021).
Verbal fluency is another aspect of expressive language: it involves coming up with words starting with an obligatory phonological marker (phonemic verbal fluency test) or with words from a semantic category (e.g., animals, vegetables). Phonemic verbal fluency deficits have been noticed in various patient categories and mainly but not exclusively in those with injury to the frontal lobe or fronto-parietal networks (Lim, Kim, Lee, Yoo, Kim, Kim, & Lee, Reference Lim, Kim, Lee, Yoo, Kim, Kim and Lee2019; Manca, Mitolo, Stabile, Bevilacqua, Sharrack, & Venneri, Reference Manca, Mitolo, Stabile, Bevilacqua, Sharrack and Venneri2019; Pasquier, Lebert, Grymonprez, & Petit, Reference Pasquier, Lebert, Grymonprez and Petit1995; Klumpp & Deldin, Reference Klumpp and Deldin2010). This is one of the reasons why phonetic verbal fluency can also be considered a marker of executive functioning (Bialystok, Craik, & Luk, Reference Bialystok, Craik and Luk2012; Luo, Luk, & Bialystok, Reference Luo, Luk and Bialystok2010).
The primary aim of our research is to establish whether the scores of these three language function tests are affected by whether or not people speak Bahasa daily in public, and whether or not they speak Bahasa at home. This distinction between language usage at home and in public was earlier used by Sari, van de Vijver, Chasiotis, and Bender (Reference Sari, van de Vijver, Chasiotis and Bender2018) in the Indonesian context. It can be assumed that there will be a disadvantage for those that do not speak Bahasa daily in public. The type of language spoken is relevant since if the negative effects of not speaking Bahasa daily in public or at home are established, then corrections for normative scores for the language function tests will be required.
Although bilingualism is often considered an advantage in many cognitive tasks (Bialystok et al., Reference Bialystok, Craik and Luk2012; Mindt, Arentoft, Kubo Germano, D'Aquila, Scheiner, Pizzirusso, Sandoval, & Gollan, Reference Mindt, Arentoft, Kubo Germano, D'Aquila, Scheiner, Pizzirusso, Sandoval and Gollan2008; Schroeder, Marian, Shook, & Bartolotti, Reference Schroeder, Marian, Shook and Bartolotti2016), the verbal production of picture naming such as in the BNT or in other similar tasks that require lexical access has been found to be negatively affected in the form of slower responses and/or more errors in bilinguals compared to monolinguals, even though the naming task was done in their dominant language (Costa, Reference Costa2005; Gollan, Montoya, Fennema-Notestine, & Morris, Reference Gollan, Montoya, Fennema-Notestine and Morris2005; Michael & Gollan, Reference Michael and Gollan2005). This slower word retrieval was independent of factors mediating language proficiency for bilinguals – such as order of acquisition and language dominance (Ivanova & Costa, Reference Ivanova and Costa2008).
Baldo, Shimamura, Delis, Kramer, and Kaplan (Reference Baldo, Shimamura, Delis, Kramer and Kaplan2001) proposed that the phonemic VFT should be considered as an executive function task considering that neuroimaging studies have shown converging evidence for the involvement of executive control in this task – that is, mediated by the left frontal areas, and specifically the posterior opercular area of Broca's area (Paulesu, Goldacre, Scifo, Cappa, Gilardi, Castiglioni, Perani, & Fazio, Reference Paulesu, Goldacre, Scifo, Cappa, Gilardi, Castiglioni, Perani and Fazio1997), an area that is also recruited in cognitive tasks without language production (Yeung, Nystrom, Aronson, & Cohen, Reference Yeung, Nystrom, Aronson and Cohen2006). Therefore, considering that phonemic word production involves executive functioning as well as word production, little or no effects of bilingualism can be expected on the VFT. Minimal effects of bilingualism on the most difficult items of the TT are expected: since language comprehension is not known to be affected by speaking a second or third language. In all, the second aim of this study is to establish the effects of bilingualism on the three language tests, among others whether bilinguals will show the expected reduction in the performance of the BNT (Kohnert, Hernandez, & Bates, Reference Kohnert, Hernandez and Bates1998).
The third aim is to explore the effects of the demographics, such as age, education, and sex on the three language tests and to compare the size of these effects with those of language spoken in public and at home.
Methods
Participants
Participants were recruited by research assistants from six different universities, three on Java Island, one from Bali, East Kalimantan, and south Sulawesi, in all cases in urbanized parts of the four islands. All participants completed the questionnaire regarding demographic variables such as age, sex, place of birth, years of education, marital status, ethnic group of both parents, and a health questionnaire, regarding the use of alcohol, drugs, or medication, current and past neurological of psychiatric issues, or other factors known to influence health status. The data of participants that reported head trauma, drug abuse, or other previous or current illnesses that might have influenced the performance on the tests were excluded. Respondents were also asked whether they spoke Bahasa Indonesia daily in public or not; and, in the cases where they spoke Bahasa daily in public, it was further asked if they spoke, in addition to Bahasa, either one or two other languages in public. Finally, it was asked whether they spoke Indonesian at home, or if they spoke any other language besides Bahasa Indonesia at home.
The total sample (n = 840) consisted of a rather large range in age (16–80 years, M = 35.5, SD = 15.25) with 61.7% females, and education level, varying from only elementary school (6 years) to more than 17 years (postgraduate with a maximum of 22 years). Participants were categorized into age decade groups as is often done (Fernández & Marcopulos, Reference Fernández and Marcopulos2008): (i) age 20–29 years, (ii) age 30–39 years, (iii) age 40–49 years, (iv) age 50–59 years. The data of the group over 60 years were pooled, as well as those 16–19 years of age.
The years of education were categorized as well, and the five categories represented the Indonesian education system: cat 1, educated for less than seven years (i.e., Elementary School (ES)); cat 2, education between 7–9 years encompasses Junior High School (JHS); cat 3, education between 10–12 years (Senior High School (SHS) or equivalent); cat 4, education between 13–16 years (Undergraduate (UG) or equivalent), and cat 5, education more than 16 years (Graduate and Postgraduate).
Table 1 gives an overview of the demographics of the participants. Most of them had at least an undergraduate type of education (53.1%), while 35.7% had completed SHS. A small percentage finished only JHS (6.9%) or ES (4.3%). The age group best represented in our sample were those between 20 to 29 years (35.5%), the three groups between 30 to 59 were equally represented (from 15 to 16.8%), while the youngest and eldest group were less well represented. Culturally our sample represents in large part the urban Javanese population (57%) and the urban parts of Bali, East Kalimantan, and South Sulawesi (each 14%). The sample contained more females than males.
Note: This table demonstrates age, education, and sex of the participants as well as languages spoken daily, in public and at home in numbers and in percentages.
Almost all participants (94.4%) indicated that they were speaking daily Bahasa in public, about half of them were speaking only Bahasa, the other half Bahasa plus one of two other languages. These percentages were rather different for the languages spoken at home: here only 64.2% indicated that they were speaking Bahasa at home, and more than one third of the sample spoke another language at home.
Four groups (G1 to G4) were formed regarding the in-public daily spoken language(s) and three groups (A1 to A3) regarding the language spoken at home (see Table 1). We used what the participants spoke in public (no Bahasa (G1), only Bahasa (G2), Bahasa plus one other language (G3) or Bahasa plus two or more other languages (G4)) and at home (no Bahasa (A1), only Bahasa (A2), Bahasa plus another language (A3)) as independent factors in the various statistical analyses.
Stimuli and Materials
The data used for the present study were taken from recently collected data of three neuropsychological language tests adapted for Indonesia. The letters S, K, T for the phonemic version of the VFT were chosen, as proposed by Hendrawan and Hatta (Reference Hendrawan and Hatta2010) and applied by Pesau and van Luijtelaar (Reference Pesau and van Luijtelaar2021); the for-Indonesia adapted version of the I-BNT (Sulastri et al., Reference Sulastri, Utami, Jongsma, Hendriks and van Luijtelaar2019), and a from-English translated version of the TT. Data will be available after the normative data of these tests have been published.
Procedure
The tests were administered in Bahasa Indonesia, the official language of Indonesia that is used nationwide in public media, administration, and business. The participants did receive seventy-five thousand rupiahs (equal to five US dollars) after finishing the series of tests. The current research was conducted under the Helsinki Declaration, ethical clearance was provided by the ethics committee of Soegijapranata University under number 001B/B.7.5/FP.KEP/IV/2018). All subjects gave written informed consent.
Research Questions and Data Analyses
We had three major research questions. One: Does the language spoken daily, either in public or at home, affect the performance on the language tests? Two: Are there differences between monolinguals and bilinguals in the performance on the three language tests? Three: Were there effects and what were the effect sizes of sex, age and education on the three language tests?
Differences between the four (language usage in public) and three (language spoken at home) groups regarding their level of education (five levels), age (seven categories) as done previously (Wahyuningrum et al., Reference Wahyuningrum, van Luijtelaar and Sulastri2021) and sex were analyzed with two ANOVAs with groups as between subject factor. This was done to establish if these demographic factors should be additionally included in the analyses of the group differences on the three language tests. The analyses of whether the “language groups” G1 - G4 and A1 - A3 differ regarding their performance of verbal tests should not be contaminated by other between group differences such as education, age and sex. Therefore, the analysis of the demographic factors as dependent variables was necessary to establish objectively whether these demographic factors should be included in the ANCOVA or not. Therefore, the ANCOVA was used to establish the effect of language usage daily in public (with groups as between-subjects factor), while controlling for the significant demographic factors age, education, and sex by using them as co-factors. Since sex was without any significant effect, the ANCOVA's were redone without sex as co-factor.
Helmert contrasts were used as post-hoc tests; they directly answered research question 1 and 2. They compared the group that did not speak Bahasa daily in public (G1) with the three groups that did speak Bahasa daily in public (G2, G3, G4) (research question 1); next, comparisons were made between the three groups speaking Bahasa in public: the group that spoke only daily Bahasa (G2, monolinguals) was compared with those that spoke Bahasa plus at least one other language (bilinguals, G3 and G4) (research question 2) , and finally between those that spoke Bahasa in public plus one other language (G3) with the group that spoke, in addition to Bahasa, two other languages in public (G4). In order to prevent type I errors, the p values that were used as representing a significant difference were set at p < .01 for the ANCOVAs.
A similar statistical analysis approach was followed for the factor “language spoken at home”. In short, first the demographics variables were analyzed to check whether the three groups differed regarding age, sex, and education, followed by an ANCOVA to control for the effects of demographics. Also, since here sex was without any significant effect, the ANCOVAs were redone without sex as co-factor. The Helmert contrast compared the non-Bahasa speaking group (A1) with the two Bahasa speaking groups (A2 + A3) and the only-Bahasa speaking group (A2) with the group that speaks Bahasa as well as at least one other language in addition to Bahasa (A3). MANOVAs were used to get a general impression regarding the effect size (amount of explained variance) of the two factors (daily language spoken in public and language spoken at home) in relation to the demographic factors of age, education, and sex on the performance of the three verbal tests. The interpretation of effect sizes as expressed by η2 was according to Richardson (Reference Richardson2011).
Results
Daily language spoken in public: sample characteristics
The one factor ANOVA regarding whether the four “daily language spoken in public” groups differed in age, education and sex showed significant medium-sized age (F = 15.48, df 3,84, p < .001, ή2= .05) and education effects (F = 11.85, df 3.84, p < .001, ή2= .04) and a small sex effect (F = 2.62, df 3.84, p < .001, ή2= .01).
Subsequent post-hoc tests showed that the participants who did not speak Bahasa daily in public (G1) were significantly less educated and older compared to those who did (G2-G4). Regarding the participants who spoke Bahasa in public it was found that those who only used Bahasa as their daily public language (G2) were significantly younger than those who used Bahasa in addition to another language (G3, G4). Therefore, it is imperative to control for age and education in the comparisons between the four groups regarding the languages spoken daily in public. Finally, it was found that those who use three languages daily were better educated than those who use two daily languages.
Language spoken at home: sample characteristics
The distribution of the subjects regarding the language spoken at home already showed that more than one third (n = 301, 36%) did not speak Bahasa at home. The ANOVA analyzed whether the three groups differed regarding education, age, and sex. A moderate group effect for education was found (F = 30.38, df 2.84, p < .001, ή2=.07): the post-hoc test confirmed that the education level for those that do not speak Bahasa at home (A1) was less than from the two Bahasa speaking groups (A2, A3). There was also an age effect (F = 15.41, df 2.84, p < .001, ή2 = .04): the post-hoc tests confirmed that the non-Bahasa speaking group (A1) was older than both Bahasa speaking groups (A2, A3) and that the Bahasa speaking group (A2) was younger than the Bahasa plus another language group (A3). Therefore, it is imperative to control for age and education in the comparisons between the three groups regarding the languages spoken at home. In total, the combined demographics of the two language factors showed that 44 of the 840 probands indicated that they speak Bahasa neither at home, nor in public.
The effect sizes of daily spoken language(s) on language tests compared
MANOVAs comparing the effect sizes of daily spoken language in public and language spoken at home were used to get a general impression regarding the effect size of both factors on the performance of the verbal tests as ensemble: the results were F = 3.49, df 54.25, p < .001, ή=.07 for daily language spoken and F = 3.45, df 36.16, p < .001, ή=.07 for languages spoken at home. This demonstrates that the overall contribution of both factors on the performances on all three language tests taken together is significant, moderate sized and equal.
Another MANOVA was used to compare the effect size for the factor daily ‘language spoken in public’ compared with those of age, education, and sex on the three language tests taken together. The factor ‘daily in-public spoken language’ was significant with a moderate effect size (F = 2.84, df 54.25, p < .001, ή2= .06), the effect of education was rather large (F = 19.89, df 18.82, p < .001, ή2= .31), and the effect of age was moderate (F = 3.47, df 18.82, p < .001, ή2= .07). The factor sex was not significant. The same analyses were repeated for the comparisons between the effect size of ‘language spoken at home’ and the three demographic factors. The factor language spoken at home was significant (F = 1.63, df 34.14, p < .01, ή2 = .04), as well as the factor education (F = 2.70, df 68.28, p < .001, ή2= .06), and age (F = 1.41, df 85.35, p < .01, ή2= = .03), while sex was not significant.
The effect of daily spoken language(s) in public on the three language tests
We next analyzed whether the factor ‘daily in-public spoken language’ affected the performance measures of all three language tests separately while controlling for the differences in education and age. The ANCOVA for the factor daily language spoken showed a significant effect for almost all variables of the I-BNT and the most difficult items of the TT. The results are presented in Table 2. The picture emerges that there are quite a few significant group effects, implying that, while controlling for education and age, the language spoken in public affected the scores of two of the three language tests, but in a different manner.
Note: This table shows the effect of daily spoken language(s) in public on the Boston Naming Test (BNT), Token Test (TT), and Verbal Fluency Test (VFT) while controlling for age, and education as covariates (ANCOVA). Given are the Mean and SD of the four groups regarding the spoken language(s). The Mean Square, F (with df's), p-value and partial eta squared (effect size), followed by the outcomes of the Helmert contrasts as post-hoc tests.
The first research question asked whether there is a disadvantage when tested in Bahasa Indonesia for individuals that do not speak Bahasa in public. The results of the ANCOVA showed group effects for subscale F of the TT (F = 8.82, df 3.83, p < .001; ή2= .03), and for the Total score of the TT (F = 7.47, df 3.83, p < .001; ή2= .03), while the outcomes of the first of Helmert's contrast (G1 vs combined groups G2, G3 and G4) confirmed the disadvantage for those that do not speak Bahasa Indonesia daily in public (p < .01) for scale F and the total score of the TT. There were no such effects on the two language production tests.
The second research question was answered by the second Helmert contrast in which G2 was compared with G3 and G4. That is, on the I-BNT, is there an advantage for those who are monolinguistic in public versus those who speak one or two other languages in public in addition to Bahasa? This ANCOVA also showed significant group effects for almost all variables of the I-BNT: the F-values for the number of spontaneous correct items were (F = 4.68, df 3.83, p < .001 ή2 = .02), for the number of correct responses after an a-phonemic cue (F = 9.51, df 3.83, p < .001, ή2= .03), for the number of correct responses after the phonemic cue (F = 3.60, df 3.83, p < .01, ή2= .01), for the number of total correct responses (F = 4.49, df 3.83, p < .01, ή2= .02) and for the total time to complete the BNT (F = 16.19, df 3.83, p < .001; ή2= .06). The Helmert contrast confirmed the advantage of monolingualism for the higher number of spontaneous correct items, for less use of phonemic and a-phonemic cues, as well as a much quicker time to complete the BNT (p's <.05). Interestingly, none of the TT and VFT scores were affected. The third contrast, the comparison between the bilingual and trilingual speaking participants, showed two more effects for the BNT: an advantage for the trilinguals compared to the bilinguals for the total number of correct items and for a higher number of a-phonemic cues. The third test, the VFT, did not show differences between any of the groups.
The ANCOVA also confirmed age and education effects, in addition to the mentioned effects of ‘language spoken in public’. The results are presented in Table 3. Age and education effects were found for all three language tests. More specifically: all I-BNT variables except the number of correct responses after phonemic cues showed the age-related differences, the F-scale and Total score for the TT, and all scores for the VFT. All I-BNT variables also showed effects of education, except the number of correct responses after a-phonemic cues, all TT variables, except scale A, and all four variables of the VFT. In fact, more and larger effects (as inferred from ή2) of education compared to age were found (see Table 3). Correlation coefficients (data not given, it would need another table filled with only positive correlations with education and only negative correlations with age) between age and the performance in all three language tests were negative, and they were positive between years of education and performances in all three language tests. In general, age worsened and education improved performance in all three language tests.
Note: F with df's, and p-value and effect size are given.
The effect of language(s) spoken at home on the three language tests
We analyzed next whether the factor ‘language spoken at home’ affected the performance measures of all three language tests while controlling for the differences in education and age by using them as covariates. The effects of the demographic factors on the language tests were revealed by the outcomes of the ANOVAs. Rather similar effects of the demographic factors were found for the factor ‘spoken language in public’. In particular, significant effects for the factor ‘language spoken at home’ were found for I-BNT and TT test, but not for the VFT.
The ANCOVA for the factor ‘daily language spoken at home’ showed significant group effects for most of the I-BNT variables and again on the same TT variables – the score on the F-scale and the total score – and again no effects of this factor were found on the VFT; details of the ANCOVA are presented in Table 4.
Note: This table demonstrates the effect of spoken language at home while controlling for age, and education (ANCOVA) on the Boston Naming Test (BNT), Token Test (TT), and Verbal Fluency Test (VFT). Given are the Mean and SD of the three groups (A1, A2, A3), the MS, F (with df's) p and eta squared, followed by the outcomes of the Helmert contrasts as post-hoc tests.
Significant group effects for the I-BNT regarding the number of spontaneous correct answers (F = 9.87, df 2.84, p < .001, ή2= .02), the number of correct answers after-phonemic cues (F = 5.67, df 2.84, p < .01, ή2= .01), the total number of correct responses (F = 6.51, df 2.84, p < .01, ή2= .02) as well as total time to complete the BNT (F = 6.48, df 2.84, p < .01, ή2= .02). All these variables showed the disadvantage as revealed by the first Helmert contrast (A1 vs A2, A3) for not-speaking Bahasa at home: that is, a smaller number of spontaneous and correct items, a slower speed to complete the test, and more used phonemic cues. The same disadvantage was also found for the two variables of the TT: its F-values were F = 6.80, df 2.84, p < .01, ή2= .02 and F = 6.39, df 2.84, p < .01, ή2= .02 respectively for the F-scale and Total score, since the first contrast confirmed in both cases worsened performance: fewer correct items for those who do not speak Bahasa at home and were assessed in Bahasa. The second contrast concerned the disadvantage of bilingualism: here the total score of the TT was affected (for F see above) and a lower score for the bilingual individuals was obtained. No group effects were found for the VFT.
The ANCOVA showed, in addition to the previously mentioned group effects, age and education effects as well for all three tests. The results are presented in Table 5. It is particularly striking that almost all variables of the three tests show the effects of education, with the size of this effect being larger than that of age. Striking as well is the finding that all subscales of the TT except one are sensitive to education effects. There were also rather large effects for education on the VFT.
Note: F with df's, p-value, and effect size are given
Discussion
Indonesia is truly a society in which many people use more than one language and this is represented in our sample. While almost 95% speak Bahasa Indonesia in public, at home the percentage of Bahasa-speaking participants is much lower (64%). Next, many (47.5%) Indonesians speak another language in addition to Bahasa in public. Those that did not speak Bahasa in public were somewhat older and less educated than the Bahasa speaking groups. Among the Bahasa speaking groups it was the youngsters who only spoke Bahasa; the older participants were more often bilingual. The same tendencies were found regarding the language spoken at home: the youngsters and well-educated people more often spoke Bahasa at home compared to older and less well-educated individuals. This result fits in the general tendency that most of the population speak Indonesian next to their local language and that Indonesian is becoming more common as a first language (Cohn & Ravindranath, Reference Cohn and Ravindranath2014). Under the assumption that middle class families are better educated than their working-class counterparts, the outcomes of our study agree with an earlier Indonesian study showing that middle class parents and children in Central Java use Bahasa much more than their working-class counterparts (Kurniasih, Reference Kurniasih2006).
The level of education and the age of the participants across the groups was not the same and considering that it is well documented that age and education have major effects on many cognitive tests, including language tests, it is imperative to control for age and education when the groups’ linguistic performance is compared. The major outcome of the comparisons between the language groups is that language spoken at home has a moderate effect on the performance of two of the three language tests. More specifically, a clear disadvantage was found for participants who do not speak Bahasa Indonesia at home on a language comprehension test and on a word production test (I-BNT) when assessed in Bahasa Indonesia. The same disadvantage was also found for those that do not speak Bahasa Indonesia in public on the same language comprehension test, but not for the I-BNT.
A second major outcome was that bilingualism is a disadvantage in word production as measured with the I-BNT but not with the VFT. Furthermore, this bilingualism disadvantage effect, or monolingualism advantage, was found for both speaking only Bahasa at home and for only speaking Bahasa in public. Thirdly, large education and moderate age effects were found for all three language tests. Sex was not found to have any influence on any of the performance measures of the three language tests.
The disadvantage of not being tested in the daily language one speaks in public was evident for the language comprehension test but only for the total score of the TT. This disadvantage of being assessed in Bahasa while Bahasa is not the language one regularly uses in public was confirmed and extended by the analyses regarding the languages spoken at home: again, a lower TT performance, not only on the test's F-scale (the most difficult scale) and the total score, but now also on the I-BNT. Almost all I-BNT variables showed this disadvantage: both language comprehension and word production were negatively affected. Both the time to complete the task and the number of correct items showed this effect. Interestingly, such an effect was not found for the other word production test, the VFT. Other research in this area showed that Swedish immigrants when tested in the Swedish language performed worse on a phonemic VFT task compared to age-matched native Swedes, demonstrating that not being tested in one's original language lowers the performance scores in word production (Stålhammar, Hellström, Eckerström, & Wallin, Reference Stålhammar, Hellström, Eckerström and Wallin2020). The reasons for the lack of effects on the VFT test in the present study might be: that although Bahasa is not spoken daily, the familiarity of the subjects with Bahasa or the proficiency in Bahasa might be high, considering that almost all Indonesians are educated in Bahasa for nine years; that the majority of our subjects finished at least senior high school or college (undergraduate); and that the large exposure to mass media has a large and compensatory impact on their ability to produce words in Bahasa Indonesia. It also seems that the BNT, with its increase in the level of difficulty of the items, is more challenging than the VFT for the detection of word production performance.
The disadvantage in word production of bilinguals found specifically for the I-BNT was earlier found by others for the BNT both regarding the decrease in the number of correct items or in a lengthening of the time to complete the BNT (Bialystok, Craik, & Luk, Reference Bialystok, Craik and Luk2008; Luo et al., Reference Luo, Luk and Bialystok2010; Gollan, Fennema-Notestine, Montoya, & Jernigan, Reference Gollan, Fennema-Notestine, Montoya and Jernigan2007; Sandoval, Gollan, Ferreira, & Salmon, Reference Sandoval, Gollan, Ferreira and Salmon2010). Many other studies found that bilingualism appears to slow down lexical processing as measured in VF tasks, including those constrained by letter-cues (Gollan, Montoya, & Werner, Reference Gollan, Montoya and Werner2002; Portocarrero, Burright, & Donovick, Reference Portocarrero, Burright and Donovick2007). The longer retrieval times needed by bilinguals have been interpreted as a consequence of competition or interference between lexical entries of the bilinguals’ two languages, or, alternatively, to lower word usage frequencies. And there is indeed quite a body of evidence that, when a bilingual individual is assessed: that lexical access is influenced by properties of the lexicon of the other language (Dijkstra & van Heuven, Reference Dijkstra and Van Heuven2002); that the lexicon of the two languages is not accessed separately; and that between-language similarities regarding phonological, semantic and orthographic similarities can trigger the activation of the other language (Sanoudaki & Thierry, Reference Sanoudaki and Thierry2015).
Interestingly, here again the second word production test, the VFT, was not sensitive, but this lack of effects was predicted (Bialystok et al., Reference Bialystok, Craik and Luk2012; Gollan et al., Reference Gollan, Montoya and Werner2002; Luo et al., Reference Luo, Luk and Bialystok2010; Rosselli, Ardila, Salvatierra, Marquez, Luis, & Weekes, Reference Rosselli, Ardila, Salvatierra, Marquez, Luis and Weekes2002). The phonemic VFT is also considered an executive function task and bilinguals should therefore have here an advantage above monolinguals. Therefore, the lack of a bilingual disadvantage in the VFT task as found here may be a consequence of the interplay between the linguistic disadvantages and the executive control advantages in bilinguals (Sandoval et al., Reference Sandoval, Gollan, Ferreira and Salmon2010). Put in other words, the bilingual disadvantage in word production might have been compensated for by the bilingual advantage in executive function. The bilingual disadvantage for the I-BNT as found here is not caused by age and education effects, as demonstrated by the outcomes of the ANCOVA (where the effects persisted after controlling for these demographics), although it is plausible that some non-language aspects on which bilingual and monolingual speakers might differ such as parents’ education level, socioeconomic status, and ethnicity could also have contributed to a bilingual disadvantage, as was proposed by Antoniou (Reference Antoniou2018).
A comparison of individuals who speak either one or two languages in addition to Bahasa revealed a significant difference in just one I-BNT variable: the number of a-phonemic cues. This type of cue was used most often by those who speak two more languages in addition to Bahasa, followed by those who speak a single additional language and then those who speak only Bahasa. Young (Reference Young2016) found that the recall time of a picture-naming task increased gradually as the number of languages increased. Their evidence supports a hypothesis that learning more languages does increase difficulty with word retrieval, the more languages a person knows, and the slower he or she recalls individual words phonetically and semantically. Getting access to a broader lexicon and finding the appropriate word seems more difficult and cues are more necessary compared to those who only speak Bahasa.
All three language tests were more sensitive for education than for age with moderate effects, while sex effects were not found on any of the performance measures of the three language tests (Lucas, Ivnik, Smith, Ferman, Willis, Petersen, & Graff-Radford, Reference Lucas, Ivnik, Smith, Ferman, Willis, Petersen and Graff-Radford2005; Scheuringer, Wittig, & Pletzer, Reference Scheuringer, Wittig and Pletzer2017). The relatively small, though significant, age effects on the TT are also in agreement with literature (e.g., Aranciva, Casals-Coll, Sánchez-Benavides, Quintana, Manero, Rognoni, Calvo, Palomo, Tamayo, & Peña-Casanova, Reference Aranciva, Casals-Coll, Sánchez-Benavides, Quintana, Manero, Rognoni, Calvo, Palomo, Tamayo and Peña-Casanova2012; Lucas et al., Reference Lucas, Ivnik, Smith, Ferman, Willis, Petersen and Graff-Radford2005; Moreira et al., Reference Moreira, Schlottfeldt, de Paula, Daniel, Paiva, Cazita, Coutinho, Salgado and Malloy-Diniz2011) showing that verbal comprehension is usually better preserved during ageing than the productive aspects of language, such as word fluency and picture naming, but some difficulties are encountered when the material is complex (Juncos-Rabadán, Facal, Rodríguez, & Pereiro, Reference Juncos-Rabadán, Facal, Rodríguez and Pereiro2010) mainly due to a diminishment in the phonemic loop of working memory. Also interesting, in our case, the age-related effects on the TT were found for the more difficult C, E, F scales and Total score, but not for the easiest ones (A and B).
Corrections for age and education have become common practice for cognitive tests when normative scores are reported. As mentioned, effects of education and age on all three language tests were found, in agreement with a large body of literature. The size of the reported effects of the languages spoken at home on the language tests are comparable in size with those of age and cannot be denied. Therefore, it is more than reasonable that corrections for language spoken will be incorporated in the normative scores, next to education and age effects. The present outcomes have the practical consequence that when assessed with Indonesian language tests, different normative data should be used for those participants that do not speak Indonesian at home, as well, perhaps, for participants that do not use Indonesian daily in public. This is the case for the I-BNT and to a lesser degree for the Token Test, but not for the VFT. No effects of spoken language were found for the VFT when we controlled for age and education effects. The lack of spoken language effects for the VFT may suggest that age and education have mediating effects on the factors of language spoken at home and in public, and indeed those who did not speak Bahasa at home were older and less educated.
In clinical neuropsychological practice, persons may often be tested in a language different from their home language. Those people who do not have Bahasa as a home language, but who are assessed in Bahasa, have an immediate disadvantage. Putative lower scores on the I-BNT and TT might lead to underestimation of their linguistic skills, or in an overdiagnosis in cases of pathology. It remains to be determined whether corrections for ethnicity are imperative as well. This awaits further empirical research addressing this issue. It is imperative regarding a fair assessment that the clients’ and patients’ diversity be respected by offering them ethnicity-adapted tests, as well as providing recent normative data from groups that mimic their demographic characteristics as closely as possible. A fair assessment implies as well that one is tested in one's primary language and by a representative of one's own culture or ethnic group. This is necessary to prevent neuropsychological health-care disparities and underestimation of a persons’ cognitive skills (Cory, Reference Cory2021; Rabin, Brodale, Elbulok-Charcape, & Barr, Reference Rabin, Brodale, Elbulok-Charcape and Barr2020).
It can be concluded that the two factors “daily language” and “language spoken at home” that were explored in this study have rather similar effects on the three different language tests, although the factor language spoken at home seems more influential. The effects have consequences for clinical practices. First, and this might be preferable, to avoid underdiagnosis or overdiagnosis of language problems due to the language effects found here, the tests should be administered in one's most familiar language. When this is not possible or when the tests have not been properly adapted for the first language of the testee, then the tests can be administered by a neuropsychologist with an interpreter or else directly administered by multilingual research assistants (Nielsen, Segers, Vanderaspoilden, Bekkhus-Wetterberg, Minthon, Pissiota, Bjørkløf, Beinhoff, Tsolaki, Gkioka, & Waldemar, Reference Nielsen, Segers, Vanderaspoilden, Bekkhus-Wetterberg, Minthon, Pissiota, Bjørkløf, Beinhoff, Tsolaki, Gkioka and Waldemar2018). Secondly, consideration needs to be given to correcting the normative data for the TT and I-BNT for “language spoken at home.” Such correction is particularly called for if the cognitive assessment is done in Bahasa Indonesia and Bahasa is not the language spoken at home. These corrections of the scores would be similar to the well-known corrections for age and education. For the phonemic VFT corrections for age and education are all that seem imperative in this Indonesian sample.
Acknowledgements
This work was supported by the Directorate of Higher Education General of Indonesia under grant number: 010/L6/AK/SP2H.1/PENELITIAN/2019. The authors would like to thank the members of the Indonesian Neuropsychology Consortium, research assistants, and all the participants who participated in this research. John V Keller, PhD, provided linguistic support
Competing interests
The author(s) declare none