Introduction
The demographic of the aging population in France, particularly those aged over 80, is significant and growing. According to Eurostat (2022), there are currently over 4.1 million people over 80 in France, accounting for 6.2% of the total population (Eurostat, 2022). It is projected that the number of people over 80 will double by 2070 making up 1 in 8 people in the country. Neurodegenerative diseases affect 6.4% of the population. Their prevalence increases with age, from 1.2% between 65 and 69 years, to 15% after 80 years and almost 30% after 90 years old (Gil, Reference Gil2018).
The importance of recent cognitive function standards in the oldest old population lies in its ability to distinguish between pathological aging and successful or normal aging, as defined by Hartley et al. (Reference Hartley, Angel, Castel, Didierjean, Geraci, Hartley, Hazeltine, Lemaire, Maquestiaux, Ruthruff, Taconnat, Thevenot and Touron2018). A comprehensive neuropsychological assessment also enables better tailoring of interventions to individuals in need.
Normal aging causes changes in cognitive functions (Angel & Isingrini, Reference Angel and Isingrini2015; Braver & West, Reference Braver, West, Craik and Salthouse2008; Cabeza et al., Reference Cabeza, Nyberg and Park2016; Goh et al., Reference Goh, An and Resnick2012; Park & Reuter-Lorenz, Reference Park and Reuter-Lorenz2009). Salthouse (Reference Salthouse2019) reported that declines in episodic memory and reasoning skills accelerate around the age of 65. Processing speed declines linearly from the 30s, while vocabulary decline is less pronounced. Procedural memory is resistant to cognitive aging.
Several studies have investigated the impact of individual factors, such as education level and gender, on cognitive performance in old age. Research consistently shows a positive correlation between higher education and better cognitive performance among older individuals (Fletcher et al., Reference Fletcher, Topping, Zheng and Lu2021; Grasset et al., Reference Grasset, Jacqmin-Gadda, Proust-Lima, Pérès, Amieva, Dartigues and Helmer2018; Opdebeeck et al., Reference Opdebeeck, Martyr and Clare2015), which supports the cognitive reserve hypothesis (Stern, Reference Stern2002, Reference Stern2009). Seblova et al. (Reference Seblova, Berggren and Lövdén2020) conducted a recent review and concluded that education was strongly correlated with performance levels but that the association between educational level and changes in cognitive performance was not significant. In a meta-analysis, Lövdén et al. (Reference Lövdén, Fratiglioni, Glymour, Lindenberger and Tucker-Drob2020) found that education plays a significant role in cognitive function during later life by creating differences in cognitive abilities established in early adulthood that persist into old age. Gender differences in cognitive aging have been extensively studied. Older men tend to outperform women in visual perceptual and visuoconstructive tasks, while older women show superior performance in the verbal domain (Munro et al., Reference Munro, Winicki, Schretlen, Gower, Turano, Muñoz, Keay, Bandeen-Roche and West2012; McCarrey et al., Reference McCarrey, An, Kitner-Triolo, Ferrucci and Resnick2016; Proust-Lima et al., Reference Proust-Lima, Amieva, Letenneur, Orgogozo, Jacqmin-Gadda and Dartigues2008). Further, McCarrey et al. (Reference McCarrey, An, Kitner-Triolo, Ferrucci and Resnick2016) found that men experience more cognitive decline in global efficiency, processing speed, and visuospatial ability, while older women are more resistant to age-related cognitive decline. These findings partially contrast with a recent study by Levine et al. (Reference Levine, Gross, Briceño, Tilton, Giordani, Sussman, Hayward, Burke, Hingtgen, Elkind, Manly, Gottesman, Gaskin, Sidney, Sacco, Tom, Wright, Yaffe and Galecki2021), which showed that women experience faster cognitive decline than men in global cognition and executive functions.
These different studies clearly demonstrate the impact of education and gender on cognitive function in aging. It is important to consider these factors when establishing standards for cognitive function.
According to Giulioli and Amieva (Reference Giulioli and Amieva2016), the oldest old population is at a higher risk of developing a neurodegenerative disease. However, four out of five patients do not receive the recommended diagnostic procedures. Neuropsychological assessment in the oldest old can be challenging due to comorbidities, chronic pain syndromes, sensory deficits, or severe fatigue. Furthermore, there is a lack of cognitive tools and standards specifically designed for this population. Some studies have established French standards, particularly for individuals over 80.
In 2008, Roussel and Godefroy compiled the tests commonly used to assess executive function in the GREFEX battery and published normative scores for the French-speaking population. The age range of the oldest group remains broad, including subjects aged 60 and over, and does not distinguish a category of oldest old subjects. Overall, this 60 and over age group performed less well than younger age groups on all tests, both in terms of accuracy and reaction time. With the exception of the semantic fluency task, a lower level of education had a negative impact on all tests, while gender did not have any effect.
Ferreira et al. (Reference Ferreira, Vanholsbeeck, Chopard, Pitard, Tio, Vandel, Galmiche and Rumbach2010) created the RAPID battery, a set of neuropsychological tests previously standardized in different populations. The study found a positive correlation between high educational level and cognitive functioning. Furthermore, scores were generally lower for the oldest age group (80 to 89) compared to the younger age groups, at each level of education.
Established in 1999–2000, the three Cities cohort is a large prospective cohort that includes French participants aged over 65 (The 3C Study Group, 2003). Amieva et al. (Reference Amieva, Carcaillon, Rouze L’Alzit-Schuermans, Millet, Dartigues and Fabrigoule2007) developed detailed norms for an episodic memory test, the Free and Cued Selective Reminding Test (FCSRT), for people aged 65 years and over, with a specific group of subjects aged 78 to 90 years old, distinguished by gender and educational level. Statistical tests were not conducted on variables such as gender, age or education.
The Aging Multidisciplinary Investigation cohort, as described by Pérès et al. (Reference Pérès, Matharan, Allard, Amieva, Baldi, Barberger-Gateau, Bergua, Bourdel-Marchasson, Delcourt, Foubert-Samier, Fourrier-Réglat, Gaimard, Laberon, Maubaret, Postal, Chantal, Rainfray, Rascle and Dartigues2012), is a prospective epidemiological study that focuses on health and aging. The study produced normative data on a specific population of retired farmers aged 65 years and older, who live in rural areas. Rullier et al. (Reference Rullier, Matharan, Barbeau, Mokri, Dartigues, Pérès and Amieva2014) used this cohort to provide detailed normative data for a visual recognition test (DMS-48). The study showed that test performance declined with age but increased with education. Additionally, women had significantly higher performance levels.
Giulioli et al. (Reference Giulioli, Meillon, Gonzalez-Colaço Harmand, Dartigues and Amieva2016) provided normative scores from a sample of healthy subjects aged 85 years and older from the Personnes Agées QUID study, 20 years after their inclusion. The study revealed that cognitive performance was negatively impacted by aging and lower education, except for a subtest assessing semantic ability. Gender did not have a significant effect after adjusting for age and education.
In conclusion, the limited research on French norms of neuropsychological assessment in the oldest old has not consistently considered the impact of gender and education. Furthermore, some cohorts have used tests that are not aligned with current clinical practice. The main aim of this study is to establish French standards, for individuals over 80 years old, based on commonly used neuropsychological assessment tests. These tests should be conducted under conditions that closely resemble clinical practice, taking into account gender and education. Therefore, we propose measuring cognitive performance, including global cognitive function, language, memory, executive function, praxis, visuospatial ability, attention, and processing speed, in a single session on a population aged over 80. We will investigate the effect of gender and educational level variables on the entire group. It is hypothesized that individuals over the age of 80 will perform significantly lower than the existing norms for elderly populations. Furthermore, we anticipate that those with higher levels of education will perform better overall. It is also expected that men will perform better than women on visuoconstructive tasks while women will outperform men on verbal tasks.
Method
Participants
The FIBRATLAS Project (https://anr.fr/Projet-ANR-14-CE17-0015) recruited 134 participants from six French body donation programs. The project aims to validate in vivo and ex vivo MRI tractography of the brain’s white matter using dissection as a gold standard. It has created a database containing in vivo (DWI MRI and neuropsychological assessment) and ex vivo (DWI and dissection) data obtained from the same subjects. The study involved 134 participants aged 82 or older with no history of neurological or neurosurgical disease or major cognitive impairment. Participants had a score of 4 or more on the Instrumental Activities of Daily Living scale.
Of the participants, 126 underwent a comprehensive neuropsychological assessment between 2015 and 2021 (see below). Nineteen participants, whose Mini-Mental State Examination (MMSE) scores were at or below the 5th percentile (using French norms with education level established by Kalafat et al., Reference Kalafat, Hugonot-Diener and Poitrenaud2003) were excluded due to the risk of cognitive impairment (see Figure 1 for details).
A total of 107 participants were selected for analysis, consisting of 56 women and 51 men, with an average age of 85.2 (range: 82–94.5). The majority of participants (90%) were between the ages of 82 and 89, as shown in Table 1.
The Educ - group comprised 55 participants, with less than or equal to 9 years of education, corresponding to no diploma or primary school diploma. The Educ + group comprised 52 participants, with more than 9 years of education, corresponding to a secondary school or university diploma (see Table 2). There was no significant age difference between the male and female groups. However, there was a significant and moderate difference in age between the groups divided by level of education (Educ − vs. Educ +), both for the group as a whole (U (107) = 1039; p = .015; r rb = .274) and for the female group (U (56) = 251; p = .025; r rb = .352). The female participants with the lowest level of education were older, with a mean age of 86.6 years (SD = 3.31) for the Educ – group and 84.6 years (SD = 2.85) for the Educ + group, representing a difference of 2 years. The evaluation of depression was conducted using the Montgomery and Asberg Depression Rating Scale (Montgomery and Asberg, Reference Montgomery and Asberg1979), resulting in a mean score of 5.91 (SD = 6.88). Table 2 also reports the number of participants with cardiovascular risk factors, including diabetes, hypertension, and hypercholesterolemia.
Note: Educ − = less or equal to 9 years of education; Educ + = more than 9 years of education; SD = Standard Deviation; MMSE = Mini-Mental State Examination; MADRS = Montgomery-Asberg Depression Rating Scale.
Neuropsychological assessment
Neuropsychological assessments were conducted by psychologists within 45 days of inclusion, using normative tests commonly proposed in clinical practice. The assessments covered several domains including global cognitive function, language, episodic memory, visuospatial ability, executive function, attention, processing speed and praxis. Table 3 lists the neuropsychological tests, scores and times examined. Detailed procedures and scoring are described in the supplementary material.
Note: MMSE = Mini Mental State Examination, FCSRT-Fr = Free and Cued Selective Reminding Test - French. RCFT = Rey Complex Figure Test, NCE = Non-Corrected Errors, MLB = Mahieux-Laurent Battery.
The study assessed global cognitive function using the French version of the MMSE (Kalafat et al., Reference Kalafat, Hugonot-Diener and Poitrenaud2003). Language ability was evaluated using the DO 80 picture naming test (Deloche and Hannequin, Reference Deloche and Hannequin1997), which allowed for spontaneous self-correction and strict marking, without strict time constraints. Episodic memory was evaluated in two modalities. The study used the FCSRT to assess auditory verbal memory. The FCSRT was adapted into French by Van der Linden et al. (Reference Van der Linden, Coyette, Poitrenaud, Kalafat, Calicis and Adam2004) and in our study, we renamed the French version of the test RL/RI-16 to FCSRT-Fr to improve its clarity. The test assesses immediate recall, free and cued recalls, recognition and 20-minute delayed recall tasks. Visual episodic memory was evaluated using the DMS-48 (Barbeau et al., Reference Barbeau, Didic, Tramoni, Felician, Joubert, Sontheimer, Ceccaldi and Poncet2004). The evaluation included an immediate recognition task after an encoding phase, followed by a delayed recognition task presented one hour later. Visuospatial ability was assessed using the Rey Complex Figure Test (RCFT), which includes tasks such as copying, immediate reproduction, and delayed recall after 20 minutes (see Fastenau et al., Reference Fastenau, Denburg and Hufford1999 for a description). To assess attention and executive function, tests adapted into French and standardized by GREFEX (Reflection Group on the Evaluation of Executive Function) were used, as documented by Meulemans (Reference Meulemans and Godefroy2008) and Roussel and Godefroy (Reference Roussel and Godefroy2008). The Verbal Fluency Test, Stroop Test and TMT Part A and B were administered, along with the Digit Span subtest from the WAIS IV (Wechsler, Reference Wechsler2011). Processing speed was measured using the Coding subtest of the WAIS IV (Wechsler, Reference Wechsler2011). Gestural praxis was assessed using the Mahieux-Laurent Battery (MLB), which includes symbolic and meaningless gestures and action pantomimes (Mahieux-Laurent et al., Reference Mahieux-Laurent, Fabre, Galbrun, Dubrulle and Moroni2009).
Figure 2 presents the order of the tests. The number of participants varied between 90 and 107, depending on the subtests, as some test data were not available (see Appendix 1 for details). The RCFT Delayed Recall subtest, the Stroop subtests, and TMT Part B were administered in the middle of the neuropsychological evaluation and had the lowest percentage of valid observations. Some participants experienced difficulties with certain subtests due to visual impairments (such as low visual acuity or poor color perception), or academic difficulties (such as poor writing or visuospatial skills). A small number of participants even refused to continue or expressed anxiety.
Ethic issues
Informed and written consent was obtained from all participants. This study adhered to the Ethical Principles for Medical Research Involving Human Subjects laid down in the Declaration of Helsinki. Additionally, the FIBRATLAS Project, from which part of the data was obtained, was approved by the Tours Ethics Committee (Comité de Protection des Personnes, 2015-R8) and by the ANSM (Agence Nationale de Sûreté du Médicament et des Produits de Santé, EudraCT/ID RCB: 2015-A00363-46).
Data analysis
The statistical analyses were performed using Jamovi 2.4.14 software. The study examined the impact of gender and education on participants’ cognitive performance. The scores were analyzed by gender (male vs. female) and education (Educ − vs. Educ +).
The study assessed the impact of gender on the entire sample and separately for each education group (Educ + vs. Educ −). Additionally, the effect of education was tested for each score globally (on the whole sample) and separately for each gender group (males vs. females). To determine the appropriate statistical tests, the Shapiro-Wilk test was used to assess whether the score distribution followed a normal distribution. Parametric tests, such as Student’s t-test and ANOVA, were used for scores that followed a normal distribution and had equal variances, as determined by the Levene test. The variables concerned are FCSRT-Fr Free Recalls, Semantic Fluency, Digit Span Total Score, RCFT Immediate Recall and Delayed Recall Scores and Coding Score. For variables with non-normal distributions, non-parametric tests (Mann Whitney test) were used. Effect sizes were calculated using Cohen’s d for parametric tests and biserial rank correlation for non-parametric tests.
Percentile scores (5th, 10th, 25th, 50th, 75th, 90th, and 95th) were provided for each test, stratified by gender and education, as the majority of distributions did not follow a normal distribution.
Tables 3 to 15 present the results for each test of the studied subgroups including the number of participants, percentiles 5 to 95, mean, and standard deviation, as well as the comparison of the four subgroups by gender and/or education.
We also included comparisons between subgroups based on gender and education level. Significant levels (p-values) are reported as follows: * for p ≤ .05, ** for p ≤ .01, and *** for p ≤ .001 when the performance of the subgroups was significantly different.
Results
Global cognitive functions
Table 4 shows that the mean MMSE score for the whole group was 27.36 (SD = 1.85), with poorer performance for participants with lower level of education for both women (U (56) = 190, p = < .001, r rb = .51) and men (U (51) = 213, p = .034, r rb = .34), as well as for the whole group (U (107) = 807, p = < .001, r rb = .44). No gender effect was observed.
Language
The mean score on the DO 80 naming test for the whole group was 78.13 (SD = 4.67), with no significant differences observed according to gender and education (see Table 5).
Memory
The mean scores for the entire group on the different subtests of the verbal episodic memory test FCSRT-Fr are presented below (see Tables 6 and 7 for detailed scores): immediate recall (M = 15.07, SD = 1.36), free recalls (M = 22.86, SD = 6.81), total recalls (M = 43.95, SD = 4.91), recognition (M = 15.68, SD = 0.75), delayed free recall (M = 9.07, SD = 3.14) and total delayed recall (M = 15.09, SD = 2.01). The study found that participants with lower educational levels performed worse in both free and total recall tasks for the whole group (t (102) = 2.78, p = .007, d = .54; U (104) = 914.5, p = .004, r rb = .32), as well as for the subgroups of men (t (48) = 2.07, p = .04, d = .59; U (50) = 165, p = .005, r rb = .47), and women (t (52) = 2.27, p = .03, d = .62). No significant interaction between age and education level was observed.
The study found a gender effect in the different recall phases for the whole group, with women performing better than men overall. This effect was observed in immediate recall (U (106) = 940, p = .001, r rb = .33), free recalls (t (102) = 2.44, p = .016, d = .48), total recalls (U (104) = 1005, p = .024, r rb = .26), delayed free recall (U (103) = 979, p = .022, r rb = .26) and total delayed recall (U (103) = 984, p = .011, r rb = .26). Furthermore, the study found that participants with a lower level of education had lower scores in immediate recall (U (54) = 208, p = .005, r rb = .42), total recalls (U (52) = 208, p = .020, r rb = .38), and delayed total recalls (U (51) = 202, p = .011, r rb = .37).
Intrusions after cueing are rare (median between subgroups from 0 to 1). Similarly, false recognitions in the recognition task were rare, with most subjects in the different subgroups making no such errors (see Appendix 2 for details).
The DMS-48 visual memory test results for the entire group (see Table 8), showed two mean scores: immediate recognition score (M = 44.37, SD = 3.42) and delayed recognition score (M = 43.35, SD = 4.34). Test completion times were also recorded and analyzed. No gender effect was observed on any of the measures (score and time to completion). Regarding the effect of education on completion times, a difference was only found for the immediate recognition subtest among the whole group (U (103) = 1026, p = .048, r rb = .23), with subjects with lower level of education performing slower.
Subjects with a lower level of education performed worse on both immediate and delayed recognition scores (respectively, for the whole group, U (105) = 932, p = .004, r rb = .32 and U (104) = 1047, p = .046, r rb = .23).
Visuospatial abilities
As shown in Tables 9 and 10, on the RCFT, the mean scores for copying, immediate recall and delayed recall for the whole group were 31.44 (SD = 4.11), 13.48 (SD = 6.43), and 13.11 (SD = 6.47), respectively. Males performed faster (U (103) = 974, p = .021, r rb = .26) and more efficiently (U (103) = 984; p = .025, r rb = .26) than females in copying, and participants with higher education performed faster (U (103) = 887, p = .004, r rb = .33). There were no observable effects on completion time for either immediate or delayed recall.
Women were less efficient than men for both immediate and delayed recall (t (99) = −3,70, p = < .001, d = −.74; t (88) = −3,54, p = < .001, d = −.75). Participants with higher education outperformed those with lower education (t (99) = 2,04, p = .044, d = .41; t (88) = 2.40; p = .018, d = .51)
Attention and executive function
Fluency tasks
The group’s mean scores for literal fluency and semantic fluency were 17.85 (SD = 6.60) and 21.53 (SD = 6.86), respectively. There was a group effect of education (U (107) = 1002, p = .008, r rb = .30) and only for males (U (51) = 204, p = .024, r rb = .37) on the verbal literal fluency task (see Table 11 for details). Participants with lower level of education, especially men, produced fewer words beginning with the letter P. Most participants did not have any intrusions or item repetitions, with a median of 0 for both variables, regardless of gender or level of education.
In the semantic verbal fluency task, participants with lower education produced fewer animal words (t (105) = 2.68, p = .009, d = .52), and this effect was only observed in the female subgroup (t (54) = 2.01, p = .049, d = .54). No intrusions or item repetitions were found for most participants (with a median of 0 for these two variables).
No gender effect was found for the either test, for the whole group or when considering education.
Stroop test
The mean scores for naming, reading, and interference tasks for the entire group were 78.43 (SD = 19.17), 52.28 (SD = 9.29), and 171.34 (SD = 48.68), respectively. Table 12 shows that there were few significant differences between the subgroups, regardless of the variable studied (gender or education level). Nevertheless, the difference in execution times between the interference task and the naming task was more pronounced for women than men, regardless of their level of education (U (94) = 797, p = .020, r rb = .28).
The majority of participants did not make any errors in the naming and reading subtest, while in the interference subtest, most participants made zero or one error (see Appendix 3).
Trail making test
Participants with lower levels of education completed Part A (M = 62.73, DS = 21.28 for the whole group) and Part B (M = 169.68, SD = 70.39 for the whole group) more slowly (U (102) = 896, p = .007, r rb = .31; U (95) = 769, p = .008, r rb = .32 respectively). This effect of educational level was not found for Part A when considering the male and female subgroups separately. For Part B, it was found only for the female subgroup (U (48) = 158, p = .008, r rb = .45). Similarly, when looking at the difference in time between the two parts, a longer time to completion was found for women with lower level of education (U (48) = 165, p = .011, r rb = .43).
No significant differences were found between the two male subgroups based on their level of education.
Part A had a low error rate, with 93% of participants making no errors. In Part B, participants with a higher level of education demonstrated an average of 0.7 errors with 57% making no errors. In contrast, those with a lower level of education made an average of 1.4 errors, with only 38% making no errors.
Detailed results are given in Table 13.
Digit span
The total mean score for the entire group was 20.69 (SD = 4.48). No significant differences were found between the subgroups based on gender or education for the Digit Span Forward and Sequencing Digit Span memory tasks in either scores or span (see Table 14 for scores and Appendix 4 for span). Participants with a higher level of education, particularly women, performed significantly better, on two subtests. The Digit Span Backward subtest showed significant results for the entire group (U (106) = 1019, p = .013, r rb = .27) and for the women subgroup (U (55) = 248, p = .030, r rb = .34). Additionally, the Total Score also showed consistent results for the entire group (t (104) = 2.54, p = .012) and for the women subgroup (U (55) = 215, p = .007, r rb = .43).
Speed of processing
The Coding test performance (M = 39.03; SD = 11.35 for the whole group) was significantly better for participants with higher level of education compared to those with lower level of education (U (103) = 795, p = < .001, r rb = .40), and this was true for both males (U (50) = 199, p = .031, r rb = .36) and females (t (51) = 3.03, p = .004, d = .84). No significant differences were found between the performance of men and women, regardless of their level of education (see Table 15 for details). Coding errors were rare, with no errors for 79% of the subjects and only one error for 11%.
Gestural praxis
The whole group performed well on the following three subtests, as shown in Table 16: Symbolic Gestures (M = 4.71, SD = 0.58), Action Mimes (M = 9.38, SD = 0.99) and Meaningless Gestures (M = 7.28, SD = 1.04). No significant differences were found between the different subgroups on the Symbolic and Meaningless Gestures subtests. On the Action Mimes subtest, women performed slightly worse than men (U (107) = 1020; p = .003, r rb = .29). This trend was only found for participants with a higher level of education (U (52) = 223, p = .010, r rb = .34).
Discussion
The aim of this study was to provide updated norms, adapted to the oldest old people, for neuropsychological tests frequently used by French clinicians, in conditions close to the classical clinical assessment. We found that although some studies had developed norms for the old French population, the tests were not always adapted to the oldest old individuals (Roussel & Godefroy, Reference Roussel and Godefroy2008), nor for the general population (Peres et al., Reference Pérès, Matharan, Allard, Amieva, Baldi, Barberger-Gateau, Bergua, Bourdel-Marchasson, Delcourt, Foubert-Samier, Fourrier-Réglat, Gaimard, Laberon, Maubaret, Postal, Chantal, Rainfray, Rascle and Dartigues2012), or were no longer used in clinical practice (Giulioli et al., Reference Giulioli, Meillon, Gonzalez-Colaço Harmand, Dartigues and Amieva2016).
Our study assessed various domains by administering tests, commonly used in current clinical practice in France. We analyzed the impact of gender and education on each test by dividing the participants into four subgroups based on these variables. The majority of the tests demonstrated significant differences in performance depending on gender and/or level of education, confirming the importance of these two factors in establishing normative data. Furthermore, the distinction between two levels of education (primary school level vs. secondary school level) seems to be relevant for the majority of the tests. The study found that participants with a lower level of education scored lower or completed tests more slowly than their more educated peers, confirming previous studies showing an effect of education level on cognitive performance in aging (Fletcher et al., Reference Fletcher, Topping, Zheng and Lu2021; Grasset et al., Reference Grasset, Jacqmin-Gadda, Proust-Lima, Pérès, Amieva, Dartigues and Helmer2018; Opdebeeck et al., Reference Opdebeeck, Martyr and Clare2015). The observed differences between genders were minimal. Women outperformed men in verbal episodic memory while men performed better on praxis and visual construction tasks, which is consistent with previous studies (McCarrey et al., Reference McCarrey, An, Kitner-Triolo, Ferrucci and Resnick2016; Munro et al., Reference Munro, Winicki, Schretlen, Gower, Turano, Muñoz, Keay, Bandeen-Roche and West2012; Proust-Lima et al., Reference Proust-Lima, Amieva, Letenneur, Orgogozo, Jacqmin-Gadda and Dartigues2008) but no gender effect was observed on other tests. As the majority of participants were aged between 82 and 84, it was not possible to conduct a subgroup analysis that included age, in addition to gender, and level of education. However, we analyzed the correlation between age and performance for each subgroup, divided by gender and level of education, to determine if age had an impact on performance on the subtests. The significant correlations are presented in Table 17. An age-related effect was observed mainly in the groups of men and women with higher level of education, particularly on certain episodic memory subtests. The results suggest that participants with a higher level of education may experience a moderate effect on their success in certain tests or the time taken to complete them as they age.
Note. * p < .05, ** p < .01, *** p < .001.
In the study, the Wilcoxon one-sample t-test was used to compare the performance of the participants with the theoretical averages derived from existing standards in the elderly population. The participants’ performance was similar to existing standards in the domains of verbal literal fluency (Roussel & Godefroy, Reference Roussel and Godefroy2008), visual episodic memory (Barbeau et al., Reference Barbeau, Didic, Tramoni, Felician, Joubert, Sontheimer, Ceccaldi and Poncet2004), and praxis (Mahieux-Laurent et al., Reference Mahieux-Laurent, Fabre, Galbrun, Dubrulle and Moroni2009). However, there seemed to be a strong ceiling effect for the proposed tests in the last two domains. This effect could mask differences when controlling for variables such as gender or level of education. Verbal episodic memory performance, on the other hand, was below the norms of Van der Linden et al. (Reference Van der Linden, Coyette, Poitrenaud, Kalafat, Calicis and Adam2004), probably due to the wider age range (over 74 vs. over 82 in our study). Additionally, differences were found in executive functions. Specifically, the FIBRATLAS cohort performed worse on the verbal semantic fluency task compared to results of Roussel and Godefroy (Reference Roussel and Godefroy2008) or Cardebat et al. (Reference Cardebat, Doyon, Puel, Goulet and Joanette1990). Additionally, our cohort exhibited slower performance on complex tasks, compared to the established norms in previous studies. This could be attributed to the fact that existing norms are based on a population with a wider age range, such as the group of people individuals over 60 in GREFEX. Differences between our study and previous ones were observed through a qualitative analysis, particularly in relation to the pathological thresholds corresponding to the tenth percentile. Our study population had higher thresholds than in those of similar age groups in the studies conducted by Amieva et al. (Reference Amieva, Carcaillon, Rouze L’Alzit-Schuermans, Millet, Dartigues and Fabrigoule2007), Kalafat et al. (Reference Kalafat, Hugonot-Diener and Poitrenaud2003), and Ferreira et al. (Reference Ferreira, Vanholsbeeck, Chopard, Pitard, Tio, Vandel, Galmiche and Rumbach2010) in terms of global cognitive functions and episodic memory retrieval process.
If we compare the standards established in our study with the four subgroups defined by gender and educational level, we still find some differences with previous studies. To illustrate, we can consider the performance of an 86-year-old male with a primary school education and a score on the FCSRT-Fr, as predicted by the regression equation proposed in the French Princeps study (Van der Linden et al., Reference Van der Linden, Coyette, Poitrenaud, Kalafat, Calicis and Adam2004). It would yield an expected free recalls score would of 32/48. In contrast, our study yielded a score of 20.58. If we consider the standards and the same groups as those presented by Amieva et al. (Reference Amieva, Carcaillon, Rouze L’Alzit-Schuermans, Millet, Dartigues and Fabrigoule2007), the expected score for free recalls is 17/48, while in our cohort it is 22. Another illustrative example is the RCFT, with the same imaginary case. The mean expected score on the copy task is 30/36 in accordance with the standards of Fastenau et al. (Reference Fastenau, Denburg and Hufford1999), while in our study it is 33/36. However, the cutoff score (−2 DS) is 27.64/36 for Fastenau et al. (Reference Fastenau, Denburg and Hufford1999), whereas it is 21.1/36 in our cohort.
This study has several strengths. Firstly, we propose tests to assess a wide range of cognitive functions commonly tested in older people under assessment conditions close to clinical practice. In addition, we provide detailed normative information in percentiles, controlling for gender and educational level. We have also emphasized the importance of using normative data for a more restricted age range in older age.
However, it is important to acknowledge the limitations of our study. The sample size was relatively small, with only 107 participants. Furthermore, the uneven age distribution of the participants, with 82.2% of the population aged between 82 and 87, prevented the creation of subgroups based on age, gender, and education. Providing information on the participants’ socioeconomic status, in addition to their educational background, would have offered valuable insights, as recently suggested by Migeot et al. (Reference Migeot, Calivar, Granchetti, Ibáñez and Fittipaldi2022). Additionally, the profile of the participants in the FIBRATLAS cohort who chose to donate their bodies after death may limit the study. These individuals may be particularly concerned about the importance of medical and scientific research. Therefore, they may not be representative of the general population.
Regarding the tests, only a few scores were found to follow a normal distribution: free recalls for the FCSRT-Fr, semantic verbal fluency, digit span total score, immediate recall, and delayed recall for the RCFT, and coding score. Ceiling effects were observed for some of the subcomponents of five tests, indicating a lack of sensitivity. These tests include the DO 80, with a median score close to the maximum score and a high cutoff score; the FCSRT-Fr for immediate recall, total recalls, recognition, and delayed total recalls; the MLB and DMS-48 for all the subtests. The use of normative data in percentiles is therefore necessary in clinical practice.
Our study focused on assessing memory, processing speed, and executive functions, which are typically affected by cognitive aging. A more detailed assessment of language could be proposed, focusing on mnemonic and executive components including comprehension of complex instructions, sentence concatenation, rapid naming, and sentence repetition.
In conclusion, this study aims to contribute to the development of norms for the oldest old population based on gender and level of education. The study focuses on tests commonly used in clinical practice in France, under conditions similar to those of a neuropsychological assessment. However, caution must be exercised when interpreting the results, especially for subjects aged over 90, due to the small size of the subgroups and the uneven age distribution of the participants. Establishing reliable norms would be valuable by generalizing these results to a larger population of oldest old and comparing them with young-old adults.
Supplementary material
The supplementary material for this article can be found at https://doi.org/10.1017/S1355617724000390
Acknowledgements
The authors do not have any conflicts of interest to declare. This work was supported by the National Research Agency (ANR-14-CE17-0015). The data used in this article were provided by the FIBRATLAS project. The researchers listed in the FIBRATLAS consortium (see full list with affiliations in the supplementary material) designed and implemented the FIBRATLAS project and/or provided data, but not all were involved in the analysis or writing of this manuscript.
The authors thank the participants and the testers of the FIBRATLAS Project.