Introduction
Dementia is a pressing global issue, affecting 55 million people worldwide, with Alzheimer’s disease accounting for 60–70% of dementia cases (WHO, 2023). In the United States alone, approximately 6.7 million individuals have been diagnosed with Alzheimer’s Disease (AD) (Alzheimer’s Association, 2024). Notably, Black Americans are almost twice as likely to receive an AD diagnosis compared to their White counterparts (Weuve et al., Reference Weuve, Barnes, Mendes de Leon, Rajan, Beck, Aggarwal, Hebert, Bennett, Wilson and Evans2018), and this discrepancy is anticipated to rise. A systematic review of the prevalence of mild cognitive impairment (MCI) suggests that approximately 15.6% of people worldwide meet criteria for MCI, with 10.0% meeting criteria for amnestic MCI (aMCI; Bai et al., Reference Bai, Chen, Cai, Zhang, Cheung, Jackson, Sha, Xiang and Su2022). aMCI confers an increased risk for the later development of AD. Traditional in-person neuropsychological assessments have been the gold standard for diagnosing MCI and dementia. Computerized assessments that do not require a neuropsychologist to administer may offer a unique opportunity to extend care to underserved populations, including minorities and those in rural areas who may not be able to easily access in person services, while also expanding scoring paradigms to consider timing and accuracy-by-timing calculations. However, these computerized assessments might also exacerbate health disparities based on socioeconomic status, access to technology/stable internet connection, and lower overall computer literacy. This is of particular concern in older adults who may not be as comfortable with technology. Still, it is crucial to develop and validate computerized neuropsychological measures, particularly with a focus on aging populations and racial minority populations. This paper aims to offer psychometric support for the use of the tablet-administered National Institutes of Health (NIH) Toolbox Cognition Battery (NIHTB-CB), a computerized assessment tool for crystallized and fluid cognition, within the diverse and extensive sample of the Advancing Reliable Measurement in Alzheimer’s Disease and cognitive Aging (ARMADA) study.
As advanced age increases the risk for the development of MCI and dementia particularly AD, validating measures of cognitive functioning for use in adults over the age of 85 is particularly important, as this group has over a 33% risk of developing AD (Rajan et al., Reference Rajan, Weuve, Barnes, McAninch, Wilson and Evans2021). Additionally, research has identified neural dedifferentiation (decreased specialization of brain regions and networks activated during different cognitive tasks; Koen et al., Reference Koen, Srokova and Rugg2020) and cognitive dedifferentiation (i.e., the increased reliance on generalized intelligence to complete disparate cognitive tasks) as factors contributing to abnormal cognitive decline even after age, sex and education are accounted for (Wallert et al., Reference Wallert, Rennie, Ferreira, Muehlboeck, Wahlund, Westman and Ekman2021). Still the relationship between age and both cognitive dedifferentiation and neural dedifferentiation suggests that the factor structure of our neuropsychological instruments may be different or possibly less discrete (i.e., fewer factors or more cross loadings) for adults over the age or 85 as compared to the factor structure in older adults less than 85 years of age.
The NIHTB-CB is a module within the larger computerized NIH Toolbox for Assessment of Neurological and Behavioral Function and was developed to establish a common metric for cross-study comparisons of crystallized and fluid cognition (Gershon et al., Reference Gershon, Wagster, Hendrie, Fox, Cook and Nowinski2013; Weintraub et al., Reference Weintraub, Dikmen, Heaton, Tulsky, Zelazo, Slotkin, Carlozzi, Bauer, Wallner-Allen, Fox, Havlik, Beaumont, Mungas, Manly, Moy, Conway, Edwards, Nowinski and Gershon2014). While the computerized administration of the NIHTB-CB may enhance the accessibility of neuropsychological assessment for underserved populations, the validity of these scores must be established to determine clinical and research utility.
Prior studies evaluating the NIHTB-CB have demonstrated validity and clinical utility across diverse populations, including adults aged 20–85, healthy older adults aged 60–80 (Scott, E. P., Sorrell, A., & Benitez, A., Reference Scott, Sorrell and Benitez2019), children aged 3–15 (Weintraub et al., Reference Weintraub, Bauer, Zelazo, Wallner‐Allen, Dikmen, Heaton, Tulsky, Slotkin, Blitz, Carlozzi, Havlik, Beaumont, Mungas, Manly, Borosh, Nowinski and Gershon2013), stroke patients (Carlozzi et al., Reference Carlozzi, Goodnight, Casaletto, Goldsmith, Heaton, Wong, Baum, Gershon, Heinemann and Tulsky2017), and individuals with Mild Cognitive Impairment (MCI) and Dementia (Hackett et al., Reference Hackett, Krikorian, Giovannetti, Melendez‐Cabrero, Rahman, Caesar, Chen, Hristov, Seifan, Mosconi and Isaacson2018). Research has also shown validity for older adults up to age 85, providing evidence of convergent validity of NIHTB-CB index scores with gold standard instruments of cognitive functioning, excellent sensitivity to age-related cognitive changes, and excellent test-retest reliability (Parsey, Bagger, Trittschuh, & Hanson., Reference Parsey, Bagger, Trittschuh and Hanson2021; Zelazo et al., Reference Zelazo, Anderson, Richler, Wallner-Allen, Beaumont, Conway, Gershon and Weintraub2014). The NIHTB-CB has demonstrated external validity through strong correlations with factors such as self-reported school difficulties, health status, and disability status (Heaton et al., Reference Heaton, Akshoomoff, Tulsky, Mungas, Weintraub, Dikmen, Beaumont, Casaletto, Conway, Slotkin and Gershon2014). However, the measure has not been validated for use in adults over the age of 85.
Regarding the validity of the NIHTB-CB for use in the Black American population, some studies support its use as an assessment of premorbid IQ in Black older adults with and without MCI (Halter et al., Reference Halter, Moll, Kero, Kavcic, Woodard and Giordani2024). Other studies advocate for the development and use of demographically corrected norms for nonwhite patients (Flores et al., Reference Flores, Casaletto, Marquine, Umlauf, Moore, Mungas, Gershon, Beaumont and Heaton2017). Though these demographically corrected norms would likely be most appropriate for clinical/diagnostic uses of the NIHTB-CB and would not be necessary in every case, these studies underscore the need for continued research into the validity of the NIHTB-CB for nonwhite patients.
The two-factor (Crystallized and Fluid) structure of the NIHTB-CB proposed in Weintraub et al. (Reference Weintraub, Bauer, Zelazo, Wallner‐Allen, Dikmen, Heaton, Tulsky, Slotkin, Blitz, Carlozzi, Havlik, Beaumont, Mungas, Manly, Borosh, Nowinski and Gershon2013) has been investigated in prior studies and has been shown to have good model fit for adults over the age of 65, across diagnostic groups (cognitively unimpaired vs dementia/aMCI, across sex (male vs. female), and across majority and minority racial groups (underrepresented groups, vs non underrepresented groups; Ma et al., Reference Ma, Carlsson, Wahoske, Blazel, Chappell, Johnson, Asthana and Gleason2021). This study also showed some support for the invariance of this factor structure across diagnosis and demographic factors (i.e., sex, race, and education; Ma et al., Reference Ma, Carlsson, Wahoske, Blazel, Chappell, Johnson, Asthana and Gleason2021). However, a study investigating the factor structure of the NIHTB-CB in cognitively normal older adults over the age of 85 showed support for a six-factor structure (vocabulary, reading, memory, working memory, executive functioning and speed) rather than the proposed two-factor structure or other structure models (Nolin et al., Reference Nolin, Cowart, Merritt, McInerney, Bharadwaj, Franchetti, Raichlen, Jessup, Hishaw, Van Etten, Trouard, Geldmacher, Wadley, Porges, Woods, Cohen, Levin, Rundek, Alexander and Visscher2023). An evaluation of the NIHTB-CB factor structure in adults (aged 18–84) with and without acquired brain injury found support for a five-factor model (vocabulary, reading, episodic memory, working memory, and executive functioning/processing speed; Tulsky et al., Reference Tulsky, Holdnack, Cohen, Heaton, Carlozzi, Wong, Boulton and Heinemann2017). This variability underscores the need for continued evaluation of the NIHTB-CB factor structure in different samples.
This study aims to contribute to the existing literature by examining the two-factor structure of the NIHTB-CB (Weintraub et al., Reference Weintraub, Bauer, Zelazo, Wallner‐Allen, Dikmen, Heaton, Tulsky, Slotkin, Blitz, Carlozzi, Havlik, Beaumont, Mungas, Manly, Borosh, Nowinski and Gershon2013) across different stages of cognitive impairment, specifically Cognitively Normal (CN) and aMCI, and in different populations, with special attention to Black American populations and older adults over the age of 85 using the ARMADA dataset. In line with prior literature (Ma et al., Reference Ma, Carlsson, Wahoske, Blazel, Chappell, Johnson, Asthana and Gleason2021) we hypothesized that a two-factor structure (Crystallized and Fluid) would emerge and would remain stable across racial groups, diagnostic groups, and in the population over age 85.
Methods
This study was conducted in accordance with the Helsinki Declaration and the Institutional Review Boards of the University of Michigan Medical School.
Participants
A total of 503 community-dwelling older adults were recruited from 2018–2022 as part of the ARMADA study. Data were collected across 9 sites, with most participants in this study co-enrolled in Alzheimer’s Disease Research Centers (ADRCs). Informed consent was obtained from all participants at the appropriate ADRCs. Participants underwent a comprehensive evaluation using the NACC UDS-3, involving multidomain medical, neurological, social, and neuropsychological assessments. Participants received diagnoses of CN or aMCI through consensus conferences based on NACC criteria (Rahman-Filipiak et al., Reference Rahman-Filipiak, Sadaghiyani, Davis, Bhaumik, Paulson, Giordani and Hampstead2022; Weintraub et al., Reference Weintraub, Besser, Dodge, Teylan, Ferris, Goldstein, Giordani, Kramer, Loewenstein, Marson, Mungas, Salmon, Welsh-Bohmer, Zhou, Shirk, Atri, Kukull, Phelps and Morris2018; Weintraub et al., Reference Weintraub, Salmon, Mercaldo, Ferris, Graff-Radford, Chui, Cummings, DeCarli, Foster, Galasko, Peskind, Dietrich, Beekly, Kukull and Morris2009) at their specific ADRCs. After receiving a consensus diagnosis, participants completed the tablet-based NIHTB-CB. The average time between NACC evaluation/consensus diagnosis and NIHTB-CB administration was 130 days. Only the baseline NIHTB uncorrected standard score data were used in the analyses reported here, along with consideration of appropriate adjustments based on ARMADA demographic values.
Our sample consisted of two diagnostic groups: 367 CN and 136 participants diagnosed with aMCI. The sample did not include participants with Alzheimer’s disease (AD), as sufficient samples of Black participants were not available. The sample was approximately 59% female and 30% Black. Ages ranged from 65 to 99 years, with an average age of 77.26 (SD = 7.8). Education was between 8 and 20 years, with a mean of 16.3 years (SD = 2.5). Sample demographics by racial and demographic group can be found in Table 1.
Measures
NIH Toolbox-Cognition Battery
The NIHTB-CB is a performance-based mobile application (NIH Toolbox v.2, Toolbox Assessments, Chicago, IL) designed to assess cognitive functioning using an iPad tablet. Internet connectivity is not required for NIHTB-CB administration or scoring. The battery has seven subtests and was designed to be completed within 30 mins. The seven subtests measure performance on crystallized cognition, assessed by language-dependent abilities (Oral Reading Recognition [ORR] and Picture Vocabulary [PV]), and fluid cognition, which assesses executive function (Dimensional Change Card Sort [DCCS]), attention and executive ability (Flanker Inhibitory Control and Attention [Flanker]), working memory (List Sorting Working Memory [LSWM]), processing speed (Pattern Comparison Processing Speed [PC]), and episodic memory (Picture Sequence Memory [PSM]). Performance is measured with either unadjusted normative scale scores (standard scores) or fully corrected (age and education adjusted) normative scores (T scores). Specific test details, procedures, and extensive psychometrics are found in Weintraub et al. (Reference Weintraub, Bauer, Zelazo, Wallner‐Allen, Dikmen, Heaton, Tulsky, Slotkin, Blitz, Carlozzi, Havlik, Beaumont, Mungas, Manly, Borosh, Nowinski and Gershon2013).
Statistical analyses
A series of two-factor confirmatory factor analyses (CFA) were used to investigate the factor structure of the NIHTB-CB for the entire sample, race (White vs Black), diagnosis (aMCI vs CN) and age greater than 85. Additional CFAs were run investigating the factor structure in Black and White participants diagnosed as CN vs. aMCI. Following the recommendations of the NIHTB-CB developers, uncorrected standard scores were used in these analyses.
Results
Factor analyses
A series of CFA’s were conducted to examine the factor structure of the NIHTB-CB for the entire sample, participants diagnosed with aMCI, and participants over the age of 85. Additional factor analyses were conducted to investigate variations in factor structure by race (all Black vs. all White participants) and for Black participants across diagnostic classifications (CN and aMCI). Each group showed a two-factor model with relatively good fit (CFI/TLI > 0.95; RMSEA < .07; Browne & Cudeck, Reference Browne, Cudeck, Bollen and Long1993, Hu & Bentler, Reference Hu and Bentler1999; Jöreskog & Sörbom, Reference Jöreskog and Sörbom1993) and factor loadings greater than 0.4 (Stevens, Reference Stevens1992). For detailed information on model fit by each subsample please see Table 2. Nearly all factor analyses substantiated the two anticipated factors: Fluid and Crystallized. The Fluid factor included Flanker, DCCS, PC, LSWM, and PSM, while the Crystallized factor included PV and ORR. This factor structure was also replicated for participants over the age of 85.
In the full subsample of Black participants, a two-factor model also emerged; however, PSM loaded onto the Crystallized factor rather than Fluid factor. Similar deviations in factor loadings were observed when examining the factor structure in Black CN participants, where List Sort Working Memory and PSM loaded onto the Crystallized factor rather than the Fluid factor. Interestingly, in Black participants with aMCI, the original factor structure and loading pattern were successfully replicated. It is also important to note that of all NIHTB-CB measures, PSM followed by List Sort Working Memory showed the lowest factor loadings in all subsequent samples. See Table 2 for all factor loadings by analysis.
Discussion
Our study successfully replicated the proposed two-factor structure of the NIHTB-CB across various subgroups, including the general sample, Black and White participants, individuals over the age of 85, and participants diagnosed as CN or with aMCI. These results offer psychometric support for the validity of the NIHTB-CB in these diverse populations. However, when examining the factor structure of the NIHTB-CB in Black participants identified as CN, a slight deviation was observed compared to White participants, with List Sort Working Memory and PSM loading onto the Crystallized factor rather than the Fluid factor. Interestingly, this difference was not evident when exploring the factor structure for Black participants diagnosed with MCI. It suggests that there are cultural differences that may impact the way CN Black participants interact with NIHTB-CB measures (possibly increased use of verbal mediation or verbal processing strategies) that are then attenuated by cognitive impairment (i.e., MCI). This finding also suggests potential instability in the factor structure of the NIHTB-CB across different levels of cognitive impairment for Black participants. Furthermore, considering that 10–15% of participants diagnosed with MCI progress to dementia (McGrattan et al., Reference McGrattan, Pakpahan, Siervo, Mohan, Reidpath, Prina, Allotey, Zhu, Shulin, Yates, Paddick, Robinson and Stephan2022), the generally lower factor loading of PSM in all analyses and the differences in factor structure in Black participants across diagnostic categories over time suggests that the Fluid composite may not be as stable as would be expected within a given individual as cognitive impairment progresses. Clinicians and researchers should be mindful of this potential instability in the factor structure across different levels of cognitive impairment, particularly if using the NIHTB-CB to assess for cognitive decline (particularly in memory compared to executive functioning or other fluid measures) over time in Black individuals. While unlikely to strongly influence diagnostic conclusions, the fact that the relationship between PSM and other fluid measures changes across cognitive diagnoses is an important factor to consider when interpreting patterns of cognitive weakness (i.e., aMCI vs MCI). As memory is not considered a crystalized ability and is sensitive to cognitive change, the fact that PSM loaded onto the crystalized factor also raises some concern for the stability of the NIHTB-CB crystalized composite. Indeed, previous studies identified that measures of crystallized intelligence (ORR and PV) were sensitive in differentiating MCI Black participants from healthy controls when compared to memory and executive measures (Kairys et al., Reference Kairys, Daugherty, Kavcic, Shair, Persad, Heidebrink, Bhaumik and Giordani2022; Rigby et al., Reference Rigby, Gregoire, Reader, Kahsay, Fisher, Kairys, Bhaumik, Rahman-Filipiak, Maher, Hampstead, Heidebrink, Kavcic and Giordani2024). Taken together, these findings further suggest that NIHTB-CB crystalized measures may be differentially related to cognitive decline and may not represent a cognitive “hold” factor as previously thought (Jutten et al., Reference Jutten, Ho, Karpouzian-Rogers, Van Hulle, Carlsson, Dodge., Nowinski, Gershon, Weintraub and Rentz2024), particularly in different racial groups.
This study is among the first to explore the factor structure of the NIHTB-CB with regard to adults over the age of 85 and Black participants. Given the higher risk of developing AD among Black older adults (Weuve et al., Reference Weuve, Barnes, Mendes de Leon, Rajan, Beck, Aggarwal, Hebert, Bennett, Wilson and Evans2018), validating easily accessible neuropsychological assessments becomes crucial for early detection of AD and increases access to neuropsychological assessment and diagnosis in a historically underserved population. The present findings contribute to research evaluating the utility of the NIHTB-CB in diverse populations.
However, this study is not without limitations. While the ARMADA study enables the collection and comparison of cognitive assessment data over time, we did not specifically investigate the temporal stability of the NIHTB-CB factor structure. Based on our findings, a more thorough investigation of the factor invariance of the NIHTB-CB for different demographic groups and diagnostic categories is also warranted. These represent important avenues for future research, especially in light of observed differences in factor structure between Black participants with CN and MCI. MCI participants in our study were of the amnestic type, as this confers the highest risk for development of Alzheimer’s disease. However, it is worth noting that non-amnestic MCI may not follow this same pattern as aMCI, and further research should evaluate differences in factor structure in these populations. It is also important to note that the sample gathered through the ARMADA study is highly educated compared to the general population and these findings may not generalize to populations with lower levels of education. Despite these limitations, our study provides robust validity coefficients, supporting the overall utility of the NIHTB-CB and contributing valuable insights to the ongoing exploration of its validity.
Funding statement
This project was supported by the ‘Assessing Reliable Measurement in Alzheimer’s Disease and Cognitive Aging’ project (ARMADA), U2CAG057441, sponsored by the NIA (MPIs Richard Gershon and Sandra Weintraub), The NIA/NIH R01AG068338 (MPI’s Giordani, Persad, Murphey); the NIA/NIH grant P30AG053760; and NIA/NIH grant P30AG072931 (PI, Henry Paulson).
Competing interests
The authors have no conflicts of interest to disclose.