Criteria for amnestic mild cognitive impairment (aMCI) have been devised in an attempt to capture the preclinical phase of Alzheimer's dementia. There is evidence to suggest, however, that individuals with aMCI according to the current criteria comprise a heterogeneous group, Reference Blossom, Matthews, McKeith, Bond and Brayne1,Reference Chertkow, Nasreddine, Joanette, Drolet, Kirk and Massoud2 some of whom will progress to dementia with time, while others will not. Reference Visser, Kester, Jolles and Verhey3 To maximise the diagnostic value of aMCI, criteria should identify a homogeneous group of people with preclinical dementia. To this end, cognitive criteria should be defined in a manner that reflects our current knowledge of their predictive value. As the prodromal phase of Alzheimer's disease is likely to extend beyond a 20-year period, Reference Amieva, Jacqmin-Gadda, Orgogozo, Le Carret, Helmer and Letenneur4 studies with shorter follow-up periods are liable to underestimate the risk of conversion. In fact, only 2 of 14 clinic-based longitudinal aMCI studies report follow-up periods beyond 3 years (online Table DS1). Reference Fox, Warrington, Seiffer, Agnew and Rossor5,Reference Tabert, Manly, Liu, Pelton, Rosenblum and Jacobs6 Moreover, low baseline levels of general cognitive functioning among participants with aMCI lead to a greater chance of neuropsychological tasks predicting dementia, because of the more advanced stage of disease in the cohort with aMCI. In 11 of 14 clinic-based longitudinal studies average baseline Mini-Mental State Examination (MMSE) Reference Folstein, Folstein and McHugh7 scores for the group with mild cognitive impairment who converted to dementia (aMCI converters) fell below the higher screening cut-off for dementia (i.e. 27/30). In these studies, an underlying dementia might well have been suspected on the basis of such rudimentary cognitive screening instruments alone, begging the question of the ‘added value’ of a fuller cognitive evaluation. The comprehensiveness of the neuropsychological battery employed and the appropriateness (on both theoretical and empirical grounds) of test selection might also be expected to influence predictive power. For example, the combination of the Paired Associate Learning test (PAL), Reference Blackwell, Sahakian, Vesey, Semple, Robbins and Hodges8 age and the Graded Naming Test (GNT) Reference McKenna and Warrington9 give an overall classification accuracy of 100% over a 2.5 year follow-up interval. Reference Blackwell, Sahakian, Vesey, Semple, Robbins and Hodges8 The same classification accuracy (i.e. 100%) has been reported Reference Ahmed, Mitchell, Arnold, Nestor and Hodges10 for the use of the PAL in combination with the Addenbrooke's Cognitive Examination (ACE) Reference Mathuranath, Nestor, Berrios, Rakowicz and Hodges11 and the Graded Faces Test (GFT) Reference Thompson, Graham, Patterson, Sahakian and Hodges12 over shorter (1 year) intervals. However, these findings have not been replicated outside the test authors' group, in larger numbers of people with aMCI, across follow-up periods extending beyond 2.5 years, and where the mean general level of cognitive functioning at baseline (as indicated by performance on cognitive screening) falls above cut-off points for dementia. If replicable, such measures could be used in the neuropsychological assessment of aMCI within specialist memory clinic settings to provide information about differential diagnosis and prognosis.
Measurement of cognitive function represents just one, albeit an important, approach to detecting and diagnosing Alzheimer's disease at a very early and preclinical stage. Other work has looked at the ability of imaging (for the most part magnetic resonance image (MRI) scanning), biomarkers (i.e. total tau, beta amyloid 42 and phosphorylated tau) and changes of a behavioural nature to predict the future onset of clinically diagnosable Alzheimer's disease. A recent meta-analysis of imaging and biomarkers for Alzheimer's disease Reference Schmand, Huizenga and van Gool13 indicated some promise for the cerebrospinal fluid (CSF) markers in so far as their overall predictive accuracy levels were similar to that of memory impairment 4 years prior to the point of diagnosis. Furthermore, the effect sizes for the CSF markers were largest when assessed longer before the point of diagnosis. However, atrophy of the hippocampus or other medial temporal lobe structures was found to be a less accurate predictor of future Alzheimer's disease than memory impairment, and the largest effect sizes, which are themselves likely to represent an underestimation owing to the removal of variability inherent in the inclusion of memory impairment as a selection criterion for a majority of studies, were seen in association with measures of delayed memory recall.
In this study we present a detailed neuropsychological and clinic-based cohort study, with an average of 4 years follow-up from baseline neuropsychological assessment until final review. It is the only study to date of people who can be classified as high-functioning aMCI converters (i.e. MMSE >27/30) extending beyond 3 years follow-up. Furthermore, it represents the first clinic-based study to investigate the robustness of the GNT, GFT and a combination of the ACE and PAL as predictors of conversion to dementia outside the original authors' research group, Reference Mathuranath, Nestor, Berrios, Rakowicz and Hodges11 and to report the detailed fate of people who are aMCI non-converters in terms of their course of cognitive impairment.
Method
Recruitment
Participants were recruited from the Edinburgh Older Adult Neuropsychology Service, which takes all tertiary referrals over 60 years from geriatricians and old age psychiatrists in the Lothian Region of Scotland. As there is no substantial private sector, these National Health Service (NHS) referrals are likely to be representative of individuals with memory complaints attending their doctor. During the study period from September 2004 to September 2007, 71 people were referred from old age psychiatry, 16 from geriatric medicine. Of these 87, 41 did not respond to the invitation to attend or refused to participate in the study, leaving 46 participants. Further details regarding the demographic characteristics of these individuals may be found in a previous publication. Reference Lonie, Herrmann, Donaghey and Ebmeier14
Procedure
Individuals who fulfilled criteria for aMCI Reference Petersen, Thomas, Grundman, Bennett, Doody and Ferris15 (objective cognitive impairment was defined by a performance of 1 standard deviation or more below age means on two or more measures assessing a single cognitive domain) undertook an extensive battery of neuropsychological measures at baseline and were followed up annually, regardless of whether or not they received a clinical diagnosis of dementia during the course of the study, over an average 4-year period. A 1 standard deviation cut-off point on two or more episodic memory measures was used, in place of the more commonly applied 1.5 standard deviations on one or more measures, in an attempt to minimise the likelihood of including participants with aMCI but with an unstable aMCI diagnosis, as well as to maximise sensitivity to memory deficits within our sample with IQs higher than the average participant with aMCI. At the end of the study period participants with aMCI were grouped in accordance with whether or not they had received a clinical diagnosis of dementia (as documented in their medical file) at any point subsequent to their initial study assessment.
A total of 24 age- and IQ-matched healthy elderly participants also completed the full battery of 18 neuropsychological tasks providing a normative comparison group. Sixteen of these 24 healthy participants repeated the battery in full, an average of 28 months later. The retest data were used to established cut-off values and criteria for further classifying the neuropsychological performance of the participants who were aMCI non-converters as ‘stable aMCI’, ‘progressive aMCI’, or ‘normal’ at the study end-point.
Materials, participants and outcome criteria
Participant characteristics (inclusion/exclusion criteria) and neuropsychological measures have been detailed previously. Reference Lonie, Herrmann, Donaghey and Ebmeier14,Reference Lonie, Herrmann, Tierney, Donaghey, O'Carroll and Lee16,Reference Lonie, Tierney, Herrmann, Donaghey, O'Carroll and Lee17 In brief, the neuropsychological battery comprised measures of premorbid IQ (National Adult Reading Test; NART), Reference Nelson and Willison18 episodic memory (PAL, Reference Blackwell, Sahakian, Vesey, Semple, Robbins and Hodges8 Hopkins Verbal Learning Test – Revised (HVLT–R), Reference Benedict, Schretlen, Groninger and Brandt19 Rey Complex Figure Test (RCFT), Reference Rey20 semantic memory (GFT, GNT, Category fluency; animals), visuospatial function (RCFT copy), psychomotor processing speed (Trail Making Test Part A) Reference Reitan21 and attention/executive function (Dual Task, Reference Della Sala22 Controlled Oral Word Association Test (COWAT: F, A and S); Reference Benton and Hamsher23 Trail Making Test Part B). Reference Reitan21 Amnestic mild cognitive impairment was defined in accordance with the revised criteria set out by Petersen et al. Reference Petersen, Thomas, Grundman, Bennett, Doody and Ferris15 Demographic characteristics of the aMCI converters group, the aMCI non-converters group and the normative sample group, together with their respective baseline mean performances on cognitive screening and selected neuropsychological measures, are summarised in Table 1.
Variable (maximum score) | Healthy elderly control group, mean (s.e.) | aMCI converter group, mean (s.e.) | aMCI Non-converter group, mean (s.e.) | Converter group v. non-converter group, t-test | d.f. | P |
---|---|---|---|---|---|---|
Demographic information (confounders) | ||||||
Age | 70.8 (7.8) | 76.0 (1.6) | 73.2 (5.4) | –1.53 | 42 | 0.14 |
National Adult Reading Test IQ | 118.5 (3.3) | 116.4 (2.0) | 117.4 (1.3) | 0.44 | 41 | 0.66 |
Months of follow-up | 28.0 (9.1) | 51.4 (3.0) | 52.4 (2.7) | 0.23 | 42 | 0.82 |
Cognitive screening | ||||||
Addenbrooke's Cognitive Examination, total (100) | 94.5 (3.2) | 86.6 (1.3) | 91.3 (0.9) | 2.98 | 42 | 0.006a |
Episodic memory | ||||||
PAL 6 box errorsb | 7.9 (6.7) | 23.2 (3.2) | 13.5 (2.5) | –2.32 | 42 | 0.02 |
HVLT–R delayed recall (12) | 8.1 (2.7) | 3.5 (0.7) | 5.8 (0.7) | 2.38 | 41 | 0.02 |
HVLT–R discrimination index (12) | 9.9 (1.8) | 7.2 (0.7) | 9.2 (0.4) | 2.81 | 41 | 0.008a |
Semantic memory | ||||||
Graded Naming Test (30) | 23.8 (3.1) | 20.3 (1.2) | 21.0 (0.8) | 0.56 | 39 | 0.58 |
Graded Faces Test (30) | 20.7 (3.6) | 15.3 (1.3) | 18.1 (0.9) | 1.84 | 41 | 0.08 |
Attention/executive | ||||||
Trail Making Test Part B b,c | 88.7 (30.7) | 152.7 (19.6) | 100.9 (9.8) | –2.58 | 42 | 0.014 |
Statistical analysis
Independent t-tests were conducted to compare the baseline performances of the aMCI converters group and the aMCI non-converters group on the demographic indices of age, NART full-scale IQ (FSIQ), and years of follow-up and on the neuropsychological measures of ACE total score, PAL, HVLT–R delayed recall and discrimination index (a measure of accuracy of delayed recognition), GFT, Category fluency and Trail Making Test Part B. The alpha level was adjusted to control for multiple comparisons using Holm's sequential Bonferroni correction method. Reference Holm24
Baseline ages, time of follow-up and premorbid IQ were selected as potential confounders for their established influence on risk of developing late-onset dementia. Reference Mebane-Sims25–Reference Petersen, Smith, Waring, Ivnik and Kokmen27 Seven neuropsychological measures (ACE, PAL, HVLT–R discrimination index and delayed recall, GNT, GFT and Trail Making Test Part B) were selected from a total of 18, on the basis of their known sensitivity to aMCI relative to other measures of cognitive functions established by our own cross-sectional analyses, or of their high levels of predictive validity as established by one or more clinic-based longitudinal study of neuropsychological predictors of dementia (online data supplement). Participants who were clinically diagnosed as having dementia at study end-point were identified. Participants who had not received a clinical diagnosis of dementia were further classified as ‘normal’ or having ‘persisting aMCI’ based on their neuropsychological profile at end-point, and ‘progressive’ or ‘non-progressive’ based on the longitudinal course of cognitive function during their years of study participation. ‘Abnormal’ neuropsychological performance was defined by a performance at the 7th centile or lower in two or more of the 18 neuropsychological tasks (this would occur by chance in approximately 1 of 22 participants with mild cognitive impairment without a diagnosis of dementia). Mild cognitive impairment decline was defined by cognitive deterioration of a magnitude seen in fewer than 2.5% of a sample of healthy elderly people over an average 28-month period on at least two measures of semantic memory or executive functioning. Cognitive domains other than episodic memory were selected for this criterion, because of the baseline floor-level performances on episodic memory tasks of many participants with aMCI.
Sensitivity, specificity, positive and negative predictive values, together with the overall percentage of classification accuracy in predicting conversion or non-conversion to dementia, were determined using a combination of a total score of <88/100 on the ACE or a performance of 2 standard deviations or more below controls on the PAL, as this combination of measures has previously been associated with 100% sensitivity and negative predictive values. Reference Ahmed, Mitchell, Arnold, Nestor and Hodges10 These values were also determined for face (GFT) and object (GNT) naming measures.
Neuropsychological measures, for which the baseline performances of those in the aMCI converters group and the aMCI non-converters group were significantly different, were entered simultaneously alongside the putative confounders ‘age’, ‘NART FSIQ’ and ‘years of follow-up’ into a logistic regression analysis. A backward stepwise procedure using the likelihood ratio was applied to determine model content and levels of overall classification accuracy. Criteria for entry and removal were set at P = 0.05 and P = 0.01, respectively, using 20 iterations (SPSS 17 for Windows).
Results
Forty-one percent (18/44, 95% CI 28–56%) of participants with aMCI received a clinical diagnosis of dementia (most often Alzheimer's disease) at some point prior to study end-point (i.e. on average 4.33 years after entry into the study), giving an average annual conversion rate of 11.4% (95% CI 4–23%). Fifty-nine percent (26/44) of participants had not received a clinical diagnosis of dementia. Medical notes were missing or not accessible for the remaining 2/44 individuals. For these participants, the most up-to-date information available at the study end-point was that obtained at final study attendance. Of the participants who had not received a clinical diagnosis of dementia, 8/26 (31%) were stable, 10/26 (38%) progressive and 8/26 (31%) reverted to normal, according to the criteria defined above (Fig. 1).
Following adjustment of the alpha level, Reference Holm24 significant differences in the baseline performances of the aMCI converters group and the aMCI non-converters group were found on the ACE (t (42) = 2.98, P<0.01) and HVLT–R discrimination index (t (41) = 2.81, P<0.01) (Table 1).
Only one participant obtained a GNT score at baseline below the 2nd centile of our healthy elderly control group. None of the participants with aMCI performed below the 2.5th centile of our control group on the GFT at baseline. Univariate sensitivity, specificity, negative (NPV) and positive (PPV) predictive values for conversion from aMCI to dementia were therefore based on a cut-off performance at the 7th centile (s.d.>1.5 below the mean) of age norms, and can be summarised as follows: GNT sensitivity 38%, specificity 68%, PPV 43%, NPV 63%; GFT 44%, 68%, 50%, 63%. Using a combination cut-off of ACE <88/100 or PAL >14 errors, the overall rate of classification accuracy was 68%, with sensitivity 72%, specificity 65%, PPV 59% and NPV 77%.
Backward logistic regression with age, NART FSIQ, years of follow-up and the neuropsychological measures for which baseline performance differentiated converters and non-converters (HVLT–R discrimination index and ACE total score) resulted in a final model, completed after five iterations, including the variables ACE total score and HVLT–R discrimination index score only, yielding an overall classification accuracy (aMCI converters group v. aMCI non-converters group) of 74%, sensitivity 65%, specificity 80%, NPV 77% and PPV 69% (Table 2).
95% CI for Exp(B) | |||||
---|---|---|---|---|---|
B | s.e. | Exp(B) | Lower limit | Upper limit | |
Constant | 15.32* | 6.74 | 4 501 208 | ||
Addenbrooke's Cognitive Examination, total score | –0.15* | 0.74 | 0.86 | 0.75 | 1.00 |
Hopkins Verbal Learning Test – Revised, discrimination index | –0.32* | 0.16 | 0.73 | 0.53 | 1.00 |
Discussion
Main findings
Forty-one percent of participants who met criteria for aMCI at study entry received a clinical diagnosis of dementia within the following 4 years, giving an annual conversion rate of 10% which is almost identical to the mean (9.7%) annual conversion rate obtained on averaging the findings from existing clinic-based aMCI longitudinal studies of a similar (2.5–3.5 year) length.
Baseline performance on the ACE and HVLT–R discrimination index could discriminate between future aMCI converters and non-converters at a time when general levels of cognitive functioning fell above the higher level screening cut-off for dementia (i.e. >27/30 on the MMSE and >88/100 on the ACE), and classified these individuals in accordance with their prognostic fate with a moderate degree (74%) of overall accuracy. Differences in the baseline performances of the aMCI converters and non-converters groups on these measures could not be explained by differences in age, FSIQ or length of follow-up, as the two groups were roughly similar and effects persisted after controlling for each of these variables.
The average score of the converters group on the HVLT–R discrimination index at baseline was equal to performance at the 4th centile of published age- and education-matched control values, Reference Benedict, Schretlen, Groninger and Brandt19 and the 7th centile of our own healthy elderly age- and IQ-matched control sample. The corresponding values for the non-converters group were the 28th and the 36th centile, respectively, implying that there is a greater risk of conversion to dementia among a subset of people with aMCI, who are readily identifiable on the basis of published norms.
For the ACE, average scores of the converters group were equal to the 4th, those of non-converters equal to the 31st centile of published normative values, Reference Mathuranath, Nestor, Berrios, Rakowicz and Hodges11 and the 1st and 16th centile of our matched study control data, providing further support for the designation of 88/100 as a higher cut-off point for dementia. We suggest that use of this score is appropriate to screen for aMCI, despite the younger age group of the original ACE normative sample.
Implications
In clinical practice, the combined performances of people with aMCI on the ACE and HVLT–R discrimination index could be used to inform decisions about the frequency of future contact/monitoring required, or in combination with additional clinical information (i.e. levels of carer-rated depressive symptoms, Reference Lu, Edland, Teng, Ingus, Petersen and Cummings28 ApoE4 carrier status, Reference Petersen, Thomas, Grundman, Bennett, Doody and Ferris15 corroborative history, neuroimaging findings, family history and qualitative aspects of clinical presentation) to decide whether or not to consider pharmacological or other interventions. The relatively small proportion of people with aMCI showing resolution of their cognitive symptoms over the time of the study also has implications for the clinical management of such individuals, as a number of empirically validated methods for the cognitive rehabilitation of early-stage Alzheimer's disease have been described Reference Clare and Woods29 and could theoretically be used to enhance the day-to-day memory functioning of people with aMCI.
Baseline scores of the HVLT–R discrimination index and the ACE were significant independent predictors of conversion to dementia. Closer inspection of the regression analysis reveals that the HVLT–R discrimination index score contributes to the overall classification accuracy of the ACE by increasing negative predictive value. This implies that memory impairment of a consolidation/storage nature is generally present in individuals where a diagnosis of dementia (Alzheimer's disease, vascular dementia or mixed Alzheimer's disease/vascular dementia) follows within 4 years.
It is possible that cued recall impairment arises closer to the point at which Alzheimer's disease can be diagnosed clinically, often after problems with (the more difficult) free recall become apparent. The possibility that cueing may facilitate episodic recall may then disappear with disease progression, giving rise to an encoding/consolidation profile of memory impairment.
This observation has implications for the recently proposed new research criteria for Alzheimer's disease, Reference Dubois, Feldman, Jacova, DeKosky, Barberger-Gateau and Cummings30 in which the requirement for objective evidence of significantly impaired episodic memory has been elaborated. The new criteria emphasise the importance of establishing an encoding and storage deficit on the grounds that reduced benefit from cueing during recall reliably identifies prodromal Alzheimer's disease. Our findings lend support to the specification of episodic memory impairment in this manner. However, the limited range of scores attainable using the HVLT–R discrimination index and the resultant potential for floor effects suggest it may not be well-suited for monitoring significant decline in episodic memory function over time.
The newer version of the ACE–R Reference Mioshi, Dawson, Mitchell, Arnold and Hodges31 incorporates a delayed cued verbal recognition element. In light of the added predictive value of the HVLT–R discrimination index demonstrated in this study, it would seem prudent to evaluate whether or not this measure retains its prognostic contribution alongside the ACE–R.
The mean total baseline score on the ACE (87/100) for future converters fell just below the higher level cut-off point for dementia. For 26% of participants, baseline ACE scores fell above the higher cut-off point for dementia (i.e. 88/100), suggesting that where the ACE is used as the sole means of determining the likelihood of developing dementia over the proceeding 4 years, up to a quarter of all individuals with preclinical dementia receive false reassurance of ‘normality’. The implications of using the ACE as a sole means to determine the presence or absence of clinically significant levels of cognitive impairment are even greater, as the present findings indicate that 62% of people who fulfil criteria for aMCI obtain scores of 88/100 or above on the ACE.
We were unable to replicate the high levels of sensitivity, specificity, positive and negative predictive values that have been previously reported in association with combined PAL and ACE scores, the GNT and the GFT. Reference Blackwell, Sahakian, Vesey, Semple, Robbins and Hodges8,Reference Ahmed, Mitchell, Arnold, Nestor and Hodges10,Reference Thompson, Graham, Patterson, Sahakian and Hodges12 Our need to adopt a more conservative 7th centile (1.5 standard deviation) cut-off for the GFT and GNT naming measures may in part reflect the longer follow-up period in ours as compared with the last study Reference Thompson, Graham, Patterson, Sahakian and Hodges12 (13.7 months from baseline until study end-point). Their shorter interval until diagnosis is consistent with a greater magnitude of impairment on naming tasks in their sample. The predictive value of neuropsychological measures is likely to vary as a function of the number of years prior to diagnosis, underscoring the need for careful consideration of both the length of follow-up and the levels of baseline cognitive functioning of aMCI cohorts in different studies.
Limitations
There are a number of limitations to this study: first, although a mean follow-up period of over 4 years compares well with previous clinic-based studies of longitudinal outcome in people with aMCI, it remains possible that additional participants with aMCI will go on to receive a clinical diagnosis over the longer term. Furthermore, the length of follow-up varied among those with aMCI between 1 and 5 years. Ideally all participants with aMCI would have been followed up for the maximum 5-year interval. Second, the high average premorbid IQ and select nature (i.e. tertiary referral, amnestic single and multidomain and primarily Alzheimer's disease end-point diagnosis) of our aMCI cohort limits generalisation of the study findings beyond groups that are characterised similarly. Third, although the predominant eventual diagnosis of dementia was of Alzheimer (n = 11) or mixed (n = 5) Alzheimer/vascular type, a small proportion of people with aMCI (i.e. 2) were finally diagnosed with vascular dementia. The resultant inclusion of an end-point clinical diagnosis other than that of pure Alzheimer's disease may have influenced the predictive validity of the neuropsychological measures within our battery. It could be argued, in a more practical sense, that exclusion of people with aMCI on grounds of multiple risk factors or even retrospectively does not reflect clinical reality. There is variability in the point at which clinicians arrive at a diagnosis of dementia and the specific criteria they employ, despite a common bias towards avoiding false-positive diagnoses. It remains possible that for some individuals with an aMCI clinical diagnostic status at the study end-point this was in part reliant on the idiosyncrasies of one or more of the six attending consultants. Finally, the relatively small sample size makes independent replication essential.
Acknowledgement
The revised dual task was generously provided by Della Sala and colleagues.
eLetters
No eLetters have been published for this article.