INTRODUCTION
Clinicians conducting assessments of patients with severe brain injury are sometimes presented with clients that fail to apply full effort, are poorly motivated or are malingering. Accurate measures of effort are therefore required to identify these individuals and ensure that valuable resources such as compensation, rehabilitation and entitlement to benefits are awarded equitably.
Tests such as the Test of Memory Malingering (TOMM) (Tombaugh, Reference Tombaugh1996) and sub-tests from The Word Memory Test (WMT) (Green et al., Reference Green, Allen and Astner1996) are purported to measure an individual's level of “effort” rather than cognitive “ability.” The WMT “effort” measures have been described as being “virtually insensitive to all but the most extreme forms of impairment of learning and memory” (Green et al., Reference Green, Lees-Haley and Allen2002, p. 99). Whereas the test authors have argued that failure on the WMT is caused by lack of “effort”, Merten et al. (Reference Merten, Bossink and Schmand2007) have suggested that in patients with clinically obvious symptoms, scores below cut-offs do not always provide information about insufficient “effort” but rather may simply represent false positives.
Studies have attempted to determine the sensitivity and specificity of these tests using litigating or student samples, both of which are limited methods, as without a non-litigating brain injury sample results may be inaccurate. Currently there is very limited research reporting sensitivity and specificity of the WMT (Table 1). Available research suggests the WMT experiences a high false-positive rate in severe brain injuries (Bowden et al., Reference Bowden, Shores and Mathias2006). More evidence is available regarding the diagnostic accuracy of the TOMM and most studies have found good to excellent specificity using the cut-offs provided in the manuals (Table 2).
Note
SP = specificity; T1 = Trial 1; TBI = traumatic brain injury.
a As calculated by (O'Bryant & Lucas, 2006).
Various studies have used neurological, traumatic brain injuries (TBI) and non-head injury samples in an attempt to determine the diagnostic criteria of the WMT and TOMM. In summary, three studies have directly compared the WMT and TOMM each employing a litigating sample (Bauer et al., Reference Bauer, O'Bryant, Lynch, McCaffrey and Fisher2007; Gervais et al., Reference Gervais, Rohling, Green and Ford2004; Green et al., Reference Green, Berendt, Mandel and Allen2000). The results have found an increased number of fails on the WMT ranging from 27% to 66% with an average of 42%, whereas the TOMM has fail rates ranging from 10% to 29%, with an average of 17%. Considering these disparities, one must ask whether the tests are assessing different constructs.
The well replicated difference in fail rates between the WMT and the TOMM may be accounted for by the different nature of the stimulus material utilized in each test as the WMT uses semantically related words while the material presented in the TOMM is semantically unrelated and visually distinct. The effect of confusion and increased errors caused by potential “semantic interference” (Wehner et al., Reference Wehner, Ahlfors and Mody2007) in the WMT would not occur to the same extent in the TOMM. For example if an individual cannot remember the exact word presented in the WMT (e.g., “fire”) they may remember the category (e.g., “to do with fire”). However, during the testing phase they are confronted by a choice of two strongly related words, both of which belong to the category (e.g., “fire” versus “flame”). Though they may be applying full “effort,” semantic interference would lead to increased errors, whereas such an effect would not occur in the TOMM.
Alternatively, the difference in fail rates may be accounted for by the nature of an individual's brain injury, specifically whether it affects their ability to monitor their performance and learn from feedback. Additionally, because the tests have different presentation and testing structures, the learning and feedback paradigms are different. These two factors (nature of injury and internal test structure) mean that the WMT and TOMM may produce different results for the same person.
Craik (Reference Craik, Klix, Hoffmann and van der Meers1982) argued that divided attention mimics the effects of aging and that similar patterns of memory deficit could be found under conditions of alcohol intoxication, amnesia and drugs that affect cholinergic and other networks in the brain. It has been argued that distraction or divided attention tasks interfere with encoding and hence the learning of material both in the general population and those with brain injuries (Schmitter-Edgecombe & Nissley, Reference Schmitter-Edgecombe and Nissley2000; Watt et al., Reference Watt, Shores and Kinoshita1999). These studies have also suggested that a reduction in attentional resources leave individuals with insufficient resources to complete set tasks. However, this effect was found to be more prominent in individuals with a brain injury as they often have an inability to inhibit distractions (Knight et al., Reference Knight, Titov and Crawford2006). Therefore, if an individual with an acquired brain injury were distracted while completing a task it could be expected that their performance would be markedly reduced.
This study aimed to determine which test requires more cognitive resources (or “ability”), and whether “effort” alone is sufficient to pass the WMT and TOMM by reducing available attentional resources with an auditory distraction task. To test this hypothesis, the distraction task was used to increase cognitive demand and reduce available cognitive resources. This would in turn determine whether or not the WMT and TOMM are tests of effort (as currently believed) or require ability (which would be inappropriate to use on those suffering a severe brain injury). It was expected that performance on both tests would decrease with the introduction of the distraction task (Schmitter-Edgecombe & Nissley, Reference Schmitter-Edgecombe and Nissley2000; Watt et al., Reference Watt, Shores and Kinoshita1999), however we wished to examine whether or not this difference was significant. It was predicted that the distraction would have a stronger negative effect on the WMT than the TOMM, as the former has been found to be the more difficult test as outlined above.
METHOD
Participants
Sixty-nine individuals with severe brain injuries were recruited and classified as traumatic (TBI) or non-traumatic brain injuries (nTBI). Nine participants were excluded for the following reasons: current involvement in litigation (n = 3), inability to complete the distraction task (n = 2), non-compliance with the malingering task (n = 2), risk of triggering seizures (n = 1), and a data saving error (n = 1). All participants in the TBI group (n = 38) were considered to have suffered a severe brain injury and were caused by either a motor vehicle accident (n = 19), being hit by a car (n = 7) or a fall or other incident (n = 12). Of the total, 31 had a period of post-traumatic amnesia (PTA) of greater than 24 hours (and of these, 28 had a PTA longer than one week). The remaining 7 had coma durations of greater than one week. All of the nTBI participants (n = 22) were considered severely brain damaged under the Merten et al. (Reference Merten, Bossink and Schmand2007) criteria of “clinically obvious symptoms,” including repeated speech, bradyphrenia and word finding difficulties. Aetiology of these injuries included stroke (n = 11), poisoning (n = 3), hypoxia (n = 3) tumor (n = 2) or other causes (n = 3).
The sample consisted of 43 males and 17 females, 34 men and four women had sustained a TBI. The mean age of those with an nTBI was 44 years and the mean age of sustaining this injury was 34 years, an average of 10 years earlier. The mean number of year's education was 12.2 and the mean estimated premorbid intelligence score was 107. For the TBI participants, the mean age was 39 years and the mean age at the time the injury was sustained was 27 years, which was a mean of 12 years ago. The mean education was 12.6 years with a mean estimated premorbid intelligence of 104. Only age was significantly different when comparing TBI and nTBI (F(1,58) = 4.150, p = .046, ηp2 = .067, where ηp2 refers to the proportion of variance (or proportion of individual differences on the dependent variable) attributable to the independent variable being analyzed with the respective F-statistic). Injuries recorded in this study were divided into acute (13%) where sustained less than two years ago, and chronic (87%) where sustained more than two years ago. There was no significant difference on test performance between acute or chronic injuries.
Participants also completed the Depression Anxiety and Stress Scales (DASS) (Lovibond & Lovibond, Reference Lovibond and Lovibond1995) and National Adult Reading Test (NART) (Nelson & Willison, Reference Nelson and Willison1991) to control for levels of psychological distress and estimated premorbid intelligence between the groups respectively. No significant difference was found between groups on these measures. All sample details can be found in Table 3.
a Control versus Distraction groups.
b Control versus Simulated Malingering.
c Distraction versus Simulated Malingering.
d Cohn's Effect Size.
Measures and Procedure
Participants were allocated to the Full Effort and Distraction conditions using restricted randomization to ensure approximately equal numbers of nTBI and TBI in each cell. When a significant effect between these conditions was observed, subsequent participants were then allocated to the simulation condition. The resulting group numbers were (1) Full Effort (n = 25), (2) Distraction (n = 24) or (3) Simulated Malingering (n = 11). All participants completed both the TOMM and the WMT “effort” measures. The WMT “effort” measures (Immediate Recognition, Delayed Recognition and Consistency) and the TOMM (Trials 1 and 2) were administered via computer in counterbalanced order. Therefore the design used restricted randomization to assign participants to two of the three levels of the between-subjects factor of task condition (Full Effort, Distraction, or Simulation), where the within subjects factor was test type.
The WMT utilized measures of both “effort” and “ability”. The measures of “effort” were designed to assess the individual's willingness to maximize performance and have been considered to be unaffected by all but the most severe brain injury. The participants were shown a list of 20 word pairs with each pair presented for six seconds and which was subsequently repeated. Following this the Immediate Recognition Subtest was presented to assess participant's memory for the words using a forced-choice format. The words required a low level of reading ability and feedback was provided after each answer. Thirty minutes later the Delayed Recognition subtest was administered, and performance on Immediate Recognition and Delayed Recognition was compared to achieve a Consistency score (Green et al., Reference Green, Allen and Astner1996). The lowest score of Immediate Recognition, Delayed Recognition or Consistency was used to detect a fail as per the recommended cut-off in the WMT manual (Green et al., Reference Green, Allen and Astner1996).
The TOMM is a commonly used test of “effort” alone in which 50 pictures of common objects are presented for 3 seconds each over two learning trials. Each learning trial is followed by a recognition test, which also uses a forced-choice format and provides feedback (Tombaugh, Reference Tombaugh1996). TOMM Trials 1 and 2 (T1 and T2) are usually followed by the Retention Trial, which are administered after a further 15 minutes. However Greve and Bianchini (Reference Greve and Bianchini2006) found only a 3% error rate in a litigating sample when the test was terminated after T2. Therefore the Retention trial could be considered optional in a bona fide sample where time is restricted. In the present study, the Retention Trial was not administered and the score on T2 was used in the analysis as all participants that failed T1 also failed T2. A fail was defined as per the recommended cut-off in the TOMM manual (Tombaugh, Reference Tombaugh1996).
The Full Effort and Distraction groups were given standard instructions and told to maximize their performance on the WMT and TOMM. In addition the Distraction group was asked to complete an auditory distraction task for the full duration of the learning phases of both tests, which was similar to the procedure used by Craik (Reference Craik, Klix, Hoffmann and van der Meers1982). Participants of this group were required to add three to each number (between 1 and 9) presented via audio recording at three-second intervals, and state the answer aloud. The Malingering group was asked to respond to a typical simulation scenario (described later) adapted from Tombaugh (Reference Tombaugh1997):
“Pretend you are involved in a compensation case and you are seeking damages for your acquired brain injury. The court has required that you undertake a neuropsychological assessment to establish the impact the injury has had on your brain functioning, especially your memory. Your lawyer has informed you that the greater the memory impairment you appear to have, the more compensation you will receive. However, he also warned you that obvious faking during your assessment would be detected. Therefore, you must convince the researcher that you have a memory impairment without being obvious (i.e., fake without being caught).”
During the assessment, participants were reminded of the malingering scenario at the beginning and midway through each test phase to ensure they were answering accordingly. They were also provided with a compliance questionnaire at the end of the assessment.
This study was approved by the Macquarie University Human Ethics Committee.
RESULTS
Because test scores were not normally distributed, the analysis was run with both parametric and nonparametric statistics. Because both approaches to analysis provided the same pattern of results, parametric statistics are reported here. None of the demographic variables (current age, age at time of injury, time since injury, years of education, estimated premorbid intelligence, depression, anxiety or stress levels) were significantly correlated with the dependent variables or were significantly different across groups, therefore they were not included as covariates in subsequent analyses. Performance on either the WMT or TOMM did not significantly differ with acute or chronic status of the injury, or classification of injury.
For the purposes of statistical analysis, scores on TOMM T2 and the lowest of Immediate Recognition, Delayed Recognition or Consistency WMT scores were converted to percentages, as scores on TOMM (range 1–50) and WMT (range 1–40) are on differing metrics. The test scores are shown in Table 4. An overall group by test interaction was found: F(2,57) = 19.88; p < .0005, ηp2 = .41. Therefore performance significantly differs between tests and across testing conditions. A significant interaction was also found between the Full Effort and Distraction groups, where performance on the TOMM was markedly better than the WMT for both the Full Effort (Ms = 94.48, 82.7; SDs = 12.53, 14.05 respectively) and Distraction groups (Ms = 89.5, 64.17; SDs = 14.16, 19.18 respectively); F(1,57) = 15.43, p < .0005, ηp2 = .21.
a Control v. Distraction groups.
b Control v. Simulated Malingering.
c Distraction v. Simulated Malingering.
d Cohn's Effect Size.
Using one-way ANOVAs, it was found that TOMM scores did not significantly decrease with the distraction task, F(1,47) = 1.70; p = .20, ηp2 = .04 (Fig. 1). In contrast there was a significant decrease in WMT scores when the distraction was introduced, F(1,47) = 14.98; p < .0005, ηp2 = .24 (Fig. 2).
Examination of performance, using the recommended cut-offs in the respective manuals found significantly more participants failed the WMT than the TOMM in both Full Effort and Distraction groups (χ2 = 26.97; p < .0005). Forty-four percent of Full Effort and 75% of the Distraction group failed the WMT, whereas failure on the TOMM was 16% and 33% respectively. Within the Full Effort group, participants that failed the TOMM were not significantly different from those that passed on any demographic measure using a one-way ANOVA. However, those that failed the WMT had significantly lower estimated premorbid intelligence than those that passed (Ms = 100, 110; SD = 9, 11 respectively), F(1,23) = 6.425; p = .019, ηp2 =.22.
Both tests produced 100% sensitivity to the malingering manipulation. On the other hand in the Full Effort condition the specificity for the TOMM was less than ideal at 84%, and was unacceptably low for the WMT at 56%.With a 30% base rate the positive predictive value for the TOMM was 73% and for the WMT it was 50%. With lower base rates the predictive values would fall considerably lower (Straus et al., Reference Straus, Richardson, Glasziou and Haynes2005).
DISCUSSION
This study found that the distraction task reduced the cognitive capacity of the participants, thereby allowing an assessment of the level of cognitive demand required by the WMT and TOMM. It was found that performance on the TOMM was significantly higher than the WMT for both Full Effort and Distraction groups, confirming previous findings that the WMT is a more difficult test. Furthermore, the distraction did not significantly affect TOMM performance, suggesting that the TOMM is more a measure of “effort” than cognitive “ability.” However, WMT performance was significantly affected by the distraction indicating that the WMT is measuring cognitive “ability” as well as “effort,” which does not support the theory that in order to pass the WMT, “effort” and little or no “ability” is required (Green et al., Reference Green, Lees-Haley and Allen2002). The finding that those participants who failed the WMT have significantly lower estimated premorbid intelligence than those who passed further supports the notion that WMT results are influenced by cognitive ability.
In addition to this, false positives on the WMT were unacceptably high, which decreased the specificity of the test to an unacceptable level. This finding challenges Green et al.'s (Reference Green, Lees-Haley and Allen2002) statement that all participants except for 0.02% of patients with “very severe and widespread cognitive impairment” (p. 117) should pass the test. Although the WMT has high sensitivity, the unacceptably high rate of false positives in participants with compromised cognitive capacity produced positive predictive values, which may not meet Daubert standards (Mossman, Reference Mossman2003; Vallabhajosula & van Gorp, Reference Vallabhajosula and van Gorp2001).
Differences on test performance may be due to semantic interference, as the TOMM provides visually unique and unrelated images, whilst the WMT relies on semantically related words, which may increase errors and hence fail rates (Wehner et al., Reference Wehner, Ahlfors and Mody2007).
The different administration procedures of the tests may also provide an explanation as to differences in false-positive rates. For each Trial, the TOMM provides a learning phase followed by a test phase, therefore providing the individual with feedback as to how well they learnt the material and an opportunity to increase their “attentional effort” during the second learning phase. It is suggested that for those with damage to the feedback or monitoring systems of the brain, this additional level of feedback would provide no benefit, as they would not have recognized the need to increase their “attentional effort”. For those with intact feedback systems, the additional feedback on the TOMM would allow them to increase their “attentional effort”. In comparison, the WMT presents the two learning phases before the testing phase. Therefore the examinees are not given an opportunity to increase their “attentional effort” and hence their performance. Therefore it is suggested that those that failed both the WMT and TOMM did not benefit from the additional feedback and learning opportunity provided by the TOMM, which may be a result of damage sustained to the feedback and monitoring systems. However, those that failed the WMT but passed the TOMM might have done so as a result of the TOMM's structure, which allowed them to use the additional feedback and learning opportunity to maximize their performance. Presently, this suggestion is confounded by differences in test material, where the WMT provides semantically related words, which may cause semantic interference unlike the TOMM. Therefore, further research should investigate this theory by controlling for these variables.
It could be argued that the meaningfulness of these results would be greater if they could be demonstrated in less severely injured (or non-injured) samples. Such data exists on the effects of distraction on the Immediate Recognition trial of the WMT in a non-injured college sample of 77 participants (Shores & Walker, Reference Shores and Walker2007). In this study, results supported the central finding of the present study that the WMT does not contain “effortless” cognitive tasks. Diagnostic accuracy of the WMT Immediate Recognition showed that 15% of the participants with “mildly” reduced cognitive capacity who were asked to perform to the best of their ability failed. Of those with “mildly” reduced cognitive capacity who were instructed to under perform 84.2% were detected by the test. This resulted in specificity and sensitivity ratings of 85% and 84.2% respectively. However, of those in the “severely” reduced cognitive capacity condition who performed to the best of their ability, 33.33% failed. Despite this high false positive rate 95% of the participants in the “severely” reduced cognitive capacity condition who were instructed to under-perform were correctly identified. This resulted in specificity and sensitivity ratings 66.66% and 95.0% respectively.
As previously outlined, the reasoning behind the distraction task used in this study was to increase cognitive demand, thereby reducing the cognitive resources available to participants. If the task at hand required effort alone and not ability, there should not have been a significant reduction in test performance. Using this reasoning, these results have shown that the WMT clearly requires more than effort to complete the activity, whilst TOMM scores remain robust despite reduced cognitive resources.
Other evidence using a different paradigm has also found that the WMT is not an “effortless” task (Allen et al., Reference Allen, Bigler, Larsen, Goodrich-Hunsaker and Hopkins2007). In an fMRI study it was shown that brain activation following WMT performance was found in areas consistently associated with increases in task difficulty, memory load and other forms of cognitive effort. These findings are inconsistent with the notion that the WMT is an “effortless” cognitive task. There is now strong convergent evidence gleaned from different paradigms: college students performing under different intensities of distraction, severely brain injured participants who were also placed under the additional burden of distraction and non-injured participants in an fMRI study, that challenges the notion that the so called “effort” measures of the WMT are in fact “effortless.”
This study used a sample of non-litigating severely brain injured individuals who had no external incentive to exaggerate their injuries, suggesting that scores and findings are an accurate representation of how this population actually performs on the WMT and TOMM. A limitation of the study was that although the Simulated Malingering group was assessed with a compliance questionnaire, this was not done with the Full Effort and Distraction groups. A further limitation of this study is that although an estimate of premorbid intelligence was obtained, no measures of current intellectual level were administered.
Therefore, the evidence provided in this study, has questioned whether the WMT is an appropriate test to assess individuals with a severe brain injury. Further independent research should examine whether this statement extends to all levels of severity. Considering the results, it is recommended that the TOMM be used in conjunction with evidence gathered from other sources as recommended by Aronoff et al. (Reference Aronoff, Mandel, Genovese, Maitz, Dorto, Klimek and Staats2007) and Tombaugh (Reference Tombaugh1996). Failure to do so could lead to serious unintended consequences for an individual's compensation, rehabilitation, and entitlement to benefits. The current findings suggest that simply using a test such as the WMT, with high sensitivity could lead to an unacceptably high rate of false positives in people who have compromised cognitive capacity, thus producing positive predictive values which may not meet Daubert standards.
ACKNOWLEDGMENTS
This work was supported with minor funding from the Macquarie University Psychology Department for the purchase of test equipment. The authors have no financial or other relationships that could be regarded as a conflict of interest.