Using the data on selective serotonin reuptake inhibitors (SSRIs), David Healy illustrates the difficulty of excluding the possibility of an association between a therapy and a serious, but uncommon, event such as suicide (Reference HealyHealy, 2006, this issue). The danger of relying on a statistical significance test to exclude clinically significant effects of treatment is, of course, well known but none the less worth rehearsing using the experience with the SSRIs. The basic issues are well explained with several examples by Reference Altman and BlandAltman & Bland (1995) and it is worth remembering the title of their article: ‘Absence of evidence is not evidence of absence’. The commonly made error of interpreting a non-significant hypothesis test as meaning that there is no association is known as a type II error. Although such a mistake may simply be due to lack of statistical expertise, it is also a way of ‘spinning’ the results in the direction that the authors would prefer them to go. Clinically, the events we are concerned with here will often be unexpected adverse events and the primary randomised evidence will usually have insufficient power to confirm or exclude an association reliably.
This is an increasingly important issue in clinical practice because service users are rightly demanding better information on the potential risks of treatments. So how should the clinician interpret data on risk? Evidence-based practice makes use of tactics derived from clinical epidemiology to identify the most robust research evidence and critically appraise it for its validity and applicability (Reference Sackett, Rosenberg and GraySackett et al, 1996). David Healy mentions evidence-based practice but does not really do it justice because he does not refer to the standard evidence-based practice approach to dealing with evidence on the harmful effects of drugs (or other disease risk factors). It may be helpful to review briefly how to use quantitative estimates of harm, derived from an appropriate study, in the clinical consultation (Reference Levine, Walter and LeeLevine et al, 1994).
Clinical studies of risk aim to estimate both the size of any association and the degree of uncertainty in the central estimate. In the case of very rare events such as suicide, observational studies may be needed because randomised trials may simply lack sufficient power, even combined in meta-analysis. The basic analysis, however, is the same – a comparison of the occurrence of the event in patients exposed to the treatment with the occurrence of the event in a control group who do not receive the treatment. Various metrics are used in these studies, commonly the odds ratio, the risk ratio and the hazard ratio. Each of these has its own properties but all are essentially providing an estimate of risk of the event in the treated group compared with that in the control group. On the basis of the amount of statistical information in the study, a confidence interval can then be constructed around the risk estimate; this interval describes the range of values within which the true value lies.
An example
Using the data from Wheadon et al presented by Professor Healy, the relative risk of suicide in patients with bulimia treated with fluoxetine was 1.5 (95% CI 0.3–6.9). As Healy states, ‘with this confidence interval the data are potentially consistent with that risk being 6.9 times greater’. Actually, however, a more objective interpretation would be that the data are consistent with the risk being about 7 times greater or 70% less, with the most likely estimate being a 50% increase in the risk. This is clearly a very wide range of possible effects. Such a result is indeterminate – the risk could either be greatly increased or greatly reduced. However, although a potential benefit would be useful, the central estimate is of an increased risk and this would clearly be of clinical concern. The clinical implications of a 50% increase in the relative risk of suicidal acts would largely depend on the absolute risks.
In general, relative risks seem more impressive than absolute risks (Reference Fahey, Griffiths and PetersFahey et al, 1995). If, for example, the absolute rate of suicidal acts is high, say 30%, in the control group, then this would imply a 15% absolute increase to 45% in the treatment group, clearly a clinically significant difference. On the other hand, if the absolute risk was low, say 2%, then this would mean a rate of 3% in the treatment group – a much lower absolute difference. The size of the absolute risk of harm needs to be considered alongside the chances of benefiting from the therapy. A simple way of doing this quickly in the clinical consultation is to calculate the likelihood of being helped or harmed – a useful decision tool in which absolute risk is combined with patient-derived utilities (a way of rating preferences) of both helpful and harmful outcomes (Reference StrausStraus, 2002).
It is clear that even simple statistics can be presented in ways that encourage one particular interpretation or another. The task of the clinician is to provide the patient with an objective summary of the best available evidence, which the patient in turn can integrate with their own values and preferences in reaching a decision.
Declaration of interest
J.G. has received research funding and support from GlaxoSmithKline, Sanofi-Aventis, the UK Department of Health, the Medical Research Council and the Stanley Medical Research Institute.
eLetters
No eLetters have been published for this article.