Healthcare systems around the world have placed an increasing emphasis on evidence-based practice, underpinned by the production and dissemination of evidence-based guidelines. Healthcare policy makers require independent and objective advice on both healthcare technologies (devices and diagnostics) and on procedures, particularly those that are new. High quality published evidence on the efficacy and safety of new medical devices and new procedures is typically sparse. Advice from clinical experts is therefore frequently used as an integral part of the process of health technology assessment (HTA) (1–Reference Campbell and Marlow3). Expert opinion may be influential in interpreting the published evidence and may form part of the “evidence” in its own right.
Despite the fact that advice from experts is widely used, there has been little empirical research studying how it influences assessment and what part it plays in the production of guidance for health services. Guidance development is often conceptualized as a rational, repeatable, mechanistic process that orders and synthesizes empirical literature, based on a hierarchy of evidence. However, a recent study has highlighted how this process is influenced by contextual factors and by subjective judgment based on personal experience (Reference Atkins, Smith, Kelly and Michie4). This study also raised the question of how expert opinion is conceptualized as “evidence” in this process. Previous work has shown that prior experience and beliefs have a strong influence on the development of clinical guidelines (Reference Raine, Sanderson and Hutchings5), and that the process of guideline production involves the merging of bodies of both scientific and practical knowledge, alongside political and procedural considerations (Reference Moreira6). It has even been argued that the involvement of experts may be harmful to evidence synthesis, through introducing bias to the process (Reference Gøtzsche and Ioannidis7).
The National Institute of Health and Care Excellence (NICE) Interventional Procedures (IP) Programme has produced guidance for the UK NHS since 2002. The IP Programme's objective is to appraise the efficacy and safety of emerging interventional procedures, defined as those involving an incision, puncture, or entry into a body cavity, or the use of ionizing, electromagnetic, or acoustic energy. Detailed descriptions of the process and methods used by the Interventional Procedures Program can be found elsewhere (8;9). Briefly, overviews of published evidence are provided, alongside solicited written commentary from expert advisers (clinical specialists). These, together with commentary from patients about their experience, are presented to an independent advisory committee (the Interventional Procedures Advisory Committee, IPAC, referred to henceforth as the committee). The committee drafts recommendations, which are presented for public consultation for 1 month, after which they are reconsidered by the committee in the light of the consultation comments and revised if necessary. Guidance is then ratified by NICE's Guidance Executive and published.
The IP Programme at NICE has developed a process for collection and submission of expert clinical advice to the committee. Expert advisers (also known as specialist advisers) are nominated by UK professional organizations in response to requests for advice about specific procedures relevant to their specialties. They provide advice in writing, by completing a semi-structured pro-forma. The professional organizations are asked to nominate specialists who do the procedure and who do not. The aim of this approach is to obtain a range of advice and opinion. Responses are required from a minimum of two experts for each procedure.
This study was designed to determine the aspects of expert advice that decision makers (committee members) find most useful during the development of evidence-based guidance and to identify the characteristics of experts who provide the most useful advice. To our knowledge, there is no previously published study that addresses this topic. We adopted a mixed methods approach, using both semi-structured qualitative interviews with committee members, and a quantitative cross-sectional analysis of a sample of pieces of written advice.
METHODS
Qualitative Analysis
Nineteen members of the NICE Interventional Procedures Advisory Committee were invited to participate in semi-structured interviews in Spring 2013. Members who had joined the committee shortly before the study were excluded as having insufficient experience of the IP process: all other members were included. Seventeen (89 percent) were interviewed; two were unable to participate due to time constraints. The seventeen members of the committee who were interviewed included thirteen consultant physicians, two lay members, one nurse consultant, and one professor of medical statistics. The semi-structured interview schedule was designed to capture the perceived value of expert advice when committee members consider guidance on a procedure.
Contemporary verbatim notes were taken during the interviews and these were subsequently sent to each interviewee for review, to provide respondent validation and to give interviewees the opportunity to comment further. Each transcript was then read by the lead researcher (O.O., a Public Health registrar on a 9-month placement at NICE) who identified and coded emerging categories using both the interview schedule and interviewee narratives to develop themes. Transcripts were also read independently by two other researchers (B.C., the chair of the Interventional Procedures Advisory Committee; J.P., a consultant clinical adviser to the IP Programme) who met with the lead researcher to explore and agree the emerging categories and a final thematic framework. The lead researcher returned to the transcripts and applied the agreed thematic framework to the data.
Usefulness Score
To quantify the usefulness of expert advice, a five-item “usefulness score” was developed following the qualitative analysis, which included the components of “usefulness” identified by the interview respondents. This is a five-item score for assessing the usefulness of information provided by expert advisers on a specialist adviser questionnaire. It was developed through qualitative analysis with members of the NICE Interventional Procedures Advisory Committee. The five items were: where the procedure might fit into management of the condition/information on patient selection; information on the conduct, results, or interpretation of the findings of published studies that might not be apparent to a nonspecialist; training or expertise required to do the procedure safely and well; informal (collective) views or concerns of the adviser's specialist colleagues about the procedure; any particular incentives which might lead clinicians to want to do this procedure.
On reading a completed questionnaire each of the five items could be scored between zero (no information included) and two (useful information, not available from published evidence) giving a total score between 0 (not useful) and 10 (very useful). If the questionnaire contained no information at all on this subject it would receive 0 for this item. If the expert included one line on patient selection that might reflect and support the kind of information gleaned from the published literature the questionnaire would receive 1 for this item. If the expert said something about the type of patient for whom this procedure would be the first choice and explained why, the full 2 marks would be received for the item. The maximum score that could be obtained by any questionnaire was 10 and the lowest, 0.
The criterion validity of this index was established by having four members of the committee rating 30 questionnaires as “very useful,” “useful,” or “not useful,” and these scores showed significant correlation (Spearman's rank correlation coefficient 0.68, p < .001) with the usefulness index score applied by the researcher.
Quantitative Analysis
The usefulness index was applied to all specialist advice questionnaires for procedures reviewed by the committee between July 2011 and April 2013, where at least four pieces of expert advice per procedure had been obtained. It was decided a priori to extract data from a minimum of 200 pieces of expert advice. Data were extracted by one researcher. Data recorded for each questionnaire included the usefulness score and the characteristics of the expert adviser who completed the questionnaire (including gender, year of qualification, clinical experience with the procedure in question, whether or not they had conducted research on the procedure, whether or not they had declared a conflict of interest, and their opinion of the novelty of the procedure). Finally, univariate and multivariate analyses were conducted using SPSS version 15.0 for Windows to investigate associations between “usefulness” of advice and the characteristics of the expert providing the advice. Mean scores of usefulness were tested for significant differences in univariate analyses using t-tests where two categories were being compared, and one-way analysis of variance where more than two categories were compared. Chi-squared was used to test associations between categorical variables. Multivariate analyses were conducted using linear regression to examine associations between variables of interest and usefulness score.
RESULTS
Qualitative Findings
Usefulness of Expert Advice. All seventeen interviewees agreed that information from expert advisers on issues related to the use of the procedure in clinical practice was important and useful. Many committee members reported that it was difficult to get this from the empirical literature. Committee members stated that expert advice helped to identify the potential indications and clinical applications of a procedure outside the artificial setting of a research study. Giving an understanding of where the procedure would fit in the overall management of a condition was a key element of usefulness. This included a practical understanding of which patients might particularly benefit from a procedure (patient selection) and whether a procedure had a niche application different to other available procedures. Both of these were particularly valued, as illustrated by these three quotes:
“The question is: how do they decide what the patient gets when there is more than one procedure? Or does everyone going to them with a particular problem get the same treatment? In the uterine example we discussed today, how did they decide a particular patient would get it rather than a hysterectomy or is it a stepped process where they try one thing and if it doesn't work they try the next thing which is more extreme”
“Why would anyone choose to do this particular procedure rather than something else, what this is getting at is the core of the benefit/risk balance we are trying to get to”
“Patient selection - this is rarely in the literature, they only tell us the inclusion criteria for their particular study not why those people were chosen but we want to know who benefits and we can get that clinical context from the expert.”
Two interviewees said that expert comments about patient selection could, for example, directly influence whether or not the committee recommended that selection of patients must be done by a multi-disciplinary team.
The value of the practical know-how of experts (as distinct from empirically-derived knowledge) included their understanding of what particular training or experience would be required to carry out the procedure. Again, this was perceived as being poorly documented in the empirical literature. Four committee members stated that expert adviser comments about the training and expertise needed to carry out a procedure had directly influenced the inclusion of recommendations about these matters in the guidance they produced.
In describing the value of the contextual knowledge and know-how provided by clinical experts, several committee members reported that they did not find advice useful if the experts had never performed the procedure in question. Thus what was valued was not only the specialist training and status of the expert in their field of work, but their practical hands-on experience and the tacit knowledge this could bring to inform decision making by the committee. Interviewees reported valuing the personal insights experts could provide from within their own communities of practice (described—not unfavorably—by one respondent as “corridor gossip”), for example about research study authors, or about the conduct of studies, which could influence the interpretation of the published data. As one interviewee noted:
“In my own area, when I read a paper I start by reading the names of the people who wrote it. They may be people with sensible, thoughtful, well-measured or they may be two standard deviations away from common-sense. So by getting insight from someone in that area, we can understand this”
The value of these “corridor gossip” insights also included the benefits of hearing about concerns with efficacy or safety outcomes that might be being expressed within specialist practice, but not in the published literature, as illustrated in this quote:
“If there is a particular aspect of a safety outcome- because I can guess or I can see a list of all the things that might go wrong- but what I want to know is: This is what is really bothering everybody”
It was clear from the interviews that expert opinion was seen as a category distinct from, and complementary to, the published literature. Seven committee members considered that they should be able to get all the efficacy and safety evidence they needed from the peer-reviewed literature, but that expert advice could help the interpretation of the literature on efficacy and safety. For example, one committee member explained that if the clinical expert contradicted the published evidence this gave him pause for thought, to pay attention to why that might be. Another stated:
“If everyone who knows [i.e., experts] thinks it's a no-brainer, we should realise that”.
Others highlighted the potential for expert advisers to identify unusual safety events that would not find their way into the published literature due to their anecdotal nature, or due to the reluctance of authors to report all negative outcomes, but which were useful for the committee to consider in their discussions. These points are illustrated by the following quotes, which also refer to the value of honesty in expert opinion.
“The anecdotal safety outcomes you have to take as anecdote, but these things won't have made it into the literature”
“Even if we have published evidence, there is a reluctance from authors to state all the negative outcomes. Specialist advisers are more honest about these things”
Finally, another benefit of using specialist advice that was identified in the interviews related to the performative aspect of a guidance production program being seen to engage with the clinical community and the relevant specialist societies. The process of capturing and including expert opinion was seen as enhancing the credibility of the program and facilitating engagement with the processes of guidance development and implementation. As one respondent said:
“By engaging with the specialist societies and recognising their comments. . . it builds a relationship and recognised respect between the specialist community and the institute”
Limitations of Expert Advice
It was clear from the interviews that committee members regarded the potential for bias as the main limitation of expert advice. Ten committee members drew attention to the risks of selection bias as well as respondent bias, leading to respondents who may be overly enthusiastic about doing the procedure. It was also noted that the fact that experts are often early adopters of a procedure may mean that they are natural risk takers. These considerations were stated as possible reasons that experts’ views of the risks and benefits posed by a procedure may be skewed, as illustrated in the following quotes:
“People are selected and by default you get a biased subgroup. The people who want to take part are usually believers in it, not the ones who don't believe.”
“Sometimes they think it works because that's what they do”
“It is important to remember this is a unilateral and monocular view.”
Crucially, the interviewees took account of this limitation (of potential bias) in their use of expert advice. Thus bias was not so much an absolute barrier to usefulness, but more a contextual factor to be accounted for in the interpretation of advice.
A second reported limitation of expert advice related specifically to advice on safety events. While, as noted above, many interviewees reported the benefit of anecdotal safety information, two committee members had concerns. One was concerned that the safety issue may be associated with a particular unit or hospital, perhaps where specialists are not selecting patients appropriately, or where they have not optimized their technique, rather than an issue with the procedure per se. A second committee member was concerned that accepting anecdotal evidence (for safety) leaves guidance open to challenge, saying:
“Anecdotal safety outcomes can cause problems because at public consultation it is open to challenge and people can say: there is nothing in the published evidence that supports that”
Some committee members considered that discordance between the responses of different expert advisers commenting on the same procedure, made their advice difficult to interpret. By contrast, one committee member commented that they particularly appreciated this variety of response and thought it was important to find out that not all experts were in agreement.
Quantitative Findings
In the second part of this mixed methods study, data were extracted from all 211 pieces of expert advice relating to forty-one procedures—an average of five pieces of advice (i.e., five questionnaires completed by five different experts) per procedure. At the time, 455 guidance papers had been produced by the committee in total (since its inception), so this sample represented 9 percent of the procedures ever examined and included the full range of committee topics. The characteristics of experts are shown in Table 1. This shows that 92.9 percent of nominated experts providing advice were male, 7.1 percent were female. Half (106, 50 percent) of the experts had done the procedure that they were providing advice on; 144 (68 percent) stated that they had done research on the procedure in question; and 146 (69 percent) stated that they had no conflict of interest. One hundred eleven (52 percent) experts considered the procedure they were examining to be novel and of uncertain efficacy and safety, and ninety (43 percent) considered the procedure to be established practice.
Usefulness
As described in the methods, we were able to rate expert advice according to its usefulness using an index developed through the qualitative work with committee members. Here, we report which expert characteristics were associated with giving useful advice.
In univariate analyses, all three variables indicating experience with the procedure were associated with giving more useful advice (Table 2). These were (i) whether the adviser did the procedure, (ii) whether they were involved in research on the procedure, and (iii) whether they declared a conflict of interest. Experts who reported that they had done the procedure were judged to have provided more useful advice than those who reported that they had not (mean score, 4.42; SD, 2.22 versus 3.8; SD, 2.00; p = .036). Experts who reported doing research on the topic were judged to have given more useful advice than those with no involvement in research (mean score, 4.56; SD, 2.09 versus 3.14; SD, 1.93; t-test p < .001). Experts who reported a conflict of interest were also judged to have given more useful advice than those with no conflicts (mean score, 4.71; SD, 2.26 versus 3.85; SD, 2.04; p = .007). In addition, the format of the experts’ responses was associated with their usefulness. Experts who filled in the questionnaire online or who emailed the completed form provided more useful advice than those who handwrote their questionnaire (p = .007).
Analysis of data on how long experts had been qualified as doctors showed no association with the usefulness of advice (all means between 3.9 and 4.08; p = .920). Of the ten professional organizations most frequently providing advice to NICE there was no significant difference between the usefulness of advice provided (p = .136).
When linear regression was performed to examine the association between five expert characteristics (gender, year of qualification, operator, researcher, conflict of interest, and questionnaire format) the only characteristic that remained significantly associated with usefulness was whether or not the adviser reported having undertaken research on the procedure (p < .001).
DISCUSSION
Despite recurring critiques of evidence-based medicine as a reductionist, scientistic practice (Reference Goldenberg10), evidence-based guideline production has always combined different “repertoires of evaluation,” (page 1976 of Moreira) (Reference Moreira6), including both technical and practical knowledge. Indeed, Berg et al. (Reference Berg, Meulen and Van Den Burg11) has argued that the embedding of normative knowledge (in his terms value judgements influenced by pragmatic and implicit considerations, in contrast to rational facts) is important to support the successful implementation of guidelines—acceptance in clinical practice depends on a recognition that such issues have been considered.
Our interview findings showed that the elements of expert advice the committee members reported to be most useful were those grounded in the experience of actual clinical practice. This practice-based evidence was valued for providing both knowledge and know-how (Reference Gabbay and le May12). In general, expert opinion was seen as a valued complement to empirical evidence, providing context and tacit knowledge that was not available in the published literature but which was helpful in interpreting it. In this sense, the value of expert opinion was not in forming part of the hierarchy of evidence (generally seen as the bottom rung of the evidence ladder in the doctrine of evidence-based medicine), but as sitting alongside it, as an adjunct to the interpretation of technical “evidence.” For example, interviewees valued specialist advice on the training and experience required to perform a procedure, or on patient selection criteria, or on the place of a procedure within a clinical management pathway. Such topics are rarely covered in the academic papers selected to provide the “evidence-base” for committee discussions. Interviewees also valued hearing tacit insights into the reputation a procedure or a research study had gained within a clinical specialty. Jamous and Peloille (Reference Jamous, Peloille and Jackson13) distinguished between two forms of professional knowledge: technical (that can be rationalized and codified) and indeterminate (tacit and acquired through experience). Using this distinction, we can say that expert opinion was capturing some of this indeterminate knowledge, and arguably thereby adding further implicit professional authority to the final guideline.
The limitations of bias in expert opinions were widely acknowledged and some skepticism was expressed regarding the “anecdotal” nature of advice that referred to safety or efficacy outcomes. Where there was perceived value in the latter was in the reporting of safety issues which perhaps had not come to light in the empirical literature either through selective reporting or lack of timeliness. Again, the value of such reports was not so much in forming a lower rung of the evidence hierarchy in a formal, technical assessment of the safety (and efficacy) of a procedure, but in providing “pause for thought” and shaping the interpretation of the published literature. The power of the “adverse anecdote” has been discussed elsewhere (Reference Stuebe14).
In the second part of this study, we attempted to identify characteristics of expert advisers associated with providing the most useful advice. This quantitative analysis demonstrated that the most useful advice on a procedure appears to be given, probably unsurprisingly, by clinical experts who have direct personal experience of the procedure (“operators”), in particular those with research experience. This may be because those with research experience have more insight into the tacit knowledge that committee members value- or it may be confounded by other qualities (such as attention to detail eloquence in written text) that are perhaps more often present in research active physicians than in pure clinicians. Regardless, requesting advice exclusively from those who do a procedure may give a one-sided view: a previous study demonstrated that advisers with experience of doing a procedure, but not those with research experience or conflicts of interest were more likely to consider a procedure established, safe and efficacious (Reference Lyratzopoulos, Hoy, Veeramootoo, Shanmuganathan and Campbell15), a finding we confirmed in our data. In multivariate analysis, we found no association between other expert characteristics and the “usefulness” of advice received.
Strengths and Limitations
This study's strength was its conduct in a real world setting, interviewing decision makers who use expert advice to assess and produce recommendations on the use of procedures and examining retrospectively real advice given by experts. While the emergence of categories in the qualitative analysis was for the most part inductive, we reflect that it was likely to have also been influenced by the researchers’ previous experience at NICE and working in public health and medical roles. A limitation in the quantitative work was the use of a proxy for “usefulness,” because it would have been impractical to ask committee members to score more than 200 pieces of expert advice. However, the indices of usefulness used for our proxy index were grounded in the qualitative interview data from seventeen members of the committee and validated by demonstrating the correlation of a pilot set of index scores with the views of four members of the committee.
Implications
Many HTA programs already use expert advice as part of their formal decision-making processes, and others may do in less formalized ways. It has previously been suggested that the risk of introducing bias to a systematic review or meta-analysis should preclude the use of experts as authors of these pieces of work (Reference Gøtzsche and Ioannidis7). However, the same article argues that review teams should have access to expertise in the topic area.
Our findings suggest that HTA programs that include expert advice in their processes should continue to do so, perhaps formalizing this to ensure that research-active experts are included and that a balance is sought between those with experience of the technology under review and those without. Those HTA programs that do not already use input from experts may wish to develop a mechanism for expert advice to be collected and feed into decision making. We argue that published peer-reviewed evidence cannot answer all the questions that decision makers may have when creating evidence-based guidance. Information and opinions from experts, especially those related to in-practice knowledge and know-how, can complement empirical evidence in assessing and producing recommendations about healthcare interventions. Such contextual and tacit knowledge is rarely found in the empirical literature. The potential biases present in expert advice are generally acknowledged by those using it, and allowed for.
CONCLUSION
Evidence-based guidance production is often characterized as a rational, pipeline production process. Such a characterization ignores the important role that commentary from experts (and indeed, although not the topic of this study, from patients and their representatives) (Reference Campbell, Chambers, Kelson, Bennett and Lyratzopoulos16) can play in the construction of guidance. Recent authors have called for a “renaissance” in the evidence-based medicine movement to refocus on useable evidence that incorporates contextual information and expert opinion (Reference Greenhalgh, Howick and Maskrey17). Our findings suggest that at the level of guidance, this is achievable and valued by those constructing the guidance. This study adds to an emerging literature that is beginning to examine the role of the “Ghost” of expert opinion in the “Machine” of evidence-based guidance production.
CONFLICTS OF INTEREST
Competing Interests: J.P. and H.P. work as Consultant Clinical Advisers to the NICE Interventional Procedures Program. B.C. is Chair of the NICE Interventional Procedures Program. O.O. was a specialist registrar in public health on placement with NICE during the study period. The authors declare that they have no other competing interests. Author Contributions: B.C., H.P., J.P., O.O. designed the study. O.O. and A.W. undertook the fieldwork and analyses. All authors contributed to the interpretation of findings and drafting of the final manuscript. J.P. is guarantor.