The goals of personalised medicine and stratified healthcare carry the potential to improve the outcomes of psychiatric disorders, which to date are often characterised by insufficient treatment response. Defining psychiatric symptoms and establishing diagnoses using a standard assessment tool across major psychiatric disorders is a desirable aim on the path towards translating research findings into clinical practice. Furthermore, a common data-set across many different research and clinical populations facilitates the generation of large data-sets essential to studies of complex disorders. OPCRIT+ is an expansion and redevelopment of the OPCRIT system described by McGuffin et al in 1991. Reference McGuffin, Farmer and Harvey1 The original tool is an electronic checklist of psychopathlology items with algorithms for objective diagnosis of psychotic and affective disorders. It has been found to be both reliable and easy to apply in research settings but has not been applied in clinical settings. Reference Cardno, Jones, Murphy, Murphy, Asherson and Scott2,Reference Williams, Farmer, Ackenheil, Kaufmann and McGuffin3 We have redesigned OPCRIT for use in clinical settings and expanded the number of objectively rated items to cover a broader range of diagnostic categories. This is part of a larger drive to improve objectivity in diagnosis in clinical settings, and to provide a mechanism for the routine collection of a core clinical, research and audit data-set. In this paper we describe the development and structure of OPCRIT+ and present the results of an interrater reliability study. Our aim is first a proof of concept: that OPCRIT can be developed into a broad clinical assessment tool for adult mental illness. Second, we aimed to discover whether OPCRIT+ could achieve adequate interrater reliability of diagnosis among junior clinical staff, which is a requirement if such a system is to be implemented routinely.
Method
Development of OPCRIT+
As in the OPCRIT system, OPCRIT+ consists of a checklist of items in either electronic or paper format that allows flexible data entry, provides definitions of the items, and applies algorithms based on ICD-10, 4 DSM-IV 5 and other criteria to provide a diagnostic classification. Two identical versions of OPCRIT+ were designed. The first was written as a standalone program running under Windows XP (or later versions of Windows) and the second as a component of the South London and Maudsley NHS Foundation Trust electronic clinical record (ePJS), which has a web-based interface. Each version collected an identical data-set, used identical definitions of items and terms and identical algorithms to make diagnoses (Appendix 1).
Item design and arrangement
The items in OPCRIT+ were structured around the headings of a standard psychiatric history and mental state examination. We retained the original items from OPCRIT (http://sgdp.iop.kcl.ac.uk/opcrit/) but the wording of some items was changed to take account of new items, as well as changes in terminology since publication of the original OPCRIT paper. Reference McGuffin, Farmer and Harvey1 New items were designed based on the objective criteria requirements of the expanded range of diagnoses from the research version of ICD-10 and from DSM-IV. A range of anxiety, substance misuse and personality disorder diagnoses commonly encountered in psychiatric practice were included. Further items salient to the assessment of psychosocial status, risk and prognosis were also added (see online supplement and online Table DS1).
An OPCRIT+ assessment includes a number of screening items. Depending on the answers to these screening items, additional items are activated for rating. This approach reflects the need to create a system that could be rated quickly and efficiently by a busy clinician, while at the same time avoiding a Procrustean approach to diagnosis (the practice of tailoring data to fit a preconceived structure). Reference Farmer, McGuffin and Williams6 Thus, for example, items regarding delusions are only activated if a prior screening item about delusions is rated positively. Free-text boxes are available within all sections, allowing notes specific to each case to be recorded.
Each item is associated with a help entry. Definitions, suggested questions and other resources within the help were standardised where practical, with reference to the original OPCRIT items, ICD-10/DSM-IV criteria and, using as a model, questions contained within the Schedules for Clinical Assessment in Neuropsychiatry (SCAN) Reference Wing, Babor, Brugha, Burke, Cooper and Giel7 and the Structured Clinical Interview for DSM-IV (SCID). Reference First, Spitzer, Gibbon and Williams8
Algorithm design
The OPCRIT+ checklist retained all algorithms from OPCRIT (v4.0) and additional diagnoses as listed in Appendix 2. Additional algorithms were designed around the criteria for the diagnoses in ICD-10 and DSM-IV shown in Appendix 2. All algorithms were coded in the programming language C++ on Windows.
Database design
OPCRIT+ uses the mySQL database platform (v5.1). The mySQL platform is a free, open-source database structure that is widely used, flexible and secure. Item ratings were recorded numerically, with a rating of –8 for ‘not applicable’, –9 for ‘unknown’ and a ‘null’ value to represent an unrated item. Ratings are time stamped. OPCRIT+ takes advantage of the internet-enabled features of mySQL so that data can be securely entered and stored in real time by multiple users in multiple locations over a secure connection via the internet to a mySQL database server. Using this feature avoids storing potentially sensitive data on local workstations or researcher's laptops.
Usage
OPCRIT+, like its predecessor, was designed to allow flexibility in the source of ratings. That is, the constituent items can be scored following a standard clinical clerking, following a structured interview, based on a set of case notes, from a prepared case abstract/ vignette or from a combination of sources. Like its predecessor, the authors assume the rater has a basic knowledge of psychiatric assessment and eliciting psychopathology. An online web page was constructed with instructions for installation and use. Training vignettes were designed and are available on request from the website (http://sgdp.iop.kcl.ac.uk/opcritplus).
Interrater reliability
Junior doctors in psychiatry were recruited by an email advertisement sent to trainees at the South London and Maudsley NHS Foundation Trust. The recruited group of trainees (n = 20) at the time of the study had between 0 and 3 years of clinical psychiatric experience. Ten cases were selected from the clinical case records to represent the broad range of diagnoses that the OPCRIT+ checklist covers. Most of these were clinically complex, with multiple, comorbid diagnoses. Case abstracts were anonymised and prepared by J.R. Each case was described in approximately 500 words, divided by paragraphs but without headings. Each participant received 1 h of training on the use of the OPCRIT+ checklist by J.R., J.G. and C.G. Trainees were then asked to rate the cases using the checklist, on the basis of a ‘lifetime-ever occurrence of symptoms and signs’. Raters were asked to check each item manually, regardless of whether they were nested within a screening item, and reconsider their screening item rating in light of any positive rating of a nested item. This sought to mitigate against the complication of data non-independence when kappa values were calculated. J.R., C.G. and J.G. were available to give technical assistance and advice when it was requested, but did not give guidance on the rating of cases. The OPCRIT+ algorithms were applied to the collected data to generate a list of diagnoses for each rater.
Statistical analysis
We calculated a variant of Fleiss' kappa statistic that allows for multiple raters, multiple categories and no pre-existing assumption that raters are forced to assign a certain number of cases to each category. Reference Randolph9,Reference Randolph10 Values of kappa can range from –1.0 to 1.0, with −1.0 indicating perfect disagreement below chance, 0.0 indicating agreement equal to chance, and 1.0 indicating perfect agreement above chance. Reference Randolph10 The raw diagnostic output from OPCRIT+ was sorted into ordered, diagnostic groups between which the weightings were equal, as detailed in Appendix 3. We statistically analysed the diagnostic data generated from the rated cases by calculating the extent to which raters agreed on diagnoses. More specifically, within each diagnostic spectrum (i.e. psychotic/affective, substance misuse, anxiety and personality disorder spectra) we selected from the ten cases any one that might reasonably have attracted a diagnosis along that spectrum along with one ‘null’ case, and calculated the interrater reliability with respect to those cases.
We further analysed the interrater data by calculating kappa values for all original OPCRIT items, all screening and mental state examination items and any other items that were used by the diagnostic algorithms. We calculated kappa scores from counts of rating scores for each option within each item (i.e. how many out of the 20 raters rated 1, 2, 3, etc.) for each case separately and then combined scores for all cases to give a combined kappa score for that item. Thus we calculated interrater reliability scores for a total of 158 items.
Results
Interrater reliability results are detailed in Table 1. The combined, weighted kappa for diagnostic reliability among all groups of diagnoses was 0.70. For psychotic and affective disorders a kappa of 0.63 was obtained. Kappas for substance use disorders, anxiety disorders and personality disorders were 0.76, 0.71 and 0.83 respectively.
The mean kappa for interrater reliability among items was 0.80. When kappas for individual items were stratified according to whether items were independent (screening items) or dependent (contained under screening items) then kappa scores were 0.73 and 0.84 respectively. Items contained under screening items can be interpreted as partially dependent variables, resulting in the kappa score being artificially inflated, therefore detailed results are presented in online Table DS2 and online Fig. DS1.
Discussion
Main findings
OPCRIT+ can be downloaded and installed free of charge from http://sgdp.iop.kcl.ac.uk/opcritplus/. OPCRIT+ has been incorporated within the electronic case record of the South London and Maudsley NHS Foundation Trust London, UK. Data collected by OPCRIT+ in the electronic case record are subject to automatic anonymisation and categorisation via an electronic search engine called the Case Register Interactive Search (CRIS), recently developed by Stewart et al. Reference Stewart, Soremekun, Perera, Broadbent, Callard and Denis11
The OPCRIT+ checklist is a development and expansion of the OPCRIT checklist, a well-established and well-validated tool in research settings. Reference Cardno, Jones, Murphy, Murphy, Asherson and Scott2,Reference Williams, Farmer, Ackenheil, Kaufmann and McGuffin3 We demonstrate in this study good
Diagnostic spectra | Diagnostic groups, n | Cases, n | Overall agreement, % | κ |
---|---|---|---|---|
Affective/psychotic disorders | 7 | 5 | 68 | 0.63 |
Anxiety disorders | 2 | 3 | 80 | 0.71 |
Substance use disorders | 3 | 10 | 84 | 0.76 |
Personality disorder | 2 | 4 | 92 | 0.83 |
interrater reliability scores for diagnosis and good overall reliability for individual items. Most items have kappa scores between 0.6 and 0.9 (online Fig. DS1) and all but three diagnostic items have kappa scores of over 0.6 (online Table DS2). The interrater reliability study was performed on the standalone version of OPCRIT+. We would not expect any differences in interrater reliability between the two versions of OPCRIT+ in this study as they collect identical data-sets and run identical algorithms. A unified data-set and applied operationalised criteria for diagnosis in psychiatry is potentially achievable across the majority of patient groups and research groups using this tool. Such a data-set will be invaluable for the continuing research into biomarkers and personalised interventions in psychiatric disorders.
Strengths and limitations
In our study, raters were medically qualified, but had only between 0 and 3 years postgraduate experience in clinical psychiatry. They received 1 h of training in the use of OPCRIT+. Despite these limitations they achieved good interrater reliability. OPCRIT+ is therefore a robust system and could potentially be used by research workers and members of professions allied to medicine with appropriate training. However, we would also point out that our interrater reliability study was carried out using prepared abstracts and may not fully represent the sort of reliability that might be seen in busy clinical environments. Clearly, reliability may be influenced by many factors, including the time constraints on raters, their experience and knowledge of the case they are rating, their experience of psychiatric diagnosis and eliciting psychopathology in general, and their experience and training in the use of the OPCRIT+ system itself.
Interrater reliability can also be artificially inflated with non-independent data. Reference Fleiss12 OPCRIT+ contains items that are dependent on the ratings of preceding screening items. Thus the data for these items became partially dependent. We tried to minimise this effect by asking raters to manually rate all items, whether or not they had rated a screening item in a certain way, and we also avoided calculating kappa scores for items that were dependent on two or more screening items as such items are, by definition, rarely rated and thus are likely to have high dependency and misleadingly high kappa scores.
Although we have demonstrated reliability for the new items and diagnoses included in OPCRIT+, there remains the question of the validity of the classifications produced by the instrument. The new diagnostic items certainly have face validity and all are taken from established diagnostic criteria 4,5 but the issue of other forms of validity will require further studies.
The advantages and disadvantages of the use of objective criteria for diagnosis in mental health have been questioned and criticised extensively. Reference Kendell and Jablensky13 Nonetheless, no viable alternative has been developed, and the atheoretical objective criteria of the modern versions of the ICD and DSM have undeniably led to a more consistently applied approach to diagnosis. Reference Grove, Andreasen, McDonald-Scott, Keller and Shapiro14 Biomarkers may eventually refine the process but objective symptom assessment and diagnostic stratification is an essential prerequisite to this. The OPCRIT+ system places established sets of objective criteria as well as salient historical markers for aetiology, diagnosis and prognosis into an electronic diagnostic system for clinical environments with benefits for both researchers and clinicians alike.
Finally, we briefly turn to the difficulties of establishing such a standardised computer-based system in routine clinical psychiatric practice. We made considerable effort in the design process of OPCRIT+ to incorporate features that were attractive to rating clinicians (Appendix 1). We collected informal feedback from the participants in the interrater study and subsequently from clinicians at our associated hospital about their experience of using OPCRIT+ and the time taken for completion. Time for completion was influenced primarily by familiarity with the tool and with the case being rated, ranging from 10 to 45 min. Even if it is not seen as time consuming, arguments against such a system may be rooted in an objection to the spirit of the process as a whole; more specifically that the mechanical application of diagnostic criteria with a computer does not include an understanding of the individual case, and therefore has limited utility for conceptualising the individual patient (for a review and commentary in this area, see Bertelsen Reference Bertelsen15 ). Yet, as we have pointed out above, objectivity in psychiatry has led to a more robust stratification of individuals into different diagnostic categories and this is a logical prerequisite to good-quality, replicable research with all the eventual clinical benefits thereof. Moreover, the application of objective criteria should not detract from the clinical skill of eliciting the psychopathology in the first place, or from the need to also form an aetiological theory of each individual case, which we would acknowledge wholeheartedly as being essential in the successful treatment of each individual. We would therefore argue that both approaches to information gathering are needed if individuals with mental disorders are to be successfully treated and clinical research is to progress. We have developed OPCRIT+ to provide a framework for the collection of data from both these spheres of approach, which we hope represents a significant advance in this area.
Funding
The authors acknowledge financial support from the National Institute for Health Research (NIHR) Biomedical Research Centre for Mental Health at the South London and Maudsley NHS Foundation Trust (SLaM) and the Institute of Psychiatry, King's College London. J.R. is funded by a research training fellowship grant from the Wellcome Trust () and was previously funded by a preparatory clinician scientist fellowship from the NIHR Biomedical Research Centre for mental health at SLaM and the Institute of Psychiatry, King's College London.
Appendix 1
Description of differences between each version of OPCRIT+
We developed two identical versions of OPCRIT+ in this project. Each version collects an identical data-set and runs identical algorithms for diagnosis. The differences between each version are functionally aesthetic and do not affect the core function of the program. One version was built and incorporated into our affiliated hospital's electronic clinical record, which has a web-based interface. The other version was built to be installable on any computer running the Windows operating system (freely downloadable from http://sgdp.iop.kcl.ac.uk/opcritplus/).
OPCRIT+ - installed in our hospital's electronic case record
The challenge in developing this version of OPCRIT+ was making the tool attractive to busy clinicians. We developed three features to make the system more appealing.
-
(a) We developed a help system linked to each item that provides standardised definitions of items and web links to seminal papers and internet resources, aimed at junior psychiatrists in training.
-
(b) We included free-text boxes under each heading, so that additional information relevant to each case can be recorded as required.
-
(c) We designed a function to generate automatic summary text based on rated items. Automatic summary text is freely editable and amalgamated into the free text written by the rater. All text is automatically collated into one summary, with headings, using a button at the end of the form. This core summary can be pasted into any word processing software and used as the basis for written reports, such as discharge summaries or assessment records.
OPCRIT+ - installable on any PC running Microsoft Windows
This version of OPCRIT+ can be installed on any modern PC running Windows XP or later. It includes a similar help system, but without web links. This version does not include algorithms to generate text automatically, but does have free-text boxes to record notes on each case. The user can create multiple databases, and save data either locally or via a networked mySQL database. The advantage of such a feature is that many individuals using different computers running OPCRIT+ can all enter data into the same remote database simultaneously. This greatly facilitates data collection and secure, reliable storage. This version of OPCRIT+ allows export of data into a variety of formats for onward analysis and will import data from the original OPCRIT program and appropriately formatted text files.
Appendix 2
Diagnoses in OPCRIT+ arranged by major class, spectrum and specific disorder, as used in the calculation of the kappa score
-
Affective/psychotic disorders
-
Depressive disorders
-
Mild
-
Moderate
-
Moderate with somatic syndrome
-
Severe
-
Severe with psychotic symptoms
-
-
Manic disorders
-
Hypomania
-
Mania
-
Mania with psychosis
-
-
Bipolar affective disorders
-
Bipolar I
-
Bipolar II
-
-
Schizophreniform disorders
-
Schizophrenia
-
Schizoaffective disorder, manic type
-
Schizoaffective disorder, depressed type
-
Schizoaffective disorder, bipolar type
-
-
Other psychotic disorders
-
Delusional disorder
-
Other non-organic psychotic disorder
-
-
Anxiety disorders
-
Agoraphobia without panic disorder
-
Agoraphobia with panic disorder
-
Social phobia
-
Generalised anxiety disorder
-
Panic disorder
-
Obsessive-compulsive disorder
-
Post-traumatic stress disorder
-
Personality disorder
-
Personality disorder, core criteria
-
Substance use disorders
-
Alcohol
-
Harmful use
-
Dependence syndrome
-
-
Cannabis
-
Harmful use
-
Dependence syndrome
-
-
Opiates
-
Harmful use
-
Dependence syndrome
-
-
Stimulants
-
Harmful use
-
Dependence syndrome
-
Appendix 3
Diagnostic grouping
The measurement of interrater reliability is most conveniently performed with an assumption of equal weighting between categories. This is potentially problematic in scenarios such as ours, where the conceptual magnitude of difference between diagnoses has not been defined by objective means. For the purposes of statistical analysis we therefore divided the OPCRIT+ diagnoses into the following groups for which it was more reasonable to make the statistical assumptions required for accurate calculation of the kappa score.
Within the ICD-10 algorithms for affective and psychotic disorder the following subgrouping was used. Depressive episode, mild, moderate, moderate with somatic syndrome and severe were grouped into category 1. Bipolar affective disorder was denoted category 2. Mania and hypomania were grouped into category 3. Psychotic affective disorders were grouped into category 4. Delusional disorder and other non-organic psychotic disorders were grouped into category 5. Schizoaffective disorder, manic, depressed and bipolar type were grouped into category 6. Schizophrenia was denoted under category 7.
Subgrouping of anxiety, substance misuse and personality disorder algorithms was not necessary, as they were developed to be separate dichotomous outcomes. The interrater reliability results presented are therefore based on a ‘diagnosis present/diagnosis absent’ basis. Within anxiety disorders the cases comprised obsessive-compulsive disorder, agoraphobia with panic disorder and post-traumatic stress disorder. Within personality disorders the cases comprised antisocial personality disorder, emotionally unstable personality disorder (borderline type) and narcissistic personality disorder (DSM-IV). Within substance use disorders the cases comprised harmful use or dependence on one or more than one of alcohol, cannabis, opioids or stimulants (restricted to cocaine, crack cocaine and amphetamines).
Acknowledgements
The authors would like to acknowledge the contributions of the following people to this work: Penelope Brown, Maite Von Heising, Surya Goudaman, David Polyakov-Nelson, Katya Polyakova-Nelson, Carlogero Longhitano, Wojtek Wojcik, Stephen Ginn, Justin Wakefield and Arun Chindripu.
eLetters
No eLetters have been published for this article.