Since the 1960s, patient-reported outcomes (PROs) have become increasingly popular in the care of patients with psychosis. There is no universally accepted terminology and definition of such outcomes. In the literature the terms ‘PROs’, ‘patient-reported outcome measures’ (PROMs), ‘patient-based outcomes’, ‘patient-driven outcomes’, ‘self-rated outcomes’ and ‘subjective evaluation criteria’ have been used interchangeably. Reference Priebe, Gruyters, Heinze, Hoffmann and Jakel1–3 In recent years the term ‘PRO’ appears to be most widely used. Reference Hansson, Bjorkman and Priebe2 The US Food and Drug Administration (FDA) defined PROs as:
‘any report of the status of a patient's health condition that comes directly from the patient, without interpretation of the patient's response by a clinician or anyone else’ (p. 2). 4
Treatment satisfaction, subjective quality of life (SQoL), needs and the quality of the therapeutic relationship can be considered as four historically rooted, commonly used and important PRO concepts in the care of patients with psychosis. Reference Hansson, Bjorkman and Priebe2,Reference McCabe, Saidi and Priebe5,Reference Kilian and Angermeyer6 Whereas the list of PROs has increased steadily, their popularity has gained momentum over the past decade, partly through their intuitive appeal for stakeholder groups. 3–Reference McCabe, Saidi and Priebe5 In the UK a recent National Health Service (NHS) White Paper announced plans for new outcome assessments, in which PROs were to be used to measure the effectiveness of services. 3 Using PROs in the monitoring of outcomes of individual patients and services can also feed into the patient–clinician communication, reflective practice, quality management and service development. Reference Slade, McCrone, Kuipers, Leese, Cahill and Parabiaghi7,Reference Priebe, McCabe, Bullenkamp, Hansson, Lauber and Martinez-Leal8 However, the selection of appropriate concepts and measures often remains difficult. Further, some authors have questioned the use of PROs in patients with psychosis owing to conceptual and methodological shortcomings, Reference Atkinson, Zibin and Chuang9 with some proposing to discard them entirely. Reference Epstein, Hall, Tognetti, Son and Conant10 Against this background, this review aimed to examine the concepts and measures of four widely used PROs – treatment satisfaction, SQoL, needs for care and the quality of the therapeutic relationship – in the evaluation of care of patients with psychosis.
Method
A review of the conceptual and methodological literature on the four PROs in the care of patients with psychosis was conducted. We searched the literature systematically and also followed the recommendations for conceptual and methodological reviews to search widely in disparate sources and allow for overlap in the various stages (literature search, analysis and writing). Reference Lilford, Richardson, Stevens, Fitzpatrick, Edwards and Rock11,Reference Morgan, Burns, Fitzpatrick, Pinfold and Priebe12
Search strategy and selection criteria
A search of the academic databases EMBASE, Medline and PsycINFO was performed to identify papers that, first, reported the characteristics and psychometric properties of PRO measures to assess treatment satisfaction, SQoL, needs for care and the therapeutic relationship in the care of patients with psychosis, and second, provided definitions of concepts intended to be assessed by at least one of the identified measures. The term ‘PRO’ was used in accordance with the FDA definition given earlier. The literature search combined three groups of keywords in each database:
-
(a) schizophr*, psychosis OR psychoses;
-
(b) quality of life, subjective quality of life, treatment satisfaction, patient satisfaction, need*, therapeutic relationship, therapeutic alliance, helping alliance OR working alliance;
-
(c) psychometric*, validity, reliability OR responsiveness.
Titles and abstracts were screened and papers retrieved to assess their relevance. Reference lists of relevant papers were inspected for additional papers. References that cited previously identified papers were searched using the ‘cited by’ option in the electronic database Web of Science. In addition to the search of academic databases, informal networks were used to identify papers.
Data extraction and synthesis
As the conceptual and methodological literature on PROs in the evaluation of treatments for psychosis is vast and disparate, a quantitative synthesis appeared neither appropriate nor feasible. The findings are presented descriptively. Although PRO measures can be distinguished according to various characteristics, we focused on the following ones: concept purported to be measured, number and content of domains, estimated completion time, response options and type (generic, condition- or disease-specific, treatment-specific and utility measures). 4,Reference Fitzpatrick, Davey, Buxton and Jones13 Numerous psychometric properties for evaluating PROs have been proposed in the literature. Reference Mokkink, Terwee, Patrick, Alonso, Stratford and Knol14 We distinguished between reliability (i.e. internal consistency, reliability, scale information), validity (i.e. content validity, including face validity, and construct validity, including structural, convergent, discriminant, cross-cultural, concurrent and predictive validity) and responsiveness. Reference Mokkink, Terwee, Patrick, Alonso, Stratford and Knol14 Given the lack of consensus on how these psychometric properties are best evaluated and findings synthesised, 4,Reference Fitzpatrick, Davey, Buxton and Jones13 we used a simple, dichotomous rating of whether or not a psychometric property had been examined for a given instrument.
Results
The results of the search strategy are summarised in Fig. 1. The search initially yielded a total of 2181 items (813 duplicates). Titles and abstracts were screened for 1368 references. Based on title and abstract sifts, 1238 references were excluded because they did not focus on the four PROs or psychosis. The number of potentially relevant references increased from 130 to 224 when additional items were added. Of these, 49 references were excluded for different reasons. Hence, from the 2181 initially identified references, only 175 were included in our review.
Concepts and definitions
Definitions of concepts to be assessed by the identified PRO measures of treatment satisfaction, SQoL, needs for care and the therapeutic relationship are summarised in online Table DS1. Measures of treatment satisfaction for which a definition of the concept to be measured was provided all purported to assess the multidimensional satisfaction concept of a personal evaluation of healthcare services and providers as proposed by Ware et al and Ruggeri et al. Reference Ware, Snyder, Wright and Davies15,Reference Ruggeri, Dall'Agnola, Agostini and Bisoffi16 The identified SQoL measures were intended to assess a range of concepts (Table DS1). The only measure of needs that provided a definition of the concept to be measured, the Camberwell Assessment of Need (CAN), Reference Phelan, Slade, Thornicroft, Dunn, Holloway and Wykes17 purported to assess a supply and perceived need concept. Pantheoretical, Reference Bordin18 Rogerian, Reference Priebe and Gruyters19 systemic Reference Pinsof and Catherall20 and psychoanalytic Reference Freud and Strachey21,Reference Sterba22 concepts were intended to be assessed by the identified measures of the therapeutic relationship. For each of the four PROs, no single universally accepted definition could be identified. Nevertheless, there were attempts to identify a common conceptual basis. Lauer noted that:
‘There is agreement that quality of life is a multi-dimensional phenomenon and construct, aiming at a holistic or global perspective of individuals in their biopsychosocial nature’ (p. 19). Reference Lauer, Priebe, Oliver and Kaiser23
Similarly, Ware et al emphasised that treatment satisfaction is most widely measured as a multidomain concept. Reference Ware, Snyder, Wright and Davies15 However, this may imply a risk of providing non-specific or overinclusive definitions. Several PRO concepts other than the one to be measured may meet very broad definitions; for example, Stevens & Gabbay defined needs as ‘the ability to benefit in some way from health care’ (p. 21). Reference Stevens and Gabbay24 Others have found a lack of clarity of the precise nature of some PRO concepts:
‘In psychiatry, there is as yet no clearly defined concept of the therapeutic alliance’ (Catty: p. 265). Reference Catty25
A tendency was found to use terms from different theoretical backgrounds and traditions with at least slightly different connotations synonymously. For example, the term ‘therapeutic relationship’ has been used interchangeably with the terms ‘therapeutic alliance’, ‘helping alliance’ or ‘working alliance’, each of which has emerged from different lines of research. Reference Catty25 Similarly, ‘treatment satisfaction’ has been used synonymously with ‘patient satisfaction’, ‘service satisfaction’ and ‘satisfaction with care’, to name a few. Reference Ware, Snyder, Wright and Davies15,Reference Ruggeri, Dall'Agnola, Agostini and Bisoffi16 This may lead to a lack of clarity as to precisely which conceptualisation of PROs is being referred to. Reference Catty, Winfield and Clement26 Several definitions of PRO concepts were found to overlap with others (Table DS1). However, some definitions of PRO concepts did not, and contained specific elements: this applied to definitions of SQoL, Reference Zautra and Goodhard27,Reference Clare, Corney and Cairns28 needs for care, 29 and the therapeutic relationship. Reference Bordin18,Reference Freud and Strachey21,Reference Sterba22 Overall, definitions of PRO concepts were found to vary in the extent to which they included overlapping and specific aspects.
Characteristics of PRO measures
Findings on characteristics and psychometric properties of PRO measures to assess treatment satisfaction, SQoL, needs for care and the therapeutic relationship are summarised respectively in online Tables DS2–DS5. For several measures the concept that the measure was intended to assess was not provided. Most measures were generic in nature and used Likert scales. Short versions have been developed for several measures, based on conceptual and practical rather than empirical considerations. A number of measures were found to be long and time-consuming to administer: several had more than 30 items and a completion time greater than 20 min.
Several PRO measures were intended to assess multidomain concepts, with items being grouped within domains, and domains within more general PRO concepts. An overlap in the content of domains was observed across measures that were intended to assess different PROs. Specifically, the domains of measures to assess SQoL are similar and in part even identical to domains included in measures of needs. This applies likewise to measures of treatment satisfaction and the therapeutic relationship. The content of domains of treatment satisfaction and needs for care measures, and the content of treatment satisfaction and SQoL measures, show substantial overlap (Tables DS2–DS5).
Psychometric properties of PRO measures
The evaluation of the reviewed measures often included only limited information on psychometric properties in patients with psychosis (Tables DS2–DS5). The methods used to assess structural validity were largely not appropriate for ordinal data, as required for the predominantly used Likert scales. Reference Gibbons, Bock, Hedeker, Weiss, Segawa and Bhaumik30 Only for two measures, the Quality of Life Interview (QoLI) and EuroQoL-5D, Reference Lehman31–34 was there evidence on structural validity based on confirmatory factor analysis for ordinal data or item response modelling. Reference Gibbons, Bock, Hedeker, Weiss, Segawa and Bhaumik30,Reference Prieto, Novick, Sacristan and Edgell35,Reference Uttaro and Lehman36 For most measures there was no evidence on their measurement precision throughout the full range of scores; only for the QoLI was this psychometric property examined. Reference Uttaro and Lehman36 For some measures no evidence on their internal consistency, test–retest reliability and scale information as well as content, structural, discriminant, convergent, concurrent, predictive or cross-cultural validity was found in the included studies.
Empirical overlap of PRO measures
Only a few studies assessed more than one outcome at a time. They consistently suggest low discriminant validity due to an empirical overlap of measures designed to assess different outcomes. The outcomes were substantially correlated, Reference De Weert-van Oene, Havenaar and Schrijvers37–Reference Slade, Leese, Cahill, Thornicroft and Kuipers41 and a single general factor explained more than half of the variance in SQoL, needs for care and treatment satisfaction scores. Reference Hansson, Bjorkman and Priebe2,Reference Priebe, Kaiser, Huxley, Roder-Wanner and Rudolf42,Reference Fakhoury, Kaiser, Roeder-Wanner and Priebe43 The general factor has been interpreted as a general appraisal tendency of patients for positive or negative ratings across measures designed to assess different PRO concepts. Reference Priebe, Kaiser, Huxley, Roder-Wanner and Rudolf42 However, this general appraisal tendency left about half of the variance unexplained, which is potentially concept-specific. A recent study suggested a bifactor model which confirms the importance of a general appraisal tendency, but also shows the relevance of concept-specific aspects. The latter provide distinct information that is independent from both the general appraisal tendency and other concepts. Reference Reininghaus, McCabe, Burns, Croudace and Priebe44
Association with psychiatric symptoms and cognitive deficits
There was also evidence from several studies that less favourable SQoL is related to higher levels of psychopathologic disorder including positive, negative and depressive symptoms. Reference Norholm and Bech45–Reference Vatne and Bjorkly56 For the association of psychiatric symptoms and treatment satisfaction, Katsakou & Priebe reported an inverse relationship between psychiatric symptoms and level of treatment satisfaction, Reference Katsakou and Priebe57 which is in line with other studies. Reference Katsakou, Bowers, Amos, Morriss, Rose and Wykes58 There are also a number of studies suggesting that patients with more severe psychotic symptoms have more unmet and total needs for care. Reference Grinshpoon and Ponizovsky59–Reference Ochoa, Haro, Autonell, Pendas, Teba and Marquez61 However, a more recent pooled analysis of individual patient-level data obtained from 16 studies found that symptom levels were less strongly associated with SQoL in schizophrenia than in other mental disorders. Reference Priebe, Reininghaus, McCabe, Burns, Eklund and Hansson62 A pooled analysis of associations between changes of symptoms and SQoL ratings over time identified an explained variance of only 5.5%. Reference Priebe, McCabe, Junghan, Kallert, Ruggeri and Slade63
With respect to cognitive deficits, evidence on associations with PROs remains inconsistent. Fujii et al found that better cognitive performance was associated with lower SQoL ratings in a prospective study of patients with severe and enduring psychosis, Reference Fujii, Wylie and Nathan64 which is consistent with other studies. Reference Corrigan and Buican50,Reference Addington and Addington65–Reference Wegener, Redoblado-Hodge, Lucas, Fitzgerald, Harris and Brennan70 However, Galletly et al, Ritsner and Sota found the opposite. Reference Galletly, Clark, McFarlane and Weber71–Reference Sota73 Deficits in executive functioning, attention, memory and motor skills were associated with lower SQoL. One recent study on bias of PRO ratings by psychiatric symptoms and cognitive deficits at the item level identified no effect of cognitive deficits on the responses to single items and no effect of symptoms on the responses of only two single items. The study concluded that the magnitude of any response bias through symptoms or cognitive deficits, if present, is small and unlikely to be of clinical significance. Reference Reininghaus, McCabe, Burns, Croudace and Priebe74
Discussion
Our review examining concepts and measures of four established PROs in the evaluation of treatments for psychosis generated at least three important findings. First, despite the increasing popularity of PROs with numerous concepts and measures, evidence of their methodological quality remains limited. Second, there is a considerable conceptual, operational and empirical overlap across measures designed to assess different PROs, although some concepts and measures also included aspects specific to individual PROs. Last, the influence of (or bias by) cognitive deficits and psychiatric symptoms appears limited and unlikely to be of clinical significance.
Limitations
The review has several limitations. The findings may be biased, as important references on concepts, characteristics and psychometric properties of PRO measures may have been missed. Concepts that might be relevant for one of the four PROs, but were not captured in an existing measure, were not included. The review was selective in examining concepts and measures of only four PROs and only a limited number of psychometric properties. Although Mokkink et al achieved a degree of consensus on the terminology and definitions of psychometric properties and provided guidance on data synthesis for reviews of the methodological quality of studies investigating psychometric properties of PROs, Reference Mokkink, Terwee, Patrick, Alonso, Stratford and Knol14 there is no consensus on how to synthesise findings on psychometric properties per se. We classified PROs according to whether or not they assessed specific psychometric properties. Given the absence of a consensus, this did not include ratings of the extent to which these psychometric properties were met. Finally, given the nature of conceptual and methodological reviews, Reference Lilford, Richardson, Stevens, Fitzpatrick, Edwards and Rock11,Reference Mokkink, Terwee, Patrick, Alonso, Stratford and Knol14 there may have been a subjective bias of the authors in the analysis and interpretation of the literature.
Methodological quality of PROs
Over the past decades numerous concepts and measures of PROs have emerged. Reference Cramer, Rosenheck, Xu, Thomas, Henderson and Charney75–Reference Van Nieuwenhuizen, Schene, Boevink and Wolf77 In contrast, our review found only limited evidence of their methodological quality. Several measures were not linked to specific concepts. A number of measures were long and time-consuming to administer. This may imply undue assessment burden on patients with psychosis as well as increased assessment costs. For most measures there was no evidence on their measurement precision throughout the full range of scores, as has been established by a few studies for observer-rated outcome measures in mental health, Reference Uher, Farmer, Maier, Rietschel, Hauser and Marusic78 and, on a larger scale, for PROs in other medical disorders. Reference Cella, Riley, Stone, Rothrock, Reeve and Yount79 The methods used to assess structural validity were largely not appropriate for ordinal data. Reference Flora and Curran80 Only a few of the reviewed studies conducted analyses based on confirmatory factor analysis for ordinal data or item response modelling. Reference Flora and Curran80,Reference Embretson and Reise81 There are several implications of treating ordinal data as continuous, including attenuated relationships among PRO items in the presence of floor or ceiling effects, presence of pseudofactors and incorrect parameter estimates. Reference Brown82 These may challenge findings on the structural validity of PRO measures. In other words, measures using Likert scales, which have not been examined with psychometric methods appropriate for ordinal data, may be impaired in their ability to summarise patients’ item responses into scores that adequately reflect their dimensional structure. This is, however, central for the use of PROs in the evaluation of care, as such scores provide the basis on which value is assigned to treatments.
Conceptual, operational and empirical overlap
The conceptual, operational and empirical overlap of PROs has several implications for the validity of existing PRO measures. Campbell & Fiske, in their seminal work on discriminant and convergent validity, stated:
‘One cannot define without implying distinctions, and the verification of these distinctions is an important part of the validational process’ (p. 84). Reference Campbell and Fiske83
The verification of distinctions appears to be a part of the validational process that has been neglected by most of the research into PROs. New concepts were often proposed without assessing whether they were sufficiently distinct from existing concepts to warrant them being measured separately. This review suggests that an insufficient distinction between PROs at the conceptual level has led to a considerable overlap in the content of specific domains. This implies that, both at a conceptual and operational level, the requirements for establishing discriminant validity were not sufficiently considered when developing PROs. Empirically this may limit the ability of established measures to capture variance specific to the given concept. Indeed, this points towards substantial empirical overlap across measures. Although such overlap may reflect real associations between different PROs (e.g. one PRO influencing another), it still impairs the ability of each PRO measure to capture distinct information and, in psychometric terms, their discriminant validity. Reference Campbell and Fiske83 However, some concepts and operationalisations included aspects that were specific to one or more PROs. Recent evidence suggests that PROs may reflect both a general appraisal tendency that uniformly influences all PRO ratings in a positive or negative direction and components that are specific for each PRO. The specific information is independent of the general appraisal tendency. Maximising the specific information may be a challenge for future scale improvements.
Influence of cognitive deficits and psychiatric symptoms
In contrast to the concerns of some authors that the validity of existing PRO measures might be impaired owing to the influence of psychiatric symptoms and cognitive deficits, Reference Atkinson, Zibin and Chuang9,Reference Epstein, Hall, Tognetti, Son and Conant10 findings from our review suggested that the influence of (or bias by) cognitive deficits and psychiatric symptoms is very limited. The identified associations of PROs with symptoms and deficits do not compromise their validity as independent outcome criteria. However, all the evidence was taken from patients who consented to participate in research and were seen as capable of providing reasonable responses. Patients with high symptom levels may have been excluded from such studies, by clinicians or researchers. There is no evidence of a possible threshold of general or specific symptoms above which PROs might yield less reliable results.
Routine use of PROs
The conceptualisation and measurement of PROs in patients with psychosis are of practical relevance. These measures have an intuitive appeal for various stakeholder groups and there are calls to use them routinely across mental health services. 3 Even though evidence on the methodological quality of PROs is limited overall, there are at least five recommendations that can be made about the routine use of PROs in the evaluation of treatments for psychosis.
-
(a) It should be carefully considered which PRO is relevant to the aim and approach of the given service, and what the implications of its results would be for service delivery and development.
-
(b) The use of several PRO measures should be avoided unless they address clearly distinct domains.
-
(c) Measures with evidence of good psychometric properties should be preferred; the evidence on psychometric properties is limited for most measures. Overall, measures using satisfaction-based concepts (e.g. assessing satisfaction with life domains or with treatment) have been more rigorously studied than others.
-
(d) In the absence of evidence showing that longer measures have superior properties, shorter measures should be prioritised to minimise the burden and costs of measurement. However, longer measures tend to be more reliable, and there can be a trade-off between brevity and psychometric quality.
-
(e) The influence of symptoms and cognitive deficits is unlikely to affect findings in small samples (although even a small explained variance may be relevant for research in large samples).
Future research
Despite the popularity of PROs for measuring the quality of routine mental healthcare, there are a number of conceptual and methodological shortcomings. Although according to our main findings this includes considerable conceptual, operational and empirical overlap across measures designed to assess different PROs, the influence of cognitive deficits and psychiatric symptoms appears limited. There is a need for more rigorous research to identify short measures that assess distinct PROs independent from overlap with highest possible precision. New methods such as item response modelling, item banking and computerised adaptive testing may help move this forward. Reference Reininghaus, McCabe, Burns, Croudace and Priebe74,Reference Cella, Riley, Stone, Rothrock, Reeve and Yount79 Although such methods have been infrequently used in psychiatric studies, Reference Uher, Farmer, Maier, Rietschel, Hauser and Marusic78 they have led to progress in measuring PROs in other medical conditions. A prominent example is the Patient-Reported Outcomes Measurement Information System (PROMIS). Reference Cella, Riley, Stone, Rothrock, Reeve and Yount79 Computerised adaptive testing iteratively selects the item providing the highest precision for a given patient until a desired level of precision is achieved. This minimises the number of items each patient has to complete, Reference Rose, Bjorner, Becker, Fries and Ware84 and can be implemented on handheld electronic devices. Ideally, conceptual and methodological work should be linked in future research to advance the measurement of PROs in patients with psychosis, so that concepts can both inform research and be refined on the basis of empirical data.
Funding
This work was supported by a research training fellowship funded by the UK National Institute of Health Research to U.R. The report is independent research and the views expressed in this publication are those of the authors and not necessarily those of the NHS, the National Institute for Health Resaerch or the Department of Health.
eLetters
No eLetters have been published for this article.