Kelley's Formula as a Basis for the Assessment of Reliable Change

Gerard H. Maassen

doi:10.1007/BF02294373

Kelley's Formula as a Basis for the Assessment of Reliable Change

Published online by Cambridge University Press: 01 January 2025

Gerard H. Maassen

Show author details

Gerard H. Maassen*: Affiliation:
Department of Methodology and Statistics, Faculty of Social Sciences, Utrecht University, The Netherlands
*: Requests for reprints should be sent to G.H. Maassen, Utrecht University, Faculty of Social Sciences, Department Methodology and Statistics, Post Box 80140, 3508 TC Utrecht, THE NETHERLANDS. E-mail: [email protected].

Article contents

Abstract
References

Get access

Rights & Permissions

Abstract

In the literature on the measurement of change, reliable change is usually determined by means of a confidence interval around an observed value of a statistic that estimates the true change. In recent literature on the efficacy of psychotherapies, attention has been particularly directed at the improvement of the estimation of the true change. Reliable Change Indices, incorporating the reliability-weighted measure of individual change, also known as Kelley's formula, have been proposed. According to current practice, these indices are defined as the ratio of such an estimator and an intuitively appealing criterion and then regarded as standard normally distributed statistics. However, because the authors fail to adopt an adequate standard error of the estimator, the statistical properties of their indices are unclear. In this article, it is shown that this can lead to paradoxical conclusions. The adjusted standard error is derived.

Keywords

difference scores reliable change index Kelley's formula

Type: Original Paper
Information: Psychometrika , Volume 65 , Issue 2 , June 2000 , pp. 187 - 197

DOI: https://doi.org/10.1007/BF02294373 [Opens in a new window]
Copyright: Copyright © 2000 The Psychometric Society

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Article purchase

Temporarily unavailable

References

Barkham, M., Rees, A., Stiles, W.B., Shapiro, D.A., Hardy, G.E., & Reynolds, S. (1996). Dose-effect relations in time-limited psychotherapy for depression. Journal of Consulting and Clinical Psychology, 64, 927–935.CrossRef Google Scholar PubMed

Bruggemans, E., Van de Vijver, F.J.R., & Huysmans, H.A. (1997). Assessment of cognitive deterioration in individual patients following cardiac surgery: Correcting for measurement error and practice effects. Journal of Clinical and Experimental Neuropsychology, 19, 543–559.CrossRef Google Scholar PubMed

Christensen, L., & Mendoza, J.L. (1986). A method of assessing change in a single subject: An alteration of the RC index. Behavior Therapy, 12, 305–308.CrossRef Google Scholar

Cohen, J. (1977). Statistical power analysis for the behavioral sciences. New York: Academic Press.Google Scholar

Collins, L.M. (1996). Is reliability obsolete? A commentary on “Are simple gain scores obsolete?. Applied Psychological Measurement, 20, 289–292.CrossRef Google Scholar

Cronbach, L.J., & Furby, L. (1970). How we should measure “Change”—or should we?. Psychological Bulletin, 74, 68–80.CrossRef Google Scholar

Debats, D.L. (1996). Meaning in life—Clinical relevance and predictive power. British Journal of Clinical Psychology, 35, 503–516.CrossRef Google Scholar PubMed

De Haan, E., Van Oppen, P., Van Balkom, A.J.L.M., Spinhoven, P., Hoogduin, K.A.L., & Van Dyck, R. (1997). Prediction of outcome and early vs. late improvement in Ocd patients treated with cognitive-behavior therapy and pharmacotherapy. Acta Psychiatrica Scandinavica, 96, 354–361.CrossRef Google Scholar PubMed

Hafkenscheid, A.J.P.M. (1994). Rating scales in treatment efficacy studies: Individualized and normative use. Groningen (the Netherlands): Rijksuniversiteit Groningen.Google Scholar

Hageman, W.J.J.M., & Arrindell, W.A. (1993). A further refinement of the reliable change (RC) index byImproving the pre-postDifference score: IntroducingRC _ID. Behaviour Research and Therapy, 31, 693–700.CrossRef Google Scholar

Hsu, L.M. (1989). Reliable changes in psychotherapy: Taking into account regression toward the mean. Behavioral Assessment, 11, 459–467.Google Scholar

Jacobson, N.S., Follette, W.C., & Revenstorf, D. (1984). Psychotherapy outcome research: Methods for reporting variability and evaluating clinical significance. Behavior Therapy, 15, 336–352.CrossRef Google Scholar

Jacobson, N.S., & Truax, P. (1991). Clinical significance: A statistical approach to defining meaningful change in psychotherapy research. Journal of Clinical and Consulting Psychology, 59, 12–19.CrossRef Google Scholar PubMed

Kelley, T.L. (1947). Fundamentals of statistics. Cambridge: Harvard University Press.Google Scholar

Lord, F.M., & Novick, M.R. (1968). Statistical theories of mental test scores. Reading, MA: Addison-Wesley.Google Scholar

McNemar, Q. (1958). On growth measurement. Educational and Psychological Measurement, 18, 47–55.CrossRef Google Scholar

McNemar, Q. (1962). Psychological statistics 3rd ed., New York: Wiley.Google Scholar

McNemar, Q. (1969). Psychological statistics 4th ed., New York: Wiley.Google Scholar

Mellenbergh, G.J. (1999). A note on simple gain score precision. Applied Psychological Measurement, 23, 87–89.CrossRef Google Scholar

Nunnally, J.C., & Kotsch, W.E. (1983). Studies of individual subjects: logic and methods of analysis. British Journal of Clinical Psychology, 22, 83–93.CrossRef Google Scholar

Ostrom, Th.M. (1966). Perspective as an intervening construct in the judgment of attitude statements. Journal of Personality and Social Psychology, 3, 135–144.CrossRef Google Scholar

Plewis, I. (1985). Analysing change. Chichester: Wiley.Google Scholar

Rao, C.R. (1973). Linear statistical inference and its applications. New York: Wiley.CrossRef Google Scholar

Rogosa, D., Brandt, D., & Zimowski, M. (1982). A growth curve approach to the measurement of change. Psychological Bulletin, 92, 726–748.CrossRef Google Scholar

Rudy, T.E., Turk, D.C., Kubinski, J.A., & Zaki, H.S. (1995). Differential treatment responses of Tmd patients as a function of psychological characteristics. Pain, 61, 103–112.CrossRef Google Scholar PubMed

Sharma, K.K., & Gupta, J.K. (1986). Optimum reliability of gain scores. Journal of Experimental Education, 54, 105–108.CrossRef Google Scholar

Smith, M.L., Glass, G.V., & Miller, Th.I. (1980). The Benefits of Psychotherapy. Baltimore: John Hopkins University Press.Google Scholar

Speer, D.C. (1992). Clinically significant change: Jacobson and Truax (1991) revisited. Journal of Consulting and Clinical Psychology, 60, 402–408.CrossRef Google Scholar PubMed

Taylor, S. (1995). Assessment of obsessions and compulsions—Reliability, validity and sensitivity to treatment effects. Clinical Psychology Review, 15, 261–296.CrossRef Google Scholar

Upshaw, H.S., & Ostrom, Th.M. (1984). Psychological perspective in attitude research. In Eiser, J.R. (Eds.), Attitudinal judgment. New York: Springer.Google Scholar

Van Oppen, P., De Haan, E., Van Balkom, A.J.L.M., Spinhoven, P., Hoogduin, K., & Van Dyck, R. (1995). Cognitive therapy and exposure in-vivo in the treatment of obsessive-compulsive disorder. Behaviour Research and Therapy, 33, 379–390.CrossRef Google Scholar PubMed

Willett, J.B. (1988). Questions and answers in the measurement of change. In Rothkopf, E.Z. (Eds.), Review of research in education, Vol. 15, 1988–89 (pp. 345–422). Washington: American Educational Research Association.Google Scholar

Willett, J.B. (1989). Some results on reliability for the longitudinal measure of change: Implications for the design of studies of individual growth. Educational and Psychological Measurement, 49, 587–602.CrossRef Google Scholar

Williams, R.H., Zimmerman, D.W. (1996). Are simple gain scores obsolete?. Applied Psychological Measurement, 20, 59–69.CrossRef Google Scholar

Wykes, T. (1998). What are we changing with neurocognitive rehabilitation—Illustrations from 2 single cases of changes in neuropsychological performance and brain systems as measured by SPECT. Schizophrenia Research, 34, 77–86.CrossRef Google Scholar

Zimmerman, D.W., & Williams, R.H. (1982). Gain scores in research can be highly reliable. Journal of Educational Measurement, 19, 149–154.CrossRef Google Scholar

Article contents

Kelley's Formula as a Basis for the Assessment of Reliable Change

Abstract

Keywords

Access options

Article purchase

Temporarily unavailable

References

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests