The reliability of peer review for manuscript and grant submissions: A cross-disciplinary investigation

Domenic V. Cicchetti

doi:10.1017/S0140525X00065675

The reliability of peer review for manuscript and grant submissions: A cross-disciplinary investigation

Published online by Cambridge University Press: 19 May 2011

Domenic V. Cicchetti

Show author details

Domenic V. Cicchetti: Affiliation:
VA Medical Center, West Haven, CT 06516, Electronic mail: [email protected]

Article contents

Abstract
References

Get access

Rights & Permissions

Abstract

The reliability of peer review of scientific documents and the evaluative criteria scientists use to judge the work of their peers are critically reexamined with special attention to the consistently low levels of reliability that have been reported. Referees of grant proposals agree much more about what is unworthy of support than about what does have scientific value. In the case of manuscript submissions this seems to depend on whether a discipline (or subfield) is general and diffuse (e.g., cross-disciplinary physics, general fields of medicine, cultural anthropology, social psychology) or specific and focused (e.g., nuclear physics, medical specialty areas, physical anthropology, and behavioral neuroscience). In the former there is also much more agreement on rejection than acceptance, but in the latter both the wide differential in manuscript rejection rates and the high correlation between referee recommendations and editorial decisions suggests that reviewers and editors agree more on acceptance than on rejection. Several suggestions are made for improving the reliability and quality of peer review. Further research is needed, especially in the physical sciences.

Keywords

cross-disciplinary comparisons evaluation grant review manuscript reviews peer review quality control reliability

Type: Target Article
Information: Behavioral and Brain Sciences , Volume 14 , Issue 1 , March 1991 , pp. 119 - 135

DOI: https://doi.org/10.1017/S0140525X00065675 [Opens in a new window]
Copyright: Copyright © Cambridge University Press 1991

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Article purchase

Temporarily unavailable

References

Abelson, P. H. (1980) Scientific communication. Science 209:60–62. [aDVC]CrossRef Google Scholar PubMed

Abramowitz, S. I.Gomes, B. & Abrarnowitz, C. V. (1975) Publish or politic: Referee bias in manuscript review. Journal of Applied Social Psychology 5:187–200. [aDVC]CrossRef Google Scholar

Abt, H. A. (1988) What happens to rejected astronomical papers? Publications of the Astronomical Society of the Pacific 100:506–08. [aDVC]CrossRef Google Scholar

Ad Hoc Working Group for Critical Appraisal of the Medical Literature (1987) A proposal for more informative abstracts of clinical articles. Annals of Internal Medicine 106:598–604. [SPL]CrossRef Google Scholar

Adair, R. K. (1981) Anonymous refereeing. Physics Today 34:13–15. [aDVC]CrossRef Google Scholar

Adair, R. K. (1982) A physics editor comments on Peters & Ceci's peer-review study. Behavioral and Brain Sciences 5:196. [aDVC]CrossRef Google Scholar

Adair, R. K. & Trigg, G. L. (1979) Editorial: Should the character of Physical Review Letters be changed? Physical Review Letters 43:1969–74. [aDVC]CrossRef Google Scholar

Allen, E. M. (1960) Why are research grant applications disapproved? Science 132:1532–34. [aDVC]CrossRef Google Scholar PubMed

Amabile, T. M. (1983) Brilliant but cruel: Perceptions of negative evaluators. Journal of Experimental Social Psychology 19:146–56. [RC]CrossRef Google Scholar

American Psychological Association (1983) Publication manual, 3rd ed. [aDVC]Google Scholar

American Psychological Association (1985) Standards for educational and psychological testing. [RFB]Google Scholar

American Psychologist (1989) Members of underrepresented groups: Reviewers for journal manuscripts wanted. American Psychologist 44:1555. [RC]Google Scholar

Anonymous (1987) The publication game: Beyond quality in the search for a lengthy vitae. Journal of Social Behavior and Personality 2:3–12 [RC]Google Scholar

Armstrong, J. S. (1980) Unintelligible management research and academic prestige. Interfaces 10:80–86. [aDVC]CrossRef Google Scholar

Armstrong, J. S. (1982a) Barriers to scientific contributions: The author's formula. Behavioral and Brain Sciences 5:197–99. [aDVC]CrossRef Google Scholar

Armstrong, J. S. (1982b) The ombudsman: Is peer review by peers as fair as it appears? Interfaces 12:62–74. [aDVC, JSA]CrossRef Google Scholar

Armstrong, J. S. (1982c) Research on scientific journals: Implications for editors and authors. Journal of Forecasting 1:83–104. [aDVC, JSA]CrossRef Google Scholar

Bailar, J. C. III & Patterson, K. (1985) Journal peer review: The need for a research agenda. The New England Journal of Medicine 312:654–57. [aDVC]CrossRef Google Scholar PubMed

Baird, J. C.Green, D. M. and Luce, R. D. (1980) Variability and sequential effects in cross-modality matching of area and loudness. Journal of Experimental Psychology: Human Perception and Performance 6:277–89. [DL]Google Scholar PubMed

Bakanic, V.McPhail, C. & Simon, R. J. (1987) The manuscript review and decision-making process. American Sociological Report 52:631–42. [aDVC, LJS]CrossRef Google Scholar

Bartko, J. J. (1966) The intraclass correlation coefficient as a measure of reliability. Psychological Reports 19:3–11. [aDVC]CrossRef Google Scholar PubMed

Bartko, J. J. (1974) Corrective note to: “The Intraclass Correlation Coefficient as a Measure of Reliability.” Psychological Reports 34:418. [aDVC]CrossRef Google Scholar

Bartko, J. J. (1976) On various intraclass correlation reliability coefficients. Psychological Bulletin 83:762–65. [aDVC]CrossRef Google Scholar

Bartko, J. J. & Carpenter, W. T. (1976) On the methods and theory of reliability. Journal of Nervous and Mental Disease 163:307–17. [aDVC]CrossRef Google Scholar PubMed

Beck, A. T. (1976) Cognitive therapy and the emotional disorders. International Universities Press. [aDVC]Google Scholar

Benwell, R. (1979) Authors anonymous? Physics Bulletin 30:288. [aDVC]CrossRef Google Scholar

Berelson, B. (1960) Graduate education in the United States. McGraw-Hill. [aDVC]Google Scholar

Bernstein, G. S. (1984) Scientific rigor, scientific integrity: A comment on Sommer & Sommer. American Psychologist 39:1316. [aDVC]CrossRef Google Scholar

Beyer, J. M. (1978) Editorial policies and practices among leading journals in four scientific fields. The Sociological Quarterly 19:68–88. [aDVC]CrossRef Google Scholar

Blashfield, R. K. (1976) Mixture model tests of cluster analysis: Accuracy of four agglomerative hierarchical methods. Psychological Bulletin 83:377–88. [rDVC]CrossRef Google Scholar

Bloch, D. A. & Kraemer, H. C. (1989) 2X2 kappa coefficients: Measures of agreement or association. Biometrics 45:269–87. [HCK]CrossRef Google Scholar PubMed

Boehme, G. (1977) Models for the development of science. In: Science, technology, and society: A cross-disciplinary perspective, ed. Spiegel-Rosing, I. & de Solla Price, D.. Sage. [aDVC]Google Scholar

Boehme, G.van der Daele, W. & Krohn, W. (1976) Finalization of science. Social Science Information 15:306–30. [aDVC]Google Scholar

Boor, M. (1986) Suggestions to improve manuscripts submitted to professional journals. American Psychologist. 41:721–22. [RC]CrossRef Google Scholar

Bornstein, R. F. (1990) Manuscript review in psychology: An alternative model. American Psychologist 45:672–73. [RFB]CrossRef Google Scholar

Bornstein, R. F. (in press) Publication politics, experimenter bias, and the replication process in social science research. Journal of Social Behavior and Personality. [RFB].Google Scholar

Bowen, D. D.Perloff, R. & Jacoby, J. (1972) Improving manuscript evaluation procedures. American Psychologist 25:221–25. [aDVC]CrossRef Google Scholar

Bozarth, H. D. & Roberts, R. R. Jr. (1972) Signifying significant significance. American Psychologist 27:774–75. [aDVC]CrossRef Google Scholar

Bradley, J. V. (1981) Pernicious publication practices. Bulletin of the Psychonomic Society 18:31–34. [aDVC]CrossRef Google Scholar

Braida, L. D. & Durlach, N. T. (1972) Intensity perception. II. Resolution in one interval paradigms. Journal of the Acoustical Society of America 51:483–502. [DL]CrossRef Google Scholar

Brennan, R. L. & Light, R. J. (1974) Measuring agreement when two observers classify people into categories not defined in advance. British Journal of Mathematical and Statistical Psychology 27:154–63. [rDVC]CrossRef Google Scholar

Broad, W. J. (1988) Science can't keep up with the flood of new journals. The New York Times, Feb. 16:C1, Cll. [JF]Google Scholar

Brook, R. J. & Stirling, W. D. (1984) Agreement between observers when the categories are not specified in advance. British Journal of Mathematical and Statistical Psychology 37:271–82. [rDVC]CrossRef Google Scholar

Byrne, C. (1980) Tutor marked assessments at the Open University: A question of reliability. Assessment in Higher Education 5:104–18. [DL]CrossRef Google Scholar

Campbell, J. P. (1982) Some remarks from the outgoing editor. Journal of Applied Psychology 67:691–700. [LLH]CrossRef Google Scholar

Carsrud, K. B. (1984) Out of the frying pan: A reply to Sommer & Sommer. American Psychologist 31:1317–18. [aDVC]CrossRef Google Scholar

Ceci, S. J. & Peters, D. (1984) How blind is blind review? American Psychologist 39:1491–94. [aDVC]CrossRef Google Scholar

Chalmers, I. (1990) Underreporting research is scientific misconduct. Journal of the American Medical Association 263:1405–08. [SPL]CrossRef Google Scholar PubMed

Chalmers, T. C.Frank, C. S. & Reitman, D. (1990) Minimizing the three stages of publication bias. Journal of the American Medical Association 263:1392–95. [SPL]CrossRef Google Scholar PubMed

Chase, J. M. (1970) Normative criteria for scientific publication. American Sociologist 5:262–65. [aDVC]Google Scholar

Chubin, D. E. (1982) Reform of peer review. Science 215:40. [aDVC]CrossRef Google Scholar PubMed

Cicchetti, D. V. (1976) Assessing interrater reliability for rating scales: Resolving some basic issues. British Journal of Psychiatry 129:452–56. [aDVC].CrossRef Google Scholar PubMed

Cicchetti, D. V. (1980)Reliability of reviews for the American Psychologist: A biostatistical assessment of the data. American Psychologist 35:300–3. [aDVC, LJS]CrossRef Google Scholar

Cicchetti, D. V. (1980)Testing the normal approximation and minimal sample size requirements of weighted kappa when the number of categories is large. Applied Psychological Measurement 5:101–04. [arDVC]CrossRef Google Scholar

Cicchetti, D. V. (1980) On peer review: “We have met the enemy and he is us.” Behavioral and Brain Sciences 5:205. [arDVC]CrossRef Google Scholar

Cicchetti, D. V. (1985) A critique of Whitehurst's “Interrater agreement for journal manuscript reviews:” De omnibus, disputandum est. American Psychologist 40:563–68. [aDVC, MED]CrossRef Google Scholar

Cicchetti, D. V. (1988) When diagnostic agreement is high, but reliability is low: Some paradoxes occurring in independent neuropsychological assessments. Journal of Clinical and Experimental Neuropsychology 10:605–22. [aDVC]CrossRef Google Scholar

Cicchetti, D. V. & Conn, H. O. (1976) A statistical analysis of reviewer agreement and bias in evaluating medical abstracts. Yale Journal of Biology and Medicine 45:373–83. [aDVC]Google Scholar

Cicchetti, D. V. & Conn, H. O. (1978) Reviewer evaluation of manuscripts submitted to medical journals. Paper presented to the American Statistical Association Meetings, san Diego, CA. (also abstracted in Biometrics [1978] 34:728) [aDVC]Google Scholar

Cicchetti, D. V. & Eron, L. D. (1979) The reliability of manuscript reviewing for the Journal of Abnormal Psychology. Proceedings of the American Statistical Association (Social Statistics Section) 22:596–600. [aDVC]Google Scholar

Cicchetti, D. V. & Feinstein, A. R. (1990) High agreement but low kappa: II. Resolving the paradoxes. Journal of Clinical Epidemiology 43:551–68. [arDVC]CrossRef Google Scholar PubMed

Cicchetti, D. V. & Fleiss, J. L. (1977) Comparison of the null distributions of weighted kappa and the C ordinal statistic. Applied Psychological Measurement 1:195–201. [aDVC]CrossRef Google Scholar

Cicchetti, D. V. & Heavens, R. (1979) RATCAT (Rater Agreement/Categorical Data). American Statistician 33:91. [aDVC]CrossRef Google Scholar

Cicchetti, D. V. & Showalter, D. (1988) A computer program for determining the reliability of dimensionally scaled data when the numbers and specific sets of examiners may vary at each assessment. Educational and Psychological Measurement 48:717–20. [aDVC]CrossRef Google Scholar

Cicchetti, D. V. & Sparrow, S. S. (1981) Developing criteria for establishing interrater reliability of specific items: Applications to assessment of adaptive behavior. American Journal of Mental Deficiency 86:127–37. [aDVC]Google Scholar PubMed

Cicchetti, D. V. & Tyrer, P. (1988) Reliability and validity of personality assessment. In: Personality disorders: Diagnosis, management and course, ed. Tyrer, P. L.. Butterworth Scientific Ltd. [rDVC]Google Scholar

Cicchetti, D. V.Aivano, S. L. & Vitale, J. (1976) A computer program for assessing the reliability and systematic bias of individual measurements. Educational and Psychological Measurement 36:761–64. [aDVC]CrossRef Google Scholar

Cicchetti, D. V.Aivano, S. L. & Vitale, J. (1977) Computer programs for assessing rater agreement and rater bias for qualitative data. Educational and Psychological Measurement 37:195–201. [aDVC]CrossRef Google Scholar

Cicchetti, D. V.Lee, C.Fontana, A. F. & Dowds, B. N. (1978) A computer program for assessing specific category-rater agreement for qualitative data. Educational and Psychological Measurement 38:805–13. [aDVC]CrossRef Google Scholar

Cicchetti, D. V.Sharma, Y. & Cotlier, E. (1982) Assessment of observer variability in the classification of human cataracts. Yale Journal of Biology and Medicine 55:81–88. [rDVC]Google Scholar PubMed

Cicchetti, D. V.Showalter, D. & Tyrer, P. (1985) The effect of number of rating-scale categories upon levels of interrater reliability: A Monte Carlo investigation. Applied Psychological Measurement 9:31–36. [aDVC]CrossRef Google Scholar

Cleary, F. R. & Edwards, D. J. (1960) The origins of the contributors to the A.E.R. during the fifties. American Economic Review 50:1011–14. [aDVC]Google Scholar

Cohen, J. (1960) A coefficient of agreement for nominal scales. Educational and Psychological Measurement 20:37–46. [aDVC, RR]CrossRef Google Scholar

Cohen, J. (1968) Weighted kappa: Nominal scale agreement with provision for scaled disagreement or partial credit. Psychological Bulletin 70:213–20. [aDVC]CrossRef Google Scholar PubMed

Cohen, J. (1988) Statistical power analysis for the behavioral sciences, 2nd ed. Lawrence Erlbaum. [rDVC, RR]Google Scholar

Cole, J. & Cole, S. (1973) Social stratification in science. University of Chicago Press. [aDVC]Google Scholar

Cole, J. & Cole, S. (1981) Peer review in the National Science Foundation: Phase U of a study. National Academy of Sciences. [aDVC]Google Scholar

Cole, J. & Cole, S. (1985) Experts' “consensus” and decision-making at the National Science Foundation. In: Selectivity in information systems: Survival of the fittest, ed. Warren, K. S.. Praeger. [aDVC]Google Scholar

Cole, S. (1978) Scientific reward systems: A comparative analysis. In: Research in sociology of knowledge, sciences, and art, ed. Jones, R. A.. JAI Press. [aDVC]Google Scholar

Cole, S. (1983) The hierarchy of the sciences. American Journal of Sociology 89:111–39. [aDVC], SCCrossRef Google Scholar

Cole, S.Cole, J. & Simon, G. A. (1981) Chance and consensus in peer review. Science 214:881–86. [aDVC, SC, LLH]CrossRef Google Scholar PubMed

Cole, S.Cole, J. & Dietrich, L. (1978) Measuring the cognitive state of scientific disciplines. In: Toward a metric of science: The advent of science indicators, ed. Elkana, Y.Lederberg, J.Merton, R. K.Thackray, A. & Zuckerman, J.. Wiley. [SC]Google Scholar

Cole, S.Rubin, L. & Cole, J. (1978) Peer review in the National Science Foundation. National Academy of Sciences. [aDVC]Google Scholar

Cole, S.Simon, G. & Cole, J. (1988) Do journal rejection rates index scientific consensus? American Sociological Review 53:152–56. [SC]CrossRef Google Scholar

Colman, A. M. (1982a) Game theory and experimental games: The study of strategic interaction. Pergamon Press. [AMC]Google Scholar

Colman, A. M. (1982b) Manuscript evaluation by journal referees and editors: Randomness or bias? Behavioral and Brain Sciences 5:205–06. [AMC]CrossRef Google Scholar

Conger, A. J. (1980) Integration and generalization of Kappa for multiple raters. Psychological Bulletin 88:322–28. [rDVC]CrossRef Google Scholar

Conger, A. J. (1985) Kappa reliabilities for continuous behaviors and events. Educational and Psychological Measurement 45:861–68. [rDVC]CrossRef Google Scholar

Conn, H. O. (1974) An experiment in blind program selection. Clinical Research 22:128–34. [aDVC]Google Scholar

Cotlier, E.Fagadau, W. & Cicchetti, D. V. (1982) Methods for evaluation of medical therapy of senile and diabetic cataracts. Transactions of the Opthalmologic Societies of the United Kingdom 102:416–22. [rDVC]Google Scholar PubMed

Cox, R. (1967) Examinations and higher education: A survey of the literature. Universities Quarterly 21:292–340. [DL]Google Scholar

Grandall, R. (1986) Peer review: Improving editorial procedures. Bio Science 36:607–09. [RC]Google Scholar

Grandall, R. (1987a) Gauntlet thrown: Publication procedures are challenged. Dialogue (APA Division 8) 1:5. [RC]Google Scholar

Grandall, R. (1987b) We need research on what constitutes good journal papers - and good editing - not guesswork on how to improve manuscripts! American Psychologist 42:407–08. [RC]CrossRef Google Scholar

Grandall, R. (1990) Improving editorial procedures. American Psychologist 45:665–66. [RC]CrossRef Google Scholar

Crane, D. (1967) The gatekeepers of science: Some factors affecting the selection of articles for scientific journals. American Sociologist 32:195–201. [aDVC]Google Scholar

Crane, D. (1972) Invisible colleges. University of Chicago Press. [DLE]Google Scholar

Cronbach, L. J. (1981) Comment on “Chance and consensus in peer review.” Science 214:1293. [LLH]Google Scholar

Culliton, B. J. (1984) Fine-tuning peer review. Science 226:1401–02. [aDVC, RG]CrossRef Google Scholar PubMed

Darley, J. M. & Latane, B. (1968) Bystander intervention in emergencies: Diffusion of responsibility. Journal of Personality and Social Psychology 8:337–83. [AMC]CrossRef Google Scholar PubMed

Darlington, R. (1980) Another peek in the file drawers (unpublished manuscript). [PHS]Google Scholar

Davies, M. & Fleiss, J. L. (1982) Measuring agreement for multinomial data. Biometrics 38:1047–51. [rDVC]CrossRef Google Scholar

DeBakey, L. & DeBakey, S. (1976) Impartial, signed reviews. New England Journal of Medicine 294:564. [aDVC]Google Scholar PubMed

Delucchi, K. L. (1983) The use and misuse of chi-square: Lewis and Burke revisited. Psychological Bulletin 94:166–76. [rDVC]CrossRef Google Scholar

Diamond, J. (1985) Variations on a theme. Nature 314:222–23. [aDVC]CrossRef Google Scholar

Dickersin, K. (1990) The existence of publication bias and risk factors for its occurrence. Journal of the American Medical Association 263:1385–89. [SPL]CrossRef Google Scholar PubMed

Doherty, M. E. & Tweney, R. D. (1988) The role of data and feedback error in inference and prediction. Final report for ARI Contract MDA903-85-K-0193. [MEG]Google Scholar

Eckberg, D. (1982) Theoretical implications of failure to detect prepublished submissions. Behavioral and Brain Sciences 5:25–26. [DLE]CrossRef Google Scholar

Eells, W. C. (1930) Reliability of reported grading of essay type examinations. Journal of Educational Psychology 21:48–52. [DL]CrossRef Google Scholar

Eichorn, D. H. & VandenBos, G. R. (1985) Dissemination of scientific and professional knowledge. American Psychologist 40:1301–16. [RFB]CrossRef Google Scholar

Eight APA journals initiate controversial blind reviewing (1972) APA Monitor, pp. 1, 5. [aDVC]Google Scholar

Epstein, W. M. (1990) Confirmatory response bias among social work journals. Science, Technology and Human Values 15:9–38. [rDVC, MJM]CrossRef Google Scholar

Estes, W. K. (1975) Some targets for mathematical psychology. Journal of Mathematical Psychology 12:263–82. [PHS]CrossRef Google Scholar

Evans, J. T.Nadjari, H. I. & Burchell, S. A. (1990) Quotational and reference accuracy in surgical journals: A continuing peer-review problem. Journal of the American Medical Association 263:1353–54. [JSA].CrossRef Google Scholar PubMed

Feinstein, A. R. (1987) Clinimetrics. Yale University Press. [rDVC]CrossRef Google Scholar

Feinstein, A. R. & Cicchetti, D. V. (1990) High agreement but low kappa: I. The problems of two paradoxes. Journal of Clinical Epidemiology 43:543–49. [arDVC]CrossRef Google Scholar PubMed

Feynman, R. P. (1985) Surely you are joking, Mr. Feynman. Bantam. [PHS]Google Scholar

Finn, R. H. (1970) A note on estimating the reliability of categorical data. Educational and Psychological Measurement 30:71–76. [aDVC]CrossRef Google Scholar

Fisher, A. (1989) Seeing atoms. Popular Science: 102–07. [JSA]CrossRef Google Scholar

Fiske, D. W. & Fogg, L. (1990) But the reviewers are making different criticisms of my paper!: Diversity and uniqueness in reviewer comments. American Psychologist 45:591–98. [rDVC, JSA]CrossRef Google Scholar

Fleiss, J. L. (1971) Measuring nominal scale agreement among many raters. Psychological Bulletin 76:378–82. [rDVC]CrossRef Google Scholar

Fleiss, J. L. (1975) Measuring agreement between two judges on the presence or absence of a trait. Biometrics 31:651–59. [aDVC]CrossRef Google Scholar PubMed

Fleiss, J. L. (1981) Statistical methods for rates and proportions, 2nd ed. Wiley. [aDVC, RR]Google Scholar

Fleiss, J. L. & Cicchetti, D. V. (1978) Inference about weighted kappa in the non-null case. Applied Psychological Measurement 2:113–17. [rDVC]CrossRef Google Scholar

Fleiss, J. L. & Cohen, J. (1973) The equivalence of weighted kappa and the intraclass correlation coefficient as measures of reliability. Educational and Psychological Measurement 33:613–19. [aDVC]CrossRef Google Scholar

Fleiss, J. L. & Cuzick, J. (1979) The reliability of dichotomous judgments: Unequal numbers of judges per subject. Applied Psychological Measurement 3:537–52. [rDVC]CrossRef Google Scholar

Fleiss, J. L.Cohen, J. & Everitt, B. S. (1969) Large sample standard errors of kappa and weighted kappa. Psychological Bulletin 72:323–37. [rDVC]CrossRef Google Scholar

Fleiss, J. L.Nee, J. C. M. & Landis, J. R. (1979) Large sample variance of kappa in the case of different sets of raters. Psychological Bulletin 86:974–77. [rDVC]CrossRef Google Scholar

Freeman, C. & Tyrer, P., eds. (1989) Research methodology in psychiatry: A beginners guide. Royal College of Psychiatrists/Gaskell Books. [PT]Google Scholar

Fuller, S. (1989). Philosophy of science and its discontents. Westview Press. [MEG]Google Scholar

Furchtgott, E. (1984) Replicate, again and again. American Psychologist 39:1315–16. [aDVC]CrossRef Google Scholar

Garber, H. L. (1984) On Sommer & Sommer. American Psychologist 31:1315. [aDVC]CrossRef Google Scholar

Garcia, C.Rosenfield, N. S.Markowitz, R. K.Seashore, J. H.Touloukian, R. J. & Cicchetti, D. V. (1987) Appendicitis in children: Accuracy of the barium enema. American Journal of Diseases of Children 141:1309–12. [rDVC]CrossRef Google Scholar PubMed

Gardner, M. J.Snee, M. P.Hall, A. J.Powell, C. A.Downes, S. & Terrell, J. D. (1990) Results of case-control study of leukaemia and lymphoma among young people near Sellafield nuclear plant in West Cumbria. British Medical Journal 300:423–29. [SPL]CrossRef Google Scholar

Garfield, E. (1972) Citation analysis as a tool in journal evaluation. Science 178:471–79. [RFB]CrossRef Google Scholar PubMed

Garfunkel, J. M.Ulshen, R. H.Hamrick, H. J. & Lawson, E. E. (1990) Problems identified by secondary review of accepted manuscripts. Journal of the American Medical Association 263:1369–71. [rDVC, SPL]CrossRef Google Scholar PubMed

Garner, W. R. (1962) Uncertainty and structure as psychological concepts. Wiley. [DL]Google Scholar

Garner, W. R. & McGill, W. J. (1956) The relation between information and variance analyses. Psychometrika 21:219–28. [arDVC], JBGCrossRef Google Scholar

Garvey, W. D.Lin, N. & Nelson, C. E. (1970) Some comparisons of communication activities in the physical and social sciences. In: Communication among scientists and engineers, ed. Nelson, C. E. & Pollock, D. K.. Health. [SC]Google Scholar

Garvey, W. D.Lin, N. & Nelson, C. E. (1979) Communication in the physical and social sciences. In: Communication: The essence of science, ed. Garvey, W. D.. Pergamon Press. [aDVC]Google Scholar

Gholson, B. & Barker, B. (1985) Kuhn, Lakatos, and Laudan: Applications in the history of physics and psychology. American Psychologist 40:755–69. [aDVC]CrossRef Google Scholar

Giere, R. N. (1988). Explaining science: A cognitive approach. University of Chicago Press. [MEG]CrossRef Google Scholar

Gillett, R. (1985) Nominal scale response agreement and rater uncertainty. British Journal of Mathematical and Statistical Psychology 38:58–66. [rDVC]CrossRef Google Scholar

Gilmore, J. B. (1979) Illusory reliability in journal reviewing. Canadian Psychological Review 20:157–58. [arDVC, JBG]CrossRef Google Scholar

Glenn, N. D. (1976) The journal article review process: Some proposals for change. American Sociologist 11:179–85. [aDVC]Google Scholar

Goodman, L. A. & Kruskal, W. H. (1954) Measures of association for cross classifications. Journal of the American Statistical Association 49:732–64. [aDVC]Google Scholar

Goodrich, D. W. (1945) An analysis of manuscripts received by the editors of the American Sociological Review from May 1, 1944, to September 1, 1945. American Sociological Review1 10:716–25. [aDVC]CrossRef Google Scholar

Goodstein, L. D. (1982) When will the editors start to edit? Behavioral and Brain Sciences 5:212–13. [LJS]CrossRef Google Scholar

Goodstein, L. D. & Brazis, K. L. (1970) Credibility of psychologists: An empirical study. Psychological Reports 27:835–38. [aDVC, JSA]CrossRef Google Scholar

Gordon, M. D. (1977) Evaluating the evaluators. New Scientist 73:342–43. [aDVC]Google Scholar

Gordon, M. D. (1978) A study of the evaluation of papers by primary journals in the U.K. University of Leicester. [LLH]Google Scholar

Gorman, M. E. (1986) How the possibility of error affects falsification on a task that models scientific problem-solving. British Journal of Psychology 77:65–79. [MEG]CrossRef Google Scholar

Gorman, M. E. (1989) Error, falsification and scientific inference: An experimental investigation. Quarterly Journal of Experimental Psychology, 41A, 385–412. [MEG]CrossRef Google Scholar

Gorman, Michael E. & Gorman, Margaret E. (1984) A comparison of disconfirmatory, confirmatory and a control strategy on Wason–s 2, 4, 6 task. Quarterly Journal of Experimental Psychology 12:129–40. [MEG]Google Scholar

Gottfredson, S. D. (1978) Evaluating psychology research reports: Dimensions, reliability, and correlates of quality judgments. American Psychologist 33:920–34. [aDVC, RFB, JBG]CrossRef Google Scholar

Green, D. M.Luce, R. D. & Duncan, J. E. (1977) Variability and sequential effects in magnitude production and estimation of auditory intensity. Perception & Psychophysics 22:450–56. [DL]CrossRef Google Scholar

Green, D. M.Luce, R. D. & Smith, A. F. (1980) Individual magnitude estimates for various distributions of signal intensity. Perception & Psychophysics 27:483–88. [DL]CrossRef Google Scholar PubMed

Greenwald, A. G. (1975) Consequences of prejudice against the null hypothesis. Psychological Bulletin 82:1–20. [aDVC, PHS]CrossRef Google Scholar

Greenwald, A. G. (1976) An editorial. Journal of Personality and Social Psychology 33:1–7. [aDVC]CrossRef Google Scholar

Greenwald, A. G., Pratkanis, A. R., Leippe, M. R. & Baumgardner, M. H. (1986) Under what conditions does theory obstruct research progress? Psychological Review 93:216–29. [aDVC]CrossRef Google Scholar PubMed

Gross, S. T. (1986) The kappa coefficient of agreement for multiple observers when the number of subjects is small. Biometrics 42:883–93. [rDVC].CrossRef Google Scholar PubMed

Grove, W. M., Andreasen, N. C., McDonald-Scott, P., Keller, M. B. & Shapiro, R. W. (1981) Reliability studies of psychiatric diagnosis: Theory and practice. Archives of General Psychiatry 38:408–13. [rDVC]CrossRef Google Scholar PubMed

Guilford, J. P. (1954) Psychometric methods, 2nd ed. McGraw-Hill. [RR]Google Scholar

Gulliksen, H. O. (1950) Theory of mental tests. Wiley. [DL, LJS]CrossRef Google Scholar

Guyatt, G. H., Townsend, M. & Berman, L. (1987) A comparison of Likert and visual analogue scales for measuring change in function. Journal of Chronic Diseases 40:1129–33. [rDVC]CrossRef Google Scholar PubMed

Hall, J. A. (1979) Author review of reviewers. American Psychologist 34:798. [aDVC]CrossRef Google Scholar

Hargens, L. L. (1988) Scholarly consensus and journal rejection rates. American Sociological Review 53:139–51. [aDVC, SC]CrossRef Google Scholar

Hargens, L. L. (1990) Variation in journal peer-review systems: Possible causes and consequences. Journal of the American Medical Association 263:1348–52. [arDVC, LLH]CrossRef Google Scholar PubMed

Hargens, L. L. & Herting, J. R. (1990a) A new approach to referees' assessments of manuscripts. Social Science Research 19:1–16. [arDVC, LLH]CrossRef Google Scholar

Hargens, L. L. & Herting, J. R. (1990b) Neglected considerations in the analysis of agreement among journal referees. Scientometrics 19:91–106. [aDVC, LLH]CrossRef Google Scholar

Harnad, S. (1979) Creative disagreement. The Sciences 19:18–20. [aDVC]CrossRef Google Scholar

Hamad, S. ed. (1983) Peer commentary on peer review: A case study in scientific quality control. Cambridge University Press (reprinted from Behavioral and Brain Sciences, vol. 5). [aDVC].Google Scholar

Harnad, S. (1985)Rational disagreement in peer review. Science, Technology &; Human Values 10(3):55–62. [aDVC, LJS].CrossRef Google Scholar

Harnad, S. (1986)Policing the paper chase. Nature 322:24–25. [aDVC, JBG].CrossRef Google Scholar

Hartog, P., Rhodes, E. C., and Burt, C. (1936) The marks of examiners. Macmillan. [DL]Google Scholar

Heavens, R. H. Jr. & Cicchetti, D. V. (1978) A computer program for calculating rater agreement and bias statistics using contingency table input. Proceedings of the American Statistical Association (Statistical Computing Section) 21:366–70. [aDVC]Google Scholar

Hendrick, C. (1976) Editorial comment. Personality and Social Psychology Bulletin 2:27–08. [aDVC]CrossRef Google Scholar

Hendrick, C. (1977) Editorial comment. Personality and Social Psychology Bulletin 3:1–2. [aDVC].CrossRef Google Scholar

Hensler, D. (1976) Perceptions of the National Science Foundation peer-review process: A report on a survey of NSF reviewers and applicants. NSF publication 77–33. [aDVC]Google Scholar

Heskin, K. (1984) The Milwaukee Project: A cautionary comment. American Psychologist 39:1316–17. [aDVC]CrossRef Google Scholar

Holt, V. E. (1985) Research briefings: Peer-review appeals system established. American Psychological Association (APA) Monitor 16:18. [aDVC]Google Scholar

Horrobin, D. F. (1990) The philosophical basis of peer review and the suppression of innovation. Journal of the American Medical Association 263:1438–41. [JSA]CrossRef Google Scholar PubMed

Howe, M. J. A. (1982) Peer reviewing: Improve or be rejected. Behavioral and Brain Sciences 5:218–19. [aDVC]CrossRef Google Scholar

Hubbard, R. & Armstrong, J. S. (1990) Replication and the development of marketing science. Marketing Department working paper, The Wharton School, University of Pennsylvania. [JSA]Google Scholar

Hughes, H. M. (1976) Letter to the editor. American Sociologist 11:178–79. [aDVC].Google Scholar

Hull, D. L. (1988) Science as a process. University of Chicago Press. [LLH]CrossRef Google Scholar

Hunt, E. (1971) Psychological publications. American Psychologist 26:311. [aDVC]CrossRef Google Scholar

Hunt, K. (1975) Do we really need more replications? Psychological Reports 36:587–93. [aDVC]CrossRef Google Scholar

Ingelfinger, F. J. (1974) Peer review in biomedical publication. American Journal of Medicine 56:686–92. [aDVC]CrossRef Google Scholar PubMed

Ingelfinger, F. J. (1975) Charity and peer review in publication. New England Journal of Medicine 293:1371–72. [aDVC]CrossRef Google Scholar PubMed

Ison, J. R. (1985) The granting system and healthy research. Science 230:376. [aDVC]CrossRef Google Scholar PubMed

Iyengar, S. & Greenhouse, J. B. (1988) Selection model and the file drawer hypothesis. Statistical Science 33:109–35. [PHS]Google Scholar

Jesteadt, W., Luce, R. D. & Green, D. M. (1977) Sequential effects in judgment of loudness. Journal of Experimental Psychology: Human Perception and Performance 3:92–104. [DL]Google Scholar PubMed

Jesteadt, W., Wier, C. C. & Green, D. M. (1977) Intensity discrimination as a function of frequency and sensation level. Journal of the Acoustical Society of America 61:169–77. [DL]CrossRef Google Scholar PubMed

Jonckheere, A. R. (1970) Techniques for ordered contingency tables. In: Proceedings of the NUFFIC International Summer Session in Science, Het Oude Hof ed. Riemersma, J. B. & van der Meer, H. C.. The Hague. [aDVC].Google Scholar

Jones, R. (1974) Rights, wrongs, and referees. New Scientist 61:758–59. [aDVC]Google Scholar

Kahneman, D., Slovic, P. & Tversky, A., eds. (1982) Judgment under uncertainty: Heuristics and biases. Cambridge University Press. [HLR]CrossRef Google Scholar

Kamin, L. J. (1981) The intelligence controversy, ed. Eysenck, H. J.. Wiley. [PHS]Google Scholar

Kazdin, A. E. (1982) Single-case research designs: Methods for clinical and applied settings. Oxford University Press. [aDVC]Google Scholar

Kerr, S., Tolliver, J. & Petree, D. (1977) Manuscript characteristics which influence acceptance for management and social science journals. Academy of Management Journal 20:132–41. [aDVC]CrossRef Google Scholar

Klayman, J. & Ha, Y.-W. (1987) Confirmation, discontinuation and information in hypothesis testing. Psychological Review 94:211–28. [MEG]CrossRef Google Scholar

Koran, L. M. (1975a) The reliability of clinical methods, data, and judgments. New England Journal of Medicine 293:642–46. [rDVC]CrossRef Google Scholar PubMed

Koran, L. M. (1975b) The reliability of clinical methods, data, and judgments. New England Journal of Medicine 293:695–701. [rDVC]CrossRef Google Scholar PubMed

Koshland, D. E. Jr. (1985) Peer review of peer review. Science 228:1387. [aDVC]CrossRef Google Scholar PubMed

Kraemer, H. C. (1980) Extension of the kappa coefficient. Biometrics 36:207–16. [rDVC]CrossRef Google Scholar PubMed

Kraemer, H. C. (1982) Estimating false alarms and missed events from interobserver agreement: Comment on Kaye. Psychological Bulletin 92:749–54. [rDVC]CrossRef Google Scholar

Kraemer, H. C. (1988) Assessment of 2x2 associations: Generalization or signal-detection methodology. The American Statistician 42:37–49. [rDVC]Google Scholar

Kraus, C. A. (1950) The present state of academic research. Chemical and Engineering News 28:3203–04. [aDVC]CrossRef Google Scholar

Krippendorff, K. (1970) Bivariate agreement coefficients for reliability of data. In: Sociological methodology, ed. Borgatta, E. G.. Jossey-Bass. [aDVC]Google Scholar

Krystal, J., Giller, E. & Cicchetti, D. V. (1986) Assessment of alexithymia in post-traumatic stress disorder and psychosomatic illness: Introduction of a reliable measure. Psychosomatic Medicine 48:84–94. [rDVC]CrossRef Google Scholar PubMed

Kuhn, T. (1962) The structure of scientific revolutions. University of Chicago Press. [aDVC, LDN]Google Scholar

Lakatos, I. (1972) Falsification and the methodology of scientific research programmes. In: Criticism and the growth of knowledge, ed. Lakatos, I. & Musgrave, A.. Cambridge University Press. [aDVC]Google Scholar

Laming, D. (1984) The relativity of “absolute” judgments. British Journal of Mathematical and Statistical Psychology 37:152–83. [DL]CrossRef Google Scholar

Laming, D. (1990) The reliability of a certain university examination compared with the precision of absolute judgments. Quarterly Journal of Experimental Psychology 42A:239–54. [DL]CrossRef Google Scholar

Laming, D. (in press) Reconciling Fechner and Stevens? Behavioral and Brain Sciences. [DL]Google Scholar

Landis, J. R. & Koch, G. G. (1977) The measurement of observer agreement for categorical data. Biometrics 33:1599–74. [rDVC]CrossRef Google Scholar PubMed

Latane, B., Williams, K. & Harkins, S. (1979) Many hands make light work: The causes and consequences of social loafing. Journal of Personality and Social Psychology 37:822–32. [AMC]CrossRef Google Scholar

Laudan, L. (1984) Science and values: The aims of science and their role in scientific debate. University of California Press. [aDVC]Google Scholar

Lawlis, G. F. & Lu, E. (1972) Judgment of counseling process: Reliability, agreement, and error. Psychological Bulletin 78:17–20. [aDVC]CrossRef Google Scholar PubMed

Lazarus, D. (1982) Interreferee agreement and acceptance rates in physics. Behavioral and Brain Sciences 5:219. [aDVC]CrossRef Google Scholar

LeLewis, D. & Burke, C. J. (1949) The use and misuse of the chi square test. Psychological Bulletin 46:433–89. [rDVC]CrossRef Google Scholar

Leach, C. (1979) Introduction to statistics: A nonparametric approach for the social sciences. John Wiley. [aDVC]Google Scholar

Lindsey, D. (1977) Participation and influence in publication review proceedings. American Psychologist 32:379–86. [RFB]CrossRef Google Scholar

Lindsey, D. (1978) The scientific publication system in social science. Jossey-Bass. [aDVC, RFB]Google Scholar

Lindsey, D. (1988) Assessing precision in the manuscript review process: A little better than a dice roll. Scientometrics 14:75–82. [LLH]CrossRef Google Scholar

Lock, S. (1985) A difficult balance: Editorial peer review in medicine. ISI Press. [aDVC, JF]Google Scholar

Lodahl, J. B. (1970) Paradigm development as a source of consensus in scientific fields (unpublished master's thesis). [aDVC]Google Scholar

Lord, F. N. & Novick, M. R. (1968) Statistical theories of mental test scores. Addison-Wesley. [LLH]Google Scholar

Luce, R. D. (1989) A history of psychology in autobiography, vol. 8, ed. Lindzey, G.. Stanford University Press. [PHS]Google Scholar

Luce, R. D. & Green, D. M. (1978) Two tests of a neural attention hypothesis for auditory psychophysics. Perception & Psychophysics 23:363–71. [DL].CrossRef Google Scholar PubMed

Luce, R. D., Nosofsky, R. M., Green, D. M. & Smith, A. F. (1982) The bow and sequential effects in absolute identification. Perception & Psychophysics 32:397–408. [DL].CrossRef Google Scholar PubMed

Machol, R. (1981) Letter to the editor. The Sciences 21:xxi. [aDVC]Google Scholar

Maher, B. A. (1978) A reader's, writer's, and reviewer's guide to assessing research reports in clinical psychology. Journal of Consulting and Clinical Psychology 46:835–38. [aDVC]CrossRef Google Scholar PubMed

Mahoney, M. J. (1976) Scientist as subject: The psychological imperative. Ballinger. [LLH].Google Scholar

Mahoney, M. J. (1977) Publication prejudices: An experimental study of confirmatory bias in the peer review system. Cognitive Therapy Research 1:161–75. [aDVC, LDN, PHS, SPL, JSA].CrossRef Google Scholar

Mahoney, M. J. (1978) Publish and perish. Human Behavior 7:38–41. [aDVC].Google Scholar

Mahoney, M. J. (1982) Publication, politics, and scientific progress. Behavioral and Brain Sciences 5:220–21. [AMC]CrossRef Google Scholar

Mahoney, M. J. (1985) Open exchange and epistemic progress. American Psychologist 40:29–39. [ADVC, JBG, RFB, JF]CrossRef Google Scholar

Mahoney, M. J. (1987) Scientific publication and knowledge politics. Journal of Social Behavior and Personality 2:165–76. [RFB]Google Scholar

Mahoney, M. J. (1990) Bias, controversy, and abuse in the study of the scientific publication system. Science, Technology, & Human Values 15:50–55. [MJM]CrossRef Google Scholar

Mahoney, M. J., Kazdin, A. E. & Kenigsberg, M. (1978) Getting published. Cognitive Therapy and Research 2:69–70. [aDVC]CrossRef Google Scholar

Margulis, L. (1977) Letter to the editor: Peer review attacked. The Sciences 17:5, 31. [aDVC]Google Scholar

Marsh, H. W. & Ball, S. (1981) Interjudgmental reliability of reviews for the Journal of Educational Psychology. Journal of Educational Psychology 73:872–80. [aDVC], HWMCrossRef Google Scholar

Marsh, H. W. & Ball, S. (1989) The peer review process used to evaluate manuscripts submitted to academic journals: Interjudgmental reliability. Journal of Experimental Education 57:151–69. [HWM]CrossRef Google Scholar

McCarthy, P., Sharpe, M. R., Spiesel, S. Z., Dolan, T. F., Forsyth, B. W., DeWitt, T. G., Fink, H. D., Baron, M. A. & Cicchetti, D. V. (1982) Observation scales to identify serious illness in febrile children. Pediatrics 70:802–09. [rDVC]CrossRef Google Scholar PubMed

McCarthy, P. L., Sznajderman, S. D., Lustman-Findling, K., Baron, M. A., Fink, H. D., Czarkowski, N., Bauchner, H., Forsyth, B. C. & Cicchetti, D. V. (1990) Mothers' clinical judgment: A randomized trial of the acute illness observation scales. Journal of Pediatrics 116:200–06. [rDVC]CrossRef Google Scholar PubMed

McCartney, J. L. (1978) Making sense of reviewers' comments. Paper presented to the Southern Sociological Association Meetings, New Orleans, LA. [aDVC]Google Scholar

McNemar, Q. (1947) Note on the sampling error of the difference between correlated proportions or percentages. Psychometrika 12:153–57. [rDVC].CrossRef Google Scholar PubMed

McNemar, Q. (1955) Psychological statistics. Wiley. [GSW]Google Scholar

McNutt, R. A., Evans, A. T., Fletcher, R. H. & Fletcher, S. W. (1990) The effects of blinding on the quality of peer review. Journal of the American Medical Association 263:137–76. [rDVC, JSA, SPL]Google Scholar PubMed

Merton, R. K. (1973) The sociology of science: Theoretical and empirical investigations. University of Chicago Press. [aDVC, DLE]Google Scholar

Meyer, G. S. (1979) Academic labor and the development of science (unpublished doctoral dissertation). State University of New York at Stony Brook. [SC]Google Scholar

Mezzich, J. E., Kraemer, H. C., Worthington, D. R. L. & Coffman, G. A. (1981) Assessment of agreement among several raters formulating multiple diagnoses. Journal of Psychiatric Research 16:29–39. [rDVC]CrossRef Google Scholar PubMed

Mitroff, I. I. & Chubin, D. E. (1979) Peer review at the NSF: A dialectical policy and analysis. Social Studies of Science 9:199–232. Sage [aDVC]Google Scholar

Mulkay, M. (1977) Sociology of the scientific research community. In: Science, technology, and society, ed. Spiegel-Rosing, I. & de Solla Price, D.. Sage. [aDVC]Google Scholar

Mulkay, M. & William, A. T. (1971) A sociological study of a physics department. British Journal of Sociology 22:68–80. [SC]CrossRef Google Scholar

Murphy, R. J. L. (1978) Reliability of marking in eight GCE examinations. British Journal of Educational Psychology 48:196–200. [DL]CrossRef Google Scholar

Murphy, R. J. L. (1982) A further report of investigations into the reliability of marking of GCE examinations. British Journal of Educational Psychology 52:58–63. [DL]CrossRef Google Scholar

National Research Council (1988) The behavioral and social sciences: Achievements and opportunities. National Academy Press. [MJM]Google Scholar

Nelson, L., Satz, P., Mitrushiea, M., Van Gorp, W., Cicchetti, D., Lewis, R. & Van Lancker, D. (1989) Development and validation of the Neuropsychology Behavior and Affect Profile. Psychological Assessmen: A Journal of Consulting and Clinical Psychology 1:266–72. [rDVC]CrossRef Google Scholar

Newman, H., Freeman, F. & Holzinger, K. (1937) Twins. A study of heredity and environment. University of Chicago Press. [PHS].Google Scholar

Newman, S. H. (1966) Improving the evaluation of submitted manuscripts. American Psychologist 21:980–81. [aDVC]CrossRef Google Scholar PubMed

Nisbett, R., & Ross, L. (1980) Human inference: Strategies and shortcomingsin human judgments. Prentice-Hall. [HLR]Google Scholar

Nisbett, R. E. & Wilson, T. D. (1977) The halo effect: Evidence for unconscious alteration of judgments. Journal of Personality and socialPsychology 35:250–56.[RFB]CrossRef Google Scholar

Noble, J. H. (1974) Peer review: Quality control of applied social research. Science 185:916–21. [aDVC]CrossRef Google Scholar PubMed

Nunnally, J. C. (1978) Psychometric theory, 2nd ed. McGraw-Hill. [aDVC]Google Scholar

Orlinsky, D. & Howard, K. (1978) The relation of process to outcome in psychotherapy. In: Handbook of psychotherapy and behavior change, ed. Garfield, S. & Bergin, A.. John Wiley & Sons. [LDN]Google Scholar

Orr, R. H. & Kassab, J. (1965) Peer group judgments on scientific merit: Editorial refereeing. Paper presented to the Congress of the International Federation of Documentation, Washington, D.C. [aDVC]Google Scholar

Over, R. (1982) What is the source of bias in peer review?. Behavioral and Brain Sciences 5:229–30. [aDVC]CrossRef Google Scholar

Oxman, A. D., Guyatt, G. H., Singer, J., Goldsmith, C. H., Hutchison, B. G., Milner, R. A. & Streiner, D. L. (1991) Agreement among reviewers of review articles. Journal of Clinical Epidemiology 44:91–98. [rDVC]CrossRef Google Scholar PubMed

Patterson, E. H. (1969) Evaluation of manuscripts submitted for publication. American Psychologist 24:73. [aDVC]CrossRef Google Scholar

Patterson, K. & Bailar, J. C. III (1985) A review of journal peer review. In: Selectivity in information systems: Survival of the fittest, ed. Warren, K. S.. Praeger Scientific. [aDVC]Google Scholar

Peters, C. (1976) Multiple submissions: Why not?. American Sociologist 11:165–79. [aDVC]Google Scholar

Peters, D. P. & Ceci, S. J. (1982) Peer-review practices of psychological journals: The fate of published articles submitted again. Behavioral and Brain Sciences 5:187–255. [aDVC, AMC, DLE]CrossRef Google Scholar

Pfeffer, J., Leong, A. & Strehl, K. (1977) Paradigm development and particularism: Journal publication in three scientific disciplines. Social Forces 55:938–51. [aDVC]CrossRef Google Scholar

Physical Review & Physical Review Letters (1987) Annual report 1986. [rDVC]Google Scholar

Pollack, I. (1952) The information of elementary auditory displays. Journal of the Acoustical Society of America 24:745–49. [DL]CrossRef Google Scholar

Pollack, I. (1953) The information of elementary auditory displays. II Journal of the Acoustical Society of America 25:765–69. [DL]CrossRef Google Scholar

Price, D. de Solla (1963) Little science, big science. Columbia University Press. [aDVC]CrossRef Google Scholar

Reid, L. N., Soley, L. C. & Wimmer, R. D. (1981) Replications in advertising research: 1977, 1978, 1979. Journal of Advertising 10:3–13. [aDVC]CrossRef Google Scholar

Relman, A. S. (1978) Are journals really quality filters? Rockefeller Foundation working papers (conference, May 22–23). Coping with the biomedical literature explosion: A qualitative approach. [aDVC]Google Scholar

Remington, M., Tyrer, P. J., Newson-Smith, J. & Cicchetti, D. V. (1979) Comparative reliability of categorical and analogue rating scales in the assessment of psychiatric symptomatology. Psychological Medicine 9:765–70. [aDVC]CrossRef Google Scholar PubMed

Rennie, D. (1986) Guarding the guardians: A conference on editorial peer review. Journal of the American Medical Association 256:2391–92. [MJM]CrossRef Google Scholar PubMed

Roberts, W. A. (1976) Failure to replicate visual discrimination learning with a 1-min delay of reward. Learning and Motivation, 7, 313–25. [TRZ]CrossRef Google Scholar

Robertson, P. (1976) Towards open refereeing. New Scientist 71:410. [aDVC]Google Scholar

Robinson, W. S. (1957) The statistical measurement of agreement. American Sociological Review 22:17–25. [aDVC]CrossRef Google Scholar

Rodman, H. (1970) The moral responsibility of journal editors and referees. American Sociologist 5:351–57. [RC]Google Scholar

Roediger, H. L. III (1987) The role of journal editors in the scientific process. In: Scientific excellence: Origins and assessment, ed. Jackson, D. N. and Rushton, J. P.. Sage. [LLH, HLR]Google Scholar

Rogot, E. & Goldberg, I. D. (1966) A proposed index for measuring agreement in test-retest studies. Journal of Chronic Diseases 19:991–1006. [rDVC]CrossRef Google Scholar PubMed

Romanczyk, R. G., Kent, R. N., Diament, C. & O'Leary, K. D. (1973) Measuring the reliability of observational data: A reactive process. Journal of Applied Analysis 6:175–84. [JDC]Google Scholar

Rosenfield, N. S., Ablow, R. C., Markowitz, R. I., DiPietro, M., Seashore, J. H., Touloukian, R. J. & Cicchetti, D. V. (1984) Hirschsprung Disease: Accuracy of the barium enema examination. Radiology 150:393–400. [rDVC]CrossRef Google Scholar PubMed

Rosenthal, R. (1979) The “file drawer problem” and tolerance for null results Psychological Bulletin 86:638–41. [PHS]CrossRef Google Scholar

Rosenthal, R. (1984) Meta-analytic procedures for social research. Sage. [RR]Google Scholar

Rosenthal, R. (1987) Judgment studies: Design, analysis, and meta-analysis. Cambridge University Press. [RR]CrossRef Google Scholar

Rosenthal, R. & Rosnow, R. L. (1984) Essentials of behavioral research: Methods and data analysis. McGraw-Hill. [RR]Google Scholar

Rosenthal, R. & Rosnow, R. L. (1985) Contrast analysis: Focused comparisons in the analysis of variance. Cambridge University Press. [RR]Google Scholar

Rosenthal, R. & Rubin, D. B. (1978) Interpersonal expectancy effects: The first 345 studies. Behavioral and Brain Sciences 3:377–415. [PHS]CrossRef Google Scholar

Rosenthal, R. & Rubin, D. B. (1982) A simple, general purpose display of magnitude of experimental effect. Journal of Educational Psychology 74:166–69. [RR]CrossRef Google Scholar

Rourke, B. P. & Costa, L. (1979) Editorial policy II. Journal of Clinical Neuropsychology 1:93–96. [aDVC]CrossRef Google Scholar

Rowney, J. A. & Zenisek, T. J. (1980) Manuscript characteristics influencing reviewers' decisions. Canadian Psychology 21:17–21. [aDVC]CrossRef Google Scholar

Roy, R. (1985) Funding science: The real defects of peer review and an alternative to it. Science, Technology, and Human Values 10:73–81. [aDVC]CrossRef Google Scholar

Rubin, D. B. (1982) Rejection, rebuttal, revision: Some flexible features of peer review. Behavioral and Brain Sciences 2:236–37. [PHS]CrossRef Google Scholar

Scarr, S. (1982) Anosmic peer review: A rose by another name is evidently not a rose. Behavioral and Brain Sciences 5:237–38. [aDVC]CrossRef Google Scholar

Scarr, S. & Weber, B. L. R. (1978) The reliability of reviews for the American Psychologist. American Psychologist 33:935. [aDVC, LJS]CrossRef Google Scholar

Schönemann, P. H. (1971) The minimum average correlation between equivalent sets of uncorrelated factors. Psychometrika 36:21–30. [PHS]CrossRef Google Scholar

Schönemann, P. H. (1989) New questions about old heritability estimates. Bulletin of the Psychonomic Society 27:175–78. [PHS]CrossRef Google Scholar

Schönemann, P. H. & Wang, M. M. (1972) Some new results on factor indeterminancy. Psychometrika 37:61–91. [PHS]CrossRef Google Scholar

Scott, W. A. (1974) Interreferee agreement on some characteristics of manuscripts submitted to the Journal of Personality and Social Psychology. American Psychologist 29:698–702. [aDVC, CAK]CrossRef Google Scholar

Sharp, D. W. (1990) What can and should be done to reduce publication bias?. Journal of the American Medical Association 263:1390–91. [SPL]CrossRef Google Scholar PubMed

Shrout, P. E., Spitzer, R. L. & Fleiss, J. L. (1987) Quantification of agreement in psychiatric diagnosis revisited. Archives of General Psychiatry 44:172–77. [rDVC]CrossRef Google Scholar PubMed

Smart, R. (1964) The importance of negative results in psychological research. Canadian Psychologist 5a:225–32. [aDVC]CrossRef Google Scholar

Smigel, E. O. & Ross, H. L. (1970) Factors in the editorial decision. American Sociologist 5:19–21. [aDVC]Google Scholar

Smith, K. (1977) Letter to the editor: Peer review defended. The Sciences 17:5. [aDVC]Google Scholar

Snedecor, G. W. & Cochran, W. G. (1967) Statistical methods, 6th ed. Iowa State University Press. [RR]Google Scholar

Snedecor, G. W. & Cochran, W. G. (1980) Statistical methods, 7th ed. Iowa State University Press. [RR]Google Scholar

Solomon, D. L. (1989) Editorial communication. Biometrics, June 21. [PHS]Google Scholar

Sommer, R. & Sommer, B. A. (1984) Reply from Sommer & Sommer. American Psychologist 39:1318–19. [aDVC]CrossRef Google Scholar

Soper, H. V., Cicchetti, D. V., Satz, P., Light, R. & Orsini, D. L. (1988) Null hypothesis disrespect in neuropsychology: Dangers of alpha and beta errors. Journal of Clinical and Experimental Neuropsychology 10:255–70. [aDVC]CrossRef Google Scholar PubMed

Sparrow, S. S., Balla, D. A. & Cicchetti, D. V. (1984a) The Vineland Adaptive Behavior Scales: A revision of the Vineland Social Maturity Scale by E. A. Doll. I. Survey form. American Guidance Service. [rDVC]Google Scholar

Sparrow, S. S., Balla, D. A. & Cicchetti, D. V. (1984b) The Vineland Adaptive Behavior Scales: A revision of the Vineland Social Maturity Scale by E. A. Doll. II. Expanded form. American Guidance Service. [rDVC]Google Scholar

Sparrow, S. S., Balla, D. A. & Cicchetti, D. V. (1985) The Vineland Adaptive Behavior Scales: A revision of the Vineland Social Maturity Scale by E. A. Doll. III. Classroom edition. American Guidance Service. [rDVC]Google Scholar

Spearman, K. (1927) The abilities of man. MacMillan. [PHS]Google Scholar

Spitzer, R. L. & Fleiss, J. L. (1974) A reanalysis of the reliability of psychiatric diagnosis. British Journal of Psychiatry 125:341–47. [aDVC]CrossRef Google Scholar PubMed

Steiger, J. J. & Schonemann, P. H. (1976) A history of factor indeterminancy. In: Theory construction and data analysis in the behavioral sciences, ed. Shye, S.. Jossey-Bass. [PHS]Google Scholar

Steinberg, M., Rounsaville, B. & Cicchetti, D. V. (1990) Interview for DSM- III-R dissociative disorders: Preliminary report on a new diagnostic instrument. American Journal of Psychiatry 147:76–82. [rDVC]Google Scholar PubMed

Sterling, T. D. (1959) Publication decisions and their possible effects on inferences drawn from tests of significance - or vice versa. Journal of the American Statistical Association 54:30–34. [aDVC]Google Scholar

Stevens, J. C. & Tulving, E. (1957) Estimations of loudness by a group of untrained observers. American Journal of Psychology 70:600–05. [DL]CrossRef Google Scholar PubMed

Stevens, S. S. (1971) Issues in psychophysical measurement. Psychological Review 78:426–50. [DL]CrossRef Google Scholar

Stinchcombe, A. L. & Ofshe, R. (1969) On journal editing as a probabilistic process. American Sociologist 4:116–17. [rDVC, SC]Google Scholar

Stumpf, W. E. (1980) Letters: “Peer” review. Science 207:822–23. [aDVC]CrossRef Google Scholar PubMed

Summary Report of Journal Operations (1989) American Psychologist 44:1070. [aDVC]CrossRef Google Scholar

Surwillo, W. W. (1986) Anonymous reviewing and the peer review process. American Psychologist 41:218. [aDVC]CrossRef Google Scholar

Thomas, G. J. (1982) Perhaps it was right to reject the resubmitted manuscripts. Behavioral and Brain Sciences 5:240. [aDVC]CrossRef Google Scholar

Thomas, H. (1985) On the “file drawer” problem (unpublished manuscript). [PHS]Google Scholar

Tinsley, H. E. A. & Weiss, D. J. (1975) Interrater reliability and agreement of subjective judgments. Journal of Counseling Psychology 22:358–76. [LLH]CrossRef Google Scholar

Torgerson, W. S. (1958) Theory and methods of scaling. Wiley. [DL]Google Scholar

Tyrer, P., Cicchetti, D. V., Casey, P. R., Fitzpatrick, K., Oliver, R., Baiter, A., Ciller, E. & Harkness, L. (1984) Cross-national reliability study of a schedule for assessing personality disorders. The Journal of Nervous and Mental Disease 172:718–21. [rDVC]CrossRef Google Scholar PubMed

Tyrer, P., Owen, R. & Cicchetti, D. V. (1984) The Brief Scale for Anxiety: A subdivision of the Comprehensive Psychopathological Rating Scale. Journal of Neurology, Neurosurgery and Psychiatry 47:970–75. [rDVC]CrossRef Google Scholar PubMed

Tyrer, P., Strauss, J. & Cicchetti, D. V. (1983) Temporal reliability of personality in psychiatric patients. Psychological Medicine 13:393–98. [rDVC]CrossRef Google Scholar PubMed

Uebersax, J. S. (1981) GKAPPA: Generalized kappa coefficient. Applied Psychological Measurement 5:28. [rDVC]CrossRef Google Scholar

Uebersax, J. S. (1982) A generalized kappa coefficient. Educational and Psychological Measurement 42:181–83. [rDVC]CrossRef Google Scholar

Uebersax, J. S. (1989) Latent structure modeling of ordered category rating agreement. Paper presented at the annual meeting of the Psychometric Society, UCLA, Los Angeles (A Rand Rand Corp. Note). [rDVC]Google Scholar

Ubersax, J. & Grove, W. (1989) Latent structure agreement analysis. Rand Corp. (A Rand Note). [rDVC]Google Scholar

Volkmar, F. R., Cicchetti, D. V., Dykens, E., Sparrow, S. S., Leckman, J. F. & Cohen, D. J. (1988) An evaluation of the Autism Behavior Checklist. Journal of Autism and Developmental Disorders 18:81–97. [rDVC]CrossRef Google Scholar PubMed

Wason, P. C. (1960) On the failure to eliminate hypotheses in a conceptual task. Quarterly Journal of Experimental Psychology, 12, 129–40. [MEG]CrossRef Google Scholar

Watkins, M. W. (1979) Chance and interrater agreement on manuscripts. American Psychologist 34:796–97. [aDVC]CrossRef Google Scholar

Whitehurst, G. J. (1983) Interrater agreement for reviews for Developmental Review. Developmental Review 3:73–78. [aDVC]CrossRef Google Scholar

Whitehurst, G. J. (1984) Interrater agreement for journal manuscript reviews. American Psychologist 39:22–28. [aDVC, MED]CrossRef Google Scholar

Wiener, S. L., Urivetsky, M., Bregman, D., Cohen, J., Eich, R., Gootman, N., Gulotta, S., Taylor, B., Tuttle, R., Webb, W. & Wright, J. (1977) Peer review: Inter-reviewer agreement during evaluation of research grant applications. Clinical Research 25:306–11. [aDVC]Google Scholar PubMed

Wilson, E. B. (1928) Review of “The Abilities of Man, Their Nature and Measurement,” by C. Spearman. Science 67:244–48. [PHS]CrossRef Google Scholar

Wilson, J. D. (1978) Peer review and publication. Journal of Clinical Investigation 61:1697–1701. [PT]CrossRef Google Scholar PubMed

Wolff, W. M. (1970) A study of criteria for journal manuscripts. American Psychologist 25:36–39. [aDVC]CrossRef Google Scholar

Wolff, W. M. (1973) Publication problems in psychology and an explicit evaluation schema for manuscripts. American Psychologist 28:257–61. [aDVC]CrossRef Google Scholar

Wright, R. D. (1970) Truth and its keeper. New Scientist 45:402–04. [aDVC]Google Scholar

Wyer, R. S., Greenwald, A. G.Bernard, H. R., Crandall, R. & Anon. (1987) Comments on “The publication game.” Journal of Social Behavior and Personality 2:13–22. [RC]Google Scholar

Yotopoulos, P. A. (1961) Institutional affiliation of the contributors to three professional journals. American Economic Review 5:665–70. [aDVC]Google Scholar

Ziman, J. (1968) Public knowledge: The social dimension of science. Cambridge University Press. [AMC]Google Scholar

Ziman, J. (1976) The force of knowledge: The scientific dimension of society. Cambridge University Press. [AMC]Google Scholar

Zuckerman, H. & Merton, R. K. (1971) Patterns of evaluation in science: Institutionalization, structure, and functions of the referee system. Minerva 9:66–100. [aDVC, LLH]CrossRef Google Scholar

Article contents

The reliability of peer review for manuscript and grant submissions: A cross-disciplinary investigation

Abstract

Keywords

Access options

Article purchase

Temporarily unavailable

References

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests