Book contents
- Frontmatter
- Contents
- Preface
- 1 Introduction
- 2 Theory of Tests, p-Values, and Confidence Intervals
- 3 From Scientific Theory to Statistical Hypothesis Test
- 4 One-Sample Studies with Binary Responses
- 5 One-Sample Studies with Ordinal or Numeric Responses
- 6 Paired Data
- 7 Two-Sample Studies with Binary Responses
- 8 Assumptions and Hypothesis Tests
- 9 Two-Sample Studies with Ordinal or Numeric Responses
- 10 General Methods for Frequentist Inferences
- 11 k-Sample Studies and Trend Tests
- 12 Clustering and Stratification
- 13 Multiplicity in Testing
- 14 Testing from Models
- 15 Causality
- 16 Censoring
- 17 Missing Data
- 18 Group Sequential and Related Adaptive Methods
- 19 Testing Fit, Equivalence, and Noninferiority
- 20 Power and Sample Size
- 21 Bayesian Hypothesis Testing
- References
- Notation Index
- Concept Index
19 - Testing Fit, Equivalence, and Noninferiority
Published online by Cambridge University Press: 17 April 2022
- Frontmatter
- Contents
- Preface
- 1 Introduction
- 2 Theory of Tests, p-Values, and Confidence Intervals
- 3 From Scientific Theory to Statistical Hypothesis Test
- 4 One-Sample Studies with Binary Responses
- 5 One-Sample Studies with Ordinal or Numeric Responses
- 6 Paired Data
- 7 Two-Sample Studies with Binary Responses
- 8 Assumptions and Hypothesis Tests
- 9 Two-Sample Studies with Ordinal or Numeric Responses
- 10 General Methods for Frequentist Inferences
- 11 k-Sample Studies and Trend Tests
- 12 Clustering and Stratification
- 13 Multiplicity in Testing
- 14 Testing from Models
- 15 Causality
- 16 Censoring
- 17 Missing Data
- 18 Group Sequential and Related Adaptive Methods
- 19 Testing Fit, Equivalence, and Noninferiority
- 20 Power and Sample Size
- 21 Bayesian Hypothesis Testing
- References
- Notation Index
- Concept Index
Summary
This chapter first focuses on goodness-of-fit tests. A simple case is testing for normality (e.g., the Shapiro–Wilks test). We generally recommend against this because large sample sizes can find statistically significant differences even if those differences are not important, and vice versa. We show Q-Q plots to graphically check for the largeness of departures from normality. We discuss the Kolmogorov–Smirnoff test for any difference between two distributions. We review goodness-of-fit tests for contingency tables (Pearson’s chi-squared test and Fisher’s exact test) and for logistic regression (the Hosmer–Lemeshow test). The rest of the chapter is devoted to equivalence or noninferiority tests. The margin of equivalence or noninferiority must be prespecified, and for noninferiority tests of a new drug against a standard, the margin should be larger than the difference between the placebo and the standard. We discuss the constancy assumption and biocreep. We note that while poor design (poor compliance, poor study population choice, poor measurement) generally decreases power in superiority design, these can lead to high Type I error rates in noninferiority designs.
- Type
- Chapter
- Information
- Statistical Hypothesis Testing in ContextReproducibility, Inference, and Science, pp. 359 - 376Publisher: Cambridge University PressPrint publication year: 2022