Book contents
- Frontmatter
- Contents
- List of Figures
- List of Tables
- List of Boxes
- Acknowledgments
- 1 Introduction
- PART I R AND BASIC STATISTICS
- 2 Introduction to R
- 3 Looking at Data – Numerical Summaries
- 4 Looking at Data – Tables
- 5 Looking at Data – Graphs
- 6 Transformations
- 7 Missing Values
- 8 Confidence Intervals and Hypothesis Testing
- 9 Relating Variables
- PART II MULTIVARIATE METHODS
- PART III ARCHAEOLOGICAL APPROACHES TO DATA
- References
- Index
7 - Missing Values
from PART I - R AND BASIC STATISTICS
Published online by Cambridge University Press: 22 July 2017
- Frontmatter
- Contents
- List of Figures
- List of Tables
- List of Boxes
- Acknowledgments
- 1 Introduction
- PART I R AND BASIC STATISTICS
- 2 Introduction to R
- 3 Looking at Data – Numerical Summaries
- 4 Looking at Data – Tables
- 5 Looking at Data – Graphs
- 6 Transformations
- 7 Missing Values
- 8 Confidence Intervals and Hypothesis Testing
- 9 Relating Variables
- PART II MULTIVARIATE METHODS
- PART III ARCHAEOLOGICAL APPROACHES TO DATA
- References
- Index
Summary
Given the fragmentary nature of the archaeological record, it should be no surprise that missing data are often an important consideration. Data are missing because specimens are fragmentary, because measurements were incorrectly recorded or not recorded at all, or because data from several different projects, each with a somewhat different recording system, are being combined. Data can also be missing because of our inability to measure values below a certain threshold. These and other factors mean that our data sets have holes, but fortunately R provides several ways of dealing with holes.
Missing data are said to be “missing completely at random” (MCAR) if the probability that a value is missing is unrelated to its value on that variable or to the values on any other variables in the analysis (Allison, 2001; McKnight et al., 2007). Essentially this means that you cannot predict when a value will be missing on a variable. You can create an MCAR data set by randomly selecting values and changing them to missing. You would not normally want to do this, but it makes the point that the data do not contain any information that would allow you to predict that a value would be missing. It is often easier to describe clearly what MCAR is not. For example, in compositional analysis, elemental data can be missing when the concentration is below the detection limits of the equipment. In this case, only small values are missing. This kind of missing data is called left censored because it is the small values that are missing. If larger projectile points are more likely to break, length measurements might be more likely to be missing for larger points. In both cases, the pattern of missing values is not random. The assumption that the data are MCAR is a strong one and will usually be difficult or impossible to confirm for archaeological data.
Missing data are said to be “missing at random” (MAR) if the probability that a value is missing is unrelated to its value on that variable after controlling for the other variables in the analysis. For example, the probability that a point is broken could be based on thickness and we have measurements of thickness.
- Type
- Chapter
- Information
- Quantitative Methods in Archaeology Using R , pp. 144 - 158Publisher: Cambridge University PressPrint publication year: 2017