Book contents
- Frontmatter
- Dedication
- Contents
- For the student
- For the instructor
- 1 Science and statistical data analysis
- 2 Statistical summaries of data
- 3 Simple statistical inferences
- 4 Probability theory
- 5 Random variables
- 6 Estimation and maximum likelihood
- 7 Significance tests and confidence intervals
- 8 Monte Carlo methods
- Appendix A Getting started with statistical computation
- Appendix B Data case studies
- Appendix C Combinations and permutations
- Appendix D More on confidence intervals
- Appendix E Glossary
- Appendix F Notation
- References
- Index
2 - Statistical summaries of data
Published online by Cambridge University Press: 05 June 2014
- Frontmatter
- Dedication
- Contents
- For the student
- For the instructor
- 1 Science and statistical data analysis
- 2 Statistical summaries of data
- 3 Simple statistical inferences
- 4 Probability theory
- 5 Random variables
- 6 Estimation and maximum likelihood
- 7 Significance tests and confidence intervals
- 8 Monte Carlo methods
- Appendix A Getting started with statistical computation
- Appendix B Data case studies
- Appendix C Combinations and permutations
- Appendix D More on confidence intervals
- Appendix E Glossary
- Appendix F Notation
- References
- Index
Summary
The greatest value of a picture is when it forces us to notice what we never expected to see.
John Tukey (1977), statistician and pioneer of exploratory data analysisHow should you summarise a dataset? This is what descriptive statistics and statistical graphics are for. A statistic is just a number computed from a data sample. Descriptive statistics provide a means for summarising the properties of a sample of data (many numbers or values) so that the most important results can be communicated effectively (using few numbers). Numerical and graphical methods, including descriptive statistics, are used in exploratory data analysis (EDA) to simplify the uninteresting and reveal the exceptional or unexpected in data.
Plotting data
One of the basic principles of good data analysis is: always plot the data. The brain–eye system is incredibly good at recognising patterns, identifying outliers and seeing the structure in data. Visualisation is an important part of data analysis, and when confronted with a new dataset the first step in the analysis should be to plot the data. There is a wide array of different types of statistical plot useful in data analysis, and it is important to use a plot type appropriate to the data type. Graphics are usually produced for screen or paper and so are inherently two dimensional, even if the data are not.
The variables can often be classified as explanatory or response.
- Type
- Chapter
- Information
- Scientific InferenceLearning from Data, pp. 14 - 45Publisher: Cambridge University PressPrint publication year: 2013