5 - Preliminary analysis
Published online by Cambridge University Press: 07 September 2011
Summary
An outline is given of some of the steps needed to ensure that the data finally analysed are of appropriate quality. These include data auditing and data screening and the use of simple graphical and tabular preliminary analyses. No rigid boundary should be drawn between such largely informal procedures and the more formal model-based analyses that are the primary focus of statistical discussion.
Introduction
While it is always preferable to start with a thoughtful and systematic exploration of any new set of data, pressure of time may tempt those analysing such data to launch into the ‘interesting’ aspects straight away. With complicated data, or even just complicated data collection processes, this usually represents a false time economy as complications then come to light only at a late stage. As a result, analyses have to be rerun and results adjusted.
In this chapter we consider aspects of data auditing, data screening, data cleaning and preliminary analysis. Much of this work can be described as forms of data exploration, and as such can be regarded as belonging to a continuum that includes, at the other extreme, complex statistical analysis and modelling. Owing to the fundamental importance of data screening and cleaning, guidance on ethical statistical practice, aimed perhaps particularly at official statisticians, has included the recommendation that the data cleaning and screening procedures used should be reported in publications and testimony (American Statistical Association Committee on Professional Ethics, 1999).
- Type
- Chapter
- Information
- Principles of Applied Statistics , pp. 75 - 89Publisher: Cambridge University PressPrint publication year: 2011