Book contents
- Frontmatter
- Dedication
- Contents
- Figures
- Preface
- 1 Learning from Data, and Tools for the Task
- 2 Generalizing from Models
- 3 Multiple Linear Regression
- 4 Exploiting the Linear Model Framework
- 5 Generalized Linear Models, and Survival Analysis
- 6 Time Series Models
- 7 Multilevel Models, and Repeated Measures
- 8 Tree-Based Classification and Regression
- 9 Multivariate Data Exploration and Discrimination
- Appendix A The R System: a Brief Overview
- References
- References to R Packages
- Index of R Functions
- Index of Terms
9 - Multivariate Data Exploration and Discrimination
Published online by Cambridge University Press: 11 May 2024
- Frontmatter
- Dedication
- Contents
- Figures
- Preface
- 1 Learning from Data, and Tools for the Task
- 2 Generalizing from Models
- 3 Multiple Linear Regression
- 4 Exploiting the Linear Model Framework
- 5 Generalized Linear Models, and Survival Analysis
- 6 Time Series Models
- 7 Multilevel Models, and Repeated Measures
- 8 Tree-Based Classification and Regression
- 9 Multivariate Data Exploration and Discrimination
- Appendix A The R System: a Brief Overview
- References
- References to R Packages
- Index of R Functions
- Index of Terms
Summary
This chapter moves from regression to methods that focus on the pattern presented by multiple variables, albeit with applications in regression analysis. A strong focus is to find patterns that beg further investigation, and/or replace many variables by a much smaller number that capture important structure in the data. Methodologies discussed include principal components analysis and multidimensional scaling more generally, cluster analysis (the exploratory process that groups “alike” observations) and dendogram construction, and discriminant analysis. Two sections discuss issues for the analysis of data, such as from high throughput genomics, where the aim is to determine, from perhaps thousands or tens of thousands of variables, which are shifted in value between groups in the data. A treatment of the role of balance and matching in making inferences from observational data then follows. The chapter ends with a brief introduction to methods for multiple imputation, which aims to use multivariate relationships to fill in missing values in observations that are incomplete, allowing them to have at least some role in a regression or other further analysis.
Keywords
- Type
- Chapter
- Information
- A Practical Guide to Data Analysis Using RAn Example-Based Approach, pp. 400 - 468Publisher: Cambridge University PressPrint publication year: 2024