Book contents
8 - Merely two variables
Published online by Cambridge University Press: 05 June 2012
Summary
How wonderful that we have met with a paradox. Now we have some hope of making progress.
Niels BohrIntroduction
Correlations are basic for statistical understanding. In this chapter we discuss several important examples of correlations, and what they can and cannot do. In Chapter 9 we take up more complex issues, those involving correlations and associations among several variables. We will find that statistical learning machines are an excellent environment in which to learn about connections between variables, and about the strengths and limitations of correlations.
Correlations can be used to estimate the strength of a linear relationship between two continuous variables. There are three ideas here that are central: correlations quantify linear relationships, they do so with only two variables, and the two variables are at least roughly continuous (like rainfall or temperature). If the relationship between the variables is almost linear, and we don't mind restricting ourselves to those two variables, then the prediction problem has at least one familiar solution: linear regression (see Note 1). Usually the relationship between two variables is not linear. Despite this, the relationship could still be strong, and a correlation analysis could be misleading. A statistical learning machine approach may be more likely to uncover the relationship. As a prelude to later results we show this by applying a binary decision tree to one of the examples in this chapter.
- Type
- Chapter
- Information
- Statistical Learning for Biomedical Data , pp. 157 - 170Publisher: Cambridge University PressPrint publication year: 2011