Book contents
9 - More than two variables
Published online by Cambridge University Press: 05 June 2012
Summary
… beware of mathematicians, and all those who make empty prophecies. The danger already exists that the mathematicians have made a covenant with the devil to darken the spirit and to confine man in the bonds of Hell.
St. AugustineIntroduction
Even when we understand the benefits and hazards of correlations among pairs of variables we are not guaranteed solid footing when it comes to understanding relationships among slightly more involved sets of variables. Consider a problem involving just three variables, such as studying the connection between the features X, Y, and the outcome variable Z. Such data is the primary recurrent example studied in this chapter. Simple correlations do not easily deal with this situation since sets of features can act in highly coordinated but not obvious ways. The error often made here is assuming that such coordination within three (or more) features can be described using a single summary statistic. Moreover, such entangled data occurs frequently in biology and clinical medicine, and complex versions of this problem can involve hundreds or thousands of such interacting features. This is especially true for broad-scale -omics data, which includes proteomics, genomics, and other newer biological collections that continue to be organized. Statistical learning machines are well suited to dealing with these coordination problems, and we document their utility by discussing examples of these problems. We conclude with a problem related to lists of predictive variables: how to join multiple lists together into a single, best list.
- Type
- Chapter
- Information
- Statistical Learning for Biomedical Data , pp. 171 - 197Publisher: Cambridge University PressPrint publication year: 2011