An Introduction to High-Throughput Bioinformatics Data

doi:10.1017/CBO9780511584589.002

1 - An Introduction to High-Throughput Bioinformatics Data

Published online by Cambridge University Press: 23 November 2009

Keith A. Baggerly ,

Kevin R. Coombes and

Jeffrey S. Morris

Edited by

Kim-Anh Do ,

Peter Müller and

Marina Vannucci

Show author details

Kim-Anh Do: Affiliation:
University of Texas, MD Anderson Cancer Center
Peter Müller: Affiliation:
Swiss Federal Institute of Technology, Zürich
Marina Vannucci: Affiliation:
Rice University, Houston

Book contents

Get access

Summary

Abstract

High throughput biological assays supply thousands of measurements per sample, and the sheer amount of related data increases the need for better models to enhance inference. Such models, however, are more effective if they take into account the idiosyncracies associated with the specific methods of measurement: where the numbers come from. We illustrate this point by describing three different measurement platforms: microarrays, serial analysis of gene expression (SAGE), and proteomic mass spectrometry.

Introduction

In our view, high-throughput biological experiments involve three phases: experimental design, measurement and preprocessing, and postprocessing. These phases are otherwise known as deciding what you want to measure, getting the right numbers and assembling them in a matrix, and mining the matrix for information. Of these, it is primarily the middle step that is unique to the particular measurement technology employed, and it is there that we shall focus our attention. This is not meant to imply that the other steps are less important! It is still a truism that the best analysis may not be able to save you if your experimental design is poor.

We simply wish to emphasize that each type of data has its own quirks associated with the methods of measurement, and understanding these quirks allows us to craft ever more sophisticated probability models to improve our analyses. These probability models should ideally also let us exploit information across measurements made in parallel, and across samples. Crafting these models leads to the development of brand-new statistical methods, many of which are discussed in this volume.

In this chapter, we address the importance of measurement-specific methodology by discussing several approaches in detail. We cannot be all-inclusive, so we shall focus on three.

Type: Chapter
Information: Bayesian Inference for Gene Expression and Proteomics , pp. 1 - 39

DOI: https://doi.org/10.1017/CBO9780511584589.002 [Opens in a new window]

Publisher: Cambridge University Press

Print publication year: 2006

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Book contents

1 - An Introduction to High-Throughput Bioinformatics Data

Summary

Access options

Book purchase

Temporarily unavailable

Book contents

1 - An Introduction to High-Throughput Bioinformatics Data

Summary

Access options

Book purchase

Temporarily unavailable

Save book to Kindle

Save book to Dropbox

Save book to Google Drive