Book contents
- Frontmatter
- Contents
- List of Contributors
- Preface
- 1 An Introduction to High-Throughput Bioinformatics Data
- 2 Hierarchical Mixture Models for Expression Profiles
- 3 Bayesian Hierarchical Models for Inference in Microarray Data
- 4 Bayesian Process-Based Modeling of Two-Channel Microarray Experiments: Estimating Absolute mRNA Concentrations
- 5 Identification of Biomarkers in Classification and Clustering of High-Throughput Data
- 6 Modeling Nonlinear Gene Interactions Using Bayesian MARS
- 7 Models for Probability of Under- and Overexpression: The POE Scale
- 8 Sparse Statistical Modelling in Gene Expression Genomics
- 9 Bayesian Analysis of Cell Cycle Gene Expression Data
- 10 Model-Based Clustering for Expression Data via a Dirichlet Process Mixture Model
- 11 Interval Mapping for Expression Quantitative Trait Loci
- 12 Bayesian Mixture Models for Gene Expression and Protein Profiles
- 13 Shrinkage Estimation for SAGE Data Using a Mixture Dirichlet Prior
- 14 Analysis of Mass Spectrometry Data Using Bayesian Wavelet-Based Functional Mixed Models
- 15 Nonparametric Models for Proteomic Peak Identification and Quantification
- 16 Bayesian Modeling and Inference for Sequence Motif Discovery
- 17 Identification of DNA Regulatory Motifs and Regulators by Integrating Gene Expression and Sequence Data
- 18 A Misclassification Model for Inferring Transcriptional Regulatory Networks
- 19 Estimating Cellular Signaling from Transcription Data
- 20 Computational Methods for Learning Bayesian Networks from High-Throughput Biological Data
- 21 Bayesian Networks and Informative Priors: Transcriptional Regulatory Network Models
- 22 Sample Size Choice for Microarray Experiments
- Plate section
16 - Bayesian Modeling and Inference for Sequence Motif Discovery
Published online by Cambridge University Press: 23 November 2009
- Frontmatter
- Contents
- List of Contributors
- Preface
- 1 An Introduction to High-Throughput Bioinformatics Data
- 2 Hierarchical Mixture Models for Expression Profiles
- 3 Bayesian Hierarchical Models for Inference in Microarray Data
- 4 Bayesian Process-Based Modeling of Two-Channel Microarray Experiments: Estimating Absolute mRNA Concentrations
- 5 Identification of Biomarkers in Classification and Clustering of High-Throughput Data
- 6 Modeling Nonlinear Gene Interactions Using Bayesian MARS
- 7 Models for Probability of Under- and Overexpression: The POE Scale
- 8 Sparse Statistical Modelling in Gene Expression Genomics
- 9 Bayesian Analysis of Cell Cycle Gene Expression Data
- 10 Model-Based Clustering for Expression Data via a Dirichlet Process Mixture Model
- 11 Interval Mapping for Expression Quantitative Trait Loci
- 12 Bayesian Mixture Models for Gene Expression and Protein Profiles
- 13 Shrinkage Estimation for SAGE Data Using a Mixture Dirichlet Prior
- 14 Analysis of Mass Spectrometry Data Using Bayesian Wavelet-Based Functional Mixed Models
- 15 Nonparametric Models for Proteomic Peak Identification and Quantification
- 16 Bayesian Modeling and Inference for Sequence Motif Discovery
- 17 Identification of DNA Regulatory Motifs and Regulators by Integrating Gene Expression and Sequence Data
- 18 A Misclassification Model for Inferring Transcriptional Regulatory Networks
- 19 Estimating Cellular Signaling from Transcription Data
- 20 Computational Methods for Learning Bayesian Networks from High-Throughput Biological Data
- 21 Bayesian Networks and Informative Priors: Transcriptional Regulatory Network Models
- 22 Sample Size Choice for Microarray Experiments
- Plate section
Summary
Abstract
Motif discovery, which focuses on locating short sequence patterns associated with the regulation of genes in a species, leads to a class of statistical missing data problems. These problems are discussed first with reference to a hypothetical model, which serves as a point of departure for more realistic versions of the model. Some general results relating to modeling and inference through the Bayesian and/or frequentist perspectives are presented, and specific problems arising out of the underlying biology are discussed.
Introduction
The goal of motif discovery is to locate short repetitive patterns in DNA that are involved in the regulation of genes of interest. To fix ideas, let us consider the following paragraph modified from Bellhouse [4, Section 3, p. 5]:
Richard Bayes (1596–1675), a great-grandfather of Thomas Bayes, was a successful cutler in Sheffield. In 1643 Richard served in the rotating position of Master of the Company of Cutlers of Hallamshire. Richard was sufficiently well off that he sent one of his sons, Samuel Bayes (1635–1681) to Trinity College Cambridge during the Commonwealth period; Samuel obtained his degree in 1656. Another son, Joshua Bayes (1638–1703) followed in his father's footsteps in the cutlery industry, also serving as Master of the Company in 1679. Evidence of Joshua Bayes's wealth comes from the size of his house, the fact that he employed a servant and the size of the taxes that he paid. Joshua Bayes's influence may be taken from his activities in … Imagine that a person who has never seen the English language before looks at this paragraph and tries to make sense out of it.
- Type
- Chapter
- Information
- Bayesian Inference for Gene Expression and Proteomics , pp. 309 - 332Publisher: Cambridge University PressPrint publication year: 2006