Book contents
- Frontmatter
- Contents
- List of contributors
- Preface
- Bibliography of J. F. C. Kingman
- 1 A fragment of autobiography, 1957–1967
- 2 More uses of exchangeability: representations of complex random structures
- 3 Perfect simulation using dominated coupling from the past with application to area-interaction point processes and wavelet thresholding
- 4 Assessing molecular variability in cancer genomes
- 5 Branching out
- 6 Kingman, category and combinatorics
- 7 Long-range dependence in a Cox process directed by an alternating renewal process
- 8 Kernel methods and minimum contrast estimators for empirical deconvolution
- 9 The coalescent and its descendants
- 10 Kingman and mathematical population genetics
- 11 Characterizations of exchangeable partitions and random discrete distributions by deletion properties
- 12 Applying coupon-collecting theory to computer-aided assessments
- 13 Colouring and breaking sticks: random distributions and heterogeneous clustering
- 14 The associated random walk and martingales in random walks with stationary increments
- 15 Diffusion processes and coalescent trees
- 16 Three problems for the clairvoyant demon
- 17 Homogenization for advection-diffusion in a perforated domain
- 18 Heavy traffic on a controlled motorway
- 19 Coupling time distribution asymptotics for some couplings of the Lévy stochastic area
- 20 Queueing with neighbours
- 21 Optimal information feed
- 22 A dynamical-system picture of a simple branching-process phase transition
- Index
13 - Colouring and breaking sticks: random distributions and heterogeneous clustering
Published online by Cambridge University Press: 07 September 2011
- Frontmatter
- Contents
- List of contributors
- Preface
- Bibliography of J. F. C. Kingman
- 1 A fragment of autobiography, 1957–1967
- 2 More uses of exchangeability: representations of complex random structures
- 3 Perfect simulation using dominated coupling from the past with application to area-interaction point processes and wavelet thresholding
- 4 Assessing molecular variability in cancer genomes
- 5 Branching out
- 6 Kingman, category and combinatorics
- 7 Long-range dependence in a Cox process directed by an alternating renewal process
- 8 Kernel methods and minimum contrast estimators for empirical deconvolution
- 9 The coalescent and its descendants
- 10 Kingman and mathematical population genetics
- 11 Characterizations of exchangeable partitions and random discrete distributions by deletion properties
- 12 Applying coupon-collecting theory to computer-aided assessments
- 13 Colouring and breaking sticks: random distributions and heterogeneous clustering
- 14 The associated random walk and martingales in random walks with stationary increments
- 15 Diffusion processes and coalescent trees
- 16 Three problems for the clairvoyant demon
- 17 Homogenization for advection-diffusion in a perforated domain
- 18 Heavy traffic on a controlled motorway
- 19 Coupling time distribution asymptotics for some couplings of the Lévy stochastic area
- 20 Queueing with neighbours
- 21 Optimal information feed
- 22 A dynamical-system picture of a simple branching-process phase transition
- Index
Summary
Abstract
We begin by reviewing some probabilistic results about the Dirichlet Process and its close relatives, focussing on their implications for statistical modelling and analysis. We then introduce a class of simple mixture models in which clusters are of different ‘colours’, with statistical characteristics that are constant within colours, but different between colours. Thus cluster identities are exchangeable only within colours. The basic form of our model is a variant on the familiar Dirichlet process, and we find that much of the standard modelling and computational machinery associated with the Dirichlet process may be readily adapted to our generalisation. The methodology is illustrated with an application to the partially-parametric clustering of gene expression profiles.
Keywords Bayesian nonparametrics, gene expression profiles, hierarchical models, loss functions, MCMC samplers, optimal clustering, partition models, Pólya urn, stick breaking
AMS subject classification (MSC2010) 60G09, 62F15, 62G99, 62H30, 62M99
Introduction
The purpose of this note is four-fold: to remind some Bayesian nonparametricians gently that closer study of some probabilistic literature might be rewarded, to encourage probabilists to think that there are statistical modelling problems worth of their attention, to point out to all another important connection between the work of John Kingman and modern statistical methodology (the role of the coalescent in population genetics approaches to statistical genomics being the most important example; see papers by Donnelly, Ewens and Griffiths in this volume), and finally to introduce a modest generalisation of the Dirichlet process.
- Type
- Chapter
- Information
- Probability and Mathematical GeneticsPapers in Honour of Sir John Kingman, pp. 319 - 344Publisher: Cambridge University PressPrint publication year: 2010
- 2
- Cited by