There are two main theories with respect to the development of spelling ability: the stage model and the model of overlapping waves. In this paper exploratory model based clustering will be used to analyze the responses of more than 3500 pupils to subsets of 245 items. To evaluate the two theories, the resulting clusters will be ordered along a developmental dimension using an external criterion. Solutions for three statistical problems will be given: (1) an algorithm that can handle large data sets and only renders non-degenerate clusters; (2) a goodness of fit test that is not affected by the fact that the number of possible response vectors by far out-weights the number of observed response vectors; and (3) a new technique, data expunction, that can be used to evaluate goodness-of-fit tests if the missing data mechanism is known.