Published online by Cambridge University Press: 21 February 2012
Every statistical model is based on explicitly or implicitly formulated assumptions. In this study we address new techniques of calculation of variances and confidence intervals, analyse some statistical methods applied to modelling twinning rates, and investigate whether the improvements give more reliable results. For an observed relative frequency, the commonly used variance formula holds exactly with the assumptions that the repetitions are independent and that the probability of success is constant. The probability of a twin maternity depends not only on genetic predisposition, but also on several demographic factors, particularly ethnicity, maternal age and parity. Therefore, the assumption of constancy is questionable. The effect of grouping on the analysis of regression models for twinning rates is also considered. Our results indicate that grouping influences the efficiency of the estimates but not the estimates themselves. Recently, confidence intervals for proportions of low-incidence events have been a target for revived interest and we present the new alternatives. These confidence intervals are slightly wider and their midpoints do not coincide with the maximum-likelihood estimate of the twinning rate, but their actual coverage is closer to the nominal one than the coverage of the traditional confidence interval. In general, our findings indicate that the traditional methods are mainly satisfactorily robust and give reliable results. However, we propose that new formulae for the confidence intervals should be used. Our results are applied to twin-maternity data from Finland and Denmark.