A detailed study is made of methods available for estimating the significance of difference in degree of pollution of water by conform bacteria and of obtaining practically useful estimates of such differences.
For estimating the significance of a difference in a single pair of samples it was considered preferable to employ as the variate the difference in the total number of fertile tubes, or n.f.t., rather than the difference in the so-called most probable number of bacteria per 100 ml. A table (Table 3) is included by which the significance of any difference in n.f.t. may be determined at a glance. The variate is also useful in determining whether a series of differences may be regarded as homogeneous and such as might be expected to arise by chance in a high proportion of trials if corresponding members in a series of pairs of samples had been taken from sources identical as to their degree of pollution. Table 5 is included to facilitate the application of a comprehensive test of identity, in degree of pollution, of the members of each pair of samples.
The application of the binomial distribution in testing the significance of consistency in sign of a series of differences is explained, and Table 13 is included to facilitate the application of the binomial for a series of pairs twenty or less in number. The use of the normal distribution as an approximation to the binomial for pairs over twenty in number is explained.
The distribution of differences in the logarithms of the ‘most probable number of bacteria per 100 ml.’, or m.p.n., between samples in a certain series, A, and those of another series, B, in an experiment carried out by Dr L. F. L. Clegg, was found to be approximately normal in form. Thus the normal test of the mean logarithmic difference in m.p.n. is applicable and the t-test is applicable to a small series of differences. The advantages of employing the logarithmic difference as the variate are pointed out.
All tests applied to Dr Clegg's data agree in showing that the degree of pollution estimated from the results of a certain method, B, was higher than that indicated by the results of another method, A. According to method A the tubes were inoculated and- incubated at once at 44° C, while according to method B the inoculated tubes were incubated at 37° C, those proving fertile at that temperature being used for inoculating fresh tubes which were then incubated at 44° C. The mean ratio, B : A was found to be 1·416 and the 0·025 points of the distribution were found to be B : A = 9·6, A : B = 4·8 respectively.
The choice of a level of significance is discussed.
The results of the application of some statistical tests of significance to data from a control experiment are discussed. In this experiment the members of each pair of samples were taken from the same water sample and treated by identical methods.
The methods of replacing a difference by the corresponding value of the normal abscissa and of applying the normal test of significance of a mean to the mean of such normal abscissae is explained. This method is applied in testing the significance of a mean difference in n.f.t. and in testing the significance of a mean error from zero.
The theoretical bases of tests of the significance of differences in m.p.n. and in n.f.t. are compared.
The advantage of knowledge of the exact distribution of a single difference in tests of the significance of a series of differences is explained and examples are given which demonstrate this advantage very clearly.
The combination of results taken at different times and places for testing the significance of time or locality effects is explained and a, suggestion is made as to the use of replicated observations when the value of P for each observation is known.