Published online by Cambridge University Press: 14 March 2018
A systematic and time-saving procedure for the correlation of optical or other physical properties with chemical composition is outlined, and is applicable even where the composition is complex and involves several variables. The procedure is applied to anthophyllite, for which the following partial regression equations are derived:
γ = 1·7249−0·0130Si+0·0140(Ti+Fe‴+Fe′+Mn)±0·0012,
β = 1·7275−0·0142Si+0·024(Ti+Fe‴)+0·0110(Fe″+Mn)±0·0015,
α = 1·6951−0·0117Si+0·040(Ti+Fe‴)+0·0133(Fe″+Mn)±0·0025,
b(Å.) = 16·44+0·28Si−0·13Mg+0·40(Ca+Na+K)±0·04.
The a and c cell-dimensions appear to be constant, within the experimental error of the available data.
page 72 note 1 Analyses 1, 2, 3, 4, 6, 7, 8, 9, 10, 13, 14, 15, 16, 17, 20, 22, 24, 25, 26, 29, 30, 33, 34, 35, 38, 39, 40, 4,1, 43, 44,, 4,5, 72, 79a, and 85 of J. C. Babbitt (Amer. Min., 194,8, vol. 33, p. 263 [M.A. 10-416l); R. Pirani, Atti (Bend.) Aeead. Naz. Lineei, el. fis. mat. nat., 1952, ser. 8, vol. 13, sem. 2, p. 83 [M.A. i2-30], and p. 170 [M.A. 12- 140], and 1953, vol. 15, sere. 2, p. 422 [M.A. 12-374]; G. H. Francis, Min. Mag., 1955, vol. 30, p. 709.
page 72 note 2 J. C. Babbitt’s nos. 1, 8, 9, 14, 17, 20, 26, 29, 30, and 43.
page 75 note 1 That is, differences between the observed optical data and the values calculated from the regression equation.
page 77 note 1 It will be noticed that the equations are homogeneous, containing no constant term. In general, a regression equation correlating one dependent variable with n independent variables will contain a constant term, making n+ 1 constants, and in the derivation of the equation determinants of order n + 1 will be involved. But if all the variables are expressed as differences from the mean, the constant term becomes zero. For if we assume that the constant term is ζx, we have X i = ζx + axAi+bxBi+cxCi+...; and summing, ΣXi = Nζx+ax Σ Ai+bxΣBi+cxΣCi+...(i = 1,2,.,N); but if Xi, Ai, &c. are measured from their several means, the sums Σ Xi, Σ Ai .... are all zero ; hence ζx = 0. This elimination of the constant term reduces the order of the determinants involved in the derivation of the regression equations from n + 1 to n, which amply repays the labour of expressing all the variables as differences from their means. It will also be obvious that if the number of sets of observations, N, is less than the number of independent variables, n, the system of N equations has no definite solution; if N = n a solution is possible; and if, as will normally be the case, N > n, the equations will form an inconsistent system, from which, however, an optimum solution can be derived by the method of least squares ; in what follows, we assume that N > n and apply the method of least squares.
page 78 note 1 This proviso will usually have been met during the preliminary selection and preparation of the data. If for any reason a different group of independent variables must be used for any particular physical quantity, the whole procedure, including the preparatory expression of the data as differences from their means, will have to be carried out separately for that physical quantity. This is exemplified in the case of anthophyllite by the data for the unit-cell dimension, b (tables I and IIB; compare table IIA).
page 82 note 1 In this connexion, it must not be forgotten that n is the number of independent variables remaining, not necessarily the original number the investigation started with ; if any terms h~ve been rejected from the regression equation as not significantly different from zero, n will be reduced accordingly.
page 83 note 1 An alternative test, less rigorously based but quite adequate for most investigations, is to accept any coefficient as probably significant if it is greater than its standard deviation. This test has the advantage of not requiring tables of Student’s ratio.
page 86 note 1 If the covariance matrix is of low order, as in the present case, it may be simpler to recompute the new matrix from the beginning rather than find it by this process. Referring back to the original matrix of sums of squares and products (the fifthorder square matrix on p. 80), the fourth and fifth columns and fourth and fifth rows, containing M and C, are simply suppressed; to write the second and third rows and columns, they are just added together, ΣAi(Bi+Ci)=ΣAiBi+ΣAiCi, except for the four terms where the second and third columns cross the second and third rows ; these four terms Σ T2, Σ F2, and Σ TF (twice) are united by the relation Σ (T + F) 2 = Σ T 2 + Σ F 2+2 Σ FT.
page 87 note 1 The a and c cell-dimensions appear to be constant, within the experimental error of the available data.
page 87 note 2 These serve as an indication of the amounts by which the several coefficients can be varied without gravely upsetting the agreement between observed and cal. culated values (after appropriate adjustment of the constant term).
page 89 note 1 Where a large number of observations are available, the neglect of a variable will normally lead to fairly large residuals, which if plotted against the neglected variable will show a distinct trend. But if the neglected variable tends to follow one of those taken into account, its effect will be largely or wholly absorbed by the latter ; and if there are only a few observations, false constants will probably be deduced, and the residuals will show only an irregular scatter. It is desirable that there should be at least ten times as many sets of observations as there are variables to be taken into account.
page 89 note 2 As all the observed data have been published before, and all the calculated data may be derived from the regression equations, it seemed unnecessary to print table of observed and calculated data, but such a table has been drawn up and deposited in the library of the Mineral Department of the British Museum (Natural History), where it may be consulted, together with a full set of graphs of the residuals plotted against the composition parameters, including OH‘ and F’.
page 89 note 3 The mean difference, excluding the above five analyses, is only 4° against an expected 10° as calculated from the standard deviations of α, β, and γ. If the cited 2V(+) for R. Pirani’s anthophyllite from Alpe de Brez is a misprint or error for 2V(-), there would be good agreement in this case also.
page 89 note 4 G. H. Francis and M. H. Hey (in the press).
page 91 note 1 F. Hori, Sci. Papers Coll. General Education Univ. Tokyo, 1954, vol. 4, no. l. p. 71.