Hostname: page-component-745bb68f8f-s22k5 Total loading time: 0 Render date: 2025-01-22T16:03:33.614Z Has data issue: false hasContentIssue false

Some Thoughts on the Analysis of Numerical Data

Published online by Cambridge University Press:  18 August 2016

L. G. K. Starke
Affiliation:
Government Actuary's Department

Extract

This paper is the record of an attempt by an ordinary actuary, who entered the profession in days when statistics were but a nugatory ingredient in the examination syllabus, to get a little clearer in his mind about some of the similarities and differences between the traditional technique of the actuary and the methods which have been developed for dealing with statistical material in other fields. It seemed that the result might be of some general interest, and it is from this point of view that the paper is submitted to the Institute.

Type
Research Article
Copyright
Copyright © Institute and Faculty of Actuaries 1949

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

page 183 note * See note on p. 217.

page 186 note * Although it may have certain empirical advantages, e.g. in the computation of joint-life functions.

page 190 note * Although it serves the present purpose, I am not sure that this description of a mechanism exactly definable in mathematical terms is really complete. In our present state of knowledge, we find that most of the phenomena which we can measure are subject to change; is it not therefore possible to imagine that under conditions in which everything could be fully explained in terms of cause and effect there might be no such thing as a constant? If so, the algebraical formula expressing the effect in terms of the causes would consist solely of variables and the signs +, −, × and ÷ ; in other words, such an expression as (the x's being ‘variables’ and the a's ‘constants’ in present nomenclature) might merely represent the best attempt we can yet make at, say,

or something equally horrific; x 1 being the only factor which we can at present discern as having an influence upon x 2, and x 3x 8 being factors which we have not so far thought of importing into our analysis but which, when blended in the manner indicated within the brackets, always produce (on the scale of accuracy to which we are working) the same result. There can, of course, be no question of ‘solving’ such an equation in the sense in which we would find a 1 and a 2 from with the aid of two or more sets of x 1, x 2; it would seem that both the structure of the right-hand side and the ingredients of which it is composed would havet o be determined—if they ever could be determined—by trial and error on the basis of a priori reasoning. The whole notion seems rather fantastic; yet it cannot be denied that when we discover that something, hitherto regarded as invariable, is really the net result of variable factors we consider that we have added to our knowledge of the universe.

page 193 note * It is this purely additive connexion between the terms which, as I understand it, gives rise to the use of terms such as ‘linear’ and ‘plane’ no matter how many variables are involved. Thus log y = log x 1 + 2 log x 2 is describable as a linear regression equation or as representing a plane of regression although the corresponding expression connotes a curved surface.

page 194 note * We could obviously go a stage further. f 1, f 2, …, need not be explicit functions in x 1, x 2, …, only, if our a priori knowledge enables us to postulate a function such as f(x 1, x 2) and write

in which case f(x 1, x 2) becomes an independent variable in place of f (x 1) and f(x 2). But whether the f contains x 1 only, or x 1 in combination with other x's, we must form a series for it from our statistical data before we can begin to construct the regression equation. A combination of two or more series of varying degrees of reliability would, however, seem to give rise to difficult questions about the margin of error in an independent variable ‘manufactured’ in this fashion.

page 207 note * The war periods themselves were excluded for obvious reasons. The standardized rates for the years 1915–20 and 1940–41 are based on civilian mortality only and are radically affected by the withdrawal of healthy lives from the civilian population for service with the Forces.