Hostname: page-component-745bb68f8f-s22k5 Total loading time: 0 Render date: 2025-01-11T23:15:39.380Z Has data issue: false hasContentIssue false

A Connection between Correlation and Contingency

Published online by Cambridge University Press:  24 October 2008

H. O. Hirschfeld
Affiliation:
Fitzwilliam House

Extract

Let us consider a discontinuous bivariate distribution. That is, let us consider N × Q non-negative values pνq (ν = 1, 2, …, N; q = 1, 2, …, Q), being the theoretical probabilities of the νth value of a variate Xμ (μ = 1, 2, …, N) concurring with the qth value of a second variate Ys (s = 1, 2, …, Q).

Type
Research Article
Copyright
Copyright © Cambridge Philosophical Society 1935

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

* However, at most [min (N, Q) − 1] and at least one possibility is of practical use.

* If λ1 ≠ 0 and λ2 ≠ 0, then in addition to equation (4) a corresponding equation (4′), obtained by interchanging in (4) x νm x and y qm ν, N and Q, and Greek and Latin indices, is found from (2) by eliminating the x νm x instead of the y qm ν.

It will be obvious by an argument parallel to the following one, that equations (1 b) and (4′) [see footnote* above] would be sufficient conditions for (1) and (2) as well. Thus we may assume from now on that NQ.

Thus, if ρ2 = 0, then all y q = 0 = m ν. But we shall show later that we can always find at least one solution of (4) and (1 a) with ρ2 > 0 except in the case of absolute non-correlation, i.e. if p νq = P ντq. In this case, however, the problem is trivial.

* See p. 521, note †. If N > Q, then at least NQ of the (ρ(i))2 must vanish. For if (ρi)2 > 0, then by (ii) λ2 = 1, λ1 = (ρ(i))2 > 0, but there are at most Q (linearly independent) characteristic sets of (4′) (see p. 521, note *). Furthermore it may occur that for some i = 2, …, N two coordinates and are equal. However, this property is of no statistical relevance, since it depends on the exact values of the p νq. The same applies if two characteristic-values are equal.

* Provided that the variance (7) is 1, which is true for the

Usually Pearson's Mean Square Contingency is only defined for N = Q. But if N > Q (say) there are at least two possibilities of defining a measure having its properties: one is given by (18), another would have Q − 1 instead of N − 1 (as the corresponding argument for the y q would show).