2.39 Goodman and Kruskal (1954) proposed an association measure (tau) for nominal variables based on variation measure
Question:
2.39 Goodman and Kruskal (1954) proposed an association measure (tau) for nominal variables based on variation measure V (Y) = Σπ+j(1 − π+j) = 1 − Σπ2+j.
a. Show that V(Y) is the probability that two independent observations on Y fall in different categories. Show that V (Y) = 0 when π+j = 1 for some j and V(Y)
takes maximum value of (J-1)/J when π+j = 1/J for all j. This index relates to measures of concentration and diversity proposed for various applications, such as by Corrado Gini (1914a), who was highly influential in the twentieth century in the development of descriptive statistics in Italy, and by E. H. Simpson
(1949) who described species diversity (see Exercise 16.13).
b. For the proportional reduction in variation, show that E [V(Y|X)] = 1 −
ΣiΣj πij2/πi+. [The resulting measure (2.12) is called the concentration co-
efficient. Like the uncertainty coefficient U, τ = 0 is equivalent to indepen-
dence. Haberman (1982) presented generalized concentration and uncertainty coefficients.]
Step by Step Answer: