This exercise is based on results in McNamee (2003) on the use of two-phase sampling to estimate
Question:
This exercise is based on results in McNamee (2003) on the use of two-phase sampling to estimate disease prevalence. An inexpensive, but possibly inaccurate, screening test for the disease is given in the phase I sample, an SRS of size n (1). Let xi = 1 if person i tests positive on the screening test and xi = 0 if person i tests negative on the screening test. Persons are then classified into stratum 1 (xi = 0) and stratum 2 (xi = 1). The persons sampled in phase II are given a test for the presence of the disease that, for purposes of this exercise, is assumed to be 100% accurate: The phase II response is yi = 1 if person i has the disease and 0 otherwise. We can write the population values in a contingency table:
Show that
b. Suppose that the optimal allocation is used (see Section 12.5.1) and that 0 Where R is the population Pearson correlation coefficient between x and y, given in (4.1). For the second term, first show that RSy = p (S2 − W2)/√W1W2.
c. Calculate the ratio of variances in (b) when S1 = S2 and R = min {S1 + S2 −
0.9, 0.95}, for S1 ∈ {0.5, 0.6, 0.7, 0.8, 0.9, 0.95} and c (1) / c (2) ∈ {0.0001, 0.01, 0.1, 0.5,1}. Display your results in a table. For which settings would you recommend two-phase sampling to estimate disease prevalence?
Fantastic news! We've Found the answer you've been seeking!
Step by Step Answer:
Related Book For
Question Posted: