Problem 1: (20 points) A health economist plans to evaluate whether screening patients on arrival or spending extra money on cleaning is more effective in reducing the inci- dence of infections by the MRSA bacterium in hospitals. She hypothesizes the following model: MRSA, = P1 + AS, + BC.+ where, in hospital i, MRSA is the number of infections per thousand patients, S is expenditure per patient on screening, and C is expenditure per patient on cleaning. u, is a disturbance term that satisfies the usual regression model assumptions. In particular, u, is drawn from a distribution with mean zero and constant variance o". The researcher would like to fit the relationship using a sample of hospitals. Unfortunately, data for individual hospitals are not available. Instead she has to use regional data to fit MRSA, = $1 + B.S, + BC, + a,. where MRSA,, S,, C, u, are the averages of MRSA, C, S, u for the hospitals in region j. There were different numbers of hospitals in the regions, there being n, hospitals in region j. 1. Show that the variance of a, is equal to $ and that a regression using ordinary least squares (OLS) to fit the second equation will be subject to heteroscedasticity. 2. Assuming that the researcher knows the value of n, for each region, ex- plain how she could re-specify the regression model to make it homosce- dastic. State the revised specification and demonstrate mathematically that it is homoscedastic. 3. Suppose that the researcher did not know the values of n. Explain in general terms (not mathematically) how, nevertheless, she could per- form f tests relating to the regression coefficients, stating any limitati- ons.Problem 4. Consider the stochastic processes given below, where s, is normally distribu- ted white noise. For each process determine whether it is covariance station nary, strictly covariance stationary, or integrated of order one fie. / (1)], or neither of these: 1. X = 1+1+ 6 2. (1 - 1, 12 + 0, 1823 ) X = Et 3. X, = 161-1Problem 3: (10 points) Humans are analyzed regarding their weight and height. The STATA output shows the results of regressing weight (WEIGHT85, measured in pounds) on height (HEIGHT, measured in inches), first with a linear specification and then with a logarithmic one (LNWEIGHT85-log(WEIGHT85), LN- HEIGHT=log(HEIGHT)), including a dummy variable MALE (MALE= 1 if the observation unit is a man, otherwise 0) in both cases. Interpret all regression coefficients of both regressions. 1. reg WEIGHT85 HEIGHT MALE Source SS df MS Model 288595.144 144297.572 Residual 342677.256 537 638. 132692 Total 631272.4 539 | 1171.19184 Number of obs = 540 F(2, 537) = 226.12 Prob > F = 0.0000 R-squared = 0.4572 Adj R-squared = 0.4551 Boot MSE 25.261 WEIGHT85 Coef. Std. Err. P> 195%% Conf. Interval HEIGHT 1.155447 3950937 10.52 0.000 3.379328 1.931565 MALE 15.52953 3.197231 4.86 0.000 9.24892 21.81015 cons -133.8471 25.51672 -5.25 0.000 -183.9719 -83.72223 2. reg LNWEIGHTS5 LNHEIGHT MALE Source SS df MS Model 12.3281409 2 6.164070-45 Residual 12.5598134 537 023388852 Total 24.8879543 539 0 046174312 Number of obs = 540 F(2, 537) = 263.55 Prob > F = 0.0000 R-squared = 0.4953 Adj R-squared = 0.4935 Root MSE = .15293 LNWEIGHT85 Coof. Std. Err. P >1 (95% Conf. Interval LNHEIGHT 1.760394 1611105 10.93 0.000 1.44391 2.076878 MALE .1108935 0193434 5.73 0.000 0728955 1488914 conS -2.451119 .6711458 -3.65 0.000 -3.769512 -1.132726