Provide solutions for the attached questions.
We study the effect of cigarette smoking on the child's birth weight via the simple linear regression Yr =30 +3131} +ui1f01' i: 1,...,n where Y is the birthweight and X is the number of packs smoked by the mother per day. (i) In this application, the unobserved term is capturing some latent genetic factor of the mother which affects both the mother's smoking behavior and infant's birhweight. You know the following identity holds for OLS estimator E]: It: 3141 + # =(Xa ) If you suspect the correlation between a and X is negative. is 31 more likely to over-estimate or under- estimate the true effect ,61? {I point) (ii) Referring to the concern in the previous part, we pick the average price of cigarettes as our instru- mental variable 21,-. What two properties do Z need to satisfy in order to be qualied as the instrumental variable? Suppose after you run a regression of the number of packs on the average price of cigarettes and nd an almost zero R-square. Which property is Z likely to violate? (1 point) (iii) Let Z = 111 El; 2;. The instrumental variable estimator 31 takes the following form: E?=1[Zi_ _)Yi 3': new tax; Now assume SLR.5 holds so that Var (a|X) = 0'2 and you can treat both X5 and Z; as xed numbers. Show that 31 has a larger variance that the simple OLS estimator 31. Le.1 Var (31) 2 Var (31). (2 points) 1. We use the added variable technique to derive the variance ination factor (VIP). Consider a linear model of the form 91' =50+l31$1+l3213922+-"+}3p$a'p+zr, 5'3: 1:"'ana (1) where the errors are uncorrelated with mean zero and variance 02. Let X denote the n X p' predictor matrix and assume X is of full rank. We will derive the VIP for ip. The same derivation applies to any other coefcient simply by rearranging the columns of X. Let U denote the matrix containing the rst p' 1 columns of X and let z denote the the last column of X so that X = [U 2]. Then we can write the model in (1) as 50 x91 Y=[U z](,:J)+t-:=Ua+z6p+e with a: (2) x810. 1 Let 2 denote the vector of tted values from the least squares regression of z on the columns of U (Le. the regression of X.p on all the other variables), and let T : z 2 denote the residuals from that regression. Note that 'r' and 3 are not random, they are constant vectors obtained by linear transformations of z. (a) Show that the regression model in (2) can be rewritten in the form for some constant vector 6 of the same length as a. (Hint: z : i l 'r and 2? = U(UTU)_1UTz). (b) Show that UT? 2 0, a zero vector. (0) Obtain simplied expressions for the least squares estimators of 5 and 5?, showing, in particular, that 5,, : 'rTY/rT'r. (d) Based on Part (c) and the model assumptions, show that 0.2 ELK\"? _ is)? where :Eg-p is the LS tted value from regression X,D on the all the other predictor variables with an intercept. var(,p) : Simple linear regression is a statistical method that allows us to summarize and study relationships between two continuous (quantitative) variables: One variable, denoted X, is regarded as the predictor, explanatory, or independent variable. The other variable, denoted Y, is regarded as the response, outcome, or dependent variable. Suppose that we are given n-i.i.d observations { (x;, y;)}"_, from the assumed simple linear regression model Y = BIX + Bo + . Answer the following questions on simple linear regression. 5-a. Denote 1 and Bo as the point estimators of B, and Bo, respectively, that are obtained through the least squares method. Show, step by step, that the two point estimators are unbiased. Derive the least squares estimator of of and determine whether it is unbiased or not. Show your work step by step. 5-b. Calculate _'_1(yi - Bix; - Bo) (Bix, + Bo). Determine whether the point (X, Y) is on the line Y = 1X + Bo. Explain your reasoning mathematically. 5-c. Using the maximum likelihood estimation (MLE) technique, derive a point estimator for the coefficient B1 and the intercept Bo, respectively. Determine whether the point estimators that you obtained via MLE are unbiased or not. Justify your conclusion mathematically. 5-d. Calculate the variance of the four estimators from Questions 5-a and 5-c, respectively. Show your work step by step. 5-e. Suppose that we are using the simple linear regression model Y = B1 X + Bo + 1 while the true model is Y = 1X1 + B2X2 + Bo + 82 where Bo, B1, and B2 are constants. We assume that the distributions of &, and e2 are both N(0,02), i.e., normal distribution with variance o?. We further assume that the two noise variables are uncorrelated. Find the least squares estimator of B, in this case and determine whether the point estimator that you obtain is biased or not. If it is biased, calculate the bias