Question

1 Approved Answer

Posted on Jun 13, 2024

solv the following attachments. Simple linear regression is a statistical method that allows us to summarize and study relationships between two continuous (quantitative) variables: One

solv the following attachments.

Simple linear regression is a statistical method that allows us to summarize and study relationships between two continuous (quantitative) variables: One variable, denoted X, is regarded as the predictor, explanatory, or independent variable. The other variable, denoted Y, is regarded as the response, outcome, or dependent variable. Suppose that we are given n-i.i.d observations { (x;, y;)}"_, from the assumed simple linear regression model Y = BIX + Bo + . Answer the following questions on simple linear regression. 5-a. Denote 1 and Bo as the point estimators of B, and Bo, respectively, that are obtained through the least squares method. Show, step by step, that the two point estimators are unbiased. Derive the least squares estimator of of and determine whether it is unbiased or not. Show your work step by step. 5-b. Calculate _'_1(yi - Bix; - Bo) (Bix, + Bo). Determine whether the point (X, Y) is on the line Y = 1X + Bo. Explain your reasoning mathematically. 5-c. Using the maximum likelihood estimation (MLE) technique, derive a point estimator for the coefficient B1 and the intercept Bo, respectively. Determine whether the point estimators that you obtained via MLE are unbiased or not. Justify your conclusion mathematically. 5-d. Calculate the variance of the four estimators from Questions 5-a and 5-c, respectively. Show your work step by step. 5-e. Suppose that we are using the simple linear regression model Y = B1 X + Bo + 1 while the true model is Y = 1X1 + B2X2 + Bo + 82 where Bo, B1, and B2 are constants. We assume that the distributions of &, and e2 are both N(0,02), i.e., normal distribution with variance o?. We further assume that the two noise variables are uncorrelated. Find the least squares estimator of B, in this case and determine whether the point estimator that you obtain is biased or not. If it is biased, calculate the bias.Question 1 (understanding mean and variance of linear combinations of random variables) Let T_15 be the percentage of adult males who used tobacco products in 2015 in a country and T_10 be this percentage in 2010 in the same country. Define the random variable Z in the following way: Z =T_15 -T_10. We do not observe T_15 and T_10 for all countries of the world. We can only hope to get data from a random sample of n countries, where n is much smaller than the number of countries in the world. We want to estimate the E (Z) for the distribution of countries in the world. Each group member should attempt one of the following questions. The group can consult and improve the answer and only submit the improved answer, but the original person who attempted each part must be named. 1. What does the hypothesis E (Z) = 0 mean? After explaining what this hypothesis means, describe whether or not E (Z) = 0 implies T_15 = T_10 in every country in the world. Then, describe whether or not E (Z) = 0 implies -Er_15; = -Er_10; 1= 1 1= 1 for the n countries in the sample [Note that "Yes it does" or "No it doesn't" are not sufficient, you are expected to justify your answer.] 2. Using the result that sample average is an unbiased estimator of the population mean, show that iz = MET_15; - MELT_10; is an unbiased estimator of E (Z) . 3. Using the result that the variance of the sample average of a random sample of n observations from a distribution with mean / and variance o' is , compute the variance of /z = > >_,T_15; - " Ein T_10;, for a random sample of n = 40 countries, when Var (T_15) = Var(T_10) = 100, and p the correlation coefficient between 7_15 and T_10 is 0.8. 4. Suppose that we have obtained data on T_15 and T_10 for a sample n countries and computed Z; =T_15; -T_10; for i = 1, ..., n. Using the matrix formula for the OLS estimator, show that if we regress this variable on a constant only, the OLS estimate of the constant will be ! )_, T_15;- = ELIT_10.Question 1 (understanding mean and variance of linear combinations of random variables) Let T_15 be the percentage of adult males who used tobacco products in 2015 in a country and T_10 be this percentage in 2010 in the same country. Define the random variable Z in the following way: Z =T_15 -T_10. We do not observe T_15 and T_10 for all countries of the world. We can only hope to get data from a random sample of n countries, where n is much smaller than the number of countries in the world. We want to estimate the E (Z) for the distribution of countries in the world. Each group member should attempt one of the following questions. The group can consult and improve the answer and only submit the improved answer, but the original person who attempted each part must be named. 1. What does the hypothesis E (Z) = 0 mean? After explaining what this hypothesis means, describe whether or not E (Z) = 0 implies T_15 = T_10 in every country in the world. Then, describe whether or not E (Z) = 0 implies -Er_15; = -Er_10; 1= 1 1= 1 for the n countries in the sample [Note that "Yes it does" or "No it doesn't" are not sufficient, you are expected to justify your answer.] 2. Using the result that sample average is an unbiased estimator of the population mean, show that iz = MET_15; - MELT_10; is an unbiased estimator of E (Z) . 3. Using the result that the variance of the sample average of a random sample of n observations from a distribution with mean / and variance o' is , compute the variance of /z = > >_,T_15; - " Ein T_10;, for a random sample of n = 40 countries, when Var (T_15) = Var(T_10) = 100, and p the correlation coefficient between 7_15 and T_10 is 0.8. 4. Suppose that we have obtained data on T_15 and T_10 for a sample n countries and computed Z; =T_15; -T_10; for i = 1, ..., n. Using the matrix formula for the OLS estimator, show that if we regress this variable on a constant only, the OLS estimate of the constant will be ! )_, T_15;- = ELIT_10.1. We use the added variable technique to derive the variance ination factor (VIP). Consider a linear model of the form 91' =50+l31$1+l3213922+-"+}3p$a'p+zr, 5'3: 1:"'ana (1) where the errors are uncorrelated with mean zero and variance 02. Let X denote the n X p' predictor matrix and assume X is of full rank. We will derive the VIP for ip. The same derivation applies to any other coefcient simply by rearranging the columns of X. Let U denote the matrix containing the rst p' 1 columns of X and let z denote the the last column of X so that X = [U 2]. Then we can write the model in (1) as 50 x91 Y=[U z](,:J)+t-:=Ua+z6p+e with a: (2) x810. 1 Let 2 denote the vector of tted values from the least squares regression of z on the columns of U (Le. the regression of X.p on all the other variables), and let T : z 2 denote the residuals from that regression. Note that 'r' and 3 are not random, they are constant vectors obtained by linear transformations of z. (a) Show that the regression model in (2) can be rewritten in the form for some constant vector 6 of the same length as a. (Hint: z : i l 'r and 2? = U(UTU)_1UTz). (b) Show that UT? 2 0, a zero vector. (0) Obtain simplied expressions for the least squares estimators of 5 and 5?, showing, in particular, that 5,, : 'rTY/rT'r. (d) Based on Part (c) and the model assumptions, show that 0.2 ELK\"? _ is)? where :Eg-p is the LS tted value from regression X,D on the all the other predictor variables with an intercept. var(,p)