Answered step by step
Verified Expert Solution
Question
1 Approved Answer
Please show reasonings for each step Suppose you are studying the relationship between income and education, and you try to apply the following OLS regression
Please show reasonings for each step
Suppose you are studying the relationship between income and education, and you try to apply the following OLS regression model (1) log Income; = Bo + BCollegei, where Income; stands for observation i's yearly income, and College; stands for whether observation i goes to college. You also notice that one's family background can be an important factor predicting future income. Hence, you set up another OLS regression model (2) log Income; = B. + BCollege; + B, log FamilyIncomez, where FamilyIncome; stands for observation i's family income when i was 15 years old. You use some statistical software and find the following results using a random sample of 10000 observations. Outcome Variable: log Income Explanatory Variables (1) (2) College 0.3140 0.0012 (0.0032) (0.0019) log of Family Income 1.0002 (0.0045) Constant 11.1323 -0.0269 (0.0021) (0.0514) N 10000 10000 R2 0.4875 0.9125 The numbers without parentheses are regression coefficients in each model. For example, the "0.4168" is the estimate of regression coefficient , in model (1), and "11.1323 is the estimate of B, in model (1). The numbers in parentheses are the standard errors, or the estimated standard deviations of regression coefficients. We can use the standard errors to estimate the standard deviation of regression coefficients. (e) (1 point) Why do the coefficients of College; in model (1) and (2) change so much? Propose one possible reason. You also want to see whether the years of education can make difference. To test this theory, you apply the following alternative model (3): log Income; = 7+ Y1EduYear: + 72 log Family Incomei, where EduYear; is the number of years observation i was in school. You also know that the obser- vations in your sample either have a high school degree (hence EduYear; = 12) or a have college degree (hence EduYear; = 16). Therefore, EduYear; equals either 12 or 16. The following table shows the results (together with models (1) and (2)): (3) Explanatory Variables College Outcome Variable: log Income; (1) (2) 0.3140 0.0012 (0.0032) (0.0019) Years of Education log of Family Income 1.0002 0.0003 (0.0005) 1.0002 (0.0045) -0.2515 (0.0476) 10000 0.9125 (0.0045) -0.0269 (0.0514) 10000 0.9125 Constant 11.1323 (0.0021) N 10000 R2 0.4875 (f) (2 points) Why are the coefficients of log FamilyIncome; in models (2) and (3) the same? Hint: What is the definition of a regression coefficient in a multi-variable regression? (g) (2 points) Compare the models (2) and (3). Which model do you think explains the effect of education on income better given our data set? Why? Suppose you are studying the relationship between income and education, and you try to apply the following OLS regression model (1) log Income; = Bo + BCollegei, where Income; stands for observation i's yearly income, and College; stands for whether observation i goes to college. You also notice that one's family background can be an important factor predicting future income. Hence, you set up another OLS regression model (2) log Income; = B. + BCollege; + B, log FamilyIncomez, where FamilyIncome; stands for observation i's family income when i was 15 years old. You use some statistical software and find the following results using a random sample of 10000 observations. Outcome Variable: log Income Explanatory Variables (1) (2) College 0.3140 0.0012 (0.0032) (0.0019) log of Family Income 1.0002 (0.0045) Constant 11.1323 -0.0269 (0.0021) (0.0514) N 10000 10000 R2 0.4875 0.9125 The numbers without parentheses are regression coefficients in each model. For example, the "0.4168" is the estimate of regression coefficient , in model (1), and "11.1323 is the estimate of B, in model (1). The numbers in parentheses are the standard errors, or the estimated standard deviations of regression coefficients. We can use the standard errors to estimate the standard deviation of regression coefficients. (e) (1 point) Why do the coefficients of College; in model (1) and (2) change so much? Propose one possible reason. You also want to see whether the years of education can make difference. To test this theory, you apply the following alternative model (3): log Income; = 7+ Y1EduYear: + 72 log Family Incomei, where EduYear; is the number of years observation i was in school. You also know that the obser- vations in your sample either have a high school degree (hence EduYear; = 12) or a have college degree (hence EduYear; = 16). Therefore, EduYear; equals either 12 or 16. The following table shows the results (together with models (1) and (2)): (3) Explanatory Variables College Outcome Variable: log Income; (1) (2) 0.3140 0.0012 (0.0032) (0.0019) Years of Education log of Family Income 1.0002 0.0003 (0.0005) 1.0002 (0.0045) -0.2515 (0.0476) 10000 0.9125 (0.0045) -0.0269 (0.0514) 10000 0.9125 Constant 11.1323 (0.0021) N 10000 R2 0.4875 (f) (2 points) Why are the coefficients of log FamilyIncome; in models (2) and (3) the same? Hint: What is the definition of a regression coefficient in a multi-variable regression? (g) (2 points) Compare the models (2) and (3). Which model do you think explains the effect of education on income better given our data set? WhyStep by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started