Answered step by step
Verified Expert Solution
Question
1 Approved Answer
M222 Modeling Business Decisions Spring 2016 Problem Set 7 Due Friday April 22 by 6pm Question 1: Air pollution in Beijing has been a very
M222 Modeling Business Decisions Spring 2016 Problem Set 7 Due Friday April 22 by 6pm Question 1: Air pollution in Beijing has been a very serious problem in recent years. Air pollution has a negative impact on health, particularly the health of infants. As a policymaker in Beijing, you want to investigate if providing air filters to households improves infant health. You randomly sample 1,000 households with an infant in their home. Experiment A: First, suppose you asked households if they would like to receive an air filter. Of the 1000 households, 500 households requested and received an air filter, while the other 500 households did not request one. Let's call this \"Experiment A\". You collect data on the following variables: InfantHealth: Health indicator for infants in each household (Worst score = 1, Best score = 100) AirFilter: Dummy variable = 1 if the household received an air filter, = 0 if no filter MothersEducation: Mother's years of schooling (minimum 9 years, and maximum 22 years) a) Analyzing data from \"Experiment A,\" you find the following regression result. Standard errors are in parentheses. Regression 1: InfantHealth = 2.2 + 30.5*AirFilter + 3.5*MothersEducation (0.8) (8.4) (0.9) Interpret the coefficient on AirFilter in regression #1. (One sentence). b) Calculate the t-statistic and 95% confidence interval for the coefficient on AirFilter in regression (#1). Is it statistically significant? t-statistic:_________________ 95% confidence Interval:______________ Is the AirFilter coefficient statistically significant?____________________________ c) Based on this result, your colleague says, \"Providing air filters is a good public policy because regression #1 shows that air filters cause better infant health.\" Do you agree that this evidence shows that air filters cause better infant health? Why or why not? 1 QM222 Modeling Business Decisions Spring 2016 d) Based upon the data you collect in this experiment you find that the correlation between AirFilter and MothersEducation is 0.84. You next run the following regression (#2). Regression #2: InfantHealth = 0 + 1*AirFilter Which of the following statements is correct? Circle one and explain why. i) 1 will be larger than 30.5 ii) 1 will be smaller than 30.5 iii) 1 will be very close to 30.5 iv) Cannot tell Explanation: Experiment B: Now, instead of asking households to choose whether they want an air filter, you randomly assign air filters among another 1000 randomly selected households. Let's call this \"Experiment B\". You provide air filters to 500 randomly selected households while the other 500 households do not receive air filters. e) Using the Experiment B data, you find the following regression result. Regression #3: 4.5*MothersEducation InfantHealth = 1 + 10.2*AirFilter + (8.4) (0.9) Based on this result, your colleague says, \"Providing air filters is a good public policy because regression (2) shows that air filters cause better infant health.\" Do you agree that this evidence shows that air filters cause better infant health? Why or why not? f) Using the Experiment B data, you now run the following regression: Regression #4: AirFilter = 0 + 1* MothersEducation Which of the following statements is most likely to be correct? Circle one and explain why. i) 1 will be positive and statistically significant 2 QM222 Modeling Business Decisions Spring 2016 ii) 1 will be negative and statistically significant iii) 1 will be close to zero and not statistically significant iv) Cannot tell Explanation: g) Using the Experiment B data, you now run another regression (#5): Regression #5: InfantHealth = 0 + 1*AirFilter Which of the following statements is correct? Circle one and explain why. (Note: 10.2 is the coefficient on AirFilter in Regression 3 above.) i) 1 will be larger than 10.2 ii) 1 will be smaller than 10.2 iii) 1 will be very close to 10.2 iv) Cannot tell Explanation Question 2: Executives at a major financial company are trying to model which households own stocks. They collected data for a national sample of households from around the country. The data they collected includes: 3 QM222 Modeling Business Decisions own_stock: college: college highschool: net_wealth: house: Spring 2016 whether or not the household owns any stocks. whether the most educated person in the household completed whether the most educated person in the household completed high school but not college the total amount of wealth in millions of dollars (includes value of real assets like houses; of financial assets like bank accounts, bonds excluding stocks; and subtracts out amount the household owes including loans, mortgages etc.) whether or not the household owns their own house They ask you to model who owns stocks, so you run a set of regressions with \"own_stock\" as the left hand side (Y) variable. 1. You first run Regression 1 on the next page. Without any statistics terms or jargon, what does the coefficient 0.1296 tell us? (1-2 sentences.) 2. Would it be possible for Regression #1 to predict a probability greater than 1? Less than 0? Explain (1-2 sentences) 3. You then run Regression 2 on the next page. What is one combination of values for the explanatory variables would lead to a predicted probability of owning stock that is less than zero? 4 QM222 Modeling Business Decisions Spring 2016 Regression 1: Regression Statistics Multiple R 0.33171 R Square 0.110032 Adjusted R Square 0.109733 Standard Error 0.394304 Observations 5962 ANOVA df Regression Residual Total 2 5959 5961 SS 114.5458 926.4794 1041.025 Intercept highschool college Coefficients 0.108553 0.129613 0.332212 Standard Error 0.007151 0.01289 0.012254 MS 57.2729 0.155476 F 368.3721 Significance F 1.5E-151 t Stat 15.1791 10.05568 27.1094 P-value 4.35E-51 1.34E-23 1E-152 Lower 95% 0.094533 0.104345 0.308188 Upper 95% 0.122572 0.154881 0.356235 Regression 2: Regression Statistics Multiple R 0.369552405 R Square 0.13656898 Adjusted R Square 0.135989204 Standard Error 0.388445985 Observations 5962 ANOVA df Regression Residual Total 4 5957 5961 SS 142.1717 898.8534 1041.025 Intercept highschool college net_wealth house Coefficients -0.040815524 0.124457386 0.307153043 0.011547481 0.163435276 Standard Error 0.016802 0.012707 0.012236 0.001271 0.017169 MS 35.54294 0.15089 F 235.5548 Significance F 4.6E-188 t Stat -2.42925 9.794512 25.1029 9.081971 9.519454 P-value 0.01516 1.76E-22 2.8E-132 1.42E-19 2.47E-21 Lower 95% -0.07375 0.099547 0.283167 0.009055 0.129779 5 Upper 95% -0.00788 0.149367 0.33114 0.01404 0.197092 QM222 Modeling Business Decisions Spring 2016 Problem Set 7 Due Friday April 22 by 6pm Question 1: Air pollution in Beijing has been a very serious problem in recent years. Air pollution has a negative impact on health, particularly the health of infants. As a policymaker in Beijing, you want to investigate if providing air filters to households improves infant health. You randomly sample 1,000 households with an infant in their home. Experiment A: First, suppose you asked households if they would like to receive an air filter. Of the 1000 households, 500 households requested and received an air filter, while the other 500 households did not request one. Let's call this \"Experiment A\". You collect data on the following variables: InfantHealth: Health indicator for infants in each household (Worst score = 1, Best score = 100) AirFilter: Dummy variable = 1 if the household received an air filter, = 0 if no filter MothersEducation: Mother's years of schooling (minimum 9 years, and maximum 22 years) a) Analyzing data from \"Experiment A,\" you find the following regression result. Standard errors are in parentheses. Regression 1: InfantHealth = 2.2 + 30.5*AirFilter + 3.5*MothersEducation (0.8) (8.4) (0.9) Interpret the coefficient on AirFilter in regression #1. (One sentence). This represent magnitude change in Infant Health as a result of a unit change in AirFilter b) Calculate the t-statistic and 95% confidence interval for the coefficient on AirFilter in regression (#1). Is it statistically significant? t-statistic:_____3.631_ 95% confidence Interval 14.016 46.98 Is the AirFilter coefficient statistically significant?_ Yes__________ c) Based on this result, your colleague says, \"Providing air filters is a good public policy because regression #1 shows that air filters cause better infant health.\" Do you agree that this evidence shows that air filters cause better infant health? Why or why not? Yes, since the coefficient is positive and significant which means Providing air filters is a good public policy because regression #1 shows that air filters cause better infant health 1 QM222 Modeling Business Decisions Spring 2016 d) Based upon the data you collect in this experiment you find that the correlation between AirFilter and MothersEducation is 0.84. You next run the following regression (#2). Regression #2: InfantHealth = 0 + 1*AirFilter Which of the following statements is correct? Circle one and explain why. i) 1 will be larger than 30.5 ii) 1 will be smaller than 30.5 iii) 1 will be very close to 30.5 iv) Cannot tell Explanation: Since correlation is high and Infant health will now be explained by only one variable, we expect it to be high. Experiment B: Now, instead of asking households to choose whether they want an air filter, you randomly assign air filters among another 1000 randomly selected households. Let's call this \"Experiment B\". You provide air filters to 500 randomly selected households while the other 500 households do not receive air filters. e) Using the Experiment B data, you find the following regression result. Regression #3: 4.5*MothersEducation InfantHealth = 1 + 10.2*AirFilter + (8.4) (0.9) Based on this result, your colleague says, \"Providing air filters is a good public policy because regression (2) shows that air filters cause better infant health.\" Do you agree that this evidence shows that air filters cause better infant health? Why or why not? NO: REASON: the 95% confidence interval for the slope is -6.08371 26.88371 and since zero is within the interval, it means there is no linear relationship between the two variables and thus, the claim is invalid f) Using the Experiment B data, you now run the following regression: 2 QM222 Modeling Business Decisions Regression #4: Spring 2016 AirFilter = 0 + 1* MothersEducation Which of the following statements is most likely to be correct? Circle one and explain why. i) 1 will be positive and statistically significant ii) 1 will be negative and statistically significant iii) 1 will be close to zero and not statistically significant iv) Cannot tell Explanation: Since the coefficient AirFilter is not significant in the first equation, we expect when we predict it by reversing the equation to be very small and insignificant. g) Using the Experiment B data, you now run another regression (#5): Regression #5: InfantHealth = 0 + 1*AirFilter Which of the following statements is correct? Circle one and explain why. (Note: 10.2 is the coefficient on AirFilter in Regression 3 above.) i) 1 will be larger than 10.2 ii) 1 will be smaller than 10.2 iii) 1 will be very close to 10.2 iv) Cannot tell Explanation Since in the first equation (#4) AirFilter is not significant, it will be hard to tell its effects on predicting InfantHealth as the only coefficient 3 QM222 Modeling Business Decisions Spring 2016 Question 2: Executives at a major financial company are trying to model which households own stocks. They collected data for a national sample of households from around the country. The data they collected includes: own_stock: college: college highschool: net_wealth: house: whether or not the household owns any stocks. whether the most educated person in the household completed whether the most educated person in the household completed high school but not college the total amount of wealth in millions of dollars (includes value of real assets like houses; of financial assets like bank accounts, bonds excluding stocks; and subtracts out amount the household owes including loans, mortgages etc.) whether or not the household owns their own house They ask you to model who owns stocks, so you run a set of regressions with \"own_stock\" as the left hand side (Y) variable. 1. You first run Regression 1 on the next page. Without any statistics terms or jargon, what does the coefficient 0.1296 tell us? (1-2 sentences.) High school=0.1296 is the change in own_stock as a result of unit change in High school (whether the most educated person in the household completed high school but not college _ 2. Would it be possible for Regression #1 to predict a probability greater than 1? Less than 0? Explain (1-2 sentences) N0 since all the coefficients are greater than zero and the sum of all the coeficients is not grater than one. 3. You then run Regression 2 on the next page. What is one combination of values for the explanatory variables would lead to a predicted probability of owning stock that is less than zero? a. When all the predictor coefficients are equal to zero. b. When all the predictor coefficients are equal to zero apart from net worth. 4 QM222 Modeling Business Decisions Spring 2016 5 QM222 Modeling Business Decisions Spring 2016 Regression 1: Regression Statistics Multiple R 0.33171 R Square 0.110032 Adjusted R Square 0.109733 Standard Error 0.394304 Observations 5962 ANOVA df Regression Residual Total 2 5959 5961 SS 114.5458 926.4794 1041.025 Intercept highschool college Coefficients 0.108553 0.129613 0.332212 Standard Error 0.007151 0.01289 0.012254 MS 57.2729 0.155476 F 368.3721 Significance F 1.5E-151 t Stat 15.1791 10.05568 27.1094 P-value 4.35E-51 1.34E-23 1E-152 Lower 95% 0.094533 0.104345 0.308188 Upper 95% 0.122572 0.154881 0.356235 Regression 2: Regression Statistics Multiple R 0.369552405 R Square 0.13656898 Adjusted R Square 0.135989204 Standard Error 0.388445985 Observations 5962 ANOVA df Regression Residual Total 4 5957 5961 SS 142.1717 898.8534 1041.025 Intercept highschool college net_wealth house Coefficients -0.040815524 0.124457386 0.307153043 0.011547481 0.163435276 Standard Error 0.016802 0.012707 0.012236 0.001271 0.017169 MS 35.54294 0.15089 F 235.5548 Significance F 4.6E-188 t Stat -2.42925 9.794512 25.1029 9.081971 9.519454 P-value 0.01516 1.76E-22 2.8E-132 1.42E-19 2.47E-21 Lower 95% -0.07375 0.099547 0.283167 0.009055 0.129779 6 Upper 95% -0.00788 0.149367 0.33114 0.01404 0.197092
Step by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started