. We continue the analysis begun in Exercises 1.7, 2.22, 3.6, and 4.7. The focus of this exercise is variable selection. a. Begin with
. We continue the analysis begun in Exercises 1.7, 2.22, 3.6, and 4.7. The focus of this exercise is variable selection. a. Begin with the data from = 185 countries throughout the world that have valid (nonmissing) life expectancies, Plot the life expectancy versus the gross domestic product and private expenditures on health. From these plots, describe why it is desirable to use logarithmic transforms, InGDP and InHEALTH, respectively. Also plot life expectancy versus InGDP and InHEALTH to confirm your intuition. b. Use a stepwise regression algorithm to help you select a model. Do not consider the variables RESEARCHERS, SMOKING, and FEMALE. BOSS, as these have many missing values. For the remaining variables. use only the observations without any missing values. Do this twice, with and without the categorical variable REGION. c. Return to the full dataset of n = 185 countries and run a regression model using FERTILITY, PUBLICEDUCATION, and InHEALTH as explanatory variables. c(i). Provide histograms of standardized residuals and leverages. c(ii). Identify the standardized residual and leverage associated with Lesotho, formerly Basutoland, a kingdom surrounded by South Africa. Is this observation an outlier, a high leverage point, or both? c(iii). Rerun the regression without Lesotho, Cite any differences in the statistical coefficients between this model and the one in part c(i).
Step by Step Solution
3.34 Rating (154 Votes )
There are 3 Steps involved in it
Step: 1
a life expectancy the gress d...See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started