Question
Data set link: http://users.stat.ufl.edu/~rrandles/sta4210/Rclassnotes/data/textdatasets/KutnerData/Appendix%20C%20Data%20Sets/APPENC02.txt This data set provides selected county demographic information (CDI) for 440 of the most populous counties in the United States. Each
Data set link:
http://users.stat.ufl.edu/~rrandles/sta4210/Rclassnotes/data/textdatasets/KutnerData/Appendix%20C%20Data%20Sets/APPENC02.txt
This data set provides selected county demographic information (CDI)for 440 of the most populous counties in the United States. Each line of the data set has an identification number with a county name and state abbreviation and provides information on 14 variables for a single county. Counties with missing data were deleted from the data set. The information generally pertains to the years 1990 and 1992. The 17 variables are:
Identification Number
County
State
Land Area
Total Population
Percent of Population aged 18-34
Percent of population 65 or older
Number of active physicians(Y)
Number of hospital beds
Total serious crimes
Percent high school graduates
Percent bachelor's degrees
Percent below poverty level
Percent unemployment
Per capita income
Total personal income
Geographic region(1 = Northeast, 2 = Midwest, 3 = South, 4 = West
The goal is to model the number of physicians per 1000 inhabitants, using the other demographic variables.
(1) Plot Number of active physiciansagainst each of Total Population, Total personal income, per capita income , Total serious crimes and pop65plus.
(2) Plot ln(Number of active physicians) against the others (Total Population, Total personal income, per capita income , Total serious crimes and pop65plus). Does is seem reasonable to take the log?
(3)
a. Regress the ln(number of active physicians) in turn on (SLR)each of the three predictor variables (total population, number of hospital beds, and total personal income). State the estimated regression functions.
b. Plot the three estimated regression functions and data on separate graphs. Does a linear regression relation appear to provide a good fit for each of the three predictor variables?
c. Calculates(sqrt(MSE)) for each of the three predictor variables. Which predictor variable leads to the smallest variability around the fitted regression line?
d. Obtain Bonferroni joint confidence intervals for 0 and 1 using a 95 percent family confidence coefficient and interpret the interval for all the models.
e. An investigator has suggested that for model with total population 0 should be -100 and 1 should be .0028. Do the joint confidence intervals in part (d) support this view? Discuss.
f. Estimate the expected number of active physicians for counties with total population of X = 500, 1000, 5000 thousand with Bonferroni family confidence coefficient 0.90.
Step by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started