Question
The divorce rate in the United States during the years 1920-1996 can be modeled with the quantitative variables listed below. In Problem 1 you will
The divorce rate in the United States during the years 1920-1996 can be modeled with the quantitative variables listed below.
In Problem 1 you will examine an assumed linear model of the divorce rate as a function of socio-economic characteristics (as predictors).
We will assume i.i.d. normal errors for the response values, and unknown constant variance of the errors.In the following questions, use the data as-is; do not remove any outliers.
Dataset can be found here: https://drive.google.com/file/d/1CZdb2m_eeY60sw82yGn3MaFNoK7xx5kY/view?usp=sharing
simply copy the link and paste it on your browser.
Read the data from `divusa.txt` into R. Use
divusa <- read.table("divusa.txt", header = T,sep=',')
The data description is as follows:
- `divorce`: divorce per 1000 women aged 15 or more
- `unemployed`: unemployment rate
- `femlab`: percent female participation in labor force aged 16+
- `marriage`: marriages per 1000 unmarried women aged 16+
- `birth`: births per 1000 women aged 15-44
- `military`: military personnel per 1000 population
## Part (a)
demonstrate a numerical summary of the data, and use the function `pairs()` (in base R) to demonstrate a graphical summary of the data. Do you see anything that looks promising for modeling? Do you see anything that may alert you to potential problems? Limit your answer to one or two sentences.
## Part (b)
Fit a linear model to predict the variable `divorce` from the variable `femlab`.
## Part (c)
What *specific* hypothesis is being tested with the p-value given for the slope coefficient in the output in part (b)? (State the null and alternative hypotheses).Do you accept or reject the null-hypothesis, and on what basis?
## Part (d)
What is the sample size?
## Part (e)
Does the intercept term have a useful interpretation, in terms of the model? Explain in one or two sentences.
## Part (f)
What percentage of variation in the data is not explained by the model?
## Part (g)
Plot the standardized residuals against the response variable and the predictor variable, and produce a Q-Q plot of the standardized residuals. What can we conclude about the normality of the errors, the constancy of the error variance, and the relationship between the errors and the variable?
## Part (h)
What is the estimated mean divorce rate when femlab = 38?
## Part (i)
demonstrate a 97% prediction interval around the mean response estimated in part (i).
## Part (j)
demonstrate a 97% confidence interval for$\beta_1$, the slope coefficient.
## Part (k)
Suppose that the percent of female participation in the labor force increased by 13 from one year to the next.What would be the predicted change in the US divorce rate?
Step by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started