Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

# Problem 1 The divorce rate in the United States during the years 1920-1996 can be modeled with the quantitative variables listed below. In Problem

# Problem 1

The divorce rate in the United States during the years 1920-1996 can be modeled with the quantitative variables listed below.

In Problem 1 you will examine an assumed linear model of the divorce rate as a function of socio-economic characteristics (as predictors).

We will assume i.i.d. normal errors for the response values, and unknown constant variance of the errors.In the following questions, use the data as-is; do not remove any outliers.

The datasets for the problems can be found here: https://drive.google.com/drive/folders/1CI6kWaDpYQBt3V22s7XD8T872lxOPl7I?usp=sharing

Please Use R and explain Along the way

Read the data from `divusa.txt` into R. Use

divusa <- read.table("divusa.txt", header = T,sep=',')

The data description is as follows:

- `divorce`: divorce per 1000 women aged 15 or more

- `unemployed`: unemployment rate

- `femlab`: percent female participation in labor force aged 16+

- `marriage`: marriages per 1000 unmarried women aged 16+

- `birth`: births per 1000 women aged 15-44

- `military`: military personnel per 1000 population

## Part (a)

Create a numerical summary of the data, and use the function `pairs()` (in base R) to create a graphical summary of the data. Do you see anything that looks promising for modeling? Do you see anything that may alert you to potential problems? Limit your answer to one or two sentences.

## Part (b)

Fit a linear model to predict the variable `divorce` from the variable `femlab`.

## Part (c)

What *specific* hypothesis is being tested with the p-value given for the slope coefficient in the output in part (b)? (State the null and alternative hypotheses).Do you accept or reject the null-hypothesis, and on what basis?

## Part (d)

What is the sample size?

## Part (e)

Does the intercept term have a useful interpretation, in terms of the model? Explain in one or two sentences.

## Part (f)

What percentage of variation in the data is not explained by the model?

## Part (g)

Plot the standardized residuals against the response variable and the predictor variable, and produce a Q-Q plot of the standardized residuals. What can we conclude about the normality of the errors, the constancy of the error variance, and the relationship between the errors and the variable?

## Part (h)

What is the estimated mean divorce rate when femlab = 38?

## Part (i)

Create a 97% prediction interval around the mean response estimated in part (i).

## Part (j)

Create a 97% confidence interval for$\beta_1$, the slope coefficient.

## Part (k)

Suppose that the percent of female participation in the labor force increased by 13 from one year to the next.What would be the predicted change in the US divorce rate?

# Problem 2

Download the data set `Tree.txt`. Collected by Bruce and Schumacher, this classic dataset measures the diameter (x, in inches) and volume (y, in cubic feet) of shortleaf pines.

Load the data using `tree <- read.table("Tree.txt", header = T)`

tree <- read.table("Tree.txt", header = T)

## Part (a)

Fit a simple linear regression model for predictor diameter and response volume.

## Part (b)

Assess the appropriateness of the model fit using model diagnostics. Limit your response to two or three sentences.

## Part (c)

Fit a simple linear regression to the log-log transformed data (take the natural logrithm of both the response and predictor variable)

## Part (d)

Produce a Q-Q plot of the standardized residuals from the transformed model in part (c), and plot the standardized residuals against the response variable and the predictor variable. What can we conclude about the normality of the errors, the constancy of the error variance, and the relationship between the errors and the variable?

## Part (e)

What is the nature of the relationship between the diameter and volume of shortleaf pines? Is there a significant association between the diameter and volume?

## Part (f)

Interpret the coefficient $\beta_1$, in terms of the model.

## Part (g)

What is the expected volume of a tree with an eleven inch diameter?

# Problem 3

Download the semiconductor photomask line-spacing data. The data includes measurement errors for measurements taken at different line spacing. It appears that the precision of the line-spacing measurements decreases as the line spacing increases.

- `line_space`: The line spacing for the observation.

- `measurement_error`: The measurement error for that observation.

- `sd`: The standard deviation for the $Y_i$ of each observation.

Read in the data using

photomask <- read.table("measurements.txt", header = T)

## Part (a)

Why would the Weighted Least Squares model be appropriate in this situation?

## Part (b)

Do a weighted least squared regression model to predict the measurement error for a given line spacing by giving the weights directly to the `lm` function as `weights = `.

## Part (c)

Is this model significant at $\alpha = 0.001$?

## Part (d)

Use your model from Part (b) to find a 95\% prediction interval for a new measurement taken at a line spacing of 1.99.

## Part (e)

Why is the prediction interval in part (d) untrustworthy? Is the interval at this location going to be too small or too large?

## Part (f)

Build a new model that incorporates the weights into the variables for a LS model.

## Part (g)

Use your model from Part (f) to find a more accurate prediction error for measurement error at line_space = 1.99. Use a standard deviation of 0.013.

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

Finite Mathematics and Its Applications

Authors: Larry J. Goldstein, David I. Schneider, Martha J. Siegel, Steven Hair

12th edition

978-0134768588, 9780134437767, 134768582, 134437764, 978-0134768632

More Books

Students also viewed these Mathematics questions

Question

What is cultural tourism and why is it growing?

Answered: 1 week ago