Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

# Problem 1 The divorce rate in the United States during the years 1920-1996 can be modeled with the quantitative variables listed below. In Problem

# Problem 1

The divorce rate in the United States during the years 1920-1996

can be modeled with the quantitative variables listed below.

In Problem 1 you will examine an assumed linear model of

the divorce rate as a function of socio-economic characteristics (as predictors).

We will assume i.i.d. normal errors for the response

values, and unknown constant variance of the errors.In the following

questions, use the data as-is; do not remove any outliers.

The datasets for the problems can be found here: https://drive.google.com/drive/folders/1CI6kWaDpYQBt3V22s7XD8T872lxOPl7I?usp=sharing

Please Use R and explain Along the way

Read the data from `divusa.txt` into R. Use

divusa <- read.table("divusa.txt", header =

T,sep=',')

The data description is as follows:

- `divorce`: divorce per 1000 women aged 15 or more

- `unemployed`: unemployment rate

- `femlab`: percent female participation in labor force aged

16+

- `marriage`: marriages per 1000 unmarried women aged 16+

- `birth`: births per 1000 women aged 15-44

- `military`: military personnel per 1000 population

## Part (a)

Demonstrate a

numerical summary of the data, and use the function `pairs()` (in base R) to show a

graphical summary of the data. Do you see anything that looks promising for

modeling? Do you see anything that may alert you to potential problems? Limit

your answer to one or two sentences.

## Part (b)

Fit a linear model to predict the variable `divorce` from the

variable `femlab`.

## Part (c)

What *specific* hypothesis is being tested with the p-value

given for the slope coefficient in the output in part (b)? (State the null and

alternative hypotheses).Do you accept or reject the null-hypothesis, and

on what basis?

## Part (d)

What is the sample size?

## Part (e)

Does the intercept term have a useful interpretation, in terms

of the model? Explain in one or two sentences.

## Part (f)

What percentage of variation in the data is not explained by the

model?

## Part (g)

Plot the standardized residuals against the response variable

and the predictor variable, and produce a Q-Q plot of the standardized

residuals. What can we conclude about the normality of the errors, the

constancy of the error variance, and the relationship between the errors and

the variable?

## Part (h)

What is the estimated mean divorce rate when femlab =

38?

## Part (i)

Demonstrate a 97%

prediction interval around the mean response estimated in part (i).

## Part (j)

Demonstrate a 97%

confidence interval for$\beta_1$, the slope coefficient.

## Part (k)

Suppose that the percent of female participation in the labor

force increased by 13 from one year to the next.What would be the

predicted change in the US divorce rate?

# Problem 2

Download the data set `Tree.txt`. Collected by Bruce and

Schumacher, this classic dataset measures the diameter (x, in inches) and

volume (y, in cubic feet) of shortleaf pines.

Load the data using `tree <- read.table("Tree.txt",

header = T)`

tree <- read.table("Tree.txt", header = T)

## Part (a)

Fit a simple linear regression model for predictor diameter and

response volume.

## Part (b)

Assess the appropriateness of the model fit using model

diagnostics. Limit your response to two or three sentences.

## Part (c)

Fit a simple linear regression to the log-log transformed data

(take the natural logrithm of both the response and predictor variable)

## Part (d)

Produce a Q-Q plot of the standardized residuals from the

transformed model in part (c), and plot the standardized residuals against the

response variable and the predictor variable. What can we conclude about the

normality of the errors, the constancy of the error variance, and the

relationship between the errors and the variable?

## Part (e)

What is the nature of the relationship between the diameter and

volume of shortleaf pines? Is there a significant association between the

diameter and volume?

## Part (f)

Interpret the coefficient $\beta_1$, in terms of the

model.

## Part (g)

What is the expected volume of a tree with an eleven inch

diameter?

# Problem 3

Download the semiconductor photomask line-spacing data. The data

includes measurement errors for measurements taken at different line spacing.

It appears that the precision of the line-spacing measurements decreases as the

line spacing increases.

- `line_space`: The line spacing for the observation.

- `measurement_error`: The measurement error for that

observation.

- `sd`: The standard deviation for the $Y_i$ of each

observation.

Read in the data using

photomask <- read.table("measurements.txt", header

= T)

## Part (a)

Why would the Weighted Least Squares model be appropriate in

this situation?

## Part (b)

Represent

weighted least squared regression model to predict the measurement error for a

given line spacing by giving the weights directly to the `lm` function as

`weights = `.

## Part (c)

Is this model significant at $\alpha = 0.001$?

## Part (d)

Use your model from Part (b) to find a 95\% prediction interval

for a new measurement taken at a line spacing of 1.99.

## Part (e)

Why is the prediction interval in part (d) untrustworthy? Is the

interval at this location going to be too small or too large?

## Part (f)

Build a new model that incorporates the weights into the

variables for a LS model.

## Part (g)

Use your model from Part (f) to find a more accurate prediction

error for measurement error at line_space = 1.99. Use a standard deviation of

0.013.

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

Elementary Differential Equations And Boundary Value Problems

Authors: William E Boyce, Richard C DiPrima

8th Edition

0470476389, 9780470476383

More Books

Students also viewed these Mathematics questions

Question

4. What means will you use to achieve these values?

Answered: 1 week ago

Question

3. What values would you say are your core values?

Answered: 1 week ago