Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

Your Code You have been asked to create regression models in the Module Two Problem Set. Review the Problem Set Report template to see the

Your Code

You have been asked to create regression models in the Module Two Problem Set. Review the Problem Set Report template to see the questions you will be answering about your models.

Use the empty blocks below to write the R code for your models and get outputs. Then use the outputs to answer the questions in your problem set report.

Note: Use the + (plus) button to add new code blocks or the scissor icon to remove empty code blocks, if needed.

In[12]:

Edit Metadata

# Step 1: Loading the Data Set# Loading mtcars data set from a mtcars.csv filemtcars <- read.csv(file='mtcars.csv', header=TRUE, sep=",")# Converting appropriate variables to factors mtcars2 <- within(mtcars, { vs <- factor(vs) am <- factor(am) cyl <- factor(cyl) gear <- factor(gear) carb <- factor(carb)})# Print the first six rowsprint("head")head(mtcars2, 6)# Step 2: Subsetting Data and Correlation Matrixmyvars <- c("mpg","wt","drat")mtcars_subset <- mtcars2[myvars]# Print the first six rowsprint("head")head(mtcars_subset, 6)# Print the correlation matrixprint("cor")corr_matrix <- cor(mtcars_subset, method = "pearson")round(corr_matrix, 4)# Step 3: Multiple Regression With Interaction Term# Create the multiple regression model and print summary statistics. Note that this model includes the interaction term.model1 <- lm(mpg ~ wt + drat + wt:drat, data=mtcars_subset)summary(model1)# Step 4: Adding in a Qualitative Predictor# Subsetting data to only include the variables that are neededmyvars <- c("mpg","wt","drat","am")mtcars_subset <- mtcars2[myvars]# Create the modelmodel2 <- lm(mpg ~ wt + drat + wt:drat + am, data=mtcars_subset)summary(model2)# Step 5: Fitted Values# Predicted valuesprint("fitted")fitted_values <- fitted.values(model2) fitted_values# Step 6: Residuals# Residualsprint("residuals")residuals <- residuals(model2)residuals# Step 7: Diagnostic Plots Residuals against Fitted Valuesplot(fitted_values, residuals, main = "Residuals against Fitted Values", xlab = "Fitted Values", ylab = "Residuals", col="red", pch = 19, frame = FALSE)# Step 8: Diagnostic Plots Q-Q Plotqqnorm(residuals, pch = 19, col="red", frame = FALSE)qqline(residuals, col = "blue", lwd = 2)# Step 9: Confidence Interval for Parameter Estimates# Confidence intervals for model parametersprint("confint")conf_90_int <- confint(model2, level=0.90) round(conf_90_int, 4)# Step 10: Predictions, Prediction Interval, and Confidence Intervalnewdata <- data.frame(wt=3.88, drat=3.05, am='1')print("prediction interval")prediction_pred_int <- predict(model2, newdata, interval="predict", level=0.90) round(prediction_pred_int, 4)print("confidence interval")prediction_conf_int <- predict(model2, newdata, interval="confidence", level=0.90) round(prediction_conf_int, 4)

[1] "head"

car mpg cyl disp hp drat wt qsec vs am gear carb
Mazda RX4 21.0 6 160 110 3.90 2.620 16.46 0 1 4 4
Mazda RX4 Wag 21.0 6 160 110 3.90 2.875 17.02 0 1 4 4
Datsun 710 22.8 4 108 93 3.85 2.320 18.61 1 1 4 1
Hornet 4 Drive 21.4 6 258 110 3.08 3.215 19.44 1 0 3 1
Hornet Sportabout 18.7 8 360 175 3.15 3.440 17.02 0 0 3 2
Valiant 18.1 6 225 105 2.76 3.460 20.22 1 0 3 1

[1] "head"

mpg wt drat
21.0 2.620 3.90
21.0 2.875 3.90
22.8 2.320 3.85
21.4 3.215 3.08
18.7 3.440 3.15
18.1 3.460 2.76

[1] "cor"

mpg wt drat
mpg 1.0000 -0.8677 0.6812
wt -0.8677 1.0000 -0.7124
drat 0.6812 -0.7124 1.0000

Call: lm(formula = mpg ~ wt + drat + wt:drat, data = mtcars_subset) Residuals: Min 1Q Median 3Q Max -3.8913 -1.8634 -0.3398 1.3247 6.4730 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 5.550 12.631 0.439 0.6637 wt 3.884 3.798 1.023 0.3153 drat 8.494 3.321 2.557 0.0162 * wt:drat -2.543 1.093 -2.327 0.0274 * --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 Residual standard error: 2.839 on 28 degrees of freedom Multiple R-squared: 0.7996, Adjusted R-squared: 0.7782 F-statistic: 37.25 on 3 and 28 DF, p-value: 6.567e-10

Call: lm(formula = mpg ~ wt + drat + wt:drat + am, data = mtcars_subset) Residuals: Min 1Q Median 3Q Max -3.6907 -1.4711 -0.2512 0.9344 6.7453 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 3.247 12.914 0.251 0.8034 wt 4.168 3.822 1.091 0.2851 drat 9.562 3.529 2.710 0.0116 * am1 -1.464 1.597 -0.917 0.3674 wt:drat -2.708 1.111 -2.438 0.0216 * --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 Residual standard error: 2.847 on 27 degrees of freedom Multiple R-squared: 0.8057, Adjusted R-squared: 0.7769 F-statistic: 27.99 on 4 and 27 DF, p-value: 2.948e-09

[1] "fitted"

1

22.3202071012681

2

20.6895074959298

3

24.0747502539664

4

19.2785936167189

5

18.3566004049746

6

18.1948543136102

7

17.7829147976458

8

19.9455031142101

9

20.4155783618075

10

18.5453485011573

11

18.5453485011573

12

15.7244271929934

13

17.1343818387994

14

16.9270355673574

15

11.4830468223862

16

10.468474335848

17

9.65079623894033

18

25.654703888853

19

34.0906848946004

20

28.8096807794287

21

24.1983096753214

22

17.9964150947557

23

18.378418393557

24

16.124985574862

25

16.6489676538974

26

27.4785433498266

27

27.3857668975878

28

28.6890131121626

29

19.1154587119623

30

20.7842398534528

31

16.2835591339497

32

21.7238845270116

[1] "residuals"

1

-1.32020710126808

2

0.310492504070191

3

-1.2747502539664

4

2.12140638328108

5

0.343399595025413

6

-0.0948543136101894

7

-3.48291479764585

8

4.45449688578988

9

2.38442163819248

10

0.654651498842696

11

-0.745348501157303

12

0.675572807006601

13

0.16561816120057

14

-1.72703556735736

15

-1.08304682238617

16

-0.068474335848017

17

5.04920376105967

18

6.74529611114699

19

-3.69068489460044

20

5.09031922057126

21

-2.69830967532141

22

-2.49641509475568

23

-3.17841839355699

24

-2.82498557486204

25

2.55103234610262

26

-0.178543349826627

27

-1.38576689758782

28

1.71098688783741

29

-3.31545871196233

30

-1.08423985345285

31

-1.28355913394969

32

-0.323884527011619

[1] "confint"

5 % 95 %
(Intercept) -18.7488 25.2427
wt -2.3414 10.6771
drat 3.5516 15.5725
am1 -4.1845 1.2564
wt:drat -4.6004 -0.8164

[1] "prediction interval"

fit lwr upr
15.0672 9.4501 20.6844

[1] "confidence interval"

fit lwr upr
15.0672 12.2316 17.9029

can you please arrange the answer in a concise way like 1, 2, 3 questions starting from the introduction

1. Introduction

Discuss the statement of the problem about the statistical analyses that are being performed. Address the following questions in your analysis:

  • What is the data set that you are exploring?
  • How might your results be used?
  • What type of analyses will you be running in this problem set?

Answer the questions in a paragraph response. Remove all questions and this note before submitting! Do not include R code in your report.

2. Data Preparation

There are some important variables that you have been asked to analyze in this problem set. Identify and explain these variables. Address the following questions in your analysis:

  • What are the important variables in this data set?
  • How many rows and columns are present in this data set?

3. Model with Interaction Term

Correlation Analysis

Describe the relationships between variables in the data set. Address the following questions in your analysis:

  • Calculate Pearson Correlation Coefficients between fuel economy (mpg) and horsepower (hp); fuel economy and quarter mile time (qsec); and fuel economy and rear axle ratio (drat). Comment on the strength and direction of these correlation coefficients.

Reporting Results

Report the results of the regression model. Address the following questions in your analysis:

  • Write the general form and the prediction equation of the regression model for fuel economy using horsepower, quarter mile time, and rear axle ratio as predictors. Include interaction terms for horsepower and quarter mile time; horsepower and rear axle ratio.
  • Create the regression model for fuel economy using horsepower, quarter mile time, and rear axle ratio as predictors. Include interaction terms for horsepower and quarter mile time; horsepower and rear axle ratio.Write the prediction model equation using outputs obtained from your R script.
  • What are the values of(R-squared) and(Adjusted R-squared) for the model? Provide your interpretation of these statistics.
  • For this model, estimate the change in fuel economy of a car with 160 horsepower for each unit increase in quarter mile time. Explain your answer.
  • Now estimate the change in fuel economy of a car with 160 horsepower for each unit increase in rear axle ratio. Explain your answer.
  • Obtain fitted values and residuals using the model for the data set and create the following plots:
  • Residuals against Fitted Values
  • Normal Q-Q plot
  • Residuals against Fitted Values
  • Normal Q-Q plot
  • Based on these plots, what can you say about the assumptions of homoscedasticity and normality of the residuals? Be detailed in your response.

Evaluating Model Significance

Evaluate model significance for the regression model. Address the following questions in your analysis:

  • Is the model significant at a 5% level of significance? Carry out the overall F-test by identifying the null hypothesis, the alternative hypothesis, the P-value, and the conclusion of the test.
  • Which terms in the model are significant at a 5% level of significance? Carry out individual beta tests by identifying the null hypothesis, the alternative hypothesis, the P-value, and the conclusion of the test.

Making Predictions Using the Model

Make predictions using the regression model. Address the following questions in your analysis:

  • What is the predicted fuel economy for a car that has 175 horsepower, 14.2 quarter mile time and 3.91 rear-axle ratio?
  • What is the 95% prediction interval for the fuel economy of this car? Interpret the interval.
  • What is the 95% confidence interval for the fuel economy of this car? Interpret the interval.

4. Model with Interaction Term and Qualitative Predictor

Reporting Results

Report the results of the regression model. Address the following questions in your analysis:

  • Write the general form and the prediction equation of the regression model for fuel economy using horsepower, quarter mile time, interaction term for horsepower and quarter mile time, and number of cylinders. Note that number of cylinders is a qualitative predictor.
  • Create the regression model for fuel economy using horsepower, quarter mile time, interaction term for horsepower and quarter mile time, and number of cylinders. Note that number of cylinders is a qualitative predictor. Write the prediction model equation using outputs obtained from your R script. Let us call this model 2.
  • What are the values of(R-squared) and(adjusted R-squared) for the model? Provide your interpretation of these statistics.
  • Obtain fitted values and residuals for the data set using model 2 and create the following plots:
  • Based on these plots, what can you say about the assumptions of homoscedasticity and normality of the residuals? Be detailed in your response.

Evaluating Model Significance

Evaluate model significance for the regression model. Address the following questions in your analysis:

  • Is the model significant at a 5% level of significance? Carry out the overall F-test by identifying the null hypothesis, the alternative hypothesis, the P-value, and the conclusion of the test.
  • Which terms in the model are significant at a 5% level of significance? Carry out individual beta tests by identifying the null hypothesis, the alternative hypothesis, the P-value, and the conclusion of the test.

Making Predictions Using the Model

Make predictions using the regression model. Address the following questions in your analysis:

  • Using the second model, what is the predicted fuel economy for a car that has 175 horsepower, 14.2 quarter mile time and6 cylinders? Note that the number of cylinders is a qualitative variable. Therefore, set it equal to '6' (using single quotes).
  • What is the 95% prediction interval for the fuel economy of this car? Interpret the interval.
  • What is the 95% confidence interval for the fuel economy of this car? Interpret the interval.
  • Why are prediction intervals wider than confidence intervals?

5. Conclusion

Describe the results of the statistical analyses and address the following questions:

  • Based on the analysis that you have performed here and assuming that the sample size is sufficiently large, which model would you recommend?
  • Fully describe what these results mean for your scenario using proper descriptions of statistical terms and concepts.
  • What is the practical importance of the analyses that were performed?

6. Citations

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

Mathematical Applications For The Management, Life And Social Sciences

Authors: Ronald J. Harshbarger, James J. Reynolds

12th Edition

978-1337625340

More Books

Students also viewed these Mathematics questions