Question
Your Code You have been asked to create regression models in the Module Two Problem Set. Review the Problem Set Report template to see the
Your Code
You have been asked to create regression models in the Module Two Problem Set. Review the Problem Set Report template to see the questions you will be answering about your models.
Use the empty blocks below to write the R code for your models and get outputs. Then use the outputs to answer the questions in your problem set report.
Note: Use the + (plus) button to add new code blocks or the scissor icon to remove empty code blocks, if needed.
In[12]:
Edit Metadata
# Step 1: Loading the Data Set# Loading mtcars data set from a mtcars.csv filemtcars <- read.csv(file='mtcars.csv', header=TRUE, sep=",")# Converting appropriate variables to factors mtcars2 <- within(mtcars, { vs <- factor(vs) am <- factor(am) cyl <- factor(cyl) gear <- factor(gear) carb <- factor(carb)})# Print the first six rowsprint("head")head(mtcars2, 6)# Step 2: Subsetting Data and Correlation Matrixmyvars <- c("mpg","wt","drat")mtcars_subset <- mtcars2[myvars]# Print the first six rowsprint("head")head(mtcars_subset, 6)# Print the correlation matrixprint("cor")corr_matrix <- cor(mtcars_subset, method = "pearson")round(corr_matrix, 4)# Step 3: Multiple Regression With Interaction Term# Create the multiple regression model and print summary statistics. Note that this model includes the interaction term.model1 <- lm(mpg ~ wt + drat + wt:drat, data=mtcars_subset)summary(model1)# Step 4: Adding in a Qualitative Predictor# Subsetting data to only include the variables that are neededmyvars <- c("mpg","wt","drat","am")mtcars_subset <- mtcars2[myvars]# Create the modelmodel2 <- lm(mpg ~ wt + drat + wt:drat + am, data=mtcars_subset)summary(model2)# Step 5: Fitted Values# Predicted valuesprint("fitted")fitted_values <- fitted.values(model2) fitted_values# Step 6: Residuals# Residualsprint("residuals")residuals <- residuals(model2)residuals# Step 7: Diagnostic Plots Residuals against Fitted Valuesplot(fitted_values, residuals, main = "Residuals against Fitted Values", xlab = "Fitted Values", ylab = "Residuals", col="red", pch = 19, frame = FALSE)# Step 8: Diagnostic Plots Q-Q Plotqqnorm(residuals, pch = 19, col="red", frame = FALSE)qqline(residuals, col = "blue", lwd = 2)# Step 9: Confidence Interval for Parameter Estimates# Confidence intervals for model parametersprint("confint")conf_90_int <- confint(model2, level=0.90) round(conf_90_int, 4)# Step 10: Predictions, Prediction Interval, and Confidence Intervalnewdata <- data.frame(wt=3.88, drat=3.05, am='1')print("prediction interval")prediction_pred_int <- predict(model2, newdata, interval="predict", level=0.90) round(prediction_pred_int, 4)print("confidence interval")prediction_conf_int <- predict(model2, newdata, interval="confidence", level=0.90) round(prediction_conf_int, 4)
[1] "head"
car | mpg | cyl | disp | hp | drat | wt | qsec | vs | am | gear | carb |
---|---|---|---|---|---|---|---|---|---|---|---|
Mazda RX4 | 21.0 | 6 | 160 | 110 | 3.90 | 2.620 | 16.46 | 0 | 1 | 4 | 4 |
Mazda RX4 Wag | 21.0 | 6 | 160 | 110 | 3.90 | 2.875 | 17.02 | 0 | 1 | 4 | 4 |
Datsun 710 | 22.8 | 4 | 108 | 93 | 3.85 | 2.320 | 18.61 | 1 | 1 | 4 | 1 |
Hornet 4 Drive | 21.4 | 6 | 258 | 110 | 3.08 | 3.215 | 19.44 | 1 | 0 | 3 | 1 |
Hornet Sportabout | 18.7 | 8 | 360 | 175 | 3.15 | 3.440 | 17.02 | 0 | 0 | 3 | 2 |
Valiant | 18.1 | 6 | 225 | 105 | 2.76 | 3.460 | 20.22 | 1 | 0 | 3 | 1 |
[1] "head"
mpg | wt | drat |
---|---|---|
21.0 | 2.620 | 3.90 |
21.0 | 2.875 | 3.90 |
22.8 | 2.320 | 3.85 |
21.4 | 3.215 | 3.08 |
18.7 | 3.440 | 3.15 |
18.1 | 3.460 | 2.76 |
[1] "cor"
mpg | wt | drat | |
---|---|---|---|
mpg | 1.0000 | -0.8677 | 0.6812 |
wt | -0.8677 | 1.0000 | -0.7124 |
drat | 0.6812 | -0.7124 | 1.0000 |
Call: lm(formula = mpg ~ wt + drat + wt:drat, data = mtcars_subset) Residuals: Min 1Q Median 3Q Max -3.8913 -1.8634 -0.3398 1.3247 6.4730 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 5.550 12.631 0.439 0.6637 wt 3.884 3.798 1.023 0.3153 drat 8.494 3.321 2.557 0.0162 * wt:drat -2.543 1.093 -2.327 0.0274 * --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 Residual standard error: 2.839 on 28 degrees of freedom Multiple R-squared: 0.7996, Adjusted R-squared: 0.7782 F-statistic: 37.25 on 3 and 28 DF, p-value: 6.567e-10
Call: lm(formula = mpg ~ wt + drat + wt:drat + am, data = mtcars_subset) Residuals: Min 1Q Median 3Q Max -3.6907 -1.4711 -0.2512 0.9344 6.7453 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 3.247 12.914 0.251 0.8034 wt 4.168 3.822 1.091 0.2851 drat 9.562 3.529 2.710 0.0116 * am1 -1.464 1.597 -0.917 0.3674 wt:drat -2.708 1.111 -2.438 0.0216 * --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 Residual standard error: 2.847 on 27 degrees of freedom Multiple R-squared: 0.8057, Adjusted R-squared: 0.7769 F-statistic: 27.99 on 4 and 27 DF, p-value: 2.948e-09
[1] "fitted"
1
22.3202071012681
2
20.6895074959298
3
24.0747502539664
4
19.2785936167189
5
18.3566004049746
6
18.1948543136102
7
17.7829147976458
8
19.9455031142101
9
20.4155783618075
10
18.5453485011573
11
18.5453485011573
12
15.7244271929934
13
17.1343818387994
14
16.9270355673574
15
11.4830468223862
16
10.468474335848
17
9.65079623894033
18
25.654703888853
19
34.0906848946004
20
28.8096807794287
21
24.1983096753214
22
17.9964150947557
23
18.378418393557
24
16.124985574862
25
16.6489676538974
26
27.4785433498266
27
27.3857668975878
28
28.6890131121626
29
19.1154587119623
30
20.7842398534528
31
16.2835591339497
32
21.7238845270116
[1] "residuals"
1
-1.32020710126808
2
0.310492504070191
3
-1.2747502539664
4
2.12140638328108
5
0.343399595025413
6
-0.0948543136101894
7
-3.48291479764585
8
4.45449688578988
9
2.38442163819248
10
0.654651498842696
11
-0.745348501157303
12
0.675572807006601
13
0.16561816120057
14
-1.72703556735736
15
-1.08304682238617
16
-0.068474335848017
17
5.04920376105967
18
6.74529611114699
19
-3.69068489460044
20
5.09031922057126
21
-2.69830967532141
22
-2.49641509475568
23
-3.17841839355699
24
-2.82498557486204
25
2.55103234610262
26
-0.178543349826627
27
-1.38576689758782
28
1.71098688783741
29
-3.31545871196233
30
-1.08423985345285
31
-1.28355913394969
32
-0.323884527011619
[1] "confint"
5 % | 95 % | |
---|---|---|
(Intercept) | -18.7488 | 25.2427 |
wt | -2.3414 | 10.6771 |
drat | 3.5516 | 15.5725 |
am1 | -4.1845 | 1.2564 |
wt:drat | -4.6004 | -0.8164 |
[1] "prediction interval"
fit | lwr | upr |
---|---|---|
15.0672 | 9.4501 | 20.6844 |
[1] "confidence interval"
fit | lwr | upr |
---|---|---|
15.0672 | 12.2316 | 17.9029 |
can you please arrange the answer in a concise way like 1, 2, 3 questions starting from the introduction
1. Introduction
Discuss the statement of the problem about the statistical analyses that are being performed. Address the following questions in your analysis:
- What is the data set that you are exploring?
- How might your results be used?
- What type of analyses will you be running in this problem set?
Answer the questions in a paragraph response. Remove all questions and this note before submitting! Do not include R code in your report.
2. Data Preparation
There are some important variables that you have been asked to analyze in this problem set. Identify and explain these variables. Address the following questions in your analysis:
- What are the important variables in this data set?
- How many rows and columns are present in this data set?
3. Model with Interaction Term
Correlation Analysis
Describe the relationships between variables in the data set. Address the following questions in your analysis:
- Calculate Pearson Correlation Coefficients between fuel economy (mpg) and horsepower (hp); fuel economy and quarter mile time (qsec); and fuel economy and rear axle ratio (drat). Comment on the strength and direction of these correlation coefficients.
Reporting Results
Report the results of the regression model. Address the following questions in your analysis:
- Write the general form and the prediction equation of the regression model for fuel economy using horsepower, quarter mile time, and rear axle ratio as predictors. Include interaction terms for horsepower and quarter mile time; horsepower and rear axle ratio.
- Create the regression model for fuel economy using horsepower, quarter mile time, and rear axle ratio as predictors. Include interaction terms for horsepower and quarter mile time; horsepower and rear axle ratio.Write the prediction model equation using outputs obtained from your R script.
- What are the values of(R-squared) and(Adjusted R-squared) for the model? Provide your interpretation of these statistics.
- For this model, estimate the change in fuel economy of a car with 160 horsepower for each unit increase in quarter mile time. Explain your answer.
- Now estimate the change in fuel economy of a car with 160 horsepower for each unit increase in rear axle ratio. Explain your answer.
- Obtain fitted values and residuals using the model for the data set and create the following plots:
- Residuals against Fitted Values
- Normal Q-Q plot
- Residuals against Fitted Values
- Normal Q-Q plot
- Based on these plots, what can you say about the assumptions of homoscedasticity and normality of the residuals? Be detailed in your response.
Evaluating Model Significance
Evaluate model significance for the regression model. Address the following questions in your analysis:
- Is the model significant at a 5% level of significance? Carry out the overall F-test by identifying the null hypothesis, the alternative hypothesis, the P-value, and the conclusion of the test.
- Which terms in the model are significant at a 5% level of significance? Carry out individual beta tests by identifying the null hypothesis, the alternative hypothesis, the P-value, and the conclusion of the test.
Making Predictions Using the Model
Make predictions using the regression model. Address the following questions in your analysis:
- What is the predicted fuel economy for a car that has 175 horsepower, 14.2 quarter mile time and 3.91 rear-axle ratio?
- What is the 95% prediction interval for the fuel economy of this car? Interpret the interval.
- What is the 95% confidence interval for the fuel economy of this car? Interpret the interval.
4. Model with Interaction Term and Qualitative Predictor
Reporting Results
Report the results of the regression model. Address the following questions in your analysis:
- Write the general form and the prediction equation of the regression model for fuel economy using horsepower, quarter mile time, interaction term for horsepower and quarter mile time, and number of cylinders. Note that number of cylinders is a qualitative predictor.
- Create the regression model for fuel economy using horsepower, quarter mile time, interaction term for horsepower and quarter mile time, and number of cylinders. Note that number of cylinders is a qualitative predictor. Write the prediction model equation using outputs obtained from your R script. Let us call this model 2.
- What are the values of(R-squared) and(adjusted R-squared) for the model? Provide your interpretation of these statistics.
- Obtain fitted values and residuals for the data set using model 2 and create the following plots:
- Based on these plots, what can you say about the assumptions of homoscedasticity and normality of the residuals? Be detailed in your response.
Evaluating Model Significance
Evaluate model significance for the regression model. Address the following questions in your analysis:
- Is the model significant at a 5% level of significance? Carry out the overall F-test by identifying the null hypothesis, the alternative hypothesis, the P-value, and the conclusion of the test.
- Which terms in the model are significant at a 5% level of significance? Carry out individual beta tests by identifying the null hypothesis, the alternative hypothesis, the P-value, and the conclusion of the test.
Making Predictions Using the Model
Make predictions using the regression model. Address the following questions in your analysis:
- Using the second model, what is the predicted fuel economy for a car that has 175 horsepower, 14.2 quarter mile time and6 cylinders? Note that the number of cylinders is a qualitative variable. Therefore, set it equal to '6' (using single quotes).
- What is the 95% prediction interval for the fuel economy of this car? Interpret the interval.
- What is the 95% confidence interval for the fuel economy of this car? Interpret the interval.
- Why are prediction intervals wider than confidence intervals?
5. Conclusion
Describe the results of the statistical analyses and address the following questions:
- Based on the analysis that you have performed here and assuming that the sample size is sufficiently large, which model would you recommend?
- Fully describe what these results mean for your scenario using proper descriptions of statistical terms and concepts.
- What is the practical importance of the analyses that were performed?
6. Citations
Step by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started