Answered step by step

Verified Expert Solution

Link Copied!

Question

1 Approved Answer

Posted on Jul 05, 2024

Can the following questions be addressed? Please: 1.You developed a scatterplot of miles per gallon against weight; check to make sure it was included in

Can the following questions be addressed? Please:

1.You developed a scatterplot of miles per gallon against weight; check to make sure it was included in your attachment. Does the graph show any trend? If yes, is the trend what you expected? Why or why not? See Step 2 in the Python script.

2.What is the coefficient of correlation between miles per gallon and weight? What is the sign of the correlation coefficient? Does the coefficient of correlation indicate a strong correlation, weak correlation, or no correlation between the two variables? How do you know? See Step 3 in the Python script.

3.Write the simple linear regression equation for miles per gallon as the response variable and weight as the predictor variable. How might the car rental company use this model? See Step 4 in the Python script.

4.What is the slope coefficient? Is this coefficient significant at a 5% level of significance (alpha=0.05)? (Hint: Check the P-value,for weight in the Python output.) See Step 4 in the Python script.

Step 1: Generating cars dataset

This block of Python code will generate the sample data for you. You will not be generating the dataset using numpy module this week. Instead, the dataset will be imported from a CSV file. To make the data unique to you, a random sample of size 30, without replacement, will be drawn from the data in the CSV file. The data set will be saved into a Python dataframe which you will use in later calculations.

Click the block of code below and hit theRunbutton above.

In[1]:

import pandas as pd

from IPython.display import display, HTML

# read data from mtcars.csv data set.

cars_df_orig = pd.read_csv("https://s3-us-west-2.amazonaws.com/data-analytics.zybooks.com/mtcars.csv")

# randomly pick 30 observations without replacement from mtcars dataset to make the data unique to you.

cars_df = cars_df_orig.sample(n=30, replace=False)

# print only the first five observations in the data set.

print(" Cars data frame (showing only the first five observations)")

display(HTML(cars_df.head().to_html()))

Cars data frame (showing only the first five observations)

Unnamed: 0

mpg

cyl

disp

drat

qsec

gear

carb

Cadillac Fleetwood

10.4

472.0

205

2.93

5.250

17.98

Merc 450SLC

15.2

275.8

180

3.07

3.780

18.00

Hornet 4 Drive

21.4

258.0

110

3.08

3.215

19.44

Toyota Corona

21.5

120.1

3.70

2.465

20.01

Merc 280

19.2

167.6

123

3.92

3.440

18.30

Step 2: Scatterplot of miles per gallon against weight

The block of code below will develop a scatterplot of miles per gallon (coded as mpg in the data set) and weight of the car (coded as wt).

Click the block of code below and hit theRunbutton above.

NOTE: If the plot is not created, click the code section and hit theRunbutton again.

In[3]:

import matplotlib.pyplot as plt

# create scatterplot of variables mpg against wt.

plt.plot(cars_df["wt"], cars_df["mpg"], 'o', color='red')

# set a title for the plot, x-axis, and y-axis.

plt.title('MPG against Weight')

plt.xlabel('Weight (1000s lbs)')

plt.ylabel('MPG')

# show the plot.

plt.show()

Step 3: Correlation coefficient for miles per gallon and weight

Now you will calculate the correlation coefficient between the miles per gallon and weight variables. Thecorrmethod of a dataframe returns the correlation matrix with correlation coefficients between all variables in the dataframe. You will specify to only return the matrix for the variables "miles per gallon" and "weight".

Click the block of code below and hit theRunbutton above.

In[4]:

# create correlation matrix for mpg and wt.

# the correlation coefficient between mpg and wt is contained in the cell for mpg row and wt column (or wt row and mpg column)

mpg_wt_corr = cars_df[['mpg','wt']].corr()

print(mpg_wt_corr)

mpgwt

mpg1.000000 -0.865622

wt-0.8656221.000000

Step 4: Simple linear regression model to predict miles per gallon using weight

The block of code below produces a simple linear regression model using "miles per gallon" as the response variable and "weight" (of the car) as a predictor variable. Theolsmethod in statsmodels.formula.api submodule returns all statistics for this simple linear regression model.

Click the block of code below and hit theRunbutton above.

In[5]:

from statsmodels.formula.api import ols

# create the simple linear regression model with mpg as the response variable and weight as the predictor variable

model = ols('mpg ~ wt', data=cars_df).fit()

#print the model summary

print(model.summary())

OLS Regression Results

==============================================================================

Dep. Variable:mpgR-squared:0.749

Model:OLSAdj. R-squared:0.740

Method:Least SquaresF-statistic:83.69

Date:Wed, 02 Jun 2021Prob (F-statistic):6.62e-10

Time:01:22:43Log-Likelihood:-73.298

No. Observations:30AIC:150.6

Df Residuals:28BIC:153.4

Df Model:1

Covariance Type:nonrobust

==============================================================================

coefstd errtP>|t|[0.0250.975]

------------------------------------------------------------------------------

Intercept35.99981.86019.3510.00032.18939.811

wt-5.01340.548-9.1480.000-6.136-3.891

==============================================================================

Omnibus:2.902Durbin-Watson:2.477

Prob(Omnibus):0.234Jarque-Bera (JB):2.013

Skew:0.633Prob(JB):0.366

Kurtosis:3.095Cond. No.13.0

==============================================================================

Warnings:

[1] Standard Errors assume that the covariance matrix of the errors is correctly specified.

import pandas as pd from IPython . display import display, HTML # read data from mtcars. csv data set. cars_df_orig = pd. read_csv( "https : / /s3-us-west-2 . amazonaws . com/data-analytics . zybooks .com/mtcars .csv") # randomly pick 30 observations without replacement from mtcars dataset to make the data unique to you. cars_df = cars_df_orig . sample (n=30, replace=False) # print only the first five observations in the data set. print ("\ Cars data frame (showing only the first five observations)") display ( HTML (cars_df . head( ) . to_html ( ) ) ) Cars data frame (showing only the first five observations) Unnamed: 0 mpg cyl disp hp drat wt qsec VS am gear carb 14 Cadillac Fleetwood 10.4 8 472.0 205 2.93 5.250 17.98 0 0 3 4 13 Merc 450SLC 15.2 8 275.8 180 3.07 3.780 18.00 0 0 3 3 3 Hornet 4 Drive 21.4 6 258.0 110 3.08 3.215 19.44 1 3 1 20 Toyota Corona 21.5 4 120.1 97 3.70 2.465 20.01 1 0 3 7 9 Merc 280 19.2 6 167.6 123 3.92 3.440 18.30 1 4 4\f# create correlation matrix for mpg and wt. # the correlation coefficient between mpg and wt is contained in the cell for mpg row and wt column (or wt row and mpg column ) mpg_wt_corr = cars_df [ ['mpg' , 'wt' ] ] . corr ( ) print (mpg_wt_corr) mpg wt 1. 000000 -0. 865622 wt -0. 865622 1. 000000 Step 4: Simple linear regression model to predict miles per gallon using weight The block of code below produces a simple linear regression model using "miles per gallon" as the response variable and "weight" (of the car) as a predictor variable. The ols method in statsmodels.formula.api submodule returns all statistics for this simple linear regression model. Click the block of code below and hit the Run button above. from statsmodels . formula . api import ols # create the simple linear regression model with mpg as the response variable and weight as the predictor variable model = ols ( 'mpg ~ wt' , data=cars_df) . fit( ) #print the model summary print (model . summary ( ) ) OLS Regression Results Dep. Variable: mpg R-squared: 0 . 749 Model OLS Adj. R-squared: 0 . 740 Method : Least Squares F-statistic: 83.69 Date Wed, 02 Jun 2021 Prob (F-statistic) : 6. 62e-10 Time 01 : 22 :43 Log-Likelihood: -73. 298 No. Observations : 30 AIC 150 .6 Df Residuals : 28 BIC: 153 . 4 Df Model: Covariance Type: nonrobust coef std err P> t [0 . 025 0 . 975] Intercept 35 . 9998 1 . 860 19 . 351 0 . 000 32 . 189 39. 811 wt -5 . 0134 0 . 548 -9 . 148 0.000 -6 . 136 -3. 891 Omnibus 2.902 Durbin-Watson : 2. 477 Prob (Omnibus ) : 0.234 Jarque-Bera (JB) : 2 . 013 Skew 0. 633 Prob ( JB) : 0. 366 Kurtosis: 3 . 095 Cond. No. 13. Warnings : [1] Standard Errors assume that the covariance matrix of the errors is correctly specified