Answered step by step
Verified Expert Solution
Question
1 Approved Answer
In this discussion, you will apply the statistical concepts and techniques covered in this week's reading about multiple regression. Last week's discussion involved a car
In this discussion, you will apply the statistical concepts and techniques covered in this week's reading about multiple regression. Last week's discussion involved a car rental company that wanted to evaluate the premise that heavier cars are less fuel efficient than lighter cars. The company expected fuel efficiency (miles per gallon) and weight of the car (often measured in thousands of pounds) to be correlated. The company also expects cars with higher horsepower to be less fuel efficient than cars with lower horsepower. They would like you to consider this new variable in your analysis. In this discussion, you will work with a cars data set that includes the three variables used in this discussion: . Miles per gallon (coded as mpg in the data set) . Weight of the car (coded as wt in the data set) . Horsepower (coded as hp in the data set) The random sample will be drawn from a CSV file. This data will be unique to you, and therefore your answers will be unique as well. Run Step 1 in the Python script to generate your unique sample data. In your initial post, address the following items: 1. Check to be sure your scatterplots of miles per gallon against horsepower and weight of the car were included in your attachment. Do the plots show any trend? If yes, is the trend what you expected? Why or why not? See Steps 2 and 3 in the Python script. 2. What are the coefficients of correlation between miles per gallon and horsepower? Between miles per gallon and the weight of the car? What are the directions and strengths of these coefficients? Do the coefficients of correlation indicate a strong correlation, weak correlation, or no correlation between these variables? See Step 4 in the Python script. 3. Write the multiple regression equation for miles per gallon as the response variable. Use weight and horsepower as predictor variables. See Step 5 in the Python script. How might the car rental company use this model?This block of Python code will generate the sample data for you. You will not be generating the data set using numpy module this week. Instead, the data set will be imported from a CSV file. To make the data unique to you, a random sample of size 30, without replacement, will be drawn from the data in the CSV file. The data set will be saved in a Python dataframe that will be used in later calculations. Click the block of code below and hit the Run button above. K import pandas as pd from IPython. display import display, HTML # read data from mtcars. csv data set. cars_df_orig = pd. read_csv("https://53-us-west-2. amazonaws. com/data-analytics . zybooks.com/mtcars.cav") # randomly pick 30 observations from the data set to make the data set unique to you. cars_of = cars_df_orig. sample(n=30, replace=False) # print only the first five observations in the dataset. print("Cars data frame (showing only the first five observations) \\") display (HTML(cars_df . head() . to_html( ) ) ) Cars data frame (showing only the first five observations) Unnamed: 0 mpg cyl disp hp drat wt qsec vs am gear carb Merc 230 22.8 4 140.8 95 3.92 3.150 22.90 1 0 4 2 N Datsun 710 22.8 4 108.0 93 3.85 2.320 18.61 1 1 Mazda RX4 21.0 6 160.0 110 3.90 2.620 16.46 0 1 4 1 Mazda RX4 Wag 21.0 6 160.0 110 3.90 2.875 17.02 0 1 A 16 Chrysler Imperial 14.7 8 440.0 230 3.23 5.345 17.42 0 0 3 4The block of code below will create a scatterplot of the variables "miles per gallon" (coded as mpg in the data set) and "weight" of the car (coded as wt). Click the block of code below and hit the Run button above. NOTE: If the plot is not created, click the code section and hit the Run button again. import matplotlib. pyplot as plt # create scatterplot of variables mpg against wt. plt. plot(cars_df["wt"], cars_df["mpg"], 'o', color='red' ) # set a title for the plot, x-axis, and y-axis. plt . title('MPG against Weight' ) plt . xlabel( 'Weight (1000s 1bs)' ) plt . ylabel ('MPG' ) # show the plot. pit . show( ) MPG against Weight 30 25 MPG 20 15 10 15 20 25 3.0 3.5 4.0 4.5 5.0 5.5The block of code below will create a scatterplot of the variables "miles per gallon" (coded as mpg in the data set) and "horsepower" of the car (coded as hp) Click the block of code below and hit the Run button above. NOTE: If the plot is not created, click the code section and hit the Run button again.: import matplotlib. pyplot as plt # create scatterplot of variables mpg against hp. plt. plot(cars_df["hp"], cars_df["mpg"], 'o', color='blue' ) # set a title for the plot, x-axis, and y-axis. plt . title('MPG against Horsepower' ) plt. xlabel ( 'Horsepower' ) pit . ylabel( 'MPG' ) # show the plot. pit . show() MPG against Horsepower 30 25 MPG 20 15 10 .. 50 100 150 200 250 300 HorsepowerNow you will calculate the correlation coefficient between the variables "miles per gallon" and "weight". You will also calculate the correlation coefficient between the variables "miles per gallon" and "horsepower". The corr method of a dataframe returns the correlation matrix with the correlation coefficients between all variables in the dataframe. You will specify to only return the matrix for the three variables. Click the block of code below and hit the Run button above. # create correlation matrix for mpg, wt, and hp. # The correlation coefficient between mpg and wt is contained in the cell for mpg row and wt column (or wt row and mpg column # The correlation coefficient between mpg and hp is contained in the cell for mpg row and hp column (or hp row and mpg column mpg_wt_corr = cars_df [ ['mpg', 'wt', 'hp' ]]. corr() print (mpg_wt_corr) 1 mpg wt hp mpg 1.000000 -0.866756 -0.772352 wt -0. 866756 1.000000 0.641831 hp -0.772352 0.641831 1.009006This block of code produces a multiple regression model with "miles per gallon" as the response variable, and "weight" and "horsepower" as predictor variables. The ols method in statsmodels. formula.api submodule returns all statistics for this multiple regression model. Click the block of code below and hit the Run button above. from statsmodels . formula. api import ols # create the multiple regression model with mpg as the response variable; weight and horsepower as predictor variables. model = ols('mpg ~ wtthp', data=cars_df) . fit() print (model . summary ( ) ) OLS Regression Results Dep. Variable: mpg R-squared : 3.831 Model : OLS Adj. R-squared : 0. 818 Method : Least Squares F-statistic: 66. 21 Date: Mon, 07 Feb 2022 Prob (F-statistic): 3. 88e-11 Time : 14:34:02 Log-Likelihood : -67.378 No. Observations: 30 AIC : 140.8 Of Residuals: 27 BIC: 145.0 Of Model : 2 Covariance Type: nonrobust coef std err t P> t [0. 025 0.975] Intercept 36.0368 1. 558 23. 131 0.000 32 .840 39.234 wt -3.6478 0. 597 -6.109 0.000 -4.873 -2.423 hp -0. 0302 0. 008 -3.557 0.001 -0. 048 -0. 013 Omnibus : 5.966 Durbin-Watson: 1.889 Prob (Omnibus) : 0. 051 Jarque-Bera (JB) : 4.340 Skew: 0. 876 Prob ( JB) : 0. 114 Kurtosis: 3.632 Cond. No. 608. Warnings : [1] Standard Errors assume that the covariance matrix of the errors is correctly specified
Step by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started