- Is at least one of the two variables (weight and horsepower) significant in the model? Run the overall F-test and provide your interpretation at 5% level of significance. See Step 5 in the Python script. Include the following in your analysis:
- Define the null and alternative hypothesis in mathematical terms and in words.
- Report the level of significance.
- Include the test statistic and the P-value. (Hint: F-Statistic and Prob (F-Statistic) in the output).
- Provide your conclusion and interpretation of the test. Should the null hypothesis be rejected? Why or why not?
- What is the slope coefficient for the weight variable? Is this coefficient significant at 5% level of significance (alpha=0.05)? (Hint: Check the P-value,, for weight in Python output. Recall that this is the individual t-test for the beta parameter.) See Step 5 in the Python script.
- What is the slope coefficient for the horsepower variable? Is this coefficient significant at 5% level of significance (alpha=0.05)? (Hint: Check the P-value,, for horsepower in Python output. Recall that this is the individual t-test for the beta parameter.) See Step 5 in the Python script.
- What is the purpose of performing individual t-tests after carrying out the overall F-test? What are the differences in the interpretation of the two tests?
- What is the coefficient of determination of your multiple regression model from Module Six? Provide appropriate interpretation of this statistic.
Step 3: Scatterplot of miles per gallon against horsepower The block of code below will create a scatterplot of the variables "miles per gallon" (coded as mpg in the data set) and "horsepower" of the car (coded as hp). Click the block of code below and hit the Run button above. NOTE: If the plot is not created, click the code section and hit the Run button again. In [4] : import matplotlib. pyplot as plt # create scatterplot of variables mpg against hp. pit. plot(cars_df["hp"], cars_df["mpg"], 'o', color='blue' ) # set a title for the plot, x-axis, and y-axis. pit. title('MPG against Horsepower' ) pit . xlabel ( ' Horsepower' ) pit . ylabel('MPG' ) # show the plot. pit . show( ) MPG against Horsepower 35 30 25 MPG 20 15 10 50 100 150 200 250 300Step 1: Generating cars dataset This block of Python code will generate the sample data for you. You will not be generating the data set using numpy module this week. Instead, the data set will be imported from a CSV file. To make the data unique to you, a random sample of size 30, without replacement, will be drawn from the data in the CSV file. The data set will be saved in a Python dataframe that will be used in later calculations. Click the block of code below and hit the Run button above. In [1] : import pandas as pd from IPython . display import display, HTML # read data from mtcars. csv data set. cars_df_orig = pd. read_csv("https://s3-us-west-2. amazonaws . com/data-analytics. zybooks. com/mtcars.csv") # randomly pick 30 observations from the data set to make the data set unique to you. cars_df = cars_df_orig. sample(n=30, replace=False) # print only the first five observations in the dataset. print("cars data frame (showing only the first five observations) \ ") display (HTML(cars_df . head() . to_html()) ) Cars data frame (showing only the first five observations) Unnamed: 0 mpg cyl | disp hp drat wt qsec VS gear | carb 14 Cadillac Fleetwood 10.4 8 472.0 205 2.93 5.250 17.98 0 0 3 4 Mazda RX4 Wag 21.0 6 160.0 110 3.90 2.875 17.02 0 1 4 4 21 Dodge Challenger 15.5 8 318.0 150 2.76 3.520 16.87 0 0 3 2 28 Ford Pantera L 15.8 8 351.0 264 4.22 3.170 14.50 10 1 5 4 20 Toyota Corona 21.5 4 120.1 97 3.70 2.465 20.01 1 0 3 1\fStep 4: Correlation matrix for miles per gallon, weight and horsepower Now you will calculate the correlation coefcient between the variables "miles per gallon" and "weight". You will also calculate the correlation coefcient between the variables "miles per gallon" and "horsepower". The can method of a dataframe returns the correlation matrix with the correlation coefcients between all variables in the dataframe. You will specify to only return the matrix for the three variables. Click the block of code below and hit the Run button above. In [5]: # create correiotion matrix for mpg, wt, and hp. # The corretation coefficient between mpg and wt is contained in the ceLL for mpg row and wt coLumn (or wt row and mpg column). at The corretation coefficient between mpg and hp is contained in the ceLL for mpg row and hp coLumn (or hp row and mpg cotumn). mpgkwticor'r' = carsidf[['mpg', 'w't', 'hp']].corr'() pr'int(mngwt7corr) mpg wt hp mpg 1.866968 78.869598 78.775551 wt 70.869568 1 .696398 8 .647397 hp 43.775551 3 .647397 1 .696056 Step 5: Multiple regression model to predict miles per gallon using weight and horsepower This block of code produces a multiple regression model with "miles per gallon" as the response variable, and "weight" and "horsepower" as predictor variables. The ols method in statsmodels.formula.api submodule returns all statistics for this multiple regression model. Click the block of code below and hit the Run button above. In [6]: from statsmodels. formula. api import ols # create the multiple regression model with mpg as the response variable; weight and horsepower as predictor variables. model = ols('mpg ~ wtthp', data=cars_df) . fit() print (model . summary () ) OLS Regression Results :=== Dep. Variable: mpg R- squared: 0. 834 Model : OLS Adj. R-squared: 0. 822 Method: Least Squares F-statistic: 67.76 Date: Thu, 05 Aug 2021 Prob (F-statistic) : 2. 99e-11 Time : 00 : 29:21 Log-Likelihood : -69. 894 No. Observations: 30 AIC : 145.8 Of Residuals: 27 BIC : 150.0 of Model : Covariance Type : nonrobust == coef std err t P> | t] [0. 025 0.975] Intercept 37.7163 1. 668 22. 611 0. 000 34. 294 41.139 wt -3.9503 0. 643 -6. 146 0.000 -5. 269 -2. 631 hp -0. 0326 0. 009 -3.557 0. 001 -0. 051 -0. 014 Omnibus : 3.874 Durbin-Watson: 1. 266 Prob (Omnibus ) : 0. 144 Jarque-Bera (JB) : 2. 797 Skew : 0. 743 Prob (JB) : 0. 247 Kurtosis: 3.173 Cond. No. 598. Warnings : [1] Standard Errors assume that the covariance matrix of the errors is correctly specified