Question
Part 5: Training Score Keep getting the wrong values for Pred Y and Error. Part 5: Training Score Regression models are often scored using the
Part 5: Training Score
Keep getting the wrong values for Pred Y and Error.
Part 5: Training Score Regression models are often scored using the r-squared metric, denoted byr. This score is a number between 0 and 1, and represents the proportion of the variance in the response variable that has been explained by the model. For example, ifr= 0.9 for a model, then the model explains 90% of the variability in the values of the response variable, whereas it -0.05, then the model explains only 5% of the variability in the response In this part and the next, we will calculate the model's r.squared score on both the training set and on the test set. We will be primarily interested in the test score, but the training score will be useful for the sake of comparison. To calculate the r-squared score on a particular set, we must first use the model to find estimated values for that set, and then use the observed values to calculate the amount of error involved in each prediction. We will now walk through the steps involved in this process. Let Y..12. -I denote the predictor values in either the training or test set. For each predictor value, we can calculate an estimated response value, denoted by using the formula below. Let Y.Y.-. y. denote the true response values for the same set, and let ... denote the errors involved in each of our predictions. These error values are also called residuals. We can calculate each of the residuals as follows: 4 -- We will now calculate the estimated response values, - Create a markdown cell with a level 2 header that reads "Part 5: Training Score". : In the same cell, add unformatted text explaining that in this section, we will be calculating the training rsquared score, and that we will start by calculating estimated response values for the training set. Use the variables beta_ and beta_1 as well as the list x_train to calculate the estimated response values for the training set. Store the results in a list named pred y train. You can accomplish this task using either a loop or a list comprehension. This cell should not produce any output. Create a markdown cell explaining that we wil now calculate the residuals for the training set Use the values stored in y_train and pred_y_train to calculate the residuals for the training set. Store the results in in a list named error y train. You can accomplish this task using either a loop or a list comprehension. This cell should not produce any output Before continuing on with our calculation of the r-squared score, we will display the true Y values, the predicted values, and the residual for each of the first 10 observations in the training set. Create a markdown cell explaining that we will be displaying the values mentioned above. Use a code cell to print the first 10 values of each of the lists y_train, pred_y_train, and error_y_train. Format , , the output as follows: The values from each list should be arranged in columns with each row of output corresponding to a single training observation The output should include column headers and a dividing line, as shown below The number of characters reserved for each column, in order, should be 6, 10, and 10 The columns should be right aligned. The values displayed should all be rounded to 4 decimal places. The first few rows of your output should look exactly as shown below: : True y Predy Error 3.3032 3.4904 3. e289 3.4538 0.2743 0.0366 Before calculating ther-squared score, we must calculate one more intermediate value, named the sum of squared errors, and denoted by SSE. The formula for a model's SSE on a given set is provided below. SSE = = +++ Create a markdown cell explaining that we will now calculate the sum of squared errors score for the training set Use the values stored in the list errory_train to calculate the training sum of squared errors score, storing the result in a variable name sse_train. Display the result with text output as shown below, rounding the displayed value to 4 decimal places Training SSE - XXXX We are now ready to calculate the r-squared score for the training set. The formula for this metric is provided below: SSE r2 = 1-2 Syy Create a markdown cell explaining that we will now calculate the squared score for the training set. Calculate the training rsquared score, storing the result in a variable named r2_train. Display the result with text , . output as shown below, rounding the displayed value to 4 decimal places. Training r-squared - XXXX
Part 5: Training Score Regression models are often scored using the r-squared metric, denoted byr. This score is a number between 0 and 1, and represents the proportion of the variance in the response variable that has been explained by the model. For example, ifr= 0.9 for a model, then the model explains 90% of the variability in the values of the response variable, whereas it -0.05, then the model explains only 5% of the variability in the response In this part and the next, we will calculate the model's r.squared score on both the training set and on the test set. We will be primarily interested in the test score, but the training score will be useful for the sake of comparison. To calculate the r-squared score on a particular set, we must first use the model to find estimated values for that set, and then use the observed values to calculate the amount of error involved in each prediction. We will now walk through the steps involved in this process. Let Y..12. -I denote the predictor values in either the training or test set. For each predictor value, we can calculate an estimated response value, denoted by using the formula below. Let Y.Y.-. y. denote the true response values for the same set, and let ... denote the errors involved in each of our predictions. These error values are also called residuals. We can calculate each of the residuals as follows: 4 -- We will now calculate the estimated response values, - Create a markdown cell with a level 2 header that reads "Part 5: Training Score". : In the same cell, add unformatted text explaining that in this section, we will be calculating the training rsquared score, and that we will start by calculating estimated response values for the training set. Use the variables beta_ and beta_1 as well as the list x_train to calculate the estimated response values for the training set. Store the results in a list named pred y train. You can accomplish this task using either a loop or a list comprehension. This cell should not produce any output. Create a markdown cell explaining that we wil now calculate the residuals for the training set Use the values stored in y_train and pred_y_train to calculate the residuals for the training set. Store the results in in a list named error y train. You can accomplish this task using either a loop or a list comprehension. This cell should not produce any output Before continuing on with our calculation of the r-squared score, we will display the true Y values, the predicted values, and the residual for each of the first 10 observations in the training set. Create a markdown cell explaining that we will be displaying the values mentioned above. Use a code cell to print the first 10 values of each of the lists y_train, pred_y_train, and error_y_train. Format , , the output as follows: The values from each list should be arranged in columns with each row of output corresponding to a single training observation The output should include column headers and a dividing line, as shown below The number of characters reserved for each column, in order, should be 6, 10, and 10 The columns should be right aligned. The values displayed should all be rounded to 4 decimal places. The first few rows of your output should look exactly as shown below: : True y Predy Error 3.3032 3.4904 3. e289 3.4538 0.2743 0.0366 Before calculating ther-squared score, we must calculate one more intermediate value, named the sum of squared errors, and denoted by SSE. The formula for a model's SSE on a given set is provided below. SSE = = +++ Create a markdown cell explaining that we will now calculate the sum of squared errors score for the training set Use the values stored in the list errory_train to calculate the training sum of squared errors score, storing the result in a variable name sse_train. Display the result with text output as shown below, rounding the displayed value to 4 decimal places Training SSE - XXXX We are now ready to calculate the r-squared score for the training set. The formula for this metric is provided below: SSE r2 = 1-2 Syy Create a markdown cell explaining that we will now calculate the squared score for the training set. Calculate the training rsquared score, storing the result in a variable named r2_train. Display the result with text , . output as shown below, rounding the displayed value to 4 decimal places. Training r-squared - XXXXStep by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started