Answered step by step
Verified Expert Solution
Question
1 Approved Answer
Open the Excel workbook. There are several tabs in this workbook. The first two tabs contain historical user - song ratings, randomly partitioned into training
Open the Excel workbook. There are several tabs in this workbook. The first two tabs contain historical usersong ratings, randomly partitioned into training data and test data, respectively. The third tab contains a partially populated template for generating kNN predictions. The remaining tabs contain pairwise distance calculations between each test observation, and each training data observation. For example, the tab named contains the pairwise distance calculations between test observation and every training data observation. On the kNN predictions tab, you will find that three sets of predictions have been prepopulated for you. These include i a popularitybased predictor, ie the average rating that has been provided in the training data for the song ID in question this is a common, intuitive approach, but it is also unsophisticated ii a continuous kNearest Neighbor prediction ie kNN regression and iii a discrete kNearest Neighbor prediction kNN classification All three sets of predictions are provided for a set of test observations that were randomly drawn from the available rating data. The kNN predictions are based on the k nearestneighbors of each test observation, where k is the number of neighbors to consider, where near versus far is defined in terms of Euclidean distance. As you modify K you will see the kNN predictions change for each test observation. You will also see that the popularitybased predictions remain fixed. In addition to the predictions, placeholders have been provided for you to capture performance error metrics for all three approaches, including the continuous popularity and kNN based predictions MAE RMSE and discrete accuracy error and a confusion matrix prediction implementations. As you adjust the value of K you will see the predictions change, as well as the individual error values for each test observation. Question Points Vary the value of k from through Based on the continuous kNN regression prediction error measures, what is the optimal number of nearest neighbors employ? Question Points Based on the confusion matrix you observe when k calculate the prediction accuracy of the kNN classifier in cell BHint: overall accuracy is the proportion of the predictions that were correct, ie on the diagonal of the confusion matrix We cannot answer this question without more information.
Open the Excel workbook. There are several tabs in this workbook. The first two tabs contain historical usersong ratings, randomly partitioned into training data and test data, respectively. The third tab contains a partially populated template for generating kNN predictions. The remaining tabs contain pairwise distance calculations between each test observation, and each training data observation. For example, the tab named contains the pairwise distance calculations between test observation and every training data observation.
On the kNN predictions tab, you will find that three sets of predictions have been prepopulated for you. These include i a popularitybased predictor, ie the average rating that has been provided in the training data for the song ID in question this is a common, intuitive approach, but it is also unsophisticated ii a continuous kNearest Neighbor prediction ie kNN regression and iii a discrete kNearest Neighbor prediction kNN classification All three sets of predictions are provided for a set of test observations that were randomly drawn from the available rating data. The kNN predictions are based on the k nearestneighbors of each test observation, where k is the number of neighbors to consider, where near versus far is defined in terms of Euclidean distance. As you modify K you will see the kNN predictions change for each test observation. You will also see that the popularitybased predictions remain fixed.
In addition to the predictions, placeholders have been provided for you to capture performance error metrics for all three approaches, including the continuous popularity and kNN based predictions MAE RMSE and discrete accuracy error and a confusion matrix prediction implementations.
As you adjust the value of K you will see the predictions change, as well as the individual error values for each test observation.
Question
Points
Vary the value of k from through Based on the continuous kNN regression prediction error measures, what is the optimal number of nearest neighbors employ?
Question
Points
Based on the confusion matrix you observe when k calculate the prediction accuracy of the kNN classifier in cell BHint: overall accuracy is the proportion of the predictions that were correct, ie on the diagonal of the confusion matrix
We cannot answer this question without more information.
Step by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started