Reconsider Figure 3.5, which shows the application of the KNN algorithm with k = 3 to the

Question:

Reconsider Figure 3.5, which shows the application of the KNN algorithm with k = 3 to the new sales lead (the predictor record) for the case study while using standardized values of the predictor variables. The prediction for the new sales lead is that it is not likely to have solar installed but the estimated probability that the new sales lead actually will have solar installed is a relatively high 1/3 since one of the three nearest neighbors (among all the historical records) did have solar installed. It is well understood for this application that substantially decreasing the value of one of these predictor variables for the new sales lead should tend to substantially decrease this probability while substantially increasing that predictor variable should have the opposite effect. This problem explores this further. 

a. Without changing the value of the household income, use visual inspection for the new sales lead to determine how its estimated probability changes (if at all) over the following standardized values of peak sun hours: −1.5, −1.0, −0.5, 0, +0.5. How much did these major changes in the peak sun hours change the estimated probability? 

b. Without changing the value of peak sun hours for the new sales lead, use visual inspection to determine how its estimated probability changes (if at all) over the following standardized value of household income: −1.0, −0.5, 0, +0.5, +1.0. How much did these major changes in the household income change the estimated probability?

c. Comment on how the randomness of the outcome for individual historical records can lead to the results in parts a and b. 

d. Comment on how greatly increasing the number of historical records and increasing the value of k a little should substantially improve the accuracy of the estimated probability that the new sales lead will have solar installed, including showing more accurately how these probabilities change as the predictor variables change.  


Data from Figure 3.5.

The shaded region shows where the KNN algorithm would predict that a new sales lead (the predictor record) is likely to install solar when k = 3 for the case study. 

Fantastic news! We've Found the answer you've been seeking!

Step by Step Answer:

Question Posted: