Question

1 Approved Answer

Posted on Sep 06, 2024

In this exercise, you are to implement only one of two possible classifiers (your choice). Note, you are not to use modules which provide these

In this exercise, you are to implement only one of two possible classifiers (your choice). Note, you are not to use modules which provide these functions - that would be too easy (no sklearn.naive_bayes, for example) but rather create them yourselves. Students who implement the logic for writing code for both the classifiers will be given bonus credit. The performance of your classifier implementation will be evaluated for the classifier functionality (whether you correctly implement kNN or Nave Bayes for the dataset) rather than efficiency. The data set to use is the digit recognition data set available from the sklearn module; the demonstration linked here should provide some guidance. You are expected to use Jupyter notebooks and Python on this assignment, but can ask for exceptions. Your goal is to take the first half of the data set to train your model, and the last half is used for prediction. a) k-Nearest Neighbors Each digit is an 8x8 pixel patch, which when reshaped is a 64 length vector. Distance metric: the simplest distance metric for k-Nearest Neighbors is the sum of squares error between pixel values. For each example in the test set, calculate the distance to every other example in the training set. Identify the closest k neighbors (a good use of numpy.argsort). Pick the class which most of the neighbors belong to. Break ties in any way you wish. Compare the true class of each member of the test set to the predicted class using k-Nearest neighbors. Report the accuracy. 1. Now, change the value of k. Create a table with the accuracy for k=1, k=3, k=5, and k=100, and k=500 2. Show a classification matrix for each run of k. You can use sklearn.metrics.confusion_matrix for this 3. Note which errors are more common. In what way does that match your intuitions? b) Gaussian Naive Bayes x represents the image vector (x1, x2, x3, ... x64) ck represents class k = that is, one of the 10 digits for recognition Recall, were looking for the highest p(ck|x) by using this fact:

p(ck|x) = p(x|ck) p(ck) / p(x) Lets step through the parts: p(ck) is simply the proportion of that class in the training data. E.g. if there are 20 fives out of 200 digits in the training sample p(five) = 20/200 = 0.1 p(x|ck) is more complicated The main assumption of naive Bayes is that the features should be treated independently (which is why its naive). This means p(x|ck) = p(x1|ck) * p(x2|ck) * ... * p(x64|ck) For each class, k, in the training data: Calculate the mean and variance of each pixel location for that class Use that and the formula for a gaussian probability to calculate p(xi|ck) p(x) is the normalization term. You dont need to calculate this, since you just want to pick the largest p(ck|x), and p(x) is the same denominator in calculating p(ck|x) for every class. However, if you want p(ck|x) to provide a true estimate of the probability, you can use the following formula to calculate p(x): p(x) = k p(x,ck) = k p(x|ck) p(ck) The predicted class is the largest p(ck|x) for each image 1. Report the overall accuracy of your prediction. 2. Show the classification matrix. 3. Note which errors are more common. In what way does that match your intuitions?

Write which classifier you implemented. A PDF and the Jupyter notebook file.