A well-known data set is the MNIST handwritten digit database, containing many thousands of digitalized numbers (from
Question:
A well-known data set is the MNIST handwritten digit database, containing many thousands of digitalized numbers (from 0 to 9 ), each described by a \(28 \times 28\) matrix of gray scales. A similar but much smaller data set is described in [63]. Here, each handwritten digit is summarized by a \(8 \times 8\) matrix with integer entries from 0 (white) to 15 (black). Figure 7.14 shows the first 50 digitized images. The data set can be accessed with Python using the sklearn package as follows.
(a) Divide the data into a \(75 \%\) training set and \(25 \%\) test set.
(b) Compare the effectiveness of the \(K\)-nearest neighbors and naïve Bayes method to classify the data.
(c) Assess which \(K\) to use in the \(K\)-nearest neighbors classification.
Step by Step Answer:
Data Science And Machine Learning Mathematical And Statistical Methods
ISBN: 9781118710852
1st Edition
Authors: Dirk P. Kroese, Thomas Taimre, Radislav Vaisman, Zdravko Botev