Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

Need gaussian naive byes python code for digits dataset without using any direct libraries , please no errors In this exercise, you are to implement

Need gaussian naive byes python code for digits dataset without using any direct libraries , please no errors

In this exercise, you are to implement only one of two possible classifiers (your choice). Note, you are not to use modules which provide these functions - that would be too easy (no sklearn.naive_bayes, for example) but rather create them yourselves. Students who implement the logic for writing code for both the classifiers will be given bonus credit. The performance of your classifier implementation will be evaluated for the classifier functionality (whether you correctly implement kNN or Nave Bayes for the dataset) rather than efficiency. The data set to use is the digit recognition data set available from the sklearn module; the demonstration linked here should provide some guidance. You are expected to use Jupyter notebooks and Python on this assignment, but can ask for exceptions. Your goal is to take the first half of the data set to train your model, and the last half is used for prediction.

image text in transcribed

b) Gaussian Naive Bayes x represents the image vector (x1,x2,x3,x64) ck represents class k= that is, one of the 10 digits for recognition Recall, we're looking for the highest p(ckx) by using this fact: p(ckx)=p(xck)p(ck)/p(x) Let's step through the parts: - p(ck) is simply the proportion of that class in the training data. E.g. if there are 20 fives out of 200 digits in the training sample p( five )=20/200=0.1 - p(xck) is more complicated - The main assumption of naive Bayes is that the features should be treated independently (which is why it's "naive"). This means p(xck)=p(x1ck)p(x2ck)p(x64ck) For each class, k, in the training data: - Calculate the mean and variance of each pixel location for that class - Use that and the formula for a gaussian probability to calculate p(xick) g(x)=21e21(x)2. - p(x) is the normalization term. You don't need to calculate this, since you just want to pick the largest p(ckx), and p(x) is the same denominator in calculating p(ckx) for every class. - However, if you want p(ckx) to provide a true estimate of the probability, you can use the following formula to calculate p(x) : p(x)=kp(x,ck)=kp(xck)p(ck) The predicted class is the largest p(ckx) for each image 1. Report the overall accuracy of your prediction. 2. Show the classification matrix. 3. Note which errors are more common. In what way does that match your intuitions

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

Medical Image Databases

Authors: Stephen T.C. Wong

1st Edition

1461375398, 978-1461375395

More Books

Students also viewed these Databases questions

Question

fscanf retums a special value EOF that stands for...

Answered: 1 week ago

Question

What does the start( ) method defined by Thread do?

Answered: 1 week ago

Question

What lessons in OD contracting does this case represent?

Answered: 1 week ago

Question

Does the code suggest how long data is kept and who has access?

Answered: 1 week ago