Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

1. Generalization and risk decomposition. Govinda is designing a classifier to deter- mine whether an email should be sent to the user's spam folder. He

image text in transcribed
image text in transcribed
1. Generalization and risk decomposition. Govinda is designing a classifier to deter- mine whether an email should be sent to the user's spam folder. He collects a dataset of 1000 emails, pays human experts as to annotate them with labels (spam or not spam), and splits the dataset into a training set, validation set and test set. He first decides to use the 10000 most common words of English as features for this classifier (let's denote this classifier 910000). Later, his manager asks him to use a smaller number of features, so he also trains another classifier, 9100, which uses only the 100 most common words of English. (a) (2 points) Which is more likely to overfit? A. 9100 B. 910000 C. They are both equally likely to overfit (b) (3 points) Is is approximation error higher when Govinda uses 100 features, when he uses 10000 features, or is it equal in both cases? Briefly explain using the definition of approximation error. (c) (2 points) Is the Bayes error higher when Govinda uses 100 features, when he uses 10000 features, or is it equal in both cases? Briefly explain using the definition of the Bayes error. (d) (3 points) As the deadline approaches, Govinda decreases the size of the training set to speed up training. Do you expect the approximation error to increase, decrease, or stay the same? What about the estimation error? Explain briefly. (e) (2 points) Is the approximation error a random variable? What about the estima- tion error? Briefly explain what it means to say that a given quantity is a random variable. (f) (3 points) Suppose efficiency isn't a concern anymore, but Govinda is now intrigued and would like to determine empirically how many features he should use in his classifier. He experiments with all gi for 100

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

Webassign For Applied Calculus

Authors: James Stewart

1st Edition

1337771953, 9781337771955

More Books

Students also viewed these Mathematics questions