Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

Only part c) 6 Boosting (12 pts] Consider a patient classification problem, in which we aim to classify patients as low risk (y = -1)

Only part c)

image text in transcribed

6 Boosting (12 pts] Consider a patient classification problem, in which we aim to classify patients as low risk (y = -1) or high-risk (y = 1) for in-hospital mortality using only medication data. That is, we represent each patient by binary (0/1) feature vector = [21, ..., Xd]", where x; = 1 if the patient received medication i and X; = 0 otherwise. We define a set of weak classifiers, h(;7) = z[i, parameterized by 0 = (i, z) (the choice of the component, i {1,..., d} and the class label, z (-1,1}, that the medication should be associated with. There are exactly 2d possible weak classifiers of this type. Next, we run the Adaboost algorithm, which aims to minimize the exponential loss on the training data through a learned combination of weak classifiers. We assume that the boosting algorithm finds the best (i.e., minimizes weighted training loss) weak classifier at each iteration. (a) For the following questions, answer True or False and justify your choice: (i) (2 pt) Within an iteration, the weights on all the misclassified points go up by the same multi- plicative factor. (ii) (2 pt) The boosting algorithm described above can select the exact same weak classifier more than once. (b) (4 pt) In running this algorithm, we observe that the weighted error of the k(th) weak classifier (mea- sured relative to the weights at the beginning of the k(th) iteration) tends to increase as a function of the iteration k. Provide a brief rationale for why this might be the case. (c) (4 pt) We use the output of this classifier to select features. That is, we select features or medications in the order in which they were identified by the weak learners. Suppose that one of our goals is to remove redundant features. Is the ranking produced by the boosting algorithm likely to be more useful than a ranking based on simple information gain calculations (e.g., IG(y, ))? Briefly justify your

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

Building Database Driven Catalogs

Authors: Sherif Danish

1st Edition

0070153078, 978-0070153073

More Books

Students also viewed these Databases questions

Question

What is the principle of thermodynamics? Explain with examples

Answered: 1 week ago