7.12 Empirical margin loss boosting. As discussed in the chapter, AdaBoost can be viewed as coordinate descent

Question:

7.12 Empirical margin loss boosting. As discussed in the chapter, AdaBoost can be viewed as coordinate descent applied to a convex upper bound on the empirical error. Here, we consider an algorithm seeking to minimize the empirical margin loss. For any 0 < 1 let bR S;

(f) = 1 m

Pm i=1 1yif(xi) denote the empirical margin loss of a function f of the form f =

PT Pt=1 tht T t=1 t for a labeled sample S = ((x1; y1); : : : ; (xm; ym)).

(a) Show that bR S;

(f) can be upper bounded as follows:

bR S;

(f)

1 m

Xm i=1 exp

????yi XT t=1 tht(xi) +

XT t=1 t

(b) For any > 0, let G be the objective function dened for all 0 by G() =

1 m

Xm i=1 exp 0

@????yi XN j=1 jhj(xi) +

XN j=1 j

1 A;

with hj 2 H for all j 2 [N], with the notation used in class in the boosting lecture. Show that G is convex and dierentiable.

(c) Derive a boosting-style algorithm A by applying (maximum) coordinate descent to G. You should justify in detail the derivation of the algorithm, in particular the choice of the base classier selected at each round and that of the step. Compare both to their counterparts in AdaBoost.

(d) What is the equivalent of the weak learning assumption for A (Hint: use non-negativity of the step value)?

(e) Give the full pseudocode of the algorithm A. What can you say about the A0 algorithm?

(f) Provide a bound on bR S;(f).

i. Prove the upper bound bR S;

(f) exp

PT t=1 t

QT t=1 Zt, where the normalization factors Zt are dened as in the case of AdaBoost (with t the step chosen by A at round t).

ii. Give the expression of Zt as a function of and t, where t is the weighted error of the hypothesis found by A at round t (dened in the same way as for AdaBoost in class). Use that to prove the following upper bound bR S;

(f)

u 1+
2 + u????1????
2 T YT t=1 q 1????
t (1 ???? t)1+;
where u = 1????
1+ .
iii. Assume that for all t 2 [T], 1????
2 ???? t >
> 0. Use the result of the previous question to show that bR S;

(f) exp
????
2 2T 1 ???? 2
:
(Hint: you can use without proof the following identity:

u 1+
2 + u????1????
2 q 1????
t (1 ???? t)1+ 1 ???? 2 ???? 1????
2 ???? t 2 1 ???? 2 ;
valid for 1????
2 ???? t > 0.) Show that for T (logm)(1????2)
2 2 , all points of the training data have margin at least .

Fantastic news! We've Found the answer you've been seeking!

Step by Step Answer:

Answer rating: 100% (QA)

Answered By

AJIN kuriakose

I have completed B.Tech in Electrical Engineering & Masters in Power & Control From one of the best universities in India. I got the 99.05 percentile in the Gate Electrical Engineering Exam. I can Help students solving assignments in Electrical subjects like Power Electronics, Control system, Analog, Network Theory & Engineering Mathematics. Clear your fundamentals and develop problem-solving skills and analytical skills to crack the exam. Get guidance and the opportunity to learn from experienced... I can provide tuition for Electrical engineering subjects (Power Electronics, Digital electronics, Network Theory, Control System & Engineering Mathematics). The toughest subject of Electrical engineering can be made simple in online classes... I can also solve it. 1 .I can help you with your assignments or exams or quiz or tutoring. 2. Very strict to the deadlines. Message me for any help in assignments, live sessions. I am here to help students for all assignments, tests and exams and I will make sure you always get _95% In your subject. Contact me in solution inn for any help in your semester, projects and for many more things . Also feel free to contact me through solution inn and for any advise related to tutoring and how it works here.thank you.

5.00+ 5+ Reviews 10+ Question Solved

Related Book For book-img-for-question