5.4 Sequential minimal optimization (SMO). The SMO algorithm is an optimization algorithm introduced to speed up the

Question:

5.4 Sequential minimal optimization (SMO). The SMO algorithm is an optimization algorithm introduced to speed up the training of SVMs. SMO reduces a

(potentially) large quadratic programming (QP) optimization problem into a series of small optimizations involving only two Lagrange multipliers. SMO reduces memory requirements, bypasses the need for numerical QP optimization and is easy to implement. In this question, we will derive the update rule for the SMO algorithm in the context of the dual formulation of the SVM problem.

(a) Assume that we want to optimize equation (5.33) only over 1 and 2. Show that the optimization problem reduces to max 1; 2 1 + 2 ????

1 2

K11 2 1 ????

1 2

K22 2 2 ???? sK12 1 2 ???? y1 1v1 ???? y2 2v2

| {z }

1( 1; 2)

subject to: 0  1; 2  C ^ 1 + s 2 =

;

where

= y1 Pm i=3 yi i, s = y1y2 2 P f????1;+1g, Kij = (xi  xj) and vi = m j=3 jyjKij for i = 1; 2.

(b) Substitute the linear constraint 1 =

????s 2 into 1 to obtain a new objective function 2 that depends only on 2. Show that the 2 that minimizes 2 (without the constraints 0  1; 2  C) can be expressed as 2 =

s(K11 ???? K12)

+ y2(v1 ???? v2) ???? s + 1



;

where  = K11 + K22 ???? 2K12.

(c) Show that v1 ???? v2 = f(x1) ???? f(x2) + 2y2 ???? sy2 (K11 ???? K12)
where f(x) = Pm i=1 i yi(xi  x) + b and i are values for the Lagrange multipliers prior to optimization over 1 and 2 (similarly, b is the previous value for the o set).

(d) Show that 2 = 2 + y2 (y2 ???? f(x2)) ???? (y1 ???? f(x1))

:

(e) For s = +1, de ne L = maxf0;
???? Cg and H = minfC;
g as the lower and upper bounds on 2. Similarly, for s = ????1, de ne L = maxf0;????
g and H = minfC;C ????
g. The update rule for SMO involves \clipping" the value of 2, i.e., clip 2 = 8>><
>>:
2 if L < 2 < H L if 2  L H if 2  H :
We subsequently solve for 1 such that we satisfy the equality constraint, resulting in 1 = 1 +s( 2 ???? clip 2 ). Why is \clipping" is required? How are L and H derived for the case s = +1?

Fantastic news! We've Found the answer you've been seeking!

Step by Step Answer:

Related Book For  book-img-for-question

Foundations Of Machine Learning

ISBN: 9780262351362

2nd Edition

Authors: Mehryar Mohri, Afshin Rostamizadeh

Question Posted: