8.6 Margin Perceptron. Given a training sample S that is linearly separable with a maximum margin ...

Question:

8.6 Margin Perceptron. Given a training sample S that is linearly separable with a maximum margin > 0, theorem 8.8 states that the Perceptron algorithm run cyclically over S is guaranteed to converge after at most R2=2 updates, where R is the radius of the sphere containing the sample points. However, this theorem does not guarantee that the hyperplane solution of the Perceptron algorithm achieves a margin close to . Suppose we modify the Perceptron algorithm to ensure that the margin of the hyperplane solution is at least =2. In particular, consider the algorithm described in gure 8.12. In this problem we show that this algorithm converges after at most 16R2=2 updates. Let I denote the set of times t 2 [T] at which the algorithm makes an update and let M = jIj be the total number of updates.

(a) Using an analysis similar to the one given for the Perceptron algorithm, show that M kwT+1k. Conclude that if kwT+1k < 4R2

, then M < 4R2=2.

(For the remainder of this problem, we will assume that kwT+1k 4R2

(b) Show that for any t 2 I (including t = 0), the following holds:

kwt+1k2 (kwtk + =2)2 + R2:

(c) From (b), infer that for any t 2 I we have kwt+1k kwtk + =2 +

R2 kwtk + kwt+1k + =2

(d) Using the inequality from (c), show that for any t 2 I such that either kwtk 4R2 or kwt+1k 4R2 , we have kwt+1k kwtk +
3 4 :

(e) Show that kw1k R 4R2=. Since by assumption we have kwT+1k 4R2 , conclude that there must exist a largest time t0 2 I such that kwt0k 4R2
and kwt0+1k 4R2 .

(f) Show that kwT+1k kwt0k + 3 4M. Conclude that M 16R2=2.

Fantastic news! We've Found the answer you've been seeking!