2.7 Learning in the presence of noise | general case. In this question, we will seek a...

Question:

2.7 Learning in the presence of noise | general case. In this question, we will seek a result that is more general than in the previous question. We consider a nite hypothesis set H, assume that the target concept is in H, and adopt the following noise model: the label of a training point received by the learner is randomly changed with probability 2 (0; 1 2 ). The exact value of the noise rate is not known to the learner but an upper bound 0 is supplied to him with 0 < 1=2.

(a) For any h 2 H, let d(h) denote the probability that the label of a training point received by the learner disagrees with the one given by h. Let h be the target hypothesis, show that d(h) = .

(b) More generally, show that for any h 2 H, d(h) = + (1 ???? 2)R(h), where R(h) denotes the generalization error of h.

(c) Fix > 0 for this and all the following questions. Use the previous questions to show that if R(h) > , then d(h) ???? d(h) 0, where 0 = (1 ???? 20).

(d) For any hypothesis h 2 H and sample S of size m, let b d(h) denote the fraction of the points in S whose labels disagree with those given by h. We will consider the algorithm L which, after receiving S, returns the hypothesis hS with the smallest number of disagreements (thus b d(hS) is minimal). To show PAC-learning for L, we will show that for any h, if R(h) > , then with high probability b d(h) b d(h). First, show that for any > 0, with probability at least 1 ???? =2, for m 2 02 log 2 , the following holds:
b d(h) ???? d(h) 0=2

(e) Second, show that for any > 0, with probability at least 1 ???? =2, for m 2 02 (log jHj + log 2 ), the following holds for all h 2 H:
d(h) ???? b d(h) 0=2

(f) Finally, show that for any > 0, with probability at least 1 ???? , for m 2 2(1????20)2 (log jHj + log 2 ), the following holds for all h 2 H with R(h) > :
b d(h) ???? b d(h) 0:
(Hint: use b d(h)???? b d(h) = [ b d(h)????d(h)]+[d(h)????d(h)]+[d(h)???? b d(h)] and use previous questions to lower bound each of these three terms).

Fantastic news! We've Found the answer you've been seeking!