Question

1 Approved Answer

Posted on Sep 25, 2024

5. In this problem we will show that the existence of an efficient mistake-bounded learner for a class C implies an efficient PAC learner for

image text in transcribed

5. In this problem we will show that the existence of an efficient mistake-bounded learner for a class C implies an efficient PAC learner for C. Concretely, let C be a function class with domain X E{-1,1}" and binary labels Y E{-1,1}. Assume that C can be learned by algorithm/learner A with some mistake bound t. You may assume you know the value t. You may also assume that at each iteration, A runs in time polynomial in n and that A only updates its state when it gets an example wrong. The concrete goal of this problem is to create a PAC-learning algorithm, B, that can PAC- learn concept class C with respect to an arbitrary distribution D over {-1,1}" using algorithm A as a sub-routine. In order to prove that learner B can PAC-learn concept class C, we must show that there exists a finite number of examples, m, that we can draw from D such that B produces a hypothesis whose true error is more than e with probability at most 8. First, fix some distribution D on X, and we will assume that the examples are labeled by an unknown ce C. Additionally, for a hypothesis (i.e. function) h:X + Y, let err(h) = Pi~D[h(x) + c(x)]. Formally, we will need to bound m such that the following condition holds: V8, [0, 1], 3m E N| Po~D[err(B({x}")) > e] =D ID (1) where B({x}") denotes a hypotheses produced from B with m random draws of x from an arbitrary distribution D. To find this m, we will first decompose it into blocks of examples of size k and make use of results based on a single block to find the bound necessary for m that satisfies condition 1. Note: Using the identity P[err(h) > e] +P[err(h) e] 1-8, which makes the connection to the definition of PAC-learning discussed in lecture explicit. (a) Fix a single arbitrary hypothesis h' :X + Y produced by A and determine a lower bound on the number of examples, k, such that P[err(h') > e) ) by 8'. However, our bound must apply to every h that our algorith B could output for an arbitrary distribution D over examples. With this in mind, how large should m be so that we can bound all hypotheses that could be output? Recall that algorithm B will not know the mistake bound during it's execution. (c) Put everything together and fully describe (with proof) a PAC learner that is able to output a hypothesis with a true error at most e with probability at least 1 - 8, given a mistake bounded learner A. To do this you should first describe your pseudocode for algorithm B which will use A as a sub-routine (no need for minute details or code, broad strokes will suffice). Then, prove there exists a finite number of m examples for B to PAC-learn C for all values of 8 and e by lower bounding m by a function of , 8, and t (i.e. finding a finite lower bound for m such that the PAC-learning requirements in 1 are satisfied)