Answered step by step
Verified Expert Solution
Question
1 Approved Answer
5. In this problem we will show that the existence of an efficient mistake-bounded learner for a class C implies an efficient PAC learner for
5. In this problem we will show that the existence of an efficient mistake-bounded learner for a class C implies an efficient PAC learner for C. Concretely, let C be a function class with domain X E{-1,1}" and binary labels Y E{-1,1}. Assume that C can be learned by algorithm/learner A with some mistake bound t. You may assume you know the value t. You may also assume that at each iteration, A runs in time polynomial in n and that A only updates its state when it gets an example wrong. The concrete goal of this problem is to create a PAC-learning algorithm, B, that can PAC- learn concept class C with respect to an arbitrary distribution D over {-1,1}" using algorithm A as a sub-routine. In order to prove that learner B can PAC-learn concept class C, we must show that there exists a finite number of examples, m, that we can draw from D such that B produces a hypothesis whose true error is more than e with probability at most 8. First, fix some distribution D on X, and we will assume that the examples are labeled by an unknown c E C. Additionally, for a hypothesis (i.e. function) h: X Y, let err(h) = Px-D[h(x) c(x)]. Formally, we will need to bound m such that the following condition holds: VO, E [0, 1], 3m EN Px~D[err(B({x}")) > ] ] +P[err(h) ] 1-8, which makes the connection to the definition of PAC-learning discussed in lecture explicit. (a) Fix a single arbitrary hypothesis h' :X + Y produced by A and determine a lower bound on the number of examples, k, such that P[err(h') > e) ) by d'. However, our bound must apply to every h that our algorith B could output for an arbitrary distribution D over examples. With this in mind, how large should m be so that we can bound all hypotheses that could be output? Recall that algorithm B will not know the mistake bound during it's execution. (c) Put everything together and fully describe (with proof) a PAC learner that is able to output a hypothesis with a true error at most e with probability at least 1-6, given a mistake bounded learner A. To do this you should first describe your pseudocode for algorithm B which will use A as a sub-routine (no need for minute details or code, broad strokes will suffice). Then, prove there exists a finite number of m examples for B to PAC-learn C for all values of 8 and by lower bounding m by a function of , 8, and t (i.e. finding a finite lower bound for m such that the PAC-learning requirements in 1 are satisfied). 5. In this problem we will show that the existence of an efficient mistake-bounded learner for a class C implies an efficient PAC learner for C. Concretely, let C be a function class with domain X E{-1,1}" and binary labels Y E{-1,1}. Assume that C can be learned by algorithm/learner A with some mistake bound t. You may assume you know the value t. You may also assume that at each iteration, A runs in time polynomial in n and that A only updates its state when it gets an example wrong. The concrete goal of this problem is to create a PAC-learning algorithm, B, that can PAC- learn concept class C with respect to an arbitrary distribution D over {-1,1}" using algorithm A as a sub-routine. In order to prove that learner B can PAC-learn concept class C, we must show that there exists a finite number of examples, m, that we can draw from D such that B produces a hypothesis whose true error is more than e with probability at most 8. First, fix some distribution D on X, and we will assume that the examples are labeled by an unknown c E C. Additionally, for a hypothesis (i.e. function) h: X Y, let err(h) = Px-D[h(x) c(x)]. Formally, we will need to bound m such that the following condition holds: VO, E [0, 1], 3m EN Px~D[err(B({x}")) > ] ] +P[err(h) ] 1-8, which makes the connection to the definition of PAC-learning discussed in lecture explicit. (a) Fix a single arbitrary hypothesis h' :X + Y produced by A and determine a lower bound on the number of examples, k, such that P[err(h') > e) ) by d'. However, our bound must apply to every h that our algorith B could output for an arbitrary distribution D over examples. With this in mind, how large should m be so that we can bound all hypotheses that could be output? Recall that algorithm B will not know the mistake bound during it's execution. (c) Put everything together and fully describe (with proof) a PAC learner that is able to output a hypothesis with a true error at most e with probability at least 1-6, given a mistake bounded learner A. To do this you should first describe your pseudocode for algorithm B which will use A as a sub-routine (no need for minute details or code, broad strokes will suffice). Then, prove there exists a finite number of m examples for B to PAC-learn C for all values of 8 and by lower bounding m by a function of , 8, and t (i.e. finding a finite lower bound for m such that the PAC-learning requirements in 1 are satisfied)
Step by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started