Question
The Occams razor result proved in class only applies to finite H. Suppose now that H is discrete, i.e., either finite or countably infinite. Let
The Occams razor result proved in class only applies to finite H. Suppose now that H is discrete, i.e., either finite or countably infinite. Let g : H (0, 1] be any function such that
g(h) 1. hH
Although g may look a bit like a probability distribution, you should not think of it as one. It is just a function any function whose positive values happen to add up to a number not bigger than one.
Let m be the number of given examples (each chosen at random, as usual, from some unknown distribution D).
[10] Prove that, with probability at least 1 , errD(h) ln(1/g(h)) + ln(1/)
m for all h H that are consistent with the observed data. As usual, errD(h) =
PrxD [h(x) = c(x)], and c is the target concept.
[10] Suppose hypotheses in H are represented by bit strings and that |h| denotes the
number of bits needed to represent h. Show how to choose g to prove that
|h| + ln(1/) m
for all h H that are consistent with the observed data (with probability at least 1 ). Give explicit constants (in other words, give a bound that does not use O() notation).
[c] How does the bound in (b) reflect the intuition that simpler hypotheses should be prefered to more complex ones? How does the bound in (a) reflect the intuition that prior knowledge helps learning?
Here is the picture of the problem just incase this is now clear.
The Occam's razor result proved in class only applies to finite H. Suppose now that H is discrete, i.e., either finite or countably infinite. Let g : H (0, 1] be any function such that g(A) 1. hEH Although g may look a bit like a probability distribution, you should not think of it as one. It is just a function -any function-whose positive values happen to add up to a number not bigger than one. Let m be the number of given examples (each chosen at random, as usual, from some unknown distribution D The Occam's razor result proved in class only applies to finite H. Suppose now that H is discrete, i.e., either finite or countably infinite. Let g : H (0, 1] be any function such that g(A) 1. hEH Although g may look a bit like a probability distribution, you should not think of it as one. It is just a function -any function-whose positive values happen to add up to a number not bigger than one. Let m be the number of given examples (each chosen at random, as usual, from some unknown distribution DStep by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started