It is possible to define a regularizer to minimize e(loss(Y(e),Y(e)) + regularizer(Y)) rather than formula
Question:
It is possible to define a regularizer to minimize ∑e(loss(Y(e),Y(e)) +
λ ∗ regularizer(Y)) rather than formula (7.5) (page 303). How is this different than the existing regularizer? [Hint: Think about how this affects multiple datasets or for cross validation.]
Suppose λ is set by k-fold cross validation, and then the model is learned for the whole dataset. How would the algorithm be different for the original way(s) of defining a regularizer and this alternative way? [Hint: There is a different number of examples used for the regularization than there is the full dataset; does this matter?] Which works better in practice?
Fantastic news! We've Found the answer you've been seeking!
Step by Step Answer:
Related Book For
Artificial Intelligence: Foundations Of Computational Agents
ISBN: 9781009258197
3rd Edition
Authors: David L. Poole , Alan K. Mackworth
Question Posted: