Answered step by step
Verified Expert Solution
Question
1 Approved Answer
Problem 4. Minimize risk or maximize likelihood? In our lecture, we argued that one way to make a good prediction h is to minimize the
Problem 4. Minimize risk or maximize likelihood? In our lecture, we argued that one way to make a good prediction h is to minimize the mean absolute error associated with data set D = {11:1, . . . , 33\"}: R(.,bh;)=D $Z|h- a3,|. We saw that the median of $1, . . . ,scn is the prediction with the smallest mean error. Your friend Max thinks that instead of minimizing the mean error, it is better to maximize the following quantity: M01) = H eIhm. i=1 The above formula is written using product notation, which is similar to summation notation, except terms are multiplied and not added. For example, 11. Hai=a1'a2-...an. i=1 Max's reasoning is that for some models, e'lhmil is used to compute how likely the predicted value It will appear given the observation 9:, hence it is called \"likelihood.\" Then, we should attempt to maximize the chance of getting the prediction h, given the set of observations. In this problem, we'll see if Max has a good idea. a) (3 (g; 6 For an arbitrary xed value of $1, sketch a graph of the basic shape of the likelihood function M (h) = e_|h_\"'|. Explain, based on the graph, why larger values of M (h) correspond to better predictions h. b) {g} [:3 (S) [:3 Inforrnally, a minimizer of a function f is an input :L'min where f achieves its minimum value. More formally, :'L'min is a minimizer of f if f (wmin) S f(iL') for all values of 3:. In the same way, mm\" is a maximizer of f if rm\") 2 at) for all values of 3:. Suppose that f is some unknown function which takes in a real number and outputs a real number. Suppose that c is an unknown positive constant, and dene the function g(:r:) = e'c''\"). Prove that if 33min is a minimizer of f, then it is also a maximizer of g. c) Q (35 Q (3 At what value 11* is M (h) maximized? Is this a reasonable prediction? Discuss the pros and cons of using Max's prediction strategy, and describe scenarios where this gives a good prediction and where this gives a bad prediction, in your opinion. Problem 5. Empirical risk with quadratic loss During the group work session (in the last question), you have chosen a likely depiction of the empirical risk function with quadratic loss, R3q(h, D), for a dataset D = {100K, 200K, 300K, 400K}. Now let's prove your intuition. a) (3 (3 Q 6} (:5 Prove that for any data set D = {3:1, . . . , sun}, the empirical risk: aqua, D) = iiw $.02 i=1 can be simplied as: 33,05, D) = (h m2 + z. Express quantities y and z in terms of 9:1, . ..,:1:,,, and n. (Hint: expand the squares in quUn, D). Use induction if you nd it useful.) b) (3 6) Using the above result, minimize qu(h, D) (write down what's arg minheR qu(h, D) and what's minheR qu (h, D)) without using calculus. You can use the properties of the quadratic functions. (3) (3&8 For an arbitrary data set D = {9:1,...,:.':,,} (33,: E R, for any vi = 1,...,n), is quantity 3; generally positive, negative, or zero, and why? How about quantity z and why
Step by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started