In class, we considered the logistic function output as a pseudo-probability and used it to defin a likelihood for data given the parameters in w (for the following three questions we will assume b is included in w). Note for binary classification, y-0 or y-1. The goal is to find w to maximize L for all data (xy) Let us say we skip the logistic function and learn the two-way classifier w to maximize a sum: 8. Replace B with a valid expression such that S(ylx; w) will be largest when w is correctly selected to correctly classify data from class 0 (y-0) as w'x
0.d should be an algebraic expression involving some variables w, x, and/or y. For example: B IxI or B -10+y 9. Using this "loss function"/"objective function", w will be learned to maximize classification of which training data points: (a) points closest to the separator (b) points farthest from the separator (c) points with the lowest magnitude |xl (d) all points are weighted equally (e) another answer: specify alternative answer In class, we considered the logistic function output as a pseudo-probability and used it to defin a likelihood for data given the parameters in w (for the following three questions we will assume b is included in w). Note for binary classification, y-0 or y-1. The goal is to find w to maximize L for all data (xy) Let us say we skip the logistic function and learn the two-way classifier w to maximize a sum: 8. Replace B with a valid expression such that S(ylx; w) will be largest when w is correctly selected to correctly classify data from class 0 (y-0) as w'x0.d should be an algebraic expression involving some variables w, x, and/or y. For example: B IxI or B -10+y 9. Using this "loss function"/"objective function", w will be learned to maximize classification of which training data points: (a) points closest to the separator (b) points farthest from the separator (c) points with the lowest magnitude |xl (d) all points are weighted equally (e) another answer: specify alternative