Answered step by step
Verified Expert Solution
Question
1 Approved Answer
For binary classification, we assume training data set matrix as D=211421511151 where xpR21 represented by the first 2 rows and ypR11 in the last row.
For binary classification, we assume training data set matrix as D=211421511151 where xpR21 represented by the first 2 rows and ypR11 in the last row. The model parameter vector is defined as: w^=w1w2b=[wb] We apply hinge loss with ridge regularization on the feature touching weights w as the training objective function to train the binary classifier: minimizeL(w^)=P1p=1Pmax(0,1ypw^Tx^p)+2w22 a) Please write down all the sample-label pair {(xp,yp)} in the data matrix D in the format: (x1,y1)(xP,yP) numerically. b) Apply PRP +Conjugate Gradient Algorithm to update w^0 to w^2 by minimizing the regularized hinge loss L(w^). Please use a constant step size parameter =0.1 in each iteration. c) Compute the average hinge loss defined by L(w^)=P1p=1Pmax(0,1ypw^Tx^p) with the trained model w^2 over the training data set. d) Please write down the decision boundary with respect to w^2 and x^. What is the position of the decision boundary, i.e., the perpendicular distance of the decision boundary from the origin? Please calculate the numerical result. e) What is the predicted results for all the samples in the training data set by w^2 ? Which samples are misclassified
Step by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started