Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

Q 1 . [ 5 points ] The optimization problem of training logistic regression for the binary classification problem, i . e . , Y

Q1.[5 points] The optimization problem of training logistic regression for the binary classification problem, i.e.,Y={0,1}, uses the cross-entropy (CE) loss which is defined as
wCE**=argminwlCE(w)=argminw-1ni=1nyiloghat(y)i+(1-yi)log(1-hat(y)i),
where hat(y)=11+exp(-wTx).
One may believe that we could have used mean squared error (MSE) as the loss function instead of the CE loss. In this case, the optimization problem for the logistic regression can be described as
wMSE**=argminwlMSE(w)=argminw1ni=1n||yi-hat(y)i||22.
Here, hat(y)=11+exp(-wTx).
The goal of this exercise is to compare the CE loss and the MSE loss when training a logistic regression model and to understand why we should use the CE over the MSE.
(a)[2 points] Compute the first order and the second order derivatives of lCE(w).
(b)[2 points] Compute the first order and the second order derivatives of lMSE(w).
(c)[1 point] Using the seconder order derivatives, which are computed in (a) and (b), explain why we should prefer the cross-entropy loss over the MSE loss for training a logistic regression model. (Hint. Use the second derivative and check the convexity.)
(d)[1 point] Given the gradients computed in (a), derive the GD algorithm for optimization problems in equations (1).
image text in transcribed

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access with AI-Powered Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Students also viewed these Databases questions