Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

Proximal Gradient Descent The gradient descent algorithm cannot be directly applied since the objective function is non- differentiable. Discuss why the objective function is non-differentiable.

Proximal Gradient Descent

The gradient descent algorithm cannot be directly applied since the objective function is non-

differentiable. Discuss why the objective function is non-differentiable.

image text in transcribed

Problem 3. Proximal Gradient Descent Consider solving the following problem where Rnxd is the feature matrix (each row is a feature vector), y e Rn is the label vector, llw11- lwil and > 0 is a constant to balance loss and regularization. This is known as the Lasso regression problem and (10 pt) The gradient descent algorithm cannot be directly applied since the objective function is non- (30 pt) In the class we showed that gradient descent is based on the idea of function approximation. To form our goal is to derive the "proximal gradient method" for solving this differentiable. Discuss why the objective function is non-differentiable an approximation for non-differentiable function, we split the differentiable part and non-differentiable part. Let g(w)Xw yl2, as discussed in the gradient descent lecture we approximate g(w) by In each iteration of proximal gradient descent, we obtain the next iterate (wi+1) by minimizing the following approximation function: wt +1 = arg ming(w) + 1wlla Derive the close form solution of wt +1 given wt, g(w), , . What's the time complexity for one proximal gradient descent iteration? Problem 3. Proximal Gradient Descent Consider solving the following problem where Rnxd is the feature matrix (each row is a feature vector), y e Rn is the label vector, llw11- lwil and > 0 is a constant to balance loss and regularization. This is known as the Lasso regression problem and (10 pt) The gradient descent algorithm cannot be directly applied since the objective function is non- (30 pt) In the class we showed that gradient descent is based on the idea of function approximation. To form our goal is to derive the "proximal gradient method" for solving this differentiable. Discuss why the objective function is non-differentiable an approximation for non-differentiable function, we split the differentiable part and non-differentiable part. Let g(w)Xw yl2, as discussed in the gradient descent lecture we approximate g(w) by In each iteration of proximal gradient descent, we obtain the next iterate (wi+1) by minimizing the following approximation function: wt +1 = arg ming(w) + 1wlla Derive the close form solution of wt +1 given wt, g(w), , . What's the time complexity for one proximal gradient descent iteration

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

Intelligent Information And Database Systems 12th Asian Conference ACIIDS 2020 Phuket Thailand March 23 26 2020 Proceedings

Authors: Pawel Sitek ,Marcin Pietranik ,Marek Krotkiewicz ,Chutimet Srinilta

1st Edition

9811533792, 978-9811533792

More Books

Students also viewed these Databases questions

Question

What are the causes of those problems?

Answered: 1 week ago