Answered step by step
Verified Expert Solution
Question
1 Approved Answer
Consider an error function (W - 3) (w -4) (w-3) (w - 4) 4 + 9 6 . Different variants of gradient descent algorithm
Consider an error function (W - 3) (w -4) (w-3) (w - 4) 4 + 9 6 . Different variants of gradient descent algorithm can be used to minimize this error function w.r.t (w, W2). Assume (w, W) = (1, 1) at time (t-1) and after update (w, W2) = (1.5, 2.0) at time (t). Assume a = 1.5, B =0.6, n = 0.3. 1. E(W, W) = 0.05 + Compute the value that minimizes (w1 , w2). Compute the minimum possible value of error. 2. What will be value of (w1, w2 ) at time (t + 1) if standard gradient descent is used? 3. What will be value of (w1, w2 ) at time (t + 1) if momentum is used? 4. What will be value of (w1, w2 ) at time (t + 1) if RMSPRop is used? 5. What will be value of (w1, w2 ) at time (t + 1) if Adam is used?
Step by Step Solution
★★★★★
3.45 Rating (155 Votes )
There are 3 Steps involved in it
Step: 1
To find the minimum set the partial derivatives of E w w with respect to w and w to zero and s...Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started