Answered step by step
Verified Expert Solution
Question
1 Approved Answer
In Stochastic Gradient Descent ( SGD ) with batch size and momentum , the step taken at time is: where is the batch gradient and
In Stochastic Gradient Descent SGD with batch size and momentum the step taken at time is:
where is the batch gradient and is the gradient of the th datapoint. Let's now expand the recursion of momentum
If we assume that the gradients do not change significantly from one SGD step to the next, we can further simplify this expression to
One final simplification is to remove the rounding due to the batching
Step by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started