Answered step by step
Verified Expert Solution
Question
1 Approved Answer
Neural Nets 1. (2 pts) Why should one use bipolar encodings of binary values instead of standard binary encoding? 2. (2 pts) Why is momentum
Neural Nets
1. (2 pts) Why should one use bipolar encodings of binary values instead of standard binary encoding? 2. (2 pts) Why is momentum often used in combination with stochastic gradient descent (SGD)? 3. (2 pts) Why does SGD have trouble converging when the gradient has small magnitude values? 4. (2 pts) Why does SGD have trouble converging when the gradient has large magnitude values? 5. (2 pts) What can happen if the learning rate is set too high? 6. (2 pts) What can happen if the learning rate is set too low? 7. (2 pts) What is the name of the matrix of second order partial derivatives of the error function? 8. (2 pts) Why is the matrix of second order partial derivatives not commonly used to help train neural networks? 9. (2 pts) Why is L-BFGS not commonly used for optimization of error functions with neural networks? 10. (2 pts) What is the name of one non-linear activation function which helps prevent gradients from approaching zero magnitude in deep neural networksStep by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started