Answered step by step
Verified Expert Solution
Question
1 Approved Answer
Slides 28-30: Recall that the sigmoid function has derivative s'(z) = s(2)(l-s(z)). Moreover, recall that during backpro- pogation the derivative s'(z is a factor in
Slides 28-30:
Recall that the sigmoid function has derivative s'(z) = s(2)(l-s(z)). Moreover, recall that during backpro- pogation the derivative s'(z is a factor in the gradient computation used to update the weights of a multilayer perceptron (see slides 28-30 in the neural-nets.pdf slide set). Activation functions like sigmoid have a "satura- tion" problem: when z is very large or very small, s(z) is close to 1 or 0, respectively, and so s'(2 is close to 0. As a result, corresponding gradients will be nearly 0, which slows down training. Affine activation functions with positive slope always have a positive derivative and thus will (more or less) not exibit saturation, but they have other drawbacks (think back to lab 6). Do a little research and find a non-affine activation function that avoids the saturation problem (hint: ReLU). In your own words, describe how this activation is non-affine and also avoids the saturation problem. Briefly dis- cuss any drawbacks your chosen activation function may have, as well as similar alternatives that avoid these drawbacks Recall that the sigmoid function has derivative s'(z) = s(2)(l-s(z)). Moreover, recall that during backpro- pogation the derivative s'(z is a factor in the gradient computation used to update the weights of a multilayer perceptron (see slides 28-30 in the neural-nets.pdf slide set). Activation functions like sigmoid have a "satura- tion" problem: when z is very large or very small, s(z) is close to 1 or 0, respectively, and so s'(2 is close to 0. As a result, corresponding gradients will be nearly 0, which slows down training. Affine activation functions with positive slope always have a positive derivative and thus will (more or less) not exibit saturation, but they have other drawbacks (think back to lab 6). Do a little research and find a non-affine activation function that avoids the saturation problem (hint: ReLU). In your own words, describe how this activation is non-affine and also avoids the saturation problem. Briefly dis- cuss any drawbacks your chosen activation function may have, as well as similar alternatives that avoid these drawbacksStep by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started