Answered step by step
Verified Expert Solution
Question
1 Approved Answer
1. (Gradient descent) When the error on a training example d is defined as (see p. 101 in the textbook) Ea(w) = Ea(w)= 1
1. (Gradient descent) When the error on a training example d is defined as (see p. 101 in the textbook) Ea(w) = Ea(w)= 1 ke outputs then the weights for the output units need to be updated by (see p. 103, formula (4.27)) Awji = n(tj - oj)o; (1-oj)xji One method for preventing the neural networks' weights from overfitting is to add a regular- ization term to the error that increases with the magnitude of the weight vector. This causes the gradient descent search to seek weight vectors with small magnitudes, thereby reducing the risk of overfitting. One way to do this is to redefine E from p. 101 of the text book as (tk - Ok.) (1-0)2 - 0x0) + (1 - 2 - .) ke outputs (b) Explain how you obtained this: Derive the corresponding gradient descent rule for the output units. In other words, say what the new formula for Awj; is, and show how you derived it. (a) Formula for Aw Awji =
Step by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started