Question
(6 points) We train a neural network on a set of data D, where each training sample di {, t) and input vector is
(6 points) We train a neural network on a set of data D, where each training sample di {, t) and input vector is In, n = - 1... N together with a corresponding set of target vectors tn. We assume that t has Gaussian distribution. We would like to find parameter w for hypothesis h using the maximum likelihood estimation (ML): hML d(tanh(z)) dz argmax p(D/h) hell hML 772 argmaxp(d, h) hell i=1 Show that solving (1) is equivalent to minimizing the sum of squares error function: (1) M72 = arg min (di - h(ri))" i=1 (4 points) Derive the Error Gradient for a linear unit with a tanh activation func- tion. (Hint: =(1tanh(r)?))
Step by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get StartedRecommended Textbook for
Artificial Intelligence A Modern Approach
Authors: Stuart Russell, Peter Norvig
4th Edition
0134610997, 978-0134610993
Students also viewed these Programming questions
Question
Answered: 1 week ago
Question
Answered: 1 week ago
Question
Answered: 1 week ago
Question
Answered: 1 week ago
Question
Answered: 1 week ago
Question
Answered: 1 week ago
Question
Answered: 1 week ago
Question
Answered: 1 week ago
Question
Answered: 1 week ago
Question
Answered: 1 week ago
Question
Answered: 1 week ago
Question
Answered: 1 week ago
Question
Answered: 1 week ago
Question
Answered: 1 week ago
Question
Answered: 1 week ago
Question
Answered: 1 week ago
Question
Answered: 1 week ago
Question
Answered: 1 week ago
Question
Answered: 1 week ago
Question
Answered: 1 week ago
Question
Answered: 1 week ago
Question
Answered: 1 week ago
View Answer in SolutionInn App