Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

(6 points) We train a neural network on a set of data D, where each training sample di {, t) and input vector is

  

(6 points) We train a neural network on a set of data D, where each training sample di {, t) and input vector is In, n = - 1... N together with a corresponding set of target vectors tn. We assume that t has Gaussian distribution. We would like to find parameter w for hypothesis h using the maximum likelihood estimation (ML): hML d(tanh(z)) dz argmax p(D/h) hell hML 772 argmaxp(d, h) hell i=1 Show that solving (1) is equivalent to minimizing the sum of squares error function: (1) M72 = arg min (di - h(ri))" i=1 (4 points) Derive the Error Gradient for a linear unit with a tanh activation func- tion. (Hint: =(1tanh(r)?))

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image_2

Step: 3

blur-text-image_3

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

Artificial Intelligence A Modern Approach

Authors: Stuart Russell, Peter Norvig

4th Edition

0134610997, 978-0134610993

More Books

Students also viewed these Programming questions