We'll need the indicator function def indicator function ( x ) x in np array ( x ) x in x in 0 1 x in x in 0 0 return x in Main backward pass routine def backward pass ( all weights, all biases, all f , all h , y ) We'll store the derivatives dl dweights and dl dbiases in lists as well all dl dweights None ( K 1 ) all dl dbiases None ( K 1 ) And we'll store the derivatives of the loss with respect to the activation and preactivations in lists all dl df None ( K 1 ) all dl dh None ( K 1 ) Again for convenience we'll stick with the convention that all h 0 is the net input and all f k in the net output Compute derivatives of the loss with respect to the network output all dl df K np array ( d loss d output ( all f K , y ) ) Now work backwards through the network for layer in range ( K , 1 , 1 ) TODO Calculate the derivatives of the loss with respect to the biases at layer from all dl df layer ( eq 7 2 1 ) NOTE To take a copy of matrix X , use Z np array ( X ) REPLACE THIS LINE all dl dbiases layer np zeros like ( all biases layer ) TODO Calculate the derivatives of the loss with respect to the weights at layer from all dl df layer and all h layer ( eq 7 2 2 ) Don't forget to use np matmul REPLACE THIS LINE all dl dweights layer np zeros like ( all weights layer ) TODO calculate the derivatives of the loss with respect to the activations from weight and derivatives of next preactivations ( second part of last line of eq 7 2 4 ) REPLACE THIS LINE all dl dh layer np zeros like ( all h layer ) if layer 0 TODO Calculate the derivatives of the loss with respect to the pre activation f ( use derivative of ReLu function, first part of last line of eq 7 2 4 ) REPLACE THIS LINE all dl df layer 1 np zeros like ( all f layer 1 ) return all dl dweights, all dl dbiases all dl dweights, all dl dbiases backward pass ( all weights, all biases, all f , all h , y ) np set printoptions ( precision 3 ) Make space for derivatives computed by finite differences all dl dweights fd None ( K 1 ) all dl dbiases fd None ( K 1 ) Let's test if we have the derivatives right using finite differences delta fd 0 0 0 0 0 0 1 Test the dervatives of the bias vectors for layer in range ( K ) dl dbias np zeros like ( all dl dbiases layer ) For every element in the bias for row in range ( all biases layer shape 0 ) Take copy of biases We'll change one element each time all biases copy np array ( x ) for x in all biases all biases copy layer row delta fd network output 1 , compute network output ( net input, all weights, all biases copy ) network output 2 , compute network output ( net input, all weights, all biases ) dl dbias row ( least squares loss ( network output 1 , y ) least squares loss ( network output 2 , y ) ) delta fd all dl dbiases fd layer np array ( dl dbias ) print ( ) print ( Bias d , derivatives from backprop ( layer ) ) print ( all dl dbiases layer ) print ( Bias d , derivatives from finite differences ( layer ) ) print ( all dl dbiases fd layer ) if np allclose ( all dl dbiases fd layer , all dl dbiases layer , rtol 1 e 0 5 , atol 1 e 0 8 , equal nan False ) print ( Success Derivatives match ) else print ( Failure Derivatives different ) Test the derivatives of the weights matrices for layer in range ( K ) dl dweight np zeros like ( all dl dweights layer ) For every element in the bias for row in range ( all weights layer shape 0 ) for col in range ( all weights layer shape 1 ) Take copy of biases We'll change one element each time all weights copy np array ( x ) for x in all weights all weights copy layer row col delta fd network output 1 , compute network output ( net input, all weights copy, all biases ) network output 2 , compute network output ( net input, all weights, all biases ) dl dweight row col ( least squares loss ( network output 1 , y ) least squares loss ( network output 2 , y ) ) delta fd all dl dweights fd layer np array ( dl dweight ) print ( ) print ( Weight d , derivatives from backprop ( layer ) ) print ( all dl dweights layer ) print ( Weight d , derivatives from finite differences ( layer ) ) print ( all dl dweights fd layer ) if np allclose ( all dl dweights fd layer , all dl dweights layer , rtol 1 e 0 5 , atol 1 e 0 8 , equal nan False ) print ( Success Derivatives match ) else print ( Failure Derivatives different ) finish the TODO make sure the outputs match

Question

We'll need the indicator function def indicator   function ( x )   x   in   np   array ( x ) x   in   x   in     0     1 x   in   x   in   0     0 return x   in   Main backward pass routine def backward   pass ( all   weights, all   biases, all   f , all   h , y )     We'll store the derivatives dl   dweights and dl   dbiases in lists as well all   dl   dweights     None     ( K   1 ) all   dl   dbiases     None     ( K   1 )   And we'll store the derivatives of the loss with respect to the activation and preactivations in lists all   dl   df     None     ( K   1 ) all   dl   dh     None     ( K   1 )   Again for convenience we'll stick with the convention that all   h   0   is the net input and all   f   k   in the net output   Compute derivatives of the loss with respect to the network output all   dl   df   K     np   array ( d   loss   d   output ( all   f   K   , y ) )   Now work backwards through the network for layer in range ( K ,   1 ,   1 )     TODO Calculate the derivatives of the loss with respect to the biases at layer from all   dl   df   layer     ( eq 7   2 1 )   NOTE  To take a copy of matrix X , use Z   np   array ( X )   REPLACE THIS LINE all   dl   dbiases   layer     np   zeros   like ( all   biases   layer   )   TODO Calculate the derivatives of the loss with respect to the weights at layer from all   dl   df   layer   and all   h   layer   ( eq 7   2 2 )   Don't forget to use np   matmul   REPLACE THIS LINE all   dl   dweights   layer     np   zeros   like ( all   weights   layer   )   TODO  calculate the derivatives of the loss with respect to the activations from weight and derivatives of next preactivations ( second part of last line of eq 7   2 4 )   REPLACE THIS LINE all   dl   dh   layer     np   zeros   like ( all   h   layer   ) if layer   0     TODO Calculate the derivatives of the loss with respect to the pre   activation f ( use derivative of ReLu function, first part of last line of eq   7   2 4 )   REPLACE THIS LINE all   dl   df   layer   1     np   zeros   like ( all   f   layer   1   ) return all   dl   dweights, all   dl   dbiases all   dl   dweights, all   dl   dbiases   backward   pass ( all   weights, all   biases, all   f , all   h , y ) np   set   printoptions ( precision   3 )   Make space for derivatives computed by finite differences all   dl   dweights   fd     None     ( K   1 ) all   dl   dbiases   fd     None     ( K   1 )   Let's test if we have the derivatives right using finite differences delta   fd   0   0 0 0 0 0 1   Test the dervatives of the bias vectors for layer in range ( K )   dl   dbias   np   zeros   like ( all   dl   dbiases   layer   )   For every element in the bias for row in range ( all   biases   layer     shape   0   )     Take copy of biases We'll change one element each time all   biases   copy     np   array ( x ) for x in all   biases   all   biases   copy   layer     row       delta   fd network   output   1 ,       compute   network   output ( net   input, all   weights, all   biases   copy ) network   output   2 ,       compute   network   output ( net   input, all   weights, all   biases ) dl   dbias   row     ( least   squares   loss ( network   output   1 , y )   least   squares   loss ( network   output   2 , y ) )   delta   fd all   dl   dbiases   fd   layer     np   array ( dl   dbias ) print (                                                                                                   ) print (   Bias   d , derivatives from backprop     ( layer ) ) print ( all   dl   dbiases   layer   ) print (   Bias   d , derivatives from finite differences    ( layer ) ) print ( all   dl   dbiases   fd   layer   ) if np   allclose ( all   dl   dbiases   fd   layer   , all   dl   dbiases   layer   , rtol   1 e   0 5 , atol   1 e   0 8 , equal   nan   False )   print (   Success   Derivatives match   ) else  print (   Failure   Derivatives different   )   Test the derivatives of the weights matrices for layer in range ( K )   dl   dweight   np   zeros   like ( all   dl   dweights   layer   )   For every element in the bias for row in range ( all   weights   layer     shape   0   )   for col in range ( all   weights   layer     shape   1   )     Take copy of biases We'll change one element each time all   weights   copy     np   array ( x ) for x in all   weights   all   weights   copy   layer     row     col       delta   fd network   output   1 ,       compute   network   output ( net   input, all   weights   copy, all   biases ) network   output   2 ,       compute   network   output ( net   input, all   weights, all   biases ) dl   dweight   row     col     ( least   squares   loss ( network   output   1 , y )   least   squares   loss ( network   output   2 , y ) )   delta   fd all   dl   dweights   fd   layer     np   array ( dl   dweight ) print (                                                                                                   ) print (   Weight   d , derivatives from backprop     ( layer ) ) print ( all   dl   dweights   layer   ) print (   Weight   d , derivatives from finite differences    ( layer ) ) print ( all   dl   dweights   fd   layer   ) if np   allclose ( all   dl   dweights   fd   layer   , all   dl   dweights   layer   , rtol   1 e   0 5 , atol   1 e   0 8 , equal   nan   False )   print (   Success   Derivatives match   ) else  print (   Failure   Derivatives different   ) finish the TODO  make sure the outputs match

Accepted Answer

The Answer is in the image, click to view ...

Question

We'll need the indicator function def indicator _ function ( x ) : x _ in = np . array ( x ) x _

Step by Step Solution

Step: 1

Get Instant Access with AI-Powered Solutions

Step: 2

Step: 3

Ace Your Homework with AI

Students also viewed these Databases questions

Question

Question

Question

Question

Question

Question

Question

Question

Question