We'll need the indicator function def indicator function ( x ) x in np array ( x ) x in x in 0 1 x in x in 0 0 return x in Main backward pass routine def backward pass ( all weights, all biases, all f , all h , y ) We'll store the derivatives dl dweights and dl dbiases in lists as well all dl dweights None ( K 1 ) all dl dbiases None ( K 1 ) And we'll store the derivatives of the loss with respect to the activation and preactivations in lists all dl df None ( K 1 ) all dl dh None ( K 1 ) Again for convenience we'll stick with the convention that all h 0 is the net input and all f k in the net output Compute derivatives of the loss with respect to the network output all dl df K np array ( d loss d output ( all f K , y ) ) Now work backwards through the network for layer in range ( K , 1 , 1 ) TODO Calculate the derivatives of the loss with respect to the biases at layer from all dl df layer ( eq 7 2 1 ) NOTE To take a copy of matrix X , use Z np array ( X ) REPLACE THIS LINE all dl dbiases layer np zeros like ( all biases layer ) TODO Calculate the derivatives of the loss with respect to the weights at layer from all dl df layer and all h layer ( eq 7 2 2 ) Don't forget to use np matmul REPLACE THIS LINE all dl dweights layer np zeros like ( all weights layer ) TODO calculate the derivatives of the loss with respect to the activations from weight and derivatives of next preactivations ( second part of last line of eq 7 2 4 ) REPLACE THIS LINE all dl dh layer np zeros like ( all h layer ) if layer 0 TODO Calculate the derivatives of the loss with respect to the pre activation f ( use derivative of ReLu function, first part of last line of eq 7 2 4 ) REPLACE THIS LINE all dl df layer 1 np zeros like ( all f layer 1 ) return all dl dweights, all dl dbiases all dl dweights, all dl dbiases backward pass ( all weights, all biases, all f , all h , y ) np set printoptions ( precision 3 ) Make space for derivatives computed by finite differences all dl dweights fd None ( K 1 ) all dl dbiases fd None ( K 1 ) Let's test if we have the derivatives right using finite differences delta fd 0 0 0 0 0 0 1 Test the dervatives of the bias vectors for layer in range ( K ) dl dbias np zeros like ( all dl dbiases layer ) For every element in the bias for row in range ( all biases layer shape 0 ) Take copy of biases We'll change one element each time all biases copy np array ( x ) for x in all biases all biases copy layer row delta fd network output 1 , compute network output ( net input, all weights, all biases copy ) network output 2 , compute network output ( net input, all weights, all biases ) dl dbias row ( least squares loss ( network output 1 , y ) least squares loss ( network output 2 , y ) ) delta fd all dl dbiases fd layer np array ( dl dbias ) print ( ) print ( Bias d , derivatives from backprop ( layer ) ) print ( all dl dbiases layer ) print ( Bias d , derivatives from finite differences ( layer ) ) print ( all dl dbiases fd layer ) if np allclose ( all dl dbiases fd layer , all dl dbiases layer , rtol 1 e 0 5 , atol 1 e 0 8 , equal nan False ) print ( Success Derivatives match ) else print ( Failure Derivatives different ) Test the derivatives of the weights matrices for layer in range ( K ) dl dweight np zeros like ( all dl dweights layer ) For every element in the bias for row in range ( all weights layer shape 0 ) for col in range ( all weights layer shape 1 ) Take copy of biases We'll change one element each time all weights copy np array ( x ) for x in all weights all weights copy layer row col delta fd network output 1 , compute network output ( net input, all weights copy, all biases ) network output 2 , compute network output ( net input, all weights, all biases ) dl dweight row col ( least squares loss ( network output 1 , y ) least squares loss ( network output 2 , y ) ) delta fd all dl dweights fd layer np array ( dl dweight ) print ( ) print ( Weight d , derivatives from backprop ( layer ) ) print ( all dl dweights layer ) print ( Weight d , derivatives from finite differences ( layer ) ) print ( all dl dweights fd layer ) if np allclose ( all dl dweights fd layer , all dl dweights layer , rtol 1 e 0 5 , atol 1 e 0 8 , equal nan False ) print ( Success Derivatives match ) else print ( Failure Derivatives different )

The Answer is in the image, click to view ...

Answered step by step

Verified Expert Solution

Link Copied!

Question

1 Approved Answer

Posted on Jul 25, 2024

# We'll need the indicator function def indicator _ function ( x ) : x _ in = np . array ( x ) x

# We'll need the indicator function

def indicator

_

function

(

)

_

=

.

array

(

)

_

[

_

> = 0] = 1

_

[

_

< 0] = 0

return x

_

# Main backward pass routine

def backward

_

pass

(

all

_

weights, all

_

biases, all

_

,

all

_

,

)

# We'll store the derivatives dl

_

dweights and dl

_

dbiases in lists as well

all

_

_

dweights

= [

None

] * (

+ 1)

all

_

_

dbiases

= [

None

] * (

+ 1)

# And we'll store the derivatives of the loss with respect to the activation and preactivations in lists

all

_

_

= [

None

] * (

+ 1)

all

_

_

= [

None

] * (

+ 1)

# Again for convenience we'll stick with the convention that all

_

[0]

is the net input and all

_

[

]

in the net output

# Compute derivatives of the loss with respect to the network output

all

_

_

[

] =

.

array

(

_

loss

_

_

output

(

all

_

[

],

))

# Now work backwards through the network

for layer in range

(

, - 1, - 1)

# TODO Calculate the derivatives of the loss with respect to the biases at layer from all

_

_

[

layer

] . (

7.21)

# NOTE! To take a copy of matrix X

,

use Z

=

.

array

(

)

# REPLACE THIS LINE

all

_

_

dbiases

[

layer

] =

.

zeros

_

(

all

_

biases

[

layer

])

# TODO Calculate the derivatives of the loss with respect to the weights at layer from all

_

_

[

layer

]

and all

_

[

layer

] (

7.22)

# Don't forget to use np

.

matmul

# REPLACE THIS LINE

all

_

_

dweights

[

layer

] =

.

zeros

_

(

all

_

weights

[

layer

])

# TODO: calculate the derivatives of the loss with respect to the activations from weight and derivatives of next preactivations

(

second part of last line of eq

7.24)

# REPLACE THIS LINE

all

_

_

[

layer

] =

.

zeros

_

(

all

_

[

layer

])

if layer

> 0

# TODO Calculate the derivatives of the loss with respect to the pre

-

activation f

(

use derivative of ReLu function, first part of last line of eq

. 7.24)

# REPLACE THIS LINE

all

_

_

[

layer

- 1] =

.

zeros

_

(

all

_

[

layer

- 1])

return all

_

_

dweights, all

_

_

dbiases

all

_

_

dweights, all

_

_

dbiases

=

backward

_

pass

(

all

_

weights, all

_

biases, all

_

,

all

_

,

)

.

set

_

printoptions

(

precision

= 3)

# Make space for derivatives computed by finite differences

all

_

_

dweights

_

= [

None

] * (

+ 1)

all

_

_

dbiases

_

= [

None

] * (

+ 1)

# Let's test if we have the derivatives right using finite differences

delta

_

= 0.000001

# Test the dervatives of the bias vectors

for layer in range

(

)

_

dbias

=

.

zeros

_

(

all

_

_

dbiases

[

layer

])

# For every element in the bias

for row in range

(

all

_

biases

[

layer

] .

shape

[0])

# Take copy of biases We'll change one element each time

all

_

biases

_

copy

= [

.

array

(

)

for x in all

_

biases

]

all

_

biases

_

copy

[

layer

] [

row

] + =

delta

_

network

_

output

_1, *_=

compute

_

network

_

output

(

net

_

input, all

_

weights, all

_

biases

_

copy

)

network

_

output

_2, *_=

compute

_

network

_

output

(

net

_

input, all

_

weights, all

_

biases

)

_

dbias

[

row

] = (

least

_

squares

_

loss

(

network

_

output

_1,

) -

least

_

squares

_

loss

(

network

_

output

_2,

)) /

delta

_

all

_

_

dbiases

_

[

layer

] =

.

array

(

_

dbias

)

(" - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ")

("

Bias

%

,

derivatives from backprop:"

% (

layer

))

(

all

_

_

dbiases

[

layer

])

("

Bias

%

,

derivatives from finite differences"

% (

layer

))

(

all

_

_

dbiases

_

[

layer

])

if np

.

allclose

(

all

_

_

dbiases

_

[

layer

],

all

_

_

dbiases

[

layer

],

rtol

= 1

- 05,

atol

= 1

- 08,

equal

_

nan

=

False

)

("

Success

!

Derivatives match."

)

else:

("

Failure

!

Derivatives different."

)

# Test the derivatives of the weights matrices

for layer in range

(

)

_

dweight

=

.

zeros

_

(

all

_

_

dweights

[

layer

])

# For every element in the bias

for row in range

(

all

_

weights

[

layer

] .

shape

[0])

for col in range

(

all

_

weights

[

layer

] .

shape

[1])

# Take copy of biases We'll change one element each time

all

_

weights

_

copy

= [

.

array

(

)

for x in all

_

weights

]

all

_

weights

_

copy

[

layer

] [

row

] [

col

] + =

delta

_

network

_

output

_1, *_=

compute

_

network

_

output

(

net

_

input, all

_

weights

_

copy, all

_

biases

)

network

_

output

_2, *_=

compute

_

network

_

output

(

net

_

input, all

_

weights, all

_

biases

)

_

dweight

[

row

] [

col

] = (

least

_

squares

_

loss

(

network

_

output

_1,

) -

least

_

squares

_

loss

(

network

_

output

_2,

)) /

delta

_

all

_

_

dweights

_

[

layer

] =

.

array

(

_

dweight

)

(" - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ")

("

Weight

%

,

derivatives from backprop:"

% (

layer

))

(

all

_

_

dweights

[

layer

])

("

Weight

%

,

derivatives from finite differences"

% (

layer

))

(

all

_

_

dweights

_

[

layer

])

if np

.

allclose

(

all

_

_

dweights

_

[

layer

],

all

_

_

dweights

[

layer

],

rtol

= 1

- 05,

atol

= 1

- 08,

equal

_

nan

=

False

)

("

Success

!

Derivatives match."

)

else:

("

Failure

!

Derivatives different."

)

Step by Step Solution

There are 3 Steps involved in it

Step: 1

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

Step: 3

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

Concepts of Database Management

Authors: Philip J. Pratt, Joseph J. Adamski

7th edition

★★★★★

To what extent do you think companies should take care of the whole person rather than focusing only on employees welfare to the extent that this is necessary because of legislation? Presenteeism...

Answered: 1 week ago

Previous Question Next Question