We'll need the indicator function def indicator function ( x ) x in np array ( x ) x in x in 0 1 x in x in 0 0 return x in Main backward pass routine def backward pass ( all weights, all biases, all f , all h , y ) We'll store the derivatives dl dweights and dl dbiases in lists as well all dl dweights None ( K 1 ) all dl dbiases None ( K 1 ) And we'll store the derivatives of the loss with respect to the activation and preactivations in lists all dl df None ( K 1 ) all dl dh None ( K 1 ) Again for convenience we'll stick with the convention that all h 0 is the net input and all f k in the net output Compute derivatives of the loss with respect to the network output all dl df K np array ( d loss d output ( all f K , y ) ) Now work backwards through the network for layer in range ( K , 1 , 1 ) TODO Calculate the derivatives of the loss with respect to the biases at layer from all dl df layer ( eq 7 2 1 ) NOTE To take a copy of matrix X , use Z np array ( X ) REPLACE THIS LINE all dl dbiases layer np zeros like ( all biases layer ) TODO Calculate the derivatives of the loss with respect to the weights at layer from all dl df layer and all h layer ( eq 7 2 2 ) Don't forget to use np matmul REPLACE THIS LINE all dl dweights layer np zeros like ( all weights layer ) TODO calculate the derivatives of the loss with respect to the activations from weight and derivatives of next preactivations ( second part of last line of eq 7 2 4 ) REPLACE THIS LINE all dl dh layer np zeros like ( all h layer ) if layer 0 TODO Calculate the derivatives of the loss with respect to the pre activation f ( use derivative of ReLu function, first part of last line of eq 7 2 4 ) REPLACE THIS LINE all dl df layer 1 np zeros like ( all f layer 1 ) return all dl dweights, all dl dbiases all dl dweights, all dl dbiases backward pass ( all weights, all biases, all f , all h , y ) np set printoptions ( precision 3 ) Make space for derivatives computed by finite differences all dl dweights fd None ( K 1 ) all dl dbiases fd None ( K 1 ) Let's test if we have the derivatives right using finite differences delta fd 0 0 0 0 0 0 1 Test the dervatives of the bias vectors for layer in range ( K ) dl dbias np zeros like ( all dl dbiases layer ) For every element in the bias for row in range ( all biases layer shape 0 ) Take copy of biases We'll change one element each time all biases copy np array ( x ) for x in all biases all biases copy layer row delta fd network output 1 , compute network output ( net input, all weights, all biases copy ) network output 2 , compute network output ( net input, all weights, all biases ) dl dbias row ( least squares loss ( network output 1 , y ) least squares loss ( network output 2 , y ) ) delta fd all dl dbiases fd layer np array ( dl dbias ) print ( ) print ( Bias d , derivatives from backprop ( layer ) ) print ( all dl dbiases layer ) print ( Bias d , derivatives from finite differences ( layer ) ) print ( all dl dbiases fd layer ) if np allclose ( all dl dbiases fd layer , all dl dbiases layer , rtol 1 e 0 5 , atol 1 e 0 8 , equal nan False ) print ( Success Derivatives match ) else print ( Failure Derivatives different ) Test the derivatives of the weights matrices for layer in range ( K ) dl dweight np zeros like ( all dl dweights layer ) For every element in the bias for row in range ( all weights layer shape 0 ) for col in range ( all weights layer shape 1 ) Take copy of biases We'll change one element each time all weights copy np array ( x ) for x in all weights all weights copy layer row col delta fd network output 1 , compute network output ( net input, all weights copy, all biases ) network output 2 , compute network output ( net input, all weights, all biases ) dl dweight row col ( least squares loss ( network output 1 , y ) least squares loss ( network output 2 , y ) ) delta fd all dl dweights fd layer np array ( dl dweight ) print ( ) print ( Weight d , derivatives from backprop ( layer ) ) print ( all dl dweights layer ) print ( Weight d , derivatives from finite differences ( layer ) ) print ( all dl dweights fd layer ) if np allclose ( all dl dweights fd layer , all dl dweights layer , rtol 1 e 0 5 , atol 1 e 0 8 , equal nan False ) print ( Success Derivatives match ) else print ( Failure Derivatives different )

The Answer is in the image, click to view ...

Answered step by step

Verified Expert Solution

Link Copied!

Question

1 Approved Answer

Posted on Jul 25, 2024

# We'll need the indicator function def indicator _ function ( x ) : x _ in = np . array ( x ) x

# We'll need the indicator function

def indicator

_

function

(

)

_

=

.

array

(

)

_

[

_

> = 0] = 1

_

[

_

< 0] = 0

return x

_

# Main backward pass routine

def backward

_

pass

(

all

_

weights, all

_

biases, all

_

,

all

_

,

)

# We'll store the derivatives dl

_

dweights and dl

_

dbiases in lists as well

all

_

_

dweights

= [

None

] * (

+ 1)

all

_

_

dbiases

= [

None

] * (

+ 1)

# And we'll store the derivatives of the loss with respect to the activation and preactivations in lists

all

_

_

= [

None

] * (

+ 1)

all

_

_

= [

None

] * (

+ 1)

# Again for convenience we'll stick with the convention that all

_

[0]

is the net input and all

_

[

]

in the net output

# Compute derivatives of the loss with respect to the network output

all

_

_

[

] =

.

array

(

_

loss

_

_

output

(

all

_

[

],

))

# Now work backwards through the network

for layer in range

(

, - 1, - 1)

# TODO Calculate the derivatives of the loss with respect to the biases at layer from all

_

_

[

layer

] . (

7.21)

# NOTE! To take a copy of matrix X

,

use Z

=

.

array

(

)

# REPLACE THIS LINE

all

_

_

dbiases

[

layer

] =

.

zeros

_

(

all

_

biases

[

layer

])

# TODO Calculate the derivatives of the loss with respect to the weights at layer from all

_

_

[

layer

]

and all

_

[

layer

] (

7.22)

# Don't forget to use np

.

matmul

# REPLACE THIS LINE

all

_

_

dweights

[

layer

] =

.

zeros

_

(

all

_

weights

[

layer

])

# TODO: calculate the derivatives of the loss with respect to the activations from weight and derivatives of next preactivations

(

second part of last line of eq

7.24)

# REPLACE THIS LINE

all

_

_

[

layer

] =

.

zeros

_

(

all

_

[

layer

])

if layer

> 0

# TODO Calculate the derivatives of the loss with respect to the pre

-

activation f

(

use derivative of ReLu function, first part of last line of eq

. 7.24)

# REPLACE THIS LINE

all

_

_

[

layer

- 1] =

.

zeros

_

(

all

_

[

layer

- 1])

return all

_

_

dweights, all

_

_

dbiases

all

_

_

dweights, all

_

_

dbiases

=

backward

_

pass

(

all

_

weights, all

_

biases, all

_

,

all

_

,

)

.

set

_

printoptions

(

precision

= 3)

# Make space for derivatives computed by finite differences

all

_

_

dweights

_

= [

None

] * (

+ 1)

all

_

_

dbiases

_

= [

None

] * (

+ 1)

# Let's test if we have the derivatives right using finite differences

delta

_

= 0.000001

# Test the dervatives of the bias vectors

for layer in range

(

)

_

dbias

=

.

zeros

_

(

all

_

_

dbiases

[

layer

])

# For every element in the bias

for row in range

(

all

_

biases

[

layer

] .

shape

[0])

# Take copy of biases We'll change one element each time

all

_

biases

_

copy

= [

.

array

(

)

for x in all

_

biases

]

all

_

biases

_

copy

[

layer

] [

row

] + =

delta

_

network

_

output

_1, *_=

compute

_

network

_

output

(

net

_

input, all

_

weights, all

_

biases

_

copy

)

network

_

output

_2, *_=

compute

_

network

_

output

(

net

_

input, all

_

weights, all

_

biases

)

_

dbias

[

row

] = (

least

_

squares

_

loss

(

network

_

output

_1,

) -

least

_

squares

_

loss

(

network

_

output

_2,

)) /

delta

_

all

_

_

dbiases

_

[

layer

] =

.

array

(

_

dbias

)

(" - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ")

("

Bias

%

,

derivatives from backprop:"

% (

layer

))

(

all

_

_

dbiases

[

layer

])

("

Bias

%

,

derivatives from finite differences"

% (

layer

))

(

all

_

_

dbiases

_

[

layer

])

if np

.

allclose

(

all

_

_

dbiases

_

[

layer

],

all

_

_

dbiases

[

layer

],

rtol

= 1

- 05,

atol

= 1

- 08,

equal

_

nan

=

False

)

("

Success

!

Derivatives match."

)

else:

("

Failure

!

Derivatives different."

)

# Test the derivatives of the weights matrices

for layer in range

(

)

_

dweight

=

.

zeros

_

(

all

_

_

dweights

[

layer

])

# For every element in the bias

for row in range

(

all

_

weights

[

layer

] .

shape

[0])

for col in range

(

all

_

weights

[

layer

] .

shape

[1])

# Take copy of biases We'll change one element each time

all

_

weights

_

copy

= [

.

array

(

)

for x in all

_

weights

]

all

_

weights

_

copy

[

layer

] [

row

] [

col

] + =

delta

_

network

_

output

_1, *_=

compute

_

network

_

output

(

net

_

input, all

_

weights

_

copy, all

_

biases

)

network

_

output

_2, *_=

compute

_

network

_

output

(

net

_

input, all

_

weights, all

_

biases

)

_

dweight

[

row

] [

col

] = (

least

_

squares

_

loss

(

network

_

output

_1,

) -

least

_

squares

_

loss

(

network

_

output

_2,

)) /

delta

_

all

_

_

dweights

_

[

layer

] =

.

array

(

_

dweight

)

(" - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ")

("

Weight

%

,

derivatives from backprop:"

% (

layer

))

(

all

_

_

dweights

[

layer

])

("

Weight

%

,

derivatives from finite differences"

% (

layer

))

(

all

_

_

dweights

_

[

layer

])

if np

.

allclose

(

all

_

_

dweights

_

[

layer

],

all

_

_

dweights

[

layer

],

rtol

= 1

- 05,

atol

= 1

- 08,

equal

_

nan

=

False

)

("

Success

!

Derivatives match."

)

else:

("

Failure

!

Derivatives different."

)

Step by Step Solution

There are 3 Steps involved in it

Step: 1

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

Step: 3

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

Concepts of Database Management

Authors: Philip J. Pratt, Joseph J. Adamski

7th edition

★★★★★

What factor(s) in this ethics dilemma might influence a person to make a less-than-ethical decision?

Answered: 1 week ago

Previous Question Next Question