Data Download the MNIST data and construct the following sets training set one example only ( you can pick your favourite digit ) test set one example per digit from the MNIST test dataset Map the images to 0 , 1 2 8 x 2 8 P 1 MLP P 1 1 Impleme 2 8 nt a fully connected neural network h 0 , 1 2 8 x 2 8 ( goes to ) 0 , 1 2 8 x 2 8 model that regresses an image into itself The architecture should have 7 trainable dense layers the first 6 layers with 4 neurons and ReLU activation, and an output layer with the necessary number of units and activation P 1 2 Train the model using SGD on the appropriate loss function for 1 0 3 epochs on the training data Plot the training loss over epochs P 1 3 Plot the prediction over the training set and test set ( you should spot a pattern in the predictions, but since there is some randomness associated with using the GPU we recommend repeating the training 3 5 times to be sure you pick up the right pattern ) Which function do you conjecture h ( x ) has learnt ( write it in formula ) P 2 CNN P 2 1 Implement a CNN g 0 , 1 2 8 x 2 8 ( goes to ) 0 , 1 2 8 x 2 8 model that regresses an image into itself The architecture should have 2 convolutional layers the first with 1 0 filters, kernel size 5 X 5 and the same output size as input, and the second a convolutional output layer with the necessary number of filters, kernel and activation P 2 2 Train the model using SGD on the appropriate loss function for 1 0 3 epochs on the training data Plot the training loss over epochs P 2 3 Plot the prediction over the training set and test set ( you should spot a pattern in the predictions, but since there is some randomness associated with using the GPU we recommend repeating the training 3 5 times to be sure you pick up the right pattern ) Which function do you conjecture g ( x ) has learnt ( write it in formula ) P 3 Learning the identity map P 3 1 Consider a multilayer ReLU network h R n ( goes to ) R n such that h ( x ) W 3 ReLU ( W 2 ReLU ( W 1 x b 1 ) b 2 ) b 3 with W 1 ( in the list of ) R a x n , W 3 ( in the list of ) R n x n , b 1 ( in the list of ) R a b 2 , b 3 ( in the list of ) R n Find a possible solution for W 1 , W 2 , W 3 , b 1 , b 2 , b 3 , such that h represents the identity function What if you want h to represent a constant function that always outputs x 0 P 3 2 Consider a CNN g R n x n ( goes to ) R n x n model composed by a first hidden convolutional layer with c filters, d x d ( d 1 odd ) kerne , identity activation and a suitable convolutional output layer Find a possible architecture for g ( i e specify the complete architecture, c , the values in the filters, padding and stride ) such that g represents the identity function If instead of the identity activation, we use a ReLU activation, how should the architecture change Note ( R ) means natural numbers and ( ) means to the power of

The Answer is in the image, click to view ...

Answered step by step

Verified Expert Solution

Link Copied!

Question

1 Approved Answer

Posted on Sep 26, 2024

Data Download the MNIST data and construct the following sets: training set: one example only ( you can pick your favourite digit ) test set:

Data

Download the MNIST data and construct the following sets:

training set: one example only

(

you can pick your favourite digit

)

test set: one example per digit from the MNIST test dataset

Map the images to

[0, 1]^28

28

1 -

MLP

1.1 -

Impleme

^28

nt a fully connected neural network h:

[0, 1]^28

28 (

goes to

) [0, 1]^28

28

model that regresses an image into itself. The architecture should have

7

trainable dense layers: the first

6

layers with

4

neurons and ReLU activation, and an output layer with the necessary number of units and activation.

1.2 -

Train the model using SGD on the appropriate loss function for

10^3

epochs on the training data. Plot the training loss over epochs.

1.3 -

Plot the prediction over the training set and test set

(

you should spot a pattern in the predictions, but since there is some randomness associated with using the GPU we recommend repeating the training

3 - 5

times to be sure you pick up the right pattern

) .

Which function do you conjecture h

(

)

has learnt

(

write it in formula

) ?

2 -

CNN

2.1 -

Implement a CNN g:

[0, 1]^28

28 (

goes to

) [0, 1]^28

28

model that regresses an image into itself. The architecture should have

2

convolutional layers: the first with

10

filters, kernel size

5

5

and the same output size as input, and the second a convolutional output layer with the necessary number of filters, kernel and activation.

2.2 -

Train the model using SGD on the appropriate loss function for

10^3

epochs on the training data. Plot the training loss over epochs.

2.3 -

Plot the prediction over the training set and test set

(

you should spot a pattern in the predictions, but since there is some randomness associated with using the GPU we recommend repeating the training

3 - 5

times to be sure you pick up the right pattern

) .

Which function do you conjecture g

(

)

has learnt

(

write it in formula

) ?

3 -

Learning the identity map

3.1 -

Consider a multilayer ReLU network h: R

^

(

goes to

)

^

n such that h

(

) =

3

ReLU

(

2

ReLU

(

1

+

1) +

2) +

3

with W

1 (

in the list of

)

^

a x n

,

3 (

in the list of

)

^

n x n

,

1 (

in the list of

)

^

a: b

2,

3 (

in the list of

)

^

.

Find a possible solution for W

1,

2,

3,

1,

2,

3,

such that h represents the identity function.

What if you want h to represent a constant function that always outputs x

0 ?

3.2 -

Consider a CNN g:R

^

n x n

(

goes to

)

^

n x n model composed by a first hidden convolutional layer with c filters, d x d

(

> 1

odd

)

kerne

,

identity activation and a suitable convolutional output layer. Find a possible architecture for g

(

.

.

specify the complete architecture, c

,

the values in the filters, padding and stride

)

such that g represents the identity function.

If instead of the identity activation, we use a ReLU activation, how should the architecture change?

Note:

(

)

means natural numbers and

(^)

means to the power of

.

Step by Step Solution

There are 3 Steps involved in it

Step: 1

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

Step: 3

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

The Temple Of Django Database Performance

Authors: Andrew Brookins

1st Edition

★★★★★

How wide are Salary Structure Ranges?

Answered: 1 week ago

Previous Question Next Question