Answered step by step

Verified Expert Solution

Link Copied!

Question

1 Approved Answer

Posted on Sep 23, 2024

[ 1 0 + 1 5 + 1 0 = 3 5 points, implementation ] Consider the following RNN models to perform text classification over

[10 + 15 + 10 = 35

points, implementation

]

Consider the following RNN models to perform text classification over the dataset

provided in hw

1

2

text data train.csv

(

see hw

1

2

helper.py for sample implementations of these six models

) .

1 .

GRU: Single GRU layer and

32

units in the layer.

2 .

GRU stacked: Two GRU layers, first one with

32

units, and second one with

16

units in the layers.

3 .

GRU stacked bidirectional: Same as #

2,

except that layers are bidirectional.

4 .

LSTM: Single LSTM layer and

32

units in the layer.

5 .

LSTM stacked: Two LSTM layers, first one with

32

units, and second one with

16

units in the layers.

6 .

LSTM stacked bidirectional: Same as #

5,

except that layers are bidirectional.

For all the models, the preceding Embedding layer should have vocabulary size

(

input dim

)

10, 000

with

64

dimensional

embeddings. Set your optimizer to adam, loss to categorical crossentropy and metrics to accuracy, and train with

batch size

128 .

Set the validation split to

0.2

wherever needed.

(

.)

Train any given model

(

from the above list of

6

models

),

and identify the best number of epochs that minimizes the

validation loss.

Sample function call with required arguments and the output is provided below.

1

n u m e p o c h l i m i t

= 100

2

b e s t e p o c h m o d e l

=

g e t b e s t n u m e p o c h s

(

h w

1

2

t e x t d a t a t r a i n

.

c s v

,

GRU stacked

,

n u m e p o c h l i m i t

,

g m a x w o r d s

= 10000,

g m a x t e x t l e n

= 2 0 0)

3

# g m a x w o r d s and g m a x t e x t l e n must s p e c i f y t h e d i m e n s i o n s o f t h e Embedding l a y e r

4

# b e s t e p o c h m o d e l would be b e t w e e n

1

and n u m e p o c h l i m i t

(

.)

Compare the performances of the given list of models by evaluating them over the test set

1

2

text data test.csv

.

Sample function call with required arguments and the output is provided below.

Figure

1

shows the format of the output dataframe

(

.

.,

df ress

) .

The same set of performance metrics in this figure

should be in the generated data frame df ress.

1

g t r a i n d a t a f i l e

=

h w

1

2

t e x t d a t a t r a i n

.

c s v

2

g t e s t d a t a f i l e

=

h w

1

2

t e x t d a t a t e s t

.

c s v

3

g n u m e p o c h s

= 5

2

4

l i s t m o d e l s

= [

GRU

,

GRU stacked

,

G R U s t a c k e d b i d i r e c t i o n a l

,

LSTM

,

LSTM stacked

,

L S T M s t a c k e d b i d i r e c t i o n a l

]

5

d f r e s s

=

r u n g i v e n D N N s

(

g t r a i n d a t a f i l e

,

g t e s t d a t a f i l e

,

g l i s t m o d e l s

,

g n u m e p o c h s

,

g m a x w o r d s

= 10000,

g m a x t e x t l e n

= 2 0 0)

Figure

1

: Sample data frame generated by run given DNNs

()

function

(

.)

Investigate the effectiveness of the regularization methods, namely, weight regularization and dropout over the model #

6

(

LSTM stacked bidirectional

) .

You are required to implement a function named

run regularization comparison

()

which generates a plot

(

saved as a file with name

text clf regularization.pdf

,

which should look similar to Figure

2) .

Sample function call with required arguments and the output is provided below.

1

# w e i g h t r e g u l a r i z a t i o n c o d e s a m p l e

2

GRU

(3 2,

k e r n e l r e g u l a r i z e r

=

r e g u l a r i z e r s

.

2 (0 . 0 0 1))

3

# d r o p o u t c o d e s a m p l e

4

l a y e r s

.

D r o p o u t

(0 . 2 5)

5

6

max num epochs

= 25

7

t r a i n d a t a f i l e

=

h w

1

2

t e x t d a t a t r a i n

.

c s v

8

a v g l o s s o r i g

,

a v g l o s s w r e g

,

a v g l o s s d r o p o u t

=

r u n r e g u l a r i z a t i o n c o m p a r i s o n

(

t r a i n d a t a f i l e

,

max num epochs

)

9

# s a m p l e o u t p u t

a v g l o s s o r i g :

1 . 4 9 2,

wreg :

1 . 3 1 0,

d r o p o u t :

1 . 4 8 6

10

# c o d e s h o u l d a u t o m a t i c a l l y g e n e r a t e t e x t c l f r e g u l a r i z a t i o n

.

p d f

0 5 10 15 20 25

Epochs

0.8

1.0

1.2

1.4

1.6

1.8

2.0

Loss

validation losses

Validation loss

-

original

Validation loss

-

regularized

Validation loss

-

dropout

Figure

2

: Sample regularization plots

Step by Step Solution

There are 3 Steps involved in it

Step: 1

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

Step: 3

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

Samsung Galaxy S23 Ultra Comprehensive User Manual

Authors: Leo Scott

1st Edition

★★★★★

5. The month of Ramadan, a month of fasting for Muslims, ends with which holiday? a. Eid ul-Fitr b. Allahu Akbar c. Takbir d. Abu Bakr

Answered: 1 week ago

Previous Question Next Question