Answered step by step
Verified Expert Solution
Question
1 Approved Answer
3. Neural sequence models (14 points) (a) (2 points) Suppose you want to build a French to English translator system using an LSTM based sequence
3. Neural sequence models (14 points) (a) (2 points) Suppose you want to build a French to English translator system using an LSTM based sequence to sequence model. What would be fed to the decoder as input at each time step t? Note there should be two components to the input. (b) (2 points) Consider the two components of the input to your decoder from part (a). How does each component affect the generalization of the model during inference? (c) (4 points) Describe two different decoding strategies for generating English translations with your decoder from part (a). For each strategy, explain when you would want to use it and what is a possible drawback of that decoding strategy. (d) (2 points) You are building a text classifier using a simple single-layer, unidirectional RNN. Your friend recommend that you used a GRU cell instead of a LSTM cell. Under what circumstance might the GRU work better than the LSTM? (e) (4 points) You notice that the performance of your classifier from part (d) is not very good. Describe two extensions that you can make to your GRU model from part (d) to improve the performance of your classifier. For each extension, explain why it might help. 3. Neural sequence models (14 points) (a) (2 points) Suppose you want to build a French to English translator system using an LSTM based sequence to sequence model. What would be fed to the decoder as input at each time step t? Note there should be two components to the input. (b) (2 points) Consider the two components of the input to your decoder from part (a). How does each component affect the generalization of the model during inference? (c) (4 points) Describe two different decoding strategies for generating English translations with your decoder from part (a). For each strategy, explain when you would want to use it and what is a possible drawback of that decoding strategy. (d) (2 points) You are building a text classifier using a simple single-layer, unidirectional RNN. Your friend recommend that you used a GRU cell instead of a LSTM cell. Under what circumstance might the GRU work better than the LSTM? (e) (4 points) You notice that the performance of your classifier from part (d) is not very good. Describe two extensions that you can make to your GRU model from part (d) to improve the performance of your classifier. For each extension, explain why it might help
Step by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started