Question: D. What does 'Multi-Head Attention' in the Transformer model refer to? 1. The model's ability to perform multiple tasks at once. 2. The model's ability

D. What does 'Multi-Head Attention' in the Transformer model refer to?

1. The model's ability to perform multiple tasks at once.

2. The model's ability to read data word by word.

3. The model's ability to understand context better than its predecessors.

4. The model's ability to scale up by adding more layers or attention heads.

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Management And Artificial Intelligence Questions!