Examine how well word embedding models can answer analogy questions of the form A is to B

Question:

Examine how well word embedding models can answer analogy questions of the form “A is to B as C is to [what]?” (e.g. “Athens is to Greece as Oslo is to Norway”) using vector arithmetic. Create your own embeddings or use an online site such as lamyiowce. github.io/word2viz/ or bionlp-www.utu.fi/wv_demo/.

a. What analogies work well, and what ones fail?

b. In handling Country:Capital analogies, is there a problem with ambiguous country names, like “Turkey”?

c. Do analogies start to fail for rarer words? For example, “one is to 1 as two is to [what]?” will reliably retrieve “2,” but how well that hold for “ninety” or “thousand”?

d. What research papers can help you understand word embedding analogies?

e. What else can you explore?

Fantastic news! We've Found the answer you've been seeking!