Bilingual Embeddings with Random Walks over Multilingual WordNets

This page contains the relevant material to reproduce the experiments in [1].

Replicating results

Embeddings

Here are the embeddings for all language pairs. The most relevant are MAPtxt (baseline) and JOINTChyb (or best system):

Creating new embeddings from scratch

If you are interested in building the embeddings above from scratch, or want to build embeddings for other language pairs, follow the instructions in here. Necessary resources such as constraints, wordnets and mapping dictionaries are available here.

Basque monolingual datasets

Basque monolingual RG and WordSim353 similarity datasets used in [1] are available here.

References

[1] Goikoetxea, J., Agirre, E., Soroa, A.. Bilingual Embeddings with Random Walks over Multilingual WordNets. Under review.