|Authors:||Jeroen van Paridon, Bill Thompson|
|Updated:||Tue 24 December 2019|
|Keywords:||language, bigram, trigram, lexical norms, psycholinguistics, Afrikaans, Arabic, Bulgarian, Bengali, Breton, Bosnian, Catalan, Czech, Danish, German, Greek, English, Esperanto, Spanish, Estonian, Basque, Farsi, Finnish, French, Galician, Hebrew, Hindi, Croatian, Hungarian, Armenian, Indonesian, Icelandic, Italian, Georgian, Kazakh, Korean, Lithuanian, Latvian, Macedonian, Malayalam, Malay, Dutch, Norwegian, Polish, Portuguese, Romanian, Russian, Sinhala, Slovak, Slovenian, Albanian, Serbian, Swedish, Tamil, Telugu, Tagalog, Turkish, Ukranian, Urdu, Vietnamese|
|Citation:||van Paridon, J., & Thompson, B. (2019, October 13). subs2vec: Word embeddings from subtitles in 55 languages. https://doi.org/10.31234/osf.io/fcrmy|
Python 3.7 scripts and command line tools to evaluate a set of word vectors on semantic similarity, semantic and syntactic analogy, and lexical norm prediction tasks.