subs2vec
Authors: | Jeroen van Paridon, Bill Thompson |
---|---|
Updated: | Tue 30 April 2019 |
Source: | https://github.com/jvparidon/subs2vec |
Type: | Github Repository |
Languages: | cross-linguistic |
Keywords: | language, bigram, trigram, lexical norms, psycholinguistics, Afrikaans, Arabic, Bulgarian, Bengali, Breton, Bosnian, Catalan, Czech, Danish, German, Greek, English, Esperanto, Spanish, Estonian, Basque, Farsi, Finnish, French, Galician, Hebrew, Hindi, Croatian, Hungarian, Armenian, Indonesian, Icelandic, Italian, Georgian, Kazakh, Korean, Lithuanian, Latvian, Macedonian, Malayalam, Malay, Dutch, Norwegian, Polish, Portuguese, Romanian, Russian, Sinhala, Slovak, Slovenian, Albanian, Serbian, Swedish, Tamil, Telugu, Tagalog, Turkish, Ukranian, Urdu, Vietnamese |
Open Access: | |
License: | |
Citation: | van Paridon, J., & Thompson, B. (2019, October 13). subs2vec: Word embeddings from subtitles in 55 languages. https://doi.org/10.31234/osf.io/fcrmy |
Summary: | Python 3.7 scripts and command line tools to evaluate a set of word vectors on semantic similarity, semantic and syntactic analogy, and lexical norm prediction tasks. |