subs2vec
Authors: | Jeroen van Paridon, Bill Thompson |
---|---|
Updated: | Tue 24 December 2019 |
Source: | https://github.com/jvparidon/subs2vec |
Type: | Github Repository |
Languages: | cross-linguistic |
Keywords: | language, bigram, trigram, lexical norms, psycholinguistics, Afrikaans, Arabic, Bulgarian, Bengali, Breton, Bosnian, Catalan, Czech, Danish, German, Greek, English, Esperanto, Spanish, Estonian, Basque, Farsi, Finnish, French, Galician, Hebrew, Hindi, Croatian, Hungarian, Armenian, Indonesian, Icelandic, Italian, Georgian, Kazakh, Korean, Lithuanian, Latvian, Macedonian, Malayalam, Malay, Dutch, Norwegian, Polish, Portuguese, Romanian, Russian, Sinhala, Slovak, Slovenian, Albanian, Serbian, Swedish, Tamil, Telugu, Tagalog, Turkish, Ukranian, Urdu, Vietnamese |
Open Access: | |
License: | |
Citation: | van Paridon, J., & Thompson, B. (2019, October 13). subs2vec: Word embeddings from subtitles in 55 languages. https://doi.org/10.31234/osf.io/fcrmy |
Summary: | Python 3.7 scripts and command line tools to evaluate a set of word vectors on semantic similarity, semantic and syntactic analogy, and lexical norm prediction tasks. |