Speech and Language Resource Bank
ALICE is a tool for estimating the number of adult-spoken linguistic units from child-centered audio recordings, as captured by microphones worn by children. It is meant as an open-source alternative for LENA adult word count (AWC) estimator.
Authors: Okko Räsänen,
Shreyas Seshadri,
Marvin Lavechin,
Alejandrina Cristia,
Marisa Casillas
Updated: 2021-11-02
Source: https://github.com/orasanen/ALICE
Keywords: language,
linguistics,
phonetics,
speech-production,
Argentinian Spanish,
Tseltal,
Yélî Dnye,
English
Python 3.7 resources to evaluate bigram and trigram frequencies in corpora.
Authors: Jeroen van Paridon,
Bill Thompson
Updated: 2019-12-24
Source: https://github.com/jvparidon/subs2vec
Keywords: language,
bigram,
trigram,
lexical norms,
psycholinguistics,
Afrikaans,
Arabic,
Bulgarian,
Bengali,
Breton,
Bosnian,
Catalan,
Czech,
Danish,
German,
Greek,
English,
Esperanto,
Spanish,
Estonian,
Basque,
Farsi,
Finnish,
French,
Galician,
Hebrew,
Hindi,
Croatian,
Hungarian,
Armenian,
Indonesian,
Icelandic,
Italian,
Georgian,
Kazakh,
Korean,
Lithuanian,
Latvian,
Macedonian,
Malayalam,
Malay,
Dutch,
Norwegian,
Polish,
Portuguese,
Romanian,
Russian,
Sinhala,
Slovak,
Slovenian,
Albanian,
Serbian,
Swedish,
Tamil,
Telugu,
Tagalog,
Turkish,
Ukranian,
Urdu,
Vietnamese