Speech and Language Resource Bank
A corpus of European Parlimentary speech and tools for machine learning models.
Authors: Chanhan Wang,
Morgane Riviere,
Ann Lee,
Anne Wu,
Chaitanya Talnikar,
Daniel Haziza,
Mary WIlliamson,
Juan Pino,
Emmanuel Dupoux
Updated: 2021-04-30
Source: https://aclanthology.org/2021.acl-long.80/
Keywords: English,
German,
French,
Spanish,
Polish,
Italian,
Romanian,
Hungarian,
Czech,
Dutch,
Finnish,
Slovak,
Slovenian,
Estonian,
Lithuanian,
Portuguese,
Bulgarian,
Greek,
Latvian,
Maltese,
Swedish,
Danish,
speech synthesis,
machine learning,
Accented Speech
Python 3.7 resources to evaluate bigram and trigram frequencies in corpora.
Authors: Jeroen van Paridon,
Bill Thompson
Updated: 2019-04-30
Source: https://github.com/jvparidon/subs2vec
Keywords: language,
bigram,
trigram,
lexical norms,
psycholinguistics,
Afrikaans,
Arabic,
Bulgarian,
Bengali,
Breton,
Bosnian,
Catalan,
Czech,
Danish,
German,
Greek,
English,
Esperanto,
Spanish,
Estonian,
Basque,
Farsi,
Finnish,
French,
Galician,
Hebrew,
Hindi,
Croatian,
Hungarian,
Armenian,
Indonesian,
Icelandic,
Italian,
Georgian,
Kazakh,
Korean,
Lithuanian,
Latvian,
Macedonian,
Malayalam,
Malay,
Dutch,
Norwegian,
Polish,
Portuguese,
Romanian,
Russian,
Sinhala,
Slovak,
Slovenian,
Albanian,
Serbian,
Swedish,
Tamil,
Telugu,
Tagalog,
Turkish,
Ukranian,
Urdu,
Vietnamese