|<   <   Page 1 / 1   >   >|

subs2vec

Python 3.7 resources to evaluate bigram and trigram frequencies in corpora.

Authors:  Jeroen van ParidonBill Thompson
Updated:  2019-12-24
Source:  https://github.com/jvparidon/subs2vec
Keywords:  languagebigramtrigramlexical normspsycholinguisticsAfrikaansArabicBulgarianBengaliBretonBosnianCatalanCzechDanishGermanGreekEnglishEsperantoSpanishEstonianBasqueFarsiFinnishFrenchGalicianHebrewHindiCroatianHungarianArmenianIndonesianIcelandicItalianGeorgianKazakhKoreanLithuanianLatvianMacedonianMalayalamMalayDutchNorwegianPolishPortugueseRomanianRussianSinhalaSlovakSlovenianAlbanianSerbianSwedishTamilTeluguTagalogTurkishUkranianUrduVietnamese

open  

SLABank

SLABank is a component of TalkBank dedicated to providing corpora for the study of second language acquisition.

Authors:  Brian MacWhinney
Updated:  2018-05-04
Source:  https://slabank.talkbank.org/
Keywords:  language-acquisitionsecond-languageCzechEnglishFrenchGermanHungarianIcelandicItalianMandarinSpanish

open  

Nijmegen Corpus of Casual Czech

The Nijmegen Corpus of Casual Czech contains 30 hours of high-quality recordings featuring 60 Czech speakers conversing among friends.

Authors:  Mirjam ErnestusLucie Kočková-AmortováPetr Pollak
Updated:  2014-05-24
Source:  https://mirjamernestus.nl/Ernestus/NCCCz/index.php
Keywords:  languagecommunicationphoneticsCzech

|<   <   Page 1 / 1   >   >|