Speech and Language Resource Bank

home search analysis data education experimentation

Page 1 / 1

subs2vec

Python 3.7 resources to evaluate bigram and trigram frequencies in corpora.

Authors: Jeroen van Paridon, Bill Thompson
Updated: 2019-04-30
Source: https://github.com/jvparidon/subs2vec
Keywords: language, bigram, trigram, lexical norms, psycholinguistics, Afrikaans, Arabic, Bulgarian, Bengali, Breton, Bosnian, Catalan, Czech, Danish, German, Greek, English, Esperanto, Spanish, Estonian, Basque, Farsi, Finnish, French, Galician, Hebrew, Hindi, Croatian, Hungarian, Armenian, Indonesian, Icelandic, Italian, Georgian, Kazakh, Korean, Lithuanian, Latvian, Macedonian, Malayalam, Malay, Dutch, Norwegian, Polish, Portuguese, Romanian, Russian, Sinhala, Slovak, Slovenian, Albanian, Serbian, Swedish, Tamil, Telugu, Tagalog, Turkish, Ukranian, Urdu, Vietnamese

open documented

PBCM: Code-mixed Hindi-English corpus

A multispeaker code-mixed Hindi read-speech corpus.

Authors: Ayushi Pandey, Brij Mohan Lal Srivastava, Rohit Kumar, BT Nellore, KS Teja, SV Gangashetty
Updated: 2018-04-30
Source: https://brijmohan.github.io/publication/pbcm-lrec18/
Keywords: code-mixing, hindi-english, hindi, english, multi-speaker

open

Natural Sounds Stimulus Set

The sound set includes 165 natural sounds, each 2-seconds in duration. The sounds were intended to include many of the sounds people commonly hear in their daily life.

Authors: Sam Norman-Haignere, Nancy G. Kanwisher, Josh H. McDermott
Updated: 2015-11-30
Source: http://mcdermottlab.mit.edu/svnh/Natural-Sound/Stimuli.html
Keywords: speech, music, stimuli, frequency, German, French, Italian, Russian, Hindi, Chinese

Page 1 / 1