Tatool

Tatool (Training and Testing Tool) was developed to assist researchers with programming training software, experiments, and questionnaires.

Authors: Claudia C. von Bastian, André Locher, Michael Ruflin
Updated: 2022-01-30
Source: http://www.tatool.ch/
Keywords: psychology, experiment, Java, software, English, German

Psycholinguistic norms for more than 300 lexical signs in German Sign Language (DGS)

The first psycholinguistic norms for iconicity, age of acquisition, frequency, and transparency for more than 300 lexical signs in German Sign Language (Deutsche Gebärdensprache, DGS).

Authors: Patrick C. Trettenbrein, Nina-Kristin Pendzich, Jens-Michael Cramer, Markus Steinbach, Emiliano Zaccarella
Updated: 2021-11-26
Source: https://osf.io/mz8j4/
Keywords: psycholinguistics, lexicon, database, German-Sign-Language, German, English

LinguaPix

LinguaPix is a database of picture naming norms. 1,620 colour photographs of items spanning across 42 semantic categories were named and rated by a group of German speakers (and are currently evaluated by a group of Dutch, English, Polish, and Cantonese speakers).

Authors: Agniszka Ewa Krautz, Emmanuel Keuleers, Gabriella Rundblad, Susanna Yeung
Updated: 2021-04-30
Source: https://linguapix.uni-mannheim.de/frontend/web/
Keywords: psychology, linguistics, semantics, audio, picture-naming, English, German, Polish

Vox Populi

A corpus of European Parlimentary speech and tools for machine learning models.

Authors: Chanhan Wang, Morgane Riviere, Ann Lee, Anne Wu, Chaitanya Talnikar, Daniel Haziza, Mary WIlliamson, Juan Pino, Emmanuel Dupoux
Updated: 2021-04-30
Source: https://aclanthology.org/2021.acl-long.80/
Keywords: English, German, French, Spanish, Polish, Italian, Romanian, Hungarian, Czech, Dutch, Finnish, Slovak, Slovenian, Estonian, Lithuanian, Portuguese, Bulgarian, Greek, Latvian, Maltese, Swedish, Danish, speech synthesis, machine learning, Accented Speech

childLex

childLex is based on a corpus of children’s books and comprises 10 million words that were syntactically annotated and lemmatized. childLex reports linguistic norms for lexical, superlexical, and sublexical variables in three different age groups: 6–8 (grades1–2), 9–10 (grades 3–4), and 11–12 years (grades 5–6).

Authors: Sascha Schroeder, Kay-Michael Würzner, Julian Heister, Alexander Geyken, Reinhold Kliegl
Updated: 2021-03-02
Source: https://osf.io/m59uv/
Keywords: language, lexicon, reading-development, linguistics, German

The Emotion Recognition Task

The ERT is a computerized task to assess the perception of facial expressions. The task presents morphed facial expressions that gradually increase in intensity.

Authors: Barbara Montagne, Roy Kessels, David Perrett, Edward de Haan
Updated: 2020-04-30
Source: https://www.emotionrecognitiontask.com/
Keywords: emotion, psychology, neuropsychology, cognition, Dutch, English, German, French, Spanish, Finnish, Italian, Russian, Lithuanian, Greek, Portuguese, Turkish

The Karl Eberhards Corpus of spontaneously spoken southern German in dialogues: Audio and articulatory recordings

40 one-hour recordings of two friends talking to each other.

Authors: Denis Arnold, Fabian Tomaschek
Updated: 2019-11-29
Source: http://hdl.handle.net/11022/1009-0000-0007-DADB-D
Keywords: spontaneous-speech, spoken-corpus, electromagnetic-articulography, german

CMUSphinx4

CMU Sphinx is a set of speech recognition development libraries and tools that can be linked in to speech-enable applications.

Authors: Evandro Gouvea, Peter Gorniak, Philip Kwok, Paul Lamere, Beth Logan, Pedro Moreno, Bhiksha Raj, Mosur Ravishankar, Bent Schmidt-Nielsen, Rita Singh, JM Van Thong, Willie Walker, Manfred Warmuth, Joe Woelfel, Peter Wolf
Updated: 2019-10-23
Source: https://cmusphinx.github.io/
Keywords: speech, programming, Java, experiment, English, French, Mandarin, German, Dutch, Russian

subs2vec

Python 3.7 resources to evaluate bigram and trigram frequencies in corpora.

Authors: Jeroen van Paridon, Bill Thompson
Updated: 2019-04-30
Source: https://github.com/jvparidon/subs2vec
Keywords: language, bigram, trigram, lexical norms, psycholinguistics, Afrikaans, Arabic, Bulgarian, Bengali, Breton, Bosnian, Catalan, Czech, Danish, German, Greek, English, Esperanto, Spanish, Estonian, Basque, Farsi, Finnish, French, Galician, Hebrew, Hindi, Croatian, Hungarian, Armenian, Indonesian, Icelandic, Italian, Georgian, Kazakh, Korean, Lithuanian, Latvian, Macedonian, Malayalam, Malay, Dutch, Norwegian, Polish, Portuguese, Romanian, Russian, Sinhala, Slovak, Slovenian, Albanian, Serbian, Swedish, Tamil, Telugu, Tagalog, Turkish, Ukranian, Urdu, Vietnamese

SLABank

SLABank is a component of TalkBank dedicated to providing corpora for the study of second language acquisition.

Authors: Brian MacWhinney
Updated: 2018-05-04
Source: https://slabank.talkbank.org/
Keywords: language-acquisition, second-language, Czech, English, French, German, Hungarian, Icelandic, Italian, Mandarin, Spanish