LinguaPix is a database of picture naming norms. 1,620 colour photographs of items spanning across 42 semantic categories were named and rated by a group of German speakers (and are currently evaluated by a group of Dutch, English, Polish, and Cantonese speakers).

Authors:  Agniszka Ewa KrautzEmmanuel KeuleersGabriella RundbladSusanna Yeung
Updated:  2021-09-22
Keywords:  psychologylinguisticssemanticsaudiopicture-namingEnglishGermanPolish



An online database of the lexical and phonological properties of American Sign Language.

Authors:  Naomi CaselliKaren EmmoreyZed Sevcikova SehyrAriel Cohen-GoldbergCindy O'Grady Farnady
Updated:  2020-12-22
Keywords:  American-Sign-Languagelexiconphonologymorphemesemantics

open   documented  


SemDis uses advances in natural language processing to automatically determine how closely associated texts are to each other. Higher SemDis scores indicate two texts are less related, that is, they are more distantly related ideas or concepts.

Authors:  Dan Johnson & Roger Beaty
Updated:  2020-09-22
Keywords:  languagesemanticsword-recognitionEnglish


Auditory English Lexicon Project

The Auditory English Lexicon Project (AELP) is a multi-talker, multi-region psycholinguistic database of 10,170 spoken words and 10,170 spoken nonwords.

Authors:  Winston D. GohMelvin J. YapQian Wen Chee
Updated:  2019-09-22
Keywords:  psycholinguisticsdatabaselexiconauditionsemanticsEnglish

open   documented  

Linguistic Annotated Bibliography

The Linguistic Annotated Bibliography is a database of linguistic norms, programs, and calcualtions sorted by language and different stimuli of single words and paired words.

Authors:  Erin Buchanan and Addie Wikowsky
Updated:  2018-09-22
Keywords:  linguisticsphonologymorphologysemanticsexperimentsdatabase


The Hansard Corpus

A corpus of the speeches given in the British Parliament from 1803-2005.

Authors:  Marc AlexanderFraser DallachyStephen WattamPaul RaysonMark Davies
Updated:  2016-09-22
Keywords:  Englishsemanticslanguagelinguisticscorporacollocatesword-frequency


The CSLB concept property norms

The Centre for Speech, Language and the Brain (CSLB) Concept Property Norms are a publicly-available resource for researchers, including those interested in semantic feature representations of conceptual knowledge. The resource currently provides semantic properties and associated production frequency data for 638 concrete concepts, with data for each concept collected from 30 participants.

Authors:  Barry J. DevereuxLorraine K. TylerJeroen GeertzenBilli Randall
Updated:  2014-09-22
Keywords:  semanticslinguisticspsychologyword-normsfrequencyEnglish


The World Atlas of Language Structures

The World Atlas of Language Structures (WALS) is a large database of structural (phonological, grammatical, lexical) properties of languages gathered from descriptive materials (such as reference grammars) by a team of 55 authors.

Authors:  Matthew S. DryerMartin Haspelmathet al.
Updated:  2013-09-22
Keywords:  phonologysemanticslinguisticsgrammarlexicaldatabaselanguage-structure


Warriner English Affective Ratings

We have collected affective norms of valence, arousal, and dominance for 13,915 English words (lemmas). They are a complement of our age-of-acquisition ratings and subtitle word frequencies.

Authors:  Marc BrysbaertVictor Kupermanand Amy Warriner
Updated:  2013-01-05
Keywords:  semanticscrowdsourcingword-frequencyEnglishemotion