Corpus Gesproken Nederlands (The Spoken Dutch Corpus)

Authors: W.J.M. LeveltS.G. NooteboomJ. BilG.E. BooijP. DengisE. DeWallefA. HulkB. KrekelsC. LucasD. Van CompernolleW. Vonk
Updated: Thu 24 July 2014
Source: https://ivdnt.org/images/stories/producten/documentatie/cgn_website/doc_English/topics/index.htm
Type: database
Languages: Dutch
Keywords: languagephonologysyntaxword-frequencyDutch
Open Access: yes
License:
Documentation: https://ivdnt.org/images/stories/producten/documentatie/cgn_website/doc_English/topics/index.htm
Citation: Corpus Spoken Dutch - CGN (Version 2.0.3) (2014) [Data set]. Available at the Dutch Language Institute: http://hdl.handle.net/10032/tm-a2-k6
Summary:

A collection of 900 hours (almost 9 million words) of contemporary spoken Dutch from native speakers in Flanders and the Netherlands. The speech recordings are aligned with several transcriptions (e.g. orthographic, phonetic) and annotations (syntax, POS-tags). Metadata, lexica, frequency lists and the tool Corex which can be used to explore the data are included.