Corpus Gesproken Nederlands (The Spoken Dutch Corpus)

Authors:	W.J.M. Levelt, S.G. Nooteboom, J. Bil, G.E. Booij, P. Dengis, E. DeWallef, A. Hulk, B. Krekels, C. Lucas, D. Van Compernolle, W. Vonk
Updated:	Wed 30 July 2014
Source:	https://ivdnt.org/images/stories/producten/documentatie/cgn_website/doc_English/topics/index.htm
Type:	database
Languages:	Dutch
Keywords:	language, phonology, syntax, word-frequency, Dutch
Open Access:	yes
License:
Documentation:	https://ivdnt.org/images/stories/producten/documentatie/cgn_website/doc_English/topics/index.htm
Citation:	Corpus Spoken Dutch - CGN (Version 2.0.3) (2014) [Data set]. Available at the Dutch Language Institute: http://hdl.handle.net/10032/tm-a2-k6
Summary:	A collection of 900 hours (almost 9 million words) of contemporary spoken Dutch from native speakers in Flanders and the Netherlands. The speech recordings are aligned with several transcriptions (e.g. orthographic, phonetic) and annotations (syntax, POS-tags). Metadata, lexica, frequency lists and the tool Corex which can be used to explore the data are included.