SoNaR
Authors: | Nelleke Oostdijk, Martin Reynaert |
---|---|
Updated: | Thu 01 December 2011 |
Source: | https://lands.let.ru.nl/projects/SoNaR/ |
Type: | text corpus |
Languages: | Dutch |
Keywords: | language, linguistics, speech, Dutch |
Open Access: | yes |
License: | none |
Documentation: | https://portal.clarin.nl/node/4195 |
Publications: | Oostdijk, N., Reynaert, M., Hoste, V., Schuurman, I. (2013). The Construction of a 500 Million Word Reference Corpus of Contemporary Written Dutch in: Essential Speech and Language Technology for Dutch: Results by the STEVIN-programme (eds. P. Spyns, J. Odijk), Springer Verlag. |
Citation: | Oostdijk, N. (2011). SoNaR: STEVIN Dutch Reference Corpus. STEVIN Program. https://lands.let.ru.nl/projects/SoNaR/ |
Summary: | The SoNaR project aims to build a large corpus (minimum 500 million words) of contemporary written Dutch that can serve as a general reference for all kinds of research into language and language use. This includes descriptive research (as reflected in eg dictionaries and grammars), but also research in the field of language and speech technology. For such research it is very important that large amounts of data are available with the possibility to process this data with your own software. |