Corpus of Contemporary American English

Authors: Mark Davies
Updated: Sun 29 March 2020
Type: English corpus
Languages: English
Keywords: Englishcorporalinguisticsfrequency-dataword-form
Open Access: yes
Publications: Davies, M. (2010). The Corpus of Contemporary American English as the first reliable monitor corpus of English. Literary and Linguistic Computing. 25(4), 447–464,
Citation: Davies, M. (2020). Corpus of Contemporary American English. English-Corpora.

The corpus contains more than one billion words of text (25+ million words each year 1990-2019) from eight genres: spoken, fiction, popular magazines, newspapers, academic texts, and (with the update in March 2020): TV and Movies subtitles, blogs, and other web pages. It is different from most of the other corpora from in the attention it gives to the top 60,000 words in the corpus, and the wide range of information for each word, including frequency information, definitions, synonyms, WordNet entries, related topics, concordances (new display in COCA), clusters, websites that have the word as a “keyword”, and KWIC/concordance lines.