Corpus of Contemporary American English

Authors: Mark Davies
Updated: Mon 30 March 2020
Source: https://www.english-corpora.org/coca/
Type: English corpus
Languages: English
Keywords: Englishcorporalinguisticsfrequency-dataword-form
Open Access: yes
License:
Publications: Davies, M. (2010). The Corpus of Contemporary American English as the first reliable monitor corpus of English. Literary and Linguistic Computing. 25(4), 447–464, https://doi.org/10.1093/llc/fqq018
Citation: Davies, M. (2020). Corpus of Contemporary American English. English-Corpora. https://www.english-corpora.org/coca/
Summary:

The corpus contains more than one billion words of text (25+ million words each year 1990-2019) from eight genres: spoken, fiction, popular magazines, newspapers, academic texts, and (with the update in March 2020): TV and Movies subtitles, blogs, and other web pages. It is different from most of the other corpora from English-Corpora.org in the attention it gives to the top 60,000 words in the corpus, and the wide range of information for each word, including frequency information, definitions, synonyms, WordNet entries, related topics, concordances (new display in COCA), clusters, websites that have the word as a “keyword”, and KWIC/concordance lines.