Corpus of Contemporary American English
Authors: | Mark Davies |
---|---|
Updated: | Mon 30 March 2020 |
Source: | https://www.english-corpora.org/coca/ |
Type: | English corpus |
Languages: | English |
Keywords: | English, corpora, linguistics, frequency-data, word-form |
Open Access: | yes |
License: | |
Publications: | Davies, M. (2010). The Corpus of Contemporary American English as the first reliable monitor corpus of English. Literary and Linguistic Computing. 25(4), 447–464, https://doi.org/10.1093/llc/fqq018 |
Citation: | Davies, M. (2020). Corpus of Contemporary American English. English-Corpora. https://www.english-corpora.org/coca/ |
Summary: | The corpus contains more than one billion words of text (25+ million words each year 1990-2019) from eight genres: spoken, fiction, popular magazines, newspapers, academic texts, and (with the update in March 2020): TV and Movies subtitles, blogs, and other web pages. It is different from most of the other corpora from English-Corpora.org in the attention it gives to the top 60,000 words in the corpus, and the wide range of information for each word, including frequency information, definitions, synonyms, WordNet entries, related topics, concordances (new display in COCA), clusters, websites that have the word as a “keyword”, and KWIC/concordance lines. |