Friday, June 27, 2014

differences - When should I use "corpuses" over "corpora"?



I've come into a situation where I need to use the plural form of corpus, but I'm a bit confused about which plural form to use.



Merriam-Webster says the only plural form is corpora, for all senses of the word. However, Random House/Dictionary.com says it's corpora for every sense except the linguistic sense:




Linguistics . a body of utterances, as words or sentences, assumed to be representative of and used for lexical, grammatical, or other linguistic analysis.





In this sense, it's corpuses.



In my specific situation, I have a collection of data—to be processed—about a class of objects that can be swapped out at will (thus the reference to more than one corpus). To me, this satisfies the first sense of the word ("a large or complete collection of writings")—where the plural is corpora—as well as the linguistic sense of the word—where the plural is corpuses.



So what's the significant difference between these two senses of the word? When is corpora correct, and when is corpuses correct?


Answer



The OED records corpora as the only plural, and that’s all I’ve ever seen in a linguistics context, or in any other for that matter. The entire OED has 71 citations that include corpora (admittedly with various meanings) and only one that includes corpuses. Corpus data also shows a far higher frequency of corpora over corpuses. Still, corpuses certainly exists, and with no apparent difference in meaning. If you’re conservative, use corpora. If you’re feeling adventurous, use corpuses.


No comments:

Post a Comment