Monday, July 11, 2016

etymology - Is it true that the 100 most common English words are all Germanic in origin?



There is an oft-quoted statement that the 100 most common (frequently used) words in the English language are entirely Germanic/Anglo-Saxon in origin. (Also sometimes said is that ~80% of the 1000 most common are Germanic in origin.) While this did not surprise me so much, I did recently stumble across this Wikipedia page, which lists the supposed 100 most common words, with an attributed source.



A quick glance suggested (to my surprise) several words of non-Germanic (specifically, Latin) origin:





  • use

  • person

  • just

  • because (the cause part)



There may be others I've missed too? Indeed, perhaps due to the entry of Latin words into the Germanic languages in the proto-Germanic period (and the fact they are both ultimately Indo-European languages) some of the etymologies may be uncertain. Do correct me if that's not the case, as I am no historical linguist.




Clearly, depending on the statistical sample used to compile the list, results can vary. However, is there any accepted/standard list of the 100 most common English words? And moreover, is it a myth that they're all Germanic in origin (as I now doubt)?


Answer




is there any accepted/standard list of the 100 most common English words?




I suppose it all depends on your definition of authoritative, but I think a good start is The Oxford English Corpus, a collection containing over 2 billion words of 21st century English from around the world. Here's a list of facts about the corpus, including the 100 commonest words in the English language.



Neat facts about distribution: 10 lemmas (word forms, is and are are lemmas of to be) make up 25% of the corpus, 100 make up 50%, 1000 make up 75%, 7000 make up 90%, 50,000 comprise 95% and you need over a million to get 99% coverage.




So, one quarter of all words used are the, be, to, of, and, a, in, that, have, and I.




Is it a myth that they're all Germanic in origin (as I now doubt)?




Yeah, most of them are germanic in origin, but not all.



As you noted:




use is of Latin origin (by way of French) and replaced the O.E. verb brucan (which survives as the verb brook "to tolerate, put up with something unpleasant")



because is of direct Latin origin from the phrase bi cause "with cause."



and



people also Latin by way of French.



Those are the only words that jumped out at me. Of course, most of the common words have Indo-European origin, so they'll ultimately share a common root anyway. See two and duo.


No comments:

Post a Comment