|Native speakers||360 million (2010)
L2: 375 million and 750 million EFL
|Writing system|| Latin script ( English alphabet)
|Official language in|| 54 countries
27 non-sovereign entities
Countries where English is an official or de facto official language, or national language, and is spoken natively by the majority of the population
Countries where it is an official but not primary language
English is a West Germanic language that was first spoken in early medieval England and is now the most widely used language in the world. It is spoken as a first language by the majority populations of several sovereign states, including the United Kingdom, the United States, Canada, Australia, Ireland, New Zealand and a number of Caribbean nations. It is the third-most-common native language in the world, after Mandarin Chinese and Spanish. It is widely learned as a second language and is an official language of the European Union, many Commonwealth countries and the United Nations, as well as in many world organisations.
English arose in the Anglo-Saxon kingdoms of England and what is now southeast Scotland. Following the extensive influence of Great Britain and the United Kingdom from the 17th century to the mid-20th century, through the British Empire, and also of the United States since the mid-20th century, it has been widely propagated around the world, becoming the leading language of international discourse and the lingua franca in many regions.
Historically, English originated from the fusion of closely related dialects, now collectively termed Old English, which were brought to the eastern coast of Great Britain by Germanic settlers ( Anglo-Saxons) by the 5th century – with the word English being derived from the name of the Angles, and ultimately from their ancestral region of Angeln (in what is now Schleswig-Holstein). A significant number of English words are constructed on the basis of roots from Latin, because Latin in some form was the lingua franca of the Christian Church and of European intellectual life. The language was further influenced by the Old Norse language because of Viking invasions in the 9th and 10th centuries.
The Norman conquest of England in the 11th century gave rise to heavy borrowings from Norman-French, and vocabulary and spelling conventions began to give the appearance of a close relationship with Romance languages to what had then become Middle English. The Great Vowel Shift that began in the south of England in the 15th century is one of the historical events that mark the emergence of Modern English from Middle English.
Owing to the assimilation of words from many other languages throughout history, modern English contains a very large vocabulary, with complex and irregular spelling, particularly of vowels. Modern English has not only assimilated words from other European languages, but from all over the world. The Oxford English Dictionary lists over 250,000 distinct words, not including many technical, scientific, and slang terms.
Modern English, sometimes described as the first global lingua franca, is the dominant language or in some instances even the required international language of communications, science, information technology, business, seafaring, aviation, entertainment, radio and diplomacy. Its spread beyond the British Isles began with the growth of the British Empire, and by the late 19th century its reach was truly global. Following British colonisation from the 16th to 19th centuries, it became the dominant language in the United States, Canada, Australia and New Zealand. The growing economic and cultural influence of the US and its status as a global superpower since World War II have significantly accelerated the language's spread across the planet. English replaced German as the dominant language of science Nobel Prize laureates during the second half of the 20th century. English equalled and may have surpassed French as the dominant language of diplomacy during the last half of the 19th century.
A working knowledge of English has become a requirement in a number of fields, occupations and professions such as medicine and computing; as a consequence over a billion people speak English to at least a basic level (see English as a foreign or second language). It is one of six official languages of the United Nations.
One impact of the growth of English is the reduction of native linguistic diversity in many parts of the world. Its influence continues to play an important role in language attrition. Conversely, the natural internal variety of English along with creoles and pidgins have the potential to produce new distinct languages from English over time.
English originated in those dialects of North Sea Germanic that were carried to Britain by Germanic settlers from various parts of what are now the Netherlands, northwest Germany, and Denmark. Up to that point, in Roman Britain the native population is assumed to have spoken the Celtic language Brythonic alongside the acrolectal influence of Latin, from the 400-year Roman occupation.
One of these incoming Germanic tribes was the Angles, whom Bede believed to have relocated entirely to Britain. The names 'England' (from Engla land "Land of the Angles") and English (Old English Englisc) are derived from the name of this tribe—but Saxons, Jutes and a range of Germanic peoples from the coasts of Frisia, Lower Saxony, Jutland and Southern Sweden also moved to Britain in this era.
Initially, Old English was a diverse group of dialects, reflecting the varied origins of the Anglo-Saxon kingdoms of Great Britain but one of these dialects, Late West Saxon, eventually came to dominate, and it is in this that the poem Beowulf is written.
Old English was later transformed by two waves of invasion. The first was by speakers of the North Germanic language branch when Halfdan Ragnarsson and Ivar the Boneless started the conquering and colonisation of northern parts of the British Isles in the 8th and 9th centuries (see Danelaw). The second was by speakers of the Romance language Old Norman in the 11th century with the Norman conquest of England. Norman developed into Anglo-Norman, and then Anglo-French – and introduced a layer of words especially via the courts and government. As well as extending the lexicon with Scandinavian and Norman words these two events also simplified the grammar and transformed English into a borrowing language—more than normally open to accept new words from other languages.
Throughout all this period Latin in some form was the lingua franca of European intellectual life, first the Medieval Latin of the Christian Church, but later the humanist Renaissance Latin, and those that wrote or copied texts in Latin commonly coined new terms from Latin to refer to things or concepts for which there was no existing native English word.
Modern English, which includes the works of William Shakespeare and the King James Bible, is generally dated from about 1550, and after the United Kingdom became a colonial power, English served as the lingua franca of the colonies of the British Empire. In the post-colonial period, some of the newly created nations that had multiple indigenous languages opted to continue using English as the lingua franca to avoid the political difficulties inherent in promoting any one indigenous language above the others. As a result of the growth of the British Empire, English was adopted in North America, India, Africa, Australia and many other regions, a trend extended with the emergence of the United States as a superpower in the mid-20th century.
The English language belongs to the Anglo-Frisian sub-group of the West Germanic branch of the Germanic family, a member of the Indo-European languages. Modern English is the direct descendant of Middle English, itself a direct descendant of Old English, a descendant of Proto-Germanic. Typical of most Germanic languages, English is characterised by the use of modal verbs, the division of verbs into strong and weak classes, and common sound shifts from Proto-Indo-European known as Grimm's Law. The closest living relatives of English are Scots (spoken primarily in Scotland and parts of Northern Ireland where Ulster Scots is spoken) and Frisian (spoken on the southern fringes of the North Sea in Denmark, the Netherlands, and Germany).
After Scots and Frisian come those Germanic languages that are more distantly related: the non-Anglo-Frisian West Germanic languages (Dutch, Afrikaans, Low German, High German), and the North Germanic languages ( Swedish, Danish, Norwegian, Icelandic, and Faroese). With the (partial) exception of Scots, none of the other languages are mutually intelligible with English, owing in part to the divergences in lexis, syntax, semantics, and phonology, and to the isolation afforded to the English language by the British Isles, although some, such as Dutch, do show strong affinities with English, especially to earlier stages of the language. Isolation has allowed English and Scots (as well as Icelandic and Faroese) to develop independently of the Continental Germanic languages and their influences over time.
In addition to isolation, lexical differences between English and other Germanic languages exist due to diachronic change, semantic drift, and to substantial borrowing in English of words from other languages, especially Latin and French (though borrowing is in no way unique to English). For example, compare "exit" (Latin), vs. Dutch uitgang and German Ausgang (literally "out-going", though outgang continues to survive dialectally) and "change" (French) vs. Dutch andering and German Änderung (literally "elsing, othering", i.e. "alteration"); "movement" (French) vs. Dutch beweging and German Bewegung (" beway-ing", i.e. "proceeding along the way"); etc. With the exception of exit (a Modern English borrowing), Middle English had already distanced itself from other Germanic languages, having the terms wharf, schift (="shift"), and wending for "change"; and already by Old English times the word bewegan meant "to cover, envelop", rather than "to move". Preference of one synonym over another also causes differentiation in lexis, even where both words are Germanic, as in English care vs. German Sorge. Both words descend from Proto-Germanic *karō and *surgō respectively, but *karō has become the dominant word in English for "care" while in German, Dutch, and Scandinavian languages, the *surgō root prevailed. *Surgō still survives in English, however, as sorrow.
Despite extensive lexical borrowing, the workings of the English language are resolutely Germanic, and English remains classified as a Germanic language due to its structure and grammar. Borrowed words get incorporated into a Germanic system of conjugation, declension, and syntax, and behave exactly as though they were native Germanic words from Old English. For example, the word reduce is borrowed from Latin redūcere; however, in English one says "I reduce – I reduced – I will reduce" rather than "redūcō – redūxī – redūcam"; likewise, we say: "John's life insurance company" (cf. Dutch "Johns levensverzekeringsmaatschappij" [= leven (life) + verzekering (insurance) + maatschappij (company)] rather than "the company of insurance life of John", cf. the French: la compagnie d'assurance-vie de John). Furthermore, in English, all basic grammatical particles added to nouns, verbs, adjectives, and adverbs are Germanic. For nouns, these include the normal plural marker -s/-es (apple – apples; cf. Frisian appel – appels; Dutch appel – appels; Afrikaans appel – appels), and the possessive markers -'s (Brad's hat; German Brads Hut; Danish Brads hat) and -s' . For verbs, these include the third person present ending -s/-es (e.g. he stands/he reaches ), the present participle ending -ing (cf. Dutch -ende; German -end(e)), the simple past tense and past participle ending -ed (Swedish -ade/-ad), and the formation of the English infinitive using to (e.g. "to drive"; cf. Old English tō drīfenne; Dutch te drijven; Low German to drieven; German zu treiben). Adverbs generally receive an -ly ending (cf. German -lich; Swedish -ligt), and adjectives and adverbs are inflected for the comparative and superlative using -er and -est (e.g. hard/harder/hardest; cf. Dutch hard/harder/hardst), or through a combination with more and most (cf. Swedish mer and mest). These particles append freely to all English words regardless of origin (tsunamis; communicates; to buccaneer; during; calmer; bizarrely) and all derive from Old English. Even the lack or absence of affixes, known as zero or null (-Ø) affixes, derive from endings which previously existed in Old English (usually -e, -a, -u, -o, -an, etc.), that later weakened to -e, and have since ceased to be pronounced and spelt (e.g. Modern English "I sing" = I sing-Ø < I singe < Old English ic singe; "we thought" = we thought-Ø < we thoughte(n) < Old English wē þōhton).
Due to the Viking colonisation and influence of Old Norse upon Middle English, English syntax follows a pattern similar to that of North Germanic languages (Danish, Swedish, Icelandic, etc.) in contrast to other West Germanic languages, such as Dutch and German. This is especially evident in the order and placement of verbs. For example, English "I will never see you again" = Danish "Jeg vil aldrig se dig igen"; Icelandic "Ég mun aldrei sjá þig aftur", whereas in Dutch and German the main verb is placed at the end (e.g. Dutch "Ik zal je nooit weer zien"; German "Ich werde dich nie wieder sehen", literally, "I will you never again see"). This is also observable in perfect tense constructions, as in English "I have never seen anything in the square" = Danish "Jeg har aldrig set noget på torvet"; Icelandic "Ég hef aldrei séð neitt á torginu", where Dutch and German place the past participle at the end (e.g. Dutch "Ik heb nooit iets op het plein gezien"; German "Ich habe nie etwas auf dem Platz gesehen", literally, "I have never anything in the square seen"). As in most Germanic languages, English adjectives usually come before the noun they modify, even when the adjective is of Latinate origin (e.g. medical emergency, national treasure). Also, English continues to make extensive use of self-explaining compounds (e.g. streetcar, classroom), and nouns which serve as modifiers (e.g. lamp post, life insurance company), traits inherited from Old English (See also Kenning).
The kinship with other Germanic languages can also be seen in the tensing of English verbs (e.g. English fall/fell/fallen/will or shall fall, West Frisian fal/foel/fallen/sil falle, Dutch vallen/viel/gevallen/zullen vallen, German fallen/fiel/gefallen/werden fallen, Norwegian faller/falt/falt or falne/vil or skal falle), the comparatives of adjectives and adverbs (e.g. English good/better/best, West Frisian goed/better/best, Dutch goed/beter/best, German gut/besser/best), the treatment of nouns (English shoemaker, shoemaker's, shoemakers, shoemakers'; Dutch schoenmaker, schoenmakers, schoenmakers, schoenmakeren; Swedish skomakare, skomakares, skomakare, skomakares), and the large amount of cognates (e.g. English wet, Scots weet, West Frisian wiet, Swedish våt; English send, Dutch zenden, German senden; English meaning, Swedish mening, Icelandic meining, etc.). It occasionally gives rise to false friends (e.g. English time vs Norwegian time, meaning "hour"; English gift vs German Gift, meaning "poison"), while differences in phonology can obscure words that really are related (tooth vs. German Zahn; compare also Danish tand, North Frisian toth). Sometimes both semantics and phonology are different (German Zeit ("time") is related to English "tide", but the English word, through a transitional phase of meaning "period"/"interval", has come primarily to mean gravitational effects on the ocean by the moon, though the original meaning is preserved in forms like tidings and betide, and phrases such as to tide over).
Many North Germanic words entered English due to the settlement of Viking raiders and Danish invasions which began around the 9th century (see Danelaw). Many of these words are common words, often mistaken for being native, which shows how close-knit the relations between the English and the Scandinavian settlers were (See below: Words of Old Norse origin). Dutch and Low German also had a considerable influence on English vocabulary, contributing common everyday terms and many nautical and trading terms (See below: Words of Dutch and Low German origin).
Finally, English has been forming compound words and affixing existing words separately from the other Germanic languages for over 1500 years and has different habits in that regard. For instance, abstract nouns in English may be formed from native words by the suffixes "‑hood", "-ship", "-dom" and "-ness". All of these have cognate suffixes in most or all other Germanic languages, but their usage patterns have diverged, as German "Freiheit" vs. English "freedom" (the suffix "-heit" being cognate of English "-hood", while English "-dom" is cognate with German "-tum"; compare also North Frisian fridoem, Dutch vrijdom, Norwegian fridom, "freedom"). The Germanic languages Icelandic and Faroese also follow English in this respect, since, like English, they developed independent of German influences.
Many French words are also intelligible to an English speaker, especially when they are seen in writing (as pronunciations are often quite different), because English absorbed a large vocabulary from Norman and French, via Anglo-Norman after the Norman Conquest, and directly from French in subsequent centuries. As a result, a large portion of English vocabulary is derived from French, with some minor spelling differences (e.g. inflectional endings, use of old French spellings, lack of diacritics, etc.), as well as occasional divergences in meaning of so-called false friends: for example, compare "library" with the French librairie, which means bookstore; in French, the word for "library" is bibliothèque. The pronunciation of most French loanwords in English (with the exception of a handful of more recently borrowed words such as mirage, genre, café; or phrases like coup d'état, rendez-vous, etc.) has become largely anglicised and follows a typically English phonology and pattern of stress (compare English "nature" vs. French nature, "button" vs. bouton, "table" vs. table, "hour" vs. heure, "reside" vs. résider, etc.).
Approximately 375 million people speak English as their first language. English today is probably the third largest language by number of native speakers, after Mandarin Chinese and Spanish. However, when combining native and non-native speakers it is probably the most commonly spoken language in the world, though possibly second to a combination of the Chinese languages (depending on whether or not distinctions in the latter are classified as "languages" or "dialects").
Estimates that include second language speakers vary greatly from 470 million to over a billion depending on how literacy or mastery is defined and measured. Linguistics professor David Crystal calculates that non-native speakers now outnumber native speakers by a ratio of 3 to 1.
The countries with the highest populations of native English speakers are, in descending order: the United States (215 million), the United Kingdom (61 million), Canada (18.2 million), Australia (15.5 million), Nigeria (4 million), Ireland (3.8 million), South Africa (3.7 million), and New Zealand (3.6 million) in a 2006 Census.
Countries such as the Philippines, Jamaica and Nigeria also have millions of native speakers of dialect continua ranging from an English-based creole to a more standard version of English. Of those nations where English is spoken as a second language, India has the most such speakers (see Indian English). Crystal claims that, combining native and non-native speakers, India now has more people who speak or understand English than any other country in the world.
Countries in order of total speakers
|Country||Total||Percent of population||First language||As an additional language||Population||Comment|
|United States||251,388,301||96%||215,423,557||35,964,744||262,375,152||Source: US Census 2000: Language Use and English-Speaking Ability: 2000, Table 1. Figure for second language speakers are respondents who reported they do not speak English at home but know it "very well" or "well". Note: figures are for population age 5 and older|
|India||125,344,736||12%||226,449||86,125,221 second language speakers.
38,993,066 third language speakers
|1,028,737,436||Source: Census 2001, Figures include both those who speak English as a second language and those who speak it as a third language. The figures include English speakers, but not English users.|
|Pakistan||88,690,000||49%||88,690,000||180,440,005||Source: Euromonitor International report 2009. " The Benefits of the English Language for Individuals and Societies: Quantitative Indicators from Cameroon,Nigeria, Rwanda, Bangladesh and Pakistan." 'A custom report compiled by Euromonitor International for the British Council'.|
|Nigeria||79,000,000||53%||4,000,000||>75,000,000||148,000,000||Figures are for speakers of Nigerian Pidgin, an English-based pidgin or creole. Ihemere gives a range of roughly 3 to 5 million native speakers; the midpoint of the range is used in the table. Ihemere, Kelechukwu Uchechukwu (2006). "A Basic Description and Analytic Treatment of Noun Clauses in Nigerian Pidgin". Nordic Journal of African Studies 15 (3): 296–313.|
|United Kingdom||59,600,000||98%||58,100,000||1,500,000||60,000,000||Source: Crystal (2005), p. 109.|
|Philippines||48,800,000||58%||3,427,000||43,974,000||84,566,000||Total speakers: Census 2000, text above Figure 7, 63.71% of the 66.7 million people aged 5 years or more could speak English. Native speakers: Census 1995. Ethnologue lists 3.4 million native speakers with 52% of the population speaking it as an additional language.|
|Canada||25,246,220||85%||17,694,830||7,551,390||29,639,030||Source: 2001 Census – Knowledge of Official Languages and Mother Tongue. The native speakers figure comprises 122,660 people with both French and English as a mother tongue, plus 17,572,170 people with English and not French as a mother tongue.|
|Australia||18,172,989||92%||15,581,329||2,591,660||19,855,288||Source: 2006 Census. The figure shown in the first language English speakers column is actually the number of Australian residents who speak only English at home. The additional language column shows the number of other residents who claim to speak English "well" or "very well". Another 5% of residents did not state their home language or English proficiency.|
|New Zealand||3,673,626||91.2%||3,008,058||665,568||4,027,947||Source: 2006 Census. The figures are people who can speak English with sufficient fluency to hold an everyday conversation. The figure shown in the first language English speakers column is actually the number of New Zealand residents who reported to speak English only, while the additional language column shows the number of New Zealand residents who reported to speak English as one of two or more languages.|
|Note: Total = First language + Other language; Percentage = Total / Population|
Countries where English is a major language
English is the primary language in Anguilla, Antigua and Barbuda, Australia, the Bahamas, Barbados, Belize, Bermuda, the British Indian Ocean Territory, the British Virgin Islands, Canada, the Cayman Islands, Dominica, the Falkland Islands, Gibraltar, Grenada, Guam, Guernsey, Guyana, Ireland, the Isle of Man, Jamaica, Jersey, Montserrat, Nauru, New Zealand, Pitcairn Islands, Saint Helena, Ascension and Tristan da Cunha, Saint Kitts and Nevis, Saint Vincent and the Grenadines, Singapore, South Georgia and the South Sandwich Islands, Trinidad and Tobago, the Turks and Caicos Islands, the United Kingdom and the United States.
In some countries where English is not the most spoken language, it is an official language; these countries include Botswana, Cameroon, the Federated States of Micronesia, Fiji, Gambia, Ghana, India, Kenya, Kiribati, Lesotho, Liberia, Malta, the Marshall Islands, Mauritius, Namibia, Nigeria, Pakistan, Palau, Papua New Guinea, the Philippines ( Philippine English), Rwanda, Saint Lucia, Samoa, Seychelles, Sierra Leone, the Solomon Islands, Sri Lanka, Sudan, South Sudan, Swaziland, Tanzania, Uganda, Zambia, and Zimbabwe. Also there are countries where in a part of the territory English became a co-official language, e.g. Colombia's San Andrés y Providencia and Nicaragua's Mosquito Coast. This was a result of the influence of British colonisation in the area.
It is also one of the 11 official languages that are given equal status in South Africa ( South African English). English is also the official language in current dependent territories of Australia (Norfolk Island, Christmas Island and Cocos Island) and of the United States (American Samoa, Guam, Northern Mariana Islands, Puerto Rico, and the US Virgin Islands), and the former British colony of Hong Kong. (See List of countries where English is an official language for more details.)
Although the United States federal government has no official languages, English has been given official status by 30 of the 50 state governments. Although falling short of official status, English is also an important language in several former colonies and protectorates of the United Kingdom, such as Bahrain, Bangladesh, Brunei, Cyprus, Malaysia, and the United Arab Emirates.
English as a global language
Because English is so widely spoken, it has often been referred to as a " world language", the lingua franca of the modern era, and while it is not an official language in most countries, it is currently the language most often taught as a foreign language. It is, by international treaty, the official language for aeronautical and maritime communications. English is an official language of the United Nations and many other international organizations, including the International Olympic Committee.
English is the language most often studied as a foreign language in the European Union, by 89% of schoolchildren, ahead of French at 32%, while the perception of the usefulness of foreign languages among Europeans is 68% in favour of English ahead of 25% for French. Among some non-English-speaking EU countries, a large percentage of the adult population claims to be able to converse in English – in particular: 85% in Sweden, 83% in Denmark, 79% in the Netherlands, 66% in Luxembourg and over 50% in Finland, Slovenia, Austria, Belgium, and Germany.
Books, magazines, and newspapers written in English are available in many countries around the world, and English is the most commonly used language in the sciences with Science Citation Index reporting as early as 1997 that 95% of its articles were written in English, even though only half of them came from authors in English-speaking countries.
This increasing use of the English language globally has had a large impact on many other languages, leading to language shift and even language death, and to claims of linguistic imperialism. English itself has become more open to language shift as multiple regional varieties feed back into the language as a whole.
Dialects and varieties
English has been subject to a large degree of regional dialect variation for many centuries. Its global spread now means that a large number of dialects and English-based creole languages and pidgins can be found all over the world.
Several educated native dialects of English have wide acceptance as standards in much of the world. In the United Kingdom much emphasis is placed on Received Pronunciation, an educated dialect of South East England. General American, which is spread over most of the United States and much of Canada, is more typically the model for the American continents and areas (such as the Philippines) that have had either close association with the United States, or a desire to be so identified. In Oceania, the major native dialect of Australian English is spoken as a first language by the vast majority of the inhabitants of the Australian continent, with General Australian serving as the standard accent. The English of neighbouring New Zealand as well as that of South Africa have to a lesser degree been influential native varieties of the language.
Aside from these major dialects, there are numerous other varieties of English, which include, in most cases, several subvarieties, such as Cockney, Scouse and Geordie within British English; Newfoundland English within Canadian English; and African American Vernacular English ("Ebonics") and Southern American English within American English. English is a pluricentric language, without a central language authority like France's Académie française; and therefore no one variety is considered "correct" or "incorrect" except in terms of the expectations of the particular audience to which the language is directed.
Scots has its origins in early Northern Middle English and developed and changed during its history with influence from other sources. However, following the Acts of Union 1707 a process of language attrition began, whereby successive generations adopted more and more features from Standard English. Whether Scots is now a separate language or is better described as a dialect of English (i.e. part of Scottish English) is in dispute, although the UK government accepts Scots as a regional language and has recognised it as such under the European Charter for Regional or Minority Languages. There are a number of regional dialects of Scots, and pronunciation, grammar and lexis of the traditional forms differ, sometimes substantially, from other varieties of English.
English speakers have many different accents, which often signal the speaker's native dialect or language. For the most distinctive characteristics of regional accents, see Regional accents of English, and for a complete list of regional dialects, see List of dialects of the English language. Within England, variation is now largely confined to pronunciation rather than grammar or vocabulary. At the time of the Survey of English Dialects, grammar and vocabulary differed across the country, but a process of lexical attrition has led most of this variation to die out.
Just as English itself has borrowed words from many different languages over its history, English loanwords now appear in many languages around the world, indicative of the technological and cultural influence of its speakers. Several pidgins and creole languages have been formed on an English base, such as Jamaican Patois, Nigerian Pidgin, and Tok Pisin. There are many words in English coined to describe forms of particular non-English languages that contain a very high proportion of English words.
It is well-established that informal speech registers tend to be made up predominantly of words of Anglo-Saxon or Germanic origin, whereas the Latinate vocabulary is usually reserved for more formal uses such as legal, scientific, and otherwise scholarly or academic texts.
Child-directed speech, which is an informal speech register, also tends to rely heavily on vocabulary rife in words derived from Anglo-Saxon. The speech of mothers to young children has a higher percentage of native Anglo-Saxon verb tokens than speech addressed to adults. In particular, in parents' child-directed speech the clausal core is built in the most part by Anglo-Saxon verbs, namely, almost all tokens of the grammatical relations subject-verb, verb-direct object and verb-indirect object that young children are presented with, are constructed with native verbs. The Anglo-Saxon verb vocabulary consists of short verbs, but its grammar is relatively complex. Syntactic patterns specific to this sub-vocabulary in present-day English include periphrastic constructions for tense, aspect, questioning and negation, and phrasal lexemes functioning as complex predicates, all of which also occur in child-directed speech.
The historical origin of vocabulary items affects the order of acquisition of various aspects of language development in English-speaking children. Latinate vocabulary is in general a later acquisition in children than the native Anglo-Saxon one. Young children almost exclusively use the native verb vocabulary in constructing basic grammatical relations, apparently mastering its analytic aspects at an early stage.
Formal written English
A version of the language almost universally agreed upon by educated English speakers around the world is called formal written English. It takes virtually the same form regardless of where it is written, in contrast to spoken English, which differs significantly between dialects, accents, and varieties of slang and of colloquial and regional expressions. Local variations in the formal written version of the language are quite limited, being restricted largely to minor spelling, lexical and grammatical differences between British, American, and other national varieties of English.
Simplified and constructed varieties
Artificially simplified versions of the language have been created that are easier for non-native speakers to read. Basic English is a constructed language, with a restricted number of words, created by Charles Kay Ogden and described in his book Basic English: A General Introduction with Rules and Grammar (1930). Ogden said that it would take seven years to learn English, seven months for Esperanto, and seven weeks for Basic English. Thus, Basic English may be employed by companies that need to make complex books for international use, as well as by language schools that need to impart some knowledge of English in a short time.
Ogden did not include any words in Basic English that could be said instead with a combination of other words already in the Basic English lexicon, and he worked to make the vocabulary suitable for speakers of any other language. He put his vocabulary selections through a large number of tests and adjustments. Ogden also simplified the grammar but tried to keep it normal for English users. Although it was not built into a program, similar simplifications were devised for various international uses.
Simplified English is a controlled language originally developed for aerospace industry maintenance manuals. It employs a carefully limited and standardised subset of English. Simplified English has a lexicon of approved words and those words can only be used in certain ways. For example, the word close can be used in the phrase "Close the door" but not "do not go close to the landing gear".
Other constructed varieties of English include:
- E-Prime excludes forms of the verb to be.
- English reform is an attempt to improve collectively upon the English language.
- Manually Coded English consists of a variety of systems that have been developed to represent the English language with hand signals, designed primarily for use in deaf education. These should not be confused with true sign languages such as British Sign Language and American Sign Language used in Anglophone countries, which are independent and not based on English.
- Seaspeak and the related Airspeak and PoliceSpeak, all based on restricted vocabularies, were designed by Edward Johnson starting from the 1980s to aid international cooperation and communication in specific areas.
- Special English is a simplified version of English used by the Voice of America. It uses a vocabulary of only 1500 words.
The phonology (sound system) of English differs between dialects. The descriptions below are most closely applicable to the standard varieties known as Received Pronunciation (RP) and General American. For information concerning a range of other varieties, see IPA chart for English dialects.
The table below shows the system of consonant phonemes that functions in most major varieties of English. The symbols are from the International Phonetic Alphabet (IPA), and are also used in the pronunciation keys of many dictionaries. For more detailed information see English phonology: Consonants.
|Plosive||p b||t d||k ɡ|
|Fricative||f v||θ ð||s z||ʃ ʒ||(x)||h|
Where consonants are given in pairs (as with "p b"), the first is voiceless, the second is voiced. Most of the symbols represent the same sounds as they normally do when used as letters (see Writing system below), but /j/ represents the initial sound of yacht. The symbol /ʃ/ represents the sh sound, /ʒ/ the middle sound of vision, /tʃ/ the ch sound, /dʒ/ the sound of j in jump, /θ/ and /ð/ the th sounds in thing and this respectively, and /ŋ/ the ng sound in sing. The voiceless velar fricative /x/ is not a regular phoneme in most varieties of English, although it is used by some speakers in Scots/Gaelic words such as loch or in other loanwords such as Chanukah.
Some of the more significant variations in the pronunciation of consonants are these:
- In non- rhotic accents such as Received Pronunciation and Australian English, /r/ can only appear before a vowel (so there is no "r" sound in words like card). The actual pronunciation of /r/ varies between dialects; most common is the alveolar approximant [ɹ].
- In North American English and Australian English, /t/ and /d/ are flapped [ɾ] in many positions between vowels. This means that word pairs such as latter and ladder may become homophones for speakers of these dialects.
- The th sounds /θ/ and /ð/ are sometimes pronounced as /f/ and /v/ in Cockney, and as dental plosives (contrasting with the usual alveolar plosives) in some Irish varieties. In African American Vernacular English, /ð/ has merged with dental /d/.
- A voiceless w, [ʍ], sometimes written /hw/, for the wh in words like when and which, is preserved in Scottish and Irish English and by some speakers elsewhere.
- The voiceless plosives /p/, /t/ and /k/ are frequently aspirated, particularly at the start of stressed syllables, but they are not aspirated after an initial /s/, as in spin.
The system of vowel phonemes and their pronunciation is subject to significant variation between dialects. The table below lists the vowels found in Received Pronunciation (RP) and General American, with examples of words in which they occur. The vowels are represented with symbols from the International Phonetic Alphabet; those given for RP are in relatively standard use in British dictionaries and other publications. For more detailed information see English phonology: Vowels.
Some points to note:
- For words which in RP have /ɒ/, most North American dialects have /ɑ/ (as in the example of box above) or /ɔ/ (as in cloth). However some North American varieties do not have the vowel /ɔ/ at all (except before /r/); see cot–caught merger.
- In present-day Received Pronunciation, the realization of the /æ/ phoneme is more open than the symbol suggests, and is closer to [a], as in most other accents in Britain. The sound [æ] is now found only in conservative RP.
- In General American and some other rhotic accents, the combination of vowel+/r/ is often realized as an r-colored vowel. For example, butter /ˈbʌtər/ is pronounced with an r-colored schwa, [ɚ]. Similarly nurse contains the r-colored vowel [ɝ].
- The vowel conventionally written /ʌ/ is actually pronounced more centrally, as [ ɐ], in RP. In the northern half of England this vowel is replaced by /ʊ/ (so cut rhymes with put).
- In unstressed syllables there may or may not be a distinction between /ə/ ( schwa) and /ɪ/ (/ɨ/). So for some speakers there is no difference between roses and Rosa's. For more information see Reduced vowels in English.
- The diphthongs /eɪ/ and /əʊ/ (/oʊ/) tend towards the monophthongal pronunciations [eː] and [oː] in some dialects, including Canadian, Scottish, Irish and Northern English.
- In parts of North America /aɪ/ is pronounced [ʌɪ] before voiceless consonants. This is particularly true in Canada, where also /aʊ/ is pronounced [ʌʊ] in this position. See Canadian raising.
- The sound /ʊə/ is coming to be replaced by /ɔː/ in many words; for example, sure is often pronounced like shore. See English-language vowel changes before historic r.
Stress, rhythm and intonation
English is a strongly stressed language, in which stress is said to be phonemic, i.e. capable of distinguishing words (such as the noun increase, stressed on the first syllable, and the verb increase, stressed on the second syllable; see also Initial-stress-derived noun). In almost any word of more than one syllable there will be one syllable identified as taking the primary stress, and possibly another taking a secondary stress, as in civilization /ˌsɪvəlaɪˈzeɪʃn̩/, in which the first syllable carries secondary stress, the fourth syllable carries primary stress, and the other syllables are unstressed.
Closely related to stress in English is the process of vowel reduction; for example, in the noun contract the first syllable is stressed and contains the vowel /ɒ/ (in RP), whereas in the verb contract the first syllable is unstressed and its vowel is reduced to /ə/ ( schwa). The same process applies to certain common function words like of, which are pronounced with different vowels depending on whether or not they are stressed within the sentence. For more details, see Reduced vowels in English.
English also has strong prosodic stress – the placing of additional emphasis within a sentence on the words to which a speaker wishes to draw attention, and corresponding weaker pronunciation of less important words. As regards rhythm, English is classed as a stress-timed language – one in which there is a tendency for the time intervals between stressed syllables to become equal, with corresponding faster pronunciation of groups of unstressed syllables.
As concerns intonation, the pitch of the voice is used syntactically in English; for example, to convey surprise or irony, or to change a statement into a question. Most dialects of English use falling pitch for definite statements, and rising pitch to express uncertainty, as in questions (particularly yes-no questions). There is also a characteristic change of pitch on strongly stressed syllables, particularly on the "nuclear" (most strongly stressed) syllable in a sentence or intonation group. For more details see Intonation (linguistics): Intonation in English.
English grammar has minimal inflection compared with most other Indo-European languages. For example, Modern English, unlike Modern German or Dutch and the Romance languages, lacks grammatical gender and adjectival agreement. Case marking has almost disappeared from the language and mainly survives in pronouns. The patterning of strong (e.g. speak/spoke/spoken) versus weak verbs (e.g. love/loved or kick/kicked) inherited from its Germanic origins has declined in importance in modern English, and the remnants of inflection (such as plural marking) have become more regular.
At the same time, the language has become more analytic, and has developed features such as modal verbs and word order as resources for conveying meaning. Auxiliary verbs mark constructions such as questions, negative polarity, the passive voice and progressive aspect.
English vocabulary has changed considerably over the centuries.
Like many languages deriving from Proto-Indo-European (PIE), many of the most common words in English can trace back their origin (through the Germanic branch) to PIE. Such words include the basic pronouns I, from Old English ic, (cf. German Ich, Gothic ik, Latin ego, Greek ego, Sanskrit aham), me (cf. German mich, mir, Gothic mik, mīs, Latin mē, Greek eme, Sanskrit mam), numbers (e.g. one, two, three, cf. Dutch een, twee, drie, Gothic ains, twai, threis (þreis), Latin ūnus, duo, trēs, Greek oinos "ace (on dice)", duo, treis), common family relationships such as mother, father, brother, sister etc. (cf. Dutch moeder, Greek meter, Latin mater, Sanskrit matṛ; mother), names of many animals (cf. German Maus, Dutch muis, Sanskrit mus, Greek mus, Latin mūs; mouse), and many common verbs (cf. Old High German knājan, Old Norse kná, Greek gignōmi, Latin gnoscere, Hittite kanes; to know).
Germanic words (generally words of Old English or to a lesser extent Old Norse origin) tend to be shorter than Latinate words, and are more common in ordinary speech, and include nearly all the basic pronouns, prepositions, conjunctions, modal verbs etc. that form the basis of English syntax and grammar. The shortness of the words is generally due to syncope in Middle English (e.g. OldEng hēafod > ModEng head, OldEng sāwol > ModEng soul) and to the loss of final syllables due to stress (e.g. OldEng gamen > ModEng game, OldEng ǣrende > ModEng errand), not because Germanic words are inherently shorter than Latinate words (the lengthier, higher-register words of Old English were largely forgotten following the subjugation of English after the Norman Conquest, and most of the Old English lexis devoted to literature, the arts, and sciences ceased to be productive when it fell into disuse. Only the shorter, more direct, words of Old English tended to pass into the Modern language.) Consequently, those words which tend to be regarded as elegant or educated in Modern English are usually Latinate. However, the excessive use of Latinate words is considered at times to be either pretentious or an attempt to obfuscate an issue. George Orwell's essay " Politics and the English Language", considered an important scrutinisation of the English language, is critical of this, as well as other perceived misuses of the language.
An English speaker is in many cases able to choose between Germanic and Latinate synonyms: come or arrive; sight or vision; freedom or liberty. In some cases, there is a choice between a Germanic derived word (oversee), a Latin derived word (supervise), and a French word derived from the same Latin word (survey); or even Germanic words derived from Norman French (e.g., warranty) and Parisian French (guarantee), and even choices involving multiple Germanic and Latinate sources are possible: sickness (Old English), ill (Old Norse), infirmity (French), affliction (Latin). Such synonyms harbour a variety of different meanings and nuances. Yet the ability to choose between multiple synonyms is not a consequence of French and Latin influence, as this same richness existed in English prior to the extensive borrowing of French and Latin terms. Old English was extremely resourceful in its ability to express synonyms and shades of meaning on its own, in many respects rivaling or exceeding that of Modern English (synonyms numbering in the thirties for certain concepts were not uncommon). Take for instance the various ways to express the word "astronomer" or "astrologer" in Old English: tunglere, tungolcræftiga, tungolwītega, tīdymbwlātend, tīdscēawere. In Modern English, however, the roles of such synonyms have largely been replaced by equivalents taken from Latin, French, and Greek, as English has taken the position of a diminished reliance upon native elements and resources for the creation of new words and terminologies. Familiarity with the etymology of groups of synonyms can give English speakers greater control over their linguistic register. See: List of Germanic and Latinate equivalents in English, Doublet (linguistics).
A commonly noted area where Germanic and French-derived words coexist is that of domestic or game animals and the meats produced from them. The nouns for meats are often different from, and unrelated to, those for the corresponding animals, the animal commonly having a Germanic name and the meat having a French-derived one. Examples include: deer and venison; cow and beef; swine/pig and pork; and sheep/lamb and mutton. This is assumed to be a result of the aftermath of the Norman conquest of England, where an Anglo-Norman-speaking elite were the consumers of the meat, produced by lower classes, which happened to be largely Anglo-Saxon, although a similar duality can also be seen in other languages like French, which did not undergo such linguistic upheaval (e.g. boeuf "beef" vs. vache "cow"). With the exception of beef and pork, the distinction today is gradually becoming less and less pronounced (venison is commonly referred to simply as deer meat, mutton is lamb, and chicken is both the animal and the meat over the more traditional term poultry. Use of the term mutton, however, remains, especially when referring to the meat of an older sheep, distinct from lamb; and poultry remains when referring to the meat of birds and fowls in general.)
There are Latinate words that are used in everyday speech. These words no longer appear Latinate and oftentimes have no Germanic equivalents. For instance, the words mountain, valley, river, aunt, uncle, move, use, and push are Latinate. Likewise, the inverse can occur: acknowledge, meaningful, understanding, mindful, lavish, behaviour, forbearance, behoove, forestall, allay, rhyme, starvation, embodiment come from Anglo-Saxon, and allegiance, abandonment, debutant, feudalism, seizure, guarantee, disregard, wardrobe, disenfranchise, disarray, bandolier, bourgeoisie, debauchery, performance, furniture, gallantry are of Germanic origin, usually through the Germanic element in French, so it is oftentimes impossible to know the origin of a word based on its register.
English easily accepts technical terms into common usage and often imports new words and phrases. Examples of this phenomenon include contemporary words such as cookie, Internet and URL (technical terms), as well as genre, über, lingua franca and amigo (imported words/phrases from French, German, Italian, and Spanish, respectively). In addition, slang often provides new meanings for old words and phrases. In fact, this fluidity is so pronounced that a distinction often needs to be made between formal forms of English and contemporary usage.
Number of words in English
The vocabulary of English is undoubtedly very large, but assigning a specific number to its size is more a matter of definition than of calculation – and there is no official source to define accepted English words and spellings in the way that the French Académie française and similar bodies do for other languages.
Archaic, dialectal, and regional words might or might not be widely considered as "English", and neologisms are continually coined in medicine, science, technology and other fields, along with new slang and adopted foreign words. Some of these new words enter wide usage while others remain restricted to small circles.
The General Explanations at the beginning of the Oxford English Dictionary states:
The Vocabulary of a widely diffused and highly cultivated living language is not a fixed quantity circumscribed by definite limits... there is absolutely no defining line in any direction: the circle of the English language has a well-defined centre but no discernible circumference.
The current FAQ for the OED further states:
How many words are there in the English language? There is no single sensible answer to this question. It's impossible to count the number of words in a language, because it's so hard to decide what actually counts as a word.
The Oxford English Dictionary, 2nd edition (OED2) includes over 600,000 definitions, following a rather inclusive policy:
It embraces not only the standard language of literature and conversation, whether current at the moment, or obsolete, or archaic, but also the main technical vocabulary, and a large measure of dialectal usage and slang (Supplement to the OED, 1933).
The editors of Webster's Third New International Dictionary, Unabridged include 475,000 main headwords, but in their preface they estimate the true number to be much higher.
Comparisons of the vocabulary size of English to that of other languages are generally not taken very seriously by linguists and lexicographers. Besides the fact that dictionaries will vary in their policies for including and counting entries, what is meant by a given language and what counts as a word do not have simple definitions. Also, a definition of word that works for one language may not work well in another, with differences in morphology and orthography making cross-linguistic definitions and word-counting difficult, and potentially giving very different results. Linguist Geoffrey K. Pullum has gone so far as to compare concerns over vocabulary size (and the notion that a supposedly larger lexicon leads to "greater richness and precision") to an obsession with penis length.
In December 2010 a joint Harvard/Google study found the language to contain 1,022,000 words and to expand at the rate of 8,500 words per year. The findings came from a computer analysis of 5,195,769 digitised books. Others have estimated a rate of growth of 25,000 words each year.
One of the consequences of the French influence is that the vocabulary of English is, to a certain extent, divided between those words that are Germanic (mostly West Germanic, with a smaller influence from the North Germanic branch) and those that are "Latinate" (derived directly from Latin, or through Norman French or other Romance languages). The situation is further compounded, as French, particularly Old French and Anglo-French, were also contributors in English of significant numbers of Germanic words, mostly from the Frankish element in French (see List of English Latinates of Germanic origin).
The majority (estimates range from roughly 50% to more than 80%) of the thousand most common English words are Germanic. However, the majority of more advanced words in subjects such as the sciences, philosophy and mathematics come from Latin or Greek, with Arabic also providing many words in astronomy, mathematics, and chemistry.
|1st 100||1st 1,000||2nd 1,000||Subsequent|
|Source: Nation 2001, p. 265|
Numerous sets of statistics have been proposed to demonstrate the proportionate origins of English vocabulary. None, as yet, is considered definitive by most linguists.
A computerised survey of about 80,000 words in the old Shorter Oxford Dictionary (3rd ed.) was published in Ordered Profusion by Thomas Finkenstaedt and Dieter Wolff (1973) that estimated the origin of English words as follows:
- Langue d'oïl, including French and Old Norman: 28.3%
- Latin, including modern scientific and technical Latin: 28.24%
- Germanic languages (including words directly inherited from Old English; does not include Germanic words coming from the Germanic element in French, Latin or other Romance languages): 25%
- Greek: 5.32%
- No etymology given: 4.03%
- Derived from proper names: 3.28%
- All other languages: less than 1%
A survey by Joseph M. Williams in Origins of the English Language of 10,000 words taken from several thousand business letters gave this set of statistics:
- French (langue d'oïl): 41%
- "Native" English: 33%
- Latin: 15%
- Old Norse: 2%
- Dutch: 1%
- Other: 10%
Words of Old Norse origin
Many words of Old Norse origin have entered the English language, primarily from the Viking colonisation of eastern and northern England between 800–1000 during the Danelaw. These include common words such as anger, awe, bag, big, birth, blunder, both, cake, call, cast, cosy, cross, cut, die, dirt, drag, drown, egg, fellow, flat, flounder, gain, get, gift, give, guess, guest, gust, hug, husband, ill, kid, law, leg, lift, likely, link, loan, loose, low, mistake, odd, race (running), raise, root, rotten, same, scale, scare, score, seat, seem, sister, skill, skin, skirt, skull, sky, stain, steak, sway, take, though, thrive, Thursday, tight, till (until), trust, ugly, want, weak, window, wing, wrong, the pronoun they (and its forms), and even the verb are (the present plural form of to be) through a merger of Old English and Old Norse cognates. More recent Scandinavian imports include angstrom, fjord, geyser, kraken, litmus, nickel, ombudsman, saga, ski, slalom, smorgasbord, and tungsten.
Words of French origin
A large portion of English vocabulary is of French or Langues d'oïl origin, and was transmitted to English via the Anglo-Norman language spoken by the upper classes in England in the centuries following the Norman Conquest. Words of Norman-French origin include competition, mountain, art, table, publicity, role, pattern, joust, choice, and force. As a result of the length of time they have been in use in English, these words have been anglicised to fit English rules of phonology, pronunciation and spelling.
Some French words were adopted during the 17th to 19th centuries, when French was the dominant language of Western international politics and trade. These words can normally be distinguished because they retain French rules for pronunciation and spelling, including diacritics, are often phrases rather than single words, and are sometimes written in italics. Examples include police, routine, machine, façade, table d'hôte and affaire de cœur. These words and phrases retain their French spelling and pronunciation because historically their French origin was emphasised to denote the speaker as educated or well-travelled at a time when education and travelling was still restricted to the middle and upper classes, and so their use implied a higher social status in the user. (See also: French phrases used by English speakers).
Words of Dutch and Low German origin
Many words describing the navy, types of ships, and other objects or activities on the water are of Dutch origin. Yacht, skipper, cruiser, flag, freight, furlough, breeze, hoist, iceberg, boom, duck ("fabric, cloth"), and maelstrom are examples. Other words pertain to art and daily life: easel, etch, slim, staple (Middle Dutch stapel "market"), slip (Middle Dutch slippen), landscape, cookie, curl, shock, aloof, boss, brawl (brallen "to boast"), smack (smakken "to hurl down"), shudder, scum, peg, coleslaw, waffle, dope (doop "dipping sauce"), slender (Old Dutch slinder), slight, gas, pump. Dutch has also contributed to English slang, e.g. spook, and the now obsolete snyder (tailor) and stiver (small coin).
Words from Low German include bluster, cower, dollar, drum, geek, grab, lazy, mate, monkey, mud, ogle, orlop, paltry, poll, poodle, prong, scurvy, smug, smuggle, trade.
Since around the 9th century, English has been written in the Latin script, which replaced Anglo-Saxon runes. The modern English alphabet contains 26 letters of the Latin script: a, b, c, d, e, f, g, h, i, j, k, l, m, n, o, p, q, r, s, t, u, v, w, x, y, z (which also have majuscule, capital or uppercase forms: A, B, C, D, E, F, G, H, I, J, K, L, M, N, O, P, Q, R, S, T, U, V, W, X, Y, Z). Other symbols used in writing English include the ligatures, æ and œ (though these are no longer common). There is also some usage of diacritics, mainly in foreign loanwords (like the acute accent in café and exposé), and in the occasional use of a diaeresis to indicate that two vowels are pronounced separately (as in naïve, Zoë). For more information see English terms with diacritical marks.
The spelling system, or orthography, of English is multilayered, with elements of French, Latin and Greek spelling on top of the native Germanic system; further complications have arisen through sound changes with which the orthography has not kept pace. This means that, compared with many other languages, English spelling is not a reliable indicator of pronunciation and vice versa (it is not, generally speaking, a phonemic orthography).
Though letters and sounds may not correspond in isolation, spelling rules that take into account syllable structure, phonetics, and accents are 75% or more reliable. Some phonics spelling advocates claim that English is more than 80% phonetic. However, English has fewer consistent relationships between sounds and letters than many other languages; for example, the letter sequence ough can be pronounced in 10 different ways. The consequence of this complex orthographic history is that reading can be challenging. It takes longer for students to become completely fluent readers of English than of many other languages, including French, Greek, and Spanish. English-speaking children have been found to take up to two years longer to learn to read than children in 12 other European countries.
As regards the consonants, the correspondence between spelling and pronunciation is fairly regular. The letters b, d, f, h, j, k, l, m, n, p, r, s, t, v, w, z represent, respectively, the phonemes /b/, /d/, /f/, /h/, /dʒ/, /k/, /l/, /m/, /n/, /p/, /r/, /s/, /t/, /v/, /w/, /z/ (as tabulated in the Consonants section above). The letters c and g normally represent /k/ and /g/, but there is also a soft c pronounced /s/, and a soft g pronounced /dʒ/. Some sounds are represented by digraphs: ch for /tʃ/, sh for /ʃ/, th for /θ/ or /ð/, ng for /ŋ/ (also ph is pronounced /f/ in Greek-derived words). Doubled consonant letters (and the combination ck) are generally pronounced as single consonants, and qu and x are pronounced as the sequences /kw/ and /ks/. The letter y, when used as a consonant, represents /j/. However this set of rules is not applicable without exception; many words have silent consonants or other cases of irregular pronunciation.
With the vowels, however, correspondences between spelling and pronunciation are even more irregular. As can be seen under Vowels above, there are many more vowel phonemes in English than there are vowel letters (a, e, i, o, u, y). This means that diphthongs and other long vowels often need to be indicated by combinations of letters (like the oa in boat and the ay in stay), or using a silent e or similar device (as in note and cake). Even these devices are not used consistently, so consequently vowel pronunciation remains the main source of irregularity in English orthography.