The Turkic languages are a language family of at least thirty-five languages, spoken by Turkic peoples from Southeastern Europe and the Mediterranean to Siberia and Western China. The Turkic languages originated in a region of East Asia spanning Western China to Mongolia, where Proto-Turkic is thought to have been spoken, from where they expanded to Central Asia and farther west during the first millennium.
Turkic languages are spoken as a native language by some 170 million people, and the total number of Turkic speakers, including second-language speakers, is over 200 million. The Turkic language with the greatest number of speakers is Turkish, spoken mainly in Anatolia and the Balkans, the native speakers of which account for about 40% of all Turkic speakers.
Characteristic features of Turkish, such as vowel harmony, agglutination, and lack of grammatical gender, are universal within the Turkic family. There is also a high degree of mutual intelligibility among the various Oghuz languages, which include Turkish, Azerbaijani, Turkmen, Qashqai, Gagauz, Balkan Gagauz Turkish, and Oghuz-influenced Crimean Tatar. Although methods of classification vary, the Turkic languages are usually considered to be divided equally into two branches: Oghur, the only surviving member of which is Chuvash, and Common Turkic, which includes all other Turkic languages including the Oghuz subbranch.
The characteristics of the Turkic family also show certain similarities to the surrounding East Asian language families of the Mongolic, Tungusic, Koreanic, and Japonic languages, leading to a previously widespread acceptance of an Altaic language family. Apparent similarities with the Uralic languages family even caused these families to be regarded as one for a long time under the hypothesis of Ural-Altaic languages. However, there has not been sufficient evidence to conclude the existence of either of these macrofamilies, the shared characteristics between the languages being attributed presently to extensive prehistoric language contact.
Turkic languages are null-subject languages, have vowel harmony, extensive agglutination by means of suffixes, and lack of grammatical articles, noun classes, and grammatical gender. Subject–object–verb word order is universal within the family.
The geographical distribution of Turkic-speaking peoples across Eurasia ranges from the North-East of Siberia to Turkey in the West, since the Ottoman era (see picture in the box on the right above).
Extensive contact took place between Proto-Turks and Proto-Mongols approximately during the first millennium BCE; the shared cultural tradition between the two Eurasian nomadic groups is called the "Turco-Mongol" tradition. The two groups shared a religion, Tengrism, and there exists a multitude of evident loanwords between Turkic languages and Mongolic languages. Although the loans were bidirectional, today Turkic loanwords constitute the largest foreign component in Mongolian vocabulary. The most famous of these loanwords include "lion" (Turkish: aslan or arslan; Mongolian: arslan), "gold" (Turkish: altın; Mongolian: altan or alt), and "iron" (Turkish: demir; Mongolian: tömör).
The first established records of the Turkic languages are the eighth century AD Orkhon inscriptions by the Göktürks, recording the Old Turkic language, which were discovered in 1889 in the Orkhon Valley in Mongolia. The Compendium of the Turkic Dialects (Divânü Lügati't-Türk), written during the 11th century AD by Kaşgarlı Mahmud of the Kara-Khanid Khanate, constitutes an early linguistic treatment of the family. The Compendium is the first comprehensive dictionary of the Turkic languages and also includes the first known map of the Turkic speakers' geographical distribution. It mainly pertains to the Southwestern branch of the family.
The Codex Cumanicus (12th–13th centuries AD) concerning the Northwestern branch is another early linguistic manual, between the Kipchak language and Latin, used by the Catholic missionaries sent to the Western Cumans inhabiting a region corresponding to present-day Hungary and Romania. The earliest records of the language spoken by Volga Bulgars, the parent to today's Chuvash language, are dated to the 13th–14th centuries AD.
With the Turkic expansion during the Early Middle Ages (c. 6th–11th centuries AD), Turkic languages, in the course of just a few centuries, spread across Central Asia, from Siberia to the Mediterranean. Various terminologies from the Turkic languages have passed into Persian, Hindustani, Russian, Chinese, and to a lesser extent, Arabic.
For centuries, the Turkic-speaking peoples have migrated extensively and intermingled continuously, and their languages have been influenced mutually and through contact with the surrounding languages, especially the Iranian, Slavic, and Mongolic languages.
This has obscured the historical developments within each language and/or language group, and as a result, there exist several systems to classify the Turkic languages. The modern genetic classification schemes for Turkic are still largely indebted to Samoilovich (1922).
The Turkic languages may be divided into six branches:Common Turkic
Southwestern (Oghuz Turkic)
Northwestern (Kipchak Turkic)
Southeastern (Karluk Turkic)
Northeastern (Siberian Turkic)
In this classification, Oghur Turkic is also referred to as Lir-Turkic, and the other branches are subsumed under the title of Shaz-Turkic or Common Turkic. It is not clear when these two major types of Turkic can be assumed to have actually diverged.
With less certainty, the Southwestern, Northwestern, Southeastern and Oghur groups may further be summarized as West Turkic, the Northeastern, Kyrgyz-Kipchak and Arghu (Khalaj) groups as East Turkic.
Geographically and linguistically, the languages of the Northwestern and Southeastern subgroups belong to the central Turkic languages, while the Northeastern and Khalaj languages are the so-called peripheral languages.
The following isoglosses are traditionally used in the classification of the Turkic languages:Rhoticisation (or in some views, zetacism), e.g. in the last consonant of the word for "nine" *toqqız. This separates the Oghur branch, which exhibits /r/, from the rest of Turkic, which exhibits /z/. In this case, rhoticisation refers to the development of *-/r/, *-/z/, and *-/d/ to /r/,*-/k/,*-/kh/ in this branch. See Antonov and Jacques (2012) on the debate concerning rhotacism and lambdacism in Turkic.
Intervocalic *d, e.g. the second consonant in the word for "foot" *hadaq
Word-final -G, e.g. in the word for "mountain" *tāğ
Suffix-final -G, e.g. in the suffix *lIG, in e.g. *tāğlığ
Additional isoglosses include:Preservation of word initial *h, e.g. in the word for "foot" *hadaq. This separates Khalaj as a peripheral language.
Denasalisation of palatal *ń, e.g. in the word for "moon", *ań
*In the standard Istanbul dialect of Turkish, the ğ in dağ and dağlık is not realized as a consonant, but as a slight lengthening of the preceding vowel.
The following table is based upon the classification scheme presented by Lars Johanson (1998)
The following is a brief comparison of cognates among the basic vocabulary across the Turkic language family (about 60 words).
Empty cells do not necessarily imply that a particular language is lacking a word to describe the concept, but rather that the word for the concept in that language may be formed from another stem and is not a cognate with the other words in the row or that a loanword is used in its place.
Also, there may be shifts in the meaning from one language to another, and so the "Common meaning" given is only approximate. In some cases the form given is found only in some dialects of the language, or a loanword is much more common (e.g. in Turkish, the preferred word for "fire" is the Persian-derived ateş, whereas the native od is dead). Forms are given in native Latin orthographies unless otherwise noted.
Turkic speakers often use the same basic roots when forming names, so that names like "Aksu" and "Karakul" appear all over central Asia. Spellings vary with the language and transliteration system. Meanings can vary slightly with the language.