ISO 639-2 ben
Spoken by Bengalis
Native to Bangladesh, India
|ISO 639-1 bn|
Native speakers 250 million
Dialects Bengali dialects
Region Bengal , Bangladesh, West Bengal, Barak valley, Tripura.
Early forms AbahattaOld BengaliBengali
Official language in Bangladesh India (in West Bengal, Tripura, Jharkhand and South Assam)
Language family Indo-European languages
Regulated by Bangla Academy, Paschimbanga Bangla Akademi
Writing system Bengali alphabet, Bengali Braille
Bengali (/bɛŋˈɡɔːli/), also known by its endonym Bangla (/bɑːŋlɑː/; বাংলা [ˈbaŋla]) is an Indo-Aryan language spoken in South Asia. It is the national and official language of the People's Republic of Bangladesh, and the official language of several northeastern states of the Republic of India, including West Bengal, Tripura, Assam (Barak Valley) and Andaman and Nicobar Islands. With over 210 million speakers, Bengali is the seventh most spoken native language in the world.
- How to speak bengali greetings
- Ancient language of Bengal
- Emergence of Bengali
- Middle Bengali
- Modern Bengali
- Geographical distribution
- Official status
- Spoken and literary varieties
- Consonant clusters
- Writing system
- Orthographic depth
- Word order
- Sample text
Although Bengali is an Indo-European language, it has been influenced by other language families prevalent in South Asia, notably the Dravidian, the Austroasiatic, and the Tibeto-Burman families, all of which contributed to Bengali vocabulary and provided the language with some structural forms. Dictionaries from the early 20th century attributed slightly more than half of the Bengali vocabulary to native words (i.e., naturally modified Sanskrit words, corrupted forms of Sanskrit words, and loanwords from non-Indo-European languages), about 45 percent to unmodified Sanskrit words, and the remainder to foreign words. Dominant in the last group was Persian, which was also the source of some grammatical forms. More recent studies suggest that the use of native and foreign words has been increasing, mainly because of the preference of Bengali speakers for the colloquial style. Today, Bengali is the primary language spoken in Bangladesh and the second most spoken language in India.
Bengali literature, with its millennium old history and folk heritage, has extensively developed since the Bengali renaissance and is one of the most prominent and diverse literary traditions in Asia. Both the national anthems of Bangladesh (Amar Sonar Bangla) and India (Jana Gana Mana) were composed in Bengali. In 1952, the Bengali Language Movement successfully pushed for the language's official status in the Dominion of Pakistan. In 1999, UNESCO recognized 21 February as International Mother Language Day in recognition of the language movement in East Pakistan. Language is an important element of Bengali identity and binds together a culturally diverse region.
How to speak bengali greetings
Ancient language of Bengal
Sanskrit was spoken in Bengal since the first millennium BCE. During the Gupta Empire, Bengal was a hub of Sanskrit literature. The Middle Indo-Aryan dialects were spoken in Bengal in the first millennium when the region was a part of the Magadha Realm. These dialects were called Magadhi Prakrit. They eventually evolved into Ardha Magadhi. Ardha Magadhi began to give way to what are called Apabhraṃśa languages at the end of the first millennium.
Emergence of Bengali
Along with other Eastern Indo-Aryan languages, Bengali evolved circa 1000–1200 AD from Sanskrit and Magadhi Prakrit. The local Apabhraṃśa of the eastern subcontinent, Purbi Apabhraṃśa or Abahatta ("Meaningless Sounds"), eventually evolved into regional dialects, which in turn formed three groups of the Bengali–Assamese languages, the Bihari languages, and the Odia language. Some argue that the points of divergence occurred much earlier — going back to even 500, but the language was not static: different varieties coexisted and authors often wrote in multiple dialects in this period. For example, Ardhamagadhi is believed to have evolved into Abahatta around the 6th century, which competed with the ancestor of Bengali for some time. Proto-Bengali was the language of the Pala Empire and the Sena dynasty.
During the medieval period, Middle Bengali was characterized by the elision of word-final অ ô, the spread of compound verbs and Arabic and Persian influences. Bengali was an official court language of the Sultanate of Bengal. Muslim rulers promoted the literary development of Bengali as part of efforts to Islamize and to check the influence of Sanskrit. Bengali became the most spoken vernacular language in the Sultanate. This period saw borrowing of Perso-Arabic terms into Bengali vocabulary. Major texts of Middle Bengali (1400–1800) include Chandidas' Shreekrishna Kirtana.
The modern literary form of Bengali was developed during the 19th and early 20th centuries based on the dialect spoken in the Nadia region, a west-central Bengali dialect. Bengali presents a strong case of diglossia, with the literary and standard form differing greatly from the colloquial speech of the regions that identify with the language. The modern Bengali vocabulary contains the vocabulary base from Magadhi Prakrit and Pali, also tatsamas and reborrowings from Sanskrit and other major borrowings from Persian, Arabic, Austroasiatic languages and other languages in contact with.
During this period, the চলিতভাষা Chôlitôbhasha form of Bengali using simplified inflections and other changes, was emerging from সাধুভাষা Sadhubhasha (Proper form or original form of Bengali) as the form of choice for written Bengali.
In 1948 the Government of Pakistan tried to impose Urdu as the sole state language in Pakistan, starting the Bengali language movement. The Bengali Language Movement was a popular ethno-linguistic movement in the former East Bengal (today Bangladesh), which was a result of the strong linguistic consciousness of the Bengalis to gain and protect spoken and written Bengali's recognition as a state language of the then Dominion of Pakistan. On the day of 21 February 1952 five students and political activists were killed during protests near the campus of the University of Dhaka. In 1956 Bengali was made a state language of Pakistan. The day has since been observed as Language Movement Day in Bangladesh and was proclaimed the International Mother Language Day by UNESCO on 17 November 1999, marking Bengali language the only language in the world to be also known for its language movements and people sacrificing their life for their mother language.
A Bengali language movement in the Indian state of Assam took place in 1961, a protest against the decision of the Government of Assam to make Assamese the only official language of the state even though a significant proportion of the population were Bengali-speaking, particularly in the Barak Valley.
In 2010, the parliament of Bangladesh and the legislative assembly of West Bengal proposed that Bengali be made an official UN language. Their motions came after Bangladeshi Prime Minister Sheikh Hasina suggested the idea while addressing the UN General Assembly that year.
Bengali language is native to the region of Bengal, which comprises Indian states of West Bengal, Tripura, southern Assam and the present-day nation of Bangladesh.
Besides the native region it is also spoken by the majority of the population in the Indian union territory of Andaman and Nicobar Islands. There is a good presence of Bengali-speaking people in Odisha, Bihar, Jharkhand, Chhattisgarh and Delhi of India. Bengali speaking people are also found in cities like Mumbai, Varanasi, Vrindavan, and other places in India. There are also significant Bengali-speaking communities in Middle East, Japan, United States, Singapore, Malaysia, Maldives, Australia, Canada and the United Kingdom.
Bengali is national and official language of Bangladesh, and one of the 23 official languages in India. It is the official language of the Indian states of West Bengal, Tripura and in Barak Valley of Assam. It is also a major language in the Indian union territory of Andaman and Nicobar Islands.
Bengali is a second official language of the Indian state of Jharkhand since September 2011. It is also a recognized secondary language in the City of Karachi in Pakistan. The Department of Bengali in the University of Karachi also offers regular programs of studies at the Bachelors and at the Masters levels for Bengali Literature.
The national anthems of both Bangladesh and India were written in Bengali by the Bengali Nobel laureate Rabindranath Tagore. In 2009, elected representatives in both Bangladesh and West Bengal called for Bengali language to be made an official language of the United Nations.
Regional variation in spoken Bengali constitutes a dialect continuum. Linguist Suniti Kumar Chattopadhyay grouped these dialects into four large clusters—Rarh, Banga, Kamarupa and Varendra; but many alternative grouping schemes have also been proposed. The south-western dialects (Rarh or Nadia dialect) form the basis of modern standard colloquial Bengali. In the dialects prevalent in much of eastern and south-eastern Bangladesh (Barisal, Chittagong, Dhaka and Sylhet Divisions of Bangladesh), many of the stops and affricates heard in West Bengal are pronounced as fricatives. Western alveolo-palatal affricates চ [tɕɔ], ছ [tɕʰɔ], জ [dʑɔ] correspond to eastern চ [tsɔ], ছ [tsʰɔ~sɔ], জ [dzɔ~zɔ]. The influence of Tibeto-Burman languages on the phonology of Eastern Bengali is seen through the lack of nasalized vowels and an alveolar articulation of what are categorised as the "cerebral" consonants (as opposed to the postalveolar articulation of West Bengal). Some variants of Bengali, particularly Chittagonian and Chakma, have contrastive tone; differences in the pitch of the speaker's voice can distinguish words. Rangpuri, Kharia Thar and Mal Paharia are closely related to Western Bengali dialects, but are typically classified as separate languages. Similarly, Hajong is considered a separate language, although it shares similarities to Northern Bengali dialects.
During the standardization of Bengali in the 19th century and early 20th century, the cultural center of Bengal was in the city of Kolkata, founded by the British. What is accepted as the standard form today in both West Bengal and Bangladesh is based on the West-Central dialect of Nadia District, located next to the border of Bangladesh. There are cases where speakers of Standard Bengali in West Bengal will use a different word from a speaker of Standard Bengali in Bangladesh, even though both words are of native Bengali descent. For example, the word salt is নুন nun in the west which corresponds to লবণ lôbôn in the east.
Spoken and literary varieties
Bengali exhibits diglossia, though largely contested notion as some scholars proposed triglossia or even n-glossia or heteroglossia between the written and spoken forms of the language. Two styles of writing, involving somewhat different vocabularies and syntax, have emerged:
- Shadhu-bhasha (সাধুভাষা ← সাধু shadhu "sage" + ভাষা bhasha "language") was the written language, with longer verb inflections and more of a Pali and Sanskrit-derived Tatsama vocabulary. Songs such as India's national anthem Jana Gana Mana (by Rabindranath Tagore) were composed in Shadhubhasha. However, use of Shadhubhasha in modern writing is uncommon, restricted to some official signs and documents in Bangladesh as well as for achieving particular literary effects.
- Cholitobhasha (চলিতভাষা ← চলিত chôlitô "current" + ভাষা bhasha "language"), known by linguists as Standard Colloquial Bengali, is a written Bengali style exhibiting a preponderance of colloquial idiom and shortened verb forms, and is the standard for written Bengali now. This form came into vogue towards the turn of the 19th century, promoted by the writings of Peary Chand Mitra (Alaler Gharer Dulal, 1857), Pramatha Chaudhuri (Sabujpatra, 1914) and in the later writings of Rabindranath Tagore. It is modeled on the dialect spoken in the Shantipur region in Nadia district, West Bengal. This form of Bengali is often referred to as the "Nadia standard", "Nadia dialect", "Southwestern/West-Central dialect" or "Shantipuri Bangla".
While most writing is in Standard Colloquial Bengali, spoken dialects exhibit a greater variety. South-eastern West Bengal, including Kolkata, speak in Standard Colloquial Bengali. Other parts of West Bengal and western Bangladesh speak in dialects that are minor variations, such as the Midnapore dialect characterised by some unique words and constructions. However, a majority in Bangladesh speak in dialects notably different from Standard Colloquial Bengali. Some dialects, particularly those of the Chittagong region, bear only a superficial resemblance to Standard Colloquial Bengali. The dialect in the Chittagong region is least widely understood by the general body of Bengalis. The majority of Bengalis are able to communicate in more than one variety—often, speakers are fluent in Cholitobhasha (Standard Colloquial Bengali) and one or more regional dialects.
Even in Standard Colloquial Bengali, the words may differ according to the speakers's religion: Hindus are more likely to use words derived from Sanskrit and of Austroasiatic Deshi origin whereas Muslims are more likely to use words of Persian and Arabic origin respectively. For example:
The phonemic inventory of standard Bengali consists of 29 consonants and 7 vowels, including 6 nasalized vowels. The inventory is set out below in the International Phonetic Alphabet (upper grapheme in each box) and romanization (lower grapheme).
Bengali is known for its wide variety of diphthongs, combinations of vowels occurring within the same syllable.
In standard Bengali, stress is predominantly initial. Bengali words are virtually all trochaic; the primary stress falls on the initial syllable of the word, while secondary stress often falls on all odd-numbered syllables thereafter, giving strings such as in সহযোগিতা shô-hô-jo-gi-ta "cooperation", where the boldface represents primary and secondary stress.
Native Bengali words do not allow initial consonant clusters; the maximum syllabic structure is CVC (i.e. one vowel flanked by a consonant on each side). Many speakers of Bengali restrict their phonology to this pattern, even when using Sanskrit or English borrowings, such as গেরাম geram (CV.CVC) for গ্রাম gram (CCVC) "village" or ইস্কুল iskul (VC.CVC) for স্কুল skul (CCVC) "school".
The Bengali script is an abugida, a script with letters for consonants, diacritics for vowels, and in which an "inherent" vowel (অ ô) is assumed for consonants if no vowel is marked. The Bengali alphabet is used throughout Bangladesh and eastern India (Assam, West Bengal, Tripura). The Bengali alphabet is believed to have evolved from a modified Brahmic script around 1000 CE (or 10th – 11th century). Note that despite Bangladesh being majority Muslim, it uses the Bengali alphabet rather than an Arabic-based one like Pakistan does.
The Bengali script is a cursive script with eleven graphemes or signs denoting nine vowels and two diphthongs, and thirty-nine graphemes representing consonants and other modifiers. There are no distinct upper and lower case letter forms. The letters run from left to right and spaces are used to separate orthographic words. Bengali script has a distinctive horizontal line running along the tops of the graphemes that links them together called মাত্রা matra.
Since the Bengali script is an abugida, its consonant graphemes usually do not represent phonetic segments, but carry an "inherent" vowel and thus are syllabic in nature. The inherent vowel is usually a back vowel, either [ɔ] as in মত [mɔt̪] "opinion" or [o], as in মন [mon] "mind", with variants like the more open [ɒ]. To emphatically represent a consonant sound without any inherent vowel attached to it, a special diacritic, called the hôsôntô (্), may be added below the basic consonant grapheme (as in ম্ [m]). This diacritic, however, is not common, and is chiefly employed as a guide to pronunciation. The abugida nature of Bengali consonant graphemes is not consistent, however. Often, syllable-final consonant graphemes, though not marked by a hôsôntô, may carry no inherent vowel sound (as in the final ন in মন [mon] or the medial ম in গামলা [ɡamla]).
A consonant sound followed by some vowel sound other than the inherent [ɔ] is orthographically realized by using a variety of vowel allographs above, below, before, after, or around the consonant sign, thus forming the ubiquitous consonant-vowel typographic ligatures. These allographs, called কার kar, are diacritical vowel forms and cannot stand on their own. For example, the graph মি [mi] represents the consonant [m] followed by the vowel [i], where [i] is represented as the diacritical allograph ি (called ই-কার i-kar) and is placed before the default consonant sign. Similarly, the graphs মা [ma], মী [mi], মু [mu], মূ [mu], মৃ [mri], মে [me~mæ], মৈ [moj], মো [mo] and মৌ [mow] represent the same consonant ম combined with seven other vowels and two diphthongs. It should be noted that in these consonant-vowel ligatures, the so-called "inherent" vowel [ɔ] is first expunged from the consonant before adding the vowel, but this intermediate expulsion of the inherent vowel is not indicated in any visual manner on the basic consonant sign ম [mɔ].
The vowel graphemes in Bengali can take two forms: the independent form found in the basic inventory of the script and the dependent, abridged, allograph form (as discussed above). To represent a vowel in isolation from any preceding or following consonant, the independent form of the vowel is used. For example, in মই [moj] "ladder" and in ইলিশ [iliɕ] "Hilsa fish", the independent form of the vowel ই is used (cf. the dependent form ি). A vowel at the beginning of a word is always realized using its independent form.
In addition to the inherent-vowel-suppressing hôsôntô, three more diacritics are commonly used in Bengali. These are the superposed chôndrôbindu (ঁ), denoting a suprasegmental for nasalization of vowels (as in চাঁদ [tɕãd] "moon"), the postposed ônusbar (ং) indicating the velar nasal [ŋ] (as in বাংলা [baŋla] "Bengali") and the postposed bisôrgô (ঃ) indicating the voiceless glottal fricative [h] (as in উঃ! [uh] "ouch!") or the gemination of the following consonant (as in দুঃখ [dukʰːɔ] "sorrow").
The Bengali consonant clusters (যুক্তব্যঞ্জন juktôbênjôn) are usually realized as ligatures, where the consonant which comes first is put on top of or to the left of the one that immediately follows. In these ligatures, the shapes of the constituent consonant signs are often contracted and sometimes even distorted beyond recognition. In the Bengali writing system, there are nearly 285 such ligatures denoting consonant clusters. Although there exist a few visual formulas to construct some of these ligatures, many of them have to be learned by rote. Recently, in a bid to lessen this burden on young learners, efforts have been made by educational institutions in the two main Bengali-speaking regions (West Bengal and Bangladesh) to address the opaque nature of many consonant clusters, and as a result, modern Bengali textbooks are beginning to contain more and more "transparent" graphical forms of consonant clusters, in which the constituent consonants of a cluster are readily apparent from the graphical form. However, since this change is not as widespread and is not being followed as uniformly in the rest of the Bengali printed literature, today's Bengali-learning children will possibly have to learn to recognize both the new "transparent" and the old "opaque" forms, which ultimately amounts to an increase in learning burden.
Bengali punctuation marks, apart from the downstroke । daṛi – the Bengali equivalent of a full stop – have been adopted from western scripts and their usage is similar.
Unlike in western scripts (Latin, Cyrillic, etc.) where the letter-forms stand on an invisible baseline, the Bengali letter-forms instead hang from a visible horizontal left-to-right headstroke called মাত্রা matra. The presence and absence of this matra can be important. For example, the letter ত tô and the numeral ৩ "3" are distinguishable only by the presence or absence of the matra, as is the case between the consonant cluster ত্র trô and the independent vowel এ e. The letter-forms also employ the concepts of letter-width and letter-height (the vertical space between the visible matra and an invisible baseline).
There is yet to be a uniform standard collating sequence (sorting order of graphemes to be used in dictionaries, indices, computer sorting programs, etc.) of Bengali graphemes. Experts in both Bangladesh and India are currently working towards a common solution for this problem.
The Bengali script in general has a comparatively shallow orthography, i.e., in most cases there is a one-to-one correspondence between the sounds (phonemes) and the letters (graphemes) of Bengali. But grapheme-phoneme inconsistencies do occur in certain cases.
One kind of inconsistency is due to the presence of several letters in the script for the same sound. In spite of some modifications in the 19th century, the Bengali spelling system continues to be based on the one used for Sanskrit, and thus does not take into account some sound mergers that have occurred in the spoken language. For example, there are three letters (শ, ষ, and স) for the voiceless alveolo-palatal sibilant [ɕɔ], although the letter স retains the voiceless alveolar sibilant [sɔ] sound when used in certain consonant conjuncts as in স্খলন [skʰɔlɔn] "fall", স্পন্দন [spɔndɔn] "beat", etc. The letter ষ also retains the voiceless retroflex sibilant [ʂɔ] sound when used in certain consonant conjuncts as in কষ্ট [kɔʂʈɔ] "suffering", গোষ্ঠী [ɡoʂʈʰi] "clan", etc. Similarly, there are two letters (জ and য) for the voiced alveolo-palatal affricate [dʑɔ]. Moreover, what was once pronounced and written as a retroflex nasal ণ [ɳɔ] is now pronounced as an alveolar [nɔ] when in conversation (the difference is seen heard when reading) (unless conjoined with another retroflex consonant such as ট, ঠ, ড and ঢ), although the spelling does not reflect this change. The near-open front unrounded vowel [æ] is orthographically realized by multiple means, as seen in the following examples: এত [æt̪ɔ] "so much", এ্যাকাডেমী [ækademi] "academy", অ্যামিবা [æmiba] "amoeba", দেখা [d̪ækʰa] "to see", ব্যস্ত [bæst̪ɔ] "busy", ব্যাকরণ [bækɔrɔn] "grammar".
Another kind of inconsistency is concerned with the incomplete coverage of phonological information in the script. The inherent vowel attached to every consonant can be either [ɔ] or [o] depending on vowel harmony (স্বরসঙ্গতি) with the preceding or following vowel or on the context, but this phonological information is not captured by the script, creating ambiguity for the reader. Furthermore, the inherent vowel is often not pronounced at the end of a syllable, as in কম [kɔm] "less", but this omission is not generally reflected in the script, making it difficult for the new reader.
Many consonant clusters have different sounds than their constituent consonants. For example, the combination of the consonants ক্ [k] and ষ [ʂɔ] is graphically realized as ক্ষ and is pronounced [kkʰɔ] (as in রুক্ষ [rukkʰɔ] "rugged") or [kkʰo] (as in ক্ষতি [kkʰot̪i] "loss") or even [kkʰɔ] (as in ক্ষমতা [kkʰɔmɔt̪a] "power"), depending on the position of the cluster in a word. The Bengali writing system is, therefore, not always a true guide to pronunciation.
The script used for Bengali, Assamese and other languages is known as Bengali-Assamese or Eastern Nagari script. The script is known as the Bengali alphabet for Bengali and its dialects and the Assamese alphabet for Assamese language with some minor variations. Other related languages in the nearby region also make use of the Bengali alphabet like the Meithei language in the Indian state of Manipur, where the Meitei language has been written in the Bengali alphabet for centuries, though the Meithei script has been promoted in recent times.
There are various ways of Romanization systems of Bengali created in recent years which have failed to represent the true Bengali phonetic sound. The Bengali alphabet has often been included with the group of Brahmic scripts for romanization where the true phonetic value of Bengali is never represented. Some of them are the International Alphabet of Sanskrit Transliteration or IAST system (based on diacritics), "Indian languages Transliteration" or ITRANS (uses upper case alphabets suited for ASCII keyboards), and the National Library at Kolkata romanization.
In the context of Bengali romanization, it is important to distinguish transliteration from transcription. Transliteration is orthographically accurate (i.e. the original spelling can be recovered), whereas transcription is phonetically accurate (the pronunciation can be reproduced).
Although it might be desirable to use a transliteration scheme where the original Bengali orthography is recoverable from the Latin text, Bengali words are currently Romanized on Wikipedia using a phonemic transcription, where the true phonetic pronunciation of Bengali is represented with no reference to how it is written.
Bengali nouns are not assigned gender, which leads to minimal changing of adjectives (inflection). However, nouns and pronouns are moderately declined (altered depending on their function in a sentence) into four cases while verbs are heavily conjugated, and the verbs do not change form depending on the gender of the nouns.
As a head-final language, Bengali follows subject–object–verb word order, although variations to this theme are common. Bengali makes use of postpositions, as opposed to the prepositions used in English and other European languages. Determiners follow the noun, while numerals, adjectives, and possessors precede the noun.
Yes-no questions do not require any change to the basic word order; instead, the low (L) tone of the final syllable in the utterance is replaced with a falling (HL) tone. Additionally, optional particles (e.g. কি -ki, না -na, etc.) are often encliticized onto the first or last word of a yes-no question.
Wh-questions are formed by fronting the wh-word to focus position, which is typically the first or second word in the utterance.
Nouns and pronouns are inflected for case, including nominative, objective, genitive (possessive), and locative. The case marking pattern for each noun being inflected depends on the noun's degree of animacy. When a definite article such as -টা -ṭa (singular) or -গুলা -gula (plural) is added, as in the tables below, nouns are also inflected for number.
When counted, nouns take one of a small set of measure words. Similar to Japanese, the nouns in Bengali cannot be counted by adding the numeral directly adjacent to the noun. The noun's measure word (MW) must be used between the numeral and the noun. Most nouns take the generic measure word -টা -ṭa, though other measure words indicate semantic classes (e.g. -জন -jôn for humans).
Measuring nouns in Bengali without their corresponding measure words (e.g. আট বিড়াল aṭ biṛal instead of আটটা বিড়াল aṭ-ṭa biṛal "eight cats") would typically be considered ungrammatical. However, when the semantic class of the noun is understood from the measure word, the noun is often omitted and only the measure word is used, e.g. শুধু একজন থাকবে। Shudhu êk-jôn thakbe. (lit. "Only one-MW will remain.") would be understood to mean "Only one person will remain.", given the semantic class implicit in -জন -jôn.
In this sense, all nouns in Bengali, unlike most other Indo-European languages, are similar to mass nouns.
There are two classes of verbs: finite and non-finite. Non-finite verbs have no inflection for tense or person, while finite verbs are fully inflected for person (first, second, third), tense (present, past, future), aspect (simple, perfect, progressive), and honor (intimate, familiar, and formal), but not for number. Conditional, imperative, and other special inflections for mood can replace the tense and aspect suffixes. The number of inflections on many verb roots can total more than 200.
Inflectional suffixes in the morphology of Bengali vary from region to region, along with minor differences in syntax.
Bengali differs from most Indo-Aryan Languages in the zero copula, where the copula or connective be is often missing in the present tense. Thus, "he is a teacher" is সে শিক্ষক se shikkhôk, (literally "he teacher"). In this respect, Bengali is similar to Russian and Hungarian. Romani grammar is also the closest to Bengali grammar.
Bengali has as many as 100,000 separate words, of which 50,000 are considered tatsamas, 21,100 are tadbhavas and the remainder loanwords from Austroasiatic and other foreign languages.
However, these figures do not take into account the large proportion of archaic or highly technical words, little used. The productive vocabulary used in modern literary works, in fact, is made up mostly (67%) of tadbhavas, while tatsamas comprise only 25% of the total. Loanwords from non-Indic languages comprise the remaining 8% of the vocabulary used in modern Bengali literature.
Because of centuries of contact with Europeans, Turkic peoples, and Persians, the Bengali language has absorbed numerous words from foreign languages, often totally integrating these borrowings into the core vocabulary.
The most common borrowings from foreign languages come from three different kinds of contact. After close contact with several indigenous Austroasiatic languages, and later the Mughal invasion whose court language was Persian, numerous Chagatai, Arabic, and Persian words were absorbed into the lexicon.
The following is a sample text in Bengali of Article 1 of the Universal Declaration of Human Rights:
Bengali in the Bengali alphabet
Bengali in phonetic Romanization
Bengali in the International Phonetic Alphabet