Hindustani grammar

Updated on Nov 22, 2024

Edit

Comment

Hindustani is the lingua franca of northern India and Pakistan, and its two standardised registers, Hindi and Urdu, are official languages of India and Pakistan respectively. Grammatical differences between the two standards are minimal but each uses its own script: Hindi uses Devanagari while Urdu uses an extended form of the Perso-Arabic script, typically in the Nastaʿlīq style.

On this grammar page Hindustani is written in "standard orientalist" transcription as outlined in Masica (1991:xv). Being "primarily a system of transliteration from the Indian scripts, [and] based in turn upon Sanskrit" (cf. IAST), these are its salient features: subscript dots for retroflex consonants; macrons for etymologically, contrastively long vowels; h denoting aspirated plosives. Tildes denote nasalized vowels.

Phonology

The vowels used in Hindustani are the following: a, ā, i, ī, u, ū, e, o, ai, au. Note that the vowels a ai au normally have the pronunciations [ə] [ɛː] [ɔː]. Consonants are outlined in the table below. Hovering the mouse cursor over them will reveal the appropriate IPA symbol, while in the rest of the article hovering the mouse cursor over underlined forms will reveal the appropriate English translation. See Hindustani phonology for further clarification.

Nouns

Hindustani distinguishes two genders (masculine and feminine), two noun types (count and non-count), two numbers (singular and plural), and three cases (direct, oblique, and vocative). Nouns may be further divided into two classes based on declension, called type-I (marked) and type-II (unmarked). The basic difference between the two categories is that the former has characteristic terminations in the direct singular while the latter does not.

The table below displays the suffix paradigms. A hyphen symbol (for the marked type-I) denotes change from the original termination to another (for example laṛkā to laṛke in the masculine singular oblique), whereas a plus sign (for the unmarked type-II) denotes an ending which should be added (seb to sebõ in the masculine plural oblique.

The next table of noun declensions, mostly adapted from Shapiro (2003:263), shows the above suffix paradigms in action. Words: laṛkā ('boy'), kuā̃ ('well'), seb ('apple'), vālid ('father'), cākū ('penknife'), ādmī ('man'), mitra ('friend'), laṛkī ('girl'), ciṛiyā ('finch'), kitāb ('book'), bhāṣā ('language'), and aurat ('woman').

Notes for noun declension:

^1 This is also the ending used for the vocative masculine singular.

^2 A small number of marked masculines like kuã̄ display nasalization of all terminations.

^3 Some masculines ending in ā don't change in the direct plural and fall in the unmarked category. i.e. vālid "father", cācā "uncle", rājā "king".

^4 Unmarked nouns ending in ū and ī generally shorten this to u and i before the oblique (and vocative) plural termination(s), with the latter also inserting the semivowel y.

^5 Many feminine Sanskrit loanwords such as bhāṣā ('language') and mātā (mother) end in ā, therefore the ā is not a reliable indicator of noun gender.

The iyā ending is also not a reliable indicator of gender or noun type. Some words such as pahiyā ('wheel') and Persian takiyā ('pillow') are masculine type-I: pahiye ('wheels'), takiye ('pillows'). Feminine loanwords such as Arabic duniyā ('world') and Sanskrit kriyā ('action') use feminine type-II endings: duniyāẽ ('worlds'), kriyāẽ ('actions').

In Urdu, many Arabic words retain their Arabic plurals.

Perso-Arabic loans ending in final unpronounced h are handled as masculine marked nouns. Hence bacca(h) → baccā. The former is the Urdu spelling, the latter the Hindi.

Some Perso-Arabic loans may use their original dual and plural markings. i.e. vālid "father" → vālidain "parents".

Adjectives

Adjectives may be divided into declinable and indeclinable categories. Declinables are marked, through termination, for the gender, number, case of the nouns they qualify. The set of declinable adjective terminations is similar but greatly simplified in comparison to that of noun terminations —

Indeclinable adjectives are completely invariable, and can end in either consonants or vowels (including ā and ī ). A number of declinables display nasalization of all terminations. Dir. masc. sg. (-ā) is the citation form.

Examples of declinable adjectives: baṛā "big", choṭā "small", moṭā "fat", acchā "good", burā "bad", kālā "black", ṭhaṇḍā "cold".

Examples of indeclinable adjectives: xarāb "bad", sāf "clean", bhārī "heavy", murdā "dead", sundar "beautiful", pāgal "crazy", lāl "red".

All adjectives can be used either attributively, predicatively, or substantively. Substantively they are of course declined as nouns rather than adjectives.

sā (~ se ~ sī) is a suffix for adjectives, modifying or lightening their meaning; giving them an "-ish" or "quite" sense. e.g. nīlā "blue" → nīlā-sā "bluish". Its emphasis is rather ambiguous, sometimes enhancing, sometimes toning down, the sense of the adjective.

Comparatives and superlatives

Comparisons are made by using "than" (the postposition se; see below), "more" (aur, zyādā), and "less" (kam). The word for "more" is optional, while "less" is required, so that in the absence of either "more" will be inferred.

In the absence of an object of comparison ("more" of course is now no longer optional):

Superlatives are made through comparisons with "all" (sab). Comparisons using "least" are rare; it is more common to use an antonym.

In Sanskritized and Persianized registers of Hindustani, comparative and superlative adjectival forms using suffixes derived from those languages can be found.

Numerals

The numeral systems of several of the Indo-Aryan languages, including Hindustani and Nepali, are typical decimal systems, but contracted to the extent that nearly every number 1–99 is irregular. The first four ordinal numbers are also irregular. The suffix -vā̃ marks ordinals beginning at the number five.

Postpositions

The aforementioned inflectional case system only goes so far on its own, and rather serves as that upon which is built a system of agglutinative suffixes or particles known as postpositions, which parallel English's prepositions. It is their use with a noun or verb that necessitates the noun or verb taking the oblique case (though the bare oblique is also minorly used adverbially), and it is with them that the locus of grammatical function or "case-marking" then lies. There are seven such one-word primary postpositions:

kā – genitive marker; variably declinable in the manner of an adjective. X kā/ke/kī Y has the sense "X's Y", with kā/ke/kī agreeing with Y.

ko – marks the indirect object (hence named "dative marker"), or, if definite, the direct object.

ne – ergative marker; applied to subjects of transitive perfective verbs.

se – ablative marker; has a very wide range of uses and meanings:

"from"; dillī se "from Delhi".

"from, of"; tumse ḍarnā "to fear of you".

"since"; itvār se "since Sunday".

"by, with"; instrumental marker.

"by, with, -ly"; adverbial marker.

"than"; for comparatives.

a minority of verbs use se rather than ko to mark their patients.

mẽ – "in".

par – "on".

tak – "until, up to".

Beyond these are a large range of compound postpositions, composed of the genitive primary postposition kā in the oblique form (ke, kī) plus an adverb.

kī taraf "towards", ke andar "inside", ke āge "in front of, ahead of", ke ūpar "on top of, above", ke nīce "beneath, below", ke pīche "behind", ke bād "after", ke bāre mẽ "about", ke bāhar "outside", ke liye "for", ke sāmne "facing, opposite", etc.

Personal

Hindustani has personal pronouns for the first and second persons, while for the third person demonstratives are used, which can be categorized deictically as proximate and non-proximate. Pronouns distinguish cases of direct, oblique, and dative. The lattermost, often called a set of "contracted" forms, is in free variation with the oblique case plus dative postposition. Pronouns do not distinguish gender.

Also displayed in the below table are the genitive pronominal forms to show that the 1st and 2nd pronouns have their own distinctive forms of merā, hamārā, terā, tumhārā apart from the regular formula of OBL. + kā; as well as the ergative pronominal forms to show that the postposition ne does not straightforwardly suffix the oblique bases: rather than *mujh ne and *tujh ne, direct bases are used giving mai ne and tū ne, and rather than in ne and un ne, it's inhõ ne and unhõ ne.

tū, tum, and āp are the three second person pronouns ("you"), constituting a threefold scale of sociolinguistic formality: respectively "intimate", "familiar", and "polite". The "intimate" is grammatically singular while the "familiar" and "polite" are grammatically plural. When being referred to in the third person however, only those of the "polite" level of formality are grammatically plural. The following table is adapted from Shapiro (2003:265).

Notes for pronouns:

Postpositions are treated as bound morphemes after pronouns in Hindi, but as separate words in Urdu.

The varying forms for the 3rd pn. dir. constitute one of the small number of grammatical differences between Hindi and Urdu. yah "this" / ye "these" / vah "that" / ve "those" is the literary set for Hindi while ye "this, these" / vo "that, those" is the set for Urdu and spoken (and also often written) Hindi.

The above section on postpositions noted that ko (the dative case) marks direct objects if definite. As "the most specific thing of all is an individual", persons (or their pronouns) nearly always take the dative case or postposition.

it is very common practice to use plural pronouns (and their accompanying conjugation) in polite situations, thus tum can be used in the second person when referring to one person. Similarly, some speakers prefer plural ham over singular mãĩ. This is not quite the same as the "royal we"; it is rather colloquial.

koī and kuch are indefinite pronouns/quantifiers. As pronouns koī is used for animates ("someone") and kuch for inanimates ("something"). As quantifiers/adjectives koī is used for singular count nouns and kuch for mass nouns and plural count nouns. koī takes the form kisī in the oblique. The form kaī "several" is partially a plural equivalent to koī. kuch can also act as an adverb, qualifying an adjective, meaning "rather". koī preceding a number takes the meaning of "about, approximately". In this usage it does not oblique to kisī.

apnā is a (genitive) reflexive pronoun: "my/your/etc. (own)". Using non-reflexive and reflexive together gives emphasis; e.g. merā apnā "my (very) own". xud, āp, and svayam are some (direct; non-genitive) others: "my/your/etc.-self". Bases for oblique usage are usually apne or apne āp. The latter alone can also mean "of one's own accord"; āpas mẽ means "among/between themselves".

Adverbs

Hindustani has few underived forms. Adverbs may be derived in ways such as the following —

Simply obliquing some nouns and adjectives: nīcā "low" → nīce "down", sīdhā "straight" → sīdhe "straight", dhīrā "slow" → dhīre "slowly", sawerā "morning" → sawere "in the morning", ye taraf "this direction" → is taraf "in this direction", kalkattā "Calcutta" → kalkatte "to Calcutta".

Nouns using a postposition such as se "by, with, -ly": zor "force" → zor se "forcefully" (lit. "with force"), dhyān "attention" → dhyān se "attentively" (lit. "with attention").

Adjectives using postpositional phrases involving "way, manner": acchā "good" → acchī tarah se "well" (lit. "by/in a good way"), xās "special" → xās taur par "especially" (lit. "on a special way").

Verbs in conjunctive form: hãs "laugh" → hãs kar "laughingly" (lit. "having laughed"), meherbānī kar "do kindness" → meherbānī kar ke "kindly, please" (lit. "having done kindness").

Formative suffixes from Sanskrit or Perso-Arabic in higher registers of Hindi or Urdu. Skt. sambhava "possible" + -taḥ → sambhavataḥ "possibly; Ar. ittifāq "chance" + -an → ittifāqan "by chance".

Overview

The Hindustani verbal system is largely structured around a combination of aspect and tense/mood. Like the nominal system, the Hindustani verb involves successive layers of (inflectional) elements to the right of the lexical base.

Hindustani has 3 aspects: perfective, habitual, and continuous, each having overt morphological correlates. These are participle forms, inflecting for gender and number by way of a vowel termination, like adjectives. The perfective, though displaying a "number of irregularities and morphophonemic adjustments", is the simplest, being just the verb stem followed by the agreement vowel. The habitual forms from the imperfective participle; verb stem, plus -t-, then vowel. The continuous forms periphrastically through compounding (see below) with the perfective of rahnā "to stay".

Derived from honā "to be" are five copula forms: present, past, subjunctive, presumptive, contrafactual (aka "past conditional"). Used both in basic predicative/existential sentences and as verbal auxiliaries to aspectual forms, these constitute the basis of tense and mood.

Non-aspectual forms include the infinitive, the imperative, and the conjunctive. Mentioned morphological conditions such the subjunctive, "presumptive", etc. are applicable to both copula roots for auxiliary usage with aspectual forms and to non-copula roots directly for often unspecified (non-aspectual) finite forms.

Finite verbal agreement is with the nominative subject, except in the transitive perfective, where it is with the direct object, with the erstwhile subject taking the ergative construction -ne (see postpositions above). The perfective aspect thus displays split ergativity.

Tabled below on the left are the paradigms for adjectival concord (^A), here only slightly different from that introduced previously: the f. pl. can nasalize under certain conditions. To the right are the paradigms for personal concord (^P), used by the subjunctive.

Forms

The sample verb is intransitive dauṛnā "to run", and the sample inflection is 3rd. masc. sg. (^P = e, ^A = ā) where applicable.

Notes

Much of the above chart information derives from Masica (1991:292–294, 323–325).

The future tense is formed by adding the suffix gā (~ ge ~ gī) to the subjunctive, which is a contraction of gaā (= gayā, perfective participle of jānā "to go"). The future suffix, conjunctive participle, and suffix vālā are treated as bound morphemes in written Hindi, but as separate words in written Urdu.

^ The present copula (h-?) seems not to follow along the lines of the regular ^P system of terminations; while the subjunctive copula (ho-^P) is thoroughly irregular. So here are all of their forms.

For the 1. subj. sg. copula Schmidt (2003:324) and Snell & Weightman (1989:113, 125) list hū̃ while Shapiro (2003:267) lists hoū̃.

Shapiro (2003:268) lists the polite imperative ending as -iye, while Schmidt (2003:330) lists it as -ie but -iye after ā, o, ū.

The euphonic glide y is inserted in perfective participles between prohibited vowel clusters. It is historically the remnant of the old perfective marker. The clusters are a + ā, ā + ā, o + ā, and ī + ā, resulting in āyā, ayā, oyā, iyā. e.g. khāyā/khāye/khāī/khāī̃ (khā- "eat").

In addition, the combinations ī + ī and i + ī give ī. e.g. piyā/piye/pī/pī̃ (pī- "drink").

As stated, agreement in the transitive perfective is with the direct object, with the erstwhile subject taking the ergative postposition ne. If however the direct object takes the postposition ko (marking definiteness), or if no direct object is expressed, then agreement neutralizes to default m. sg. -ā.

Is this regard, there are a small number of verbs that while perhaps logically transitive still do not take ne and continue to agree with the subject, in the perfective. e.g. lānā "to bring", bhūlnā "to forget", milnā "to meet", etc.

Besides supplying the copulas, honā "to be" can be used aspectually: huā "happened, became"; hotā "happens, becomes, is"; ho rahā "happening, being".

-ke can be used as a colloquial alternative to -kar for the conjunctive participle of any verb. But for karnā it is the only possible form; karke, not *karkar.

Hindustani displays a very small number of irregular forms, spelled out in the cells below.

^ However, it is jā- that is used as the perfective stem in the rare instance of an intransitive verb like jānā being expressed passively, such as in a passivized imperative/subjunctive construction: ghar jāyā jāe? "Shall [we] go home?" (lit. "Shall home be gone to [by us]?").

Causatives

Transitives or causatives are morphologically contrastive in Hindustani, leading to the existence of related verb sets divisible along such lines. While the derivation of such forms shows patterns, they do reach a level of variegation so as to make it somewhat difficult to outline all-encompassing rules. Furthermore, some sets may have as many as four to five distinct members; also, the meaning of certain members of given sets may be idiosyncratic.

Starting from intransitive or transitive verb stems further transitive/causative stems are produced according to these assorted rules —

1a. Root vowel change: a → ā, u/ū → o, i/ī → e. Sometimes accompanied by root final consonant change: k → c, ṭ → ṛ, l → Ø.1b. Suffixation of -ā. Often accompanied by:Root vowel change: ū/o → u, e/ai/ā/ī → i.Insertion of semivowel l between such vowel-terminating stems.2. Suffixation of -vā (in place of -ā if and where it'd occur) for a "causative".

The majority of the following are sets culled from Shapiro (2003:270) and Snell & Weightman (1989:243–244). The lack of third members displayed for the ghūmnā to dhulnā sets does not imply that they do not exist but that they were simply not listed in the source literature (Snell & Weightman 1989:243). Intransitive verbs are coloured brown while transitives remain the usual black.

girnā "to fall", girānā "to fell", girvānā "to cause to be felled".

banna "to become", banānā "to make", banvānā "to cause to be made".

khulnā "to open", kholnā "to open", khulvānā "to cause to be opened".

sīkhnā "to learn", sikhānā "to teach", sikhvānā "to cause to be taught".

khānā "to eat", khilānā "to feed", khilvānā "to cause to be fed".

biknā "to sell", becnā "to sell", bikvānā "to cause to be sold".

dikhnā/dīkhnā "to seem", dekhnā "to see", dikhānā "to show", dikhvānā "to cause to be shown".

kahnā "to say", kahlānā "to be called".

ghūmnā "to go round", ghumānā "to make go round".

leṭnā "to lie down", liṭānā "to lay down".

baiṭhnā "to sit", biṭhānā "to seat".

sonā "to sleep", sulānā "to make sleep".

dhulnā "to wash", dhonā "to wash".

ṭūṭnā "to break", toṛnā "to break", tuṛānā "to cause to be broken".

In the causative model of "to cause to be Xed", the agent takes the postposition se. Thus Y se Z banvānā "to cause Z to be made by Y" = "to cause Y to make Z" = "to have Z made by Y" = "to have Y make Z", etc.

Compounds

Compound verbs, a highly visible feature of Hindi–Urdu grammar, consist of a verbal stem plus an auxiliary verb. The auxiliary (variously called "subsidiary", "explicator verb", and "vector",) loses its own independent meaning and instead "lends a certain shade of meaning" to the main or stem verb, which "comprises the lexical core of the compound". While most any verb can act as a main verb, there is a limited set of productive auxiliaries. Shown below are prominent such auxiliaries, with their independent meaning first outlined, followed by their semantic contribution as auxiliaries.

jānā "to go"; gives a sense of completeness, finality, or change of state. e.g. ānā "to come" → ā jānā "to come, arrive"; khānā "to eat" → khā jānā "to eat up"; pīnā "to drink" → pī jānā "to drink up"; baiṭhnā "to sit" → baiṭh jānā "to sit down"; samajhnā "to understand" → samajh jānā "to realise"; sonā "to sleep" → so jānā "to go to sleep"; honā "to be" → ho jānā "to become".

lenā "to take"; suggests that the benefit of the action flows towards the doer. e.g. paṛh lenā "to read (to/for oneself)".

denā "to give"; suggests that the benefit of the action flows away from the doer. e.g. paṛh denā "to read (out)".

The above three are the most common of auxiliaries, and the "least marked", or "lexically nearly colourless". The nuance conveyed by an auxiliary can often be very subtle, and need not always be expressed with different words in English translation. lenā and denā, transitive verbs, occur with transitives, while intransitive jānā occurs mostly with intransitives; a compound of a transitive and jānā will be grammatically intransitive as jānā is.

ḍālnā "to throw, pour"; indicates an action done vigorously, decisively, violently or recklessly; it is an intensifier, showing intensity, urgency, completeness, or violence. e.g. mārnā "to strike" → mār ḍālnā "to kill", pīnā "to drink" → pī ḍālnā "to drink down".

baiṭhnā "to sit"; implies an action done foolishly or stubbornly; shows speaker disapproval or an impulsive or involuntary action. kahnā "to say" → kah baiṭhnā "to blurt out", karnā "to do" → kar baiṭhnā "to do (as a blunder)", laṛnā "to fight" → laṛ baiṭhnā "to quarrel (foolishly)".

paṛnā "to fall"; connotes involuntary, sudden, or unavoidable occurrence; adds a sense of suddenness or change of state, with its independent/literal meaning sometimes showing through in a sense of downward movement.

uṭhnā "to rise"; functions like an intensifier; suggests inception of action or feeling, with its independent/literal meaning sometimes showing through in a sense of upward movement. e.g. jalnā "to burn" → jal uṭhnā "to burst into flames", nacnā "to dance" → nac uṭhnā "to break into dance".

rakhnā "to keep, maintain"; implies a firmness of action, or one with possibly long-lasting results or implications; occurs with lenā and denā, meaning "to give/take (as a loan)", and with other appropriate verbs, showing an action performed beforehand.

The continuous aspect marker rahā apparently originated as a compound verb with rahnā ("remain"): thus mãĩ bol rahā hū̃ = "I have remained speaking" → "I have continued speaking" → "I am speaking". However it has lost the ability to take any form other than the perfective, and is thus considered to have become grammaticalized.

Finally, having to do with the manner of an occurrence, compounds verbs are mostly used with completed actions and imperatives, and much less with negatives, conjunctives, and contexts continuous or speculative. This is because non-occurrences cannot be described to have occurred in a particular manner.

Conjuncts

Another notable aspect of Hindi–Urdu grammar is that of "conjunct verbs", composed of a noun or adjective paired up with a general verbalizer, most commonly transitive karnā "to do" or intransitive honā "to be(come)", functioning in the place of what in English would be single unified verb.

In the case of an adjective as the non-verbal element, it is often helps to think of karnā "to do" as supplementally having the senses of "to cause to be", "to make", "to render", etc.

In the case of a noun as the non-verbal element, it is treated syntactically as the verb's (direct) object (never taking the ko marker; governing agreement in perfective and infinitival constructions), and the semantic patient (or agent: see gālī khānā below) of the conjunct verbal expression is often expressed/marked syntactically as a genitive adjunct (-kā ~ ke ~ kī) of the noun.

With English it is the verb stems themselves that are used.

Passive

The passive construction is periphrastic. It is formed from the perfective participle by addition of the auxiliary jānā "to go"; i.e. likhnā "to write" → likhā jānā "to be written". The agent is marked by the postposition se. Furthermore, both intransitive and transitive verbs may be grammatically passivized to show physical/psychological incapacity, usually in negative sentences. Lastly, intransitives often have a passive sense, or convey unintentional action.

Syntax

With regards to word order, Hindustani is an SOV language. In terms of branching, it is neither purely left- or right-branching, and phenomena of both types can be found. The order of constituents in sentences as a whole lacks governing "hard and fast rules", and frequent deviations can be found from normative word position, describable in terms of a small number of rules, accounting for facts beyond the pale of the label of "SOV".

Indirect objects precede direct objects.
Attributive adjectives precede the noun they qualify.
Adverbs precede the adjectives they qualify.
Negative markers (nahī̃, na, mat) and interrogatives precede the verb.
Interrogatives precede negative markers if both are present.
kyā ("what?") as the yes-no question marker occurs at the beginning of a clause.

Possession

Possession, reflecting what many other languages indicate via the verb to have, is reflected in Hindustani by the genitive kā (inflected appropriately) or the postposition ke pās ("near") and the verb honā. Possible objects of possession (nouns) fall into two main categories in Hindustani: one for persons such as family members, or body parts, and the other for most inanimate objects, animals, most abstract ideas, and some persons such as servants.

For indicating possession with objects of the first category, kā appears after the subject of the possession, followed by the object. With personal pronouns, this requires the use of the possessive pronoun (inflected appropriately). Examples: Merī mātā hai ("I have a mother"), Śiv kī tīn ā̃khẽ haĩ ("Shiva has three eyes").

For indicating possession with objects of the second category, the compound postposition ke pās ("near") is used. For example: Mohan ke pās ek bakarī hai ("Mohan has a goat", but also "There is a goat near Mohan").

References

Hindustani grammar Wikipedia

(Text) CC BY-SA

Contents