Code switching

Updated on Apr 25, 2026

Edit

Comment

In linguistics, code-switching occurs when a speaker alternates between two or more languages, or language varieties, in the context of a single conversation. Multilinguals, speakers of more than one language, sometimes use elements of multiple languages when conversing with each other. Thus, code-switching is the use of more than one linguistic variety in a manner consistent with the syntax and phonology of each variety.

Code-switching is distinct from other language contact phenomena, such as borrowing, pidgins and creoles, loan translation (calques), and language transfer (language interference). Borrowing affects the lexicon, the words that make up a language, while code-switching takes place in individual utterances. Speakers form and establish a pidgin language when two or more speakers who do not speak a common language form an intermediate, third language. On the other hand, speakers practice code-switching when they are each fluent in both languages. Code mixing is a thematically related term, but the usage of the terms code-switching and code-mixing varies. Some scholars use either term to denote the same practice, while others apply code-mixing to denote the formal linguistic properties of language-contact phenomena and code-switching to denote the actual, spoken usages by multilingual persons.

In the 1940s and the 1950s, many scholars considered code-switching to be a substandard use of language. Since the 1980s, however, most scholars have come to regard it as a normal, natural product of bilingual and multilingual language use.

The term "code-switching" is also used outside the field of linguistics. Some scholars of literature use the term to describe literary styles that include elements from more than one language, as in novels by Chinese-American, Anglo-Indian, or Latino writers. In popular usage, code-switching is sometimes used to refer to relatively stable informal mixtures of two languages, such as Spanglish, Taglish, or Hinglish. Both in popular usage and in sociolinguistic study, the name code-switching is sometimes used to refer to switching among dialects, styles or registers. This form of switching is practiced, for example, by speakers of African American Vernacular English as they move from less formal to more formal settings. Such shifts, when performed by public figures such as politicians, are sometimes criticized as signalling inauthenticity or insincerity.

Code-switching relates to, and sometimes indexes social-group membership in bilingual and multilingual communities. Some sociolinguists describe the relationships between code-switching behaviours and class, ethnicity, and other social positions. In addition, scholars in interactional linguistics and conversation analysis have studied code-switching as a means of structuring speech in interaction. Some discourse analysts, including conversation analyst Peter Auer, suggest that code-switching does not simply reflect social situations, but that it is a means to create social situations.

Markedness model

The Markedness Model, developed by Carol Myers-Scotton, is one of the more complete theories of code-switching motivations. It posits that language users are rational and choose to speak a language that clearly marks their rights and obligations, relative to other speakers, in the conversation and its setting. When there is no clear, unmarked language choice, speakers practice code-switching to explore possible language choices. Many sociolinguists, however, object to the Markedness Model’s postulation that language-choice is entirely rational.

Sequential analysis

Scholars of conversation analysis such as Peter Auer and Li Wei argue that the social motivation behind code-switching lies in the way code-switching is structured and managed in conversational interaction; in other words, the question of why code-switching occurs cannot be answered without first addressing the question of how it occurs. Using conversation analysis (CA), these scholars focus their attention on the sequential implications of code-switching. That is, whatever language a speaker chooses to use for a conversational turn, or part of a turn, impacts the subsequent choices of language by the speaker as well as the hearer. Rather than focusing on the social values inherent in the languages the speaker chooses ("brought-along meaning"), the analysis concentrates on the meaning that the act of code-switching itself creates ("brought-about meaning").

Communication accommodation theory

The communication accommodation theory (CAT), developed by Howard Giles, professor of communication at the University of California, Santa Barbara, seeks to explain the cognitive reasons for code-switching, and other changes in speech, as a person either emphasizes or minimizes the social differences between himself and the other person(s) in conversation. Giles posits that when speakers seek approval in a social situation they are likely to converge their speech with that of the other speaker. This can include, but is not limited to, the language of choice, accent, dialect, and para-linguistic features used in the conversation. In contrast to convergence, speakers might also engage in divergent speech, in which an individual person emphasizes the social distance between himself and other speakers by using speech with linguistic features characteristic of his own group.

Diglossia

In a diglossic situation, some topics and situations are better suited to the use of one language over another. Joshua Fishman proposes a domain-specific code-switching model (later refined by Blom and Gumperz) wherein bilingual speakers choose which code to speak depending on where they are and what they are discussing. For example, a child who is a bilingual Spanish-English speaker might speak Spanish at home and English in class, but Spanish at recess.

Types of switching

Scholars use different names for various types of code-switching.

Intersentential switching occurs outside the sentence or the clause level (i.e. at sentence or clause boundaries). It is sometimes called "extrasentential" switching. In Assyrian-English switching one could say, "Ani wideili. What happened?" ("Those, I did them. What happened?").

Intra-sentential switching occurs within a sentence or a clause. In Spanish-English switching one could say, "La onda is to fight y jambar." ("The in-thing is to fight and steal.")

Tag-switching is the switching of either a tag phrase or a word, or both, from one language to another, (common in intra-sentential switches). In Spanish-English switching one could say, "Él es de México y así los criaron a ellos, you know." ("He's from Mexico, and they raise them like that, you know.")

Intra-word switching occurs within a word itself, such as at a morpheme boundary. In Shona-English switching one could say, "But ma-day-s a-no a-ya ha-ndi-si ku-mu-on-a. ("But these days I don't see him much.") Here the English plural morpheme -s appears alongside the Shona prefix ma-, which also marks plurality.

Most code-switching studies primarily focus on intra-sentential switching, as it creates many hybrid grammar structures that require explanation. The other types involve utterances that simply follow the grammar of one language or the other. Intra-sentential switching can be alternational or insertional. In alternational code-switching, a new grammar emerges that is a combination of the grammars of the two languages involved. Insertional code-switching involves "the insertion of elements from one language into the morphosyntactic frame of the other."

Grammatical theories

In studying the syntactic and morphological patterns of language alternation, linguists have postulated specific grammatical rules and specific syntactic boundaries for where code-switching might occur.

Poplack's model

Shana Poplack's model of code-switching is the best known theory of the underlying grammar of code-switching. In this model, code-switching is subject to two constraints. The free-morpheme constraint stipulates that code-switching cannot occur between a lexical stem and bound morphemes. Essentially, this constraint distinguishes code-switching from borrowing. Generally, borrowing occurs in the lexicon, while code-switching occurs at either the syntax level or the utterance-construction level. The equivalence constraint predicts that switches occur only at points where the surface structures of the languages coincide, or between sentence elements that are normally ordered in the same way by each individual grammar. For example, the sentence: "I like you porque eres simpático" ("I like you because you are nice") is allowed because it obeys the syntactic rules of both Spanish and English. Cases like the noun phrases the casa white and the blanca house are ruled out because the combinations are ungrammatical in at least one of the languages involved. Spanish noun phrases are made up of determiners, then nouns, then adjectives, while the adjectives come before the nouns in English noun phrases. The casa white is ruled out by the equivalence constraint because it does not obey the syntactic rules of English, and the blanca house is ruled out because it does not follow the syntactic rules of Spanish.

Critics cite weaknesses of Sankoff and Poplack's model. The free-morpheme and equivalence constraints are insufficiently restrictive, meaning there are numerous exceptions that occur. For example, the free morpheme constraint does not account for why switching is impossible between certain free morphemes. The sentence: "The students had visto la película italiana" ("The students had seen the Italian movie") does not occur in Spanish-English code-switching, yet the free-morpheme constraint would seem to posit that it can. The equivalence constraint would also rule out switches that occur commonly in languages, as when Hindi postpositional phrases are switched with English prepositional phrases like in the sentence: "John gave a book ek larakii ko" ("John gave a book to a girl"). The phrase ek larakii ko is literally translated as a girl to, making it ungrammatical in English, and yet this is a sentence that occurs in English-Hindi code-switching despite the requirements of the equivalence constraint. The Sankoff and Poplack model only identifies points at which switching is blocked, as opposed to explaining which constituents can be switched and why.

Matrix language-frame model

Carol Myers-Scotton's Matrix Language-Frame (MLF) model is the dominant model of insertional code-switching. The MLF model posits that there is a Matrix Language (ML) and an Embedded Language (EL). In this case, elements of the Embedded Language are inserted into the morphosyntactic frame of the Matrix Language. The hypotheses are as follows (Myers-Scotton 1993b: 7):

The Matrix Language Hypothesis states that those grammatical procedures in the central structure in the language production system which account for the surface structure of the Matrix Language + Embedded Language constituent (linguistics) are only Matrix Language–based procedures. Further, the hypothesis is intended to imply that frame-building precedes content morpheme insertion. A Matrix Language can be the first language of the speaker or the language in which the morphemes or words are more frequently used in speech, so the dominant language is the Matrix Language and the other is the Embedded Language. A Matrix Language island is a constituent composed entirely of Matrix Language morphemes.

According to the Blocking Hypothesis, in Matrix Language + Embedded Language constituents, a blocking filter blocks any Embedded Language content morpheme which is not congruent with the Matrix Language with respect to three levels of abstraction regarding subcategorization. "Congruence" is used in the sense that two entities, linguistic categories in this case, are congruent if they correspond in respect of relevant qualities.

The three levels of abstraction are:

Even if the Embedded Language realizes a given grammatical category as a content morpheme, if it is realized as a system morpheme in the Matrix Language, the Matrix Language blocks the occurrence of the Embedded Language content morpheme. (A content morpheme is often called an "open-class" morpheme, because they belong to categories that are open to the invention of arbitrary new items. They can be made-up words like "smurf", "nuke", "byte", etc. and can be nouns, verbs, adjectives, and some prepositions. A system morpheme, e.g. function words and inflections, expresses the relation between content morphemes and does not assign or receive thematic roles.)

The Matrix Language also blocks an Embedded Language content morpheme in these constituents if it is not congruent with a Matrix Language content morpheme counterpart in terms of theta role assignment.

Congruence between Embedded Language content morphemes and Matrix Language content morphemes is realized in terms of their discourse or pragmatic functions.

Examples

Hindi/English
life ko face kiijiye with himmat and faith in apane aap. (Code-switching)
"Face life with courage and faith in self." (Translation)
Swahili/English
hata wengine nasikia washawekwa cell. (Code-switching)
"Even others I heard were put [in] cells." (Translation)

We see that example 1 is consistent with the Blocking Hypothesis and the system content morpheme criteria, so the prediction is that the Hindi equivalents are also content morphemes. Sometimes non-congruence between counterparts in the Matrix Language and Embedded Language can be circumvented by accessing bare forms. "Cell" is a bare form and so the thematic role of "cell" is assigned by the verb -wek- 'put in/on'; this means that the verb is a content morpheme.

The Embedded Language Island Trigger Hypothesis states that when an Embedded Language morpheme appears which is not permitted under either the Matrix Language Hypothesis or Blocking Hypothesis, it triggers the inhibition of all Matrix Language accessing procedures and completes the current constituent as an Embedded Language island. Embedded Language islands consist only of Embedded Language morphemes and are well-formed by Embedded Language grammar, but they are inserted in the Matrix Language frame. Therefore, Embedded Language islands are under the constraint of Matrix Language grammar.

Examples

Swahili/English
*Sikuona your barau ambayo uliipoteza. (Code-switching ungrammatical)
"I didn't see your letter which you lost." (Translation)
Swahili/English
*Nikamwambia anipe ruhusa niende ni-ka-check for wewe. (Code-switching, ungrammatical)
"And I told him he should give me permission so that I go and check for you." (Translation)
Nikamwambia anipe ruhusa niende ni-ka-check for you. (Code-switching, grammatical)

Example 1 is ungrammatical (indicated by the leading asterisk) because "your" is accessed, so the Embedded Language Island Trigger Hypothesis predicts that it must be followed by an English head (e.g., "your letter") as an Embedded Language island. The reason is that possessive adjectives are system morphemes. We see the same thing happen in example 2, which is therefore ungrammatical. However, the correct way to finish the sentence is not "for wewe", switching back to Swahili; rather, it should end with "for you", which would be an Embedded Language island.

The Embedded Language Implicational Hierarchy Hypothesis can be stated as two sub-hypotheses:

The farther a constituent is from the main arguments of the sentence, the freer it is to appear as an Embedded Language island.
The more formulaic in structure a constituent is, the more likely it is to appear as an Embedded Language island. Stated more strongly, choice of any part of an idiomatic expression will result in an Embedded Language island.

The Implication Hierarchy of Embedded Language Islands:

Formulaic expressions and idioms (especially prepositional phrases expressing time and manner, but also as verb phrase complements)
Other time and manner expressions
Quantifier expressions
Non-quantifier, non-time noun phrases as verb phrase complements
Agent Noun phrases
Theme role and case assigners, i.e. main finite verbs (with full inflections)

Examples

Wolof/French
Le matin de bonne heure ngay joge Medina pour dem juilli. Suba tee nga fa war a joge. (Code-switching)
"Early in the morning you leave Medina to go to pray. Early in the morning you should leave then." (Translation)
Swahili/English
Ulikuwa ukiongea a lot of nonsense. (Code-switching)
"You were talking a lot of nonsense." (Translation)

We see example 1 work because the French Embedded Language island Le matin de bonne heure, "early in the morning", is a time expression. (Also, it is repeated in Wolof in the second sentence.) In example 2, we see the quantifier a lot of is a predicted Embedded Language island. Here we see an objective complement of a finite verb begin with the quantifier.

Constraint-free

Jeff MacSwan has posited a constraint-free approach to analyzing code-switching. This approach views explicit reference to code-switching in grammatical analysis as tautological, and seeks to explain specific instances of grammaticality in terms of the unique contributions of the grammatical properties of the languages involved. MacSwan characterizes the approach with the refrain, "Nothing constrains code-switching apart from the requirements of the mixed grammars." The approach focuses on the repudiation of any rule or principle which explicitly refers to code-switching itself. This approach does not recognize or accept terms such as "matrix language", "embedded language", or "language frame", which are typical in constraint-based approaches such as the MLF Model.

Rather than posit constraints specific to language alternation, as in traditional work in the field, MacSwan advocates that mixed utterances be analyzed with a focus on the specific and unique linguistic contributions of each language found in a mixed utterance. Because these analyses draw on the full range of linguistic theory, and each data set presents its own unique challenges, a much broader understanding of linguistics is generally needed to understand and participate in this style of codeswitching research.

For example, Cantone and MacSwan (2009) analyzed word order differences for nouns and adjectives in Italian-German codeswitching using a typological theory of Cinque that had been independently proposed in the syntax literature; their account derives the word order facts of Italian-German codeswitching from underlying differences between the two languages, according to Cinque's theory.

Controversies

Much remains to be done before a more complete understanding of code-switching phenomena is achieved. Linguists continue to debate apparent counter-examples to proposed code-switching theories and constraints.

The Closed-class Constraint, developed by Aravind Joshi, posits that closed class items (pronouns, prepositions, conjunctions, etc.) cannot be switched. The Functional Head Constraint developed by Belazi et al. holds that code-switching cannot occur between a functional head (a complementizer, a determiner, an inflection, etc.) and its complement (sentence, noun-phrase, verb-phrase). These constraints, among others like the Matrix Language-Frame model, are controversial among linguists positing alternative theories, as they are seen to claim universality and make general predictions based upon specific presumptions about the nature of syntax.

Myers-Scotton and MacSwan debated the relative merits of their approaches in a series of exchanges published in 2005 in Bilingualism: Language and Cognition, issues 8(1) and 8(2).

Examples

In this section, segments that are switched from the primary language of the conversation are shown in red.

Spanish and English

Researcher Ana Celia Zentella offers this example from her work with Puerto Rican Spanish-English bilingual speakers in New York City. In this example, Marta and her younger sister, Lolita, speak Spanish and English with Zentella outside of their apartment building.

Lolita: Oh, I could stay with Ana?Marta: — but you could ask papi and mami to see if you could come down.Lolita: OK.Marta: Ana, if I leave her here would you send her upstairs when you leave?Zentella: I’ll tell you exactly when I have to leave, at ten o’clock. Y son las nueve y cuarto. ("And it’s nine fifteen.")Marta: Lolita, te voy a dejar con Ana. ("I’m going to leave you with Ana.") Thank you, Ana.

Zentella explains that the children of the predominantly Puerto Rican neighbourhood speak both English and Spanish: "Within the children’s network, English predominated, but code-switching from English to Spanish occurred once every three minutes, on average."

French and Tamil

This example of switching from French to Tamil comes from ethnographer Sonia Das's work with immigrants from Jaffna, Sri Lanka, to Quebec.

Selvamani: Parce que n’importe quand quand j’enregistre ma voix ça l’aire d’un garçon. ([in French] "Because whenever I record my voice I sound like a guy.")[laughter]Selvamani: ennatā, ennatā, enna romba ciritā? ([in Tamil] "What, what, what's so funny?")

Selvamani, who moved from Sri Lanka to Quebec as a child and now identifies as Québécois, speaks to Das in French. When Selvamani's sister, Mala, laughs, Selvamani switches to Tamil to ask Mala why she is laughing. After this aside, Selvamani continues to speak in French. Selvamani also uses the word tsé ("you know", contraction of tu sais) and the expression je me ferrai pas poigné [sic] ("I will not be handled"), which are not standard French but are typical of the working-class Montreal dialect Joual.

Hopi and Tewa

Researcher Paul Kroskrity offers the following example of code-switching by three elder Arizona Tewa men, who are trilingual in Tewa, Hopi, and English. They are discussing the selection of a site for a new high school in the eastern Hopi Reservation:

Speaker A: Tututqaykit qanaanawakna. ([in Hopi] "Schools were not wanted.")Speaker B: Wédít’ókánk’egena’adi imbí akhonidi. ([in Tewa] "They didn’t want a school on their land.")Speaker C: Naembí eeyae nąeląemo díbít’ó’ámmí kąayį’į wédimu::di. ([in Tewa] "It’s better if our children go to school right here, rather than far away.")

In their two-hour conversation, the three men primarily speak Tewa; however, when Speaker A addresses the Hopi Reservation as a whole, he code-switches to Hopi. His speaking Hopi when talking of Hopi-related matters is a conversational norm in the Arizona Tewa speech community. Kroskrity reports that these Arizona Tewa men, who culturally identify themselves as Hopi and Tewa, use the different languages to linguistically construct and maintain their discrete ethnic identities.

References

Code-switching Wikipedia

(Text) CC BY-SA

Contents

Social motivations

Markedness model

Sequential analysis

Communication accommodation theory

Diglossia

Types of switching

Grammatical theories

Poplack's model

Matrix language-frame model

Constraint-free

Controversies

Examples

Spanish and English

French and Tamil

Hopi and Tewa

References