Bootstrapping is a term used in language acquisition in the field of linguistics. It refers to the idea that human beings are born innately equipped with a mental faculty that forms the basis of language, and that allows children to effortlessly acquire language. As a process, bootstrapping can be divided into different domains, according to whether it involves semantic bootstrapping, syntactic bootstrapping, prosodic bootstrapping, or pragmatic bootstrapping.
Contents
- Origin of the term bootstrapping
- Bootstrapping and connectionism
- Bootstrapping and innateness
- Semantic bootstrapping
- Acquiring the stateevent contrast
- Acquiring the countmass contrast
- Syntactic bootstrapping
- Acquiring verbs
- Acquiring nouns
- Acquiring adjectives
- Acquiring functional categories
- Evidence
- Prosodic bootstrapping
- Prosodic cues for syntactic structure
- Prosodic cues for clauses and phrases
- Criticism
- Pragmatic bootstrapping
- Gaze following
- Observing adult behavior
- References
Origin of the term "bootstrapping"
In literal terms, a bootstrap is the small strap on a boot that is used to help pull on the entire boot. Similarly in computer science, booting refers to the startup of an operation system by means of first initiating a smaller program. Therefore, bootstrapping is a general term used to refer to the leveraging of a small action into a more powerful and significant operation.
Bootstrapping in linguistics was first introduced by Steven Pinker as a metaphor for the idea that children are innately equipped with mental processes that help initiate language acquisition. Bootstrapping attempts to identify the language learning processes that enable children to learn about the structure of the target language.
Bootstrapping and connectionism
Bootstrapping has a strong link to connectionist theories which model human cognition as a system of simple, interconnected networks. In this respect, connectionist approaches view human cognition as a computational algorithm. On this view, in terms of learning, humans have statistical learning capabilities that allow them to problem solve. Proponents of statistical learning believe that it is the basis for higher level learning, and that humans use the statistical information to create a database which allows them to learn higher-order generalizations and concepts.
For a child acquiring language, the challenge is to parse out discrete segments from a continuous speech stream. Research demonstrates that, when exposed to streams of nonsense speech, children use statistical learning to determine word boundaries. In every human language, there are certain sounds that are more likely to occur with each other: for example, in English, the sequence [st] is attested (stop), but the sequence *[gb] is not.
It appears that children can detect the statistical probability of certain sounds occurring with one another, and use this to parse out word boundaries. Utilizing these statistical abilities, children appear to be able to form mental representations, or neural networks, of relevant pieces of information. Pieces of relevant information include word classes, which in connectionist theory, are seen as each having an internal representation and transitional links between concepts. Neighbouring words provide concepts and links for children to bootstrap new representations on the basis of their previous knowledge.
Bootstrapping and innateness
The innateness hypothesis was originally coined by Noam Chomsky as a means to explain the universality in language acquisition. All normally developing children with adequate exposure to a language will learn to speak and comprehend the language fluently. It is also proposed that despite the supposed variation in languages, they all fall into a very restricted subset of the potential grammars that could be infinitely conceived. Chomsky argued that since all grammars universally deviate very little from the same subset of general structure, and that children so seamlessly acquire language, humans must have some intrinsic language learning capability that allows us to learn language. This intrinsic capability was hypothesized to be embedded in the brain, earning the title of language acquisition device. According to this, the child is equipped with knowledge of grammatical and ungrammatical types, which he then applies to the stream of speech he is hearing in order to determine the grammar this stream is compatible with. The processes underlying this LAD relates to bootstrapping in that once a child has identified the subset of the grammar he is learning, he can then apply his knowledge of grammatical types in order to learn the language-specific aspects of the word. This relates to the Principles and Parameters theory of linguistics, in that languages universally consist of basic, unbroken principles, and vary by specific parameters.
Semantic bootstrapping
Semantic bootstrapping is a linguistic theory of child language acquisition which proposes that children can acquire the syntax of a language by first learning and recognizing semantic elements and building upon, or bootstrapping from, that knowledge.
According to Pinker, semantic bootstrapping requires two critical assumptions to hold true:
- A child must be able to perceive meaning from utterances. That is, the child must associate utterances with, for example, objects and actions in the real world.
- A child must also be able to realize that there are strong correspondences between semantic and syntactic categories. The child can then use the knowledge of these correspondences to create, test, and refine internal grammar rules iteratively as the child gains more knowledge of their language.
Acquiring the state/event contrast
When discussing acquisition of temporal contrasts, the child must first have a concept of time outside of semantics. In other words, the child must be able to have some mental grasp on the concept of events, memory, and general progression of time before attempting to conceive of it semantically. Semantics, especially with regard to events and memory concepts, appears to be far more language-general, with meanings being more universal concepts rather than the individual segments being used to represent them. For this reason, semantics requires far more cognition than external stimuli in acquiring it, and relies much on the innate capability of the child to develop such abstraction, as the child must first have a mental representation of the concept, before attempting to link a word to that meaning. In order to actually learn time events, several processes must occur:
- The child must have a grasp on temporal concepts
- They must learn which concepts are represented in their own language
- They must learn how their experiences are representative of certain event types that are present in the language
- They must learn the different morphological and syntactic representations of these events
(Data in list cited from )
Using these basic stepping stones, the child is able to map their internal concept of the meaning of time onto explicit linguistic segments. This bootstrapping allows them to have hierarchical, segmental steps, in which they are able to build upon their previous knowledge in order to aid future learning.
Tomasello argues that in learning linguistic symbols, the child does not need to have explicit external linguistic contrasts, and rather, will learn about these concepts via social context ant their surroundings. This can be demonstrated with semantic bootstrapping, in that the child does not explicitly receive information on the semantic meaning of temporal events, but learns to apply their internal knowledge of time to the linguistic segments that they are being exposed to.
Acquiring the count/mass contrast
With regard to mapping the semantic relationships for count, it follows previous bootstrapping methods. Since the context in which children are presented with number quantities usually have visual aid to accompany them, the child has a relatively easy way to map these number concepts.
For nouns which denote discrete entities, granted that the child already has the mental concept for BOY and THREE in place, she will see the set of animate, young, human males (i.e. boys) and confirm that the set has a cardinality of three.
For mass nouns, which denote non-discrete substances, in order to count They act to demonstrate the relationship between atoms of the word and substance. However, mass nouns can vary with regard to their sharpness, or the narrowness that they refer to an entity. For example, a grain of rice has a much narrower quantity definition than a bag of rice.
"Of" is a word that children are thought to learn the definition of as being something that transforms a substance into a set of atoms. For example, when you say:
The word of is used in (3) to mark the mass noun water is partitioned into gallons. The initial substance now denotes a set. The child again uses visual cues to grasp what this relationship is.
Syntactic bootstrapping
Syntactic bootstrapping is a theory about the process of how children identify word meanings based on their syntactic categories. In other words, how knowledge of grammatical structure, including how syntactic categories combine into phrases and constituents in order to form sentences, "bootstraps," or encourages the acquisition of word meaning. Children do not need to rely solely on environmental context to understand meaning or have the words explained to them. With this theory, children infer word meanings from their observations about syntax, and use these observations to comprehend future utterances they hear.
One of the earliest demonstrations of the existence of syntactic bootstrapping is an experiment done by Roger Brown at Harvard University in 1957. Brown's experiment was the beginning of the framework needed in order for the theory to thrive. He took a nonsense word (like sib) and asked children one of three specific sentences containing the word (1-3). Through his experiments, he showed that children acquire grammar and semantics simultaneously. This led linguists like Lila Gleitman, who coined the term syntactic bootstrapping in 1990, to argue that syntax was pivotal for language learning, as it also gives a learner clues about semantics.
(1) Do you see any sib?(2) What is sibbing?(3) Do you see a sib?Acquiring verbs
An early demonstration by Naigles (1990) of syntactic bootstrapping involved showing 2-year-olds a video of a duck using its left hand to push a rabbit down into a squatting position while both the animals wave their right arms in circles.
Initial video: Duck uses left hand to push rabbit into squatting position while both animals wave their right arms in circlesDuring the video, children are presented with one of the following two descriptions:
(6) Utterance A: The duck is kradding the rabbit. (describes a situation where the duck does something to the rabbit)(7) Utterance B: The rabbit and duck are kradding. (describes a situation where the duck and the rabbit perform the same action)Children were then presented two distinct follow-up videos.
Follow-up video 1: the duck pushing the rabbitFollow-up video 2: the duck and the rabbit are both waving their arms in the air.When instructed to "find kradding", children looked to the video that illustrated the utterance they heard during the initial video. Children who heard utterance A interpreted kradding to mean the act of the duck pushing on the rabbit, while children who heard utterance B assumed kradding was the action of arm waving. This indicates that children arrive at interpretations of an novel verb based on the utterance context and the syntactic structure in which it was embedded.
In 1990, Lila Gleitman took this idea further by examining the acquisition of verbs in more detail. In her study, she found that children could differentiate between verbs that take one or more arguments and that this knowledge was used to help them narrow down the potential meanings for the verb in question. This discovery explains how children can learn the meaning of verbs that cannot be observed, like ‘think’.
Acquiring nouns
The acquisition of nouns is related to the acquisition of the mass/count contrast. In 1969, Willard Van Orman Quine claimed that children cannot learn new nouns unless they have already acquired this semantic distinction. Otherwise, the word “apples” might refer to the individual objects in a pile or the pile itself, and the child would have no way to know without already understanding the difference between a mass and a count noun. Nancy N. Soja argues that Quine is mistaken, and that children can learn new nouns without fully understanding the mass/count distinction. She found in her study that 2-year old children were able to learn new nouns (some mass, some count nouns) from inferring meaning from the syntactic structure of the sentence the words were introduced in.
Acquiring adjectives
In a 2010 study, Syrett and Lidz show that children learn the meaning of novel gradable adjectives on the basis of the adverbs that modify them. Gradable adjectives have a scale associated with them: for example, the adjective “large” places the noun that it modifies on a size scale, while the adjective “expensive” places the noun that is modifiers on a price scale. In addition, gradable adjectives (GA's) subdivide into two classes: relative and maximal GA’s.
Relative GA’s are words like “big” in (5), and require a reference point: a big mouse is not the same size as a big elephant. As shown in (6) and (7), while relative GAs can be modified by the adverb very they cannot be modified by the adverb completely.
relative gradable adjectives(5) a. a big mouse b. a big elephant(6) a. a very big mouse b. a very big elephant(7) a. *a completely big mouse b. *a completely big elephantMaximal GA’s are words like, “full” in (8); they operate on a close-ended scale. As shown in (9) and (10), while relative GAs cannot be modified by the adverb very they can be modified by the adverb completely.
maximal gradable adjectives(8) a. a full pool b. a full tank(9) a. ?? a very full pool b. ?? a very full tank(10) a. a completely full pool b. a completely full tankIn the 2010 study, Syrett and Lidz showed children pictures of objects that could be described in terms of both relative and maximal GA’s. For example, a picture of a container that could be described as both tall (a relative GA) and clear (a maximal GA).
When showing these objects to the children, the novel adjective used to describe them was prefaced with either adverb very (which usually modifies relative GA’s) or the adverb completely (which modifies maximal GA’s). As a control, in some contexts, no adverb was present. When the novel adjective was presented with the adverb very, the children assigned a relative GA meaning to it, and when it was presented with adverb completely, a maximal GA. When no adverb was present, the children were unable to assign a meaning to the adjective. This shows that, in order for children to learn the meaning of a new adjective, they depend on grammatical information provide by adverbs about the semantic class of the novel adjective.
Acquiring functional categories
There is a basic contrast between lexical categories (which include open-class items such as verbs, nouns, and adjectives), and functional categories (which include closed-class items such auxiliary verbs, case markers, complementizers, conjunctions and determiners. The acquisition functional categories has been studied significantly less than the lexical class, so much remains unknown. A 1998 study led by Rushen Shi shows that, at a very young age, Mandarin and Turkish learners use phonological, acoustic and distributional cues to distinguish between words that are lexical categories from words that are functional categories. 11 to 20-month old children were observed speaking with their mothers to evaluate whether speech directed at the children contained clues that they could then use to categorize words as "lexical" or "function". Compared to as lexical category words, functional category words were found to have the following properties:
Evidence
A) Brown (1957) -- Children between the ages of three and five were shown various pictures accompanied by novel English words. The nonsense words included singular nouns, mass nouns, and verbs. Brown showed these pictures to a child and asked them to tell him which specific nonsense word the picture depicted; this data was collected as either a noun, mass noun, or a verb. When the novel words were repositioned within the sentence and the children were asked a question, they focused on different aspects of the image shown and adjusted their answer. For example, when Brown wanted the child to identify a mass noun, he would ask the children "do you see any sib", and the child would point at the pictured mass noun or noun indicating quantity. To identify a verb, he would ask "what is sibbing", where sib is just a verb stem. In order to identify a singular noun, he would ask "do you see a sib?" When children made guesses, they were correct more than half of the time. This shows that children are sensitive to the syntactic position of words, and can correctly associate a novel word with its syntactic category.
B) Landau and Gleitman (1985) —Upon experimenting with both blind and sighted children, Landau and Gleitman found that these children all differentiate between look and see versus touch, despite the blind child not being physically capable of looking or seeing. All children were found to associate look and see with perception, and touch with exploration. That blind children were able to learn the meanings of vision-related words even though they do not have vision shows that they used syntax and context to infer the meaning of these verbs.
C) Papafragou, Cassidy, Gleitman (2007) —Participants were asked to identify verbs within the context of a video. Papafragou et al had children watch 12 videotaped stories. 4 stories about the subject's desires and 8 stories that varied in the subject's beliefs and the framing of a novel verb. At the end of the tape, they would hear a sentence describing the scene but the sentence's verb was replaced with a novel word. Children were asked to respond with what they thought the word meant. Their responses were categorized 4 ways: Action, Belief, Desire, and Other. They found that action words were easily interpreted by children. However, false belief scenes with the complementizer phrase caused for children to respond with belief words more often. Results showed that participants in the experiment identified the verb most accurately when they could use both the video and sentence contexts. When it comes to attitude verbs, children are sensitive to the syntactic framing of the verb in question.
D) Wellwood, Gagliardi, and Lidz (2016) —Showed that four-year-olds can understand the difference between a quantitative or qualitative word, based on its syntactic position within a sentence. In “Gleebest of the cows are by the barn,” the novel word “gleebest” is in a determiner position, and is inferred to mean “most” or “many.” In “the gleebest cows are by the barn,” “gleebest” is in an adjective position, and children infer it to mean “spotty” or another quality. These results are significant because they show children using syntax to understand word meanings.
E) Gillette et al. (1999) —Researchers tested adults to see what difficulties they would face when asked to identify a word from a muted, videotaped scene. They found that adults had trouble identifying the word, especially verbs, when they could only refer to the scene. Their performance increased once they were given the syntactic context for the mystery word. These results indicate that word learning is aided by the presence of syntactic context.
F) Harrigan, Hacquard, and Lidz (2016) —Found that children's interpretation of a new attitude verb depended on the syntactic frame in which it was Introduced. In the experiment, children who heard the word 'hope' presented in the same syntactic frame as 'want' (i.e. followed by an infinitival verb) connected the new verb 'hope' with a meaning of desire. On the other hand, those that heard 'hope' presented in the same frame as 'think' (i.e. followed by a finite verb) made no such association between desire and the new verb, instead associating the novel verb with belief. This provides evidence that children use syntax to some extent in learning the meaning behind these sorts of abstract verbs.
G) Waxman, S. R., & Booth, A. E. (2001) —Children who heard nouns focused on the object categories and children who focused on adjectives focused on an object's properties and categories. This shows that children are sensitive to different syntactic categories and can use their observations of syntax to infer word meaning.
Prosodic bootstrapping
Even before infants can comprehend word meaning, prosodic details assist them in discovering syntactic boundaries. Prosodic bootstrapping or phonological bootstrapping investigates how prosodic information — which includes stress, rhythm, intonation, pitch, pausing, as well as dialectal features — can assist a child in discovering the grammatical structure of the language that she or he is acquiring.
In general, prosody introduces features the reflect either attributes of the speaker or of the utterance type. Speaker attributes include emotional state, as well as the presence of irony or sarcasm. Utterance-level attributes are used to mark questions, statements, and commands, and they can also be used to mark contrast.
Similarly, in sign language, prosody includes facial expression, mouthing, and the rhythm, length, and tension of gestures and signs.
In language, words are not only categorized into phrases, clauses, and sentences. Words are also organized into prosodic envelopes. The idea of a prosodic envelope states that words that go together syntactically also form a similar intonation pattern. This explains how children discover syllable and word boundaries through prosodic cues. Overall, prosodic bootstrapping explores determining grammatical groupings in a speech stream rather than learning word meaning.
One of the key components of the prosodic bootstrapping hypothesis is that prosodic cues may aid infants in identifying lexical and syntactical properties. From this, three key elements of prosodic bootstrapping can be proposed:
- The syntax of language is correlated with acoustic properties.
- Infants can detect and are sensitive to these acoustic properties.
- These acoustic properties can be used by infants when processing speech.
There is evidence that the acquisition of language-specific prosodic qualities start even before an infant is born. This is seen in neonate crying patterns, which have qualities that are similar to the prosody of the language that they are acquiring. The only way that an infant could be born with this ability is if the prosodic patterns of the target language are learned in utero. Further evidence of young infants using prosodic cues is their ability to discriminate the acoustic property of pitch change by 1–2 months old.
Prosodic cues for syntactic structure
Infants and young children receive much of their language input in the form of infant-directed speech (IDS) and child-directed speech (CDS), which are characterized as having exaggerated prosody and simplification of words and grammar structure. When interacting with infants and children, adults often raise and widen their pitch, and reduce their speech rate. However, these cues vary across cultures and across languages.
There are several ways in which infant and child directed speech can facilitate language acquisition. In recent studies, it is shown that IDS and CDS contain prosodic information that may help infants and children distinguish between paralinguistic expressions (e.g. gasps, laughs, expressions) and informative speech. In Western cultures, mothers speak to their children using exaggerated intonation and pauses, which offer insight about syntactic groupings such as noun phrases, verb phrases, and prepositional phrases. This means that the linguistic input infants and children receive include some prosodic bracketing around syntactically relevant chunks.
(1) Look the boy is patting the dog with his hand.(2) *Look the boy ... is ... patting the ... dog with his ... hand.(3) Look … [DP The boy] ... [VP is patting the dog] ... [PP with his hand].A sentence like (1) will not typically be produced with the pauses indicated in (2), where the pauses "interrupt" syntactic constituents. For example, pausing between the and dog would interrupt the noun phrase (DP) constituent, as would pausing between his and hand. Most often, pauses are placed so as to group the utterance into chunks that correspond to the beginnings and ends of constituents such as noun phrases (DPs), verb phrases (VPs), and prepositional phrases (PPs). As a result, sentences like (3), where the pauses correspond to syntactic constituents, are much more natural.
Moreover, within these phrases are distinct patterns of stress, which helps to differentiate individual elements within the phrase, such as a noun from an article. Typically, articles and other unbound morphemes are unstressed and are relatively short in duration in contrast to the pronunciation of nouns. Furthermore, in verb phrases, auxiliary verbs are less stressed than main verbs. This can be seen in (4).
4. They are RUNning.Prosodic bootstrapping states that these naturally occurring intonation packages help infants and children to bracket linguistic input into syntactic groupings. Currently, there is not enough evidence to suggest that prosodic cues in IDS and CDS facilitate in the acquisition of more complex syntax, however IDS and CDS are richer linguistic inputs for infants and children.
Prosodic cues for clauses and phrases
There is continued research into whether infants use prosodic cues – in particular, pauses – when processing clauses and phrases. Clauses are the largest constituent structure in a phrase and are often produced in isolation in conversation; for example,
In (1), there is a pause before the verb
Criticism
Critics of prosodic bootstrapping have argued that the reliability of prosodic cues has been overestimated and that prosodic boundaries don't always match up with syntactic boundaries. It is argued instead that while prosody does provide infants and children useful clues about a language, it does not explain how children learn to combine clauses, phrases, and sentences, nor word meaning. As a result, a comprehensive account of how children learn language must combine prosodic bootstrapping with other types of bootstrapping as well as more general learning mechanisms.
Pragmatic bootstrapping
Pragmatic bootstrapping refers to how pragmatic cues and their use in social context assist in language acquisition, and more specifically, word learning. Pragmatic cues are illustrated both verbally and through nonlinguistic cues. They include hand gesture, eye movement, a speaker's focus of attention, intentionality, and linguistic context. Similarly, the parsimonious model proposes that a child learns word meaning by relating language input to their immediate environment. An example of Pragmatic Bootstrapping would be a teacher saying the word <dog> while gesturing to a dog in the presence of a child.
Gaze following
YouTube Video - Word Learning - Gaze Direction
Children are able to associate words with actions or objects by following the gaze of their communication partner. Often, this occurs when an adult labels an action or object while looking at it.
The results from the experiment illustrated that children in the Action Highlighted Condition associated the novel word with the novel action, whereas the children in the Object Highlighted Condition assumed the novel word referred to the novel object. To understand that the novel word referred to the novel action, children had to learn from the experimenter's nonverbal behavior that they were requesting the action of the object. This illustrates how non-linguistic context influences novel word learning.
Observing adult behavior
Children also look at the adults face when learning new words, and this can often lead to better understanding of what that the word means. In everyday speech, mistakes are often made, so why don't children end up learning the wrong words for the targeted things? This may be because children are able to see whether the word was right or wrong for the intended meaning by seeing the adult's facial expressions and behaviors.
The adult said this sentence without previously explaining what the verb "plunk" would mean. Afterwards, the adult would do one of two things.
Action 1 She then performed the target action intentionally, saying "There!", followed immediately by another action on the same apparatus performed "accidentally", in an awkward fashion saying "Whoops!" Action 2 Same as Action 1, however, reversed.Afterwards, the children were asked to do the same to another apparatus, and see if the children would perform the targeted action.
Verb: Plunk "Can you go plunk Mickey Mouse?"The results were, that the children were able to understand the intended action for the new word in which they just heard, and performed the action when asked. By watching the adult's behavior and facial expressions, they were able to understand what the verb "plunk" meant, and figure out whether it was the targeted action or the accidental action.
Afterwards, the adults would leave, then ask the child to bring the new object over. In the Language condition, the child would correctly bring the targeted object over. In the No-Language condition, the child would just randomly bring an object over.
This presents the discovery of two things...
- The child was aware of which object was new for the adults that left the room.
- The child knew that the adult was excited because the object was new, and that is why they would use this new term that they had never heard before.
...and the child was able to understand this based on the emotional behaviors of the adult.