![]() | ||
Controlled grammars are a class of grammars that extend, usually, the context-free grammars with additional controls on the derivations of a sentence in the language. A number of different kinds of controlled grammars exist, the four main divisions being Indexed grammars, grammars with prescribed derivation sequences, grammars with contextual conditions on rule application, and grammars with parallelism in rule application. Because indexed grammars are so well established in the field, this article will address only the latter three kinds of controlled grammars.
Contents
- Control by prescribed sequences
- Language controlled grammars
- Proof theoretic description
- Example
- Matrix grammars
- Vector grammars
- Programmed grammars
- Control by context conditions
- Conditional grammars
- Proof theoretic definition
- Semi conditional grammars
- Random context grammars
- Ordered grammars
- Grammars with parallelism
- Indian parallel grammars
- K grammars
- Russian parallel grammars
- Scattered context grammars
- References
Control by prescribed sequences
Grammars with prescribed sequences are grammars in which the sequence of rule application is constrained in some way. There are four different versions of prescribed sequence grammars: language controlled grammars (often called just controlled grammars), matrix grammars, vector grammars, and programmed grammars.
In the standard context-free grammar formalism, a grammar itself is viewed as a 4-tuple,
Productions over such a grammar are sequences of rules in P that, when applied in order of the sequence, lead to a terminal string. That is, one can view the set of imaginable derivations in G as the set
The set R, due to its infinitude, is almost always (though not necessarily) described via some more convenient mechanism, such as a grammar (as in language controlled grammars), or a set of matrices or vectors (as in matrix and vector grammars). The different variations of prescribed sequence grammars thus differ by how the sequence of derivations is defined on top of the context-free base. Because matrix grammars and vector grammars are essentially special cases of language controlled grammars, examples of the former two will not be provided below.
Language controlled grammars
Language controlled grammars are grammars in which the production sequences constitute a well-defined language of arbitrary nature, usually though not necessarily regular, over a set of (again usually though not necessarily) context-free production rules. They also often have a sixth set in the grammar tuple, making it
Proof-theoretic description
We let a regularly controlled context-free grammar with appearance checking be a 6-tuple
Given some strings x and y, both in
holds if either
Intuitively, this simply spells out that a rule can apply to a string if the rule's left-hand-side appears in that string, or if the rule is in the set of "vacuously applicable" rules which can "apply" to a string without changing anything. This requirement that the non-vacuously applicable rules must apply is the appearance checking aspect of such a grammar. The language for this kind of grammar is then simply set of terminal strings
Example
Let's consider a simple (though not the simplest) context-free grammar that generates the language
Let
In language controlled form, this grammar is simply
If we chose some arbitrary production sequence
Choosing two random non-terminal deriving sequences, and one terminal-deriving one, we can see this in work:
Let
Let
Let
Similar derivations with a second cycle of
Matrix grammars
Matrix grammars (expanded on in their own article) are a special case of regular controlled context-free grammars, in which the production sequence language is of the form
The derives relation in a matrix grammar is thus defined simply as:
Given some strings x and y, both in
holds if either
Informally, a matrix grammar is simply a grammar in which during each rewriting cycle, a particular sequence of rewrite operations must be performed, rather than just a single rewrite operation, i.e. one rule "triggers" a cascade of other rules. Similar phenomena can be performed in the standard context-sensitive idiom, as done in rule-based phonology and earlier Transformational grammar, by what are known as "feeding" rules, which alter a derivation in such a way as to provide the environment for a non-optional rule that immediately follows it.
Vector grammars
Vector grammars are closely related to matrix grammars, and in fact can be seen as a special class of matrix grammars, in which if
The derives relation in a vector grammar is then:
Given some strings x and y, both in
holds if either
Notice that the number of production rules used in the derivation sequence, n, is the same as the number of production rules in the vector. Informally, then, a vector grammar is one in which a set of productions is applied, each production applied exactly once, in arbitrary order, to derive one string from another. Thus vector grammars are almost identical to matrix grammars, minus the restriction on the order in which the productions must occur during each cycle of rule application.
Programmed grammars
Programmed grammars are relatively simple extensions to context-free grammars with rule-by-rule control of the derivation. A programmed grammar is a 4-tuple
Given two strings
The language of a programmed grammar G is defined by constraining the derivation rule-wise, as
Intuitively, when applying a rule p in a programmed grammar, the rule can either succeed at rewriting a symbol in the string, in which case the subsequent rule must be in ps success field, or the rule can fail to rewrite a symbol (thus applying vacuously), in which case the subsequent rule must be in ps failure field. The choice of which rule to apply to the start string is arbitrary, unlike in a language controlled grammar, but once a choice is made the rules that can be applied after it constrain the sequence of rules from that point on.
Example
As with so many controlled grammars, programmed grammars can generate the language
Let
The derivation for the string aaaa is as follows:
As can be seen from the derivation and the rules, each time
Control by context conditions
Unlike grammars controlled by prescribed sequences of production rules, which constrain the space of valid derivations but do not constrain the sorts of sentences that a production rule can apply to, grammars controlled by context conditions have no sequence constraints, but permit constraints of varying complexity on the sentences to which a production rule applies. Similar to grammars controlled by prescribed sequences, there are multiple different kinds of grammars controlled by context conditions: conditional grammars, semi-conditional grammars, random context grammars, and ordered grammars.
Conditional grammars
Conditional grammars are the simplest version of grammars controlled by context conditions. The structure of a conditional grammar is very similar to that of a normal rewrite grammar:
Proof-theoretic definition
With this definition of a conditional grammar, we can define the derives relation as follows:
Given two strings
Informally then, the production rule for some pair in P can apply only to strings that are in its context language. Thus, for example, if we had some pair
Example
Conditional grammars can generate the context-sensitive language
Let
We can then generate the sentence aaaa with the following derivation:
Semi-conditional grammars
A semi-conditional grammar is very similar to a conditional grammar, and technically the class of semi-conditional grammars are a subset of the conditional grammars. Rather than specifying what the whole of the string must look like for a rule to apply, semi-conditional grammars specify that a string must have as substrings all of some set of strings, and none of another set, in order for a rule to apply. Formally, then, a semi-conditional grammar is a tuple
For two strings
The language of a semi-conditional grammar is then trivially the set of terminal strings
An example of a semi-conditional grammar is given below also as an example of random context grammars.
Random context grammars
A random context grammar is a semi-conditional grammar in which the R and Q sets are all subsets of N. Because subsets of N are finite sets over
Example
Like conditional grammars, random context grammars (and thus semi-conditional grammars) can generate the language
Let
Consider now the production for aaaa:
The behavior of the R sets here is trivial: any string can be rewritten according to them, because they do not require any substrings to be present. The behavior of the Q sets, however, are more interesting. In
Ordered grammars
Ordered grammars are perhaps one of the simpler extensions of grammars into the controlled grammar domain. An ordered grammar is simply a tuple
Given some strings
Example
Like many other contextually controlled grammars, ordered grammars can enforce the application of rules in a particular order. Since this is the essential property of previous grammars that could generate the language
Let
The derivation for the string aaaa is simply:
At each step of the way, the derivation proceeds by rewriting in cycles. Notice that if at the fifth step SY, we had four options:
Grammars with parallelism
A still further class of controlled grammars is the class of grammars with parallelism in the application of a rewrite operation, in which each rewrite step can (or must) rewrite more than one non-terminal simultaneously. These, too, come in several flavors: Indian parallel grammars, k-grammars, scattered context grammars, unordered scattered context grammars, and k-simple matrix grammars. Again, the variants differ in how the parallelism is defined.
Indian parallel grammars
An Indian parallel grammar is simply a CFG in which to use a rewrite rule, all instances of the rules non-terminal symbol must be rewritten simultaneously. Thus, for example, given the string aXbYcXd, with two instances of X, and some rule
Indian parallel grammars can easily produce the language
Let
Generating aabaab then is quite simple:
The language
Let
It should be obvious, just from the first rule, and the requirement that all instances of a non-terminal are rewritten simultaneously with the same rule, that the number of Ss doubles on each rewrite step using the first rule, giving the derivation steps
K-grammars
A k-grammar is yet another kind of parallel grammar, very different from an Indian parallel grammar, but still with a level of parallelism. In a k-grammar, for some number k, exactly k non-terminal symbols must be rewritten at every step (except the first step, where the only symbol in the string is the start symbol). If the string has less than k non-terminals, the derivation fails.
A 3-grammar can produce the language
Let
With the following derivation for aaabbbccc:
At each step in the derivation except the first and last, we used the self-recursive rules
Russian parallel grammars
Russian parallel grammars are somewhere between Indian parallel grammars and k-grammars, defined as
Scattered context grammars
A scattered context grammar is a 4-tuple
Intuitively, then, the matrixes in a scattered context grammar provide a list of rules which must each be applied to non-terminals in a string, where those non-terminals appear in the same linear order as the rules that rewrite them.
An unordered scattered context grammar is a scattered context grammar in which, for every rule in P, each of its permutations is also in P. As such, a rule and its permutations can instead be represented as a set rather than as tuples.
Example
Scattered context grammars are capable of describing the language
Let
Deriving aaabbbccc then is trivial: