Chomsky–Schützenberger representation theorem - Alchetron, the free social encyclopedia

In formal language theory, the Chomsky–Schützenberger representation theorem is a theorem derived by Noam Chomsky and Marcel-Paul Schützenberger about representing a given context-free language in terms of two simpler languages. These two simpler languages, namely a regular language and a Dyck language, are combined by means of an intersection and a homomorphism.

A few notions from formal language theory are in order. A context-free language is regular, if can be described by a regular expression, or, equivalently, if it is accepted by a finite automaton. A homomorphism is based on a function h which maps symbols from an alphabet Γ to words over another alphabet Σ ; If the domain of this function is extended to words over Γ in the natural way, by letting h ( x y ) = h ( x ) h ( y ) for all words x and y , this yields a homomorphism h : Γ ∗ → Σ ∗ . A matched alphabet T ∪ T ¯ is an alphabet with two equal-sized sets; it is convenient to think of it as a set of parentheses types, where T contains the opening parenthesis symbols, whereas the symbols in T ¯ contains the closing parenthesis symbols. For a matched alphabet T ∪ T ¯ , the Dyck language D T is given by

D T = { w ∈ ( T ∪ T ¯ ) ∗ ∣ w is a correctly nested sequence of parentheses }

words that are well-nested parentheses over T ∪ T ¯ .

Chomsky–Schützenberger theorem. A language L over the alphabet Σ is context-free if and only if there exists

a matched alphabet T ∪ T ¯

a regular language R over T ∪ T ¯ ,

and a homomorphism h : ( T ∪ T ¯ ) ∗ → Σ ∗

such that L = h ( D T ∩ R ) .

Proofs of this theorem are found in several textbooks, e.g. Autebert, Berstel & Boasson (1997) or Davis, Sigal & Weyuker (1994).

References

Chomsky–Schützenberger representation theorem Wikipedia

(Text) CC BY-SA