In type theory and functional programming, Hindley–Milner (HM), also known as Damas–Milner or Damas–Hindley–Milner, is a classical type system for the lambda calculus with parametric polymorphism, first described by J. Roger Hindley and later rediscovered by Robin Milner. Luis Damas contributed a close formal analysis and proof of the method in his PhD thesis.
Contents
- Introduction
- Syntax
- Monotypes
- Polytype
- Free type variables
- Context and typing
- Polymorphic type order
- Deductive system
- Typing rules
- Principal type
- Let polymorphism
- Towards an algorithm
- Degrees of freedom choosing the rules
- Syntax directed rule system
- Degrees of freedom instantiating the rules
- Algorithm W
- Original presentation of Algorithm W
- Recursive definitions
- References
Among HM's more notable properties is completeness and its ability to deduce the most general type of a given program without the need of any type annotations or other hints supplied by the programmer. Algorithm W is a fast algorithm, performing type inference in almost linear time with respect to the size of the source, making it practically usable to type large programs. HM is preferably used for functional languages. It was first implemented as part of the type system of the programming language ML. Since then, HM has been extended in various ways, most notably by type class constraints as used in Haskell.
Introduction
Organizing their original paper, Damas and Milner clearly separated two very different tasks. One is to describe what types an expression can have and another to present an algorithm actually computing a type. Keeping the two aspects separate allows one to focus separately on the logic (i.e. meaning) behind the algorithm, as well as to establish a benchmark for the algorithm's properties.
How expressions and types fit to each other is described by means of a deductive system. Like any proof system, it allows different ways to come to a conclusion and since one and the same expression arguably might have different types, dissimilar conclusions about an expression are possible. Contrary to this, the type inference method itself (Algorithm W) is defined as a deterministic step-by-step procedure, leaving no choice what to do next. Thus clearly, decisions not present in the logic might have been made constructing the algorithm, which demand a closer look and justifications but would perhaps remain non-obvious without the above differentiation.
Syntax
Logic and algorithm share the notions of "expression" and "type", whose form is made precise by the syntax.
The expressions to be typed are exactly those of the lambda calculus, enhanced by a let-expression. These are shown in the adjacent table. For readers unfamiliar with the lambda calculus, here is a brief explanation: The application
Types as a whole are split into two groups, called mono- and polytypes.
The reason why the let-expression appears in the syntax is to allow let polymorphism (see below).
Monotypes
Monotypes
Examples of monotypes include type constants like
Type variables are monotypes. Standing alone, a type variable
Polytype
Polytypes (or type schemes) are types containing variables bound by one or more for-all quantifiers, e.g.
A function with polytype
As another example
Note that quantifiers can only appear top level, i.e. a type
Free type variables
In a type
This is certainly the hardest part of HM, perhaps because polytypes containing free variables are not represented in programming languages like Haskell. Likewise, one does not have clauses with free variables in Prolog. In particular developers experienced with both languages and actually knowing all the prerequisites of HM, are likely to slip this point. In Haskell for example, all type variables implicitly occur quantified, i.e. a Haskell type a -> a
means
So what function can have a type like e.g.
Consider
In this example, the free monotype variable
Context and typing
Consequently, to get the yet disjoint parts of the syntax, expressions and types together meaningfully, a third part, the context is needed. Syntactically, it is a list of pairs
Now having the complete syntax at hand, one can finally make a meaningful statement about the type of
Polymorphic type order
While the equality of monotypes is purely syntactical, polytypes offer a richer structure by being related to other types through a specialization relation
When being applied to a value a polymorphic function has to change its shape specializing to deal with this particular type of values. During this process, it also changes its type to match that of the parameter. If for instance the identity function having type
Now the shape shifting of polymorphic values is not fully arbitrary but rather limited by their pristine polytype. Following what has happened in the example one could paraphrase the rule of specialization, saying, a polymorphic type
The syntactic restriction to allow quantification only top-level is imposed to prevent generalization while specializing. Instead of
One could undo the former specialization by specializing on some value of type
Now focusing only on the question whether a type is more special than another and no longer what the specialized type is used for, one could summarize the specialization as in the box above. Paraphrasing it clockwise, a type
Thus the specialization rules makes sure that no free variable, i.e. monotype in the pristine type becomes unintentionally bound by a quantifier, but originally quantified variable can be replaced with whatever, even with types introducing new quantified or unquantified type variables.
Starting with a polytype
So the specialization introduces no further equivalence on polytype beside the already known renaming. Polytypes are syntactically equal up to renaming their quantified variables. The equality of types is a reflexive, antisymmetric and transitive relation and the remaining specializations of polytypes are transitive and with this the relation
Deductive system
The syntax of HM is carried forward to the syntax of the inference rules that form the body of the formal system, by using the typings as judgments. Each of the rules define what conclusion could be drawn from what premises. Additionally to the judgments, some extra conditions introduced above might be used as premises, too.
A proof using the rules is a sequence of judgments such that all premises are listed before a conclusion. Please see the Examples 2 and 3 below for a possible format of proofs. From left to right, each line shows the conclusion, the
Typing rules
The side box shows the deduction rules of the HM type system. One can roughly divide them into two groups:
The first four rules
The second group is formed by the remaining two rules
The following two examples exercise the rule system in action
Example 2: A proof for
Example 3: To demonstrate generalization,
Principal type
As mentioned in the introduction, the rules allow one to deduce different types for one and the same expression. See for instance, Example 2, steps 1,2 and Example 3, steps 2,3 for three different typings of the same expression. Clearly, the different results are not fully unrelated, but connected by the type order. It is an important property of the rule system and this order that whenever more than one type can be deduced for an expression, among them is (modulo alpha-renaming of the type variables) a unique most general type in the sense, that all others are specialization of it. Though the rule system must allow to derive specialized types, a type inference algorithm should deliver this most general or principal type as its result.
Let-polymorphism
Not visible immediately, the rule set encodes a regulation under which circumstances a type might be generalized or not by a slightly varying use of mono- and polytypes in the rules
In rule
As a consequence of this regulation, no type can be inferred for
Towards an algorithm
Now that the deduction system of HM is at hand, one could present an algorithm and validate it with respect to the rules. Alternatively, it might be possible to derive it by taking a closer look on how the rules interact and proof are formed. This is done in the remainder of this article focusing on the possible decisions one can make while proving a typing.
Degrees of freedom choosing the rules
Isolating the points in a proof, where no decision is possible at all, the first group of rules centered around the syntax leaves no choice since to each syntactical rule corresponds a unique typing rule, which determines a part of the proof, while between the conclusion and the premises of these fixed parts chains of
Because the only choice in a proof with respect of rule selection are the
Syntax-directed rule system
A contemporary treatment of HM uses a purely syntax-directed rule system due to Clement as an intermediate step. In this system, the specialization is located directly after the original
Formally, to validate, that this new rule system
While consistency can be seen by decomposing the rules
implying, one can derive the principal type for an expression in
Comparing
Degrees of freedom instantiating the rules
Within the rules themselves, assuming a given expression, one is free to pick the instances for (rule) variables not occurring in this expression. These are the instances for the type variable in the rules. Working towards finding the most general type, this choice can be limited to picking suitable types for
Therefore, the general strategy for finding a proof would be to make the most general assumption (
To briefly summarize the union-find algorithm, given the set of all types in a proof, it allows one to group them together into equivalence classes by means of a
Algorithm W
The presentation of Algorithm W as shown in the side box does not only deviate significantly from the original but is also a gross abuse of the notation of logical rules, since it includes side effects. It is legitimized here, for allowing a direct comparison with
The procedure
Because the procedures used in the algorithm have nearly O(1) cost, the overall cost of the algorithm is close to linear in the size of the expression for which a type is to be inferred. This is in strong contrast to many other attempts to derive type inference algorithms, which often came out to be NP-hard, if not undecidable with respect to termination. Thus the HM performs as well as the best fully informed type-checking algorithms can. Type-checking here means that an algorithm does not have to find a proof, but only to validate a given one.
Efficiency is slightly reduced because the binding of type variables in the context has to be maintained to allow computation of
Original presentation of Algorithm W
In the original paper, the algorithm is presented more formally using a substitution style instead of side effects in the method above. In the latter form, the side effect invisibly takes care of all places where a type variable is used. Explicitly using substitutions not only makes the algorithm hard to read‹See TfD›, because the side effect occurs virtually everywhere, but also gives the false impression that the method might be costly. When implemented using purely functional means or for the purpose of proving the algorithm to be basically equivalent to the deduction system, full explicitness is of course needed and the original formulation a necessary refinement.
Recursive definitions
A central property of the lambda calculus is, that recursive definitions are non-elemental, but can instead be expressed by a fixed point combinator. The original paper notes that recursion can be realized by this combinator's type
Alternatively an extension of the expression syntax and an extra typing rule is possible as:
where
basically merging