Among Markov chain Monte Carlo (MCMC) algorithms, coupling from the past is a method for sampling from the stationary distribution of a Markov chain. Contrary to many MCMC algorithms, coupling from the past gives in principle a perfect sample from the stationary distribution. It was invented by James Propp and David Wilson in 1996.
Consider a finite state irreducible aperiodic Markov chain M with state space S and (unique) stationary distribution π ( π is a probability vector). Suppose that we come up with a probability distribution μ on the set of maps f : S → S with the property that for every fixed s ∈ S , its image f ( s ) is distributed according to the transition probability of M from state s . An example of such a probability distribution is the one where f ( s ) is independent from f ( s ′ ) whenever s ≠ s ′ , but it is often worthwhile to consider other distributions. Now let f j for j ∈ Z be independent samples from μ .
Suppose that x is chosen randomly according to π and is independent from the sequence f j . (We do not worry for now where this x is coming from.) Then f − 1 ( x ) is also distributed according to π , because π is M -stationary and our assumption on the law of f . Define
F j := f − 1 ∘ f − 2 ∘ ⋯ ∘ f − j . Then it follows by induction that F j ( x ) is also distributed according to π for every j ∈ N . Now here is the main point. It may happen that for some n ∈ N the image of the map F n is a single element of S . In other words, F n ( x ) = F n ( y ) for each y ∈ S . Therefore, we do not need to have access to x in order to compute F n ( x ) . The algorithm then involves finding some n ∈ N such that F n ( S ) is a singleton, and outputing the element of that singleton. The design of a good distribution μ for which the task of finding such an n and computing F n is not too costly is not always obvious, but has been accomplished successfully in several important instances.
There is a special class of Markov chains in which there are particularly good choices for μ and a tool for determining if | F n ( S ) | = 1 . (Here | ⋅ | denotes cardinality.) Suppose that S is a partially ordered set with order ≤ , which has a unique minimal element s 0 and a unique maximal element s 1 ; that is, every s ∈ S satisfies s 0 ≤ s ≤ s 1 . Also, suppose that μ may be chosen to be supported on the set of monotone maps f : S → S . Then it is easy to see that | F n ( S ) | = 1 if and only if F n ( s 0 ) = F n ( s 1 ) , since F n is monotone. Thus, checking this becomes rather easy. The algorithm can proceed by choosing n := n 0 for some constant n 0 , sampling the maps f − 1 , … , f − n , and outputing F n ( s 0 ) if F n ( s 0 ) = F n ( s 1 ) . If F n ( s 0 ) ≠ F n ( s 1 ) the algorithm proceeds by doubling n and repeating as necessary until an output is obtained. (But the algorithm does not resample the maps f − j which were already sampled; it uses the previously sampled maps when needed.)