Among Markov chain Monte Carlo (MCMC) algorithms, coupling from the past is a method for sampling from the stationary distribution of a Markov chain. Contrary to many MCMC algorithms, coupling from the past gives in principle a perfect sample from the stationary distribution. It was invented by James Propp and David Wilson in 1996.
Consider a finite state irreducible aperiodic Markov chain
M
with state space
S
and (unique) stationary distribution
π
(
π
is a probability vector). Suppose that we come up with a probability distribution
μ
on the set of maps
f
:
S
→
S
with the property that for every fixed
s
∈
S
, its image
f
(
s
)
is distributed according to the transition probability of
M
from state
s
. An example of such a probability distribution is the one where
f
(
s
)
is independent from
f
(
s
′
)
whenever
s
≠
s
′
, but it is often worthwhile to consider other distributions. Now let
f
j
for
j
∈
Z
be independent samples from
μ
.
Suppose that
x
is chosen randomly according to
π
and is independent from the sequence
f
j
. (We do not worry for now where this
x
is coming from.) Then
f
−
1
(
x
)
is also distributed according to
π
, because
π
is
M
-stationary and our assumption on the law of
f
. Define
F
j
:=
f
−
1
∘
f
−
2
∘
⋯
∘
f
−
j
.
Then it follows by induction that
F
j
(
x
)
is also distributed according to
π
for every
j
∈
N
. Now here is the main point. It may happen that for some
n
∈
N
the image of the map
F
n
is a single element of
S
. In other words,
F
n
(
x
)
=
F
n
(
y
)
for each
y
∈
S
. Therefore, we do not need to have access to
x
in order to compute
F
n
(
x
)
. The algorithm then involves finding some
n
∈
N
such that
F
n
(
S
)
is a singleton, and outputing the element of that singleton. The design of a good distribution
μ
for which the task of finding such an
n
and computing
F
n
is not too costly is not always obvious, but has been accomplished successfully in several important instances.
There is a special class of Markov chains in which there are particularly good choices for
μ
and a tool for determining if
|
F
n
(
S
)
|
=
1
. (Here
|
⋅
|
denotes cardinality.) Suppose that
S
is a partially ordered set with order
≤
, which has a unique minimal element
s
0
and a unique maximal element
s
1
; that is, every
s
∈
S
satisfies
s
0
≤
s
≤
s
1
. Also, suppose that
μ
may be chosen to be supported on the set of monotone maps
f
:
S
→
S
. Then it is easy to see that
|
F
n
(
S
)
|
=
1
if and only if
F
n
(
s
0
)
=
F
n
(
s
1
)
, since
F
n
is monotone. Thus, checking this becomes rather easy. The algorithm can proceed by choosing
n
:=
n
0
for some constant
n
0
, sampling the maps
f
−
1
,
…
,
f
−
n
, and outputing
F
n
(
s
0
)
if
F
n
(
s
0
)
=
F
n
(
s
1
)
. If
F
n
(
s
0
)
≠
F
n
(
s
1
)
the algorithm proceeds by doubling
n
and repeating as necessary until an output is obtained. (But the algorithm does not resample the maps
f
−
j
which were already sampled; it uses the previously sampled maps when needed.)