The Swendsen–Wang algorithm is the first nonlocal algorithm for Monte Carlo simulation for large systems near criticality.
The original algorithm was designed for the Ising and Potts models, and later it was generalized to other systems as well, such as the XY model by Wolff algorithm and particles of fluids. A key ingredient of the method is based on the representation of the Ising or Potts model through percolation models of connecting bonds due to Fortuin and Kasteleyn. It has been generalized by Barbu and Zhu (2005) to sampling arbitrary probabilities by viewing it as a Metropolis–Hastings algorithm and computing the acceptance probability of the proposed Monte Carlo move.
The problem of the critical slowingdown affecting local processes is of fundamental importance in the study of secondorder phase transitions (like ferromagnetic transition in the Ising model) as increasing the size of the system in order to reduce finitesize effects has the disadvantage of requiring a far larger number of moves to reach thermal equilibrium. Indeed the correlation time
τ
usually increases as
L
z
with
z
≃
2
or greater; since, to be accurate, the simulation time must be
t
≫
τ
, this is a major limitation in the size of the systems that can be studied through local algorithms. SW algorithm was the first to produce unusually small values for the dynamical critical exponents:
z
=
0.35
for the 2D Ising model (
z
=
2.125
for standard simulations);
z
=
0.75
for the 3D Ising model, as opposed to
z
=
2.0
for standard simulations.
The algorithm is nonlocal in the sense that in a single sweep of moves a collective update of the spin variables of the system is done. The key idea is to take an additional number of 'bond' variables, as suggested by Fortuin and Kasteleyn, who mapped the Potts model onto a percolation model.
Let's consider a typical ferromagnetic Ising model with only nearestneighbour interaction.
Starting from a given configuration of spins, we associate to each couple of nearest neighbours on sites
n
,
m
a random variable
b
n
,
m
∈
{
0
,
1
}
which is interpreted in the following way: if
b
n
,
m
=
0
there is no link between the sites
n
and
m
; if
b
n
,
m
=
1
,
σ
n
and
σ
m
are connected. These values are assigned according to the following probability distribution:
P
[
b
n
,
m
=
0

σ
n
≠
σ
m
]
=
1
;
P
[
b
n
,
m
=
1

σ
n
≠
σ
m
]
=
0
;
P
[
b
n
,
m
=
0

σ
n
=
σ
m
]
=
e
−
2
β
J
n
m
;
P
[
b
n
,
m
=
1

σ
n
=
σ
m
]
=
1
−
e
−
2
β
J
n
m
;
where
J
n
m
>
0
is the ferromagnetic interaction intensity.
This probability distribution has been derived in the following way: the Hamiltonian of the Ising model is
H
[
σ
]
=
∑
<
i
,
j
>
−
J
i
,
j
σ
i
σ
j
, and the partition function is
Z
=
∑
{
σ
}
e
−
β
H
[
σ
]
. Consider the interaction between a couple of selected sites
n
and
m
and eliminate it from the total Hamiltonian, defining
H
n
m
[
σ
]
=
∑
<
i
,
j
>≠<
n
,
m
>
−
J
i
,
j
σ
i
σ
j
.
Define also the restricted sums:
Z
n
,
m
s
a
m
e
=
∑
{
σ
}
e
−
β
H
n
m
[
σ
]
δ
σ
n
,
σ
m
;
Z
n
,
m
d
i
f
f
=
∑
{
σ
}
e
−
β
H
n
m
[
σ
]
(
1
−
δ
σ
n
,
σ
m
)
.
Z
=
e
β
J
n
m
Z
n
,
m
s
a
m
e
+
e
−
β
J
n
m
Z
n
,
m
d
i
f
f
.
Introduce the quantity
Z
n
m
i
n
d
=
Z
n
,
m
s
a
m
e
+
Z
n
,
m
d
i
f
f
; the partition function can be rewritten as
Z
=
(
e
β
J
n
m
−
e
−
β
J
n
m
)
Z
n
,
m
s
a
m
e
+
e
−
β
J
n
m
Z
n
,
m
i
n
d
.
Since the first term contains a restriction on the spin values whereas there is no restriction in the second term, the weighting factors (properly normalized) can be interpreted as probabilities of forming/not forming a link between the sites:
P
<
n
,
m
>
l
i
n
k
=
1
−
e
−
2
β
J
n
m
.
The process can be easily adapted to antiferromagnetic spin systems, as it is sufficient to eliminate
Z
n
,
m
s
a
m
e
in favor of
Z
n
,
m
d
i
f
f
, in accordance to the change of sign in the interaction constant.
After assigning the bond variables, we identify the samespin clusters formed by connected sites and make an inversion of all the variables in the cluster with probability 1/2. At the following time step we have a new starting Ising configuration, which will produce a new clustering and a new collective spinflip.
It can be shown that this algorithm leads to equilibrium configurations. The first way to prove it is using the theory of Markov chains, either noting that the equilibrium (described by BoltzmannGibbs distribution) maps into itself, or showing that in a single sweep of the lattice there is a nonzero probability of going from any state of the Markov chain to any other; thus the corresponding irreducible ergodic Markov chain has an asymptotic probability distribution satisfying detailed balance.
Alternatively, we can show explicitily that detailed balance is satisfied. Every transition between two Ising configurations must pass through some bond configuration in the percolation representation. Let's fix a particular bond configuration: what matters in comparing the probabilities related to it is the number of factors
q
=
e
−
2
β
J
for each missing bond between neighboring spins with the same value; the probability of going to a certain Ising configuration compatible with a given bond configuration is uniform (say
p
). So the ratio of the transition probabilities of going from one state to another is
P
{
σ
}
→
{
σ
′
}
P
{
σ
′
}
→
{
σ
}
=
P
r
(
{
σ
′
}

B
.
C
.
)
P
r
(
B
.
C
.

{
σ
}
)
P
r
(
{
σ
}

B
.
C
.
)
P
r
(
B
.
C
.

{
σ
′
}
)
=
p
⋅
exp
[
−
2
β
∑
<
l
,
m
>
δ
σ
l
,
σ
m
J
l
m
]
p
⋅
exp
[
−
2
β
∑
<
l
,
m
>
δ
σ
l
′
,
σ
m
′
J
l
m
]
=
−
β
Δ
E
since
Δ
E
=
−
∑
<
l
,
m
>
J
l
m
(
σ
l
′
σ
m
′
−
σ
l
σ
m
)
=
−
∑
<
l
,
m
>
J
l
m
[
δ
σ
l
′
,
σ
m
′
−
(
1
−
δ
σ
l
′
,
σ
m
′
)
−
δ
σ
l
,
σ
m
+
(
1
−
δ
σ
l
′
,
σ
m
′
)
]
=
−
2
∑
<
l
,
m
>
J
l
m
(
δ
σ
l
′
,
σ
m
′
−
δ
σ
l
,
σ
m
)
.
This is valid for every bond configuration the system can pass through during its evolution, so detailed balance is satisfied for the total transition probability. This proves that the algorithm works.
Although not analytically clear from the original paper, the reason why all the values of z obtained with the SW algorithm are much lower than the exact lower bound for singlespinflip algorithms (
z
≥
γ
/
ν
) is that the correlation length divergence is strictly related to the formation of percolation clusters, which are flipped together. In this way the relaxation time is significantly reduced.
The algorithm is not particularly good for simulating frustrated systems.