In probability theory, a Markov kernel (also known as a stochastic kernel or probability kernel) is a map that plays the role, in the general theory of Markov processes, that the transition matrix does in the theory of Markov processes with a finite state space.
Let
(
X
,
A
)
,
(
Y
,
B
)
be measurable spaces. A Markov kernel with source
(
X
,
A
)
and target
(
Y
,
B
)
is a map
κ
:
X
×
B
→
[
0
,
1
]
with the following properties:
- The map
x
↦
κ
(
x
,
B
)
is
A
- measureable for every
B
∈
B
.
- The map
B
↦
κ
(
x
,
B
)
is a probability measure on
(
Y
,
B
)
for every
x
∈
X
.
(i.e. It associates to each point
x
∈
X
a probability measure
κ
(
x
,
.
)
on
(
Y
,
B
)
such that, for every measurable set
B
∈
B
, the map
x
↦
κ
(
x
,
B
)
is measurable with respect to the
σ
-algebra
A
.)
Simple random walk: Take
X
=
Y
=
Z
and
A
=
B
=
P
(
Z
)
, then the Markov kernel
κ
with
κ
(
x
,
B
)
=
1
2
1
B
(
x
−
1
)
+
1
2
1
B
(
x
+
1
)
,
∀
x
∈
Z
,
∀
B
∈
P
(
Z
)
,
describes the transition rule for the random walk on
Z
, where
1
is the indicator function.
Galton-Watson process: Take
X
=
Y
=
N
,
A
=
B
=
P
(
N
)
, then
κ
(
x
,
B
)
=
{
1
B
(
0
)
x
=
0
,
P
[
ξ
1
+
⋯
+
ξ
x
∈
B
]
else,
with i.i.d. random variables
ξ
i
.
General Markov processes with finite state space: Take
X
=
Y
,
A
=
B
=
P
(
X
)
=
P
(
Y
)
and
|
X
|
=
|
Y
|
=
n
, then the transition rule can be represented as a stochastic matrix
(
K
i
j
)
1
≤
i
,
j
≤
n
with
Σ
j
∈
Y
K
i
j
=
1
for every
i
∈
X
. In the convention of Markov kernels we write
κ
(
i
,
B
)
=
Σ
j
∈
B
K
i
j
,
∀
i
∈
X
,
∀
B
∈
B
.
Construction of a Markov kernel: If
ν
is a finite measure on
(
Y
,
B
)
and
k
:
X
×
Y
→
R
+
is a measurable function with respect to the product
σ
-algebra
A
⊗
B
and has the property
∫
X
k
(
x
,
y
)
ν
(
d
y
)
=
1
,
for all
x
∈
X
, then the mapping
κ
:
X
×
B
→
[
0
,
1
]
κ
(
x
,
B
)
=
∫
B
k
(
x
,
y
)
ν
(
d
y
)
,
defines a Markov kernel.
Let
(
X
,
A
,
P
)
be a probability space and
κ
a Markov kernel from
(
X
,
A
)
to some
(
Y
,
B
)
.
Then there exists a unique measure
Q
on
(
X
×
Y
,
A
⊗
B
)
, such that
Q
(
A
×
B
)
=
∫
A
κ
(
x
,
B
)
d
P
(
x
)
,
∀
A
∈
A
,
∀
B
∈
B
.
Let
(
S
,
Y
)
be a Borel space,
X
a
(
S
,
Y
)
- valued random variable on the measure space
(
Ω
,
F
,
P
)
and
G
⊆
F
a sub-
σ
-algebra.
Then there exists a Markov kernel
κ
from
(
Ω
,
G
)
to
(
S
,
Y
)
, such that
κ
(
.
,
B
)
is a version of the conditional expectation
E
[
1
{
X
∈
B
}
|
G
]
for every
B
∈
Y
, i.e.
P
[
X
∈
B
|
G
]
=
E
[
1
{
X
∈
B
}
|
G
]
=
κ
(
ω
,
B
)
,
P
−
a
.
s
.
∀
B
∈
G
.
It is called regular conditional distribution of
X
given
G
and is not uniquely defined.