In probability theory, a Markov kernel (also known as a stochastic kernel or probability kernel) is a map that plays the role, in the general theory of Markov processes, that the transition matrix does in the theory of Markov processes with a finite state space.
Let ( X , A ) , ( Y , B ) be measurable spaces. A Markov kernel with source ( X , A ) and target ( Y , B ) is a map κ : X × B → [ 0 , 1 ] with the following properties:
- The map x ↦ κ ( x , B ) is A - measureable for every B ∈ B .
- The map B ↦ κ ( x , B ) is a probability measure on ( Y , B ) for every x ∈ X .
(i.e. It associates to each point x ∈ X a probability measure κ ( x , . ) on ( Y , B ) such that, for every measurable set B ∈ B , the map x ↦ κ ( x , B ) is measurable with respect to the σ -algebra A .)
Simple random walk: Take X = Y = Z and A = B = P ( Z ) , then the Markov kernel κ with κ ( x , B ) = 1 2 1 B ( x − 1 ) + 1 2 1 B ( x + 1 ) , ∀ x ∈ Z , ∀ B ∈ P ( Z ) ,
describes the transition rule for the random walk on Z , where 1 is the indicator function.
Galton-Watson process: Take X = Y = N , A = B = P ( N ) , then κ ( x , B ) = { 1 B ( 0 ) x = 0 , P [ ξ 1 + ⋯ + ξ x ∈ B ] else, with i.i.d. random variables ξ i .
General Markov processes with finite state space: Take X = Y , A = B = P ( X ) = P ( Y ) and | X | = | Y | = n , then the transition rule can be represented as a stochastic matrix ( K i j ) 1 ≤ i , j ≤ n with Σ j ∈ Y K i j = 1 for every i ∈ X . In the convention of Markov kernels we write κ ( i , B ) = Σ j ∈ B K i j , ∀ i ∈ X , ∀ B ∈ B .
Construction of a Markov kernel: If ν is a finite measure on ( Y , B ) and k : X × Y → R + is a measurable function with respect to the product σ -algebra A ⊗ B and has the property ∫ X k ( x , y ) ν ( d y ) = 1 , for all x ∈ X , then the mapping κ : X × B → [ 0 , 1 ]
κ ( x , B ) = ∫ B k ( x , y ) ν ( d y ) , defines a Markov kernel.
Let ( X , A , P ) be a probability space and κ a Markov kernel from ( X , A ) to some ( Y , B ) .
Then there exists a unique measure Q on ( X × Y , A ⊗ B ) , such that
Q ( A × B ) = ∫ A κ ( x , B ) d P ( x ) , ∀ A ∈ A , ∀ B ∈ B .
Let ( S , Y ) be a Borel space, X a ( S , Y ) - valued random variable on the measure space ( Ω , F , P ) and G ⊆ F a sub- σ -algebra.
Then there exists a Markov kernel κ from ( Ω , G ) to ( S , Y ) , such that κ ( . , B ) is a version of the conditional expectation E [ 1 { X ∈ B } | G ] for every B ∈ Y , i.e.
P [ X ∈ B | G ] = E [ 1 { X ∈ B } | G ] = κ ( ω , B ) , P − a . s . ∀ B ∈ G .
It is called regular conditional distribution of X given G and is not uniquely defined.