Kernel independent component analysis - Alchetron, the free social encyclopedia

In statistics, kernel-independent component analysis (kernel ICA) is an efficient algorithm for independent component analysis which estimates source components by optimizing a generalized variance contrast function, which is based on representations in a reproducing kernel Hilbert space. Those contrast functions use the notion of mutual information as a measure of statistical independence.

Main idea

Kernel ICA is based on the idea that correlations between two random variables can be represented in a reproducing kernel Hilbert space (RKHS), denoted by F , associated with a feature map L x : F ↦ R defined for a fixed x ∈ R . The F -correlation between two random variables X and Y is defined as

ρ F ( X , Y ) = max f , g ∈ F corr ⁡ ( ⟨ L X , f ⟩ , ⟨ L Y , g ⟩ )

where the functions f , g : R → R range over F and

corr ⁡ ( ⟨ L X , f ⟩ , ⟨ L Y , g ⟩ ) := cov ⁡ ( f ( X ) , g ( Y ) ) var ⁡ ( f ( X ) ) 1 / 2 var ⁡ ( g ( Y ) ) 1 / 2

for fixed f , g ∈ F . Note that the reproducing property implies that f ( x ) = ⟨ L x , f ⟩ for fixed x ∈ R and f ∈ F . It follows then that the F -correlation between two independent random variables is zero.

This notion of F -correlations is used for defining contrast functions that are optimized in the Kernel ICA algorithm. Specifically, if X := ( x i j ) ∈ R n × m is a prewhitened data matrix, that is, the sample mean of each column is zero and the sample covariance of the rows is the m × m dimensional identity matrix, Kernel ICA estimates a m × m dimensional orthogonal matrix A so as to minimize finite-sample F -correlations between the columns of S := X A ′ .

References

Kernel-independent component analysis Wikipedia

(Text) CC BY-SA