In probability and statistics, given two stochastic processes
X
=
(
X
t
)
and
Y
=
(
Y
t
)
, the crosscovariance is a function that gives the covariance of one process with the other at pairs of time points. With the usual notation E for the expectation operator, if the processes have the mean functions
μ
t
=
E
[
X
t
]
and
ν
t
=
E
[
Y
t
]
, then the crosscovariance is given by
C
X
Y
(
t
,
s
)
=
cov
(
X
t
,
Y
s
)
=
E
[
(
X
t
−
μ
t
)
(
Y
s
−
ν
s
)
]
=
E
[
X
t
Y
s
]
−
μ
t
ν
s
.
Crosscovariance is related to the more commonly used crosscorrelation of the processes in question.
In the case of two random vectors
X
=
(
X
1
,
X
2
,
.
.
.
,
X
p
)
and
Y
=
(
Y
1
,
Y
2
,
.
.
.
,
Y
q
)
, the crosscovariance would be a p by q matrix
C
X
Y
(often denoted
cov
(
X
,
Y
)
) with entries
C
X
Y
(
j
,
k
)
=
cov
(
X
j
,
Y
k
)
.
Thus the term crosscovariance is used in order to distinguish this concept from the covariance of a random vector X, which is understood to be the matrix of covariances between the scalar components of X itself.
In signal processing, the crosscovariance is often called crosscorrelation and is a measure of similarity of two signals, commonly used to find features in an unknown signal by comparing it to a known one. It is a function of the relative time between the signals, is sometimes called the sliding dot product, and has applications in pattern recognition and cryptanalysis.
For random vectors X and Y, each containing random elements whose expected value and variance exist, the crosscovariance matrix of X and Y is defined by
cov
(
X
,
Y
)
=
E
[
(
X
−
μ
X
)
(
Y
−
μ
Y
)
T
]
,
where μ_{X} and μ_{Y} are vectors containing the expected values of X and Y. The vectors X and Y need not have the same dimension, and either might be a scalar value. Any element of the crosscovariance matrix is itself a "crosscovariance".
For example, if X=(X_{1}, X_{2}, X_{3}) and Y=(Y_{1}, Y_{2}) are random vectors, then cov(X, Y) is a 3 x 2 matrix whose ijth entry is cov(X_{i}, Y_{j}).
For the crosscovariance matrix, the following basic properties apply:

cov
(
X
,
Y
)
=
cov
(
Y
,
X
)
T

cov
(
X
1
+
X
2
,
Y
)
=
cov
(
X
1
,
Y
)
+
cov
(
X
2
,
Y
)

cov
(
A
X
+
a
,
B
T
Y
+
b
)
=
A
cov
(
X
,
Y
)
B
 If
X
and
Y
are independent (or somewhat less restrictedly, if every random variable in
X
is uncorrelated with every random variable in
Y
), then
cov
(
X
,
Y
)
=
0
where
X
,
X
1
and
X
2
are random p×1 vectors,
Y
is a random q×1 vector,
a
is a q×1 vector,
b
is a p×1 vector,
A
and
B
are q×p matrices of constants, and
0
is a p×q matrix of zeroes.
The crosscovariance is also relevant in signal processing where the crosscovariance between two widesense stationary random processes can be estimated by averaging the product of samples measured from one process and samples measured from the other (and its time shifts). The samples included in the average can be an arbitrary subset of all the samples in the signal (e.g., samples within a finite time window or a subsampling of one of the signals). For a large number of samples, the average converges to the true covariance.
Crosscovariance may also refer to a "deterministic" crosscovariance between two signals. This consists of summing over all time indices. For example, for discrete signals f_{i} and g_{i} the crosscovariance is defined as
(
f
⋆
g
)
i
=
d
e
f
∑
j
f
j
∗
g
i
+
j
where the asterisk indicates that the complex conjugate is taken when the signals are complexvalued.
For continuous functions f (x) and g (x) the (deterministic) crosscovariance is defined as
(
f
⋆
g
)
(
x
)
=
d
e
f
∫
f
∗
(
t
)
g
(
x
+
t
)
d
t
The crosscovariance of two signals is related to the convolution by:
f
(
t
)
⋆
g
(
t
)
=
f
∗
(
−
t
)
∗
g
(
t
)
,
.