The following section derives the matched filter for a discretetime system. The derivation for a continuoustime system is similar, with summations replaced with integrals.
The matched filter is the linear filter,
h
, that maximizes the output signaltonoise ratio.
y
[
n
]
=
∑
k
=
−
∞
∞
h
[
n
−
k
]
x
[
k
]
.
Though we most often express filters as the impulse response of convolution systems, as above (see LTI system theory), it is easiest to think of the matched filter in the context of the inner product, which we will see shortly.
We can derive the linear filter that maximizes output signaltonoise ratio by invoking a geometric argument. The intuition behind the matched filter relies on correlating the received signal (a vector) with a filter (another vector) that is parallel with the signal, maximizing the inner product. This enhances the signal. When we consider the additive stochastic noise, we have the additional challenge of minimizing the output due to noise by choosing a filter that is orthogonal to the noise.
Let us formally define the problem. We seek a filter,
h
, such that we maximize the output signaltonoise ratio, where the output is the inner product of the filter and the observed signal
x
.
Our observed signal consists of the desirable signal
s
and additive noise
v
:
x
=
s
+
v
.
Let us define the covariance matrix of the noise, reminding ourselves that this matrix has Hermitian symmetry, a property that will become useful in the derivation:
R
v
=
E
{
v
v
H
}
where
v
H
denotes the conjugate transpose of
v
, and
E
denotes expectation. Let us call our output,
y
, the inner product of our filter and the observed signal such that
y
=
∑
k
=
−
∞
∞
h
∗
[
k
]
x
[
k
]
=
h
H
x
=
h
H
s
+
h
H
v
=
y
s
+
y
v
.
We now define the signaltonoise ratio, which is our objective function, to be the ratio of the power of the output due to the desired signal to the power of the output due to the noise:
S
N
R
=

y
s

2
E
{

y
v

2
}
.
We rewrite the above:
S
N
R
=

h
H
s

2
E
{

h
H
v

2
}
.
We wish to maximize this quantity by choosing
h
. Expanding the denominator of our objective function, we have
E
{

h
H
v

2
}
=
E
{
(
h
H
v
)
(
h
H
v
)
H
}
=
h
H
E
{
v
v
H
}
h
=
h
H
R
v
h
.
Now, our
S
N
R
becomes
S
N
R
=

h
H
s

2
h
H
R
v
h
.
We will rewrite this expression with some matrix manipulation. The reason for this seemingly counterproductive measure will become evident shortly. Exploiting the Hermitian symmetry of the covariance matrix
R
v
, we can write
S
N
R
=

(
R
v
1
/
2
h
)
H
(
R
v
−
1
/
2
s
)

2
(
R
v
1
/
2
h
)
H
(
R
v
1
/
2
h
)
,
We would like to find an upper bound on this expression. To do so, we first recognize a form of the CauchySchwarz inequality:

a
H
b

2
≤
(
a
H
a
)
(
b
H
b
)
,
which is to say that the square of the inner product of two vectors can only be as large as the product of the individual inner products of the vectors. This concept returns to the intuition behind the matched filter: this upper bound is achieved when the two vectors
a
and
b
are parallel. We resume our derivation by expressing the upper bound on our
S
N
R
in light of the geometric inequality above:
S
N
R
=

(
R
v
1
/
2
h
)
H
(
R
v
−
1
/
2
s
)

2
(
R
v
1
/
2
h
)
H
(
R
v
1
/
2
h
)
≤
[
(
R
v
1
/
2
h
)
H
(
R
v
1
/
2
h
)
]
[
(
R
v
−
1
/
2
s
)
H
(
R
v
−
1
/
2
s
)
]
(
R
v
1
/
2
h
)
H
(
R
v
1
/
2
h
)
.
Our valiant matrix manipulation has now paid off. We see that the expression for our upper bound can be greatly simplified:
S
N
R
=

(
R
v
1
/
2
h
)
H
(
R
v
−
1
/
2
s
)

2
(
R
v
1
/
2
h
)
H
(
R
v
1
/
2
h
)
≤
s
H
R
v
−
1
s
.
We can achieve this upper bound if we choose,
R
v
1
/
2
h
=
α
R
v
−
1
/
2
s
where
α
is an arbitrary real number. To verify this, we plug into our expression for the output
S
N
R
:
S
N
R
=

(
R
v
1
/
2
h
)
H
(
R
v
−
1
/
2
s
)

2
(
R
v
1
/
2
h
)
H
(
R
v
1
/
2
h
)
=
α
2

(
R
v
−
1
/
2
s
)
H
(
R
v
−
1
/
2
s
)

2
α
2
(
R
v
−
1
/
2
s
)
H
(
R
v
−
1
/
2
s
)
=

s
H
R
v
−
1
s

2
s
H
R
v
−
1
s
=
s
H
R
v
−
1
s
.
Thus, our optimal matched filter is
h
=
α
R
v
−
1
s
.
We often choose to normalize the expected value of the power of the filter output due to the noise to unity. That is, we constrain
E
{

y
v

2
}
=
1.
This constraint implies a value of
α
, for which we can solve:
E
{

y
v

2
}
=
α
2
s
H
R
v
−
1
s
=
1
,
yielding
α
=
1
s
H
R
v
−
1
s
,
giving us our normalized filter,
h
=
1
s
H
R
v
−
1
s
R
v
−
1
s
.
If we care to write the impulse response of the filter for the convolution system, it is simply the complex conjugate time reversal of
h
.
Though we have derived the matched filter in discrete time, we can extend the concept to continuoustime systems if we replace
R
v
with the continuoustime autocorrelation function of the noise, assuming a continuous signal
s
(
t
)
, continuous noise
v
(
t
)
, and a continuous filter
h
(
t
)
.
Alternatively, we may solve for the matched filter by solving our maximization problem with a Lagrangian. Again, the matched filter endeavors to maximize the output signaltonoise ratio (
S
N
R
) of a filtered deterministic signal in stochastic additive noise. The observed sequence, again, is
x
=
s
+
v
,
with the noise covariance matrix,
R
v
=
E
{
v
v
H
}
.
The signaltonoise ratio is
S
N
R
=

y
s

2
E
{

y
v

2
}
.
Evaluating the expression in the numerator, we have

y
s

2
=
y
s
H
y
s
=
h
H
s
s
H
h
.
and in the denominator,
E
{

y
v

2
}
=
E
{
y
v
H
y
v
}
=
E
{
h
H
v
v
H
h
}
=
h
H
R
v
h
.
The signaltonoise ratio becomes
S
N
R
=
h
H
s
s
H
h
h
H
R
v
h
.
If we now constrain the denominator to be 1, the problem of maximizing
S
N
R
is reduced to maximizing the numerator. We can then formulate the problem using a Lagrange multiplier:
h
H
R
v
h
=
1
L
=
h
H
s
s
H
h
+
λ
(
1
−
h
H
R
v
h
)
∇
h
∗
L
=
s
s
H
h
−
λ
R
v
h
=
0
(
s
s
H
)
h
=
λ
R
v
h
which we recognize as a generalized eigenvalue problem
h
H
(
s
s
H
)
h
=
λ
h
H
R
v
h
.
Since
s
s
H
is of unit rank, it has only one nonzero eigenvalue. It can be shown that this eigenvalue equals
λ
max
=
s
H
R
v
−
1
s
,
yielding the following optimal matched filter
h
=
1
s
H
R
v
−
1
s
R
v
−
1
s
.
This is the same result found in the previous subsection.
Matched filtering can also be interpreted as a least squares estimator for the optimal location and scaling of a given model or template. Once again, let the observed sequence be defined as
x
k
=
s
k
+
v
k
,
where
v
k
is uncorrelated zero mean noise. The signal
s
k
is assumed to be a scaled and shifted version of a known model sequence
f
k
:
s
k
=
μ
0
⋅
f
k
−
j
0
We want to find optimal estimates
j
∗
and
μ
∗
for the unknown shift
j
0
and scaling
μ
0
by minimizing the least squares residual between the observed sequence
x
k
and a "probing sequence"
h
j
−
k
:
j
∗
,
μ
∗
=
arg
min
j
,
μ
∑
k
(
x
k
−
μ
⋅
h
j
−
k
)
2
The appropriate
h
j
−
k
will later turn out to be the matched filter, but is as yet unspecified. Expanding
x
k
and the square within the sum yields
j
∗
,
μ
∗
=
arg
min
j
,
μ
[
∑
k
(
s
k
+
v
k
)
2
+
μ
2
∑
k
h
j
−
k
2
−
2
μ
∑
k
s
k
h
j
−
k
−
2
μ
∑
k
v
k
h
j
−
k
]
.
The first term in brackets is a constant (since the observed signal is given) and has no influence on the optimal solution. The last term has constant expected value because the noise is uncorrelated and has zero mean. We can therefore drop both terms from the optimization. After reversing the sign, we obtain the equivalent optimization problem
j
∗
,
μ
∗
=
arg
max
j
,
μ
[
2
μ
∑
k
s
k
h
j
−
k
−
μ
2
∑
k
h
j
−
k
2
]
.
Setting the derivative w.r.t.
μ
to zero gives an analytic solution for
μ
∗
:
μ
∗
=
∑
k
s
k
h
j
−
k
∑
k
h
j
−
k
2
.
Inserting this into our objective function yields a reduced maximization problem for just
j
∗
:
j
∗
=
arg
max
j
(
∑
k
s
k
h
j
−
k
)
2
∑
k
h
j
−
k
2
.
The numerator can be upperbounded by means of the CauchySchwarz inequality:
(
∑
k
s
k
h
j
−
k
)
2
∑
k
h
j
−
k
2
≤
∑
k
s
k
2
⋅
∑
k
h
j
−
k
2
∑
k
h
j
−
k
2
=
∑
k
s
k
2
=
c
o
n
s
t
.
The optimization problem assumes its maximum when equality holds in this expression. According to the properties of the CauchySchwarz inequality, this is only possible when
h
j
−
k
=
ν
⋅
s
k
=
κ
⋅
f
k
−
j
0
.
for arbitrary nonzero constants
ν
or
κ
, and the optimal solution is obtained at
j
∗
=
j
0
as desired. Thus, our "probing sequence"
h
j
−
k
must be proportional to the signal model
f
k
−
j
0
, and the convenient choice
κ
=
1
yields the matched filter
h
k
=
f
−
k
.
Note that the filter is the mirrored signal model. This ensures that the operation
∑
k
x
k
h
j
−
k
to be applied in order to find the optimum is indeed the convolution between the observed sequence
x
k
and the matched filter
h
k
. The filtered sequence assumes its maximum at the position where the observed sequence
x
k
best matches (in a leastsquares sense) the signal model
f
k
.
The matched filter may be derived in a variety of ways, but as a special case of a least squares procedure it may also be interpreted as a maximum likelihood method in the context of a (coloured) Gaussian noise model and the associated Whittle likelihood. If the transmitted signal possessed no unknown parameters (like timeofarrival, amplitude,...), then the matched filter would, according to the NeymanPearson lemma, minimize the error probability. However, since the exact signal generally is determined by unknown parameters that effectively are estimated (or fitted) in the filtering process, the matched filter constitutes a generalized maximum likelihood (test) statistic. The filtered time series may then be interpreted as (proportional to) the profile likelihood, the maximized conditional likelihood as a function of the time parameter. This implies in particular that the error probability (in the sense of Neyman and Pearson, i.e., concerning maximization of the detection probability for a given falsealarm probability) is not necessarily optimal. What is commonly referred to as the Signaltonoise ratio (SNR), which is supposed to be maximized by a matched filter, in this context corresponds to
2
log
(
L
)
, where
L
is the (conditionally) maximized likelihood ratio.
The construction of the matched filter is based on a known noise spectrum. In reality, however, the noise spectrum is usually estimated from data and hence only known up to a limited precision. For the case of an uncertain spectrum, the matched filter may be generalized to a more robust iterative procedure with favourable properties also in nonGaussian noise.
When viewed in the frequency domain, it is evident that the matched filter applies the greatest weighting to spectral components exhibiting the greatest signaltonoise ratio (i.e., large weight where noise is relatively low, and vice versa). In general this requires a nonflat frequency response, but the associated "distortion" is no cause for concern in situations such as radar and digital communications, where the original waveform is known and the objective is the detection of this signal against the background noise. On the technical side, the matched filter is a weighted least squares method based on the (heteroscedastic) frequencydomain data (where the "weights" are determined via the noise spectrum, see also previous section), or equivalently, a least squares method applied to the whitened data.
Matched filters are often used in signal detection. As an example, suppose that we wish to judge the distance of an object by reflecting a signal off it. We may choose to transmit a puretone sinusoid at 1 Hz. We assume that our received signal is an attenuated and phaseshifted form of the transmitted signal with added noise.
To judge the distance of the object, we correlate the received signal with a matched filter, which, in the case of white (uncorrelated) noise, is another puretone 1Hz sinusoid. When the output of the matched filter system exceeds a certain threshold, we conclude with high probability that the received signal has been reflected off the object. Using the speed of propagation and the time that we first observe the reflected signal, we can estimate the distance of the object. If we change the shape of the pulse in a speciallydesigned way, the signaltonoise ratio and the distance resolution can be even improved after matched filtering: this is a technique known as pulse compression.
Additionally, matched filters can be used in parameter estimation problems (see estimation theory). To return to our previous example, we may desire to estimate the speed of the object, in addition to its position. To exploit the Doppler effect, we would like to estimate the frequency of the received signal. To do so, we may correlate the received signal with several matched filters of sinusoids at varying frequencies. The matched filter with the highest output will reveal, with high probability, the frequency of the reflected signal and help us determine the speed of the object. This method is, in fact, a simple version of the discrete Fourier transform (DFT). The DFT takes an
N
valued complex input and correlates it with
N
matched filters, corresponding to complex exponentials at
N
different frequencies, to yield
N
complexvalued numbers corresponding to the relative amplitudes and phases of the sinusoidal components (see Moving target indication).
The matched filter is also used in communications. In the context of a communication system that sends binary messages from the transmitter to the receiver across a noisy channel, a matched filter can be used to detect the transmitted pulses in the noisy received signal.
Imagine we want to send the sequence "0101100100" coded in non polar Nonreturntozero (NRZ) through a certain channel.
Mathematically, a sequence in NRZ code can be described as a sequence of unit pulses or shifted rect functions, each pulse being weighted by +1 if the bit is "1" and by 1 if the bit is "0". Formally, the scaling factor for the
k
t
h
bit is,
a
k
=
{
+
1
,
if bit
k
is 1
,
−
1
,
if bit
k
is 0
.
We can represent our message,
M
(
t
)
, as the sum of shifted unit pulses:
M
(
t
)
=
∑
k
=
−
∞
∞
a
k
×
Π
(
t
−
k
T
T
)
.
where
T
is the time length of one bit.
Thus, the signal to be sent by the transmitter is
If we model our noisy channel as an AWGN channel, white Gaussian noise is added to the signal. At the receiver end, for a Signaltonoise ratio of 3 dB, this may look like:
A first glance will not reveal the original transmitted sequence. There is a high power of noise relative to the power of the desired signal (i.e., there is a low signaltonoise ratio). If the receiver were to sample this signal at the correct moments, the resulting binary message would possibly belie the original transmitted one.
To increase our signaltonoise ratio, we pass the received signal through a matched filter. In this case, the filter should be matched to an NRZ pulse (equivalent to a "1" coded in NRZ code). Precisely, the impulse response of the ideal matched filter, assuming white (uncorrelated) noise should be a timereversed complexconjugated scaled version of the signal that we are seeking. We choose
h
(
t
)
=
Π
(
t
T
)
.
In this case, due to symmetry, the timereversed complex conjugate of
h
(
t
)
is in fact
h
(
t
)
, allowing us to call
h
(
t
)
the impulse response of our matched filter convolution system.
After convolving with the correct matched filter, the resulting signal,
M
f
i
l
t
e
r
e
d
(
t
)
is,
M
f
i
l
t
e
r
e
d
(
t
)
=
M
(
t
)
∗
h
(
t
)
where
∗
denotes convolution.
Which can now be safely sampled by the receiver at the correct sampling instants, and compared to an appropriate threshold, resulting in a correct interpretation of the binary message.
Matched filters play a central role in gravitationalwave astronomy. The first observation of gravitational waves was based on largescale filtering of each detector's output for signals resembling the expected shape, followed by subsequent screening for coincident and coherent triggers between both instruments. Falsealarm rates, and with that, the statistical significance of the detection were then assessed using resampling methods. Inference on the astrophysical source parameters was completed using Bayesian methods based on parameterized theoretical models for the signal waveform and (again) on the Whittle likelihood.