Given *k+1* data points

(
x
0
,
y
0
)
,
…
,
(
x
k
,
y
k
)
The **forward divided differences** are defined as:

[
y
ν
]
:=
y
ν
,
ν
∈
{
0
,
…
,
k
}
[
y
ν
,
…
,
y
ν
+
j
]
:=
[
y
ν
+
1
,
…
,
y
ν
+
j
]
−
[
y
ν
,
…
,
y
ν
+
j
−
1
]
x
ν
+
j
−
x
ν
,
ν
∈
{
0
,
…
,
k
−
j
}
,
j
∈
{
1
,
…
,
k
}
.
The **backward divided differences** are defined as:

[
y
ν
]
:=
y
ν
,
ν
∈
{
0
,
…
,
k
}
[
y
ν
,
…
,
y
ν
−
j
]
:=
[
y
ν
,
…
,
y
ν
−
j
+
1
]
−
[
y
ν
−
1
,
…
,
y
ν
−
j
]
x
ν
−
x
ν
−
j
,
ν
∈
{
j
,
…
,
k
}
,
j
∈
{
1
,
…
,
k
}
.
If the data points are given as a function *ƒ*,

(
x
0
,
f
(
x
0
)
)
,
…
,
(
x
k
,
f
(
x
k
)
)

one sometimes writes

f
[
x
ν
]
:=
f
(
x
ν
)
,
ν
∈
{
0
,
…
,
k
}
f
[
x
ν
,
…
,
x
ν
+
j
]
:=
f
[
x
ν
+
1
,
…
,
x
ν
+
j
]
−
f
[
x
ν
,
…
,
x
ν
+
j
−
1
]
x
ν
+
j
−
x
ν
,
ν
∈
{
0
,
…
,
k
−
j
}
,
j
∈
{
1
,
…
,
k
}
.
Several notations for the divided difference of the function *ƒ* on the nodes *x*_{0}, ..., *x*_{n} are used:

[
x
0
,
…
,
x
n
]
f
,
[
x
0
,
…
,
x
n
;
f
]
,
D
[
x
0
,
…
,
x
n
]
f
etc.

For the first few values of
ν
, this yields:

[
y
0
]
=
y
0
[
y
0
,
y
1
]
=
y
1
−
y
0
x
1
−
x
0
[
y
0
,
y
1
,
y
2
]
=
[
y
1
,
y
2
]
−
[
y
0
,
y
1
]
x
2
−
x
0
=
y
2
−
y
1
x
2
−
x
1
−
y
1
−
y
0
x
1
−
x
0
x
2
−
x
0
=
y
2
−
y
1
(
x
2
−
x
1
)
(
x
2
−
x
0
)
−
y
1
−
y
0
(
x
1
−
x
0
)
(
x
2
−
x
0
)
[
y
0
,
y
1
,
y
2
,
y
3
]
=
[
y
1
,
y
2
,
y
3
]
−
[
y
0
,
y
1
,
y
2
]
x
3
−
x
0
To make the recursive process more clear, the divided differences can be put in a tabular form:

x
0
y
0
=
[
y
0
]
[
y
0
,
y
1
]
x
1
y
1
=
[
y
1
]
[
y
0
,
y
1
,
y
2
]
[
y
1
,
y
2
]
[
y
0
,
y
1
,
y
2
,
y
3
]
x
2
y
2
=
[
y
2
]
[
y
1
,
y
2
,
y
3
]
[
y
2
,
y
3
]
x
3
y
3
=
[
y
3
]
Linearity
Leibniz rule
Divided differences are symmetric: If
σ
:
{
0
,
…
,
n
}
→
{
0
,
…
,
n
}
is a permutation then
From the mean value theorem for divided differences it follows that
The divided difference scheme can be put into an upper triangular matrix. Let
T
f
(
x
0
,
…
,
x
n
)
=
(
f
[
x
0
]
f
[
x
0
,
x
1
]
f
[
x
0
,
x
1
,
x
2
]
…
f
[
x
0
,
…
,
x
n
]
0
f
[
x
1
]
f
[
x
1
,
x
2
]
…
f
[
x
1
,
…
,
x
n
]
⋮
⋮
⋮
⋱
⋮
0
0
0
…
f
[
x
n
]
)
.

Then it holds

T
f
+
g
x
=
T
f
x
+
T
g
x
T
f
⋅
g
x
=
T
f
x
⋅
T
g
x
Since
T
f
x
is a triangular matrix, its eigenvalues are obviously
f
(
x
0
)
,
…
,
f
(
x
n
)
.
Let
δ
ξ
be a Kronecker delta-like function, that is
Obviously

f
⋅
δ
ξ
=
f
(
ξ
)
⋅
δ
ξ
, thus

δ
ξ
is an eigenfunction of the pointwise function multiplication. That is

T
δ
x
i
x
is somehow an "eigenmatrix" of

T
f
x
:

T
f
x
⋅
T
δ
x
i
x
=
f
(
x
i
)
⋅
T
δ
x
i
x
. However, all columns of

T
δ
x
i
x
are multiples of each other, the matrix rank of

T
δ
x
i
x
is 1. So you can compose the matrix of all eigenvectors from the

i
-th column of each

T
δ
x
i
x
. Denote the matrix of eigenvectors with

U
x
. Example

U
(
x
0
,
x
1
,
x
2
,
x
3
)
=
(
1
1
(
x
1
−
x
0
)
1
(
x
2
−
x
0
)
⋅
(
x
2
−
x
1
)
1
(
x
3
−
x
0
)
⋅
(
x
3
−
x
1
)
⋅
(
x
3
−
x
2
)
0
1
1
(
x
2
−
x
1
)
1
(
x
3
−
x
1
)
⋅
(
x
3
−
x
2
)
0
0
1
1
(
x
3
−
x
2
)
0
0
0
1
)
The diagonalization of

T
f
x
can be written as

U
x
⋅
diag
(
f
(
x
0
)
,
…
,
f
(
x
n
)
)
=
T
f
x
⋅
U
x
.

f
[
x
0
]
=
f
(
x
0
)
f
[
x
0
,
x
1
]
=
f
(
x
0
)
(
x
0
−
x
1
)
+
f
(
x
1
)
(
x
1
−
x
0
)
f
[
x
0
,
x
1
,
x
2
]
=
f
(
x
0
)
(
x
0
−
x
1
)
⋅
(
x
0
−
x
2
)
+
f
(
x
1
)
(
x
1
−
x
0
)
⋅
(
x
1
−
x
2
)
+
f
(
x
2
)
(
x
2
−
x
0
)
⋅
(
x
2
−
x
1
)
f
[
x
0
,
x
1
,
x
2
,
x
3
]
=
f
(
x
0
)
(
x
0
−
x
1
)
⋅
(
x
0
−
x
2
)
⋅
(
x
0
−
x
3
)
+
f
(
x
1
)
(
x
1
−
x
0
)
⋅
(
x
1
−
x
2
)
⋅
(
x
1
−
x
3
)
+
f
(
x
2
)
(
x
2
−
x
0
)
⋅
(
x
2
−
x
1
)
⋅
(
x
2
−
x
3
)
+
f
(
x
3
)
(
x
3
−
x
0
)
⋅
(
x
3
−
x
1
)
⋅
(
x
3
−
x
2
)
f
[
x
0
,
…
,
x
n
]
=
∑
j
=
0
n
f
(
x
j
)
∏
k
∈
{
0
,
…
,
n
}
∖
{
j
}
(
x
j
−
x
k
)

With the help of a polynomial function
q
with
q
(
ξ
)
=
(
ξ
−
x
0
)
⋯
(
ξ
−
x
n
)
this can be written as

f
[
x
0
,
…
,
x
n
]
=
∑
j
=
0
n
f
(
x
j
)
q
′
(
x
j
)
.
Alternatively, we can allow counting backwards from the start of the sequence by defining
x
k
=
x
k
+
n
+
1
=
x
k
−
(
n
+
1
)
whenever
k
<
0
or
n
<
k
. This definition allows
x
−
1
to be interpreted as
x
n
,
x
−
2
to be interpreted as
x
n
−
1
,
x
−
n
to be interpreted as
x
0
, etc. The expanded form of the divided difference thus becomes

f
[
x
0
,
…
,
x
n
]
=
∑
j
=
0
n
f
(
x
j
)
∏
k
=
j
−
n
j
−
1
(
x
j
−
x
k
)
+
∑
j
=
0
n
f
(
x
j
)
∏
k
=
j
+
1
j
+
n
(
x
j
−
x
k
)

Yet another characterization utilizes limits:

f
[
x
0
,
…
,
x
n
]
=
∑
j
=
0
n
lim
x
→
x
j
[
f
(
x
j
)
(
x
−
x
j
)
∏
k
=
0
n
(
x
−
x
k
)
]

You can represent partial fractions using the expanded form of divided differences. (This does not simplify computation, but is interesting in itself.) If
p
and
q
are polynomial functions, where
d
e
g
p
<
d
e
g
q
and
q
is given in terms of linear factors by
q
(
ξ
)
=
(
ξ
−
x
1
)
⋅
⋯
⋅
(
ξ
−
x
n
)
, then it follows from partial fraction decomposition that

p
(
ξ
)
q
(
ξ
)
=
(
t
→
p
(
t
)
ξ
−
t
)
[
x
1
,
…
,
x
n
]
.

If limits of the divided differences are accepted, then this connection does also hold, if some of the
x
j
coincide.

If
f
is a polynomial function with arbitrary degree and it is decomposed by
f
(
x
)
=
p
(
x
)
+
q
(
x
)
⋅
d
(
x
)
using polynomial division of
f
by
q
, then

p
(
ξ
)
q
(
ξ
)
=
(
t
→
f
(
t
)
ξ
−
t
)
[
x
1
,
…
,
x
n
]
.
The divided differences can be expressed as

f
[
x
0
,
…
,
x
n
]
=
1
n
!
∫
x
0
x
n
f
(
n
)
(
t
)
B
n
−
1
(
t
)
d
t
where
B
n
−
1
is a B-spline of degree
n
−
1
for the data points
x
0
,
…
,
x
n
and
f
(
n
)
is the
n
-th derivative of the function
f
.

This is called the **Peano form** of the divided differences and
B
n
−
1
is called the Peano kernel for the divided differences, both named after Giuseppe Peano.

If nodes are cumulated, then the numerical computation of the divided differences is inaccurate, because you divide almost two zeros, each of which with a high relative error due to differences of similar values. However we know, that difference quotients approximate the derivative and vice versa:

f
(
y
)
−
f
(
x
)
y
−
x
≈
f
′
(
x
)
for

x
≈
y
This approximation can be turned into an identity whenever Taylor's theorem applies.

f
(
y
)
=
f
(
x
)
+
f
′
(
x
)
⋅
(
y
−
x
)
+
f
″
(
x
)
⋅
(
y
−
x
)
2
2
!
+
f
‴
(
x
)
⋅
(
y
−
x
)
3
3
!
+
…
⇒
f
(
y
)
−
f
(
x
)
y
−
x
=
f
′
(
x
)
+
f
″
(
x
)
⋅
y
−
x
2
!
+
f
‴
(
x
)
⋅
(
y
−
x
)
2
3
!
+
…

You can eliminate the odd powers of
y
−
x
by expanding the Taylor series at the center between
x
and
y
:

x
=
m
−
h
,
y
=
m
+
h
, that is

m
=
x
+
y
2
,
h
=
y
−
x
2
f
(
m
+
h
)
=
f
(
m
)
+
f
′
(
m
)
⋅
h
+
f
″
(
m
)
⋅
h
2
2
!
+
f
‴
(
m
)
⋅
h
3
3
!
+
…
f
(
m
−
h
)
=
f
(
m
)
−
f
′
(
m
)
⋅
h
+
f
″
(
m
)
⋅
h
2
2
!
−
f
‴
(
m
)
⋅
h
3
3
!
+
…
f
(
y
)
−
f
(
x
)
y
−
x
=
f
(
m
+
h
)
−
f
(
m
−
h
)
2
⋅
h
=
f
′
(
m
)
+
f
‴
(
m
)
⋅
h
2
3
!
+
…
The Taylor series or any other representation with function series can in principle be used to approximate divided differences. Taylor series are infinite sums of power functions. The mapping from a function
f
to a divided difference
f
[
x
0
,
…
,
x
n
]
is a linear functional. We can as well apply this functional to the function summands.

Express power notation with an ordinary function:
p
n
(
x
)
=
x
n
.

Regular Taylor series is a weighted sum of power functions:
f
=
f
(
0
)
⋅
p
0
+
f
′
(
0
)
⋅
p
1
+
f
″
(
0
)
2
!
⋅
p
2
+
f
‴
(
0
)
3
!
⋅
p
3
+
…

Taylor series for divided differences:
f
[
x
0
,
…
,
x
n
]
=
f
(
0
)
⋅
p
0
[
x
0
,
…
,
x
n
]
+
f
′
(
0
)
⋅
p
1
[
x
0
,
…
,
x
n
]
+
f
″
(
0
)
2
!
⋅
p
2
[
x
0
,
…
,
x
n
]
+
f
‴
(
0
)
3
!
⋅
p
3
[
x
0
,
…
,
x
n
]
+
…

We know that the first
n
terms vanish, because we have a higher difference order than polynomial order, and in the following term the divided difference is one:

∀
j
<
n
p
j
[
x
0
,
…
,
x
n
]
=
0
p
n
[
x
0
,
…
,
x
n
]
=
1
p
n
+
1
[
x
0
,
…
,
x
n
]
=
x
0
+
⋯
+
x
n
p
n
+
m
[
x
0
,
…
,
x
n
]
=
∑
a
∈
{
0
,
…
,
n
}
m
with
a
1
≤
a
2
≤
⋯
≤
a
m
∏
j
∈
a
x
j
.
It follows that the Taylor series for the divided difference essentially starts with
f
(
n
)
(
0
)
n
!
which is also a simple approximation of the divided difference, according to the mean value theorem for divided differences.

If we would have to compute the divided differences for the power functions in the usual way, we would encounter the same numerical problems that we had when computing the divided difference of
f
. The nice thing is, that there is a simpler way. It holds

t
n
=
(
1
−
x
0
⋅
t
)
⋯
⋅
(
1
−
x
n
⋅
t
)
⋅
(
p
0
[
x
0
,
…
,
x
n
]
+
p
1
[
x
0
,
…
,
x
n
]
⋅
t
+
p
2
[
x
0
,
…
,
x
n
]
⋅
t
2
+
…
)
.
Consequently, we can compute the divided differences of
p
n
by a division of formal power series. See how this reduces to the successive computation of powers when we compute
p
n
[
h
]
for several
n
.

If you need to compute a whole divided difference scheme with respect to a Taylor series, see the section about divided differences of power series.

Divided differences of polynomials are particularly interesting, because they can benefit from the Leibniz rule. The matrix
J
with

J
=
(
x
0
1
0
0
⋯
0
0
x
1
1
0
⋯
0
0
0
x
2
1
0
⋮
⋮
⋱
⋱
0
0
0
0
x
n
)
contains the divided difference scheme for the identity function with respect to the nodes
x
0
,
…
,
x
n
, thus
J
n
contains the divided differences for the power function with exponent
n
. Consequently, you can obtain the divided differences for a polynomial function
φ
(
p
)
with respect to the polynomial
p
by applying
p
(more precisely: its corresponding matrix polynomial function
φ
M
(
p
)
) to the matrix
J
.

φ
(
p
)
(
ξ
)
=
a
0
+
a
1
⋅
ξ
+
⋯
+
a
n
⋅
ξ
n
φ
M
(
p
)
(
J
)
=
a
0
+
a
1
⋅
J
+
⋯
+
a
n
⋅
J
n
=
(
φ
(
p
)
[
x
0
]
φ
(
p
)
[
x
0
,
x
1
]
φ
(
p
)
[
x
0
,
x
1
,
x
2
]
…
φ
(
p
)
[
x
0
,
…
,
x
n
]
0
φ
(
p
)
[
x
1
]
φ
(
p
)
[
x
1
,
x
2
]
…
φ
(
p
)
[
x
1
,
…
,
x
n
]
⋮
⋱
⋱
⋱
⋮
0
…
0
0
φ
(
p
)
[
x
n
]
)
This is known as *Opitz' formula*.

Now consider increasing the degree of
p
to infinity, i.e. turn the Taylor polynomial to a Taylor series. Let
f
be a function which corresponds to a power series. You can compute a divided difference scheme by computing the according matrix series applied to
J
. If the nodes
x
0
,
…
,
x
n
are all equal, then
J
is a Jordan block and computation boils down to generalizing a scalar function to a matrix function using Jordan decomposition.

When the data points are equidistantly distributed we get the special case called **forward differences**. They are easier to calculate than the more general divided differences.

Note that the "divided portion" from **forward divided difference** must still be computed, to recover the **forward divided difference** from the **forward difference**.

Given *n* data points

(
x
0
,
y
0
)
,
…
,
(
x
n
−
1
,
y
n
−
1
)
with

x
ν
=
x
0
+
ν
h
,
h
>
0
,
ν
=
0
,
…
,
n
−
1
the divided differences can be calculated via **forward differences** defined as

△
(
0
)
y
i
:=
y
i
△
(
k
)
y
i
:=
△
(
k
−
1
)
y
i
+
1
−
△
(
k
−
1
)
y
i
,
k
≥
1.
The relationship between divided differences and forward differences is

f
[
x
0
,
x
1
,
…
,
x
k
]
=
1
k
!
h
k
△
(
k
)
f
(
x
0
)
.
y
0
△
y
0
y
1
△
2
y
0
△
y
1
△
3
y
0
y
2
△
2
y
1
△
y
2
y
3