![]() | ||
In statistics, an approximate entropy (ApEn) is a technique used to quantify the amount of regularity and the unpredictability of fluctuations over time-series data.
Contents
For example, there are two series of data:
series 1: (10,20,10,20,10,20,10,20,10,20,10,20...), which alternates 10 and 20.series 2: (10,10,20,10,20,20,20,10,10,20,10,20,20...), which has either a value of 10 or 20, chosen randomly, each with probability 1/2.Moment statistics, such as mean and variance, will not distinguish between these two series. Nor will rank order statistics distinguish between these series. Yet series 1 is "perfectly regular"; knowing one term has the value of 20 enables one to predict with certainty that the next term will have the value of 10. Series 2 is randomly valued; knowing one term has the value of 20 gives no insight into what value the next term will have.
Regularity was originally measured by exact regularity statistics, which has mainly centered on various entropy measures. However, accurate entropy calculation requires vast amounts of data, and the results will be greatly influenced by system noise, therefore it is not practical to apply these methods to experimental data. ApEn was developed by Steve M. Pincus to handle these limitations by modifying an exact regularity statistic, Kolmogorov–Sinai entropy. ApEn was initially developed to analyze medical data, such as heart rate, and later spread its applications in finance, psychology, and human factors engineering.
The algorithm
in which
The
where
Parameter selection: typically choose
An implementation on Physionet, which is based on Pincus use
The interpretation
The presence of repetitive patterns of fluctuation in a time series renders it more predictable than a time series in which such patterns are absent. ApEn reflects the likelihood that similar patterns of observations will not be followed by additional similar observations. A time series containing many repetitive patterns has a relatively small ApEn; a less predictable process has a higher ApEn.
One example
Suppose
(i.e., the sequence is periodic with a period of 3). Let's choose
Form a sequence of vectors:
Distance is calculated as follows:
Note
Similarly,
Therefore,
Please note in Step 4, for
Then we repeat the above steps for m=3. First form a sequence of vectors:
By calculating distances between vector
Therefore,
Finally,
The value is very small, so it implies the sequence is regular and predictable, which is consistent with the observation.
Advantages
The advantages of ApEn include:
Applications
ApEn has been applied to classify EEG in psychiatric diseases, such as schizophrenia, epilepsy, and addiction.
Limitations
The ApEn algorithm counts each sequence as matching itself to avoid the occurrence of ln(0) in the calculations. This step might cause bias of ApEn and this bias causes ApEn to have two poor properties in practice:
- ApEn is heavily dependent on the record length and is uniformly lower than expected for short records.
- it lacks relative consistency. That is, if ApEn of one data set is higher than that of another, it should, but does not, remain higher for all conditions tested.