Lee–Carter model

Updated on Nov 22, 2024

Edit

Comment

The Lee–Carter model is a numerical algorithm used in mortality forecasting and life expectancy forecasting. The input to the model is a matrix of age specific mortality rates ordered monotonically by time, usually with ages in columns and years in rows. The output is another forecasted matrix of mortality rates.

The model uses the singular value decomposition (SVD) to find a univariate time series vector "k_t" that captures 80–90% of the mortality trend (here the subscript "t" refers to time), a vector "b_x" that describes the amount of mortality change at a given age for a unit of yearly total mortality change (here the subscript "x" refers to age), and a scaling constant (referred to here as s₁ but unnamed in the literature). Surprisingly, k_t is usually linear, implying that gains to life expectancy are fairly constant year after year in most populations. Before being input to the SVD, age specific mortality rates are transformed into "a_x,t", by taking their logarithms, and then centering them by subtracting their age-specific means (calculated over time). (The subscript "x,t" refers to the fact that a_x,t spans both age and time.) Many researchers adjust the k_t vector by fitting it to empirical life expectancies for each year, using the a_x and b_x just generated with the SVD; when adjusted using this approach, changes to k_t are usually small.

To forecast mortality, the above k_t (either adjusted or not) is projected into the future using ARIMA time series methods, the corresponding future a_x,t+n is recovered by multiplying k_t+n by b_x and the appropriate diagonal element of S (when [U S V] = svd(mort)), and the actual mortality rates are recovered by taking exponentials of this vector. Because of the linearity of k_t, it is generally modeled as a random walk with trend. Life expectancy and other life table measures can be calculated from this forecasted matrix after adding back the means and taking exponentials to yield regular mortality rates.

In most implementations, confidence intervals for the forecasts are generated by simulating multiple mortality forecasts using Monte-Carlo methods; a band of mortality between 5% and 95% percentiles of the simulated results is considered to be a valid forecast. These simulations are done by extending k_t into the future using randomization based on the standard error of k_t derived from the input data.

In outline and Matlab-style pseudocode, the algorithm is as follows:

Create a_x by taking logarithms of the mortality rates and centering the results with the average log mortality at a given age.
Derive k_t, a scaling eigenvalue, and b_x from U(:,1), S(1,1), V(1,:), where [U S V] = svd(mort).
Forecast k_t with standard univariate ARIMA methods.
Use the forecast k_t with the original b_x and a_x to calculate logged mortality rates for each forecast year.
Recover regular mortality rates by calculating the exponential of the forecasted log mortality rates.

Without applying SVD or some other method of dimension reduction the table of mortality data is a highly correlated multivariate data series; the complexity of these multidimensional time series makes such them almost impossible to forecast. SVD has become widely used as a method of dimension reduction in many disparate fields, including by Google in their page rank algorithm.

The Lee–Carter model was introduced by Ronald D. Lee and Lawrence Carter in 1992 with the article "Modeling and Forecasting the Time Series of U.S. Mortality," (Journal of the American Statistical Association 87 (September): 659–671). The model grew out of their work in the late 1980s and early 1990s attempting to use inverse projection to infer rates in historical demography. The model has been used by the United States Social Security Administration, the US Census Bureau, and the United Nations. It has become the most widely used mortality forecasting technique in the world today.

There have been extensions to the Lee–Carter, most notably to account for missing years, correlated male and female populations, and large scale coherency in populations that share a mortality regime (western Europe, for example). Many related papers can be found on Professor Ronald Lee's website.

There are surprisingly few software packages for forecasting with the Lee-Carter Model. LCFIT is a web-based package with interactive forms. Professor Rob J. Hyndman provides an R package for demography that includes routines for creating and forecasting a Lee-Carter Model. Professor German Rodriguez provides code for the Lee-Carter Model using Stata. Using Matlab, Professor Eric Jondeau and Professor Michael Rockinger have put together the Longevity Toolbox for parameter estimation.

References

Lee–Carter model Wikipedia

(Text) CC BY-SA