Puneet Varma (Editor)

Maximum likelihood estimation with flow data

Updated on
Edit
Like
Comment
Share on FacebookTweet on TwitterShare on LinkedInShare on Reddit

Maximum likelihood estimation with flow data is a parametric approach to deal with flow sampling data. Assume that we have observations of ai the time a person enters the state of interest, some observables xi, and the censoring of the flow data takes on a particular form. In particular t i = min ( t i U , L ) , where ti is the observed duration outcome, t i U is the underlying continuous variable and L is the censoring threshold. For instance, when thinking about unemployment spells, ai is the data of entering unemployment, xi is a vector of worker characteristics, and ti is the observed unemployment duration. If we only follow the workers for a certain period of time, this variable is necessarily a censored version of the true unemployment duration.

Two key assumptions allow for setting up the loglikelihood. First, a distributional form for the latent variable t i U needs to be assumed. Second, independence between the true duration and the starting point of the spell is assumed, i.e.,

F ( t i U x i , a i , L ) = F ( t i U x i )

where F is the conditional distribution of the underlying duration variable. This latter assumption allows us to model the probability that the variable is censored, i.e.,

Pr ( t i U L x i ) = 1 F ( L x i )

which leads to the following log likelihood:

i = 1 n [ d i log ( f ( t i x i ) ) + ( 1 d i ) log ( 1 F ( L x i ) ) ]

where f is the density associated with the distribution F and di is an indicator denoting whether ti = L. Additionally, it is possible to have the threshold vary at the observational level, by replacing L by Li in the formulas above.

Tests of specification in duration models encompass testing for the validity of the imposed functional form. Tests of restrictions on the functional form are similar to those testing for unobserved heterogeneity, where the restriction imposes no such heterogeneity. Nevertheless, it is often desirable to test for such heterogeneity, as this can bias the estimation of the hazard rate. Similarly, tests for censoring exist that compare the distribution of the generalized error under the censored and the uncensored assumption.

References

Maximum likelihood estimation with flow data Wikipedia