The intensity λ of a counting process is a measure of the rate of change of its predictable part. If a stochastic process { N ( t ) , t ≥ 0 } is a counting process, then it is a submartingale, and in particular its Doob-Meyer decomposition is
N ( t ) = M ( t ) + Λ ( t ) where M ( t ) is a martingale and Λ ( t ) is a predictable increasing process. Λ ( t ) is called the cumulative intensity of N ( t ) and it is related to λ by
Λ ( t ) = ∫ 0 t λ ( s ) d s .
Given probability space ( Ω , F , P ) and a counting process { N ( t ) , t ≥ 0 } which is adapted to the filtration { F t , t ≥ 0 } , the intensity of N is the process { λ ( t ) , t ≥ 0 } defined by the following limit:
λ ( t ) = lim h ↓ 0 1 h E [ N ( t + h ) − N ( t ) | F t ] .
The right-continuity property of counting processes allows us to take this limit from the right.
In statistical learning, the variation between λ and its estimator λ ^ can be bounded with the use of oracle inequalities.
If a counting process N ( t ) is restricted to t ∈ [ 0 , 1 ] and n i.i.d. copies are observed on that interval, N 1 , N 2 , … , N n , then the least squares functional for the intensity is
R n ( λ ) = ∫ 0 1 λ ( t ) 2 d t − 2 n ∑ i = 1 n ∫ 0 1 λ ( t ) d N i ( t ) which involves an Ito integral. If the assumption is made that λ ( t ) is piecewise constant on [ 0 , 1 ] , i.e. it depends on a vector of constants β = ( β 1 , β 2 , … , β m ) ∈ R + m and can be written
λ β = ∑ j = 1 m β j λ j , m , λ j , m = m 1 ( j − 1 m , j m ] ,
where the λ j , m have a factor of m so that they are orthonormal under the standard L 2 norm, then by choosing appropriate data-driven weights w ^ j which depend on a parameter x > 0 and introducing the weighted norm
∥ β ∥ w ^ = ∑ j = 2 m w ^ j | β j − β j − 1 | ,
the estimator for β can be given:
β ^ = arg min β ∈ R + m { R n ( λ β ) + ∥ β ∥ w ^ } .
Then, the estimator λ ^ is just λ β ^ . With these preliminaries, an oracle inequality bounding the L 2 norm ∥ λ ^ − λ ∥ is as follows: for appropriate choice of w ^ j ( x ) ,
∥ λ ^ − λ ∥ 2 ≤ inf β ∈ R + m { ∥ λ β − λ ∥ 2 + 2 ∥ β ∥ w ^ } with probability greater than or equal to 1 − 12.85 e − x .