Neha Patil (Editor)

Memorylessness

Updated on
Edit
Like
Comment
Share on FacebookTweet on TwitterShare on LinkedInShare on Reddit

In probability and statistics, memorylessness is a property of certain probability distributions. It usually refers to the cases when the distribution of a "waiting time" until a certain event does not depend on how much time has elapsed already. Only two kinds of distributions are memoryless: exponential distributions of non-negative real numbers and the geometric distributions of non-negative integers.

Contents

As an example, suppose that a random variable, X, is defined to be the time elapsed in a shop from 9 am on a certain day until the arrival of the first customer. If X is memoryless, then the distribution of "time from now to the next customer" is always the same, independently from the time that has elapsed so far. For example, if no customer has arrived at 10 am, the probability that a customer arrives in the next 10 minutes is the same as what the probability that a customer arrives in the next 10 minutes was at 9 am.

As a counterexample, suppose X is the lifetime of a car engine given in terms of number of miles driven. If the engine has lasted 200,000 miles, then, based on our intuition, it is clear that the probability that the engine lasts another 100,000 miles is not the same as the engine lasting 100,000 miles from the first time it was built. However, if X was memoryless, the two probabilities would be the same (and if our intuition is right, the distribution that describes the lifetime of a large set of these engines does not have the memorylessness property). In essence, we would 'forget' what state the car is in: the probabilities would not be influenced by how much time has elapsed.

In the context of Markov processes, memorylessness refers to the Markov property, an even stronger assumption which implies that the properties of random variables related to the future depend only on relevant information about the current time, not on information from further in the past. The present article describes the use outside the Markov property.

Discrete memorylessness

Suppose X is a discrete random variable whose values lie in the set { 0, 1, 2, ... }. The probability distribution of X is memoryless precisely if for any m, n in { 0, 1, 2, ... }, we have

Pr ( X > m + n X > m ) = Pr ( X > n ) .

Here, Pr(X > m + n | X  >  m) denotes the conditional probability that the value of X is larger than m + n, given that it is larger than or equal to m.

The only memoryless discrete probability distributions are the geometric distributions, which feature the number of independent Bernoulli trials needed to get one "success," with a fixed probability p of "success" on each trial. In other words those are the distributions of waiting time in a Bernoulli process.

A frequent misunderstanding

"Memorylessness" of the probability distribution of the number of trials X until the first success means that

  Pr ( X > 40 X > 30 ) = Pr ( X > 10 ) .

It does not mean that

  Pr ( X > 40 X > 30 ) = Pr ( X > 40 )

which would be true only if the events X > 40 and X > 30 were independent.

Continuous memorylessness

Suppose X is a continuous random variable whose values lie in the non-negative real numbers [0, ∞). The probability distribution of X is memoryless precisely if for any non-negative real numbers t and s, we have

Pr ( X > t + s X > t ) = Pr ( X > s ) .

This is similar to the discrete version except that s and t are constrained only to be non-negative real numbers instead of integers. Rather than counting trials until the first "success," for example, we may be marking time until the arrival of the first phone call at a switchboard.

The memoryless distribution is an exponential distribution

The only memoryless continuous probability distributions are the exponential distributions, so memorylessness completely characterizes the exponential distributions among all continuous ones. The property is derived through the following proof:

To see this, first define the survival function, G, as

G ( t ) = Pr ( X > t ) .

Note that G(t) is then monotonically decreasing. From the relation

Pr ( X > t + s | X > t ) = Pr ( X > s )

and the definition of conditional probability, it follows that

Pr ( X > t + s ) Pr ( X > t ) = Pr ( X > s ) .

This gives the functional equation, which is, by definition a result of the memorylessness property.

G ( t + s ) = G ( t ) G ( s )

From this, we must have:

G ( 2 ) = G ( 1 ) 2

G ( 1 / 2 ) = G ( 1 ) 1 / 2

In general:

G ( a ) = G ( 1 ) a

The only continuous function that will satisfy this equation for any positive, rational, real a is:

G ( a ) = G ( 1 ) a = e ln ( G ( 1 ) ) a = e λ a

Where λ = ln ( G ( 1 ) )

Therefore, since G ( a ) is a probability and must have λ > 0 , then any memorylessness function must be an exponential.

Put a different way, G is a monotone decreasing function (meaning that for times x y , then G ( x ) G ( y ) ).

The functional equation alone will imply that G restricted to rational multiples of any particular number is an exponential function. Combined with the fact that G is monotone, this implies that G over its whole domain is an exponential function.

References

Memorylessness Wikipedia