![]() | ||
Support k ∈ { 1 , 2 , … } {displaystyle kin {1,2,dotsc }} pmf ρ B ( k , ρ + 1 ) {displaystyle ho operatorname {B} (k,ho +1)} CDF 1 − k B ( k , ρ + 1 ) {displaystyle 1-koperatorname {B} (k,ho +1)} Mean ρ ρ − 1 {displaystyle {rac {ho }{ho -1}}} for ρ > 1 {displaystyle ho >1} Mode 1 {displaystyle 1} |
In probability and statistics, the Yule–Simon distribution is a discrete probability distribution named after Udny Yule and Herbert A. Simon. Simon originally called it the Yule distribution.
Contents
The probability mass function (pmf) of the Yule–Simon (ρ) distribution is
for integer
where
The parameter
The probability mass function f has the property that for sufficiently large k we have
This means that the tail of the Yule–Simon distribution is a realization of Zipf's law:
Occurrence
The Yule–Simon distribution arose originally as the limiting distribution of a particular stochastic process studied by Yule as a model for the distribution of biological taxa and subtaxa. Simon dubbed this process the "Yule process" but it is more commonly known today as a preferential attachment process. The preferential attachment process is an urn process in which balls are added to a growing number of urns, each ball being allocated to an urn with probability linear in the number the urn already contains.
The distribution also arises as a compound distribution, in which the parameter of a geometric distribution is treated as a function of random variable having an exponential distribution. Specifically, assume that
with density
Then a Yule–Simon distributed variable K has the following geometric distribution conditional on W:
The pmf of a geometric distribution is
for
The following recurrence relation holds:
Generalizations
The two-parameter generalization of the original Yule distribution replaces the beta function with an incomplete beta function. The probability mass function of the generalized Yule–Simon(ρ, α) distribution is defined as
with