Supriya Ghosh (Editor)

Subset simulation

Updated on
Edit
Like
Comment
Share on FacebookTweet on TwitterShare on LinkedInShare on Reddit

Subset simulation agiliza c lculos en trabajo con estructuras


Subset simulation is a method used in reliability engineering to compute small (i.e., rare event) failure probabilities encountered in engineering systems. The basic idea is to express a small failure probability as a product of larger conditional probabilities by introducing intermediate failure events. This conceptually converts the original rare event problem into a series of frequent event problems that are easier to solve. In the actual implementation, samples conditional on intermediate failure events are adaptively generated to gradually populate from the frequent to rare event region. These 'conditional samples' provide information for estimating the complementary cumulative distribution function (CCDF) of the quantity of interest (that governs failure), covering the high as well as the low probability regions. They can also be used for investigating the cause and consequence of failure events. The generation of conditional samples is not trivial but can be performed efficiently using Markov Chain Monte Carlo (MCMC).

Contents

Subset Simulation takes the relationship between the (input) random variables and the (output) response quantity of interest as a 'black-box'. This can be attractive for complex systems where it is difficult to use other variance reduction or rare event sampling techniques that require prior information about the system behavior. For problems where it is possible to incorporate prior information into the reliability algorithm, it is often more efficient to use other variance reduction techniques such as importance sampling.

Subset simulation metodolog a m s gil universidad nacional de colombia


Basic idea

Let X be a vector of random variables and Y = h(X) be a scalar (output) response quantity of interest for which the failure probability P(F)=P(Y>b) is to be determined. Each evaluation of h(.) is expensive and so it should be avoided if possible. Using direct Monte Carlo methods one can generate i.i.d. (independent and identically distributed) samples of X and then estimate P(F) simply as the fraction of samples with Y>b. However this is not efficient when P(F) is small because most samples will not fail (i.e., with Y≤b) and in many cases an estimate of 0 results. As a rule of thumb for small P(F) one requires 10 failed samples to estimate P(F) with a coefficient of variation of 30% (a moderate requirement). For example, 10000 i.i.d. samples, and hence evaluations of h(.), would be required for such an estimate if P(F)=0.001.

Subset Simulation attempts to convert a rare event problem into more frequent ones. Let

b1 < b2 < ... < bm = b

be an increasing sequence of intermedidate threshold levels. From the basic property of conditional probability,

P(Y>b) = P(Y>bm|Y>bm-1) P(Y>bm-1)

The 'raw idea' of subset simulation is to estimate P(F) by estimating P(Y>b1) and the conditional probabilities P(Y>bi|Y>bi-1) (i=2,...,m), anticipating efficiency gain when these probabilities are not small. To implement this idea there are two basic issues:

  1. Estimating the conditional probabilities by means of simulation requires the efficient generation of samples of X conditional on the intermediate failure events, i.e., the conditional samples. This is generally non-trivial.
  2. The intermediate threshold levels {bi} should be chosen so that the intermediate probabilities are not too small (otherwise ending up with rare event problem again) but not too large (otherwise requiring too many levels to reach the target event). However, this requires information of the CCDF, which is the target to be estimated.

In the standard algorithm of subset simulation the first issue is resolved by using Markov Chain Monte Carlo. The second issue is resolved by choosing the intermediate threshold levels {bi} adaptively using samples from the last simulation level. As a result, subset simulation in fact produces a set of estimates for b that corresponds to different fixed values of p = P(Y>b), rather than estimates of probabilities for fixed threshold values.

There are a number of variations of subset simulation used in different contexts in applied probability and stochastic operations research For example, in some variations the simulation effort to estimate each conditional probability P(Y>bi|Y>bi-1) (i=2,...,m) may not be fixed prior to the simulation, but may be random, similar to the splitting method in rare-event probability estimation.

References

Subset simulation Wikipedia