Inside–outside algorithm - Alchetron, the free social encyclopedia

In computer science, the inside–outside algorithm is a way of re-estimating production probabilities in a probabilistic context-free grammar. It was introduced James K. Baker in 1979 as a generalization of the forward–backward algorithm for parameter estimation on hidden Markov models to stochastic context-free grammars. It is used to compute expectations, for example as part of the expectation–maximization algorithm (an unsupervised learning algorithm).

Inside and outside probabilities

The inside probability β j ( p , q ) is the total probability of generating words w p ⋯ w q , given the root nonterminal N j and a grammar G :

β j ( p , q ) = P ( w p q | N p q j , G )

The outside probability α j ( p , q ) is the total probability of beginning with the start symbol N 1 and generating the nonterminal N p q j and all the words outside w p ⋯ w q , given a grammar G :

α j ( p , q ) = P ( w 1 ( p − 1 ) , N p q j , w ( q + 1 ) m | G )

Computing Inside probabilities

Base Case:

β j ( p , p ) = P ( w p | N j , G )

General Case:

Suppose there is a rule N j → N r N s in the grammar, then the probability of generating w p ⋯ w q starting with a subtree rooted at N j is:

∑ k = p k = q − 1 P ( N j → N r N s ) β r ( p , k ) β s ( k + 1 , q )

The inside probability β j ( p , q ) is just the sum over all such possible rules:

β j ( p , q ) = ∑ N r , N s ∑ k = p k = q − 1 P ( N j → N r N s ) β r ( p , k ) β s ( k + 1 , q )

Computing Outside probabilities

Base Case:

α j ( 1 , n ) = { 1 if j = 1 0 otherwise

Here the start symbol is N 1 .

References

Inside–outside algorithm Wikipedia

(Text) CC BY-SA

Contents

Inside and outside probabilities

Computing Inside probabilities

Computing Outside probabilities

References