Supriya Ghosh (Editor)

Inside–outside algorithm

Updated on
Edit
Like
Comment
Share on FacebookTweet on TwitterShare on LinkedInShare on Reddit

In computer science, the inside–outside algorithm is a way of re-estimating production probabilities in a probabilistic context-free grammar. It was introduced James K. Baker in 1979 as a generalization of the forward–backward algorithm for parameter estimation on hidden Markov models to stochastic context-free grammars. It is used to compute expectations, for example as part of the expectation–maximization algorithm (an unsupervised learning algorithm).

Contents

Inside and outside probabilities

The inside probability β j ( p , q ) is the total probability of generating words w p w q , given the root nonterminal N j and a grammar G :

β j ( p , q ) = P ( w p q | N p q j , G )

The outside probability α j ( p , q ) is the total probability of beginning with the start symbol N 1 and generating the nonterminal N p q j and all the words outside w p w q , given a grammar G :

α j ( p , q ) = P ( w 1 ( p 1 ) , N p q j , w ( q + 1 ) m | G )

Computing Inside probabilities

Base Case:

β j ( p , p ) = P ( w p | N j , G )

General Case:

Suppose there is a rule N j N r N s in the grammar, then the probability of generating w p w q starting with a subtree rooted at N j is:

k = p k = q 1 P ( N j N r N s ) β r ( p , k ) β s ( k + 1 , q )

The inside probability β j ( p , q ) is just the sum over all such possible rules:

β j ( p , q ) = N r , N s k = p k = q 1 P ( N j N r N s ) β r ( p , k ) β s ( k + 1 , q )

Computing Outside probabilities

Base Case:

α j ( 1 , n ) = { 1 if  j = 1 0 otherwise

Here the start symbol is N 1 .

References

Inside–outside algorithm Wikipedia