positive backoff weight

Andreas Stolcke stolcke at speech.sri.com
Mon Apr 5 22:13:26 PDT 2004


In message <40597F52.4050803 at irisa.fr>you wrote:
> Thank you for the past answers to my questions.
> 
> I've got another question. Sometimes, when I use a Good-Turing 
> discounting, some of the backoff weight of the unigram (I compute a 
> bigram model) are positive log-probability. How is it possible ? Is it 

Backoff weights are not probabilities.  They are normalizing factors.
Backoff weight for a history h is defined as

	BOW(h) =  [ 1- \sum_(w,h) p(w|h) ] / [ 1- \sum_(w,h) p'(w|h) ]

where p'(w|h) is the lower-order probability estimate (e.g., a bigram
estimate in a trigram model).
So, if the trigram probability estimate give lower value than the corresponding
bigram estimates for a given history, then BOW(h) will be > 1 and its log 
positive.

> because Good-Turing discounting is disabled on unigram since there are 
> no unigram which frequency is 1 ? And, more, generally, how are computed 
> backoff weights for unigrams, in the case of a bigram model ?

Backoff weights for unigrams are computed by exactly the same method
(in the formula above, p(w|h) are bigram probabilities and p'(w|h) are
unigram probabilities).

--Andreas 




More information about the SRILM-User mailing list