[SRILM User List] A confusion of the interpolated language model

Yannick Estève yannick.esteve at lium.univ-lemans.fr
Thu Aug 27 01:19:44 PDT 2009


Hi,

Back-off weights are not probabilities: they can be greater than 1.
So, your values are normal. You can have some explanations about back- 
off weight computation here, particularly for the use of the modified  
Kneser-Ney discounting method:
http://www.speech.sri.com/projects/srilm/manpages/pdfs/chen-goodman-tr-10-98.pdf

Regards,
Yannick Estève
LIUM - University of Le Mans
France

Le 27 août 09 à 09:21, 海龙 史 a écrit :

>
>
>
>
> I am a new student user of srilm from Asia.Here I used the command  
> below to construct a interpolated mod-kn discount language model:
> ~ ngram-count -read merge_counts_1994-2003.gz -gt1min 0 -gt2min 0 - 
> gt3min 2 -kndiscount -interpolate -order 3 -vocab ChWord.lexno -lm  
> 1994-2003_lm_all_pruned.lm
>
>  However in my model several N-grams' back-off werght(bow) appears  
> to be greater than 1.That is ,in the text LM file,I've got a line:
> -6.457229    <s> 1635    0.1270406
> (Here we just use a kind of index to represent a chinese word)
> in whitch the 1og10(bow) is greater than 0.We don't think a normal  
> interplotate discount method can produce an N-gram bow greater than  
> 1,besides this circumstance only occured to several(less than 5)  
> different N-grams.So I am confused and would like to ask if there is  
> someyone who encounterd this circumstance or happens to know what is  
> wrong.
> Thank you very much!
>
> 史海龙
> Hailoon Shi
> w63,EE Dpt.Thu Univ.PRC
>
>
>
>
>
> __________________________________________________
> 赶快注册雅虎超大容量免费邮箱?
> http:// 
> cn.mail.yahoo.com_______________________________________________
> SRILM-User site list
> SRILM-User at speech.sri.com
> http://www.speech.sri.com/mailman/listinfo/srilm-user

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.speech.sri.com/pipermail/srilm-user/attachments/20090827/11b93d43/attachment.html>


More information about the SRILM-User mailing list