singleton counts warning

Solen Quiniou solen.quiniou at irisa.fr
Mon Mar 15 00:36:49 PST 2004


Hi !
I use SRILM to build a language model on letters. I have a warning that 
I don't understand : "warning: no singleton counts
GT discounting disabled"
So, the model computed is wrong since some back-off weight are positives 
(in log-probability) ! Do you know what does this warning mean ? I 
thought no counts on single letters were computed but they were so I 
can't find an explanation !

I've got another question, about the computation of unigram 
log-probability. When I used the formula  : log[P(w)] = log[c(w)] - 
log[N], where N is the number of word TOKENS in the training corpus, I 
don't find exactly the value given by SRILM. Is there smoothing on 
unigram ? And if so, how is it made ?

Thank you for answering.

Solen.





More information about the SRILM-User mailing list