Search SRILM-USER Archives

Match: Format: Sort by:
Search:

singleton counts warning

From: Solen Quiniou <solen.quiniou at ADDRESS HIDDEN>
Date: Mon, 15 Mar 2004 09:36:49 +0100

Hi !
I use SRILM to build a language model on letters. I have a warning that
I don't understand : "warning: no singleton counts
GT discounting disabled"
So, the model computed is wrong since some back-off weight are positives
(in log-probability) ! Do you know what does this warning mean ? I
thought no counts on single letters were computed but they were so I
can't find an explanation !

I've got another question, about the computation of unigram
log-probability. When I used the formula  : log[P(w)] = log[c(w)] -
log[N], where N is the number of word TOKENS in the training corpus, I
don't find exactly the value given by SRILM. Is there smoothing on
unigram ? And if so, how is it made ?

Thank you for answering.

Solen.

Click here to go to the SRILM home page.