Hi,
I have a few questions about the implementation of GT-discounting and
Katz backoff in ngram-count.
1. What is the default value of gtNmin and gtNmax in ngram-count?
2. Is backing off done only for ngrams that don't appear in the language
model at all, or for ngrams that appear less than k>0 times (and what is
this k). If I want backing off to be done only for counts below some k,
should I set gtNmin to that value?
3. What does the following warning mean:
warning: discount coeff 4 is out of range
Does it mean that the discount for ngrams that appears only 4 times is
very small? Why is it a warning?
Thanks,
Roy.
Click here to go to the SRILM home page.