Search SRILM-USER Archives

Match: Format: Sort by:
Search:

error in discount estimator for order 3

From: "Rebecca Madsen" <rmadsen at ADDRESS HIDDEN>
Date: Thu, 3 Aug 2006 15:02:46 -0600

Is there a reason why duplicating my data would give me the following error:

using ModKneserNey for 3-grams
Kneser-Ney smoothing 3-grams
n1 = 0
n2 = 94762
n3 = 0
n4 = 37773
one of required modified KneserNey count-of-counts is zero
error in discount estimator for order 3

I can build a language model using the following command line with the
normal data, but concatenating two copies of the data together gives
me the discount estimator error.

$ /home/tools/srilm/bin/i686/ngram-count -text my_data_doubled.txt
-interpolate -kndiscount1 -kndiscount2 -kndiscount3 -lm
my_data_doubled.lm

Thanks for your help,
Rebecca

Click here to go to the SRILM home page.