For ngram backup you distribute the probabilty mass left over by
ngrams of order k in proportion to probabilities given by ngrams of order k-1.
What the error message is saying is that the k-1-grams don't assign any
probability to the words that don't already have k-grams. This can happen
especially when you disable smoothing as you did.
The problem should go away if you include all trigrams from your training
data. the default minimum count for trigrams 2, so you need to use
-gt3min 1 in addition to the options you have.
--Andreas
In message <20030419045423.33794.qmail at ADDRESS HIDDEN>you wrote:
> --0-1120635126-1050728063=:32317
> Content-Type: text/plain; charset=us-ascii
>
> I encountered the following problem reported from ngram-count: BOW denominato
> r for context "D SMALL" is 0 <= 0,numerator is 0.0909091 The switches I invok
> ed is: zcat EN.count.1.gz EN.count.2.gz EN.count.3.gz | perl -pe 's/<UNK>/<un
> k>/g' | ./bin/ngram-count -memuse -read - -vocab ML.vocab -order 3 -cdiscount
> 3 0 -cdiscount2 0 -cdiscount1 0 -unk -lm - | ./bin/add-dummy-bows - | perl
> -pe 's/<unk>/<UNK>/g' | gzip >! EN.arpabo.3.gz Could someone help me to get
> rid of that warning msg? Thanks, June
>
>
Click here to go to the SRILM home page.