Search SRILM-USER Archives

Match: Format: Sort by:
Search:

Re: Where have all the 3-grams gone?

From: Andreas Stolcke <stolcke at ADDRESS HIDDEN>
Date: Tue, 18 Mar 2003 14:43:42 PST

In message <Pine.LNX.4.44.0303182241510.28027-100000 at ADDRESS HIDDEN-muenc
hen.de>you wrote:
> Hi Andreas,
>
> experimenting a little with SRILM, I found that ngram-count does not enter
> trigrams into the language model, that occur only once, while it does so
> with bigrams. The command
>
> echo "the man hit the ball" | ngram-count -order 3 -text - -cdiscount3 0.5
> -cdiscount2 0.5 -cdiscount1 0.5 -unk -lm test_C3gram.lm

The default minimum counts are as follows:

1grams 1
2grams 1
3grams 2
4grams 2

You can use the -gt1min, -gt2min, etc. options to change these thresholds
at will. (Maybe counter-intuitively, these options apply to all smoothing
schemes.)

--Andreas

Click here to go to the SRILM home page.