Search SRILM-USER Archives

Match: Format: Sort by:
Search:

FW: A simple question about SRILM

From: "Roy Bar Haim" <barhaim at ADDRESS HIDDEN>
Date: Mon, 17 May 2004 20:25:52 +0200

Hi,

I have the same problem. I want the LM to give maximum-likelihood estimates.
That is, all the backoff weights should be zero.

I applied the solution below, but still I get backoff weights.

For example, when I build the lm like this:
ngram-count -order 3 -gt1max 0 -gt2max 0 -gt3max 0 -text corpus.tags -lm corpus.tags.lm

I found that the once-occuring trigrams DO NOT APPEAR in the lm, so probablity mass is still discounted.

When I turned on the debug messages, I saw many messages like:
warning: 0 backoff probability mass left for "AT SCLN" -- incrementing denominator

Does it mean that smoothing is enforced here?

Is there a way to get a pure maximum-likelihood language model, without backoff weights at all, using ngram-count?

Thanks,
Roy.
> -----Original Message-----
> From: owner-srilm-user at ADDRESS HIDDEN
> [mailto:owner-srilm-user at ADDRESS HIDDEN] On Behalf Of Andreas Stolcke
> Sent: Tuesday, April 06, 2004 6:34 PM
> To: David Picף
> Cc: srilm-user at ADDRESS HIDDEN; Jorge Gonzבlez
> Subject: Re: A simple question about SRILM
>
>
>
> The ngram-count man page says
>
>        -gtnmax count
>               where  n  is 1, 2, 3, 4, 5, 6, 7, 8, or 9.  Set the
>               maximal count of N-grams of order n that  are  dis-
>               counted  under  Good-Turing.  All N-grams more fre-
>               quent than that  will  receive  maximum  likelihood
>               estimates.  Discounting can be effectively disabled
>               by setting this to 0.
>
> Therefore, you can disable smoothing with
>
> ngram-count -gt1max 0 -gt2max 0 -gt3max 0 ...
>
> --Andreas
>
> In message <40726957.3070101 at ADDRESS HIDDEN>you wrote:
> > Hello,
> >
> > I also have a little question about SRILM. How can I infer
> a trigram
> > (or
> > bigram, or tetragram...) with no smoothing at all? I need
> to do some
> > experiments to check the effect of n-gram smoothing in my
> models and I
> > need a pure trigram with no probability mass derived to
> lower levels. Is
> > this possible in SRILM? I need to be sure that I really get
> a trigram
> > (with the whole trigram probabilities).
> >
> > Thank you very much in advance for your help and attention! David
> >
> > --
> > David Picó-Vila
> > Universitat Politècnica de València
> > Departament de Sistemes Informàtics i Computació
> > València, Spain
> >
>
>

Click here to go to the SRILM home page.