Search SRILM-USER Archives

Re: Help

From: Andreas Stolcke <stolcke at ADDRESS HIDDEN>
Date: Wed, 24 Jul 2002 09:01:30 PDT

Dimitris,

your email to the list did not go through because at the time you
sent it you were not subscribed to the list (to prevent spam we
only allow list members to post).

Regarding your question: indeed the perplexity of the mixed LM should
be much closer to what compute-best-mix outputs.

There are two ways to create an interpolated model:

"on-the-fly" this is the traditional approach: you keep the component
models separate, and compute the interpolated probabilties
when you evaluate the model

The command for this is

ngram -lm ... -mix-lm ... -lambda L -bayes 0

"merged" you create a single static model that implements an
approximation to the on-the-fly method

The command for this is

ngram -lm ... -mix-lm ... -lambda L

(no -bayes option).
The -write-lm option outputs the merged model if desired.

In the "merged" case you only get an approximation because in general
it is not possible to create a single back-off model that exactly
implements the mixed probabilties of the two component models (without
expanding out all possible N-grams and effectively bypassing the
backoff mechanism).

As explained in the ICSLP paper, the "merged" approach is usually slightly
better than the traditional interpolation. However, it only works if
you have two models of the same type (both word-based or both class-based).
When you merge a word-based and a class-based model the approximation
doesn't work anymore. I suspect that's what you did in your experiment.
Rerun ngram with the -bayes 0 option and see if you get the perplexity
you expect.

--Andreas

In message <002101c2332c$9d6cb540$7b081b93 at ADDRESS HIDDEN>you wrote:
> Hi Andreas,
>
> Before five days I sent an e-mail at srilm-user at ADDRESS HIDDEN
> and I still haven't receive an answer.
> I repeat it here. Please inform me...
>
> Hi
>
> I interpolate a 3-gram with a class 3-gram
>
> The output of compute-best-mix is:
> compute-best-mix debug2-LM1 debug2-LM2
> iteration 19, lambda = (0.849536 0.150464), ppl = 150.787
>
> The PP of the interpolated model at the held-out data I used to take
> debug2-LM1 and debug2-LM2 is 169.52
>
> This ain't to be the same with the output of compute-best-mix, eg 150.787?
> Do I something wrong?
>
> Regards,
> Dimitris
>
>

Click here to go to the SRILM home page.