Search SRILM-USER Archives

Re: Interpolation vs ngram-merge

From: ilya oparin <ioparin at ADDRESS HIDDEN>
Date: Mon, 11 Jun 2007 20:11:36 +0100 (BST)

I have experience with training LMs on huge data
(hundreds millions wordfors). If this is the case for
you it can be actually be more efficient (or even
possible at all) to interpolate trained LMs, than join
the counts and train (due to time and memory
expenses).
Moreover, it allows to give models different weights
and tune those according to perplexity results on some
test data if the "target speech" for recognition is
already known.

--- marco turchi <marco.turchi at ADDRESS HIDDEN> wrote:

> Dear experts,
> i have a question for u.
> I have two dataset, and I want to construct a LM
> that contains both the dataset.
> srilm provides me two different paths:
> 1)to create 2 different LMs and then interpolate
> them
> 2)to count the n-gram for each dataset, merge these
> counts using
> ngram-merge, and at the end construct the final LM.
> which are the differences of these methods?
> Can u suggest me a paper or book where I can
> understand these differences?
>
> Thanks a lot
> Marco
>

best regards,
Ilya

___________________________________________________________
Yahoo! Mail is the world's favourite email. Don't settle for less, sign up for
your free account today http://uk.rd.yahoo.com/evt=44106/*http://uk.docs.yahoo.com/mail/winter07.html

Click here to go to the SRILM home page.