Search SRILM-USER Archives

Match: Format: Sort by:
Search:

Re: Problems about srilm

From: Andreas Stolcke <stolcke at ADDRESS HIDDEN>
Date: Thu, 19 Apr 2007 13:09:37 -1000

¬x¤j¥° wrote:
> Hello!
> I am a student from Taiwan.
> I have some questions when I encountered difficulties in using srilm. The
> problem is as the attaching field. And when I made google n-gram models, I
> also encountered the same problem. Would you please tell me what the mistake
> did I make? Thank you!
>  
It is impossible to read the entire google 5gram corpus into memory,
which is what you are trying to do.
You have to use the count-based LM, and estimate deleted interpolation
weights from a small amount of
data, so that only a small portion of the ngrams need to be kept in memory.

I'm sorry there is no good documentation of this process at this point
(you can piece it together by reading
the manual pages for ngram-count and ngram, and look at the example in

$SRILM/test/tests/ngram-count-lm-limit-vocab/run-test

We will make complete instructions for google ngram usage available in
the future.

Andreas

> --
> Chaoyang University of Technology
> WebMail http://webmail.cyut.edu.tw
>
>
>
>  
>
> ------------------------------------------------------------------------
>

Click here to go to the SRILM home page.