Search SRILM-USER Archives

question about vocabulary

From: lavecchia <Caroline.Lavecchia at ADDRESS HIDDEN>
Date: Tue, 04 May 2004 16:18:11 +0200

Hello everybody,

I would like to know if it's possible with the SRILM toolkit to generate
a vocabulary with the 20000 most frequent words of a corpus for example.

I know that with -write-vocab in the ngram-count function I can
generate a vocabulary but only with all the words of the corpus.

Thanks in advance and sorry for my bad english,

Caroline L.

Click here to go to the SRILM home page.