Search SRILM-USER Archives

Match: Format: Sort by:
Search:

Re: class based SRI LM

From: Yang Liu <yangl at ADDRESS HIDDEN>
Date: Mon, 23 Jun 2003 15:59:05 -0500 (EST)

Hi June,
After you get the automatically induced classes (the class definition in file
text.cls), you can map all the words in your training set to classes using:
replace-words-with-classes classes=text.cls training_set > training_set_classes
Then you can any order class-based LM from that.

Hope this helps.
-- Yang

>Hi,
>
>   I tried to build class based LMs in the following way:
>
>   step-1:  ngram-class -text test.in -numclasses 100 -class-counts text.cnt
-classes text.cls  -save 100
>
>   step-2:  ngram-count -read  text.cnt -memuse -kndiscount -kndiscount1
-kndiscount2 -lm text.srilm.gz
>
>    I found that the class count output "text.cnt" from step-1 is only
bigram-counts.  Thus the final class-LM text.srilm.gz is also a bigram one.
>
>   Could anyone tell me if I am using the toolkit correctly?  How to build a
trigram class-based LM?  Also are there any published paper/document that I can
look up for detail information?
>
>   Many thanks,
>
>-June
>
>
>---------------------------------
>Do you Yahoo!?
>SBC Yahoo! DSL - Now only $29.95 per month!

Click here to go to the SRILM home page.