Search SRILM-USER Archives

Match: Format: Sort by:
Search:

class based SRI LM

From: June July <julyjune03 at ADDRESS HIDDEN>
Date: Mon, 23 Jun 2003 10:45:44 -0700 (PDT)

--0-103997848-1056390344=:78586
Content-Type: text/plain; charset=us-ascii

Hi,

   I tried to build class based LMs in the following way:

   step-1:  ngram-class -text test.in -numclasses 100 -class-counts text.cnt -classes text.cls  -save 100

   step-2:  ngram-count -read  text.cnt -memuse -kndiscount -kndiscount1 -kndiscount2 -lm text.srilm.gz

    I found that the class count output "text.cnt" from step-1 is only bigram-counts.  Thus the final class-LM text.srilm.gz is also a bigram one.

   Could anyone tell me if I am using the toolkit correctly?  How to build a trigram class-based LM?  Also are there any published paper/document that I can look up for detail information?

   Many thanks,

-June

---------------------------------
Do you Yahoo!?
SBC Yahoo! DSL - Now only $29.95 per month!
--0-103997848-1056390344=:78586
Content-Type: text/html; charset=us-ascii

<DIV>Hi,</DIV>
<DIV> </DIV>
<DIV>   I tried to build class based LMs in the following way:</DIV>
<DIV> </DIV>
<DIV>   step-1:  ngram-class -text test.in -numclasses 100 -class-counts text.cnt -classes text.cls  -save 100</DIV>
<DIV> </DIV>
<DIV>   step-2:  ngram-count -read  text.cnt -memuse -kndiscount -kndiscount1 -kndiscount2 -lm text.srilm.gz</DIV>
<DIV> </DIV>
<DIV>    I found that the class count output "text.cnt" from step-1 is only bigram-counts.  Thus the final class-LM text.srilm.gz is also a bigram one. </DIV>
<DIV> </DIV>
<DIV>   Could anyone tell me if I am using the toolkit correctly?  How to build a trigram class-based LM?  Also are there any published paper/document that I can look up for detail information? </DIV>
<DIV> </DIV>
<DIV>   Many thanks,</DIV>
<DIV> </DIV>
<DIV>-June</DIV><p><hr SIZE=1>
Do you Yahoo!?<br>
<a href="http://pa.yahoo.com/*http://rd.yahoo.com/evt=1207/*http://promo.yahoo.com/sbc/">SBC Yahoo! DSL</a> - Now only $29.95 per month!
--0-103997848-1056390344=:78586--

Click here to go to the SRILM home page.