<html>

  <head>

    <meta content="text/html; charset=ISO-8859-1"

      http-equiv="Content-Type">

  </head>

  <body bgcolor="#FFFFFF" text="#000000">

    On 5/14/2012 4:41 AM, mvp-songyoung wrote:

    <blockquote

      cite="mid:660b26a4.15837.1374a82bf15.Coremail.mvp-songyoung@163.com"

      type="cite">

      <div style="line-height: 1.7; color: rgb(0, 0, 0); font-size:

        14px; font-family: arial;">

        <div style="color: rgb(0, 0, 0); line-height: 1.7; font-family:

          arial; font-size: 14px;">

          <div style="color: rgb(0, 0, 0); line-height: 1.7;

            font-family: arial; font-size: 14px;">

            <div style="color: rgb(0, 0, 0); line-height: 1.7;

              font-family: arial; font-size: 14px;">

              <div style="color: rgb(0, 0, 0); line-height: 1.7;

                font-family: arial; font-size: 14px;">

                <div style="color: rgb(0, 0, 0); line-height: 1.7;

                  font-family: arial; font-size: 14px;">

                  <div style="color: rgb(0, 0, 0); line-height: 1.7;

                    font-family: arial; font-size: 14px;">

                    <div style="color: rgb(0, 0, 0); line-height: 1.7;

                      font-family: arial; font-size: 14px;">

                      <div style="color: rgb(0, 0, 0); line-height: 1.7;

                        font-family: arial; font-size: 14px;">

                        <div>Hi,I meet a question when lattice rescoring

                          with an interpolated class-based lm with

                          lattice-tool. This class-based LM was trained

                          by interpolating three other different

                          class-based LMs:LM1 c! ontian 3500 words and

                          merged into 350 clases;LM2 contain 2500 words

                          and merged into 250 classes ; LM3 contian 110

                          words and merged into 10 classes.  I have

                          renamed the class definitions for three

                          class-based LMs before training and

                          interpolating them.and I also merged the class

                          definitions to a single file before decoding.

                          My decoding comand is as follows:</div>

                        <div> </div>

                        <div>lattice-tool -read-htk -viterbi-decode

                          -order 4 -lm class-4gram.lm -classes

                          <class> -in-lattice-list lattice.scp

                          -htk-wdpenalty $PENALTY -htk-lmscale $LMSCALE</div>

                        <div> </div>

                        <div>But, I found that the decoding process was

                          very slow and memory consuming. <font

                            face="Trebuchet MS">I wonder to know why I

                            meet and how to process this situation? Are

                            there any steps I have did incorrect? Please

                            give me the right steps? thank you</font></div>

                        <div><font face="Trebuchet MS">              !

                                       &nbs

                            p;          <br>

                          </font></div>

                      </div>

                    </div>

                  </div>

                </div>

              </div>

            </div>

          </div>

        </div>

      </div>

    </blockquote>

    <br>

    The -classes option leads to an LM that no longer uses only a finite

    history to evaluate the probability of the next word.  This means

    that during lattice expansion all histories need to be kept

    distinct.   You should try the -simple-classes option, assuming your

    models satisfy its requirements:<br>

    <blockquote type="cite"><b><dt><b>-classes</b><i> file</i>

        </dt>

        <dd>

          Interpret the LM as an N-gram over word classes.

          The expansions of the classes are given in

          <i>file</i>

          in <a

href="http://www.speech.sri.com/projects/srilm/manpages/classes-format.5.html">classes-format(5)</a>.

          Tokens in the LM that are not defined as classes in

          <i> file </i>

          are assumed to be plain words, so that the LM can contain

          mixed N-grams over

          both words and word classes.

        </dd>

        <dt><b>-simple-classes</b>

        </dt>

        <dd>

          Assume a "simple" class model: each word is member of at most

          one word class,

          and class expansions are exactly one word long.

        </dd>

      </b></blockquote>

    <br>

    Hope this helps,<br>

    <br>

    Andreas<br>

    <br>

    <br>

    <br>

  </body>

</html>