<html>

  <head>

    <meta content="text/html; charset=ISO-8859-1"

      http-equiv="Content-Type">

  </head>

  <body text="#000000" bgcolor="#FFFFFF">

    <div class="moz-cite-prefix">On 6/13/2013 8:23 AM, Meng CHEN wrote:<br>

    </div>

    <blockquote

      cite="mid:lqi37x5f0utvpqjqx8gpy8kv.1371135103438@email.android.com"

      type="cite">Hi, in make-big-lm command, it specifies

      -read-with-mincounts and -meta-tag by default. In the help page,

      it says "if -meta-tag is defined, these low-count N-grams will be

      converted to count-of-count N-grams, so that smoothing methods

      that need this information still work correctly". However, for

      wbdiscount, we don't need the count-of-count infomation to compute

      the discounting parameters. So, why does make-big-lm specify

      -meta-tag option for wbdiscount by default? Is that necessary? Can

      I remove it?(I tried that, and find the ngrams are the same in

      model, but the probability is different.)<br>

      Thanks!<br>

    </blockquote>

    <br>

    WB discounting requires the count of the distinct word types for

    each context.  That information can also be gotten from the

    meta-counts, and that's why you're getting different results without

    -meta-tag.<br>

    <br>

    BTW, I should update the man page to say that WB discounting is also

    supported in make-big-lm.<br>

    <br>

    Andreas<br>

    <br>

    <blockquote

      cite="mid:lqi37x5f0utvpqjqx8gpy8kv.1371135103438@email.android.com"

      type="cite"><br>

      <br>

      Meng CHEN<br>

      <br>

      <br>

      <br>

      发送自魅族MX<br>

      <fieldset class="mimeAttachmentHeader"></fieldset>

      <br>

      <pre wrap="">_______________________________________________

SRILM-User site list

<a class="moz-txt-link-abbreviated" href="mailto:SRILM-User@speech.sri.com">SRILM-User@speech.sri.com</a>

<a class="moz-txt-link-freetext" href="http://www.speech.sri.com/mailman/listinfo/srilm-user">http://www.speech.sri.com/mailman/listinfo/srilm-user</a></pre>

    </blockquote>

    <br>

  </body>

</html>