<html>

  <head>

    <meta content="text/html; charset=ISO-8859-1"

      http-equiv="Content-Type">

  </head>

  <body bgcolor="#FFFFFF" text="#000000">

    <div class="moz-cite-prefix">On 7/18/2012 3:40 AM, Meng Chen wrote:<br>

    </div>

    <blockquote

cite="mid:CA+bc0mpSvZfn2GKt025zs8bR6R=0ECGmLy2myCNQ2v=7OQMnyw@mail.gmail.com"

      type="cite">Hi, I want to ask how to train N-gram language model

      with SRILM if the corpus is very large (100GB). Should I still use

      the command of <b>ngram-count</b>? Or use <b>make-big-lm</b>

      instead? I also want to know if there is any limitation of

      training corpus in vocabulary and size with SRILM?

      <div>

        Thanks!</div>

    </blockquote>

    Definitely make-big-lm.   Read the FAQ on handling large data.  You

    are limited by computer memory but it is not possible to give a hard

    limit, it depends on the properties of your data.<br>

    <br>

    Andreas<br>

    <br>

  </body>

</html>