<font color='black' size='2' face='arial'>
<div style="font-family:arial,helvetica;font-size:10pt;color:black">

<div id="AOLMsgPart_1_b384c7bd-1351-4135-918d-8a9739b8ff5e">
<font color="black" size="2" face="arial"><br>

<br>



<div style="color: black; font-family: arial; font-size: 10pt; clear: both;">Thanks!</div>



<div style="color: black; font-family: arial, helvetica; font-size: 10pt;"><br>

</div>



<div>

<div id="AOLMsgPart_1_25179108-7919-487a-8660-3a99d246e8ac">

<div class="aolReplacedBody" bgcolor="#FFFFFF" text="#000000" style="color: black; font-family: arial, helvetica; font-size: 10pt;"><font size="2"><font face="arial">>1) Make sure you're building 64-bit executables.   If "file </font></font><font color="black" face="arial" size="2"><font face="arial"><font face="arial, helvetica">bin/i686/ngram-count" says that it's
          an 32-bit >executable, do a "make clean" and rebuilt with "make
          MACHINE_TYPE=i686-m64  ..." .<br>


          <br>

This worked. I had to use "make OPTION=_l" though. Now there is no problem of ngrams with positive log probability.
          </font></font></font></div>



<div class="aolReplacedBody" bgcolor="#FFFFFF" text="#000000" style="color: black; font-family: arial, helvetica; font-size: 10pt;"><font color="black" face="arial" size="2"><font face="arial"><font face="arial, helvetica"><br>

</font></font></font></div>



<div class="aolReplacedBody" bgcolor="#FFFFFF" text="#000000" style="color: black; font-family: arial, helvetica; font-size: 10pt;"><font color="black" face="arial" size="2"><font face="arial"><font face="arial, helvetica">But when I run below command-</font></font></font></div>



<div class="aolReplacedBody" bgcolor="#FFFFFF" text="#000000" style="color: black; font-family: arial, helvetica; font-size: 10pt;"><font color="black" face="arial" size="2"><font face="arial"><font face="arial, helvetica"><br>

</font></font></font></div>



<div class="aolReplacedBody" bgcolor="#FFFFFF" text="#000000"><font size="2"><font><font face="arial, helvetica">bin/i686_l/ngram-count -order 1 -vocab wordList -read ngramCounts -lm ug.lm -wbdiscount1</font></font></font></div>



<div class="aolReplacedBody" bgcolor="#FFFFFF" text="#000000" style="color: black; font-family: arial, helvetica; font-size: 10pt;"><font color="black" face="arial" size="2"><font face="arial"><font face="arial, helvetica"><br>

</font></font></font></div>



<div class="aolReplacedBody" bgcolor="#FFFFFF" text="#000000" style="color: black; font-family: arial, helvetica; font-size: 10pt;"><font color="black" face="arial" size="2"><font face="arial"><font face="arial, helvetica">The memory usage is not much (~ 5mb) but the CPU usage is in high 90's. I tried your suggestion to scale down data. Just used 100 unigrams and the *.lm file was created within minutes. </font></font></font></div>



<div class="aolReplacedBody" bgcolor="#FFFFFF" text="#000000" style="color: black; font-family: arial, helvetica; font-size: 10pt;"><font color="black" face="arial" size="2"><font face="arial"><font face="arial, helvetica"><br>

</font></font></font></div>



<div class="aolReplacedBody" bgcolor="#FFFFFF" text="#000000" style="color: black; font-family: arial, helvetica; font-size: 10pt;"><font color="black" face="arial" size="2"><font face="arial"><font face="arial, helvetica">And for the complete data, using -wbdiscount took about 2 hours. </font></font></font></div>




</div>





</div>

</font>
</div>

</div>
</font>