<div dir="ltr">Hi<br>Thanks for your reply.<br>I'm trying to use ngram program to compute perplexity for several files in a directory. As you said I'm trying to build a simple shell script for that. ngram prints a large output but I only need perplexity as a number then I can save those numbers in a loop for every model and then compare those numbers. Something like this:<br>


<br>for j in $models<br>do <br>echo model: $j<br>ngram -lm $j -ppl $i <br>done<br><br>How can I adjust ngram to print only a number instead of this kind of output:<br><br>file testfiles/test.test: 427 sentences, 2433 words, 1184 OOVs<br>


0 zeroprobs, logprob= -5075.52 ppl= 1067.47 ppl1= 11578.9<br><br>I need only number 1067.47 in this case!<br>Thanks for your help in advance.<br><br><div class="gmail_quote">On Sat, Jun 2, 2012 at 2:39 AM, Andreas Stolcke <span dir="ltr"><<a href="mailto:stolcke@icsi.berkeley.edu" target="_blank">stolcke@icsi.berkeley.edu</a>></span> wrote:<br>


<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div class="HOEnZb"><div class="h5">On 6/1/2012 6:04 AM, Ali Asghar Toraby Parizy wrote:<br>

<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

Hi<br>

I wanna use SRILM for text classification. I've successfully compiled srilm and I could reach the classes and utilities in my own project by including header files in include folder and adding libraries in lib folder.<br>


I'm also familiar with concepts of language modeling and text categorization but I don't know where to start for using srilm in this regard.<br>

I need to create some language models from the corpus that I have and then guess the best model for a new text file using perplexity.<br>

Can anybody give me a review of classes and utilities or possibly a document that explains the class hierarchies? I don't have enough time to explore all codes to found out how to use it!<br>

</blockquote></div></div>

You probably don't need to link into the C++ API to do what you want.<br>

Instead, you can operate at the command line, train your LMs, and postprocess the output of<br>

<br>

ngram -debug 1 -ppl ...<br>

<br>

to obtain the model likelihoods on your test data.<br>

<br>

The file $SRILM/doc/lm-intro  should contain all the info you need to get that going.<span class="HOEnZb"><font color="#888888"><br>

<br>

Andreas<br>

<br>

</font></span></blockquote></div><br></div>