<html>
<head>
<meta content="text/html; charset=ISO-8859-1"
http-equiv="Content-Type">
</head>
<body bgcolor="#FFFFFF" text="#000000">
<div class="moz-cite-prefix">On 9/23/2013 1:33 PM, Md. Akmal Haidar
wrote:<br>
</div>
<blockquote
cite="mid:1379968415.89728.YahooMailNeo@web161005.mail.bf1.yahoo.com"
type="cite">
<div style="color:#000; background-color:#fff; font-family:times
new roman, new york, times, serif;font-size:12pt">
<div style="color: rgb(0, 0, 0); font-size: 16px; font-family:
times new roman,new york,times,serif; background-color:
transparent; font-style: normal;">Hi,</div>
<div style="color: rgb(0, 0, 0); font-size: 16px; font-family:
times new roman,new york,times,serif; background-color:
transparent; font-style: normal;"><br>
</div>
<div style="color: rgb(0, 0, 0); font-size: 16px; font-family:
times new roman,new york,times,serif; background-color:
transparent; font-style: normal;">1. Is it possible to
interpolate some trigram probabilities (say they are in file
t.txt) with an n-gram LM ? </div>
<div style="color: rgb(0, 0, 0); font-size: 16px; font-family:
times new roman,new york,times,serif; background-color:
transparent; font-style: normal;">SRILM gives results with the
warning (no bow for prefix of trigram of t.txt).</div>
<div style="color: rgb(0, 0, 0); font-size: 16px; font-family:
times new roman,new york,times,serif; background-color:
transparent; font-style: normal;">-lm n-gram.lm -lambda .9
-mix-lm t.txt -ppl test.txt <br>
</div>
<div style="color: rgb(0, 0, 0); font-size: 16px; font-family:
times new roman,new york,times,serif; background-color:
transparent; font-style: normal;">2. When the trigram
probabilities in t.txt changes (newt.txt), the results are
exactly the same as above. </div>
<div style="color: rgb(0, 0, 0); font-size: 16px; font-family:
times new roman,new york,times,serif; background-color:
transparent; font-style: normal;">-lm n-gram.lm -lambda .9
-mix-lm newt.txt -ppl test.txt</div>
<div style="color: rgb(0, 0, 0); font-size: 16px; font-family:
times new roman,new york,times,serif; background-color:
transparent; font-style: normal;"><br>
</div>
<div style="color: rgb(0, 0, 0); font-size: 16px; font-family:
times new roman,new york,times,serif; background-color:
transparent; font-style: normal;">Is above interpolation is
OK?Is there any other methods that are required to interpolate
these trigram probabilities to an n-gram LM?<br>
</div>
</div>
</blockquote>
<br>
The above would be fine if newt.txt contained a well-formed LM.
The format you generated is incomplete. <br>
As implied by the warning message, for each trigram "a b c" also
need the history portion ("a b") to be included as a bigram.<br>
Therefore, you should include a line <br>
<br>
-99 a b 0<br>
<br>
for every such history (plus the appropriate ngram count information
in the header). You also need a unigram section containing all
words of your vocabulary.<br>
<br>
-99 a 0<br>
<br>
(the final 0's are the log backoff weights).<br>
<br>
Now, giving 0 (log = -99) probabilities to all your unigrams and
bigrams is suboptimal because there will be cases where you don't
have a matching trigram and then the backoff will result in
probability 0. This is not the end of the world since you
presumably are interpolating with another model that will yield a
non-zero probability, but it should be better to estimate a non-zero
probability for those unigrams and bigrams. If you do, then run the
resulting model through <br>
<br>
ngram -lm newt.txt -renorm -write-lm newt-norm.txt <br>
<br>
to recompute the backoff weights. Finally, interpolate.<br>
<br>
Andreas <br>
<br>
<blockquote
cite="mid:1379968415.89728.YahooMailNeo@web161005.mail.bf1.yahoo.com"
type="cite">
<div style="color:#000; background-color:#fff; font-family:times
new roman, new york, times, serif;font-size:12pt">
<div style="color: rgb(0, 0, 0); font-size: 16px; font-family:
times new roman,new york,times,serif; background-color:
transparent; font-style: normal;"><br>
</div>
<div style="color: rgb(0, 0, 0); font-size: 16px; font-family:
times new roman,new york,times,serif; background-color:
transparent; font-style: normal;">Format of t.txt/newt.txt</div>
<div style="color: rgb(0, 0, 0); font-size: 16px; font-family:
times new roman,new york,times,serif; background-color:
transparent; font-style: normal;">\data\</div>
<div style="color: rgb(0, 0, 0); font-size: 16px; font-family:
times new roman,new york,times,serif; background-color:
transparent; font-style: normal;">ngram 3=242</div>
<div style="color: rgb(0, 0, 0); font-size: 16px; font-family:
times new roman,new york,times,serif; background-color:
transparent; font-style: normal;">\3-grams:</div>
<div style="color: rgb(0, 0, 0); font-size: 16px; font-family:
times new roman,new york,times,serif; background-color:
transparent; font-style: normal;">....</div>
<div style="color: rgb(0, 0, 0); font-size: 16px; font-family:
times new roman,new york,times,serif; background-color:
transparent; font-style: normal;">\end\</div>
<div style="color: rgb(0, 0, 0); font-size: 16px; font-family:
times new roman,new york,times,serif; background-color:
transparent; font-style: normal;"><br>
</div>
<div style="color: rgb(0, 0, 0); font-size: 16px; font-family:
times new roman,new york,times,serif; background-color:
transparent; font-style: normal;">Thanks</div>
<div style="color: rgb(0, 0, 0); font-size: 16px; font-family:
times new roman,new york,times,serif; background-color:
transparent; font-style: normal;">Best Regards</div>
<div style="color: rgb(0, 0, 0); font-size: 16px; font-family:
times new roman,new york,times,serif; background-color:
transparent; font-style: normal;">Akmal<br>
</div>
<div style="color: rgb(0, 0, 0); font-size: 16px; font-family:
times new roman,new york,times,serif; background-color:
transparent; font-style: normal;"><br>
</div>
</div>
<br>
<fieldset class="mimeAttachmentHeader"></fieldset>
<br>
<pre wrap="">_______________________________________________
SRILM-User site list
<a class="moz-txt-link-abbreviated" href="mailto:SRILM-User@speech.sri.com">SRILM-User@speech.sri.com</a>
<a class="moz-txt-link-freetext" href="http://www.speech.sri.com/mailman/listinfo/srilm-user">http://www.speech.sri.com/mailman/listinfo/srilm-user</a></pre>
</blockquote>
<br>
</body>
</html>