<html>
<head>
<meta content="text/html; charset=ISO-8859-1"
http-equiv="Content-Type">
</head>
<body bgcolor="#FFFFFF" text="#000000">
<div class="moz-cite-prefix">On 10/29/2012 9:15 AM, Stefy D. wrote:<br>
</div>
<blockquote
cite="mid:1351527325.44053.YahooMailNeo@web112503.mail.gq1.yahoo.com"
type="cite">
<div style="color:#000; background-color:#fff; font-family:times
new roman, new york, times, serif;font-size:12pt">
<div>Hello everyone,</div>
<div><br>
</div>
<div style="color: rgb(0, 0, 0); font-size: 16px; font-family:
times new roman,new york,times,serif; background-color:
transparent; font-style: normal;">I am trying to interpolate 2
language models because I want to do an experiment in domain
adaption. Below are the commands that I used. When I try to
compute lamda, I get the error "mismatch in number of samples
(60001 != 67708)". I don't know what to fix...please help me.</div>
<div style="color: rgb(0, 0, 0); font-size: 16px; font-family:
times new roman,new york,times,serif; background-color:
transparent; font-style: normal;"><br>
</div>
<div style="color: rgb(0, 0, 0); font-size: 16px; font-family:
times new roman,new york,times,serif; background-color:
transparent; font-style: normal;">~/local/tools/srilm/bin/i686/ngram
-order 3 -unk -lm ~/local/test1/lm/lm1.lm -ppl
~/local/test1/lm/de-en_corpus1.lowercased.en -debug 2 >
ppl1.ppl</div>
<div style="color: rgb(0, 0, 0); font-size: 16px; font-family:
times new roman,new york,times,serif; background-color:
transparent; font-style: normal;">~/local/tools/srilm/bin/i686/ngram
-order 3 -unk -lm ~/local/test2/lm/lm2.lm -ppl
~/local/test2/lm/de-en_corpus2.lowercased.en -debug 2 >
ppl2.ppl<br>
~/local/tools/srilm/bin/i686/compute-best-mix
~/local/test1/ppl1.ppl ~/local/test2/ppl2.ppl<br>
</div>
</div>
</blockquote>
<br>
You need to collect ppl1.ppl and ppl2.ppl on the SAME EXACT DATA.
Same data, different models. compute-best-mix will find the optimal
interpolation to minimize the combined model on that data.<br>
<br>
Andreas<br>
<br>
</body>
</html>