<html>
<head>
<meta content="text/html; charset=ISO-8859-1"
http-equiv="Content-Type">
</head>
<body text="#000000" bgcolor="#FFFFFF">
<div class="moz-cite-prefix">On 1/8/2013 6:07 PM, Marta Ruiz wrote:<br>
</div>
<blockquote
cite="mid:CABEBqHJ798PkMfSe_DJ9YLFASabk1S8Wk65nJHZgdtfoJ8tSpQ@mail.gmail.com"
type="cite">Thanks Andreas, two more questions<br>
<div class="gmail_quote">
<blockquote class="gmail_quote" style="margin:0 0 0
.8ex;border-left:1px #ccc solid;padding-left:1ex">
<br>
1. Create a word-based version of each model. For example,
you can construct a POS-based LM and combine it with a class
membership mapping (in classes-format, see man page) to get a
word-level POS-based model. Similar with lemma-based LMs
(the lemmas are effectively word classes).<br>
<br>
</blockquote>
<div><br>
which is the instruction to do this?<br>
</div>
</div>
</blockquote>
<br>
1. You create the class-to-word mapping file (in the format
described <a
href="http://www.speech.sri.com/projects/srilm/manpages/classes-format.5.html">here</a>)
to reflect either your POS-to-word or lemma-to-word mapping.<br>
2. Process the training data to replace the words with POS or
lemmas, as appropriate.<br>
3. Train the ngram portion of the LM by running ngram-count on the
training data represented as a sequence of POS tags / lemmas (from
step 2).<br>
<br>
<br>
<blockquote
cite="mid:CABEBqHJ798PkMfSe_DJ9YLFASabk1S8Wk65nJHZgdtfoJ8tSpQ@mail.gmail.com"
type="cite">
<div class="gmail_quote">
<div> </div>
<blockquote class="gmail_quote" style="margin:0 0 0
.8ex;border-left:1px #ccc solid;padding-left:1ex">
2. Then interpolate the models using<br>
<br>
ngram -bayes 0 -lm LM1 -mix-lm LM2 -mix-lm2 LM3 ....
-lambda ... -mix-lambda2 ... -classes CLASSES<br>
<br>
where CLASSES is a classes-format(5) file defining the union
of all the word classes used in the various component models.<span
class="HOEnZb"><font color="#888888"><br>
<br>
</font></span></blockquote>
<div><br>
to find the lambdas can I use the compute-best-mix, can't I?<br>
</div>
</div>
</blockquote>
Exactly.<br>
<br>
Andreas<br>
<br>
</body>
</html>