<html>
<head>
<meta content="text/html; charset=ISO-8859-1"
http-equiv="Content-Type">
</head>
<body text="#000000" bgcolor="#FFFFFF">
<div class="moz-cite-prefix">On 12/14/2012 1:21 PM, Mohammed Mediani
wrote:<br>
</div>
<blockquote
cite="mid:20121214222159.c7gbsnki880oc8kk@webmail.ira.uni-karlsruhe.de"
type="cite">Could anybody please tell me how the discounting
parameters for modified kneser-ney smoothing (D1, D2, D3+) are
computed in case we have gtmin parameter greater than 1.
<br>
In such case, the corresponding ni would be zero, and we
eventually have to divide by this ni to get one of the Di's.
<br>
Many thanks,
<br>
_______________________________________________
<br>
SRILM-User site list
<br>
<a class="moz-txt-link-abbreviated" href="mailto:SRILM-User@speech.sri.com">SRILM-User@speech.sri.com</a>
<br>
<a class="moz-txt-link-freetext" href="http://www.speech.sri.com/mailman/listinfo/srilm-user">http://www.speech.sri.com/mailman/listinfo/srilm-user</a>
</blockquote>
<br>
The gtmin parameter is applied (i.e., the ngrams with frequency
below the threshold are omitted from the model) AFTER the
discounting constants are computed, so the gtmin options don't
affect the D1,D2,D3 computation.<br>
<br>
You have a problem when frequency cutoffs have been applied to the
Ngram data BEFORE SRILM gets to see it. This is the case, e.g.,
with the Google N-gram data. In that case, if you use the
make-big-lm wrapper script, an attempt will be made to extrapolate
the low count-of-counts from the higher ones, according to an
empirical law that is described in Figure 1 / Equation 1 of <a
href="http://www.speech.sri.com/cgi-bin/run-distill?papers/asru2007-mt-lm.ps.gz">this
paper</a>.<br>
<br>
Andreas<br>
<br>
</body>
</html>