<html>
<head>
<meta http-equiv="content-type" content="text/html; charset=UTF-8">
</head>
<body bgcolor="#FFFFFF" text="#000000">
Hi, I have two questions:<br>
<br>
1. If I generate the language model with Kneser-Ney smoothing (or
Modified Kneser-Ney), why do the parameter "-gtnmin" apply to
already modified counts? <br>
<blockquote class="gmail_quote" style="margin:0 0 0
.8ex;border-left:1px #ccc solid;padding-left:1ex">
<div>For example, if in the training data 2-gram "markov model"
occurs only in the context "hidden markov model" and gt2min = 2,
then the modified count for "markov model" = n(* markov model) =
1 < gt2min and <br>
prob("markov model") = bow("markov")*prob("model"). <br>
Instead of prob("markov model") = ( n(* markov model) - D)/
n(* markov *) ;<br>
<br>
2. Let say I use ngram-count to generate the language model as
following: <br>
ngram-count -text text.txt -vocab vocab.txt -gt1min 5 -lm sri.lm<br>
Let the word "hello" exists in "vocab.txt" and occurs 4 times in
"text.txt". Then probability of "hello" is calculated as
probability of zerotone. Is it correct?<br>
<div><br>
</div>
</div>
</blockquote>
<br>
<pre class="moz-signature" cols="72">Thanks
Anna Bulusheva
</pre>
</body>
</html>