Hi!<br><br>When I calculate perplexity of my POS-based class model (word can belong to many classes, class-definition file I create myself on the base of a POS-tagged data), with "-debug 2" I get the output I can not fully understand. For testing puropses I measure ppl on the same data I trained the class model (i.e. there should not be ay OOVs). However, in the debug output, for every N-gram there is a string of the format<br>P(w| w...) = [OOV][n-gram][n-gram]...[OOV][n-gram][n-gram]...<br>As far as I get it, [n-gram]s refer to different combinations of assigning words to classes. But why fo those [OOV] may appear (and they appear in equal intervals between strings of [n-gram]s for each word)?<br><br>I have only one guess: since [OOVs] are only missing for the last (&lt;/s&gt;| ...) n-gram, those [OOV] may correspond to a check if a word is present in the implicit stop-word vocabulary or something... <br><br>It would be great if anybody could comment on

 that.<br><BR><BR>best regards,<br>Ilya<p>

                <hr size=1> 

<a href="http://us.rd.yahoo.com/mail/uk/taglines/default/nowyoucan/spamguard/*http://us.rd.yahoo.com/evt=40565/*http://uk.docs.yahoo.com/nowyoucan.html">All New Yahoo! Mail</a> – Tired of Vi@gr@! come-ons? Let our SpamGuard protect you.