<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">

<html>

<head>


<meta http-equiv="content-type" content="text/html; charset=Big5">

</head>

<body bgcolor="#ffffff" text="#000000">

<meta charset="utf-8">

<span class="Apple-style-span"

 style="border-collapse: separate; color: rgb(0, 0, 0); font-family: Times; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: normal; orphans: 2; text-indent: 0px; text-transform: none; white-space: normal; widows: 2; word-spacing: 0px; font-size: medium;"><span

 class="Apple-style-span"

 style="border-collapse: collapse; font-family: arial,helvetica,sans-serif; font-size: 13px;">Dear

Dr. Andreas<br>

<br>

I have a question regarding to the perplexity of ngram-class.<br>

<br>

The command I used was: ngram-class -debug 2 -text TEXT -vocab VOCAB

-numclasses NUM -classes OUTPUT<br>

<br>

The output file will contain a perplexity and PPL1 inside, what does

the perplexity stands for in class inducing? It seems that such

perplexity was calculated during the class clustering process

(merging), but what are the parameters it uses (e.g. -text and -lm)?<br>

<br>

In the manual, it said that "...minimize perplexity of a class-based

N-gram model given the provided word N-gram count". But to my

understanding, there are few steps needed to use the class-based N-gram

model:<br>

<br>

(a) use ngram-class to induce classes<br>

(b) use replace-words-with-classes to replace both the TEXT and VOCAB<br>

(c) follow the same method we used to estimate n-gram word-based model

LM, in order to get the class-based model LM, which will give us P(C_i

| C_i-2 C_i-1 ...)<br>

(d) use this LM to calculate the perplexity: ngram -ppl TEST_SET -lm LM

-class CLASS_DEFINITION, which give us P( wi | ci )<br>

<br>

Is the perplexity in ngram-class correlates with the perplexity in step

(d)? Or where could I get more detail definition about it?<br>

<br>

Thanks for your help in advance.<br>

<br>

Best Regards<br>

<br>

Tzu-Chiang<br>

</span></span>

</body>

</html>