<html>
<head>
<meta content="text/html; charset=UTF-8" http-equiv="Content-Type">
</head>
<body bgcolor="#FFFFFF" text="#000000">
<div class="moz-cite-prefix">On 3/31/2014 4:26 AM, Laatar Rim wrote:<br>
</div>
<blockquote
cite="mid:1396265179.10448.YahooMailNeo@web173202.mail.ir2.yahoo.com"
type="cite">
<div style="color:#000; background-color:#fff; font-family:times
new roman, new york, times, serif;font-size:12pt">
<div>Dear Andreas, </div>
<div><br>
</div>
<div style="color: rgb(0, 0, 0); font-size: 16px; font-family:
'times new roman', 'new york', times, serif; background-color:
transparent; font-style: normal;">PLz i have a question : </div>
<div style="color: rgb(0, 0, 0); font-size: 16px; font-family:
'times new roman', 'new york', times, serif; background-color:
transparent; font-style: normal;">you say : <span
style="font-size: 12pt;">Knowing which words should be in
class should be considered part of the training process, or
comes from prior knowledge.</span></div>
If you application gives you the class membership of the words
in the test data then you can add it, otherwise it would be
"training on test data".<br clear="none">
<br clear="none">
you mean that my "IN_SRILM: my classes-format - File format for
word class definitions ( <i>class</i> [<i>p</i>] <i>word1</i> <i>word2</i> ...
)" should also contain both words that exist in my training data
and test data or it should contains only words from trainnig
data .??
<div> </div>
</div>
</blockquote>
You should only use words in the training data, plus any other
knowledge source or databases that are different from the test data.<br>
In many application domains that involve semantic knowledge you have
additional information about the task domain from which you can
infer class membership.<br>
For example, if you are doing air travel domain, you probably have a
list of all airport cities, and you create a word class from that.<br>
<br>
Andreas<br>
<br>
</body>
</html>