Naive question about unknown words

gaudinat arnaud.gaudinat at healthonnet.org
Tue Oct 11 08:14:39 PDT 2005


Sorry for this naive question:

I create my LM with this command:
ngram-count  -text learningdb.txt -lm GT -unk

I evaluate a sentence with the following command:
ngram -lm GT -ppl sentence.txt

I obtain coherent results but I get also the following warning message:
"warning: non-zero probability for <unk> in closed-vocabulary LM"

Can anyone give me some information about this warning and how to avoid it?
Of course I need to give a weight for the unknown words.

Thanks in advance,

Arnaud.



More information about the SRILM-User mailing list