Search SRILM-USER Archives

Match: Format: Sort by:
Search:

Naive question about unknown words

From: gaudinat <arnaud.gaudinat at ADDRESS HIDDEN>
Date: Tue, 11 Oct 2005 17:14:39 +0200

Sorry for this naive question:

I create my LM with this command:
ngram-count  -text learningdb.txt -lm GT -unk

I evaluate a sentence with the following command:
ngram -lm GT -ppl sentence.txt

I obtain coherent results but I get also the following warning message:
"warning: non-zero probability for <unk> in closed-vocabulary LM"

Can anyone give me some information about this warning and how to avoid it?
Of course I need to give a weight for the unknown words.

Thanks in advance,

Arnaud.

Click here to go to the SRILM home page.