Dear SRILM mailing list,
I am wondering.. when I try to train a language model with ngram-count and
the –tolower option,
I’m getting the following error:
assertion "i < maxWordLength" failed: file "Vocab.cc", line 97
The input corpus (-text) is an utf8 file. Might this cause the problem?
I am grateful for any suggestion.
Barbara
Click here to go to the SRILM home page.