Search SRILM-USER Archives

Match: Format: Sort by:
Search:

unicode & many files

From: Alexy Khrabrov <deliverable at ADDRESS HIDDEN>
Date: Wed, 12 Sep 2007 17:50:50 +0200

How good is the unicode support -- e.g. for utf8?  I've fed it some  
utf8 Cyrillics and it did fine.  How does it know we're using  
multibyte or single byte characters?

Another question -- how do I feed many text files from a directory,  
should I do multiple -text options after cooking them somehow, or use  
-read on an accumulating count file?

Cheers,
Alexy

Click here to go to the SRILM home page.