[SRILM User List] SRI LM toolkit: ngram-count

Andreas Stolcke stolcke at speech.sri.com
Thu Feb 18 12:33:40 PST 2010


On 2/9/2010 1:15 AM, 이일빈 wrote:
>
> ----------------------------
> ngram -debug 2 -order 3 -count-lm -lm google.countlm -vocab vocab.txt
> -vocab-aliases google.aliases -limit-vocab -write-lm google.lm
> ----------------------------
> But an error message came out.
> ----------------------------
> assertion "body != 0" failed: file "../../include/LHash.cc", line 138
> 3 [sig] ngram 21852 winpids::enumNT: error 0xC0000005 reading system proce
> ss information
> Aborted (core dumped)
> ----------------------------
> However, I monitored the committed memory size and it reached only 900MB.
> So I'm wondering whether there is a memory usage limit in the toolkit.
>
In case this is helpful: I wrote a small program called "maxalloc" that
will determine empirically how much data your system allows you to
allocate in a program (via malloc/new). This will be a function of swap
space, resource limits, system configuration, and you might not have
control over some of these and need assistance from a system administrator.

For what it's worth, on my Windows XP system with Cygwin, I can only
malloc about 1GB of data, although I have 4GB of memory instaled. Surely
there is some windows configuration thingy somewhere that will change
this. Maybe someone with more Windows expertise can help!

Andreas

PS. maxalloc is in the latest SRILM beta version.


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.speech.sri.com/pipermail/srilm-user/attachments/20100218/43614123/attachment.html>


More information about the SRILM-User mailing list