Search SRILM-USER Archives

Match: Format: Sort by:
Search:

ngram-merge

From: Mirjam Sepesy Maucec <mirjam.sepesy at ADDRESS HIDDEN>
Date: Tue, 11 Mar 2003 18:36:34 +0100

This is a multi-part message in MIME format.

--Boundary_(ID_hlfQjq+7AyO1Uvlq91hONA)
Content-type: text/plain; charset=us-ascii
Content-transfer-encoding: 7BIT

Hi.

I have problems with ngram-merge, when I want to merge 2 huge sorted
6-gram files  (the first is about 2G and contains 61M counts and the
second
is 700M and contains 21M counts).
At once ngram-merge stucks. Output file does not change any more,  but
ngram-merge is still doing something.  When I look at the info
of the output file, I see, that the time of the last modification is
changing and there is stil space on the disc.
When I split both input files at the critical 6-gram and merge the top
parts and the botton parts of both files separatelly, it works well, but
I think
this is not the case. I have to do merging many times :-(

One more question. If my count file contains 4-grams and 6-grams and I
use -recompute option in ngram-count. Are in this case 5-grams
recomputed from 6-grams and 3-grams from 4-grams?

Regards,

Mirjam.

--Boundary_(ID_hlfQjq+7AyO1Uvlq91hONA)
Content-type: text/x-vcard; name=mirjam.sepesy.vcf; charset=us-ascii
Content-transfer-encoding: 7BIT
Content-disposition: attachment; filename=mirjam.sepesy.vcf
Content-description: Card for Mirjam Sepesy Maucec

begin:vcard
n:Sepesy Maucec;Mirjam
x-mozilla-html:FALSE
org:Faculty of Electrical Engineering and Computer Science, Smetanova 17, 2000 Maribor
adr:;;;;;;
version:2.1
email;internet:mirjam.sepesy at ADDRESS HIDDEN
title:PhD
note:Phone: ++386 (0)2 220-7225
x-mozilla-cpt:;7072
fn:Mirjam Sepesy Maucec
end:vcard

--Boundary_(ID_hlfQjq+7AyO1Uvlq91hONA)--

Click here to go to the SRILM home page.