From noah at saros.cs.colorado.edu Tue Apr 2 23:08:19 2002 From: noah at saros.cs.colorado.edu (Noah Coccaro) Date: Wed, 03 Apr 2002 00:08:19 -0700 Subject: linux + alpha??? Message-ID: <200204030708.g3378Ja05200@saros.cs.colorado.edu> I would like to run the toolkit on dec alpha hardware that is running linux. (Redhat, I believe). Does anyone have a makefile appropriate for this blend? Any tips on how to create an appropriate Makefile.machine.linux+alpha ??? In theory, these machines should be able to run binaries compiled on alphas with digital unix, and that used to work, but no longer does, exiting with: (ds20-31: bin/alpha) ./ngram -help exception system: exiting due to internal error: out of memory trying to allocate exception system resources Abort (ds20-31: bin/alpha) From stolcke at speech.sri.com Wed Apr 3 15:59:46 2002 From: stolcke at speech.sri.com (Andreas Stolcke) Date: Wed, 03 Apr 2002 15:59:46 PST Subject: linux + alpha??? In-Reply-To: Your message of Wed, 03 Apr 2002 00:08:19 -0700. <200204030708.g3378Ja05200@saros.cs.colorado.edu> Message-ID: <200204032359.PAA19296@zap.speech.sri.com> Noah, I would proceed like this: first, modify bin/machine-type to return a string that identifies the platform ("alpha-linux" or something). Then copy the x86 Linux makefile (commmon/Makefile.machine.i686) to common/Makefile.machine.alpha-linux and see if that works. Since the compiler (gcc) and OS are the same I expect that to work, possibly with minor tweaks. Remove any gcc flags that are specific to the Intel target. If you get it working please mail me the changes you made. --Andreas In message <200204030708.g3378Ja05200 at saros.cs.colorado.edu>you wrote: > > > I would like to run the toolkit on dec alpha hardware that is running > linux. (Redhat, I believe). > > Does anyone have a makefile appropriate for this blend? Any tips on > how to create an appropriate Makefile.machine.linux+alpha ??? > > In theory, these machines should be able to run binaries compiled on > alphas with digital unix, and that used to work, but no longer does, > exiting with: > > (ds20-31: bin/alpha) ./ngram -help > exception system: exiting due to internal error: out of memory trying to allocate exception system resources > Abort > (ds20-31: bin/alpha) From huan0010 at infoeng.flinders.edu.au Fri Apr 5 00:27:57 2002 From: huan0010 at infoeng.flinders.edu.au (Jin Hu Huang) Date: Fri, 5 Apr 2002 17:57:57 +0930 (CST) Subject: help Message-ID: Hi, I just downloaded the toolkit but I cannot install it. I tried to install it on Sparc machine with SunOS 5.8. It couldnot support the following statement in the Makefile. MACHINE_TYPE := $(shell $(SRILM)/bin/machine-type) Did anyone meet same problem? Could anyone please help me and let me know how to fix it? Same problem in the Makefile.common.variables file? I tried to make Makefile.machine.sparc-elf or sparc directly, but it didnot work. Thanks Jin From stolcke at speech.sri.com Sat Apr 6 14:18:58 2002 From: stolcke at speech.sri.com (Andreas Stolcke) Date: Sat, 06 Apr 2002 14:18:58 PST Subject: SRILM help needed Message-ID: <200204062218.OAA09035@huge> Zhu, the default smoothing algorithm in ngram-count is Good-Turing. The default parameters (as displayed by ngram-count -help) are: -gt1min: lower 1gram discounting cutoff Default value: 1 -gt1max: upper 1gram discounting cutoff Default value: 1 -gt2min: lower 2gram discounting cutoff Default value: 1 -gt2max: upper 2gram discounting cutoff Default value: 7 -gt3min: lower 3gram discounting cutoff Default value: 2 -gt3max: upper 3gram discounting cutoff Default value: 7 -gt4min: lower 4gram discounting cutoff Default value: 2 -gt4max: upper 4gram discounting cutoff Default value: 7 -gt5min: lower 5gram discounting cutoff Default value: 2 -gt5max: upper 5gram discounting cutoff Default value: 7 -gt6min: lower 6gram discounting cutoff Default value: 2 -gt6max: upper 6gram discounting cutoff Default value: 7 So all unigram and bigrams are kept, but singleton ngrams of higher orders are discarded (which is a pretty standard choice). I'm not sure I understand your question about hidden-ngram. It doesn't use any "cut-offs". Cut-offs apply in N-gram model training, hidden-ngram only uses the model as it is produced by ngram-count (or some other program). --Andreas PS. Your message to srilm-user didn't make it to the list because you are not a subscriber. As way to control junk mail, only subscribers can post to the list. To join, send a message containing "subscribe srilm-user" to majordomo at speech.sri.com. ------- Forwarded Message Date: Thu, 4 Apr 2002 20:55:13 -0500 (EST) From: Zhu Zhang X-X-Sender: zhuzhang at mspacman.gpcc.itd.umich.edu To: srilm-user at speech.sri.com Subject: SRILM help needed Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Content-Length: 313 Hi, Could anybody provide the following info about SRILM, which doesn't seem to be very clear from the documentation: - - What is the defaul smoothing algorithm for ngram-count? - - what are the smoothing parameters? - - In hidden-ngram, what are the event cut-off frequencies? Thanks in advance for any help! ------- End of Forwarded Message From jill at cs.huji.ac.il Sun Apr 7 17:43:16 2002 From: jill at cs.huji.ac.il (Gill Bejerano) Date: Mon, 8 Apr 2002 03:43:16 +0300 (IDT) Subject: some tech queries Message-ID: Hi, I am new to SRILM, and quite new to language modelling at large (coming from other domains of n-gram models usage). I have run some perliminary probes with SRILM (on linux, smooth install) and have the following questions: 1. in ngram-count: when using -lm with the default -order 3, i had expected -text to yield the same model as -read -read -read where order{1-3} have been obtained through ngram-count -write{1-3} (all other paramters being equal). and yet the two LM files differ. how come? 2. in ngram-count: i'm not quite clear about the multiple -cdiscount flags. suppose i want a default -order 3 LM. mustn't i give all three D's and have the model interpolate over all of these, as eq. (18) in Chen&Goodman (p.15) implies? in practice it seems one can specify any subset of the 3 and get different models. (are there default Ds?) 3. in ngram-count: probably closely related to question 2. (and prob. due to some confusion i have between backoff & interpolation) why are there multiple -interpolate flags. again, eq. (18) in C&G appears to imply a recursive all levels interpolation. and yet ngram-count appears to take any subset of -interpolate{1-3} (in the above example) and yield different LMs. 4. combining 2+3: if i want an absolute discount model of order, say 3, "by the book" C&G eq. (18), what is the proper way to run it? assume i have ran ngram-count => get-gt-counts => make-abs-discount and obtained . a command line example will be highly appreciated. 5. ngram-count vs. ngram: if i use ngram-count with some combination of -prune and -minprune to obtain a model and then use ngram -ppl, will the result be identical to running ngram-count without the pruning flags, and running ngram -ppl on the new model with -prune -minprune as was previously done for model building? 6. for ngram -ppl: in -debug 1, i believe, two measures are given per sentence, ppl and ppl1. how are they defined? is one C&G's $PP_p(T)$ (p.9,top)? then, what is the other? Help would be highly appreciated, -Gill From stolcke at speech.sri.com Wed Apr 10 14:06:32 2002 From: stolcke at speech.sri.com (Andreas Stolcke) Date: Wed, 10 Apr 2002 14:06:32 PDT Subject: help In-Reply-To: Your message of Fri, 05 Apr 2002 17:57:57 +0930. Message-ID: <200204102106.OAA12510@zap.speech.sri.com> I believe you are not using GNU make. Grab gnumake from the web The SRILM web page has links to all the extra software you need to build and run SRILM. --Andreas In message you wrote: > Hi, > > I just downloaded the toolkit but I cannot install it. I tried to install > it on Sparc machine with SunOS 5.8. It couldnot support the following > statement in the Makefile. > > MACHINE_TYPE := $(shell $(SRILM)/bin/machine-type) > > Did anyone meet same problem? Could anyone please help me and let me know > how to fix it? Same problem in the Makefile.common.variables file? I tried > to make Makefile.machine.sparc-elf or sparc directly, but it didnot work. > > Thanks > > Jin > From stolcke at speech.sri.com Wed May 1 09:53:35 2002 From: stolcke at speech.sri.com (Andreas Stolcke) Date: Wed, 01 May 2002 09:53:35 PDT Subject: some tech queries In-Reply-To: Your message of Mon, 08 Apr 2002 03:43:16 +0300. Message-ID: <200205011653.JAA09151@huge> In message you wrote: > > Hi, > > I am new to SRILM, and quite new to language modelling at large > (coming from other domains of n-gram models usage). > > I have run some perliminary probes with SRILM (on linux, smooth install) > and have the following questions: Sorry for taking a while, but I hope the answers are still useful. > > 1. in ngram-count: > when using -lm with the default -order 3, i had expected -text > to yield the same model as -read -read -read > where order{1-3} have been obtained through ngram-count -write{1-3} > (all other paramters being equal). and yet the two LM files differ. > how come? You are right, the two methods of getting the counts should be equivalent. You can test this by doing ngram-count -text TEST -write NEWCOUNTS and ngram-count -read COUNTS -write NEWCOUNTS and comparing the output. If you find a discrepancy then there might be a bug and I'd like you to send me a small test case that shows the problem. BTW, there is no reason to use -write1 -write2 -write3 together if you are going to combine the counts later. Just -write will do the job. > > 2. in ngram-count: > i'm not quite clear about the multiple -cdiscount flags. > suppose i want a default -order 3 LM. > mustn't i give all three D's and have the model interpolate over all > of these, as eq. (18) in Chen&Goodman (p.15) implies? > in practice it seems one can specify any subset of the 3 and get > different models. (are there default Ds?) The way it is implemented you have complete freedom to use different discounting methods for different orders of N-grams. The default is Good-Turing, so -cdiscount1 D1 -cdiscount3 D3 would use absolute discounting for orders 1 and 3, but GT for bigrams. (There is no default D value for absolute discounting). Also, whether or not higher-order estimates use interpolation with lower-order estimates can be chosen separately for each order. Not all possible combinations make sense from a theoretical point of view, so it's up to you to not abuse this flexibility. > 3. in ngram-count: > probably closely related to question 2. > (and prob. due to some confusion i have between backoff & interpolation) > why are there multiple -interpolate flags. > again, eq. (18) in C&G appears to imply a recursive all levels > interpolation. and yet ngram-count appears to take any subset of > -interpolate{1-3} (in the above example) and yield different LMs. See above. -interpolate estimates order-N N-gram probabilities by interpolating with order-(N-1) estimates. The latter could themselses be interpolated or not, so you control how far the recursion goes. > 4. combining 2+3: > if i want an absolute discount model of order, say 3, > "by the book" C&G eq. (18), what is the proper way to run it? > assume i have ran ngram-count => get-gt-counts => make-abs-discount > and obtained . > a command line example will be highly appreciated. Correct. > > 5. ngram-count vs. ngram: > if i use ngram-count with some combination of -prune and -minprune > to obtain a model and then use ngram -ppl, will the result be identical > to running ngram-count without the pruning flags, and running ngram -ppl > on the new model with -prune -minprune as was previously done for model > building? Correct (again, barring any bugs...). > 6. for ngram -ppl: > in -debug 1, i believe, two measures are given per sentence, ppl and ppl1. > how are they defined? > is one C&G's $PP_p(T)$ (p.9,top)? then, what is the other? I get a lot of question about this because it's not documented, except in the code. ppl1 is the perplexity computed without counting the end-of-sentence tokens in the denominator (the end-of-sentence log probabilities are still included in the total log probability). ppl1 can be more meaningful for comparing perplexities on testsets that have been segmented in different ways. --Andreas