From noah at saros.cs.colorado.edu  Tue Apr  2 23:08:19 2002
From: noah at saros.cs.colorado.edu (Noah Coccaro)
Date: Wed, 03 Apr 2002 00:08:19 -0700
Subject: linux + alpha???
Message-ID: <200204030708.g3378Ja05200@saros.cs.colorado.edu>


I would like to run the toolkit on dec alpha hardware that is running
linux. (Redhat, I believe). 

Does anyone have a makefile appropriate for this blend? Any tips on
how to create an appropriate Makefile.machine.linux+alpha ??? 

In theory, these machines should be able to run binaries compiled on
alphas with digital unix, and that used to work, but no longer does,
exiting with: 

(ds20-31: bin/alpha) ./ngram -help
exception system: exiting due to internal error: out of memory trying to allocate exception system resources
Abort
(ds20-31: bin/alpha) 


From stolcke at speech.sri.com  Wed Apr  3 15:59:46 2002
From: stolcke at speech.sri.com (Andreas Stolcke)
Date: Wed, 03 Apr 2002 15:59:46 PST
Subject: linux + alpha??? 
In-Reply-To: Your message of Wed, 03 Apr 2002 00:08:19 -0700.
             <200204030708.g3378Ja05200@saros.cs.colorado.edu> 
Message-ID: <200204032359.PAA19296@zap.speech.sri.com>


Noah,

I would proceed like this: first, modify bin/machine-type to return a string that
identifies the platform ("alpha-linux" or something).

Then copy the x86 Linux makefile 
(commmon/Makefile.machine.i686) to common/Makefile.machine.alpha-linux and see 
if that works.  Since the compiler (gcc) and OS are the same I expect that to work,
possibly with minor tweaks.  Remove any gcc flags that are specific to the 
Intel target.

If you get it working please mail me the changes you made.

--Andreas

In message <200204030708.g3378Ja05200 at saros.cs.colorado.edu>you wrote:
> 
> 
> I would like to run the toolkit on dec alpha hardware that is running
> linux. (Redhat, I believe). 
> 
> Does anyone have a makefile appropriate for this blend? Any tips on
> how to create an appropriate Makefile.machine.linux+alpha ??? 
> 
> In theory, these machines should be able to run binaries compiled on
> alphas with digital unix, and that used to work, but no longer does,
> exiting with: 
> 
> (ds20-31: bin/alpha) ./ngram -help
> exception system: exiting due to internal error: out of memory trying to allocate exception system resources
> Abort
> (ds20-31: bin/alpha) 


From huan0010 at infoeng.flinders.edu.au  Fri Apr  5 00:27:57 2002
From: huan0010 at infoeng.flinders.edu.au (Jin Hu Huang)
Date: Fri, 5 Apr 2002 17:57:57 +0930 (CST)
Subject: help
Message-ID: <Pine.GSO.4.10.10204051735570.8467-100000@trojan>

Hi,

I just downloaded the toolkit but I cannot install it. I tried to install
it on Sparc machine with SunOS 5.8. It couldnot support the following
statement in the Makefile.

MACHINE_TYPE := $(shell $(SRILM)/bin/machine-type)

Did anyone meet same problem? Could anyone please help me and let me know
how to fix it? Same problem in the Makefile.common.variables file? I tried
to make Makefile.machine.sparc-elf or sparc directly, but it didnot work.

Thanks

Jin


From stolcke at speech.sri.com  Sat Apr  6 14:18:58 2002
From: stolcke at speech.sri.com (Andreas Stolcke)
Date: Sat, 06 Apr 2002 14:18:58 PST
Subject: SRILM help needed
Message-ID: <200204062218.OAA09035@huge>


Zhu,

the default smoothing algorithm in ngram-count is Good-Turing.
The default parameters (as displayed by ngram-count -help) are:

 -gt1min:       lower 1gram discounting cutoff
                Default value: 1
 -gt1max:       upper 1gram discounting cutoff
                Default value: 1
 -gt2min:       lower 2gram discounting cutoff
                Default value: 1
 -gt2max:       upper 2gram discounting cutoff
                Default value: 7
 -gt3min:       lower 3gram discounting cutoff
                Default value: 2
 -gt3max:       upper 3gram discounting cutoff
                Default value: 7
 -gt4min:       lower 4gram discounting cutoff
                Default value: 2
 -gt4max:       upper 4gram discounting cutoff
                Default value: 7
 -gt5min:       lower 5gram discounting cutoff
                Default value: 2
 -gt5max:       upper 5gram discounting cutoff
                Default value: 7
 -gt6min:       lower 6gram discounting cutoff
                Default value: 2
 -gt6max:       upper 6gram discounting cutoff
                Default value: 7

So all unigram and bigrams are kept, but singleton ngrams of higher orders
are discarded (which is a pretty standard choice).

I'm not sure I understand your question about hidden-ngram.
It doesn't use any "cut-offs".   Cut-offs apply in N-gram model
training, hidden-ngram only uses the model as it is produced by 
ngram-count (or some other program).

--Andreas

PS.  Your message to srilm-user didn't make it to the list because you are
not a subscriber.  As way to control junk mail, only subscribers can post
to the list.  To join, send a message containing "subscribe srilm-user"
to majordomo at speech.sri.com.

------- Forwarded Message

Date: Thu, 4 Apr 2002 20:55:13 -0500 (EST)
From: Zhu Zhang <zhuzhang at umich.edu>
X-X-Sender: zhuzhang at mspacman.gpcc.itd.umich.edu
To: srilm-user at speech.sri.com
Subject: SRILM help needed
Message-ID: <Pine.SOL.4.44.0204042045410.16911-100000 at mspacman.gpcc.itd.umich.edu>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
Content-Length: 313


Hi,

Could anybody provide the following info about SRILM, which doesn't seem
to be very clear from the documentation:

- -  What is the defaul smoothing algorithm for ngram-count?
- -  what are the smoothing parameters?
- -  In hidden-ngram, what are the event cut-off frequencies?

Thanks in advance for any help!

------- End of Forwarded Message


From jill at cs.huji.ac.il  Sun Apr  7 17:43:16 2002
From: jill at cs.huji.ac.il (Gill Bejerano)
Date: Mon, 8 Apr 2002 03:43:16 +0300 (IDT)
Subject: some tech queries
Message-ID: <Pine.LNX.4.20_heb2.08.0204080255320.20347-100000@pogo.cs.huji.ac.il>


Hi,

I am new to SRILM, and quite new to language modelling at large
(coming from other domains of n-gram models usage).

I have run some perliminary probes with SRILM (on linux, smooth install)
and have the following questions:

1. in ngram-count:
   when using -lm with the default -order 3, i had expected -text <textfile>
   to yield the same model as -read <order1> -read <order2> -read <order3>
   where order{1-3} have been obtained through ngram-count -write{1-3}
   (all other paramters being equal). and yet the two LM files differ.
   how come?

2. in ngram-count:
   i'm not quite clear about the multiple -cdiscount flags.
   suppose i want a default -order 3 LM.
   mustn't i give all three D's and have the model interpolate over all
   of these, as eq. (18) in Chen&Goodman (p.15) implies?
   in practice it seems one can specify any subset of the 3 and get
   different models. (are there default Ds?)

3. in ngram-count:
   probably closely related to question 2.
   (and prob. due to some confusion i have between backoff & interpolation)
   why are there multiple -interpolate flags.
   again, eq. (18) in C&G appears to imply a recursive all levels
   interpolation. and yet ngram-count appears to take any subset of
   -interpolate{1-3} (in the above example) and yield different LMs.

4. combining 2+3:
   if i want an absolute discount model of order, say 3, 
   "by the book" C&G eq. (18), what is the proper way to run it? 
   assume i have ran ngram-count => get-gt-counts => make-abs-discount
   and obtained <D1> <D2> <D3>.
   a command line example will be highly appreciated.

5. ngram-count vs. ngram:
   if i use ngram-count with some combination of -prune and -minprune 
   to obtain a model and then use ngram -ppl, will the result be identical
   to running ngram-count without the pruning flags, and running ngram -ppl
   on the new model with -prune -minprune as was previously done for model
   building?

6. for ngram -ppl:
   in -debug 1, i believe, two measures are given per sentence, ppl and ppl1.
   how are they defined? 
   is one C&G's $PP_p(T)$ (p.9,top)? then, what is the other?

Help would be highly appreciated,
-Gill


From stolcke at speech.sri.com  Wed Apr 10 14:06:32 2002
From: stolcke at speech.sri.com (Andreas Stolcke)
Date: Wed, 10 Apr 2002 14:06:32 PDT
Subject: help 
In-Reply-To: Your message of Fri, 05 Apr 2002 17:57:57 +0930.
             <Pine.GSO.4.10.10204051735570.8467-100000@trojan> 
Message-ID: <200204102106.OAA12510@zap.speech.sri.com>


I believe you are not using GNU make.   Grab gnumake from the web
The SRILM web page has links to all the extra software you need to 
build and run SRILM.

--Andreas

In message <Pine.GSO.4.10.10204051735570.8467-100000 at trojan>you wrote:
> Hi,
> 
> I just downloaded the toolkit but I cannot install it. I tried to install
> it on Sparc machine with SunOS 5.8. It couldnot support the following
> statement in the Makefile.
> 
> MACHINE_TYPE := $(shell $(SRILM)/bin/machine-type)
> 
> Did anyone meet same problem? Could anyone please help me and let me know
> how to fix it? Same problem in the Makefile.common.variables file? I tried
> to make Makefile.machine.sparc-elf or sparc directly, but it didnot work.
> 
> Thanks
> 
> Jin
> 


From stolcke at speech.sri.com  Wed May  1 09:53:35 2002
From: stolcke at speech.sri.com (Andreas Stolcke)
Date: Wed, 01 May 2002 09:53:35 PDT
Subject: some tech queries 
In-Reply-To: Your message of Mon, 08 Apr 2002 03:43:16 +0300.
             <Pine.LNX.4.20_heb2.08.0204080255320.20347-100000@pogo.cs.huji.ac.il> 
Message-ID: <200205011653.JAA09151@huge>


In message <Pine.LNX.4.20_heb2.08.0204080255320.20347-100000 at pogo.cs.huji.ac.il
>you wrote:
> 
> Hi,
> 
> I am new to SRILM, and quite new to language modelling at large
> (coming from other domains of n-gram models usage).
> 
> I have run some perliminary probes with SRILM (on linux, smooth install)
> and have the following questions:

Sorry for taking a while, but I hope the answers are still useful.

> 
> 1. in ngram-count:
>    when using -lm with the default -order 3, i had expected -text <textfile>
>    to yield the same model as -read <order1> -read <order2> -read <order3>
>    where order{1-3} have been obtained through ngram-count -write{1-3}
>    (all other paramters being equal). and yet the two LM files differ.
>    how come?

You are right, the two methods of getting the counts should be equivalent.
You can test this by doing 

	ngram-count -text TEST -write NEWCOUNTS

and
	ngram-count -read COUNTS -write NEWCOUNTS

and comparing the output.  If you find a discrepancy then there might
be a bug and I'd like you to send me a small test case that shows the problem.

BTW, there is no reason to use -write1 -write2 -write3 together if you 
are going to combine the counts later. Just -write will do the job.

> 
> 2. in ngram-count:
>    i'm not quite clear about the multiple -cdiscount flags.
>    suppose i want a default -order 3 LM.
>    mustn't i give all three D's and have the model interpolate over all
>    of these, as eq. (18) in Chen&Goodman (p.15) implies?
>    in practice it seems one can specify any subset of the 3 and get
>    different models. (are there default Ds?)

The way it is implemented you have complete freedom to use 
different discounting methods for different orders of N-grams.
The default is Good-Turing, so

	-cdiscount1 D1 -cdiscount3 D3

would use absolute discounting for orders 1 and 3, but GT for bigrams.
(There is no default D value for absolute discounting).

Also, whether or not higher-order estimates use interpolation with 
lower-order estimates can be chosen separately for each order.

Not all possible combinations make sense from a theoretical point of 
view, so it's up to you to not abuse this flexibility.

> 3. in ngram-count:
>    probably closely related to question 2.
>    (and prob. due to some confusion i have between backoff & interpolation)
>    why are there multiple -interpolate flags.
>    again, eq. (18) in C&G appears to imply a recursive all levels
>    interpolation. and yet ngram-count appears to take any subset of
>    -interpolate{1-3} (in the above example) and yield different LMs.

See above.  -interpolate<N> estimates order-N N-gram probabilities by 
interpolating with order-(N-1) estimates.  The latter could themselses
be interpolated or not, so you control how far the recursion goes.


> 4. combining 2+3:
>    if i want an absolute discount model of order, say 3, 
>    "by the book" C&G eq. (18), what is the proper way to run it? 
>    assume i have ran ngram-count => get-gt-counts => make-abs-discount
>    and obtained <D1> <D2> <D3>.
>    a command line example will be highly appreciated.

Correct.

> 
> 5. ngram-count vs. ngram:
>    if i use ngram-count with some combination of -prune and -minprune 
>    to obtain a model and then use ngram -ppl, will the result be identical
>    to running ngram-count without the pruning flags, and running ngram -ppl
>    on the new model with -prune -minprune as was previously done for model
>    building?

Correct (again, barring any bugs...).

> 6. for ngram -ppl:
>    in -debug 1, i believe, two measures are given per sentence, ppl and ppl1.
>    how are they defined? 
>    is one C&G's $PP_p(T)$ (p.9,top)? then, what is the other?

I get a lot of question about this because it's not documented,
except in the code.  ppl1 is the perplexity computed without counting 
the end-of-sentence tokens in the denominator (the end-of-sentence 
log probabilities are still included in the total log probability).
ppl1 can be more meaningful for comparing perplexities on testsets that
have been segmented in different ways.

--Andreas