From stolcke at speech.sri.com Thu Jan 20 17:27:09 2011
From: stolcke at speech.sri.com (Andreas Stolcke)
Date: Thu, 20 Jan 2011 17:27:09 -0800
Subject: [SRILM User List] SRILM 1.5.12 released
Message-ID: <201101210127.p0L1R9M12432@huge>
The latest version of SRILM is now available from
http://www.speech.sri.com/projects/srilm/download.html .
A list of changes appears below.
Enjoy,
Andreas
1.5.12 20 Jan 2011
Functionality:
* Enable lattice-tool -old-decoding if -nbest-duplicates is specified (and
warn about it).
* Support make-big-lm -wbdiscount option.
* New option ngram -prune-history-lm, for specifying a separate LM that computes
the history marginal probablities needed for N-gram pruning purposes.
Inspired by C. Chelba et al., "Study on Interaction Between Entropy Pruning and
Kneser-Ney Smoothing", Proc. Interspeech-2010.
* Added optional limitVocab argument to VocabMultiMap::read() function. This
is now used by lattice-tool -limit-vocab to avoid reading parts of the dictionary
that are not used in the input.
* Added an option -zeroprob-word to ngram and lattice-tool. It specifies a
word that should be used as a replacement if the current word has probability
zero. This is different from -map-unk which only applies to OOV words and
actually replaces the word label in the output lattice, if any.
* Added new wrapper LM class NonzeroLM, to implement the above.
Portability:
* New MACHINE_TYPE values for Android-ARM platform: android-armeabi and
android-armeabi-v7a (from Mike Frandsen).
* Deleted the htk directory from distribution; it was obsolete and not
documented.
Bug fixes:
* Prob.h: guard against under/overflow in intlog and bytelog conversions.
* Replaced gunzip with gzip -d in all scripts (for efficiency).
* Better option checking in make-big-lm, disallowing mixing of discounting
methods and use of discounting flags that are not supported.
* Undefine max() macro in Trellis.h to avoid conflict with some system
header files.
* Better support for recent MSVC versions in common/Makefile.machine.msvc
(from Mile Frandsen).
* add-pauses-to-pfsg: prevent existing pause nodes from being processed.
From suzuki at ks.cs.titech.ac.jp Mon Jan 24 00:10:50 2011
From: suzuki at ks.cs.titech.ac.jp (suzuki yasuo)
Date: Mon, 24 Jan 2011 17:10:50 +0900
Subject: [SRILM User List] Question about output of "ngram -ppl -debug 2"
for class-based LM model
Message-ID: <20110124171050.d87347e1.suzuki@ks.cs.titech.ac.jp>
Hello, all.
I made a class LM(bigram) and caluculated ppl of some testdata by this command in shell script,
"ngram -order 2 -lm ${CLASS_LM_NAME} -ppl ${TEST} -debug 2 -classes ${CLASS_FILE}".
I can get output of -debug 2. A part of that is like this..
The term is generally applied to behavior within civil governments , but politics has been observed in other grou
p interactions , including corporate , academic , and religious institutions .
p( The | ) = [OOV][2gram] 0.00520962 [ -2.28319 ]
p( term | The ...) = [OOV][1gram][OOV][2gram] 0.000536365 [ -3.27054 ]
p( is | term ...) = [OOV][1gram][OOV][2gram] 0.0139987 [ -1.85391 ]
p( generally | is ...) = [OOV][1gram][OOV][2gram] 0.000171588 [ -3.76551 ]
p( applied | generally ...) = [OOV][1gram][OOV][2gram] 0.000122932 [ -3.91033 ]
p( to | applied ...) = [OOV][1gram][OOV][2gram] 0.0811208 [ -1.09087 ]
p( behavior | to ...) = [OOV][1gram][OOV][2gram] 6.12967e-05 [ -4.21256 ]
p( within | behavior ...) = [OOV][1gram][OOV][2gram] 0.000763519 [ -3.11718 ]
p( civil | within ...) = [OOV][1gram][OOV][2gram] 4.96081e-05 [ -4.30445 ]
p( | civil ...) = [1gram][1gram] 0.0156937 [ -1.80427 ]
p( , | ...) = [OOV][1gram] 0.0149661 [ -1.82489 ]
p( but | , ...) = [OOV][1gram][OOV][2gram] 0.00500311 [ -2.30076 ]
p( politics | but ...) = [OOV][1gram][OOV][2gram] 4.8048e-05 [ -4.31833 ]
p( has | politics ...) = [OOV][1gram][OOV][1gram] 0.000661878 [ -3.17922 ]
p( been | has ...) = [OOV][1gram][OOV][2gram] 0.00721624 [ -2.14169 ]
p( observed | been ...) = [OOV][1gram][OOV][1gram] 1.12884e-05 [ -4.94737 ]
p( in | observed ...) = [OOV][1gram][1gram][OOV][2gram][1gram] 0.0144335 [ -1.84063 ]
p( other | in ...) = [OOV][1gram][OOV][2gram][OOV][2gram] 0.00162061 [ -2.79032 ]
p( group | other ...) = [OOV][1gram][OOV][2gram] 0.000567602 [ -3.24596 ]
p( | group ...) = [1gram][1gram] 0.0150167 [ -1.82343 ]
p( , | ...) = [OOV][1gram] 0.0149661 [ -1.82489 ]
p( including | , ...) = [OOV][1gram][OOV][2gram] 0.000755534 [ -3.12175 ]
p( corporate | including ...) = [OOV][1gram][OOV][2gram] 5.59105e-05 [ -4.25251 ]
p( , | corporate ...) = [OOV][1gram][OOV][1gram] 0.0222226 [ -1.65321 ]
p( academic | , ...) = [OOV][1gram][OOV][2gram] 4.36976e-05 [ -4.35954 ]
p( , | academic ...) = [OOV][1gram][OOV][1gram] 0.0222226 [ -1.65321 ]
p( and | , ...) = [OOV][1gram][OOV][2gram] 0.0787025 [ -1.10401 ]
p( religious | and ...) = [OOV][1gram][OOV][2gram] 6.80949e-05 [ -4.16689 ]
p( institutions | religious ...) = [OOV][1gram][OOV][2gram] 0.000141801 [ -3.84832 ]
p( . | institutions ...) = [OOV][1gram][OOV][2gram] 0.0110882 [ -1.95514 ]
p( | . ...) = [1gram][2gram] 0.979002 [ -0.00921631 ]
1 sentences, 30 words, 0 OOVs
0 zeroprobs, logprob= -85.9741 ppl= 593.414 ppl1= 734.18
I can understand how these probs were caluculated for most of the lines, but I can't analyze this line
p( in | observed ...) = [OOV][1gram][1gram][OOV][2gram][1gram] 0.0144335 [ -1.84063 ]
Will you tell me the meaning of this line? How this prob were caluculated from my class-based LM?
--
Yasuo Suzuki
4th year undergrad at Shinoda Laboratory
Department of Computer Science
Tokyo Institute of Technology
suzuki at ks.cs.titech.ac.jp
From pawang.iitk at gmail.com Mon Jan 24 11:12:12 2011
From: pawang.iitk at gmail.com (Pawan Goyal)
Date: Mon, 24 Jan 2011 19:12:12 +0000
Subject: [SRILM User List] error message while running make World
Message-ID:
Hi all,
uname -a
Linux pawan-laptop 2.6.32-28-generic #55-Ubuntu SMP Mon Jan 10 23:42:43 UTC
2011 x86_64 GNU/Linux
gcc version 4.4.3 (Ubuntu 4.4.3-4ubuntu5)
error message while
make World
.........................
/usr/bin/ld: skipping incompatible
/usr/lib/gcc/x86_64-linux-gnu/4.4.3/libstdc++.so when searching for -lstdc++
/usr/bin/ld: skipping incompatible
/usr/lib/gcc/x86_64-linux-gnu/4.4.3/libstdc++.a when searching for -lstdc++
/usr/bin/ld: cannot find -lstdc++
collect2: ld returned 1 exit status
/home/pawan/Documents/PhD/summarization/srilm/sbin/decipher-install 0555
../bin/i686/maxalloc ../../bin/i686
ERROR: File to be installed (../bin/i686/maxalloc) does not exist.
ERROR: File to be installed (../bin/i686/maxalloc) is not a plain file.
Usage: decipher-install ...
mode: file permission mode, in octal
file1 ... fileN: files to be installed
directory: where the files should be installed
..................................................................................
Thanks in advance
Pawan
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
From stolcke at speech.sri.com Mon Jan 24 14:07:57 2011
From: stolcke at speech.sri.com (Andreas Stolcke)
Date: Mon, 24 Jan 2011 14:07:57 -0800
Subject: [SRILM User List] error message while running make World
In-Reply-To:
References:
Message-ID: <4D3DF83D.30307@speech.sri.com>
Pawan Goyal wrote:
> Hi all,
>
> uname -a
> Linux pawan-laptop 2.6.32-28-generic #55-Ubuntu SMP Mon Jan 10
> 23:42:43 UTC 2011 x86_64 GNU/Linux
>
> gcc version 4.4.3 (Ubuntu 4.4.3-4ubuntu5)
The probable reason is that you're trying to compile 32-bit binaries
(the default for the i686 platform), but you Ubuntu system doesn't have
the required libraries installed (only the 64bit ones are installed on
most systems).
Two solutions: 1) install the optional 32bit binaries using commands
such as
apt-get install ia32-libs
(if you want to build with Tcl support you'd also need the 32bit version
of libtcl -- don't know the name of the package).
2) Compile 64bit binaries. You can copy
common/Makefile.machine.i686-ubuntu to common/Makefile.machine.i686, or
edit the file by hand.
Andreas
>
> error message while
>
> make World
>
> .........................
> /usr/bin/ld: skipping incompatible
> /usr/lib/gcc/x86_64-linux-gnu/4.4.3/libstdc++.so when searching for
> -lstdc++
> /usr/bin/ld: skipping incompatible
> /usr/lib/gcc/x86_64-linux-gnu/4.4.3/libstdc++.a when searching for
> -lstdc++
>
> /usr/bin/ld: cannot find -lstdc++
> collect2: ld returned 1 exit status
> /home/pawan/Documents/PhD/summarization/srilm/sbin/decipher-install
> 0555 ../bin/i686/maxalloc ../../bin/i686
> ERROR: File to be installed (../bin/i686/maxalloc) does not exist.
> ERROR: File to be installed (../bin/i686/maxalloc) is not a plain file.
> Usage: decipher-install ...
> mode: file permission mode, in octal
> file1 ... fileN: files to be installed
> directory: where the files should be installed
>
> ..................................................................................
>
> Thanks in advance
> Pawan
> ------------------------------------------------------------------------
>
> _______________________________________________
> SRILM-User site list
> SRILM-User at speech.sri.com
> http://www.speech.sri.com/mailman/listinfo/srilm-user
From stolcke at speech.sri.com Mon Jan 24 14:21:50 2011
From: stolcke at speech.sri.com (Andreas Stolcke)
Date: Mon, 24 Jan 2011 14:21:50 -0800
Subject: [SRILM User List] Question about output of "ngram -ppl -debug
2" for class-based LM model
In-Reply-To: <20110124171050.d87347e1.suzuki@ks.cs.titech.ac.jp>
References: <20110124171050.d87347e1.suzuki@ks.cs.titech.ac.jp>
Message-ID: <4D3DFB7E.4030603@speech.sri.com>
suzuki yasuo wrote:
> Hello, all.
>
> I made a class LM(bigram) and caluculated ppl of some testdata by this command in shell script,
>
> "ngram -order 2 -lm ${CLASS_LM_NAME} -ppl ${TEST} -debug 2 -classes ${CLASS_FILE}".
>
> I can get output of -debug 2. A part of that is like this..
>
>
> The term is generally applied to behavior within civil governments , but politics has been observed in other grou
> p interactions , including corporate , academic , and religious institutions .
> p( The | ) = [OOV][2gram] 0.00520962 [ -2.28319 ]
> p( term | The ...) = [OOV][1gram][OOV][2gram] 0.000536365 [ -3.27054 ]
> p( is | term ...) = [OOV][1gram][OOV][2gram] 0.0139987 [ -1.85391 ]
> p( generally | is ...) = [OOV][1gram][OOV][2gram] 0.000171588 [ -3.76551 ]
> p( applied | generally ...) = [OOV][1gram][OOV][2gram] 0.000122932 [ -3.91033 ]
> p( to | applied ...) = [OOV][1gram][OOV][2gram] 0.0811208 [ -1.09087 ]
> p( behavior | to ...) = [OOV][1gram][OOV][2gram] 6.12967e-05 [ -4.21256 ]
> p( within | behavior ...) = [OOV][1gram][OOV][2gram] 0.000763519 [ -3.11718 ]
> p( civil | within ...) = [OOV][1gram][OOV][2gram] 4.96081e-05 [ -4.30445 ]
> p( | civil ...) = [1gram][1gram] 0.0156937 [ -1.80427 ]
> p( , | ...) = [OOV][1gram] 0.0149661 [ -1.82489 ]
> p( but | , ...) = [OOV][1gram][OOV][2gram] 0.00500311 [ -2.30076 ]
> p( politics | but ...) = [OOV][1gram][OOV][2gram] 4.8048e-05 [ -4.31833 ]
> p( has | politics ...) = [OOV][1gram][OOV][1gram] 0.000661878 [ -3.17922 ]
> p( been | has ...) = [OOV][1gram][OOV][2gram] 0.00721624 [ -2.14169 ]
> p( observed | been ...) = [OOV][1gram][OOV][1gram] 1.12884e-05 [ -4.94737 ]
> p( in | observed ...) = [OOV][1gram][1gram][OOV][2gram][1gram] 0.0144335 [ -1.84063 ]
> p( other | in ...) = [OOV][1gram][OOV][2gram][OOV][2gram] 0.00162061 [ -2.79032 ]
> p( group | other ...) = [OOV][1gram][OOV][2gram] 0.000567602 [ -3.24596 ]
> p( | group ...) = [1gram][1gram] 0.0150167 [ -1.82343 ]
> p( , | ...) = [OOV][1gram] 0.0149661 [ -1.82489 ]
> p( including | , ...) = [OOV][1gram][OOV][2gram] 0.000755534 [ -3.12175 ]
> p( corporate | including ...) = [OOV][1gram][OOV][2gram] 5.59105e-05 [ -4.25251 ]
> p( , | corporate ...) = [OOV][1gram][OOV][1gram] 0.0222226 [ -1.65321 ]
> p( academic | , ...) = [OOV][1gram][OOV][2gram] 4.36976e-05 [ -4.35954 ]
> p( , | academic ...) = [OOV][1gram][OOV][1gram] 0.0222226 [ -1.65321 ]
> p( and | , ...) = [OOV][1gram][OOV][2gram] 0.0787025 [ -1.10401 ]
> p( religious | and ...) = [OOV][1gram][OOV][2gram] 6.80949e-05 [ -4.16689 ]
> p( institutions | religious ...) = [OOV][1gram][OOV][2gram] 0.000141801 [ -3.84832 ]
> p( . | institutions ...) = [OOV][1gram][OOV][2gram] 0.0110882 [ -1.95514 ]
> p( | . ...) = [1gram][2gram] 0.979002 [ -0.00921631 ]
> 1 sentences, 30 words, 0 OOVs
> 0 zeroprobs, logprob= -85.9741 ppl= 593.414 ppl1= 734.18
>
> I can understand how these probs were caluculated for most of the lines, but I can't analyze this line
>
> p( in | observed ...) = [OOV][1gram][1gram][OOV][2gram][1gram] 0.0144335 [ -1.84063 ]
>
> Will you tell me the meaning of this line? How this prob were caluculated from my class-based LM?
>
Each term in brackets [OOV] [1gram] ... corresponds to one way to parse
the the word as part of a class expansion, as as a plain word.
For example, you see
p( The | ) = [OOV][2gram] 0.00520962 [ -2.28319 ]
because first word could be generated by the LM as a bigram The, or
as CLASS with "The" being a member of CLASS.
I suspect your LM doesn't contain "The" as a vocabulary item independent
of CLASS, hence the first parse yields the [OOV] label.
One you get to the second word you have more ways to predict the next
word, because now the history also has multiple parses.
In general, the predicted probabilities for all parses are added up to
arrive at the total conditional probability.
So disable this type of processing (multiple parses) you can use the
-simple-classes option, but that only works if word-class membership is
unambiugous.
Andreas
-classes newlabels+spell.classes
>
>
>
From pawang.iitk at gmail.com Mon Jan 24 14:45:36 2011
From: pawang.iitk at gmail.com (Pawan Goyal)
Date: Mon, 24 Jan 2011 22:45:36 +0000
Subject: [SRILM User List] error message while running make World
In-Reply-To: <4D3DF83D.30307@speech.sri.com>
References:
<4D3DF83D.30307@speech.sri.com>
Message-ID:
Hi Andreas,
Thanks for pointing out the incompatibly problem. I had the 32-bit binaries
installed already, so tried the second option. I am not using tcl support,
i.e.
NO_TCL = X
TCL_INCLUDE =
TCL_LIBRARY =
I am still getting the problems and sorry but not able to figure out
the solution. Part of the error message during make World:
..................................................................................................................
/usr/bin/g++ -march=athlon64 -m64 -Wall -Wno-unused-variable
-Wno-uninitialized -DINSTANTIATE_TEMPLATES -D_FILE_OFFSET_BITS=64
-I. -I../../include -L../../lib/i686 -g -O3 -o ../bin/i686/maxalloc
../obj/i686/maxalloc.o ../obj/i686/libdstruct.a -lm -ldl
../../lib/i686/libmisc.a -lm 2>&1 | c++filt
/usr/bin/ld: i386 architecture of input file `../obj/i686/maxalloc.o'
is incompatible with i386:x86-64 output
/usr/bin/ld: i386 architecture of input file
`../../lib/i686/libmisc.a(option.o)' is incompatible with i386:x86-64
output
collect2: ld returned 1 exit status
/home/pawan/Documents/PhD/summarization/srilm/sbin/decipher-install
0555 ../bin/i686/maxalloc ../../bin/i686
ERROR: File to be installed (../bin/i686/maxalloc) does not exist.
ERROR: File to be installed (../bin/i686/maxalloc) is not a plain file.
Usage: decipher-install ...
mode: file permission mode, in octal
file1 ... fileN: files to be installed
directory: where the files should be installed
files = ../bin/i686/maxalloc
directory = ../../bin/i686
mode = 0555
make[2]: [../../bin/i686/maxalloc] Error 1 (ignored)
make[2]: Leaving directory
`/home/pawan/Documents/PhD/summarization/srilm/dstruct/src'
make[2]: Entering directory
`/home/pawan/Documents/PhD/summarization/srilm/lm/src'
/usr/bin/g++ -march=athlon64 -m64 -Wall -Wno-unused-variable
-Wno-uninitialized -DINSTANTIATE_TEMPLATES -D_FILE_OFFSET_BITS=64
-I. -I../../include -u matherr -L../../lib/i686 -g -O3 -o
../bin/i686/ngram ../obj/i686/ngram.o ../obj/i686/liboolm.a -lm -ldl
../../lib/i686/libflm.a ../../lib/i686/libdstruct.a
../../lib/i686/libmisc.a -lm 2>&1 | c++filt
collect2: ld terminated with signal 11 [Segmentation fault]
/usr/bin/ld: i386 architecture of input file `../obj/i686/ngram.o' is
incompatible with i386:x86-64 output
/usr/bin/ld: i386 architecture of input file
`../obj/i686/liboolm.a(matherr.o)' is incompatible with i386:x86-64
output
......................................................................................
/home/pawan/Documents/PhD/summarization/srilm/sbin/decipher-install
0555 ../bin/i686/ngram ../../bin/i686
ERROR: File to be installed (../bin/i686/ngram) does not exist.
ERROR: File to be installed (../bin/i686/ngram) is not a plain file.
Usage: decipher-install ...
mode: file permission mode, in octal
file1 ... fileN: files to be installed
directory: where the files should be installed
files = ../bin/i686/ngram
directory = ../../bin/i686
mode = 0555
make[2]: [../../bin/i686/ngram] Error 1 (ignored)
...............................................................................
Thanks
Pawan
On Mon, Jan 24, 2011 at 10:07 PM, Andreas Stolcke wrote:
> Pawan Goyal wrote:
>
>> Hi all,
>>
>> uname -a
>> Linux pawan-laptop 2.6.32-28-generic #55-Ubuntu SMP Mon Jan 10 23:42:43
>> UTC 2011 x86_64 GNU/Linux
>>
>> gcc version 4.4.3 (Ubuntu 4.4.3-4ubuntu5)
>>
> The probable reason is that you're trying to compile 32-bit binaries (the
> default for the i686 platform), but you Ubuntu system doesn't have the
> required libraries installed (only the 64bit ones are installed on most
> systems).
>
> Two solutions: 1) install the optional 32bit binaries using commands such
> as
>
> apt-get install ia32-libs
>
> (if you want to build with Tcl support you'd also need the 32bit version of
> libtcl -- don't know the name of the package).
>
> 2) Compile 64bit binaries. You can copy
> common/Makefile.machine.i686-ubuntu to common/Makefile.machine.i686, or edit
> the file by hand.
>
> Andreas
>
>
>
>> error message while
>> make World
>>
>> .........................
>> /usr/bin/ld: skipping incompatible
>> /usr/lib/gcc/x86_64-linux-gnu/4.4.3/libstdc++.so when searching for -lstdc++
>> /usr/bin/ld: skipping incompatible
>> /usr/lib/gcc/x86_64-linux-gnu/4.4.3/libstdc++.a when searching for -lstdc++
>>
>> /usr/bin/ld: cannot find -lstdc++
>> collect2: ld returned 1 exit status
>> /home/pawan/Documents/PhD/summarization/srilm/sbin/decipher-install 0555
>> ../bin/i686/maxalloc ../../bin/i686
>> ERROR: File to be installed (../bin/i686/maxalloc) does not exist.
>> ERROR: File to be installed (../bin/i686/maxalloc) is not a plain file.
>> Usage: decipher-install ...
>> mode: file permission mode, in octal
>> file1 ... fileN: files to be installed
>> directory: where the files should be installed
>>
>>
>> ..................................................................................
>>
>> Thanks in advance
>> Pawan
>> ------------------------------------------------------------------------
>>
>> _______________________________________________
>> SRILM-User site list
>> SRILM-User at speech.sri.com
>> http://www.speech.sri.com/mailman/listinfo/srilm-user
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
From stolcke at speech.sri.com Mon Jan 24 17:27:20 2011
From: stolcke at speech.sri.com (Andreas Stolcke)
Date: Mon, 24 Jan 2011 17:27:20 -0800
Subject: [SRILM User List] error message while running make World
In-Reply-To:
References:
<4D3DF83D.30307@speech.sri.com>
Message-ID: <4D3E26F8.6070307@speech.sri.com>
Pawan Goyal wrote:
> Hi Andreas,
>
> Thanks for pointing out the incompatibly problem. I had the 32-bit
> binaries installed already, so tried the second option. I am not using
> tcl support, i.e.
> NO_TCL = X
> TCL_INCLUDE =
> TCL_LIBRARY =
> I am still getting the problems and sorry but not able to figure out the solution. Part of the error message during make World:
But now you are trying a 64bit compile! (look at your g++ options.
That is fine, but you need to completely remove all old .o files because
you cannot mix 32bit and 64bit .o files and libraries.
Andreas
>
> ..................................................................................................................
> /usr/bin/g++ -march=athlon64 -m64 -Wall -Wno-unused-variable -Wno-uninitialized -DINSTANTIATE_TEMPLATES -D_FILE_OFFSET_BITS=64 -I. -I../../include -L../../lib/i686 -g -O3 -o ../bin/i686/maxalloc ../obj/i686/maxalloc.o ../obj/i686/libdstruct.a -lm -ldl ../../lib/i686/libmisc.a -lm 2>&1 | c++filt
> /usr/bin/ld: i386 architecture of input file `../obj/i686/maxalloc.o' is incompatible with i386:x86-64 output
> /usr/bin/ld: i386 architecture of input file `../../lib/i686/libmisc.a(option.o)' is incompatible with i386:x86-64 output
> collect2: ld returned 1 exit status
> /home/pawan/Documents/PhD/summarization/srilm/sbin/decipher-install 0555 ../bin/i686/maxalloc ../../bin/i686
> ERROR: File to be installed (../bin/i686/maxalloc) does not exist.
> ERROR: File to be installed (../bin/i686/maxalloc) is not a plain file.
> Usage: decipher-install ...
> mode: file permission mode, in octal
> file1 ... fileN: files to be installed
> directory: where the files should be installed
>
> files = ../bin/i686/maxalloc
> directory = ../../bin/i686
> mode = 0555
>
> make[2]: [../../bin/i686/maxalloc] Error 1 (ignored)
> make[2]: Leaving directory `/home/pawan/Documents/PhD/summarization/srilm/dstruct/src'
> make[2]: Entering directory `/home/pawan/Documents/PhD/summarization/srilm/lm/src'
> /usr/bin/g++ -march=athlon64 -m64 -Wall -Wno-unused-variable -Wno-uninitialized -DINSTANTIATE_TEMPLATES -D_FILE_OFFSET_BITS=64 -I. -I../../include -u matherr -L../../lib/i686 -g -O3 -o ../bin/i686/ngram ../obj/i686/ngram.o ../obj/i686/liboolm.a -lm -ldl ../../lib/i686/libflm.a ../../lib/i686/libdstruct.a ../../lib/i686/libmisc.a -lm 2>&1 | c++filt
> collect2: ld terminated with signal 11 [Segmentation fault]
> /usr/bin/ld: i386 architecture of input file `../obj/i686/ngram.o' is incompatible with i386:x86-64 output
> /usr/bin/ld: i386 architecture of input file `../obj/i686/liboolm.a(matherr.o)' is incompatible with i386:x86-64 output
> ......................................................................................
> /home/pawan/Documents/PhD/summarization/srilm/sbin/decipher-install 0555 ../bin/i686/ngram ../../bin/i686
> ERROR: File to be installed (../bin/i686/ngram) does not exist.
> ERROR: File to be installed (../bin/i686/ngram) is not a plain file.
> Usage: decipher-install ...
> mode: file permission mode, in octal
> file1 ... fileN: files to be installed
> directory: where the files should be installed
>
> files = ../bin/i686/ngram
> directory = ../../bin/i686
> mode = 0555
>
> make[2]: [../../bin/i686/ngram] Error 1 (ignored)
>
> ...............................................................................
>
> Thanks
> Pawan
>
> On Mon, Jan 24, 2011 at 10:07 PM, Andreas Stolcke
> > wrote:
>
> Pawan Goyal wrote:
>
> Hi all,
>
> uname -a
> Linux pawan-laptop 2.6.32-28-generic #55-Ubuntu SMP Mon Jan 10
> 23:42:43 UTC 2011 x86_64 GNU/Linux
>
> gcc version 4.4.3 (Ubuntu 4.4.3-4ubuntu5)
>
> The probable reason is that you're trying to compile 32-bit
> binaries (the default for the i686 platform), but you Ubuntu
> system doesn't have the required libraries installed (only the
> 64bit ones are installed on most systems).
>
> Two solutions: 1) install the optional 32bit binaries using
> commands such as
>
> apt-get install ia32-libs
>
> (if you want to build with Tcl support you'd also need the 32bit
> version of libtcl -- don't know the name of the package).
>
> 2) Compile 64bit binaries. You can copy
> common/Makefile.machine.i686-ubuntu to
> common/Makefile.machine.i686, or edit the file by hand.
>
> Andreas
>
>
>
> error message while
> make World
>
> .........................
> /usr/bin/ld: skipping incompatible
> /usr/lib/gcc/x86_64-linux-gnu/4.4.3/libstdc++.so when
> searching for -lstdc++
> /usr/bin/ld: skipping incompatible
> /usr/lib/gcc/x86_64-linux-gnu/4.4.3/libstdc++.a when searching
> for -lstdc++
>
> /usr/bin/ld: cannot find -lstdc++
> collect2: ld returned 1 exit status
> /home/pawan/Documents/PhD/summarization/srilm/sbin/decipher-install
> 0555 ../bin/i686/maxalloc ../../bin/i686
> ERROR: File to be installed (../bin/i686/maxalloc) does not
> exist.
> ERROR: File to be installed (../bin/i686/maxalloc) is not a
> plain file.
> Usage: decipher-install ...
> mode: file permission mode, in octal
> file1 ... fileN: files to be installed
> directory: where the files should be installed
>
> ..................................................................................
>
> Thanks in advance
> Pawan
> ------------------------------------------------------------------------
>
> _______________________________________________
> SRILM-User site list
> SRILM-User at speech.sri.com
> http://www.speech.sri.com/mailman/listinfo/srilm-user
>
>
>
From mshamsuddeen2 at gmail.com Mon Jan 24 23:57:05 2011
From: mshamsuddeen2 at gmail.com (Muhammad Shamsuddeen Muhammad)
Date: Tue, 25 Jan 2011 15:57:05 +0800
Subject: [SRILM User List] ngram-count missing
Message-ID:
I get this error when running the following command =>>
integ at integ-desktop:~/tools/
demo$ ../../tools/srilm/bin/i686-gcc4/ngram-count -order 3 -interpolate
-kndiscount -unk -text work/lm/news-commentary.lowercased.en -lm
work/lm/news-commentary.lowercased.lm
bash: ../../tools/srilm/bin/i686-gcc4/ngram-count: No such file or directory
Up until this point, the installation process has been smooth, i have
checked the directory and the file is no where to be found. Can anyone shed
any light onto the situation and possibly guide me on how to fix it.
Thanks in Advance.
--
Muhammad Shamsuddeen Muhammad
"There is no knowledge that is not power".
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
From pawang.iitk at gmail.com Tue Jan 25 03:03:58 2011
From: pawang.iitk at gmail.com (Pawan Goyal)
Date: Tue, 25 Jan 2011 11:03:58 +0000
Subject: [SRILM User List] error message while running make World
In-Reply-To: <4D3E26F8.6070307@speech.sri.com>
References:
<4D3DF83D.30307@speech.sri.com>
<4D3E26F8.6070307@speech.sri.com>
Message-ID:
Thanks Andreas. I did everything from the start again and it was successful!
Regards
Pawan
On Tue, Jan 25, 2011 at 1:27 AM, Andreas Stolcke wrote:
> Pawan Goyal wrote:
>
>> Hi Andreas,
>>
>> Thanks for pointing out the incompatibly problem. I had the 32-bit
>> binaries installed already, so tried the second option. I am not using tcl
>> support, i.e. NO_TCL = X
>> TCL_INCLUDE = TCL_LIBRARY = I am still getting the problems and
>> sorry but not able to figure out the solution. Part of the error message
>> during make World:
>>
> But now you are trying a 64bit compile! (look at your g++ options.
>
> That is fine, but you need to completely remove all old .o files because
> you cannot mix 32bit and 64bit .o files and libraries.
>
> Andreas
>
>>
>> ..................................................................................................................
>> /usr/bin/g++ -march=athlon64 -m64 -Wall -Wno-unused-variable
>> -Wno-uninitialized -DINSTANTIATE_TEMPLATES -D_FILE_OFFSET_BITS=64 -I.
>> -I../../include -L../../lib/i686 -g -O3 -o ../bin/i686/maxalloc
>> ../obj/i686/maxalloc.o ../obj/i686/libdstruct.a -lm -ldl
>> ../../lib/i686/libmisc.a -lm 2>&1 | c++filt
>> /usr/bin/ld: i386 architecture of input file `../obj/i686/maxalloc.o' is
>> incompatible with i386:x86-64 output
>> /usr/bin/ld: i386 architecture of input file
>> `../../lib/i686/libmisc.a(option.o)' is incompatible with i386:x86-64 output
>> collect2: ld returned 1 exit status
>> /home/pawan/Documents/PhD/summarization/srilm/sbin/decipher-install 0555
>> ../bin/i686/maxalloc ../../bin/i686
>> ERROR: File to be installed (../bin/i686/maxalloc) does not exist.
>> ERROR: File to be installed (../bin/i686/maxalloc) is not a plain file.
>> Usage: decipher-install ...
>> mode: file permission mode, in octal
>> file1 ... fileN: files to be installed
>> directory: where the files should be installed
>>
>> files = ../bin/i686/maxalloc
>> directory = ../../bin/i686
>> mode = 0555
>>
>> make[2]: [../../bin/i686/maxalloc] Error 1 (ignored)
>> make[2]: Leaving directory
>> `/home/pawan/Documents/PhD/summarization/srilm/dstruct/src'
>> make[2]: Entering directory
>> `/home/pawan/Documents/PhD/summarization/srilm/lm/src'
>> /usr/bin/g++ -march=athlon64 -m64 -Wall -Wno-unused-variable
>> -Wno-uninitialized -DINSTANTIATE_TEMPLATES -D_FILE_OFFSET_BITS=64 -I.
>> -I../../include -u matherr -L../../lib/i686 -g -O3 -o ../bin/i686/ngram
>> ../obj/i686/ngram.o ../obj/i686/liboolm.a -lm -ldl ../../lib/i686/libflm.a
>> ../../lib/i686/libdstruct.a ../../lib/i686/libmisc.a -lm 2>&1 | c++filt
>> collect2: ld terminated with signal 11 [Segmentation fault]
>> /usr/bin/ld: i386 architecture of input file `../obj/i686/ngram.o' is
>> incompatible with i386:x86-64 output
>> /usr/bin/ld: i386 architecture of input file
>> `../obj/i686/liboolm.a(matherr.o)' is incompatible with i386:x86-64 output
>>
>> ......................................................................................
>> /home/pawan/Documents/PhD/summarization/srilm/sbin/decipher-install 0555
>> ../bin/i686/ngram ../../bin/i686
>> ERROR: File to be installed (../bin/i686/ngram) does not exist.
>> ERROR: File to be installed (../bin/i686/ngram) is not a plain file.
>> Usage: decipher-install ...
>> mode: file permission mode, in octal
>> file1 ... fileN: files to be installed
>> directory: where the files should be installed
>>
>> files = ../bin/i686/ngram
>> directory = ../../bin/i686
>> mode = 0555
>>
>> make[2]: [../../bin/i686/ngram] Error 1 (ignored)
>>
>>
>> ...............................................................................
>>
>> Thanks Pawan
>>
>> On Mon, Jan 24, 2011 at 10:07 PM, Andreas Stolcke > stolcke at speech.sri.com>> wrote:
>>
>> Pawan Goyal wrote:
>>
>> Hi all,
>>
>> uname -a
>> Linux pawan-laptop 2.6.32-28-generic #55-Ubuntu SMP Mon Jan 10
>> 23:42:43 UTC 2011 x86_64 GNU/Linux
>>
>> gcc version 4.4.3 (Ubuntu 4.4.3-4ubuntu5)
>>
>> The probable reason is that you're trying to compile 32-bit
>> binaries (the default for the i686 platform), but you Ubuntu
>> system doesn't have the required libraries installed (only the
>> 64bit ones are installed on most systems).
>>
>> Two solutions: 1) install the optional 32bit binaries using
>> commands such as
>>
>> apt-get install ia32-libs
>>
>> (if you want to build with Tcl support you'd also need the 32bit
>> version of libtcl -- don't know the name of the package).
>>
>> 2) Compile 64bit binaries. You can copy
>> common/Makefile.machine.i686-ubuntu to
>> common/Makefile.machine.i686, or edit the file by hand.
>>
>> Andreas
>>
>>
>>
>> error message while
>> make World
>>
>> .........................
>> /usr/bin/ld: skipping incompatible
>> /usr/lib/gcc/x86_64-linux-gnu/4.4.3/libstdc++.so when
>> searching for -lstdc++
>> /usr/bin/ld: skipping incompatible
>> /usr/lib/gcc/x86_64-linux-gnu/4.4.3/libstdc++.a when searching
>> for -lstdc++
>>
>> /usr/bin/ld: cannot find -lstdc++
>> collect2: ld returned 1 exit status
>> /home/pawan/Documents/PhD/summarization/srilm/sbin/decipher-install
>> 0555 ../bin/i686/maxalloc ../../bin/i686
>> ERROR: File to be installed (../bin/i686/maxalloc) does not
>> exist.
>> ERROR: File to be installed (../bin/i686/maxalloc) is not a
>> plain file.
>> Usage: decipher-install ...
>> mode: file permission mode, in octal
>> file1 ... fileN: files to be installed
>> directory: where the files should be installed
>>
>>
>> ..................................................................................
>>
>> Thanks in advance
>> Pawan
>>
>> ------------------------------------------------------------------------
>>
>> _______________________________________________
>> SRILM-User site list
>> SRILM-User at speech.sri.com
>>
>> http://www.speech.sri.com/mailman/listinfo/srilm-user
>>
>>
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
From mehdi_hoseini at comp.iust.ac.ir Wed Jan 26 05:47:51 2011
From: mehdi_hoseini at comp.iust.ac.ir (Mehdi hoseini)
Date: Wed, 26 Jan 2011 17:17:51 +0330
Subject: [SRILM User List] Problem in using Language model
Message-ID:
hi all
I made a simple trigram in ARPA format with SRILM and I made a ASR with HTK.
but I have problems with use this trigram model in HTK. Does anybody in here
use language models in HTK? if so I have some questions to ask.
best regards
Mehdi Hoseini
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
From zeeshankhans at gmail.com Thu Jan 27 13:59:59 2011
From: zeeshankhans at gmail.com (zeeshan khan)
Date: Thu, 27 Jan 2011 22:59:59 +0100
Subject: [SRILM User List] dynamic Loglinear mix for lattice rescoring
Message-ID:
Hi all,
Is there a way to rescore htk lattices using dynamic log-linear
interpolation of more than one language models, using SRILM.
Ideally, the command should look like
lattice-tool -read-htk -in-lattice $SRC_LATTICE -lm $LM_FILE -order
$LM_ORDER -bayes 0 -lambda $LAMBDA -mix-lm $LM2_FILE -loglinear-mix
-write-htk -out-lattice $TMP_TRG_LATTIC -unk -map-unk $UNK_WORD
-keep-unk
but there is no loglinear-mix option in lattice-tool, like in ngram.
Thanks in advance,
Zeeshan.
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
From stolcke at speech.sri.com Fri Jan 28 21:49:48 2011
From: stolcke at speech.sri.com (Andreas Stolcke)
Date: Sat, 29 Jan 2011 00:49:48 -0500
Subject: [SRILM User List] lattice-tool related issues
In-Reply-To:
References: <201008032329.o73NTtj15149@huge>
Message-ID: <4D43AA7C.6010106@speech.sri.com>
Sorry for not responding earlier to this.
The latest version has a new lattice-tool option : -zeroprob-word . It
allows you to avoid assigning zero probabilities to OOV words without
mapping them to .
Andreas
Anoop Deoras wrote:
>
> On Aug 3, 2010, at 7:29 PM, Andreas Stolcke wrote:
>
>>
>> In message <5D1CA95A-F9E4-417E-B276-DE8056B3F254 at jhu.edu>you wrote:
>>> Hello,
>>>
>>> I am trying to rescore htk lattices using lattice-tool and am
>>> running into following issues:
>>>
>>> 1. I pass a 3gm language model and a vocabulary file to rescore the
>>> lattice (encoding bigram information) and
>>> then write back the updated and expanded lattice back in the htk
>>> format.
>>>
>>> However, when I specify -unk and -keep-unk flags, the OOV words gets
>>> mapped to unk without preserving the
>>> original label. I was under the impression that -keep-unk would
>>> preserve the label of the OOV word, but it does not do so.
>>
>> I just looked at the code, and it seems that -keep-unk is only
>> implemented
>> when reading HTK format lattices, not for PFSGs.
>> Is that what you are using?
>>
>> If you are using HTK lattices then please prepare some small input data
>> files that demonstrate the problem, and I can look into it when I get
>> a chance.
>>
>
> Hi Andreas,
>
> I am, infact, using HTK lattices. I was doing some debugging myself
> and noticed
> that when the rescoring LM is of the same order as that of the lattice
> (i.e. if the
> lattice expansion is not required), then -keep-unk works fine. When I
> use a higher
> order LM, it fails. I have uploaded the data at:
>
>
>
> Please run RescoreLattice.sh to process the HTK lattice file. I have
> kept the
> necessary vocabulary and trigram and bigram LM files too (Note: input
> lattices
> encodes bigram history and hence a trigram rescoring LM expands the
> lattice)
>
> The word 'slash' is out of vocabulary. A bigram rescoring keeps it intact
> while trigram rescoring maps it to
>
>
>>>
>>> 2. Before I rescore the lattice, I want to split some words (multiword
>>> units). The multiwords are connected by an
>>> underscore character. I hence use the flags, -split-multiwords -multi-
>>> char _
>>>
>>> All goes well, as long as I do not use -unk -keep-unk flag in
>>> conjunction with -split-multiwords . If I use -unk -keep-unk flag
>>> (for point 1 above) and also use -split-multiwords flags, then the
>>> multiword functionality does not work moreover the OOV
>>> words get mapped to .
>>>
>>> I should point out that the multi-word unit is NOT in my vocabulary
>>> but after the split, all the individual words are found
>>> in the vocabulary. Hence, I am suspecting that the functionality for
>>> the flag -unk takes place before the splitting
>>> and since no multiword unit is in the vocabulary, the -split-
>>> multiwords functionality does not have
>>> anything to split.
>>>
>>> I was wondering if there is anyway we can invoke split-multiword
>>> functionality before mapping
>>> unk words ?
>>
>> The way it works is that upon reading the lattice (before any operation
>> on them), word labels are converted to integers. Normally a new word
>> generates a new integer autoamtically, but with -unk and -keep-unk
>> unknown words are mapped to the integer code.
>>
>> So therefore, the splitting won't work if the multiwords themselves
>> are not in the vocabulary.
>>
>> A workaround is to do the multiword splitting in a separate processing
>> pass, where lattice-tool is invoked WITHOUT -unk.
>>
>> Andreas
>
> Yes, that makes sense. Thank you.
>
> -Anoop
> _______________________________________________
> SRILM-User site list
> SRILM-User at speech.sri.com
> http://www.speech.sri.com/mailman/listinfo/srilm-user
From stolcke at speech.sri.com Fri Jan 28 22:38:24 2011
From: stolcke at speech.sri.com (Andreas Stolcke)
Date: Fri, 28 Jan 2011 22:38:24 -0800
Subject: [SRILM User List] dynamic Loglinear mix for lattice rescoring
In-Reply-To: Your message of Thu, 27 Jan 2011 22:59:59 +0100.
Message-ID: <201101290638.p0T6cOM06163@huge>
>
> Hi all,
>
> Is there a way to rescore htk lattices using dynamic log-linear
> interpolation of more than one language models, using SRILM.
>
> Ideally, the command should look like
>
> lattice-tool -read-htk -in-lattice $SRC_LATTICE -lm $LM_FILE -order
> $LM_ORDER -bayes 0 -lambda $LAMBDA -mix-lm $LM2_FILE -loglinear-mix
> -write-htk -out-lattice $TMP_TRG_LATTIC -unk -map-unk $UNK_WORD
> -keep-unk
>
> but there is no loglinear-mix option in lattice-tool, like in ngram.
>
> Thanks in advance,
The patch below will add the -loglinear-mix option to lattice-tool.
Andreas
Index: lattice/src/lattice-tool.cc
===================================================================
RCS file: /home/srilm/CVS/srilm/lattice/src/lattice-tool.cc,v
retrieving revision 1.154
retrieving revision 1.155
diff -c -r1.154 -r1.155
*** lattice/src/lattice-tool.cc 14 Jan 2011 01:07:54 -0000 1.154
--- lattice/src/lattice-tool.cc 29 Jan 2011 05:56:35 -0000 1.155
***************
*** 5,11 ****
#ifndef lint
static char Copyright[] = "Copyright (c) 1997-2011 SRI International. All Rights Reserved.";
! static char RcsId[] = "@(#)$Id: lattice-tool.cc,v 1.154 2011/01/14 01:07:54 stolcke Exp $";
#endif
#ifdef PRE_ISO_CXX
--- 5,11 ----
#ifndef lint
static char Copyright[] = "Copyright (c) 1997-2011 SRI International. All Rights Reserved.";
! static char RcsId[] = "@(#)$Id: lattice-tool.cc,v 1.155 2011/01/29 05:56:35 stolcke Exp $";
#endif
#ifdef PRE_ISO_CXX
***************
*** 43,48 ****
--- 43,49 ----
#include "SimpleClassNgram.h"
#include "ProductNgram.h"
#include "BayesMix.h"
+ #include "LoglinearMix.h"
#include "RefList.h"
#include "LatticeLM.h"
#include "WordMesh.h"
***************
*** 138,143 ****
--- 139,145 ----
static double mixLambda7 = 0.0;
static double mixLambda8 = 0.0;
static double mixLambda9 = 0.0;
+ static int loglinearMix = 0;
static char *inLattice = 0;
static char *inLattice2 = 0;
static char *inLatticeList = 0;
***************
*** 231,236 ****
--- 233,239 ----
{ OPT_FLOAT, "mix-lambda8", &mixLambda8, "mixture weight for -mix-lm8" },
{ OPT_STRING, "mix-lm9", &mixFile9, "ninth LM to mix in" },
{ OPT_FLOAT, "mix-lambda9", &mixLambda9, "mixture weight for -mix-lm9" },
+ { OPT_TRUE, "loglinear-mix", &loglinearMix, "use log-linear mixture LM" },
{ OPT_INT, "order", &order, "ngram order used for expansion or bigram weight substitution" },
{ OPT_TRUE, "no-expansion", &noExpansion, "do not apply expansion with LM" },
{ OPT_STRING, "ref-list", &refList, "reference file used for computing WER (lines starting with utterance id)" },
***************
*** 1090,1095 ****
--- 1093,1144 ----
}
}
+ LM *
+ makeLoglinearMixLM(Array filenames, Vocab &vocab,
+ SubVocab *classVocab, unsigned order,
+ LM *oldLM, Array lambdas)
+ {
+ Array allLMs;
+ allLMs[0] = oldLM;
+
+ for (unsigned i = 1; i < filenames.size(); i++) {
+ const char *filename = filenames[i];
+ File file(filename, "r");
+
+ /*
+ * create factored LM if -factored was specified,
+ * class-ngram if -classes were specified,
+ * and otherwise a regular ngram
+ */
+ Ngram *lm = factored ?
+ new ProductNgram((ProductVocab &)vocab, order) :
+ (classVocab != 0) ?
+ (simpleClasses ?
+ new SimpleClassNgram(vocab, *classVocab, order) :
+ new ClassNgram(vocab, *classVocab, order)) :
+ new Ngram(vocab, order);
+ assert(lm != 0);
+
+ if (!lm->read(file, limitVocab)) {
+ cerr << "format error in mix-lm file " << filename << endl;
+ exit(1);
+ }
+
+ /*
+ * Each class LM needs to read the class definitions
+ */
+ if (classesFile != 0) {
+ File file(classesFile, "r");
+ ((ClassNgram *)lm)->readClasses(file);
+ }
+ allLMs[i] = lm;
+ }
+
+ LM *newLM = new LoglinearMix(vocab, allLMs, lambdas);
+ assert(newLM != 0);
+
+ return newLM;
+ }
int
main (int argc, char *argv[])
{
***************
*** 1310,1316 ****
useLM = ngram;
}
! if (mixFile) {
/*
* create a Bayes mixture LM
*/
--- 1359,1365 ----
useLM = ngram;
}
! if (mixFile && !loglinearMix) {
/*
* create a Bayes mixture LM
*/
***************
*** 1370,1375 ****
--- 1419,1476 ----
useLM = makeMixLM(mixFile9, *vocab, classVocab, order, useLM,
mixLambda9, 1.0);
}
+ } else if (mixFile && loglinearMix) {
+ /*
+ * Create log-linear mixture LM
+ */
+ double mixLambda1 = 1.0 - mixLambda - mixLambda2 - mixLambda3
+ - mixLambda4 - mixLambda5 - mixLambda6 - mixLambda7
+ - mixLambda8 - mixLambda9;
+
+ Array filenames;
+ Array lambdas;
+
+ /* Add redundant filename entry for base LM to make filenames array
+ * symmetric with lambdas */
+ filenames[0] = "";
+ filenames[1] = mixFile;
+ lambdas[0] = mixLambda;
+ lambdas[1] = mixLambda1;
+
+ if (mixFile2) {
+ filenames[2] = mixFile2;
+ lambdas[2] = mixLambda2;
+ }
+ if (mixFile3) {
+ filenames[3] = mixFile3;
+ lambdas[3] = mixLambda3;
+ }
+ if (mixFile4) {
+ filenames[4] = mixFile4;
+ lambdas[4] = mixLambda4;
+ }
+ if (mixFile5) {
+ filenames[5] = mixFile5;
+ lambdas[5] = mixLambda5;
+ }
+ if (mixFile6) {
+ filenames[6] = mixFile6;
+ lambdas[6] = mixLambda6;
+ }
+ if (mixFile7) {
+ filenames[7] = mixFile7;
+ lambdas[7] = mixLambda7;
+ }
+ if (mixFile8) {
+ filenames[8] = mixFile8;
+ lambdas[8] = mixLambda8;
+ }
+ if (mixFile9) {
+ filenames[9] = mixFile9;
+ lambdas[9] = mixLambda9;
+ }
+ useLM = makeLoglinearMixLM(filenames, *vocab, classVocab, order,
+ useLM, lambdas);
}
/*
From mshamsuddeen2 at gmail.com Mon Jan 31 17:57:59 2011
From: mshamsuddeen2 at gmail.com (Muhammad Shamsuddeen Muhammad)
Date: Tue, 1 Feb 2011 09:57:59 +0800
Subject: [SRILM User List] SRILM Missing FIles
Message-ID:
I compiled srilm but faced an error while trying to build a language model
using a tutorial. The error was that i had the file "ngram-count" missing,
while running this command >>
$ $SRILM?HOME/bin/i686/ngram?count ?order 3 ?interpolate ?kndiscount ?unk ?
text lm/corpus.lowercased.en ?lm lm/corpus.lm
I tried compiling from scratch all over again and it still is missing.
According to another tutorial thou, if the following files >>
liboolm.a
libdstruct.a
libflm.a
liblattice.a
libmisc.a
are created then the installation was successful, and they are all present
in my installation.
So what may be the issue here, and since "ngram-count" is missing from the
$SRILM?HOME/bin/i686/ directory there could be other missing files. Can
someone possibly send me a list of all the files present in that directory
of a working installation so that i could compare also.
Best Regards
--
Muhammad Shamsuddeen Muhammad
"There is no knowledge that is not power".
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
From stolcke at speech.sri.com Tue Feb 1 11:48:36 2011
From: stolcke at speech.sri.com (Andreas Stolcke)
Date: Tue, 01 Feb 2011 11:48:36 -0800
Subject: [SRILM User List] SRILM Missing FIles
In-Reply-To:
References:
Message-ID: <4D486394.5000102@speech.sri.com>
Muhammad Shamsuddeen Muhammad wrote:
> I compiled srilm but faced an error while trying to build a language
> model using a tutorial. The error was that i had the file
> "ngram-count" missing, while running this command >>
>
> $ $SRILM?HOME/bin/i686/ngram?count ?order 3 ?interpolate ?kndiscount
> ?unk ?
> text lm/corpus.lowercased.en ?lm lm/corpus.lm
>
> I tried compiling from scratch all over again and it still is missing.
> According to another tutorial thou, if the following files >>
>
> liboolm.a
> libdstruct.a
> libflm.a
> liblattice.a
> libmisc.a
>
> are created then the installation was successful, and they are all
> present in my installation.
> So what may be the issue here, and since "ngram-count" is missing from
> the $SRILM?HOME/bin/i686/ directory there could be other missing
> files. Can someone possibly send me a list of all the files present in
> that directory of a working installation so that i could compare also.
If there are no binaries generated in $SRILM/bin/$MACHINE_TYPE (with
ngram-count being one of them) then you need to follow the checklist
under frequently asked question A1) at
http://www.speech.sri.com/projects/srilm/manpages/srilm-faq.7.html
Andreas
>
> Best Regards
>
> --
> Muhammad Shamsuddeen Muhammad
>
> "There is no knowledge that is not power".
>
> ------------------------------------------------------------------------
>
> _______________________________________________
> SRILM-User site list
> SRILM-User at speech.sri.com
> http://www.speech.sri.com/mailman/listinfo/srilm-user
From mshamsuddeen2 at gmail.com Tue Feb 1 20:04:36 2011
From: mshamsuddeen2 at gmail.com (Muhammad Shamsuddeen Muhammad)
Date: Wed, 2 Feb 2011 12:04:36 +0800
Subject: [SRILM User List] make release not working
Message-ID:
Upon trying to 'compile moses support scripts' after editing the Makefile to
the relevant directories, when i enter the " $ make release" command, i get
a response saying 'make: release is up-to date' and the time stamped folder
is not created when i look into the directory. Here is the output of the
command...
integ at integ-desktop:~/mosesdecoder/trunk/scripts$ make release
make: `release' is up to date.
Any suggestions as to what im doing wrong?
Regards
--
Muhammad Shamsuddeen Muhammad
"There is No Knowledge That is Not Power".
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
From marco.turchi at gmail.com Sun Feb 6 15:32:20 2011
From: marco.turchi at gmail.com (marco turchi)
Date: Mon, 7 Feb 2011 00:32:20 +0100
Subject: [SRILM User List] Query srilm from Java
Message-ID:
Dear All,
I need to query srilm from java, do you know any free available wrappers?
Best Regards
Marco
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
From zeeshankhans at gmail.com Mon Feb 7 15:24:39 2011
From: zeeshankhans at gmail.com (zeeshan khan)
Date: Tue, 8 Feb 2011 00:24:39 +0100
Subject: [SRILM User List] effect of ngram -vocab and -limit-vocab on ppl
calculations
Message-ID:
Hi all,
I wanted to share my observation regarding the SRILM toolkit's calculation
of perplexities and the effect of -vocab and -limit-vocab on it, and wanted
to know why this happens.
SRILM toolkit's ngram tool gives 3 different perplexities of the SAME text
if these options are used as follows.
P1: ngram -unk -map-unk '[UNKNOWN]' -order 4 -lm -ppl
: gives the highest perplexity value
P2: ngram -unk -map-unk '[UNKNOWN]' -vocab -order 4 -lm
-ppl : gives perplexity value lesser than P1 and
greater than P3.
P3: ngram -unk -map-unk '[UNKNOWN]' -vocab -limit-vocab -order
4 -lm -ppl : gives perplexity value smaller than both
P1 and P2.
Can anyone tell me why this happens ? I thought the effect of -vocab and
-limit-vocab options is only on memory usage.
Just for information, the VOCAB files are generated from lattice files
generated during a recognition process.
Thanks and Regards,
Zeeshan.
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
From stolcke at speech.sri.com Mon Feb 7 23:26:02 2011
From: stolcke at speech.sri.com (Andreas Stolcke)
Date: Mon, 07 Feb 2011 23:26:02 -0800
Subject: [SRILM User List] effect of ngram -vocab and -limit-vocab on
ppl calculations
In-Reply-To:
References:
Message-ID: <4D50F00A.7030807@speech.sri.com>
zeeshan khan wrote:
> Hi all,
>
> I wanted to share my observation regarding the SRILM toolkit's
> calculation of perplexities and the effect of -vocab and -limit-vocab
> on it, and wanted to know why this happens.
>
>
> SRILM toolkit's ngram tool gives 3 different perplexities of the SAME
> text if these options are used as follows.
>
> P1: ngram -unk -map-unk '[UNKNOWN]' -order 4 -lm -ppl
> : gives the highest perplexity value
>
> P2: ngram -unk -map-unk '[UNKNOWN]' -vocab -order 4 -lm
> -ppl : gives perplexity value lesser than P1 and
> greater than P3.
That's probably because your contains more words than the
LM itself. That means fewer words are mapped to '[UNKNOWN]' and this
changes which probabilities are looked up in the LM. If however your
contains a subset of the vocabulary in the LM itself then
there should be no change in perplexity.
>
> P3: ngram -unk -map-unk '[UNKNOWN]' -vocab -limit-vocab
> -order 4 -lm -ppl : gives perplexity value
> smaller than both P1 and P2.
This has the effect that only ngrams covered by the words in
are read from the LM.
Presumably more words are now mapped to [UNKNOWN], but it's hard to
predict what happens to perplexity because you don't say what the
relationship between the vocabulary and the data in is.
The purpose of -limit-vocab is to all and only the portions of the LM
that are needed by the input data. Therefore, to make meaningful use of
this option you need to generate the vocabulary from the in
this case.
>
> Can anyone tell me why this happens ? I thought the effect of -vocab
> and -limit-vocab options is only on memory usage.
A good way to track down the differences is to use -debug 2, capture the
output in files, and use diff to see where they differ.
Andreas
>
>
> Just for information, the VOCAB files are generated from lattice files
> generated during a recognition process.
>
>
> Thanks and Regards,
>
>
> Zeeshan.
> ------------------------------------------------------------------------
>
> _______________________________________________
> SRILM-User site list
> SRILM-User at speech.sri.com
> http://www.speech.sri.com/mailman/listinfo/srilm-user
From mehdi_hoseini at comp.iust.ac.ir Tue Feb 8 04:03:20 2011
From: mehdi_hoseini at comp.iust.ac.ir (Mehdi hoseini)
Date: Tue, 08 Feb 2011 15:33:20 +0330
Subject: [SRILM User List] Variable N-grams
Message-ID:
hi all,
I read a paper titled "Variable N-grams and Extensions for Conversational
Speech Language modeling". I wonder is there any option in SRILM that help
me to make Variable N-grams Language model?
thanks.
M. Hoseini
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
From fabian_in_hongkong at hotmail.com Wed Feb 9 03:05:11 2011
From: fabian_in_hongkong at hotmail.com (Fabian -)
Date: Wed, 9 Feb 2011 12:05:11 +0100
Subject: [SRILM User List] Expand class-based LM PPL
Message-ID:
Hi,
I have a language model interpolated from a class LM and a word LM. If I compute the PPL on my dev set with ngram -classes .. it gives a reasonable PPL, if I expand the interpolated LM and compute the PPL (without the -classes parameter) I get a very high PPL. Can anyone tell me why?
Best,Fabian
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
From stolcke at speech.sri.com Wed Feb 9 09:13:27 2011
From: stolcke at speech.sri.com (Andreas Stolcke)
Date: Wed, 09 Feb 2011 09:13:27 -0800
Subject: [SRILM User List] Expand class-based LM PPL
In-Reply-To:
References:
Message-ID: <4D52CB37.2010401@speech.sri.com>
Fabian - wrote:
> Hi,
>
> I have a language model interpolated from a class LM and a word LM. If
> I compute the PPL on my dev set with ngram -classes .. it gives a
> reasonable PPL, if I expand the interpolated LM and compute the PPL
> (without the -classes parameter) I get a very high PPL. Can anyone
> tell me why?
Have you tried expanding the class LM before interpolation, and
verifying that it has a reasonable PPL ?
Andreas
>
> Best,
> Fabian
> ------------------------------------------------------------------------
>
> _______________________________________________
> SRILM-User site list
> SRILM-User at speech.sri.com
> http://www.speech.sri.com/mailman/listinfo/srilm-user
From lfu20 at hotmail.com Sun Feb 13 13:05:51 2011
From: lfu20 at hotmail.com (Luis Uebel)
Date: Sun, 13 Feb 2011 21:05:51 +0000
Subject: [SRILM User List] Compacting language models
In-Reply-To: <4D52CB37.2010401@speech.sri.com>
References: ,
<4D52CB37.2010401@speech.sri.com>
Message-ID:
I am using SRI to produce some reverse language models and are quite big.
Stats: training data: 1.1G words
88M sentences
but system was limited to 39k words (wordlist.txt) by:
ngram-count -memuse -order 3 -interpolate -kndiscount -unk -vocab ../lang-data/wordlist.txt -limit-vocab -text ../lang-data/${training}-${reverse}.xml -lm ${training}-reverse-lm${trigram}
Is there other options to reduce LM size since trigrams are 1.7G? (without so much lost in performance)?
Thanks,
Luis
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
From ammansik at cis.hut.fi Mon Feb 14 03:36:53 2011
From: ammansik at cis.hut.fi (Andre Mansikkaniemi)
Date: Mon, 14 Feb 2011 13:36:53 +0200
Subject: [SRILM User List] Python wrapper
Message-ID: <4D5913D5.4010404@cis.hut.fi>
Hi!
I'm trying to compile a Python wrapper for SRILM as described in [1]. I
run into problems when compiling the Python shared library module:
g++ -shared srilm.o srilm_wrap.o -loolm -ldstruct -lmisc
-L/home/ammansik/srilm-1.5.12/lib/i686-m64 -o _srilm.so
I get the following error messages:
/usr/bin/ld: /home/andre/srilm-1.5.12/lib/i686-m64/liboolm.a(Prob.o):
relocation R_X86_64_32 against `a local symbol' can not be used when
making a shared object; recompile with -fPIC
/home/andre/srilm-1.5.12/lib/i686-m64/liboolm.a: could not read symbols:
Bad value
collect2: ld returned 1 exit status
The system I'm using is x86_64 GNU/Linux.
Any ideas how to solve this?
Andr?
[1] Nitin Madnani. Source Code: Querying and Serving N -gram Language
Models with Python
From stolcke at icsi.berkeley.edu Tue Feb 15 15:35:50 2011
From: stolcke at icsi.berkeley.edu (Andreas Stolcke)
Date: Tue, 15 Feb 2011 15:35:50 -0800
Subject: [SRILM User List] Python wrapper
In-Reply-To: <4D5913D5.4010404@cis.hut.fi>
References: <4D5913D5.4010404@cis.hut.fi>
Message-ID: <4D5B0DD6.60005@icsi.berkeley.edu>
Andre Mansikkaniemi wrote:
> Hi!
> I'm trying to compile a Python wrapper for SRILM as described in [1]. I
> run into problems when compiling the Python shared library module:
> g++ -shared srilm.o srilm_wrap.o -loolm -ldstruct -lmisc
> -L/home/ammansik/srilm-1.5.12/lib/i686-m64 -o _srilm.so
> I get the following error messages:
> /usr/bin/ld: /home/andre/srilm-1.5.12/lib/i686-m64/liboolm.a(Prob.o):
> relocation R_X86_64_32 against `a local symbol' can not be used when
> making a shared object; recompile with -fPIC
> /home/andre/srilm-1.5.12/lib/i686-m64/liboolm.a: could not read symbols:
> Bad value
> collect2: ld returned 1 exit status
> The system I'm using is x86_64 GNU/Linux.
> Any ideas how to solve this?
>
PIC compilation should be the default in recent releases.
Run "make cleanest" , then rebuild and make sure -fPIC appears inn all
compiler commands. Make sure the PIC_FLAG variable is not modified in
any in common/Makefile.machine.i686-m64 or Makefile.site.i686-m64 .
Andreas
> Andr?
>
> [1] Nitin Madnani. Source Code: Querying and Serving N -gram Language
> Models with Python
>
> _______________________________________________
> SRILM-User site list
> SRILM-User at speech.sri.com
> http://www.speech.sri.com/mailman/listinfo/srilm-user
>
>
From ammansik at cis.hut.fi Wed Feb 16 00:13:15 2011
From: ammansik at cis.hut.fi (Andre Mansikkaniemi)
Date: Wed, 16 Feb 2011 10:13:15 +0200
Subject: [SRILM User List] Python wrapper
In-Reply-To: <4D5B0DD6.60005@icsi.berkeley.edu>
References: <4D5913D5.4010404@cis.hut.fi> <4D5B0DD6.60005@icsi.berkeley.edu>
Message-ID: <4D5B871B.4020002@cis.hut.fi>
Andreas Stolcke wrote:
> Andre Mansikkaniemi wrote:
>> Hi!
>> I'm trying to compile a Python wrapper for SRILM as described in [1]. I
>> run into problems when compiling the Python shared library module:
>> g++ -shared srilm.o srilm_wrap.o -loolm -ldstruct -lmisc
>> -L/home/ammansik/srilm-1.5.12/lib/i686-m64 -o _srilm.so
>> I get the following error messages:
>> /usr/bin/ld: /home/andre/srilm-1.5.12/lib/i686-m64/liboolm.a(Prob.o):
>> relocation R_X86_64_32 against `a local symbol' can not be used when
>> making a shared object; recompile with -fPIC
>> /home/andre/srilm-1.5.12/lib/i686-m64/liboolm.a: could not read symbols:
>> Bad value
>> collect2: ld returned 1 exit status
>> The system I'm using is x86_64 GNU/Linux.
>> Any ideas how to solve this?
>>
> PIC compilation should be the default in recent releases.
> Run "make cleanest" , then rebuild and make sure -fPIC appears inn all
> compiler commands. Make sure the PIC_FLAG variable is not modified in
> any in common/Makefile.machine.i686-m64 or Makefile.site.i686-m64 .
>
> Andreas
Hi,
I got it working now.
Many thanks!
Andr?
>
>> Andr?
>>
>> [1] Nitin Madnani. Source Code: Querying and Serving N -gram Language
>> Models with Python
>>
>> _______________________________________________
>> SRILM-User site list
>> SRILM-User at speech.sri.com
>> http://www.speech.sri.com/mailman/listinfo/srilm-user
>>
>>
>
From mehdi_hoseini at comp.iust.ac.ir Wed Feb 16 03:04:52 2011
From: mehdi_hoseini at comp.iust.ac.ir (Mehdi hoseini)
Date: Wed, 16 Feb 2011 14:34:52 +0330
Subject: [SRILM User List] I have problem in building new version
Message-ID:
hi
I build new version files successfully except ngram and lattice-tool. I
don't know how to deal with. if I use ngram.exe and lattice-tool.exe from
previous version (1.5.11), do I lose so many things?
thanks
From stolcke at icsi.berkeley.edu Wed Feb 23 06:28:56 2011
From: stolcke at icsi.berkeley.edu (Andreas Stolcke)
Date: Wed, 23 Feb 2011 06:28:56 -0800
Subject: [SRILM User List] Compacting language models
In-Reply-To:
References: ,
<4D52CB37.2010401@speech.sri.com>
Message-ID: <4D6519A8.4090901@icsi.berkeley.edu>
Luis Uebel wrote:
> I am using SRI to produce some reverse language models and are quite big.
> Stats: training data: 1.1G words
> 88M sentences
>
> but system was limited to 39k words (wordlist.txt) by:
> ngram-count -memuse -order 3 -interpolate -kndiscount -unk -vocab
> ../lang-data/wordlist.txt -limit-vocab -text
> ../lang-data/${training}-${reverse}.xml -lm
> ${training}-reverse-lm${trigram}
>
>
> Is there other options to reduce LM size since trigrams are 1.7G?
> (without so much lost in performance)?
Luis,
if the issue is that training takes too much memory, please see the FAQ
on memory issues.
If you already have a (large) LM and want to reduce its size for test
purposes, us the ngram -prune option. You want to read the following
papers to understand how LM pruning works:
A. Stolcke,'' Entropy-based Pruning of Backoff Language
Models,'' Proc. DARPA Broadcast News Transcription
and Understanding Workshop, pp. 270-274, Lansdowne, VA, 1998.
C. Chelba, T. Brants, W. Neveitt, and P. Xu, ''Study on
Interaction Between Entropy Pruning and Kneser-Ney
Smoothing,'' Proc. Interspeech, pp. 2422-2425, Makuhari, Japan, 2010.
Andreas
>
> Thanks,
>
>
> Luis
>
> ------------------------------------------------------------------------
>
> _______________________________________________
> SRILM-User site list
> SRILM-User at speech.sri.com
> http://www.speech.sri.com/mailman/listinfo/srilm-user
From no-reply at dropboxmail.com Sun Mar 6 14:49:35 2011
From: no-reply at dropboxmail.com (Dropbox)
Date: Sun, 06 Mar 2011 22:49:35 +0000
Subject: [SRILM User List] Jayadev Jayaraman invited you to Dropbox!
Message-ID: <20110306224935.28EA6476BCA@mailman-2.dropboxmail.com>
Jayadev Jayaraman wants you to use Dropbox to sync and share files online and across computers.
Get started here: http://www.dropbox.com/link/20.R4fEtk4066/NjczNTk3MTgzNw?src=referrals_bulk9
- The Dropbox Team
____________________________________________________
To stop receiving invites from Dropbox, please go to http://www.dropbox.com/bl/225e2a93fec1/srilm-user%40speech.sri.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
From mehdi_hoseini at comp.iust.ac.ir Mon Mar 7 12:03:23 2011
From: mehdi_hoseini at comp.iust.ac.ir (Mehdi hoseini)
Date: Mon, 07 Mar 2011 23:33:23 +0330
Subject: [SRILM User List] Error Comiling SRILM
Message-ID:
hi all
I use cygwin to compile SRILM. but I have these i attached the picture.
regards
Mehdi
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
From stolcke at icsi.berkeley.edu Tue Mar 22 10:37:46 2011
From: stolcke at icsi.berkeley.edu (Andreas Stolcke)
Date: Tue, 22 Mar 2011 10:37:46 -0700
Subject: [SRILM User List] threshold on maximal counts for LM estimation
In-Reply-To:
References:
Message-ID: <4D88DE6A.1030402@icsi.berkeley.edu>
zeeshan khan wrote:
> Thanks alot Andreas for your answers !
> I have another question.
>
> Using the ngram-count tool, is there a way to generate a count file
> which contains counts only lower than a certain limit.
> For example, if I want to generate a count file which contains only
> those N-grams which occurred less than 50 times in a corpus, how can I
> do it with the ngram-count. May be it is very simple to do, but I
> couldnt find it. Currently, I do it manually, but it is cumbersome and
> time-consuming.
>
> There is a way to set the maximal count of N-grams of an order / n /
> that are discounted under Good-Turing, but I couldnt find a way to set
> a maximal count limit of all N-grams to be considered at all.
There is no way to do it using existing functions in ngram-count.
Even if there were a way to do it with a built-in function you would not
really gain any efficiency because to know if something occurs more than
N times you need to keep track of all counts to begin with. So you're
not going to be able to do much better than
ngram-count -text .... -write - | gawk '$NR < 50' | gzip >
counts-less-than-50.gz
Andreas
>
> Best Regards,
> Zeeshan.
>
>
>
>
>
>
> On Tue, Feb 8, 2011 at 8:26 AM, Andreas Stolcke
> > wrote:
>
> zeeshan khan wrote:
>
> Hi all,
> I wanted to share my observation regarding the SRILM toolkit's
> calculation of perplexities and the effect of -vocab and
> -limit-vocab on it, and wanted to know why this happens.
>
>
> SRILM toolkit's ngram tool gives 3 different perplexities of
> the SAME text if these options are used as follows.
> P1: ngram -unk -map-unk '[UNKNOWN]' -order 4 -lm
> -ppl : gives the highest perplexity value
>
> P2: ngram -unk -map-unk '[UNKNOWN]' -vocab -order
> 4 -lm -ppl : gives perplexity value
> lesser than P1 and greater than P3.
>
> That's probably because your contains more words than
> the LM itself. That means fewer words are mapped to '[UNKNOWN]'
> and this changes which probabilities are looked up in the LM. If
> however your contains a subset of the vocabulary in
> the LM itself then there should be no change in perplexity.
>
>
> P3: ngram -unk -map-unk '[UNKNOWN]' -vocab
> -limit-vocab -order 4 -lm -ppl : gives
> perplexity value smaller than both P1 and P2.
>
> This has the effect that only ngrams covered by the words in
> are read from the LM.
> Presumably more words are now mapped to [UNKNOWN], but it's hard
> to predict what happens to perplexity because you don't say what
> the relationship between the vocabulary and the data in
> is.
> The purpose of -limit-vocab is to all and only the portions of the
> LM that are needed by the input data. Therefore, to make
> meaningful use of this option you need to generate the vocabulary
> from the in this case.
>
>
> Can anyone tell me why this happens ? I thought the effect of
> -vocab and -limit-vocab options is only on memory usage.
>
> A good way to track down the differences is to use -debug 2,
> capture the output in files, and use diff to see where they differ.
>
> Andreas
>
>
>
> Just for information, the VOCAB files are generated from
> lattice files generated during a recognition process.
>
>
> Thanks and Regards,
>
>
> Zeeshan.
> ------------------------------------------------------------------------
>
> _______________________________________________
> SRILM-User site list
> SRILM-User at speech.sri.com
> http://www.speech.sri.com/mailman/listinfo/srilm-user
>
>
>
>
From andersson at disi.unitn.it Thu Mar 31 06:01:14 2011
From: andersson at disi.unitn.it (Simon Andersson)
Date: Thu, 31 Mar 2011 15:01:14 +0200 (CEST)
Subject: [SRILM User List] fngram-count test doesn't terminate
Message-ID: <53728.127.0.0.1.1301576474.squirrel@mail.disi.unitn.it>
Hello,
I just installed SRILM 1.5.12 and ran
make test
The test results were O.K. for all modules except flm (I removed flm from
the test and reran it). The fngram-count test doesn't terminate (I
terminated it after an hour). Has somebody experienced this?
Thanks,
- Simon Andersson
University of Trento, Italy
From stolcke at icsi.berkeley.edu Thu Mar 31 18:04:35 2011
From: stolcke at icsi.berkeley.edu (Andreas Stolcke)
Date: Thu, 31 Mar 2011 18:04:35 -0700
Subject: [SRILM User List] fngram-count test doesn't terminate
In-Reply-To: <53728.127.0.0.1.1301576474.squirrel@mail.disi.unitn.it>
References: <53728.127.0.0.1.1301576474.squirrel@mail.disi.unitn.it>
Message-ID: <4D9524A3.6090401@icsi.berkeley.edu>
Simon Andersson wrote:
> Hello,
>
> I just installed SRILM 1.5.12 and ran
>
> make test
>
> The test results were O.K. for all modules except flm (I removed flm from
> the test and reran it). The fngram-count test doesn't terminate (I
> terminated it after an hour). Has somebody experienced this?
>
Two ideas:
- try a different compiler (eg, a different version of gcc, or update
the version installed by default on your system)
- disable optimization
I have heard about issues specifically with fngram-count.
Andreas
> Thanks,
>
> - Simon Andersson
> University of Trento, Italy
> _______________________________________________
> SRILM-User site list
> SRILM-User at speech.sri.com
> http://www.speech.sri.com/mailman/listinfo/srilm-user
>