Divider
  Speech Technology and Research Laboratory
  People
  Current Research Activities
  Past Research Activities
  Publications
  Career Opportunities
  Seminars
  Technologies for License
  In the News
  Contact Us
  STAR Search
  Information and Computing Sciences Division
SpacerAbout UsDividerR and D DivisionsDividerCareersDividerNewsroomDividerContact UsDividerSRI HomeSpacer

Spacer
         
  SRI Logo

Search SRILM-USER Archives

Match: Format: Sort by:
Search:

Another lattice rescoring problem

From: Teemu Hirsimaki <teemu.hirsimaki at ADDRESS HIDDEN>
Date: Tue, 18 Oct 2005 15:02:36 +0300

I ran into another problem with the lattice rescoring.  I have two
simple HTK lattices (acoustic log-probabilities in parentheses):

test0.htk:

   a(-1) --+--> c(-2) ------+--> b(-3)
           |                |
           +--> !NULL(-2) --+

test1.htk:

   a(-1) -----> !NULL(-2) -----> b(-3)

If I rescore the above lattices with a simple 2-gram language model
test.arpa (see the end of the mail for the example files), the
language model probability of the path "a b" is computed incorrectly
for the first lattice.  In the second case, the probability is
correct:

$ echo "a b" | lattice-tool -in-lattice test0.htk -read-htk \
   -lm test.arpa -ppl - -debug 2
...
         p( a | <s> )    = [10] 7.43548e-13 [ -12.1287 ]
         p( b | a ...)   = [16] 8.00959e-07 [ -6.09639 ]
         p( </s> | b ...)        = [9] 6.88685e-14 [ -13.162 ]
0 zeroprobs, logprob= -31.3871 ppl= 2.8997e+10 ppl1= 4.93776e+15

$ echo "a b" | lattice-tool -in-lattice test1.htk -read-htk \
   -lm test.arpa -ppl - -debug 2
...
         p( a | <s> )    = [9] 2.57573e-17 [ -16.5891 ]
         p( b | a ...)   = [13] 8.00959e-07 [ -6.09639 ]
         p( </s> | b ...)        = [8] 6.88685e-14 [ -13.162 ]
0 zeroprobs, logprob= -35.8475 ppl= 8.89522e+11 ppl1= 8.38947e+17

It seems that the backoff probability BO(a) is missing from the first
case.

Next I tried to use the -no-nulls flag.  Then I get correct language
mode probabilities for both lattices, but the acoustic probability is
incorrect, as the acoustic probability of the !NULL edge is discarded.
Should the general LM expansion handle !NULL edges correctly?

I also tried changing the !NULL words to a distinct word symbol and
specifying it with the -ignore-vocab flag to lattice-tool (tried
versions 1.4.5 and 1.4.6 beta).  Then the acoustic probabilities are
preserved nicely, but again the backoff probability BO(a) is missing
from the first rescored lattice.

Did I miss something again, or is the above expected behaviour?

-Teemu

Here are the example files:

test0.htk:

VERSION=1.1
base=10
dir=f
lmscale=1 wdpenalty=0
start=0 end=3
N=4 L=4
I=0
I=1
I=2
I=3
J=0 S=0 E=1 W=a   a=-1
J=1 S=1 E=2 W=!NULL a=-2
J=2 S=1 E=2 W=c   a=-2
J=3 S=2 E=3 W=b   a=-3

test1.htk:

VERSION=1.1
base=10
dir=f
lmscale=1 wdpenalty=0
start=0 end=3
N=4 L=3
I=0
I=1
I=2
I=3
J=0 S=0 E=1 W=a   a=-1
J=1 S=1 E=2 W=!NULL a=-2
J=2 S=2 E=3 W=b   a=-3

test.arpa:

\data\
ngram 1=5
ngram 2=5

\1-grams:
-99 <s> -7.34882
-2.10718 c -4.28966
-4.77987 a -4.46041
-5.81316 </s> -7.34882
-4.02326 b -2.07313

\2-grams:
-3.33947 c a
-1.08518 c </s>
-4.58511 c b
-0.000484286 a c
-1.67833 b c

\end\

Click here to go to the SRILM home page.

 

About Us  Vertical divider  R&D Divisions  Divider  Careers  Divider  Newsroom  Divider  Contact Us
©2006 SRI International, 333 Ravenswood Avenue, Menlo Park, CA 94025-3493
SRI International is an independent, nonprofit corporation. Privacy policy

Last modified Nov 21, 2008