Rescore HTK lattice

Sopheap SENG sopheap.seng at gmail.com
Mon Mar 3 04:54:15 PST 2008


Hello,

I need HTK lattice in my experiments but the sphinx3 decoder I used, could
not generate HTK lattice. So I have to convert sphinx lattice to HTK
lattice.

My problem is : the lattice generated by sphinx3 decoder provides only the
acoustic score of word transitions, I did not find the option to obtain the
lmscore in sphninx lattice.

In order to obtain HTK lattice with lmscore, first I converted sphinx
lattice to HTK SLF lattice format (I added l=0 as lmscore, the acoustic
score is  kept as it is)

Then I used lattice tool (Srilm V 1.5.2) to rescore the lattice by giving a
LM :

> lattice-tool  -in-lattice in.slf  -read-htk -lm LM.BO <http://lm.bo/>-htk-lmscale
9.5 -htk-wdpenalty 0.7 -htk-logbase 1.0003 -out-lattice out.slf  -write-htk

(the lmscale, wdpenalty and logbase are the values that I used during
lattice generation with sphninx3, the LM is the same as in sphinx3)

I obtained in the output a lattice with acoustic score and new lmscore. What
I observed is that the acoustic score in the output lattice is recalculated
using the logbase.

In order to verify that the output lattice in HTK format is equivalent to
the orginal sphinx lattice once, I generated 200-Best lists from these two
lattices.

- for sphinx lattice I used sphinx3_astar to generate N-best
- for the rescore HTK lattice, i used lattice-tool :

    >lattice-tool -in-lattice out.slf  -read-htk -lm lm.BO
-htk-lmscale 9.5-htk-wdpenalty
0.7 -htk-logbase 1.0003  -nbest-decode 200 -out-nbest-dir OUT/

The problem is that the order of the hypothesis in the two N-best list is
not the same. The 1-best given by sphinx3_astar ccould be found in the
200-Best given by lattice-tool but with a much more lower rank or some time
not found. But I always find the 1-best of sphinx_astar in a bigger N-Best
list of lattice-tool (N=2000).

I am convinced that this is a problems of normalizing the score between
sphinx and lattice-tool. If the score is correctly normalized, I should have
the same N-best at both sides.

Could  you please give me any clues on this issue?

Thank in advance.

Sopheap

-- 
---------------------------------------------
Sopheap SENG

Laboratoire d'Informatique de Grenoble (LIG)
Equipe GETALP Bureau C118
220, avenue de la Chimie
Campus Scientifique, BP53
38041 GRENOBLE Cedex 9, FRANCE
Tél : (33)-4-76-63-55-81
Télécopie : (33)-4-76-63-55-52
Courriel : sopheap.seng at imag.f
URL : http://www-geod.imag.fr
---------------------------------------------
Enseignant
Institut de Technologie du Cambodge
BP 86, Bd de Pochentong
Phnom Penh - Cambodge
Tél : (855)-23-88-03-70/98-24-45
Télécopie : (855)-23-88-03-69
Courriel : sopheap.seng at itc.edu.kh
URL : http://www.itc.edu.kh
---------------------------------------------
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.speech.sri.com/pipermail/srilm-user/attachments/20080303/0d6267b8/attachment.html>


More information about the SRILM-User mailing list