Search SRILM-USER Archives

Match: Format: Sort by:
Search:

Fwd: Bug in lattice-tool?

From: "Tom Murray" <yozhik at ADDRESS HIDDEN>
Date: Thu, 18 Jan 2007 09:55:01 -0800

Thanks, Andreas. I'm forwarding this to the list because I think it
may be quite useful to a number of people.

---------- Forwarded message ----------
From: Andreas Stolcke <stolcke at ADDRESS HIDDEN>
Date: Jan 17, 2007 10:57 PM
Subject: Re: Bug in lattice-tool?
To: Tom Murray <yozhik at ADDRESS HIDDEN>

Tom,

what you are trying to do can be done with lattice-tool as it is,
but it requires two passes.  That's how we rescore lattices ourselves.

step 1: expand lattice with new LM, write new lattices
step 2: read rescored lattices, choosing scaling factors and decoding
        1-best or n-best.

You are trying to combine these steps into one, and it fails because
the LM rescoring function overrides the combined scores.
This behavior is by design and some other functions depend on it,
but it needs to be better documented.

BTW, I don't think your patch will necessarily do the right thing.
It simply adds the new LM score to the old combined score, instead
of replacing the old LM score in the combination of scores.
There are ways to fix this, but it would require more extensive code
changes.

I would recommend the 2-step approach.  It also has the advantage
hat you can rerun step2 (n-best decoding) multiple times to try different
scaling factors.

One more thing:  since your LM does not contain multiwords you need
to split the multiwords prior to LM expansion. Simply add the -split-multiwords
option in step 1.

Andreas

In message <39abe3570701171423p4bb5d962qf6dbed50cca8aeda at ADDRESS HIDDEN>you wro
te:
> ------=_Part_119177_28709660.1169072629160
> Content-Type: text/plain; charset=ISO-8859-1; format=flowed
> Content-Transfer-Encoding: 7bit
> Content-Disposition: inline
>
> Hi, Andreas--
>
> What we want to do with lattice-tool is this: generate an n-best list
> from a lattice using an external LM, where the path scores are a
> weighted sum of the AM and LM scores in the lattice and the scores of
> the external LM.
>
> Attached is a tarred directory with an HTK lattice, an LM, and a test
> script test-lattice.sh. Also included is the output of v1.5.1
> lattice-tool, compared with my patched version which adds the
> transition log weights as I described.
>
> The script runs lattice-tool three times, first with default
> -htk-lmscale and -htk-acscale, and then with the lmscale and the
> acscale zeroed out. You can see that the n-best list is the same for
> all three for the v1.5.1 output. For mine it differs.
>
> To give a little more detail of where I think the bug is, according to
> my understanding of what's going on:
>
> When you load the HTK file, you create a node for each HTK edge, and
> then connect this new node from the start node and to the end node.
> The weight of the connection from the start to the new node is the
> weighted sum (according to lmscale, acscale, etc.) of the various
> scores from the HTK edge.
>
> Now, during expansion, old nodes and transitions are replaced by new
> ones, with the old nodes deleted. I printed out all the node indices,
> and the initial nodes corresponding to the HTK edges are deleted
> during this stage. I became convince of this when I added a line to
> zero out the probs from the external LM, and all the hyp scores during
> n-best output had score = 0.
>
> Please let me know if I'm misunderstanding something. Thanks for your help,
>
> tm

Click here to go to the SRILM home page.