decode lattice

Andreas Stolcke stolcke at speech.sri.com
Fri Mar 5 08:24:38 PST 2004


In message <20040305115330.newington+john at force9>you wrote:
> 
> Andreas,
> 
> I changed our lattice files so that the words were not enclosed in double quo
> tes. This fixed the initial problem and enabled me to get an output from latt
> ice-tool. However, I then realised that I needed to scale the output from my 
> classifier by subtracting the log prior probabilities for each class before b
> uilding the lattice. Now, when I try the rescaling and decoding using lattice
> -tool it predicts the same (low frequency) label for almost every token.

I'm a little confused by your description.  I gather you have a classifier that
operates on word hypotheses and outputs posterior probabilities, which you
scale by the priors to obtain pseudo-likelihoods, giving you your acoustic 
scores.  That part sounds reasonable (correct me if I got it wrong).

Does the unigram LM you are using encode the priors ?

What do you mean by "token" in your last sentence? 

> 
> Am I wrong to scale my 'accoustic' probabilities before building the lattice?
>  Does lattice-tool do this for me when I call:
> 
> ./lattice-tool -read-htk -in-lattice lattice.slf -write-htk \ -out-lattice la
> ttice.out -lm DAgrammar -no-nulls

lattice-tool only performs global scaling of the scores in the lattice.
By default the scores are interpreted as being natural logs (base e).
If you add a header field

	base=B

then scores are taken to be logs base B.
So, if your acoustic scores are not natural logs you should either convert
them, or insert the "base=" spec in the lattice header.
(You can also use straight probabilities as scores by setting base=0.)

The default log scale for output lattices is 10 (so that LM scores can 
be more easily inspected and compared to LM files),
so the header of an output lattice will contain "base=10"
regardless of the input.  However, you can chose that with the "-htk-logbase"
option.  That won't change your result, though, because when the lattice
is read back in everything is converted to log base 10 internally.
The important thing is that the acoustic scores have the right base 
in the original lattice so that the LM scores generated by rescoring
are compatible.

When you decode from the lattice (lattice-tool -viterbi-decode) you can
chose to scale the acoustic and LM scores differently to give different weights
to these knowledge sources.  This is controlled by the options

 -htk-acscale
 -htk-lmscale

So you might want to play with those.

--Andreas 




More information about the SRILM-User mailing list