[SRILM User List] Rescoring lattices
    Dmytro Prylipko 
    dmytro.prylipko at ovgu.de
       
    Mon May 28 04:15:36 PDT 2012
    
    
  
Dear Andreas,
As far as I know a convenient way to combine HTK and SRILM for speech
recognition is:
1. Generate lattices with HTK.
2. Rescore them with lattice-tool and LM built with the SRILM toolkit.
3. Decode rescored lattices using, e.g. Vitebi decode procedure, and
finally obtain the most likely hypothesis with the new language model.
On the first step I also get a 1-best utterance(s) decoded with the HTK
bigram model (built using HLStats and HBuild).
And I found that the recognition accuracy of the SRILM trigram is much
worse that the accuracy of the initial output obtained with the HTK bigram.
For instance, 75.99% vs. 83.14%, 65.51% vs. 71.58%.
The reason for this is the different training sets for bigrams and
trigrams. HVite uses for decoding not real back-off N-gram but a word
network.
And this network contains just those sequences found in train data. In
order to ensure that each test utterance is present in the initial lattice
(and thus
can be recognized with 100% accuracy), I cheated: built the network
(initial lattice) using train as well as test data.
In contrast, trigrams were trained just on train material but with the full
vocabulary.
However, the gap seems to me too large. Could you tell whether the scheme I
described is correct?
Another strangeness is a behavior of the models during incremental
adaptation. In supervised mode the progress looks like:
HTK            SRILM
81.07          75.99
82.61          75.72
82.61          76.32
83.14          75.99
83.21          75.52
82.74          75.85
83.41          76.39
83.55          76.79
83.68          76.59
6 improvements and 1 worsening (in comparison to previous result) with HTK
and 4:4 with SRILM.
2.61% vs. 0.8% absolute improvement, and 3.12% vs. 1.04% relative. This
looks pretty strange for me: I supposed the dynamics should be roughly the
same.
Do you have any suggestions why the trigram performs so bad? Maybe, the
language scale factor is too large (I use 10.0)?
I would greatly appreciated for any help.
Yours,
Dmytro Prylipko.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.speech.sri.com/pipermail/srilm-user/attachments/20120528/32250fb7/attachment.html>
    
    
More information about the SRILM-User
mailing list