Vocabularies when interpolation

Andreas Stolcke stolcke at speech.sri.com
Thu Apr 10 09:15:07 PDT 2003


In message <20030410105631.B11073 at luistervink.cs.utwente.nl>you wrote:
> Hello Andreas,
> 
> When performing interpolation with 
> 
> ngram -lm .. -mix-lm .. -lambda ...
> 
> the vocabularies of the LM's being mixed get merged if I understand it correc
> tly (from doing some test runs). Is there a way to force the resulting output
>  LM to have a predefined vocabulary (e.g. the vocab of one of the LM's being 
> mixed)? 

No, but you can limit the LM vocabulary either before or after the 
merging.  The proper way to do this is to specify the same vocabulary
when building the various LM components. If that is not possible
(e.g., you got the LMs from someone else) you can modify the 
LM vocabulary post-training using the "change-lm-vocab" script.
Check the "lm-scripts" man page.

--Andreas 




More information about the SRILM-User mailing list