[SRILM User List] rescoring with fngram and sapare probabilities

Andreas Stolcke stolcke at icsi.berkeley.edu
Fri Jun 22 20:17:40 PDT 2012


On 6/22/2012 5:50 PM, Gregor Donaj wrote:
> Hi,
>
> I have two question about the fngram tool. I used it to re-score 
> n-best lists of factored sentences. I took a look at the man pages, 
> but I couldn't find my answers.
> 1)
> After taking a close look at the probabilities i realized, that the 
> score seem already to be weighted by the factor 8. Is the any option 
> to change this factor? How about the ngram tool?
I would not use fngram for nbest rescoring, lack of documentation being 
one problem.  Also, this program has not been updated in a while.
The better approach is to use fngram-count to train FLMs, but  then use 
ngram -factored to apply the LM to data.  So you would use ngram 
-factored -nbest or -nbest-files or  -rescore (see ngram(1) man page).
> 2)
> Can fngram give also the original language score to output? I mean not 
> just to replace the original language score with the re-scored 
> probability but to write both in the output file?
Well, you can always save the original nbest lists and use its LM scores 
as an additional input to you the score combination.
Using more than the standard three scores (AM, LM, and word count) 
requires extra work, some of which is supported by the wrapper scripts 
described in the nbest-scripts(5) man page.

The typical way to do this would be:

1) Use the rescore-decipher wrapper script with the -lm-only option (in 
addition to -factored -lm ...) to produce score files that contain only 
the FLM scores.
2) Use nbest-optimize (on a held-out tuning set) to determine the 
optimal score weightings (see man page)
3) Use rescore-reweight to combine all scores and output new 1-best hyps.

Andreas





More information about the SRILM-User mailing list