pfsg-format

Andreas Stolcke stolcke at speech.sri.com
Thu Mar 25 09:33:16 PST 2004


In message <4063152D.3060201 at irisa.fr>you wrote:
> Hi !
> I've got one question about the pfsg format : is the transition cost, 
> between 2 states, considered to be 10000.5 times the log-probability of 
> the bigram corresponding to the 2 states ?

correct.

> Because, when I use a language model made from an ARPA file (by using 
> the NgramLM class) to compute the probability of a word (my language 
> model is based on letters) and when I use a language model made from a 
> PFSG file (I convert the ARPA thanks to the make-ngram-pfsg script and 
> then by using the LatticeLM class), I don't have the same 
> log-probability from both representations. Why is there a difference ? 
> Since I convert the ARPA file into a PFSG file, it should be the same.

How big are the differences?  there will be some discrepancy due to
rounding the scaled log probabilities to an integer, but it should 
be a small error.

--Andreas 




More information about the SRILM-User mailing list