Search SRILM-USER Archives

Re: Problems finding best path (to choose synonynm)

From: Andreas Stolcke <stolcke at ADDRESS HIDDEN>
Date: Tue, 06 Sep 2005 09:24:24 PDT

In message <6.0.1.1.1.20050906134955.036d7238 at ADDRESS HIDDEN>you wrote:
> I'm trying to use srilm for a Natural Language Generation
> application, to choose between synonymns of a word.  The input
> to a system is a structure such as
>
>   you OR[answered,got] 4 questions OR[correctly,correct,right]
>
> The system needs to make a choice at each OR point, with the
> goal of producing the easiest-to-read final sentence.  There are
> preference weights for the choices, for example, "answered"
> gets a preference weight of 0.2 and "got" gets 0.8, this reflects
> the fact that even ignoring LM issues we expect "got"
> to be easier to read (shorter, simple phoneme->letter mapping)
>
> I represent the above as a "wlat" format file, which I convert
> to pfsg and then run lattice-tool on.  However, I can't get
> lattice-tool to find the best path through the mesh taking into
> account both the language model and the preference weights.
> If I specify -viterbi-decode  I get the best path based on the
> LM (but ignoring the preference scores), while if I specify
> -posterior-decode I get the best path based on preference scores
> (but ignoring the LM).  I'd also like to see the actual scores,
> I thought I would get this with -nbest-decode but the nbest file
> has 0 for all the scores.
>
> Is there any way to find the best path taking both LM and
> preference weights into consideration, and giving actual
> scores?

I think you would have to directly encode your problem as an HTK-style
lattice, where you can have a number of scores associated with each word.
The HTK format is not documented as part of SRILM, but as part of
the HTK documentation (which is available online at
http://htk.eng.cam.ac.uk/

That said, it seems like your problem is more straightforwardly encoded
as a HMM tagging problem.   Have a look at the disambig tool, especially
the -text-map option.  The preference values would be encoded in the
map file, and the unamiguous words are mapped to themselves.

--Andreas

>
k
> Many thanks
> Ehud Reiter
>

Click here to go to the SRILM home page.