Divider
  Speech Technology and Research Laboratory
  People
  Current Research Activities
  Past Research Activities
  Publications
  Career Opportunities
  Seminars
  Technologies for License
  In the News
  Contact Us
  STAR Search
  Information and Computing Sciences Division
SpacerAbout UsDividerR and D DivisionsDividerCareersDividerNewsroomDividerContact UsDividerSRI HomeSpacer

Spacer
         
  SRI Logo

Search SRILM-USER Archives

Match: Format: Sort by:
Search:

Re: Problems finding best path (to choose synonynm)

From: Andreas Stolcke <stolcke at ADDRESS HIDDEN>
Date: Tue, 06 Sep 2005 09:24:24 PDT

In message <6.0.1.1.1.20050906134955.036d7238 at ADDRESS HIDDEN>you wrote:
> I'm trying to use srilm for a Natural Language Generation
> application, to choose between synonymns of a word.  The input
> to a system is a structure such as
>
>   you OR[answered,got] 4 questions OR[correctly,correct,right]
>
> The system needs to make a choice at each OR point, with the
> goal of producing the easiest-to-read final sentence.  There are
> preference weights for the choices, for example, "answered"
> gets a preference weight of 0.2 and "got" gets 0.8, this reflects
> the fact that even ignoring LM issues we expect "got"
> to be easier to read (shorter, simple phoneme->letter mapping)
>
> I represent the above as a "wlat" format file, which I convert
> to pfsg and then run lattice-tool on.  However, I can't get
> lattice-tool to find the best path through the mesh taking into
> account both the language model and the preference weights.
> If I specify -viterbi-decode  I get the best path based on the
> LM (but ignoring the preference scores), while if I specify
> -posterior-decode I get the best path based on preference scores
> (but ignoring the LM).  I'd also like to see the actual scores,
> I thought I would get this with -nbest-decode but the nbest file
> has 0 for all the scores.
>
> Is there any way to find the best path taking both LM and
> preference weights into consideration, and giving actual
> scores?

I think you would have to directly encode your problem as an HTK-style
lattice, where you can have a number of scores associated with each word.
The HTK format is not documented as part of SRILM, but as part of
the HTK documentation (which is available online at
http://htk.eng.cam.ac.uk/

That said, it seems like your problem is more straightforwardly encoded
as a HMM tagging problem.   Have a look at the disambig tool, especially
the -text-map option.  The preference values would be encoded in the
map file, and the unamiguous words are mapped to themselves.

--Andreas

>
k
> Many thanks
> Ehud Reiter
>

Click here to go to the SRILM home page.

 

About Us  Vertical divider  R&D Divisions  Divider  Careers  Divider  Newsroom  Divider  Contact Us
©2006 SRI International, 333 Ravenswood Avenue, Menlo Park, CA 94025-3493
SRI International is an independent, nonprofit corporation. Privacy policy

Last modified Nov 21, 2008