Divider
  Speech Technology and Research Laboratory
  People
  Current Research Activities
  Past Research Activities
  Publications
  Career Opportunities
  Seminars
  Technologies for License
  In the News
  Contact Us
  STAR Search
  Information and Computing Sciences Division
SpacerAbout UsDividerR and D DivisionsDividerCareersDividerNewsroomDividerContact UsDividerSRI HomeSpacer

Spacer
         
  SRI Logo

Search SRILM-USER Archives

Match: Format: Sort by:
Search:

Re: Disambig n-best scores

From: Andreas Stolcke <stolcke at ADDRESS HIDDEN>
Date: Tue, 30 Mar 2004 15:58:02 PST

In message <009501c4166e$a0b50cd0$34284484 at ADDRESS HIDDEN>you wrote:
> Hi,
>
> How is path score in disambig with n-best option calculated?
>
> For example, suppose that I have the sentence:
>
> W1 W2
> Which is tagged with T1 T2
>
> Then I calculated the path probability as follows:
>
> Log10 [ P(T1|<s>)*P(T2|T1)*P(<\s>|T2)*P(W1|T1)*P(W2|T2) ]
>
> I got it "almost right" . I checked for two paths:
> For one I got -20.549 (while disambig returned -120.549)
> For the other I got -20.837 (while disambig returned -120.837)
>
> What is the reason for this difference? Should I always ignore the "1"
> after the "-"?

The -100 comes from an OOV word.  When the LM returns a probability of 0
AND the word is not in the LM it is considered an OOV.  To allow the
probability computation to go on a large negative, but finite, log probability
of -100 is substituted (cf. the constant LogP_PseudoZero in disambig.cc).

--Andreas

Click here to go to the SRILM home page.

 

About Us  Vertical divider  R&D Divisions  Divider  Careers  Divider  Newsroom  Divider  Contact Us
©2006 SRI International, 333 Ravenswood Avenue, Menlo Park, CA 94025-3493
SRI International is an independent, nonprofit corporation. Privacy policy

Last modified Nov 21, 2008