Divider
  Speech Technology and Research Laboratory
  People
  Current Research Activities
  Past Research Activities
  Publications
  Career Opportunities
  Seminars
  Technologies for License
  In the News
  Contact Us
  STAR Search
  Information and Computing Sciences Division
SpacerAbout UsDividerR and D DivisionsDividerCareersDividerNewsroomDividerContact UsDividerSRI HomeSpacer

Spacer
         
  SRI Logo

Search SRILM-USER Archives

Match: Format: Sort by:
Search:

Re: LM missing back-off probabilities

From: =?ISO-8859-1?Q?Yannick_Est=E8ve_-_LIUM?= <yannick.esteve at ADDRESS HIDDEN>
Date: Wed, 25 May 2005 22:57:50 +0200

I hope this message can help you.

To use CMU Sphinx with LM estimated with SRILM you have to use two tools
provided with SRILM toolkit :

-add-dummy-bows:  this program adds the 'missing' back-off weights (in
fact, when these weights equal to 0 ngram-count doesn't print them)
-sort-lm: this program sorts n-grams in lexical order (lm3gdmp works
only if the n-grams are sorted. In fact, 2-3-...-k-grams have to be
sorted in the same order).

These two tools are programmed in awk (awk or gawk have to be installed
on your computer).

-- Yannick

Goldee Udani a écrit :

> Hi there,
>
> I am sorry if this problem has already been addressed before on this
> forum.
>
> I am trying to generate a small LM for using in Sphinx Speech
> Recognition system but the back-off probabilities for every ngram
> occuring at the end of sentence(s) are missing.
> For example -
>
> <s> we cannot afford to fight the war against poverty with accounting
> tricks </s>
>
> For a trigram LM, it doesn't generate back-off probabilities for
> "tricks" (unigram) and "accounting tricks " (bigram). This tends to
> happen for all the sentences in the test set taken from the corpus.
>
> I am trying to use the "ngram-count" script with witten bell
> discounting applied to all n-grams in a trigram model.
>
> If any of you have faced a similar problem before, I would appreciate
> it if you could help me out here.
>
> Thanks,
> Goldee
>
>

Click here to go to the SRILM home page.

 

About Us  Vertical divider  R&D Divisions  Divider  Careers  Divider  Newsroom  Divider  Contact Us
©2006 SRI International, 333 Ravenswood Avenue, Menlo Park, CA 94025-3493
SRI International is an independent, nonprofit corporation. Privacy policy

Last modified Nov 21, 2008