Search SRILM-USER Archives

Re: general help

From: Andreas Stolcke <stolcke at ADDRESS HIDDEN>
Date: Tue, 03 Sep 2002 08:45:06 PDT

Hongqin,

Two suggestions:

- interpolate your class-based LM with the word-based one
(class-based LMs alone usually don't give an improvement over word-based ones
  except in very limited domains).

- use Kneser-Ney smoothing (with interpolation) for the 4gram LM:

   -kndiscount1 -interpolate1 -kndiscount2 -interpolate2
   -kndiscount3 -interpolate3 -kndiscount4 -interpolate4

  You should see a perplexity reduction over the 3gram, and over GT
   discounting.  Of course you never know about WER...

--Andreas

In message <3D74D4C4.2A5F65BB at ADDRESS HIDDEN>you wrote:
> Hi, Crouching tigers & hidden dragons:
>
> I am using a word based trigram (GT backoff) for an application, and
> trying to make futher improvement. I tried to use class based, but
> seemed not so good as word based. Higher gram (4gram) seems also worse
> than 3gram. The WER (word error rate) i got now is about 8-10%, it seems
> that there is still some room for improvement. Anyone got good ideas --
> within ngram. Thanks in advance.
>
> Hongqin Liu
>
>

Click here to go to the SRILM home page.