<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<html>
<head>
<meta content="text/html; charset=ISO-8859-1"
http-equiv="Content-Type">
</head>
<body bgcolor="#ffffff" text="#000000">
Dear all,<br>
<br>
I am trying to implement the CN based misspelling correction
method published by Bertoldi et al. 2010 (full citation is available
at the end of this e-mail). However, I am sticked at step number 4
which involves the generation of a word-based CN by means of
lattice-tool of SRILM toolkit.<br>
<br>
Once I have set the unifilar word lattices altogether in SLF
format I call lattice-tool through this command:<br>
<br>
lattice-tool -in-lattice wordlattice.slf -read-htk -lm lm/en.lm
-write-mesh wordlattice.cn<br>
<br>
However, the fact of including the language model may destroy
completely the original CN form if the input lattice is considerably
long (>15 nodes). I have tried to scale the language model impact
through -htk-scale and -htk-wdpenalty options. But even though I set
the htk-scale and htk-wdpenalty options to 0 the CN still gets
destroyed. The only way I can save the CN structure is avoiding
completely the -lm option. But then the BLEU score of the
translations decrease considerably.<br>
<br>
Could anyone give me some clues in order to keep track of the
problem I may have? I can provide slf lattice sample alongside
dot-generated images of intact and destroyed CNs.<br>
<br>
Regards,<br>
<br>
Lluís Formiga<br>
<br>
<br>
<br>
<span class="Apple-style-span" style="border-collapse: separate;
color: rgb(0, 0, 0); font-family: 'Times New Roman'; font-style:
normal; font-variant: normal; font-weight: normal; letter-spacing:
normal; line-height: normal; orphans: 2; text-indent: 0px;
text-transform: none; white-space: normal; widows: 2;
word-spacing: 0px; font-size: medium;"><span
class="Apple-style-span" style="font-family:
Verdana,Arial,Helvetica,sans-serif; font-size: 16px; text-align:
left;">[Bertoldi et al. 2010] Nicola Bertoldi, Mauro Cettolo,
and Marcello Federico. 2010. Statistical machine translation of
texts with misspelled words. In<span
class="Apple-converted-space"> </span><em>Human Language
Technologies: The 2010 Annual Conference of the North American
Chapter of the Association for Computational Linguistics</em><span
class="Apple-converted-space"> </span>(HLT '10). Association
for Computational Linguistics, Stroudsburg, PA, USA, 412-419.</span></span><br>
<div class="moz-signature">-- <br>
<img src="cid:part1.05040107.04050605@upc.edu" border="0"></div>
</body>
</html>