<html>
<head>
<meta content="text/html; charset=UTF-8" http-equiv="Content-Type">
</head>
<body bgcolor="#FFFFFF" text="#000000">
<div class="moz-cite-prefix"><br>
</div>
<blockquote
cite="mid:60dd107450de493fb644f40672957b0e@BL2PR03MB193.namprd03.prod.outlook.com"
type="cite">
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
<meta name="Generator" content="Microsoft Word 15 (filtered
medium)">
<style><!--
/* Font Definitions */
@font-face
{font-family:"Cambria Math";
panose-1:2 4 5 3 5 4 6 3 2 4;}
@font-face
{font-family:Calibri;
panose-1:2 15 5 2 2 2 4 3 2 4;}
/* Style Definitions */
p.MsoNormal, li.MsoNormal, div.MsoNormal
{margin:0in;
margin-bottom:.0001pt;
font-size:12.0pt;
font-family:"Times New Roman","serif";}
a:link, span.MsoHyperlink
{mso-style-priority:99;
color:#0563C1;
text-decoration:underline;}
a:visited, span.MsoHyperlinkFollowed
{mso-style-priority:99;
color:#954F72;
text-decoration:underline;}
span.EmailStyle17
{mso-style-type:personal-reply;
font-family:"Calibri","sans-serif";
color:#1F497D;}
.MsoChpDefault
{mso-style-type:export-only;
font-family:"Calibri","sans-serif";}
@page WordSection1
{size:8.5in 11.0in;
margin:1.0in 1.0in 1.0in 1.0in;}
div.WordSection1
{page:WordSection1;}
--></style><!--[if gte mso 9]><xml>
<o:shapedefaults v:ext="edit" spidmax="1026" />
</xml><![endif]--><!--[if gte mso 9]><xml>
<o:shapelayout v:ext="edit">
<o:idmap v:ext="edit" data="1" />
</o:shapelayout></xml><![endif]-->
<div class="WordSection1">
<p class="MsoNormal"><a moz-do-not-send="true"
name="_MailEndCompose"><span
style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D"></span></a></p>
<b><span
style="font-size:11.0pt;font-family:"Calibri","sans-serif"">From:</span></b>
K. Richardson [<a class="moz-txt-link-freetext" href="mailto:kazimir.richardson@gmail.com">mailto:kazimir.richardson@gmail.com</a>]
<br>
<p class="MsoNormal"><span
style="font-size:11.0pt;font-family:"Calibri","sans-serif""><b>Sent:</b>
Monday, June 9, 2014 3:56 AM<br>
<b>To:</b> Andreas Stolcke<br>
<b>Subject:</b> question about SRILM non-events feature<o:p></o:p></span></p>
<p class="MsoNormal"><o:p> </o:p></p>
<div>
<p class="MsoNormal">Hi Andreas, <o:p></o:p></p>
<div>
<p class="MsoNormal"><o:p> </o:p></p>
</div>
<div>
<p class="MsoNormal">I apologize if there is some other
official channel for asking SRILM technical questions (I
tried writing to the srilm mailing list, but it bounced).
<br>
</p>
</div>
</div>
</div>
</blockquote>
You need to join the mailing list to be able to post questions.<br>
<br>
<blockquote
cite="mid:60dd107450de493fb644f40672957b0e@BL2PR03MB193.namprd03.prod.outlook.com"
type="cite">
<div class="WordSection1">
<div>
<div>
<p class="MsoNormal"><o:p></o:p></p>
</div>
<div>
<p class="MsoNormal"><o:p> </o:p></p>
</div>
<div>
<p class="MsoNormal">I am using SRILM as a black box in an
MT system. I am trying to build a LM that enforces that
every sequence start with some default value, e.g.
<s> X, such that X never occurs elsewhere in some
other n-gram. <br>
</p>
</div>
</div>
</div>
</blockquote>
So do you want to (1) force X to occur always after <s>, or do
you want to (2) prevent it from occurring elsewhere, or both?<br>
<br>
You can do (1) by manipulating the conditional probability of bigram
<s> X to be 1, and 0 for all other bigrams starting with
<s>.<br>
<br>
You can do (2) by giving X a unigram probability of 0 and have it
not occur in any other ngrams (other than those starting with
<s>). The zero probability prevents X from getting
probability via backoff.<br>
<br>
After you manipulate the probabilities you should use ngram -renorm
to recompute backoff weights.<br>
<br>
<br>
<br>
<blockquote
cite="mid:60dd107450de493fb644f40672957b0e@BL2PR03MB193.namprd03.prod.outlook.com"
type="cite">
<div class="WordSection1">
<div>
<div>
<p class="MsoNormal"><o:p></o:p></p>
</div>
<div>
<p class="MsoNormal"><o:p> </o:p></p>
</div>
<div>
<p class="MsoNormal">Is it possible to enforce this? Is this
within the purview of what the -nonevents option does? I
have been having a hard time understanding how this option
works, and specifically how you specify the associated
non-events file. <br>
</p>
</div>
</div>
</div>
</blockquote>
Non-events are tags like <s> are not predicted by the LM but
that can occur in the history (context) portion of an N-gram to
condition the next word.<br>
It doesn't sound like that's what you want here.<br>
<br>
Andreas<br>
<br>
</body>
</html>