The Park Plaza Hotel, Boston, Massachusetts
Friday, May 7, 2004 (New Date)
As a result of the number of received and accepted papers, both for this and other HLT-NAACL workshops, we have decided to merge our workshop with the Workshop on Spoken Language Understanding for Conversational Systems. Both workshops will be combined into a single 1-day event and will be held on Friday, May 7, 2004. We believe the change in date will benefit potential attendees since there are fewer conflicts with other workshops on that day.
The combine workshop will feature invited talks by
The theme of this workshop is the use of higher-level linguistic and other types of knowledge for automatic speech processing, especially, but not limited to, speech recognition (ASR). Most current state-of-the-art speech recognizers do not explicitly use linguistic information (with the exception of pronunciation dictionaries), relying mainly on information encoded in statistical N-gram language models. Higher-level linguistic processes such as prosody, syntax, semantics, and pragmatics are obviously important, but such information is typically harder to label, model, and integrate into the standard computational frameworks (such as hidden Markov models). In addition, high-level meta-information, such as personal information stored in a database or dialogue and pragmatic coherence constraints, can also play important roles. All these sources of information can potentially compensate for acoustic confusability resulting from noisy environments and unexpected channel and speaker mismatch, which are very challenging issues for automatic speech recognizers. Furthermore, high-level information is typically crucial when the ultimate goal is to interpret the spoken input (i.e., the same sequence of words can mean different things depending on prosodic and syntactic features, as well as pragmatic constraints). Speaker recognition is another field that has recently recognized the importance of higher-level linguistic features, due to the fact that speakers exhibit idiosyncratic prosodic, lexico-syntactic, and pragmatic patterns ("conversational biometrics").
This workshop seeks to bring together researchers in speech, NLP, and linguistics, exploring novel ideas on the use of information beyond the low-level approaches traditionally used in speech processing (frame-level acoustic modeling and N-gram based language modeling). Many vigorous research efforts in this direction are well-established, and some have proven to be very successful, such as structured/dependency language models for speech recognition, or prosodic information for speaker recognition. For limited domains (e.g. travel reservation and financial transactions), semantic information has clearly been useful for improving speech recognition. Recently, more human knowledge resources that encode different aspects of syntax, semantics, ontology, and common-sense knowledge have become available, and could well be used to augment language models to improve speech recognition. Such resources may include, but are not limited to, annotated corpora such as the Penn Treebank and PropBank, as well as FrameNet, WordNet, OpenCyc, etc. One challenge is that conditioning a language model on such information typically leads to data sparseness/fragmentation, so a proper representation of such knowledge is absolutely critical to success. This workshop seeks to improve the dissemination and exchange of ideas, methods, and data resources that are relevant to further progress. This workshop is seeking papers that present novel ideas of how higher-level linguistic and other types of information can be utilized for automatic speech processing, as well as experimental results.
|Fri, Jan 30, 2004||Submissions due (Extended deadline)|
|Fri, Feb 20, 2004||Acceptance/rejection notification|
|Mon, Mar 17, 2004||Camera ready copy due (Extended deadline|
|Wed, May 7, 2004||Workshop (New date!)|