From LI-HMT-2011

Jump to: navigation, search

International Workshop on Using Linguistic Information for Hybrid Machine Translation (LIHMT-2011)

Technical University of Catalonia, Barcelona, Spain.
Friday, November 18, 2011

In conjunction with Shared Task on Applying Machine Learning techniques to optimising the division of labour in Hybrid MT (ML4HMT-2011)



Invited Speakers

The programme will include three invited plenary talks, each addressing one of the central issues above, and the presentation of a number of refereed contributions on related topics. The invited speakers include:

Rich Morphology and What Can We Expect from Hybrid Approaches to MT
The talk will consist of two parts: a summary of problems caused by rich morphology and a speculation as to which of these problems can be mitigated by hybrid approaches to MT.
In the first part, I will give an overview of the most important steps in MT pipeline (training, tuning, evaluation) and the counterplaying effects of rich source and/or target side morphology on achievable MT quality. In the second part, I will relate the problems to some hybrid MT techniques (ROVER system combination, two-step translation and grammatical post-processing) including their inherent limitations in solving the problems.
  • Alon Lavie (Carnegie Mellon University, Pennsylvania)
Statistical MT with Syntax and Morphology: Challenges and Some Solutions
Phrase-based Statistical Machine Translation is the most dominant approach to MT in recent years. Its linguistic shallowness, however, limits its capabilities when applied to morphologically-rich languages and to language-pairs with highly divergent syntax. Integration of morphological analysis and syntactic modeling within statistical MT are currently at the forefront of MT research. This talk will overview recent work within my research group on hybrid MT frameworks that incorporate syntax and morphology into statistical translation.
The talk will focus on three main lines of work: (1) Morphological segmentation of Arabic and its impact on English-to-Arabic phrase-base SMT; (2) Learning of syntax-based synchronous context-free grammars from large volumes of parsed parallel corpora; and (3) Automatic Category Label Coarsening for Syntax-based MT.
Linguistic Indicators for Quality Estimation of Machine Translation
Although significant progress has been observed in the field of Machine Translation (MT) in recent years, the quality of a given MT system can vary across translated segments. As MT becomes more popular among several types of users, an increasingly relevant problem is that of automatically assessing the quality of translations at the segment level to inform such users. In this talk I will present work on modelling the problem of quality estimation for different applications, focusing on the use of linguistic indicators contrasting the input and translation segments in order to complement shallow, language-independent and confidence-based indicators.

Important Dates

Paper submission deadline.. Sept. 9 2011 Sept. 16 2011
Notification of acceptance........ Oct. 7 2011 Oct. 18 2011
Final version of paper................ Oct. 21 2011 Oct. 28 2011
Workshop................................... Nov. 18 2011


Papers should be in English and up to a maximum of 8 pages long. Please follow the ACL HLT 2011 formatting requirements for long papers.

To submit contributions, please follow the instructions at the EasyChair conference management system submission website.

The deadline for submission is September 9, 2011.

The contributions will undergo a double-blind review by members of the programme committee.

Please address queries to programme chairs.

Venue, Travel & Acommodation

The workshop will be held at Campus Nord of the Technical University of Catalonia (Barcelona), on November 18th.

Conference Venue:

Room "Sala de Juntes".
Building Rectorat
Campus Nord
C/ Jordi Girona, 1-3
08034 Barcelona


Programme Committee

  • Co-Chair: David Farwell (Technical University of Catalonia, TALP, Barcelona)
  • Co-Chair: Gorka Labaka (University of the Basque Country, Donostia)
  • Iñaki Alegria (University of the Basque Country, Donostia)
  • Ondřej Bojar (Charles University, Czech Republic)
  • Josep M. Crego (LIMSI/CNRS, France)
  • Arantza Díaz de Ilarraza (University of the Basque Country, Donostia)
  • Chris Dyer (Carnegie Mellon University, US)
  • Cristina España (Technical University of Catalonia, TALP, Barcelona)
  • Marcello Federico (Fondazione Bruno Kessler, Italy)
  • Mikel Forcada (University of Alacant, Alicante)
  • Adrià de Gispert (University of Cambridge, UK)
  • Kevin Knight (Information Sciences Institute, US)
  • Philipp Koehn (University of Edinburgh, UK)
  • Patrik Lambert (Universiteé du Maine, France)
  • José B. Mariño (Technical University of Catalonia, TALP, Barcelona)
  • Lluís Màrquez (Technical University of Catalonia, TALP, Barcelona)
  • Hermann Ney (RWTH-Aachen, Germany)
  • Daniele Pighin (Technical University of Catalonia, TALP, Barcelona)
  • Aarne Ranta (Chalmers University of Technology, Gothenburg, Sweden)
  • Marta R. Costa-jussà (Barcelona Media, Barcelona)
  • Felipe Sánchez-Martínez (University of Alacant, Alicante)
  • Kepa Sarasola (University of the Basque Country, Donostia)
  • Lucia Specia (University of Wolverhampton, UK)
  • Dekai Wu (Hong Kong University of Science and Technology, China)

Local organization

Centre for Speech and Language Applications and Technologies (TALP), Technical University of Catalonia (UPC)

Committee members: David Farwell (Chair), Lluís Màrquez, Cristina España, Daniele Pighin, Meritxell González, Amarin Deemagarn.

Co-located Shared Task

Co-located to LIHMT, the ML4HMT-2011 workshop will explore alternatives in order to provide optimal support for Hybrid MT design, using sophisticated machine-learning techniques. One further important objective of the workshop is to build bridges from MT to the ML community to systematically and jointly explore the choice space for Hybrid MT.

The "Shared Task on Optimising the Division of Labour in Hybrid MT" is an effort to trigger systematic investigation on improving state-of-the-art Hybrid MT, using advanced machine-learning (ML) methodologies. Participants are requested to build Hybrid/System Combination systems by combining the output of several systems of different types, which is provided by the organizers.

Call for Papers

Second Call for Papers