Login



Champion Sponsor



Donors










Best of MobileHCI 2012

Best Papers

  • An investigation into the use of tactile instructions in snowboarding. Daniel Spelmezan
  • MemReflex: Adaptive Flashcards for Mobile Microlearning. Darren Edge, Stephen Fitchett, Michael Whitney, James Landay

Best Demo

  • Tilt Displays: Display Surfaces with Multi-Axis Tilt and Actuation. Jason Alexander, Andrés Lucero, Sriram Subramanian

Demo honorable mention

  • PoI Poi: Point-of-Interest Poi for Multimodal Tethered Whirling. Michael Cohen

Social Media

The official hashtag for MobileHCI2012 is #mobilehci2012 and you can follow us on Twitter or Facebook.



Speech-based Interaction: Myths, Challenges, and Opportunities

Organizers

  • Cosmin Munteanu, National Research Council Canada
  • Gerald Penn, University of Toronto

Summary

HCI research has for long been dedicated to better and more naturally facilitating information transfer between humans and machines. Unfortunately, humans' most natural form of communication, speech, is also one of the most difficult modalities to be understood by machines – despite, and perhaps, because it is the highest-bandwidth communication channel we possess. While significant research efforts, from engineering, to linguistic, and to cognitive sciences, have been spent on improving machines' ability to understand speech, the MobileHCI community has been relatively timid in embracing this modality as a central focus of research. This can be attributed in part to the relatively discouraging levels of accuracy in understanding speech, in contrast with often-unfounded claims of success from industry, but also to the intrinsic difficulty of designing and especially evaluating speech and natural language interfaces. The goal of this course is to inform the MobileHCI community of the current state of speech and natural language research, to dispel some of the myths surrounding speech-based interaction, as well as to provide an opportunity for researchers and practitioners to learn more about how speech recognition works, what are its limitations, and how it could be used to enhance current interaction paradigms.

Overview

  • How Automatic Speech Recognition (ASR) works and why it is such a computationally-difficult problem?
  • Where is ASR used in current commercial mobile applications?
  • What are the usability issues surrounding speech-based interaction systems, particularly in mobile and pervasive computing?
  • What are the challenges in enabling speech as a modality for mobile interaction?
  • What is the current state-of-the-art in ASR research?
  • What are the differences between the commercial ASR systems' accuracy claims and the needs of mobile interactive applications?
  • What opportunities exist for HCI researchers in terms of enhancing systems' interactivity by enabling speech?

Bios

  • Cosmin Munteanu

Cosmin Munteanuis a Research Officer with the National Research Council Canada – Institute for Information Technology, where he leads several research projects exploring speech and natural language interaction for mobile devices and mixed reality systems. His area of expertise is at the intersection of Automatic Speech Recognition (ASR) and Human-Computer Interaction, having extensively studied the human factors of using imperfect speech recognition systems, and having designed and evaluated systems that consider humans as an important part of the ASR process. He has authored numerous publications in HCI, ASR, and Computational Linguistics. http://nrc-ca.academia.edu/CosminMunteanu

  • Gerald Penn

Gerald Penn is an Associate Professor of Computer Science at the University of Toronto. His area of expertise is in the study of human languages, both from a mathematical and computational perspective. Gerald is one of the leading scholars in Computational Linguistics, with significant contributions to the formal study of natural languages. His publications cover many areas, from Theoretical Linguistics, to Mathematics, and to Automatic Speech Recognition, as well as Human-Computer Interaction. http://www.cs.toronto.edu/~gpenn/