SlideShare a Scribd company logo
Multimodal Man-Machine Interaction By: Rajesh P. Barnwal School of Information Technology Indian Institute of Technology, Kharagpur Sociology Language Design Engineering Ethnography Psychology Human Factors Computer Science
Outline Introduction What is Man-Machine Interaction Interaction Modalities  Unimodal vs Multimiodal HCI  Limitation of Unimodal HCI Multimodal  Interaction Various HCI Modalities  Architecture of Multimodal HCI Issues and Challenges Application Areas Case Studies Future Scopes Conclusion References
Introduction Human interacts with Human (Human-Human Interaction) Machine (Man-Machine Interaction) Human-Human Interaction (By default Natural) Human-Machine Interaction  (Better, if Natural)
Multimodal man machine interaction
What is Man-Machine Interaction? A process of information transfer from User to Machine Machine to User Man-Machine Interaction also referred as Human-Computer Interaction (HCI) Computer-Human Interaction (CHI) As per ACM SIGCHI: HCI is “a discipline concerned with the  design ,  evaluation  and  implementation  of interactive  computing systems  for  human use  and with the study of major phenomena surrounding them”
Human Interaction Modalities Human interaction with outside world uses Sensory Organs as Input Effectors for Output Picture Courtesy: Google Sight Touch Hearing Taste Smell Limbs Eyes Finger Head Vocal
Computer Interaction Modalities Computer interacts with outside world using Input medium Output medium Picture Courtesy: Google
HCI System Architecture Architecture of any HCI Systems identified by- Number of inputs and outputs in the system Diversity of inputs and outputs in terms of modality Workings of these diverse input and output for interaction purpose Based on different configuration and design of interface, HCI system can be divided into- Unimodal HCI System Multimodal HCI System
Unimodal vs Multimodal Unimodal HCI System The system based on single channel of input Restricted to the use of only one mode of human-computer interaction modality Example are Text based User Interface, Graphical User Interface, Pointer based Interface, Touch based interface etc  Multimodal HCI System The system based on combination of multiple modalities of interaction by simultaneous use of different channels Motivated by the natural way of human interaction
Limitation of Unimodal Interaction Not a natural way of human interaction Usually designed for the ‘average’ user Fails to cater the need of diverse category of people Difficult to use by disable, illiterate and untrained people Cannot provide universal interface More error prone
Multimodal Interaction To address limitations of Unimodal Interaction Based on two views: Human centered: multiple and simultaneous use of human input/output channels for perception and control. System centered: multiple input/ output modalities for better accuracy, naturalness, redundancy  and efficiency.
Human-Computer Interaction Modalities Sensor-based Mouse, Keyboard, Joystick  Pen-based sensors Motion tracking sensors Haptic/ Touch Sensors Pressure Sensors Smell/ Taste Sensors Visual-based  Facial Expression Analysis Body Movement Tracking (Large-scale) Gesture Recognition Gaze Detection (Eyes Movement Tracking)
Human-Computer Interaction Modalities Audio-based Speech Recognition Speaker Recognition Auditory Emotion Analysis Human-Noise/Sign Detections  (Gasp, Sigh, Laugh, Cry, etc.)
Architecture of Multimodal User Interface Inputs text speech vision motor … Media Analysis language gesture gaze … Outputs graphics animation speech sound … Media Design language gesture … … Interaction Management media fusion discourse management plan recognition and generation user modeling presentation design (Picture from: Maybury and Whalster 1998)
Need for Multimodal HCI System To enhance error avoidance and ease of error resolution. To accommodate a wider range of users, tasks, and environmental situations. To cater the need of individual with differences, such as permanent or temporary handicaps. To prevent overuse of any individual mode during extended computer usage.  To permit the flexible and improved use of input modes, including alternation and integrated use.
Issues and Challenges Perfection of technology Lack of universal model for interface design Simultaneous tracking of mode Unambiguous interpretation Multi-modal information fusion Realizing Natural User Interface Cost of hardware
Application Areas Computing devices and application for physically handicapped people Universal Interface design for– Old Age Children Novice  Robotic Interaction Gaming Industry Medical Industry Smart Surveillance
Bharati – A Multimodal Web Interface An IIT-Kharagpur Initiative The objective of the project  a internet user interface for both language and computer illiterate people : text, speech and icon
Bharati – Chitra Iconic Module for the People unable to read/ write in their mother tongue
Bharati – Dhwani Speech based module for those who has speaking but not reading/ writing ability in their mother language
Bharati – Akshar Text based module for the user unable to use English
Multimodal Framework of Bharati Text Analyzer Speech Recognizer Visual Language Manager Speech to text converter Hindi/ Bengali  to English  Language Translator Keywords Extractor Information Visualization Text to Speech Converter English to Hindi/ Bengali  Language Translator Content Receiver Search Engine Internet Strings of text Strings of text Query Strings HTML page Query Strings Text Speech WIMP WIMP : Window, Icon, Menu, Pointing Device (Picture adapted from NID (IITKgp) Website) Next
ITR – A Multimodal Lab Project An Initiative of Beckman Institute, University of Illinois Multimodal Interaction Scenario  (Adapted from ITR Website)
ITR – Multimodal Framework (Picture adapted from ITR Website)
ITR Project – Demo Video (Video from ITR Website)
Future of Human-Computer Interaction Future HCI will be more for realizing Natural User Interface Interaction can be made using combination of Gesture,  Speech, Facial Expression, Vision, Gaze, Touch, Brain wave As per prediction in Microsoft’s HCI Vision 2020 report,  in future,  Physical objects like wall, floor, furniture will be able to interact in distributed manner with the human beings in very natural way As per an article appeared on 3 rd  September 2010 in an online magazine “Network World” Future computer can be interacted using brainwaves in combination of other  modality like gesture and can be accessible at anywhere even by just using movements of human body parts in air – Pranav Mistry  , MIT Media Lab.
Conclusion Use of Multimodal HCI (MHCI) provides great many advantages over unimodal interface MHCI is capable of providing natural user interface to human being MHCI is able to pave a way for universal design for diverse application and people MHCI has great potential in terms of applications areas and thus needs extensive inter-disciplinary research for addressing issues and challenges
References ACM SIGCHI  Curricula for Human-Computer Interaction . http://old.sigchi.org/cdg/cdg2.html#2_1 [Online accessed on 28 th  September 2010]. Alan Dix, Janet Finlay, Gregory D. Abowd, Russel Beale (2004).  Human Computer Interaction . Pearson Education (Singapore) Pte. Ltd., ISBN 81-297-0409-9. Being Human: Human Computer Interaction in the Year 2020 . http://research.microsoft.com/ en-us/um/cambridge/projects/hci2020/downloads/Being Human_A3.pdf [Online accessed on 22 nd  September 2010]. Bharati: Internet for all . http://www.nid.iitkgp.ernet.in/Bharati/ [Online accessed on 26 th  September 2010]. Fakhreddine Karray, Milad Alemzadeh, Jamil Abou Saleh and Mo Nours Arab.  Human-Computer Interaction: Overview on State of the Art , International Journal on Smart Sensing and Intelligent Systems, Vol. 1, No. 1, March 2008, pp 137-159. Human–computer interaction - Wikipedia, the free encyclopedia . http://en.wikipedia.org/wiki/ Human–computer_interaction [Online accessed on 1st September 2010].
References Jiagen Jin, Wenfeng Li.  A Survey of The Information Fusion in MMHCI , 2010 International Conference on Machine Vision and Human-machine Interface, April 24-25, 2010, pp.509-513. Jon Brodkin, The future of human-computer interaction. Article appeared in Network World Online Magazine on September 03, 2010, http://www.networkworld.com/news /2010/090210-human-computer-interaction.html [Online accessed on 29 th  September 2010]. Mark T. Maybury and Wolfgang Wahlster (Eds.),  Readings in Intelligent User Interfaces . Morgan Kaufmann Publishers, 1998 [ Ref. in Multimodal Human-Computer Interaction: a constructive and empirical study . Academic Dissertation by Roope Raisamo, University of Tampere, 1999]. Multimodal Human Computer Interaction: Toward a Proactive Computer . http://itr.beckman.uiuc.edu/index.html [Online accessed on 22 nd  September 2010]. S.K. Card, T.P. Moran and A. Newell (1983).  The Psychology of Human-Computer Interaction . Lawrence Erlbaum Associates, ISBN 0-89859-859-1.
Thank You! Questions, if any?
Thank You!
Typical Information Flow in  a Basic Multimodal Interaction Gesture Recognition Gesture Understanding Speech Recognition Natural Language Processing Camera Glove Laser Touch Microphone Context Management Multimodal Integration Dialogue Manager Application Invocation and Coordination Response Planning App1 App2 App3 Graphics VR TTS Feature/ Frame Structures Feature/ Frame Structures
Human Computer Interaction  (Classical Model) (Picture adapted ACM-SIGCHI)
Why Man-Machine Interaction? Fast and information-centered life Increasing tendency for getting self-serviced life To get the work done with the help of machine To get easy and better life with the help of machine Requires interaction as natural as possible

More Related Content

Multimodal man machine interaction

  • 1. Multimodal Man-Machine Interaction By: Rajesh P. Barnwal School of Information Technology Indian Institute of Technology, Kharagpur Sociology Language Design Engineering Ethnography Psychology Human Factors Computer Science
  • 2. Outline Introduction What is Man-Machine Interaction Interaction Modalities Unimodal vs Multimiodal HCI Limitation of Unimodal HCI Multimodal Interaction Various HCI Modalities Architecture of Multimodal HCI Issues and Challenges Application Areas Case Studies Future Scopes Conclusion References
  • 3. Introduction Human interacts with Human (Human-Human Interaction) Machine (Man-Machine Interaction) Human-Human Interaction (By default Natural) Human-Machine Interaction (Better, if Natural)
  • 5. What is Man-Machine Interaction? A process of information transfer from User to Machine Machine to User Man-Machine Interaction also referred as Human-Computer Interaction (HCI) Computer-Human Interaction (CHI) As per ACM SIGCHI: HCI is “a discipline concerned with the design , evaluation and implementation of interactive computing systems for human use and with the study of major phenomena surrounding them”
  • 6. Human Interaction Modalities Human interaction with outside world uses Sensory Organs as Input Effectors for Output Picture Courtesy: Google Sight Touch Hearing Taste Smell Limbs Eyes Finger Head Vocal
  • 7. Computer Interaction Modalities Computer interacts with outside world using Input medium Output medium Picture Courtesy: Google
  • 8. HCI System Architecture Architecture of any HCI Systems identified by- Number of inputs and outputs in the system Diversity of inputs and outputs in terms of modality Workings of these diverse input and output for interaction purpose Based on different configuration and design of interface, HCI system can be divided into- Unimodal HCI System Multimodal HCI System
  • 9. Unimodal vs Multimodal Unimodal HCI System The system based on single channel of input Restricted to the use of only one mode of human-computer interaction modality Example are Text based User Interface, Graphical User Interface, Pointer based Interface, Touch based interface etc Multimodal HCI System The system based on combination of multiple modalities of interaction by simultaneous use of different channels Motivated by the natural way of human interaction
  • 10. Limitation of Unimodal Interaction Not a natural way of human interaction Usually designed for the ‘average’ user Fails to cater the need of diverse category of people Difficult to use by disable, illiterate and untrained people Cannot provide universal interface More error prone
  • 11. Multimodal Interaction To address limitations of Unimodal Interaction Based on two views: Human centered: multiple and simultaneous use of human input/output channels for perception and control. System centered: multiple input/ output modalities for better accuracy, naturalness, redundancy and efficiency.
  • 12. Human-Computer Interaction Modalities Sensor-based Mouse, Keyboard, Joystick Pen-based sensors Motion tracking sensors Haptic/ Touch Sensors Pressure Sensors Smell/ Taste Sensors Visual-based Facial Expression Analysis Body Movement Tracking (Large-scale) Gesture Recognition Gaze Detection (Eyes Movement Tracking)
  • 13. Human-Computer Interaction Modalities Audio-based Speech Recognition Speaker Recognition Auditory Emotion Analysis Human-Noise/Sign Detections (Gasp, Sigh, Laugh, Cry, etc.)
  • 14. Architecture of Multimodal User Interface Inputs text speech vision motor … Media Analysis language gesture gaze … Outputs graphics animation speech sound … Media Design language gesture … … Interaction Management media fusion discourse management plan recognition and generation user modeling presentation design (Picture from: Maybury and Whalster 1998)
  • 15. Need for Multimodal HCI System To enhance error avoidance and ease of error resolution. To accommodate a wider range of users, tasks, and environmental situations. To cater the need of individual with differences, such as permanent or temporary handicaps. To prevent overuse of any individual mode during extended computer usage. To permit the flexible and improved use of input modes, including alternation and integrated use.
  • 16. Issues and Challenges Perfection of technology Lack of universal model for interface design Simultaneous tracking of mode Unambiguous interpretation Multi-modal information fusion Realizing Natural User Interface Cost of hardware
  • 17. Application Areas Computing devices and application for physically handicapped people Universal Interface design for– Old Age Children Novice Robotic Interaction Gaming Industry Medical Industry Smart Surveillance
  • 18. Bharati – A Multimodal Web Interface An IIT-Kharagpur Initiative The objective of the project a internet user interface for both language and computer illiterate people : text, speech and icon
  • 19. Bharati – Chitra Iconic Module for the People unable to read/ write in their mother tongue
  • 20. Bharati – Dhwani Speech based module for those who has speaking but not reading/ writing ability in their mother language
  • 21. Bharati – Akshar Text based module for the user unable to use English
  • 22. Multimodal Framework of Bharati Text Analyzer Speech Recognizer Visual Language Manager Speech to text converter Hindi/ Bengali to English Language Translator Keywords Extractor Information Visualization Text to Speech Converter English to Hindi/ Bengali Language Translator Content Receiver Search Engine Internet Strings of text Strings of text Query Strings HTML page Query Strings Text Speech WIMP WIMP : Window, Icon, Menu, Pointing Device (Picture adapted from NID (IITKgp) Website) Next
  • 23. ITR – A Multimodal Lab Project An Initiative of Beckman Institute, University of Illinois Multimodal Interaction Scenario (Adapted from ITR Website)
  • 24. ITR – Multimodal Framework (Picture adapted from ITR Website)
  • 25. ITR Project – Demo Video (Video from ITR Website)
  • 26. Future of Human-Computer Interaction Future HCI will be more for realizing Natural User Interface Interaction can be made using combination of Gesture, Speech, Facial Expression, Vision, Gaze, Touch, Brain wave As per prediction in Microsoft’s HCI Vision 2020 report, in future, Physical objects like wall, floor, furniture will be able to interact in distributed manner with the human beings in very natural way As per an article appeared on 3 rd September 2010 in an online magazine “Network World” Future computer can be interacted using brainwaves in combination of other modality like gesture and can be accessible at anywhere even by just using movements of human body parts in air – Pranav Mistry , MIT Media Lab.
  • 27. Conclusion Use of Multimodal HCI (MHCI) provides great many advantages over unimodal interface MHCI is capable of providing natural user interface to human being MHCI is able to pave a way for universal design for diverse application and people MHCI has great potential in terms of applications areas and thus needs extensive inter-disciplinary research for addressing issues and challenges
  • 28. References ACM SIGCHI Curricula for Human-Computer Interaction . http://old.sigchi.org/cdg/cdg2.html#2_1 [Online accessed on 28 th September 2010]. Alan Dix, Janet Finlay, Gregory D. Abowd, Russel Beale (2004). Human Computer Interaction . Pearson Education (Singapore) Pte. Ltd., ISBN 81-297-0409-9. Being Human: Human Computer Interaction in the Year 2020 . http://research.microsoft.com/ en-us/um/cambridge/projects/hci2020/downloads/Being Human_A3.pdf [Online accessed on 22 nd September 2010]. Bharati: Internet for all . http://www.nid.iitkgp.ernet.in/Bharati/ [Online accessed on 26 th September 2010]. Fakhreddine Karray, Milad Alemzadeh, Jamil Abou Saleh and Mo Nours Arab. Human-Computer Interaction: Overview on State of the Art , International Journal on Smart Sensing and Intelligent Systems, Vol. 1, No. 1, March 2008, pp 137-159. Human–computer interaction - Wikipedia, the free encyclopedia . http://en.wikipedia.org/wiki/ Human–computer_interaction [Online accessed on 1st September 2010].
  • 29. References Jiagen Jin, Wenfeng Li. A Survey of The Information Fusion in MMHCI , 2010 International Conference on Machine Vision and Human-machine Interface, April 24-25, 2010, pp.509-513. Jon Brodkin, The future of human-computer interaction. Article appeared in Network World Online Magazine on September 03, 2010, http://www.networkworld.com/news /2010/090210-human-computer-interaction.html [Online accessed on 29 th September 2010]. Mark T. Maybury and Wolfgang Wahlster (Eds.), Readings in Intelligent User Interfaces . Morgan Kaufmann Publishers, 1998 [ Ref. in Multimodal Human-Computer Interaction: a constructive and empirical study . Academic Dissertation by Roope Raisamo, University of Tampere, 1999]. Multimodal Human Computer Interaction: Toward a Proactive Computer . http://itr.beckman.uiuc.edu/index.html [Online accessed on 22 nd September 2010]. S.K. Card, T.P. Moran and A. Newell (1983). The Psychology of Human-Computer Interaction . Lawrence Erlbaum Associates, ISBN 0-89859-859-1.
  • 32. Typical Information Flow in a Basic Multimodal Interaction Gesture Recognition Gesture Understanding Speech Recognition Natural Language Processing Camera Glove Laser Touch Microphone Context Management Multimodal Integration Dialogue Manager Application Invocation and Coordination Response Planning App1 App2 App3 Graphics VR TTS Feature/ Frame Structures Feature/ Frame Structures
  • 33. Human Computer Interaction (Classical Model) (Picture adapted ACM-SIGCHI)
  • 34. Why Man-Machine Interaction? Fast and information-centered life Increasing tendency for getting self-serviced life To get the work done with the help of machine To get easy and better life with the help of machine Requires interaction as natural as possible