Automatic speech recognition using audio visual cues

Please use this identifier to cite or link to this item: https://idr.l2.nitk.ac.in/jspui/handle/123456789/7432

Title:	Automatic speech recognition using audio visual cues
Authors:	Yashwanth, H. Mahendrakar, H. Sumam, David S.
Issue Date:	2004
Citation:	Proceedings of the IEEE INDICON 2004 - 1st India Annual Conference, 2004, Vol., , pp.166-169
Abstract:	Automatic speech recognition (ASR) systems have been able to gain much popularity since many multimedia applications require robust speech recognition algorithms. The use of audio and visual information in the speaker-independent continuous speech recognition process makes the performance of the system better compared to the ones with only the audio information. There has been a marked increase in the recognition rates by the use of visual data to aid the audio data available. This is due to the fact that video information is less susceptible to ambient noise than audio information. In this paper a robust Audio-Video Speech Recognition (AVSR) system that allows us to incorporate the Coupled Hidden Markov Model (CHMM) model for fusion of audio and video modalities is presented. The application records the input data and recognizes the isolated words in the input file over a wide range of Signal to Noise Ratio (SNR.) The experimental results show a remarkable increase of about 10% in the recognition rate in the AVSR compared to the audio only ASR and 20% compared to the video only ASR for an SNR of 5dB. �2004 IEEE.
URI:	https://idr.nitk.ac.in/jspui/handle/123456789/7432
Appears in Collections:	2. Conference Papers

Files in This Item:

File	Description	Size	Format
7432.pdf		776.69 kB	Adobe PDF	View/Open