Speaker Recognition in Emotional Environment using Excitation Features

Thomas T.; Spoorthy; Sobhana N.V.; Koolagudi S.G.

Please use this identifier to cite or link to this item: https://idr.l2.nitk.ac.in/jspui/handle/123456789/15055

Full metadata record

DC Field	Value	Language
dc.contributor.author	Thomas T.
dc.contributor.author	Spoorthy
dc.contributor.author	Sobhana N.V.
dc.contributor.author	Koolagudi S.G.
dc.date.accessioned	2021-05-05T10:16:18Z	-
dc.date.available	2021-05-05T10:16:18Z	-
dc.date.issued	2020
dc.identifier.citation	Proceedings of 2020 3rd International Conference on Advances in Electronics, Computers and Communications, ICAECC 2020 , Vol. , , p. -	en_US
dc.identifier.uri	https://doi.org/10.1109/ICAECC50550.2020.9339501
dc.identifier.uri	http://idr.nitk.ac.in/jspui/handle/123456789/15055	-
dc.description.abstract	Speaker Recognition is known as the task of recognizing the person speaking from his/her speech. Speaker recognition has many applications including transaction authentication, access control, voice dialing, web services, etc. Emotive speaker recognition is important because in real life, human beings extensively express emotions during conversations, and emotions alter the human voice. A text-independent speaker recognition system is proposed in the work. The system designed is for emotional environment. The proposed system in this work is trained using the speech samples recorded in neutral environment and the system evaluation is performed in an emotional environment. Here, excitation source features are used to represent speaker-specific details contained in speech signal. The excitation source signal is obtained after separating the segmental level features from the voice samples. The excitation source signal is almost considered as a noise so identifying a speaker in an emotive environment is a challenging task. Excitation features include Linear Prediction (LP) residual, Glottal Closure Instance (GCI), LP residual phase, residual cepstrum, Residual Mel-Frequency Cepstral Coefficient (R-MFCC), etc. A decrease in performance is observed when the system is trained with neutral speech samples and tested with emotional speech samples. Different emotions considered for emotional speaker identification are happy, sad, anger, fear, neutral, surprise, disgust, and sarcastic For the classification of speakers the algorithms used are Gaussian Mixture Model (GMM), Support Vector Machine (SVM), K-Nearest Neighbor(KNN), Random Forest and Naive Bayes. © 2020 IEEE.	en_US
dc.title	Speaker Recognition in Emotional Environment using Excitation Features	en_US
dc.type	Conference Paper	en_US
Appears in Collections:	2. Conference Papers

Files in This Item:

There are no files associated with this item.

Show simple item record