Please use this identifier to cite or link to this item:
https://idr.l2.nitk.ac.in/jspui/handle/123456789/10210
Full metadata record
DC Field | Value | Language |
---|---|---|
dc.contributor.author | Koolagudi, S.G. | |
dc.contributor.author | Murthy, Y.V.S. | |
dc.contributor.author | Bhaskar, S.P. | |
dc.date.accessioned | 2020-03-31T08:18:44Z | - |
dc.date.available | 2020-03-31T08:18:44Z | - |
dc.date.issued | 2018 | |
dc.identifier.citation | International Journal of Speech Technology, 2018, Vol.21, 1, pp.167-183 | en_US |
dc.identifier.uri | http://idr.nitk.ac.in/jspui/handle/123456789/10210 | - |
dc.description.abstract | In this paper, the process of selecting a classifier based on the properties of dataset is designed since it is very difficult to experiment the data on n number of classifiers. As a case study speech emotion recognition is considered. Different combinations of spectral and prosodic features relevant to emotions are explored. The best subset of the chosen set of features is recommended for each of the classifiers based on the properties of chosen dataset. Various statistical tests have been used to estimate the properties of dataset. The nature of dataset gives an idea to select the relevant classifier. To make it more precise, three other clustering and classification techniques such as K-means clustering, vector quantization and artificial neural networks are used for experimentation and results are compared with the selected classifier. Prosodic features like pitch, intensity, jitter, shimmer, spectral features such as mel frequency cepstral coefficients (MFCCs) and formants are considered in this work. Statistical parameters of prosody such as minimum, maximum, mean (?) and standard deviation (?) are extracted from speech and combined with basic spectral (MFCCs) features to get better performance. Five basic emotions namely anger, fear, happiness, neutral and sadness are considered. For analysing the performance of different datasets on different classifiers, content and speaker independent emotional data is used, collected from Telugu movies. Mean opinion score of fifty users is collected to label the emotional data. To make it more accurate, one of the benchmark IIT-Kharagpur emotional database is used to generalize the conclusions. 2018, Springer Science+Business Media, LLC, part of Springer Nature. | en_US |
dc.title | Choice of a classifier, based on properties of a dataset: case study-speech emotion recognition | en_US |
dc.type | Article | en_US |
Appears in Collections: | 1. Journal Articles |
Files in This Item:
There are no files associated with this item.
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.