Machine Learning Approach for Emotion Recognition in Speech

Martin Gjoreski; Hristijan Gjoreski; Andrea Kulakov

Machine Learning Approach for Emotion Recognition in Speech

Authors

Martin Gjoreski
Hristijan Gjoreski
Andrea Kulakov

Abstract

This paper presents a machine learning approach to automatic recognition of human emotions from speech. The approach consists of three steps. First, numerical features are extracted from the sound database by using audio feature extractor. Then, feature selection method is used to select the most relevant features. Finally, a machine learning model is trained to recognize seven universal emotions: anger, fear, sadness, happiness, boredom, disgust and neutral. A thorough ML experimental analysis is performed for each step. The results showed that 300 (out of 1582) features, as ranked by the gain ratio, are sufficient for achieving 86% accuracy when evaluated with 10 fold cross-validation. SVM achieved the highest accuracy when compared to KNN and Naive Bayes. We additionally compared the accuracy of the standard SVM (with default parameters) and the one enhanced by Auto-WEKA (optimized algorithm parameters) using the leave-one-speaker-out technique. The results showed that the SVM enhanced with Auto-WEKA achieved significantly better accuracy than the standard SVM, i.e., 73% and 77% respectively. Finally, the results achieved with the 10 fold cross-validation are comparable and similar to the ones achieved by a human, i.e., 86% accuracy in both cases. Even more, low energy emotions (boredom, sadness and disgust) are better recognized by our machine learning approach compared to the human.

Downloads

Issue

Vol. 38 No. 4 (2014)

Section

Regular papers

License

I assign to Informatica, An International Journal of Computing and Informatics ("Journal") the copyright in the manuscript identified above and any additional material (figures, tables, illustrations, software or other information intended for publication) submitted as part of or as a supplement to the manuscript ("Paper") in all forms and media throughout the world, in all languages, for the full term of copyright, effective when and if the article is accepted for publication. This transfer includes the right to reproduce and/or to distribute the Paper to other journals or digital libraries in electronic and online forms and systems.

I understand that I retain the rights to use the pre-prints, off-prints, accepted manuscript and published journal Paper for personal use, scholarly purposes and internal institutional use.

In certain cases, I can ask for retaining the publishing rights of the Paper. The Journal can permit or deny the request for publishing rights, to which I fully agree.

I declare that the submitted Paper is original, has been written by the stated authors and has not been published elsewhere nor is currently being considered for publication by any other journal and will not be submitted for such review while under review by this Journal. The Paper contains no material that violates proprietary rights of any other person or entity. I have obtained written permission from copyright owners for any excerpts from copyrighted works that are included and have credited the sources in my article. I have informed the co-author(s) of the terms of this publishing agreement.

How to Cite

Machine Learning Approach for Emotion Recognition in Speech. (2014). Informatica, 38(4). https://puffbird.ijs.si/index.php/informatica/article/view/719

Download Citation

Machine Learning Approach for Emotion Recognition in Speech

Authors

Abstract

Downloads

Issue

Section

License

How to Cite

Information

SUPPORT & INDEXING

Make a Submission

Latest publications

Browse