Vis enkel innførsel

dc.contributor.authorUddin, Md Zia
dc.contributor.authorNilsson, Erik Gøsta
dc.date.accessioned2022-02-04T12:54:21Z
dc.date.available2022-02-04T12:54:21Z
dc.date.created2020-09-01T17:30:16Z
dc.date.issued2020
dc.identifier.issn0952-1976
dc.identifier.urihttps://hdl.handle.net/11250/2977200
dc.description.abstractEmotions are quite important in our daily communications and recent years have witnessed a lot of research works to develop reliable emotion recognition systems based on various types data sources such as audio and video. Since there is no apparently visual information of human faces, emotion analysis based on only audio data is a very challenging task. In this work, a novel emotion recognition is proposed based on robust features and machine learning from audio speech. For a person independent emotion recognition system, audio data is used as input to the system from which, Mel Frequency Cepstrum Coefficients (MFCC) are calculated as features. The MFCC features are then followed by discriminant analysis to minimize the inner-class scatterings while maximizing the inter-class scatterings. The robust discriminant features are then applied with an efficient and fast deep learning approach Neural Structured Learning (NSL) for emotion training and recognition. The proposed approach of combining MFCC, discriminant analysis and NSL generated superior recognition rates compared to other traditional approaches such as MFCC-DBN, MFCC-CNN, and MFCC-RNN during the experiments on an emotion dataset of audio speeches. The system can be adopted in smart environments such as homes or clinics to provide affective healthcare. Since NSL is fast and easy to implement, it can be tried on edge devices with limited datasets collected from edge sensors. Hence, we can push the decision-making step towards where data resides rather than conventionally processing of data and making decisions from far away of the data sources. The proposed approach can be applied in different practical applications such as understanding peoples’ emotions in their daily life and stress from the voice of the pilots or air traffic controllers in air traffic management systems.en_US
dc.language.isoengen_US
dc.publisherElsevieren_US
dc.rightsNavngivelse 4.0 Internasjonal*
dc.rights.urihttp://creativecommons.org/licenses/by/4.0/deed.no*
dc.subjectAudioen_US
dc.subjectEmotionen_US
dc.subjectMFCCen_US
dc.subjectLDAen_US
dc.subjectNSLen_US
dc.titleEmotion recognition using speech and neural structured learning to facilitate edge intelligenceen_US
dc.typePeer revieweden_US
dc.typeJournal articleen_US
dc.description.versionpublishedVersionen_US
dc.rights.holder© 2020 The Authors. Published by Elsevier Ltd. This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/).en_US
dc.source.pagenumber11en_US
dc.source.volume94en_US
dc.source.journalEngineering Applications of Artificial Intelligenceen_US
dc.source.issue103775en_US
dc.identifier.doi10.1016/j.engappai.2020.103775
dc.identifier.cristin1826584
cristin.ispublishedtrue
cristin.fulltextoriginal
cristin.qualitycode1


Tilhørende fil(er)

Thumbnail

Denne innførselen finnes i følgende samling(er)

Vis enkel innførsel

Navngivelse 4.0 Internasjonal
Med mindre annet er angitt, så er denne innførselen lisensiert som Navngivelse 4.0 Internasjonal