Analysis of architectures based on deep learning methods to evaluate and recognize traits in speech signals
Human emotions detection considering speech signals is a field that has attracted the attention of the research community since the last years. Several situations where the human integrity and security is at risk have been addressed; particularly the analysis of speech in emergency calls or in call-centers, are an interesting scenario. This project aimed to develop a methodology to classify different types of emotions such as anger, anxiety, disgust, and desperation, in scenarios where the speech signal is contaminated with noise or is coded by telephone channels.
To solve the task of surgical mask detection from audio recordings in the scope of Interspeech’s ComParE challenge, we introduce a phonetic recognizer which is able to differentiate between clear and mask samples. A deep recurrent phoneme recognition …
In the last years, there has a great progress in automatic speech recognition. The challenge now it is not only recognize the semantic content in the speech but also the called 'paralinguistic' aspects of the speech, including the emotions, and the …
A new set of features based on non-linear dynamics measures obtained from the wavelet packet transform for the automatic recognition of “fear-type” emotions in speech is proposed. The experiments are carried out using three different databases with a …
The interest in emotion recognition from speech has increased in the last decade. Emotion recognition can improve the quality of services and the quality of life of people. One of the main problems in emotion recognition from speech is to find …
Automatic recognition of emotions in speech has attracted the attention of the research community in recent years. Some of the most relevant proposed applications of it are in call-centers. In these scenarios the speech is distorted by compression …
Automatic emotion recognition considering speech signals has attracted the attention of the research community in the last years. One of the main challenges is to find suitable features to represent the affective state of the speaker. In this paper, …
The speech signals are non-stationary processes with changes in time and frequency. The structure of a speech signal is also affected by the presence of several paralinguistics phenomena such as emotions, pathologies, cognitive impairments, among …
Detection of emotion in humans from speech signals is a recent research field. One of the scenarios where this field has been applied is in situations where the human integrity and security are at risk. In this paper we are propossing a set of …