Informacja

Drogi użytkowniku, aplikacja do prawidłowego działania wymaga obsługi JavaScript. Proszę włącz obsługę JavaScript w Twojej przeglądarce.

Wyszukujesz frazę "speech emotion recognition" wg kryterium: Temat


Wyświetlanie 1-14 z 14
Tytuł:
Speech emotion recognition under white noise
Autorzy:
Huang, C.
Chen, G.
Yu, H.
Bao, Y.
Zhao, L.
Powiązania:
https://bibliotekanauki.pl/articles/177301.pdf
Data publikacji:
2013
Wydawca:
Polska Akademia Nauk. Czytelnia Czasopism PAN
Tematy:
speech emotion recognition
speech enhancement
emotion model
Gaussian mixture model
Opis:
Speaker‘s emotional states are recognized from speech signal with Additive white Gaussian noise (AWGN). The influence of white noise on a typical emotion recogniztion system is studied. The emotion classifier is implemented with Gaussian mixture model (GMM). A Chinese speech emotion database is used for training and testing, which includes nine emotion classes (e.g. happiness, sadness, anger, surprise, fear, anxiety, hesitation, confidence and neutral state). Two speech enhancement algorithms are introduced for improved emotion classification. In the experiments, the Gaussian mixture model is trained on the clean speech data, while tested under AWGN with various signal to noise ratios (SNRs). The emotion class model and the dimension space model are both adopted for the evaluation of the emotion recognition system. Regarding the emotion class model, the nine emotion classes are classified. Considering the dimension space model, the arousal dimension and the valence dimension are classified into positive regions or negative regions. The experimental results show that the speech enhancement algorithms constantly improve the performance of our emotion recognition system under various SNRs, and the positive emotions are more likely to be miss-classified as negative emotions under white noise environment.
Źródło:
Archives of Acoustics; 2013, 38, 4; 457-463
0137-5075
Pojawia się w:
Archives of Acoustics
Dostawca treści:
Biblioteka Nauki
Artykuł
Tytuł:
Speech emotion recognition system for social robots
Autorzy:
Juszkiewicz, Ł.
Powiązania:
https://bibliotekanauki.pl/articles/384511.pdf
Data publikacji:
2013
Wydawca:
Sieć Badawcza Łukasiewicz - Przemysłowy Instytut Automatyki i Pomiarów
Tematy:
speech emotion recognition
prosody
machine learning
Emo-DB
intonation
social robot
Opis:
The paper presents a speech emotion recognition system for social robots. Emotions are recognised using global acoustic features of the speech. The system implements the speech parameters calculation, features extraction, features selection and classification. All these phases are described. The system was verified using the two emotional speech databases: Polish and German. Perspectives for using such system in the social robots are presented.
Źródło:
Journal of Automation Mobile Robotics and Intelligent Systems; 2013, 7, 4; 59-65
1897-8649
2080-2145
Pojawia się w:
Journal of Automation Mobile Robotics and Intelligent Systems
Dostawca treści:
Biblioteka Nauki
Artykuł
Tytuł:
Multi-objective heuristic feature selection for speech-based multilingual emotion recognition
Autorzy:
Brester, C.
Semenkin, E.
Sidorov, M.
Powiązania:
https://bibliotekanauki.pl/articles/91588.pdf
Data publikacji:
2016
Wydawca:
Społeczna Akademia Nauk w Łodzi. Polskie Towarzystwo Sieci Neuronowych
Tematy:
multi-objective optimization
feature selection
speech-based emotion recognition
Opis:
If conventional feature selection methods do not show sufficient effectiveness, alternative algorithmic schemes might be used. In this paper we propose an evolutionary feature selection technique based on the two-criterion optimization model. To diminish the drawbacks of genetic algorithms, which are applied as optimizers, we design a parallel multicriteria heuristic procedure based on an island model. The performance of the proposed approach was investigated on the Speech-based Emotion Recognition Problem, which reflects one of the most essential points in the sphere of human-machine communications. A number of multilingual corpora (German, English and Japanese) were involved in the experiments. According to the results obtained, a high level of emotion recognition was achieved (up to a 12.97% relative improvement compared with the best F-score value on the full set of attributes).
Źródło:
Journal of Artificial Intelligence and Soft Computing Research; 2016, 6, 4; 243-253
2083-2567
2449-6499
Pojawia się w:
Journal of Artificial Intelligence and Soft Computing Research
Dostawca treści:
Biblioteka Nauki
Artykuł
Tytuł:
Speech emotion recognition based on sparse representation
Autorzy:
Yan, J.
Wang, X.
Gu, W.
Ma, L.
Powiązania:
https://bibliotekanauki.pl/articles/177778.pdf
Data publikacji:
2013
Wydawca:
Polska Akademia Nauk. Czytelnia Czasopism PAN
Tematy:
speech emotion recognition
sparse partial least squares regression SPLSR
SPLSR
feature selection and dimensionality reduction
Opis:
Speech emotion recognition is deemed to be a meaningful and intractable issue among a number of do- mains comprising sentiment analysis, computer science, pedagogy, and so on. In this study, we investigate speech emotion recognition based on sparse partial least squares regression (SPLSR) approach in depth. We make use of the sparse partial least squares regression method to implement the feature selection and dimensionality reduction on the whole acquired speech emotion features. By the means of exploiting the SPLSR method, the component parts of those redundant and meaningless speech emotion features are lessened to zero while those serviceable and informative speech emotion features are maintained and selected to the following classification step. A number of tests on Berlin database reveal that the recogni- tion rate of the SPLSR method can reach up to 79.23% and is superior to other compared dimensionality reduction methods.
Źródło:
Archives of Acoustics; 2013, 38, 4; 465-470
0137-5075
Pojawia się w:
Archives of Acoustics
Dostawca treści:
Biblioteka Nauki
Artykuł
Tytuł:
Automatic speech based emotion recognition using paralinguistics features
Autorzy:
Hook, J.
Noroozi, F.
Toygar, O.
Anbarjafari, G.
Powiązania:
https://bibliotekanauki.pl/articles/200261.pdf
Data publikacji:
2019
Wydawca:
Polska Akademia Nauk. Czytelnia Czasopism PAN
Tematy:
random forests
speech emotion recognition
machine learning
support vector machines
lasy
rozpoznawanie emocji mowy
nauczanie maszynowe
Opis:
Affective computing studies and develops systems capable of detecting humans affects. The search for universal well-performing features for speech-based emotion recognition is ongoing. In this paper, a?small set of features with support vector machines as the classifier is evaluated on Surrey Audio-Visual Expressed Emotion database, Berlin Database of Emotional Speech, Polish Emotional Speech database and Serbian emotional speech database. It is shown that a?set of 87 features can offer results on-par with state-of-the-art, yielding 80.21, 88.6, 75.42 and 93.41% average emotion recognition rate, respectively. In addition, an experiment is conducted to explore the significance of gender in emotion recognition using random forests. Two models, trained on the first and second database, respectively, and four speakers were used to determine the effects. It is seen that the feature set used in this work performs well for both male and female speakers, yielding approximately 27% average emotion recognition in both models. In addition, the emotions for female speakers were recognized 18% of the time in the first model and 29% in the second. A?similar effect is seen with male speakers: the first model yields 36%, the second 28% a?verage emotion recognition rate. This illustrates the relationship between the constitution of training data and emotion recognition accuracy.
Źródło:
Bulletin of the Polish Academy of Sciences. Technical Sciences; 2019, 67, 3; 479-488
0239-7528
Pojawia się w:
Bulletin of the Polish Academy of Sciences. Technical Sciences
Dostawca treści:
Biblioteka Nauki
Artykuł
Tytuł:
Speech emotion recognition using wavelet packet reconstruction with attention-based deep recurrent neutral networks
Autorzy:
Meng, Hao
Yan, Tianhao
Wei, Hongwei
Ji, Xun
Powiązania:
https://bibliotekanauki.pl/articles/2173587.pdf
Data publikacji:
2021
Wydawca:
Polska Akademia Nauk. Czytelnia Czasopism PAN
Tematy:
speech emotion recognition
voice activity detection
wavelet packet reconstruction
feature extraction
LSTM networks
attention mechanism
rozpoznawanie emocji mowy
wykrywanie aktywności głosowej
rekonstrukcja pakietu falkowego
wyodrębnianie cech
mechanizm uwagi
sieć LSTM
Opis:
Speech emotion recognition (SER) is a complicated and challenging task in the human-computer interaction because it is difficult to find the best feature set to discriminate the emotional state entirely. We always used the FFT to handle the raw signal in the process of extracting the low-level description features, such as short-time energy, fundamental frequency, formant, MFCC (mel frequency cepstral coefficient) and so on. However, these features are built on the domain of frequency and ignore the information from temporal domain. In this paper, we propose a novel framework that utilizes multi-layers wavelet sequence set from wavelet packet reconstruction (WPR) and conventional feature set to constitute mixed feature set for achieving the emotional recognition with recurrent neural networks (RNN) based on the attention mechanism. In addition, the silent frames have a disadvantageous effect on SER, so we adopt voice activity detection of autocorrelation function to eliminate the emotional irrelevant frames. We show that the application of proposed algorithm significantly outperforms traditional features set in the prediction of spontaneous emotional states on the IEMOCAP corpus and EMODB database respectively, and we achieve better classification for both speaker-independent and speaker-dependent experiment. It is noteworthy that we acquire 62.52% and 77.57% accuracy results with speaker-independent (SI) performance, 66.90% and 82.26% accuracy results with speaker-dependent (SD) experiment in final.
Źródło:
Bulletin of the Polish Academy of Sciences. Technical Sciences; 2021, 69, 1; art. no. e136300
0239-7528
Pojawia się w:
Bulletin of the Polish Academy of Sciences. Technical Sciences
Dostawca treści:
Biblioteka Nauki
Artykuł
Tytuł:
Speech emotion recognition using wavelet packet reconstruction with attention-based deep recurrent neutral networks
Autorzy:
Meng, Hao
Yan, Tianhao
Wei, Hongwei
Ji, Xun
Powiązania:
https://bibliotekanauki.pl/articles/2090711.pdf
Data publikacji:
2021
Wydawca:
Polska Akademia Nauk. Czytelnia Czasopism PAN
Tematy:
speech emotion recognition
voice activity detection
wavelet packet reconstruction
feature extraction
LSTM networks
attention mechanism
rozpoznawanie emocji mowy
wykrywanie aktywności głosowej
rekonstrukcja pakietu falkowego
wyodrębnianie cech
mechanizm uwagi
sieć LSTM
Opis:
Speech emotion recognition (SER) is a complicated and challenging task in the human-computer interaction because it is difficult to find the best feature set to discriminate the emotional state entirely. We always used the FFT to handle the raw signal in the process of extracting the low-level description features, such as short-time energy, fundamental frequency, formant, MFCC (mel frequency cepstral coefficient) and so on. However, these features are built on the domain of frequency and ignore the information from temporal domain. In this paper, we propose a novel framework that utilizes multi-layers wavelet sequence set from wavelet packet reconstruction (WPR) and conventional feature set to constitute mixed feature set for achieving the emotional recognition with recurrent neural networks (RNN) based on the attention mechanism. In addition, the silent frames have a disadvantageous effect on SER, so we adopt voice activity detection of autocorrelation function to eliminate the emotional irrelevant frames. We show that the application of proposed algorithm significantly outperforms traditional features set in the prediction of spontaneous emotional states on the IEMOCAP corpus and EMODB database respectively, and we achieve better classification for both speaker-independent and speaker-dependent experiment. It is noteworthy that we acquire 62.52% and 77.57% accuracy results with speaker-independent (SI) performance, 66.90% and 82.26% accuracy results with speaker-dependent (SD) experiment in final.
Źródło:
Bulletin of the Polish Academy of Sciences. Technical Sciences; 2021, 69, 1; e136300, 1--12
0239-7528
Pojawia się w:
Bulletin of the Polish Academy of Sciences. Technical Sciences
Dostawca treści:
Biblioteka Nauki
Artykuł
Tytuł:
Analysis of Features and Classifiers in Emotion Recognition Systems : Case Study of Slavic Languages
Autorzy:
Nedeljković, Željko
Milošević, Milana
Ðurović, Željko
Powiązania:
https://bibliotekanauki.pl/articles/176678.pdf
Data publikacji:
2020
Wydawca:
Polska Akademia Nauk. Czasopisma i Monografie PAN
Tematy:
emotion recognition
speech processing
classification algorithms
Opis:
Today’s human-computer interaction systems have a broad variety of applications in which automatic human emotion recognition is of great interest. Literature contains many different, more or less successful forms of these systems. This work emerged as an attempt to clarify which speech features are the most informative, which classification structure is the most convenient for this type of tasks, and the degree to which the results are influenced by database size, quality and cultural characteristic of a language. The research is presented as the case study on Slavic languages.
Źródło:
Archives of Acoustics; 2020, 45, 1; 129-140
0137-5075
Pojawia się w:
Archives of Acoustics
Dostawca treści:
Biblioteka Nauki
Artykuł
Tytuł:
The Acoustic Cues of Fear : Investigation of Acoustic Parameters of Speech Containing Fear
Autorzy:
Özseven, T.
Powiązania:
https://bibliotekanauki.pl/articles/178133.pdf
Data publikacji:
2018
Wydawca:
Polska Akademia Nauk. Czytelnia Czasopism PAN
Tematy:
emotion recognition
acoustic analysis
fear
speech processing
Opis:
Speech emotion recognition is an important part of human-machine interaction studies. The acoustic analysis method is used for emotion recognition through speech. An emotion does not cause changes on all acoustic parameters. Rather, the acoustic parameters affected by emotion vary depending on the emotion type. In this context, the emotion-based variability of acoustic parameters is still a current field of study. The purpose of this study is to investigate the acoustic parameters that fear affects and the extent of their influence. For this purpose, various acoustic parameters were obtained from speech records containing fear and neutral emotions. The change according to the emotional states of these parameters was analyzed using statistical methods, and the parameters and the degree of influence that the fear emotion affected were determined. According to the results obtained, the majority of acoustic parameters that fear affects vary according to the used data. However, it has been demonstrated that formant frequencies, mel-frequency cepstral coefficients, and jitter parameters can define the fear emotion independent of the data used.
Źródło:
Archives of Acoustics; 2018, 43, 2; 245-251
0137-5075
Pojawia się w:
Archives of Acoustics
Dostawca treści:
Biblioteka Nauki
Artykuł
Tytuł:
Speech Emotion Recognition Based on Voice Fundamental Frequency
Autorzy:
Dimitrova-Grekow, Teodora
Klis, Aneta
Igras-Cybulska, Magdalena
Powiązania:
https://bibliotekanauki.pl/articles/177227.pdf
Data publikacji:
2019
Wydawca:
Polska Akademia Nauk. Czasopisma i Monografie PAN
Tematy:
emotion recognition
speech signal analysis
voice analysis
fundamental frequency
speech corpora
Opis:
The human voice is one of the basic means of communication, thanks to which one also can easily convey the emotional state. This paper presents experiments on emotion recognition in human speech based on the fundamental frequency. AGH Emotional Speech Corpus was used. This database consists of audio samples of seven emotions acted by 12 different speakers (6 female and 6 male). We explored phrases of all the emotions – all together and in various combinations. Fast Fourier Transformation and magnitude spectrum analysis were applied to extract the fundamental tone out of the speech audio samples. After extraction of several statistical features of the fundamental frequency, we studied if they carry information on the emotional state of the speaker applying different AI methods. Analysis of the outcome data was conducted with classifiers: K-Nearest Neighbours with local induction, Random Forest, Bagging, JRip, and Random Subspace Method from algorithms collection for data mining WEKA. The results prove that the fundamental frequency is a prospective choice for further experiments.
Źródło:
Archives of Acoustics; 2019, 44, 2; 277-286
0137-5075
Pojawia się w:
Archives of Acoustics
Dostawca treści:
Biblioteka Nauki
Artykuł
Tytuł:
Acoustic Methods in Identifying Symptoms of Emotional States
Autorzy:
Piątek, Zuzanna
Kłaczyński, Maciej
Powiązania:
https://bibliotekanauki.pl/articles/1953482.pdf
Data publikacji:
2021
Wydawca:
Polska Akademia Nauk. Czasopisma i Monografie PAN
Tematy:
emotion recognition
speech signal processing
clustering analysis
Sammon mapping
Opis:
The study investigates the use of speech signal to recognise speakers’ emotional states. The introduction includes the definition and categorization of emotions, including facial expressions, speech and physiological signals. For the purpose of this work, a proprietary resource of emotionally-marked speech recordings was created. The collected recordings come from the media, including live journalistic broadcasts, which show spontaneous emotional reactions to real-time stimuli. For the purpose of signal speech analysis, a specific script was written in Python. Its algorithm includes the parameterization of speech recordings and determination of features correlated with emotional content in speech. After the parametrization process, data clustering was performed to allows for the grouping of feature vectors for speakers into greater collections which imitate specific emotional states. Using the t-Student test for dependent samples, some descriptors were distinguished, which identified significant differences in the values of features between emotional states. Some potential applications for this research were proposed, as well as other development directions for future studies of the topic.
Źródło:
Archives of Acoustics; 2021, 46, 2; 259-269
0137-5075
Pojawia się w:
Archives of Acoustics
Dostawca treści:
Biblioteka Nauki
Artykuł
Tytuł:
Zastosowanie multimodalnej klasyfikacji w rozpoznawaniu stanów emocjonalnych na podstawie mowy spontanicznej
Spontaneus emotion redognition from speech signal using multimodal classification
Autorzy:
Kamińska, D.
Pelikant, A.
Powiązania:
https://bibliotekanauki.pl/articles/408014.pdf
Data publikacji:
2012
Wydawca:
Politechnika Lubelska. Wydawnictwo Politechniki Lubelskiej
Tematy:
rozpoznawanie emocji
sygnał mowy
algorytm kNN
emotion recognition
speech signal
k-NN algorithm
Opis:
Artykuł prezentuje zagadnienie związane z rozpoznawaniem stanów emocjonalnych na podstawie analizy sygnału mowy. Na potrzeby badań stworzona została polska baza mowy spontanicznej, zawierająca wypowiedzi kilkudziesięciu osób, w różnym wieku i różnej płci. Na podstawie analizy sygnału mowy stworzono przestrzeń cech. Klasyfikację stanowi multimodalny mechanizm rozpoznawania, oparty na algorytmie kNN. Średnia poprawność: rozpoznawania wynosi 83%.
The article presents the issue of emotion recognition from a speech signal. For this study, a Polish spontaneous database, containing speech from people of different age and gender, was created. Features were determined from the speech signal. The process of recognition was based on multimodal classification, related to kNN algorithm. The average of accuracy performance was up to 83%.
Źródło:
Informatyka, Automatyka, Pomiary w Gospodarce i Ochronie Środowiska; 2012, 3; 36-39
2083-0157
2391-6761
Pojawia się w:
Informatyka, Automatyka, Pomiary w Gospodarce i Ochronie Środowiska
Dostawca treści:
Biblioteka Nauki
Artykuł
Tytuł:
Rozumienie mowy jako moderator związku rozpoznawania emocji z nasileniem symptomów zaburzeń ze spektrum autyzmu (ASD)
Autorzy:
Krzysztofik, Karolina
Powiązania:
https://bibliotekanauki.pl/articles/2054372.pdf
Data publikacji:
2021
Wydawca:
Uniwersytet Marii Curie-Skłodowskiej. Wydawnictwo Uniwersytetu Marii Curie-Skłodowskiej
Tematy:
autism spectrum disorder
ASD
emotion recognition
speech comprehension
zaburzenia ze spektrum autyzmu
rozpoznawanie emocji
rozumienie mowy
Opis:
Współcześni badacze podkreślają konsekwencje trudności osób z zaburzeniami ze spektrum autyzmu (Autism Spectrum Disorder, ASD) w rozpoznawaniu emocji dla nasilenia symptomów tego zaburzenia. Jednocześnie wiele z osób z ASD potrafi rozpoznawać emocje innych osób dzięki strategiom kompensacyjnym opartym na relatywnie dobrze rozwiniętych kompetencjach poznawczych i językowych. Wydaje się zatem, że umiejętności językowe osób z ASD mogą moderować związek rozpoznawania emocji z nasileniem symptomów ASD. Celem prezentowanych badań było ustalenie, czy poziom rozumienia mowy osób z ASD moderuje związek rozpoznawania emocji z nasileniem symptomów ASD. Przebadano grupę 63 dzieci z ASD w wieku od 3 lat i 7 miesięcy do 9 lat i 3 miesiący, wykorzystując następujące narzędzia: Skalę Nasilenia Symptomów ASD, podskalę Rozpoznawanie Emocji ze Skali Mechanizmu Teorii Umysłu oraz podskalę Rozumienie Mowy ze skali Iloraz Inteligencji i Rozwoju dla Dzieci w Wieku Przedszkolnym (IDS-P). Uzyskane wyniki wskazują, że poziom rozumienia mowy moderuje związek poziomu rozwoju rozpoznawania emocji z nasileniem symptomów ASD w zakresie deficytów w komunikowaniu i interakcjach. Wyniki te znajdują swoje implikacje dla włączenia terapii rozumienia mowy w proces rehabilitacji osób z ASD, a także dla teoretycznej refleksji nad uwarunkowaniami nasilenia symptomów ASD.
Contemporary researchers underline consequences of difficulties in emotion recognition experienced by persons with autism spectrum disorder (ASD) for severity of symptoms of this disorder. Individuals with ASD, when trying to recognize the emotional states of others, often use compensatory strategies based on relatively well-developed cognitive and linguistic competences. Thus, the relationship between the recognition of emotions and the severity of ASD symptoms may be moderated by linguistic competencies. Own research was aimed at determining if the level of speech comprehension moderates the relationship between emotion recognition and ASD symptom severity. Participants were 63 children with ASD aged from 3 years and 7 months to 9 years and 3 months. The following tools were used: ASD Symptom Severity Scale, the Emotion Recognition subscale of the Theory of Mind Scale and the Speech Comprehension subscale from the Intelligence and Development Scales – Preschool (IDS-P). The results indicate that the level of speech comprehension moderates the relationship between the level of emotion recognition and ASD symptom severity in the range of deficits in communication and interaction. These results have implications for integrating speech comprehension therapy into the process of the rehabilitation of individuals with ASD, as well as for theoretical reflection concerning the determinants of ASD symptom severity.
Źródło:
Annales Universitatis Mariae Curie-Skłodowska, sectio J – Paedagogia-Psychologia; 2021, 34, 3; 199-219
0867-2040
Pojawia się w:
Annales Universitatis Mariae Curie-Skłodowska, sectio J – Paedagogia-Psychologia
Dostawca treści:
Biblioteka Nauki
Artykuł
Tytuł:
Comparison of speaker dependent and speaker independent emotion recognition
Autorzy:
Rybka, J.
Janicki, A.
Powiązania:
https://bibliotekanauki.pl/articles/330055.pdf
Data publikacji:
2013
Wydawca:
Uniwersytet Zielonogórski. Oficyna Wydawnicza
Tematy:
speech processing
emotion recognition
EMO-DB
support vector machines
artificial neural network
przetwarzanie mowy
rozpoznawanie emocji
maszyna wektorów wspierających
sztuczna sieć neuronowa
Opis:
This paper describes a study of emotion recognition based on speech analysis. The introduction to the theory contains a review of emotion inventories used in various studies of emotion recognition as well as the speech corpora applied, methods of speech parametrization, and the most commonly employed classification algorithms. In the current study the EMO-DB speech corpus and three selected classifiers, the k-Nearest Neighbor (k-NN), the Artificial Neural Network (ANN) and Support Vector Machines (SVMs), were used in experiments. SVMs turned out to provide the best classification accuracy of 75.44% in the speaker dependent mode, that is, when speech samples from the same speaker were included in the training corpus. Various speaker dependent and speaker independent configurations were analyzed and compared. Emotion recognition in speaker dependent conditions usually yielded higher accuracy results than a similar but speaker independent configuration. The improvement was especially well observed if the base recognition ratio of a given speaker was low. Happiness and anger, as well as boredom and neutrality, proved to be the pairs of emotions most often confused.
Źródło:
International Journal of Applied Mathematics and Computer Science; 2013, 23, 4; 797-808
1641-876X
2083-8492
Pojawia się w:
International Journal of Applied Mathematics and Computer Science
Dostawca treści:
Biblioteka Nauki
Artykuł
    Wyświetlanie 1-14 z 14

    Ta witryna wykorzystuje pliki cookies do przechowywania informacji na Twoim komputerze. Pliki cookies stosujemy w celu świadczenia usług na najwyższym poziomie, w tym w sposób dostosowany do indywidualnych potrzeb. Korzystanie z witryny bez zmiany ustawień dotyczących cookies oznacza, że będą one zamieszczane w Twoim komputerze. W każdym momencie możesz dokonać zmiany ustawień dotyczących cookies