Temat: MFCC - Katalog OPAC zbiorów

Skocz do pozycji: 1.

Tytuł:: System rozpoznawania mowy z ograniczonym słownikiem
Speech recognition system with limited dictionary
Autorzy:: Grabowski, D.
Kwiatkowska, M.
Świerczewski, Ł.
Powiązania:: https://bibliotekanauki.pl/articles/131953.pdf
Data publikacji:: 2014
Wydawca:: Wrocławska Wyższa Szkoła Informatyki Stosowanej Horyzont
Tematy:: rozpoznawanie mowy
ASR
MFCC
speech recognition
Opis:: Motywacją w pisanej pracy jest omówienie i porównanie popularnych algorytmów rozpoznawania mowy na różnych systemach. Zebrane informacje są przedstawione w stosunkowo krótkiej formie, bez wnikliwej analizy dowodów matematycznych, do których przedstawienia i tak potrzebne jest odniesienie się do odrębnych specjalistycznych źródeł. Omówione zostały tutaj problemy pewne związane z ASR (ang. Automatic Speech Recognition) i perspektywy na rozwiązanie ich. Na podstawie dostępnych rozwiązań stworzony został moduł aplikacji umożliwiający porównywanie zebranych nagrań pod kątem podobieństwa sygnału mowy i przedstawienie wyników w formie tabelarycznej. Stworzona biblioteka w celach prezentacyjnych została użyta do pełnej aplikacji umożliwiającej wykonywanie rozkazów na podstawie słów wypowiadanych do mikrofonu. Wyniki posłużą nie tyle za ostateczne wnioski w tematyce rozpoznawania mowy, co za wskazówki do kolejnych analiz i badań. Mimo postępów w badaniach nad ASR, nadal nie ma algorytmów o skuteczności przekraczającej 95%. Motywacją do dalszych działań może być np. społeczne wykluczenie ludzi nie mogących posługiwać się komunikacją polegającą na wzroku.
Motivation of this thesis is discussion about popular ASR algorithms and comparision on various architectures. Collected results are presented in relatively short shape. It’s done without math argumentation because it could depend on complicated equations. Here are discussed some problems associated with ASR (Automatic Speech Recognition) and the prospects for a solution to their. On the basis of available solutions it was developed application module that allows comparison of collected recordings in respect of similarity of the speech signal and present the results in tabular form. For presentation purposes it has been created a library and it was used in complete application that allows execution of commands based on the words spoken to microphone. The results will be used not only for the final conclusions about ASR, what clues for further analysis and research. Despite the advances in research on ASR, still there are no algorithms for effectiveness in excess of 95%. The motivation for further actions may be, eg, the social exclusion of people who can not use the communication involving the eye
Źródło:: Biuletyn Naukowy Wrocławskiej Wyższej Szkoły Informatyki Stosowanej. Informatyka; 2014, 4; 44-53
2082-9892
Pojawia się w:: Biuletyn Naukowy Wrocławskiej Wyższej Szkoły Informatyki Stosowanej. Informatyka
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Skocz do pozycji: 2.

Tytuł:: Automatic Genre Classification Using Fractional Fourier Transform Based Mel Frequency Cepstral Coefficient and Timbral Features
Autorzy:: Bhalke, D. G.
Rajesh, B.
Bormane, D. S.
Powiązania:: https://bibliotekanauki.pl/articles/177599.pdf
Data publikacji:: 2017
Wydawca:: Polska Akademia Nauk. Czytelnia Czasopism PAN
Tematy:: feature extraction
Timbral features
MFCC
Mel Frequency Cepstral Coefficient
FrFT
fractional Fourier transform
Fractional MFCC
Tamil Carnatic music
Opis:: This paper presents the Automatic Genre Classification of Indian Tamil Music and Western Music using Timbral and Fractional Fourier Transform (FrFT) based Mel Frequency Cepstral Coefficient (MFCC) features. The classifier model for the proposed system has been built using K-NN (K-Nearest Neighbours) and Support Vector Machine (SVM). In this work, the performance of various features extracted from music excerpts has been analysed, to identify the appropriate feature descriptors for the two major genres of Indian Tamil music, namely Classical music (Carnatic based devotional hymn compositions) & Folk music and for western genres of Rock and Classical music from the GTZAN dataset. The results for Tamil music have shown that the feature combination of Spectral Roll off, Spectral Flux, Spectral Skewness and Spectral Kurtosis, combined with Fractional MFCC features, outperforms all other feature combinations, to yield a higher classification accuracy of 96.05%, as compared to the accuracy of 84.21% with conventional MFCC. It has also been observed that the FrFT based MFCC effieciently classifies the two western genres of Rock and Classical music from the GTZAN dataset with a higher classification accuracy of 96.25% as compared to the classification accuracy of 80% with MFCC.
Źródło:: Archives of Acoustics; 2017, 42, 2; 213-222
0137-5075
Pojawia się w:: Archives of Acoustics
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Skocz do pozycji: 3.

Tytuł:: Sterowanie głosem urządzeniami mechatronicznymi koncepcja stanowiska dydaktycznego
The voice control of mechatronic devices the concept of didactic station
Autorzy:: Idziak, P.
Kmieć, A.
Powiązania:: https://bibliotekanauki.pl/articles/377555.pdf
Data publikacji:: 2017
Wydawca:: Politechnika Poznańska. Wydawnictwo Politechniki Poznańskiej
Tematy:: Raspberry Pi
algorytm MFCC
oprogramowanie Jasper
rozpoznawanie mowy
Opis:: W artykule zaprezentowano algorytmy zamiany głosu ludzkiego na postać cyfrową i na tej podstawie rozpoznawanie wydawanych komend. Przedstawiono opis algorytmu MFCC oraz jego aplikację działającą na platformie Raspberry Pi. Opisano spotykane open-source’owe programy umożliwiające rozpozanawanie mowy, działające w środowisku LINUX. Zaprezentowano koncepcję stanowiska dydaktycznego realizującego proste komendy głosowe. Przedstawiono rezultaty testów sprawdzających.
The article features basic algorithms which are responsible for converting human voice into digital form. It also describes MFCC algorithm and the steps required to put it into practice. It includes presentation of the primary open-source software programs, that allow speech recognition in Linux environment, on the platform Raspberry Pi. At the end, the article presents a concept of didactic station, performing simple voice commands using Jasper program and its possibility to use in future.
Źródło:: Poznan University of Technology Academic Journals. Electrical Engineering; 2017, 92; 375-386
1897-0737
Pojawia się w:: Poznan University of Technology Academic Journals. Electrical Engineering
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Skocz do pozycji: 4.

Tytuł:: Navigation security module with real-time voice command recognition system
Autorzy:: Yagimli, M.
Kursat-Tezer, H.
Powiązania:: https://bibliotekanauki.pl/articles/258920.pdf
Data publikacji:: 2017
Wydawca:: Politechnika Gdańska. Wydział Inżynierii Mechanicznej i Okrętownictwa
Tematy:: maritime navigation
LPC
MFCC
DTW
voice command recognition
Opis:: The real-time voice command recognition system used for this study, aims to increase the situational awareness, therefore the safety of navigation, related especially to the close manoeuvres of warships, and the courses of commercial vessels in narrow waters. The developed system, the safety of navigation that has become especially important in precision manoeuvres, has become controllable with voice command recognition-based software. The system was observed to work with 90.6% accuracy using Mel Frequency Cepstral Coefficients (MFCC) and Dynamic Time Warping (DTW) parameters and with 85.5% accuracy using Linear Predictive Coding (LPC) and DTW parameters.
Źródło:: Polish Maritime Research; 2017, 2; 17-26
1233-2585
Pojawia się w:: Polish Maritime Research
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Skocz do pozycji: 5.

Tytuł:: Diagnostyka silnika synchronicznego oparta na analizie sygnałów akustycznych z zastosowaniem MFCC i GSDM
Diagnostics of synchronous motor based on analysis of acoustic signals with application of MFCC and GSDM
Autorzy:: Głowacz, A.
Głowacz, W.
Głowacz, Z.
Powiązania:: https://bibliotekanauki.pl/articles/1373298.pdf
Data publikacji:: 2010
Wydawca:: Sieć Badawcza Łukasiewicz - Instytut Napędów i Maszyn Elektrycznych Komel
Tematy:: maszyna elektryczna
silnik synchroniczny
diagnostyka silników elektrycznych
sygnał akustyczny
GSDM
MFCC
Opis:: The paper presents method of diagnostics of imminent failure conditions of synchronous motor. This method is based on a study of acoustic signals generated by synchronous motor. Sound recognition system is based on data processing algorithms, such as MFCC and GSDM. Software to recognize the sounds of synchronous motor was implemented. The studies were carried out for four imminent failure conditions of synchronous motor. The results confirm that the system can be useful for detecting damage and protect the motors.
Źródło:: Maszyny Elektryczne: zeszyty problemowe; 2010, 87; 185-190
0239-3646
2084-5618
Pojawia się w:: Maszyny Elektryczne: zeszyty problemowe
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Skocz do pozycji: 6.

Tytuł:: CNN and LSTM for the classification of parkinsons disease based on the GTCC and MFCC
Autorzy:: Boualoulou, Nouhaila
Drissi, Taoufiq Belhoussine
Nsiri, Benayad
Powiązania:: https://bibliotekanauki.pl/articles/30148250.pdf
Data publikacji:: 2023
Wydawca:: Polskie Towarzystwo Promocji Wiedzy
Tematy:: Parkinson's disease
voice signal
GTCC
MFCC
DWT
EMD
CNN and LSTM
Opis:: Parkinson's disease is a recognizable clinical syndrome with a variety of causes and clinical presentations; it represents a rapidly growing neurodegenerative disorder. Since about 90 percent of Parkinson's disease sufferers have some form of early speech impairment, recent studies on tele diagnosis of Parkinson's disease have focused on the recognition of voice impairments from vowel phonations or the subjects' discourse. This paper presents a new approach for Parkinson's disease detection from speech sounds that are based on CNN and LSTM and uses two categories of characteristics. These are Mel Frequency Cepstral Coefficients (MFCC) and Gammatone Cepstral Coefficients (GTCC) obtained from noise-removed speech signals with comparative EMD-DWT and DWT-EMD analysis. The proposed model is divided into three stages. In the first step, noise is removed from the signals using the EMD-DWT and DWT-EMD methods. In the second step, the GTCC and MFCC are extracted from the enhanced audio signals. The classification process is carried out in the third step by feeding these features into the LSTM and CNN models, which are designed to define sequential information from the extracted features. The experiments are performed using PC-GITA and Sakar datasets and 10-fold cross validation method, the highest classification accuracy for the Sakar dataset reached 100% for both EMD-DWT-GTCC-CNN and DWT-EMD-GTCC-CNN, and for the PC-GITA dataset, the accuracy is reached 100% for EMD-DWT-GTCC-CNN and 96.55% for DWT-EMD-GTCC-CNN. The results of this study indicate that the characteristics of GTCC are more appropriate and accurate for the assessment of PD than MFCC.
Źródło:: Applied Computer Science; 2023, 19, 2; 1-24
1895-3735
2353-6977
Pojawia się w:: Applied Computer Science
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Skocz do pozycji: 7.

Tytuł:: Rozpoznawanie wieku i płci na podstawie analizy głosu
Age and gender recognition based on analysis of voice
Autorzy:: Gabryś, J.
Gil, G.
Kiszka, P.
Powiązania:: https://bibliotekanauki.pl/articles/261820.pdf
Data publikacji:: 2015
Wydawca:: Politechnika Wrocławska. Wydział Podstawowych Problemów Techniki. Katedra Inżynierii Biomedycznej
Tematy:: automatyczne rozpoznawanie mowy
wiek
płeć
współczynniki MFCC
klasyfikacja mówcy
maszyna wektorów nośnych
automatic speech recognition
age
gender
MFCC coefficients
classification of speaker
support vector machine (SVM)
Opis:: Metody automatycznego rozpoznawania wieku i płci pozwalają na rozpoznanie cech osoby mówiącej tylko na podstawie nagrania jej wypowiedzi. Mowa ludzka, poza werbalnym komunikatem, niesie ze sobą informacje dotyczące osoby mówiącej. Nagranie mowy osoby pozwala na wyodrębnienie takich informacji, jak jej płeć, wiek, a także emocje. Zaprezentowano przegląd metod rozpoznawania wieku i płci osób na podstawie ich mowy oraz wykonano implementację i przetestowano połączenie metod wyznaczania parametrów MFCC (współczynniki analizy cepstralnej w skali mel (Mel-frequency Cepstral Coefficients) i wysokości tonu głosu f0 oraz algorytmu SVM (metoda wektorów nośnych - Support Vector Machines) do klasyfikacji próbek głosowych. Testy zaimplementowanego rozwiązania pozwalają stwierdzić, że metoda jest skuteczna w większości przypadków testowych.
Methods for automatic recognition of the age and gender characteristics allow the identification of the person only on the basis of recording of this person speech. Human speech, beyond verbal communication, gives an information about the speaking person. Speech recording allows the identification personal characteristics such as gender, age, and the emotions. The paper presents an overview of methods of age and gender recognition of people based on their speech. A combination of methods for determining the parameters MFCC (Mel-frequency Cepstral Coefficients) and pitch of voice (f0) and SVM (Support Vector Machines) algorithm for the classification of voice samples is implanted and tested. It was demonstrated that the method is effective in the majority of test cases.
Źródło:: Acta Bio-Optica et Informatica Medica. Inżynieria Biomedyczna; 2015, 21, 3; 165-169
1234-5563
Pojawia się w:: Acta Bio-Optica et Informatica Medica. Inżynieria Biomedyczna
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Skocz do pozycji: 8.

Tytuł:: A novel Parkinsons disease detection algorithm combined EMD, BFCC, and SVM classifier
Autorzy:: Boualoulou, Nouhaila
Mounia, Miyara
Nsiri, Benayad
Behoussine Drissi, Taoufiq
Powiązania:: https://bibliotekanauki.pl/articles/27313826.pdf
Data publikacji:: 2023
Wydawca:: Polska Akademia Nauk. Polskie Towarzystwo Diagnostyki Technicznej PAN
Tematy:: EMD
BFCC
MFCC
SVM
Parkinson’s disease
sztuczna sieć neuronowa
choroba Parkinsona
Opis:: Identifying and assessing Parkinson's disease in its early stages is critical to effectively monitoring the disease's progression. Methodologies based on machine learning enhanced speech analysis are gaining popularity as the potential of this field is revealed. Acoustic features, in particular, are used in a variety of algorithms for machine learning and could serve as indicators of the general health of subjects' voices. In this research paper, a novel method is introduced for the automated detection of Parkinson's disease through speech signal analysis, a support vector machines classifier (SVM) and an Artificial Neural Network (ANN) are used to evaluate and classify the data based on two acoustic features: Bark Frequency Cepstral Coefficients (BFCC) and Mel Frequency Cepstral Coefficients (MFCC). These features are extracted from the denoised signals using Empirical Mode Decomposition (EMD). The most relevant results obtained for a dataset of 38 participants are by the BFCC coefficients with an accuracy up to 92.10%. These results confirm that EMD-BFCC-SVM method can contribute to the detection of Parkinson's disease.
Źródło:: Diagnostyka; 2023, 24, 4; art. no. 2023404
1641-6414
2449-5220
Pojawia się w:: Diagnostyka
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Skocz do pozycji: 9.

Tytuł:: Genetic Algorithm for Combined Speaker and Speech Recognition using Deep Neural Networks
Autorzy:: Kaur, G.
Srivastava, M.
Kumar, A.
Powiązania:: https://bibliotekanauki.pl/articles/958089.pdf
Data publikacji:: 2018
Wydawca:: Instytut Łączności - Państwowy Instytut Badawczy
Tematy:: deep neural networks
genetic algorithm
LPCC
MFCC
PLP
RASTA-PLP
speaker recognition
speech recognition
Opis:: Huge growth is observed in the speech and speaker recognition ﬁeld due to many artiﬁcial intelligence algorithms being applied. Speech is used to convey messages via the language being spoken, emotions, gender and speaker identity. Many real applications in healthcare are based upon speech and speaker recognition, e.g. a voice-controlled wheelchair helps control the chair. In this paper, we use a genetic algorithm (GA) for combined speaker and speech recognition, relying on optimized Mel Frequency Cepstral Coeﬃcient (MFCC) speech features, and classiﬁcation is performed using a Deep Neural Network (DNN). In the ﬁrst phase, feature extraction using MFCC is executed. Then, feature optimization is performed using GA. In the second phase training is conducted using DNN. Evaluation and validation of the proposed work model is done by setting a real environment, and eﬃciency is calculated on the basis of such parameters as accuracy, precision rate, recall rate, sensitivity, and speciﬁcity. Also, this paper presents an evaluation of such feature extraction methods as linear predictive coding coefficient (LPCC), perceptual linear prediction (PLP), mel frequency cepstral coefﬁcients (MFCC) and relative spectra ﬁltering (RASTA), with all of them used for combined speaker and speech recognition systems. A comparison of diﬀerent methods based on existing techniques for both clean and noisy environments is made as well.
Źródło:: Journal of Telecommunications and Information Technology; 2018, 2; 23-31
1509-4553
1899-8852
Pojawia się w:: Journal of Telecommunications and Information Technology
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Skocz do pozycji: 10.

Tytuł:: Visualization of stages of determining cepstral factors in speech recognition systems
Autorzy:: Proksa, R.
Powiązania:: https://bibliotekanauki.pl/articles/333103.pdf
Data publikacji:: 2009
Wydawca:: Uniwersytet Śląski. Wydział Informatyki i Nauki o Materiałach. Instytut Informatyki. Zakład Systemów Komputerowych
Tematy:: rozpoznawanie mowy
LPCC
MFCC
wyizolowane słowo
sygnały mowy
speech recognition
cepstral coefficients
isolated word
Opis:: The article presents two methods of determination of cepstral parameters commonly applied in digital signal processing, in particular in speech recognition systems. The solutions presented are part of a project aimed at developing applications allowing to control the Windows operating system with voice and the use of MSAA (Microsoft Active Accessibility). The analysed voice signal has been visually presented at each of the crucial stages of developing cepstral coefficients.
Źródło:: Journal of Medical Informatics & Technologies; 2009, 13; 121-128
1642-6037
Pojawia się w:: Journal of Medical Informatics & Technologies
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Skocz do pozycji: 11.

Tytuł:: Voice pathology assessment using x-vectors approach
Autorzy:: Kotarba, Katarzyna
Kotarba, Michał
Powiązania:: https://bibliotekanauki.pl/articles/2146638.pdf
Data publikacji:: 2021
Wydawca:: Politechnika Poznańska. Instytut Mechaniki Stosowanej
Tematy:: x-vectors
speaker embeddings
voice pathology
MFCC
GFCC
x wektory
osadzenie głośnika
patologia głosu
Opis:: Voice pathology assessment using sustained vowels has proven to be effective and reliable. However, only a few studies regarding detection of pathological speech based on continuous speech are available. In this study we evaluate the usefulness of various regression models trained on continuous speech recordings from Saarbruecken Voice Database in the detection of voice pathologies. The recordings were used for extraction of speaker embeddings called x-vectors based on mel-frequency cepstral coefficients and gammatone frequency cepstral coefficients. Since the dataset used in this study is imbalanced, various over- and undersampling techniques were applied to the training set to ensure robustness of models’ decision boundaries. The models were trained on both imbalanced and resampled training sets using 5-fold cross-validation. The best results were obtained for Multi Layer Perceptron trained on GFCC-based x-vectors, achieving accuracy of 0.8184, F1-score of 0.8212, and ROC AUC score of 0.8810 for the testing set.
Źródło:: Vibrations in Physical Systems; 2021, 32, 1; art. no. 2021108
0860-6897
Pojawia się w:: Vibrations in Physical Systems
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Skocz do pozycji: 12.

Tytuł:: Heart Rate Detection and Classification from Speech Spectral Features Using Machine Learning
Autorzy:: Usman, Mohammed
Zubair, Mohammed
Ahmad, Zeeshan
Zaidi, Monji
Ijyas, Thafasal
Parayangat, Muneer
Wajid, Mohd
Shiblee, Mohammad
Ali, Syed Jaffar
Powiązania:: https://bibliotekanauki.pl/articles/1953514.pdf
Data publikacji:: 2021
Wydawca:: Polska Akademia Nauk. Czasopisma i Monografie PAN
Tematy:: heart rate from speech
machine learning
MFCC
regression
classification
speech as a biomedical signal
Opis:: Measurement of vital signs of the human body such as heart rate, blood pressure, body temperature and respiratory rate is an important part of diagnosing medical conditions and these are usually measured using medical equipment. In this paper, we propose to estimate an important vital sign – heart rate from speech signals using machine learning algorithms. Existing literature, observation and experience suggest the existence of a correlation between speech characteristics and physiological, psychological as well as emotional conditions. In this work, we estimate the heart rate of individuals by applying machine learning based regression algorithms to Mel frequency cepstrum coefficients, which represent speech features in the spectral domain as well as the temporal variation of spectral features. The estimated heart rate is compared with actual measurement made using a conventional medical device at the time of recording speech. We obtain estimation accuracy close to 94% between the estimated and actual measured heart rate values. Binary classification of heart rate as ‘normal’ or ‘abnormal’ is also achieved with 100% accuracy. A comparison of machine learning algorithms in terms of heart rate estimation and classification accuracy is also presented. Heart rate measurement using speech has applications in remote monitoring of patients, professional athletes and can facilitate telemedicine.
Źródło:: Archives of Acoustics; 2021, 46, 1; 41-53
0137-5075
Pojawia się w:: Archives of Acoustics
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Skocz do pozycji: 13.

Tytuł:: Ekstrakcja parametrów z próbek danych biometrycznych
Extraction of parameters from biometric data samples
Autorzy:: Danek, Paweł
Ćwirta, Krzysztof
Kopniak, Piotr
Powiązania:: https://bibliotekanauki.pl/articles/98390.pdf
Data publikacji:: 2019
Wydawca:: Politechnika Lubelska. Instytut Informatyki
Tematy:: biometria
odcisk
głos
autoryzacja
normalizacja
gabor
deskryptor
lpc
mfcc
biometrics
fingerprint
voice
authorization
normalization
descriptor
Opis:: W artykule opisano możliwe sposoby ekstrakcji parametrów z próbek danych biometrycznych, takich jak odcisk palca czy nagranie głosu. Zweryfikowano wpływ konkretnych sposobów obróbki na skuteczność algorytmów obróbki próbek biometrycznych oraz ich porównania. Wykonano badania polegające na przetworzeniu dużej liczby próbek z użyciem wybranych algorytmów. W przypadku odcisku palca wykorzystano normalizację obrazu, filtr Gabora i porównanie z użyciem deskryptorów. Dla autoryzacji głosowej analizowano algorytmy LPC i MFCC. W przypadku obu rodzajów autoryzacji uzyskano zadowalającą skuteczność rzędu 60-80%.
This article describes possible ways to extract parameters from biometric data samples, such as fingerprint or voice recording. Influence of particular approaches to biometric sample preparation and comparision algorithms accuracy was verified. Experiment involving processing big ammount of samples with usage of particular algorithms was performed. In fingerprint detection case the image normalization, Gabor filtering and comparision method based on descriptors were used. For voice authorization LPC and MFCC alghoritms were used. In both cases satisfying accuracy (60-80%) was the result of the surveys.
Źródło:: Journal of Computer Sciences Institute; 2019, 13; 323-331
2544-0764
Pojawia się w:: Journal of Computer Sciences Institute
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Skocz do pozycji: 14.

Tytuł:: Discrimination between patients with CVDs and healthy people by voiceprint using the MFCC and pitch
Autorzy:: Bourouhou, Abdelhamid
Jilbab, Abdelilah
Cherti, Mohammed
Bourouhou, Zaineb
Nacir, Chafik
Powiązania:: https://bibliotekanauki.pl/articles/2096170.pdf
Data publikacji:: 2021
Wydawca:: Polska Akademia Nauk. Polskie Towarzystwo Diagnostyki Technicznej PAN
Tematy:: cardiovascular diseases
speech analysis
voiceprint
MFCC
K-near-neighbor classifier
choroby układu krążenia
analiza mowy
Opis:: Heart diseases cause many deaths around the world every year, and his death rate makes the leader of the killer diseases. But early diagnosis can be helpful to decrease those several deaths and save lives. To ensure good diagnose, people must pass a series of clinical examinations and analyses, which make the diagnostic operation expensive and not accessible for everyone. Speech analysis comes as a strong tool which can resolve the task and give back a new way to discriminate between healthy people and person with cardiovascular diseases. Our latest paper treated this task but using a dysphonia measurement to differentiate between people with cardiovascular disease and the healthy one, and we were able to reach 81.5% in prediction accuracy. This time we choose to change the method to increase the accuracy by extracting the voiceprint using 13 Mel-Frequency Cepstral Coefficients and the pitch, extracted from the people's voices provided from a database which contain 75 subjects (35 has cardiovascular diseases, 40 are healthy), three records of sustained vowels (aaaaa…, ooooo… .. and iiiiiiii….) has been collected from each one. We used the k-near-neighbor classifier to train a model and to classify the test entities. We were able to outperform the previous results, reaching 95.55% of prediction accuracy.
Źródło:: Diagnostyka; 2021, 22, 4; 9-16
1641-6414
2449-5220
Pojawia się w:: Diagnostyka
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Skocz do pozycji: 15.

Tytuł:: Hybridisation of Mel Frequency Cepstral Coefficient and Higher Order Spectral Features for Musical Instruments Classification
Autorzy:: Bhalke, D. G.
Rama Rao, C. B.
Bormane, D.
Powiązania:: https://bibliotekanauki.pl/articles/176497.pdf
Data publikacji:: 2016
Wydawca:: Polska Akademia Nauk. Czytelnia Czasopism PAN
Tematy:: feature extraction
MFCC
HOS
bispectrum
bicoherence
non-linearity
non-Gaussianity
CPNN
zero crossing rate (ZCR)
Opis:: This paper presents the classification of musical instruments using Mel Frequency Cepstral Coefficients (MFCC) and Higher Order Spectral features. MFCC, cepstral, temporal, spectral, and timbral features have been widely used in the task of musical instrument classification. As music sound signal is generated using non-linear dynamics, non-linearity and non-Gaussianity of the musical instruments are important features which have not been considered in the past. In this paper, hybridisation of MFCC and Higher Order Spectral (HOS) based features have been used in the task of musical instrument classification. HOS-based features have been used to provide instrument specific information such as non-Gaussianity and non-linearity of the musical instruments. The extracted features have been presented to Counter Propagation Neural Network (CPNN) to identify the instruments and their family. For experimentation, isolated sounds of 19 musical instruments have been used from McGill University Master Sample (MUMS) sound database. The proposed features show the significant improvement in the classification accuracy of the system.
Źródło:: Archives of Acoustics; 2016, 41, 3; 427-436
0137-5075
Pojawia się w:: Archives of Acoustics
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Informacja

Wyszukujesz frazę "MFCC" wg kryterium: Temat

Źródło danych

Dostawca treści

Kolekcja

Rok wydania

Wydawca

Temat

Autor

Typ dokumentu

Język