Temat: cepstral coefficients - Katalog OPAC zbiorów

Skocz do pozycji: 1.

Tytuł:: Determination of Input Parameters of the Neural Network Model, Intended for Phoneme Recognition of a Voice Signal in the Systems of Distance Learning
Autorzy:: Akhmetov, B.
Tereykovsky, I.
Doszhanova, A.
Tereykovskaya, L.
Powiązania:: https://bibliotekanauki.pl/articles/226378.pdf
Data publikacji:: 2018
Wydawca:: Polska Akademia Nauk. Czytelnia Czasopism PAN
Tematy:: neural networks
phonemes
recognition of a voice signal
system of distance learning
mel-cepstral coefficients
spectral analysis
Opis:: The article is devoted to the problem of voice signals recognition means introduction in the system of distance learning. The results of the conducted research determine the prospects of neural network means of phoneme recognition. It is also shown that the main difficulties of creation of the neural network model, intended for recognition of phonemes in the system of distance learning, are connected with the uncertain duration of a phoneme-like element. Due to this reason for recognition of phonemes, it is impossible to use the most effective type of neural network model on the basis of a multilayered perceptron, at which the number of input parameters is a fixed value. To mitigate this shortcoming, the procedure, allowing to transform the non-stationary digitized voice signal to the fixed quantity of mel-cepstral coefficients, which are the basis for calculation of input parameters of the neural network model, is developed. In contrast to the known ones, the possibility of linear scaling of phoneme-like elements is available in the procedure. The number of computer experiments confirmed expediency of the fact that the use of the offered coding procedure of input parameters provides the acceptable accuracy of neural network recognition of phonemes under near-natural conditions of the distance learning system. Moreover, the prospects of further research in the field of development of neural network means of phoneme recognition of a voice signal in the system of distance learning is connected with an increase in admissible noise level. Besides, the adaptation of the offered procedure to various natural languages, as well as to other applied tasks, for instance, a problem of biometric authentication in the banking sector, is also of great interest.
Źródło:: International Journal of Electronics and Telecommunications; 2018, 64, 4; 425-432
2300-1933
Pojawia się w:: International Journal of Electronics and Telecommunications
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Skocz do pozycji: 2.

Tytuł:: Effect of Time-domain Windowing on Isolated Speech Recognition System Performance
Autorzy:: Ananthakrishna, Thalengala
Anitha, H.
Girisha, T.
Powiązania:: https://bibliotekanauki.pl/articles/2055228.pdf
Data publikacji:: 2022
Wydawca:: Polska Akademia Nauk. Czytelnia Czasopism PAN
Tematy:: hidden Markov model
HMM
isolated speech recognition system
ISR
Kannada language
mono-phone model
Mel frequency cepstral coefficients
MFCC
Opis:: Speech recognition system extract the textual data from the speech signal. The research in speech recognition domain is challenging due to the large variabilities involved with the speech signal. Variety of signal processing and machine learning techniques have been explored to achieve better recognition accuracy. Speech is highly non-stationary in nature and therefore analysis is carried out by considering short time-domain window or frame. In the speech recognition task, cepstral (Mel frequency cepstral coefficients (MFCC)) features are commonly used and are extracted for short time-frame. The effectiveness of features depend upon duration of the time-window chosen. The present study is aimed at investigation of optimal time-window duration for extraction of cepstral features in the context of speech recognition task. A speaker independent speech recognition system for the Kannada language has been considered for the analysis. In the current work, speech utterances of Kannada news corpus recorded from different speakers have been used to create speech database. The hidden Markov tool kit (HTK) has been used to implement the speech recognition system. The MFCC along with their first and second derivative coefficients are considered as feature vectors. Pronunciation dictionary required for the study has been built manually for mono-phone system. Experiments have been carried out and results have been analyzed for different time-window lengths. The overlapping Hamming window has been considered in this study. The best average word recognition accuracy of 61.58% has been obtained for a window length of 110 msec duration. This recognition accuracy is comparable with the similar work found in literature. The experiments have shown that best word recognition performance can be achieved by tuning the window length to its optimum value.
Źródło:: International Journal of Electronics and Telecommunications; 2022, 68, 1; 161--166
2300-1933
Pojawia się w:: International Journal of Electronics and Telecommunications
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Informacja

Wyszukujesz frazę "cepstral coefficients" wg kryterium: Temat

Źródło danych

Dostawca treści

Kolekcja

Rok wydania

Wydawca

Temat

Autor

Typ dokumentu

Język