Towards spike-based speech processing: a biologically plausible approach to simple acoustic classification

Szczegóły
Opis

Tytuł:: Towards spike-based speech processing: a biologically plausible approach to simple acoustic classification
Autorzy:: Uysal, I.
Sathyendra, H.
Harris, J. G.
Powiązania:: https://bibliotekanauki.pl/articles/907947.pdf
Data publikacji:: 2008
Wydawca:: Uniwersytet Zielonogórski. Oficyna Wydawnicza
Tematy:: kodowanie synchroniczne
blokowanie fazowe
percepcja mowy
psychoakustyka
rozpoznawanie mowy
spike coding
synchrony coding
phase locking
speech perception
psychoacoustics
speech recognition
Źródło:: International Journal of Applied Mathematics and Computer Science; 2008, 18, 2; 129-137
1641-876X
2083-8492
Język:: angielski
Prawa:: Wszystkie prawa zastrzeżone. Swoboda użytkownika ograniczona do ustawowego zakresu dozwolonego użytku
Dostawca treści:: Biblioteka Nauki
: Artykuł

Przejdź do źródła

Shortcomings of automatic speech recognition (ASR) applications are becoming more evident as they are more widely used in real life. The inherent non-stationarity associated with the timing of speech signals as well as the dynamical changes in the environment make the ensuing analysis and recognition extremely difficult. Researchers often turn to biology seeking clues to make better engineered systems, and ASR is no exception with the usage of feature sets such as Mel frequency cepstral coefficients, which employ filter banks similar to cochlear filter banks in frequency distribution and bandwidth. In this paper, we delve deeper into the mechanics of the human auditory system to take this biological inspiration to the next level. The main goal of this research is to investigate the computation potential of spike trains produced at the early stages of the auditory system for a simple acoustic classification task. First, various spike coding schemes from temporal to rate coding are explored, together with various spike-based encoders with various simplicity levels such as rank order coding and liquid state machine. Based on these findings, a biologically plausible system architecture is proposed for the recognition of phonetically simple acoustic signals which makes exclusive use of spikes for computation. The performance tests show superior performance on a noisy vowel data set when compared with a conventional ASR system.

Informacja

Powiązane pozycje