Shortcomings of automatic speech recognition (ASR) applications are becoming more evident as they are more widely used in real life. The inherent non-stationarity associated with the timing of speech signals as well as the dynamical changes in the environment make the ensuing analysis and recognition extremely difficult. Researchers often turn to biology seeking clues to make better engineered systems, and ASR is no exception with the usage of feature sets such as Mel frequency cepstral coefficients, which employ filter banks similar to cochlear filter banks in frequency distribution and bandwidth. In this paper, we delve deeper into the mechanics of the human auditory system to take this biological inspiration to the next level. The main goal of this research is to investigate the computation potential of spike trains produced at the early stages of the auditory system for a simple acoustic classification task. First, various spike coding schemes from temporal to rate coding are explored, together with various spike-based encoders with various simplicity levels such as rank order coding and liquid state machine. Based on these findings, a biologically plausible system architecture is proposed for the recognition of phonetically simple acoustic signals which makes exclusive use of spikes for computation. The performance tests show superior performance on a noisy vowel data set when compared with a conventional ASR system.
Ta witryna wykorzystuje pliki cookies do przechowywania informacji na Twoim komputerze. Pliki cookies stosujemy w celu świadczenia usług na najwyższym poziomie, w tym w sposób dostosowany do indywidualnych potrzeb. Korzystanie z witryny bez zmiany ustawień dotyczących cookies oznacza, że będą one zamieszczane w Twoim komputerze. W każdym momencie możesz dokonać zmiany ustawień dotyczących cookies
Informacja
SZANOWNI CZYTELNICY!
UPRZEJMIE INFORMUJEMY, ŻE BIBLIOTEKA FUNKCJONUJE W NASTĘPUJĄCYCH GODZINACH:
Wypożyczalnia i Czytelnia Główna: poniedziałek – piątek od 9.00 do 19.00