- Tytuł:
- Hybrid CNN-Ligru acoustic modeling using sincnet raw waveform for hindi ASR
- Autorzy:
-
Kumar, Ankit
Aggarwal, Rajesh Kumar - Powiązania:
- https://bibliotekanauki.pl/articles/1839250.pdf
- Data publikacji:
- 2020
- Wydawca:
- Akademia Górniczo-Hutnicza im. Stanisława Staszica w Krakowie. Wydawnictwo AGH
- Tematy:
-
automatic speech recognition
CNN
CNN-LiGRU
DNN - Opis:
- Deep neural networks (DNN) currently play a most vital role in automatic speech recognition (ASR). The convolution neural network (CNN) and recurrent neural network (RNN) are advanced versions of DNN. They are right to deal with the spatial and temporal properties of a speech signal, and both properties have a higher impact on accuracy. With its raw speech signal, CNN shows its superiority over precomputed acoustic features. Recently, a novel first convolution layer named SincNet was proposed to increase interpretability and system performance. In this work, we propose to combine SincNet-CNN with a light-gated recurrent unit (LiGRU) to help reduce the computational load and increase interpretability with a high accuracy. Different configurations of the hybrid model are extensively examined to achieve this goal. All of the experiments were conducted using the Kaldi and Pytorch-Kaldi toolkit with the Hindi speech dataset. The proposed model reports an 8.0% word error rate (WER).
- Źródło:
-
Computer Science; 2020, 21 (4); 397-417
1508-2806
2300-7036 - Pojawia się w:
- Computer Science
- Dostawca treści:
- Biblioteka Nauki