- Tytuł:
- Performance evaluation of deep neural networks applied to speech recognition : RNN, LSTM and GRU
- Autorzy:
- Shewalkar, Apeksha
- Powiązania:
- https://bibliotekanauki.pl/articles/91735.pdf
- Data publikacji:
- 2019
- Wydawca:
- Społeczna Akademia Nauk w Łodzi. Polskie Towarzystwo Sieci Neuronowych
- Tematy:
-
spectrogram
connectionist temporal classification
TED-LIUM data set - Opis:
- Deep Neural Networks (DNN) are nothing but neural networks with many hidden layers. DNNs are becoming popular in automatic speech recognition tasks which combines a good acoustic with a language model. Standard feedforward neural networks cannot handle speech data well since they do not have a way to feed information from a later layer back to an earlier layer. Thus, Recurrent Neural Networks (RNNs) have been introduced to take temporal dependencies into account. However, the shortcoming of RNNs is that long-term dependencies due to the vanishing/exploding gradient problem cannot be handled. Therefore, Long Short-Term Memory (LSTM) networks were introduced, which are a special case of RNNs, that takes long-term dependencies in a speech in addition to shortterm dependencies into account. Similarily, GRU (Gated Recurrent Unit) networks are an improvement of LSTM networks also taking long-term dependencies into consideration. Thus, in this paper, we evaluate RNN, LSTM, and GRU to compare their performances on a reduced TED-LIUM speech data set. The results show that LSTM achieves the best word error rates, however, the GRU optimization is faster while achieving word error rates close to LSTM.
- Źródło:
-
Journal of Artificial Intelligence and Soft Computing Research; 2019, 9, 4; 235-245
2083-2567
2449-6499 - Pojawia się w:
- Journal of Artificial Intelligence and Soft Computing Research
- Dostawca treści:
- Biblioteka Nauki