Phonetic Segmentation using a Wavelet-based Speech Cepstral Features and Sparse Representation Classifier

Szczegóły
Opis

Tytuł:: Phonetic Segmentation using a Wavelet-based Speech Cepstral Features and Sparse Representation Classifier
Autorzy:: Al-Hassani, Ihsan
Al-Dakkak, Oumayma
Assami, Abdlnaser
Powiązania:: https://bibliotekanauki.pl/articles/2058484.pdf
Data publikacji:: 2021
Wydawca:: Instytut Łączności - Państwowy Instytut Badawczy
Tematy:: Arabic speech corpus
ASR
F1-score
phonetic segmentation
sparse representation classifier
TTS
wavelet packet
Źródło:: Journal of Telecommunications and Information Technology; 2021, 4; 12--22
1509-4553
1899-8852
Język:: angielski
Prawa:: Wszystkie prawa zastrzeżone. Swoboda użytkownika ograniczona do ustawowego zakresu dozwolonego użytku
Dostawca treści:: Biblioteka Nauki
: Artykuł

Przejdź do źródła

Speech segmentation is the process of dividing speech signal into distinct acoustic blocks that could be words, syllables or phonemes. Phonetic segmentation is about finding the exact boundaries for the different phonemes that composes a specific speech signal. This problem is crucial for many applications, i.e. automatic speech recognition (ASR). In this paper we propose a new model-based text independent phonetic segmentation method based on wavelet packet speech parametrization features and using the sparse representation classifier (SRC). Experiments were performed on two datasets, the first is an English one derived from TIMIT corpus, while the second is an Arabic one derived from the Arabic speech corpus. Results showed that the proposed wavelet packet decomposition features outperform the MFCC features in speech segmentation task, in terms of both F1-score and R-measure on both datasets. Results also indicate that the SRC gives higher hit rate than the famous k-Nearest Neighbors (k-NN) classifier on TIMIT dataset.

Informacja

Powiązane pozycje