Informacja

Drogi użytkowniku, aplikacja do prawidłowego działania wymaga obsługi JavaScript. Proszę włącz obsługę JavaScript w Twojej przeglądarce.

Wyszukujesz frazę "audio features" wg kryterium: Temat


Wyświetlanie 1-5 z 5
Tytuł:
Machine learning-based analysis of English lateral allophones
Autorzy:
Piotrowska, Magdalena
Korvel, Gražina
Kostek, Bożena
Ciszewski, Tomasz
Czyżewski, Andrzej
Powiązania:
https://bibliotekanauki.pl/articles/908115.pdf
Data publikacji:
2019
Wydawca:
Uniwersytet Zielonogórski. Oficyna Wydawnicza
Tematy:
allophone
audio features
artificial neural network
k-nearest neighbor
self organizing map
alofon
cechy akustyczne
sztuczna sieć neuronowa
metoda najbliższych sąsiadów
mapa samoorganizująca
Opis:
Automatic classification methods, such as artificial neural networks (ANNs), the k-nearest neighbor (kNN) and self-organizing maps (SOMs), are applied to allophone analysis based on recorded speech. A list of 650 words was created for that purpose, containing positionally and/or contextually conditioned allophones. For each word, a group of 16 native and non-native speakers were audio-video recorded, from which seven native speakers’ and phonology experts’ speech was selected for analyses. For the purpose of the present study, a sub-list of 103 words containing the English alveolar lateral phoneme /l/ was compiled. The list includes ‘dark’ (velarized) allophonic realizations (which occur before a consonant or at the end of the word before silence) and 52 ‘clear’ allophonic realizations (which occur before a vowel), as well as voicing variants. The recorded signals were segmented into allophones and parametrized using a set of descriptors, originating from the MPEG 7 standard, plus dedicated time-based parameters as well as modified MFCC features proposed by the authors. Classification methods such as ANNs, the kNN and the SOM were employed to automatically detect the two types of allophones. Various sets of features were tested to achieve the best performance of the automatic methods. In the final experiment, a selected set of features was used for automatic evaluation of the pronunciation of dark /l/ by non-native speakers.
Źródło:
International Journal of Applied Mathematics and Computer Science; 2019, 29, 2; 393-405
1641-876X
2083-8492
Pojawia się w:
International Journal of Applied Mathematics and Computer Science
Dostawca treści:
Biblioteka Nauki
Artykuł
Tytuł:
A Study on of Music Features Derived from Audio Recordings Examples – a Quantitative Analysis
Autorzy:
Dorochowicz, A.
Kostek, B.
Powiązania:
https://bibliotekanauki.pl/articles/178092.pdf
Data publikacji:
2018
Wydawca:
Polska Akademia Nauk. Czytelnia Czasopism PAN
Tematy:
music genre
audio parametrization
music features
Opis:
The paper presents a comparative study of music features derived from audio recordings, i.e. the same music pieces but representing different music genres, excerpts performed by different musicians, and songs performed by a musician, whose style evolved over time. Firstly, the origin and the background of the division of music genres were shortly presented. Then, several objective parameters of an audio signal were recalled that have an easy interpretation in the context of perceptual relevance. Within the study parameter values were extracted from music excerpts, gathered and compared to determine to what extent they are similar within the songs of the same performer or samples representing the same piece.
Źródło:
Archives of Acoustics; 2018, 43, 3; 505-516
0137-5075
Pojawia się w:
Archives of Acoustics
Dostawca treści:
Biblioteka Nauki
Artykuł
Tytuł:
Change Point Determination in Audio Data Using Auditory Features
Autorzy:
Maka, T.
Powiązania:
https://bibliotekanauki.pl/articles/226762.pdf
Data publikacji:
2015
Wydawca:
Polska Akademia Nauk. Czytelnia Czasopism PAN
Tematy:
audio change point detection
auditory features
gammatone filter bank
Opis:
The study is aimed to investigate the properties of auditory-based features for audio change point detection process. In the performed analysis, two popular techniques have been used: a metric-based approach and the ∆BIC scheme. The efficiency of the change point detection process depends on the type and size of the feature space. Therefore, we have compared two auditory-based feature sets (MFCC and GTEAD) in both change point detection schemes. We have proposed a new technique based on multiscale analysis to determine the content change in the audio data. The comparison of the two typical change point detection techniques with two different feature spaces has been performed on the set of acoustical scenes with single change point. As the results show, the accuracy of the detected positions depends on the feature type, feature space dimensionality, detection technique and the type of audio data. In case of the ∆BIC approach, the better accuracy has been obtained for MFCC feature space in the most cases. However, the change point detection with this feature results in a lower detection ratio in comparison to the GTEAD feature. Using the same criteria as for ∆BIC, the proposed multiscale metric-based technique has been executed. In such case, the use of the GTEAD feature space has led to better accuracy. We have shown that the proposed multiscale change point detection scheme is competitive to the ∆BIC scheme with the MFCC feature space.
Źródło:
International Journal of Electronics and Telecommunications; 2015, 61, 2; 185-190
2300-1933
Pojawia się w:
International Journal of Electronics and Telecommunications
Dostawca treści:
Biblioteka Nauki
Artykuł
Tytuł:
An Expert System for Automatic Classification of Sound Signals
Autorzy:
Tyburek, Krzysztof
Kotlarz, Piotr
Powiązania:
https://bibliotekanauki.pl/articles/307799.pdf
Data publikacji:
2020
Wydawca:
Instytut Łączności - Państwowy Instytut Badawczy
Tematy:
audio descriptors
bird species
fuzzy classification of audio signals
MPEG-7
spectral features of sound
Opis:
In this paper, we present the results of research focusing on methods for recognition/classification of audio signals. We consider the results of the research project to serve as a basis for the main module of a hybrid expert system currently under development. In our earlier studies, we conducted research on the effectiveness of three classifiers: fuzzy classifier, neural classifier and WEKA system for reference data. In this project, a particular emphasis was placed on fine-tuning the fuzzy classifier model and on identifying neural classifier applications, taking into account new neural networks that we have not studied so far in connection with sounds classification methods.
Źródło:
Journal of Telecommunications and Information Technology; 2020, 2; 86-90
1509-4553
1899-8852
Pojawia się w:
Journal of Telecommunications and Information Technology
Dostawca treści:
Biblioteka Nauki
Artykuł
Tytuł:
Audio-Visual Speech Processing System for Polish Applicable to Human-Computer Interaction
Autorzy:
Jadczyk, T.
Powiązania:
https://bibliotekanauki.pl/articles/305828.pdf
Data publikacji:
2018
Wydawca:
Akademia Górniczo-Hutnicza im. Stanisława Staszica w Krakowie. Wydawnictwo AGH
Tematy:
audio-visual speech recognition
visual features extraction
human-computer interaction
Opis:
This paper describes audio-visual speech recognition system for Polish language and a set of performance tests under various acoustic conditions. We first present the overall structure of AVASR systems with three main areas: audio features extraction, visual features extraction and subsequently, audiovisual speech integration. We present MFCC features for audio stream with standard HMM modeling technique, then we describe appearance and shape based visual features. Subsequently we present two feature integration techniques, feature concatenation and model fusion. We also discuss the results of a set of experiments conducted to select best system setup for Polish, under noisy audio conditions. Experiments are simulating human-computer interaction in computer control case with voice commands in difficult audio environments. With Active Appearance Model (AAM) and multistream Hidden Markov Model (HMM) we can improve system accuracy by reducing Word Error Rate for more than 30%, comparing to audio-only speech recognition, when Signal-to-Noise Ratio goes down to 0dB.
Źródło:
Computer Science; 2018, 19 (1); 41-63
1508-2806
2300-7036
Pojawia się w:
Computer Science
Dostawca treści:
Biblioteka Nauki
Artykuł
    Wyświetlanie 1-5 z 5

    Ta witryna wykorzystuje pliki cookies do przechowywania informacji na Twoim komputerze. Pliki cookies stosujemy w celu świadczenia usług na najwyższym poziomie, w tym w sposób dostosowany do indywidualnych potrzeb. Korzystanie z witryny bez zmiany ustawień dotyczących cookies oznacza, że będą one zamieszczane w Twoim komputerze. W każdym momencie możesz dokonać zmiany ustawień dotyczących cookies