Temat: automatic recognition - Katalog OPAC zbiorów

Skocz do pozycji: 1.

Tytuł:: Behavioral features of the speech signal as part of improving the effectiveness of the automatic speaker recognition system
Autorzy:: Mały, Dominik
Dobrowolski, Andrzej
Powiązania:: https://bibliotekanauki.pl/articles/27323689.pdf
Data publikacji:: 2023
Wydawca:: Centrum Rzeczoznawstwa Budowlanego Sp. z o.o.
Tematy:: automatic speaker recognition
automatic speaker recognition systems
physical features
behavioral features
speech signal
automatyczne rozpoznawanie mówiącego
sygnał mowy
system automatycznego rozpoznawania mówiącego
cecha behawioralna
cecha fizyczna
Opis:: The current reality is saturated with intelligent telecommunications solutions, and automatic speaker recognition systems are an integral part of many of them. They are widely used in sectors such as banking, telecommunications and forensics. The ease of performing automatic analysis and efficient extraction of the distinctive characteristics of the human voice makes it possible to identify, verify, as well as authorize the speaker under investigation. Currently, the vast majority of solutions in the field of speaker recognition systems are based on the distinctive features resulting from the structure of the speaker's vocal tract (laryngeal sound analysis), called physical features of the voice. Despite the high efficiency of such systems - oscillating at more than 95% - their further development is already very difficult, due to the fact that the possibilities of distinctive physical features have been exhausted. Further opportunities to increase the effectiveness of ASR systems based on physical features appear after additional consideration of the behavioral features of the speech signal in the system, which is the subject of this article.
Źródło:: Inżynieria Bezpieczeństwa Obiektów Antropogenicznych; 2023, 4; 26--34
2450-1859
2450-8721
Pojawia się w:: Inżynieria Bezpieczeństwa Obiektów Antropogenicznych
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Skocz do pozycji: 2.

Tytuł:: Rozpoznawanie i pomiar emocji w badaniach doświadczeń klienta
Recognition and Measurement of Emotions in Customer Experience Research
Autorzy:: Budzanowska-Drzewiecka, Małgorzata
Lubowiecki-Vikuk, Adrian
Powiązania:: https://bibliotekanauki.pl/articles/27839578.pdf
Data publikacji:: 2023
Wydawca:: Wydawnictwo Uniwersytetu Ekonomicznego we Wrocławiu
Tematy:: doświadczenie klienta
rozpoznawanie emocji
automatyczna analiza ekspresji twarzy
FaceReader
pomiar emocji
customer experience
emotion recognition
automatic facial expression analysis
measuring emotions
Opis:: Badanie doświadczeń klienta wymaga rozwijania metodyki ich pomiaru pozwalającej na uwzględnienie ich złożoności. Jedną z ważnych składowych doświadczeń są emocje, których rozpoznawanie i pomiar stanowi wciąż wyzwanie dla badaczy. Celem artykułu jest dyskusja na temat metod i technik wykorzystywanych do rozpoznawania i pomiaru emocji w badaniach doświadczeń klienta. Szczególną uwagę poświęcono wykorzystaniu technik wywodzących się z neuronauki konsumenckiej, w tym dylematom związanym z sięganiem po automatyczną analizę ekspresji mimicznej. Studia literaturowe pozwoliły na dyskusję dotyczącą korzyści i ograniczeń stosowania automatycznej analizy ekspresji mimicznej w pomiarze doświadczeń klientów. Mimo ograniczeń, mogą one być traktowane jako atrakcyjne uzupełnienie metod i technik pozwalających na uchwycenie emocjonalnych komponentów doświadczenia klienta na różnych etapach (przed zakupem, w jego czasie i po nim).
The study of customer experience requires the development of methodologies which measure such experience and account for its complexity. One important component of customer experience is emotion, the recognition and measurement of which is still a challenge for researchers. The purpose of this article is to discuss methods and techniques used to recognise and measure emotions in customer experience research. Particular attention is paid to the use of techniques derived from consumer neuroscience, including the dilemmas associated with reaching for automatic analysis of facial expressions. The literature review is indicative of the ongoing discussion on the benefits and limitations of using the automatic analysis of facial expressions technique in measuring customer experience. Despite its limitations, such a technique can be an attractive complement to methods and techniques used to capture the emotional components of customer experience at different stages (before, during, and after purchase).
Źródło:: Prace Naukowe Uniwersytetu Ekonomicznego we Wrocławiu; 2023, 67, 5; 67-77
1899-3192
Pojawia się w:: Prace Naukowe Uniwersytetu Ekonomicznego we Wrocławiu
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Skocz do pozycji: 3.

Tytuł:: Using SVM Classifier and Micro-Doppler Signature for Automatic Recognition of Sonar Targets
Autorzy:: Saffari, Abbas
Zahiri, Seye Hamid
Khozein Ghanad, Navid
Powiązania:: https://bibliotekanauki.pl/articles/31339922.pdf
Data publikacji:: 2023
Wydawca:: Polska Akademia Nauk. Czasopisma i Monografie PAN
Tematy:: sonar micro-Doppler
automatic recognition
SVM
RBF kernel
linear kernel
polynomial kernel
Opis:: In this paper, we propose using a propeller modulation on the transmitted signal (called sonar micro-Doppler) and different support vector machine (SVM) kernels for automatic recognition of moving sonar targets. In general, the main challenge for researchers and craftsmen working in the field of sonar target recognition is the lack of access to a valid and comprehensive database. Therefore, using a comprehensive mathematical model to simulate the signal received from the target can respond to this challenge. The mathematical model used in this paper simulates the return signal of moving sonar targets well. The resulting signals have unique properties and are known as frequency signatures. However, to reduce the complexity of the model, the 128-point fast Fourier transform (FFT) is used. The selected SVM classification is the most popular machine learning algorithm with three main kernel functions: RBF kernel, linear kernel, and polynomial kernel tested. The accuracy of correctly recognizing targets for different signal-to-noise ratios (SNR) and different viewing angles was assessed. Accuracy detection of targets for different SNRs (−20, −15, −10, −5, 0, 5, 10, 15, 20) and different viewing angles (10, 20, 30, 40, 50, 60, 70, 80) is evaluated. For a more fair comparison, multilayer perceptron neural network with two back-propagation (MLP-BP) training methods and gray wolf optimization (MLP-GWO) algorithm were used. But unfortunately, considering the number of classes, its performance was not satisfactory. The results showed that the RBF kernel is more capable for high SNRs (SNR = 20, viewing angle = 10) with an accuracy of 98.528%.
Źródło:: Archives of Acoustics; 2023, 48, 1; 49-61
0137-5075
Pojawia się w:: Archives of Acoustics
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Skocz do pozycji: 4.

Tytuł:: Preliminary Evaluation of Convolutional Neural Network Acoustic Model for Iban Language Using NVIDIA NeMo
Autorzy:: Michael, Steve Olsen
Juan, Sarah Samson
Mit, Edwin
Powiązania:: https://bibliotekanauki.pl/articles/2058507.pdf
Data publikacji:: 2022
Wydawca:: Instytut Łączności - Państwowy Instytut Badawczy
Tematy:: acoustic modeling
automatic speech recognition
convolutional neural network
CNN
under-resourced language
NVIDIA NeMo
Opis:: For the past few years, artificial neural networks (ANNs) have been one of the most common solutions relied upon while developing automated speech recognition (ASR) acoustic models. There are several variants of ANNs, such as deep neural networks (DNNs), recurrent neural networks (RNNs), and convolutional neural networks (CNNs). A CNN model is widely used as a method for improving image processing performance. In recent years, CNNs have also been utilized in ASR techniques, and this paper investigates the preliminary result of an end-to-end CNN-based ASR using NVIDIA NeMo on the Iban corpus, an under-resourced language. Studies have shown that CNNs have also managed to produce excellent word error (WER) rates for the acoustic model on ASR for speech data. Conversely, results and studies concerned with under-resourced languages remain unsatisfactory. Hence, by using NVIDIA NeMo, a new ASR engine developed by NVIDIA, the viability and the potential of this alternative approach are evaluated in this paper. Two experiments were conducted: the number of resources used in the works of our ASR’s training was manipulated, as was the internal parameter of the engine used, namely the epochs. The results of those experiments are then analyzed and compared with the results shown in existing papers.
Źródło:: Journal of Telecommunications and Information Technology; 2022, 1; 43--53
1509-4553
1899-8852
Pojawia się w:: Journal of Telecommunications and Information Technology
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Skocz do pozycji: 5.

Tytuł:: Recognition of speaker’s age group and gender for a large database of telephone-recorded voices
Autorzy:: Staroniewicz, Piotr
Powiązania:: https://bibliotekanauki.pl/articles/2202432.pdf
Data publikacji:: 2022
Wydawca:: Politechnika Poznańska. Instytut Mechaniki Stosowanej
Tematy:: speech processing
automatic age recognition
przetwarzanie mowy
automatyczne rozpoznawanie wieku
Opis:: The paper presents the results of the automatic recognition of age group and gender of speakers performed for the large SpeechDAT(E) acoustic database for the Polish language, containing recordings of 1000 speakers (486 males/514 females) aged 12 to 73, recorded in telephone conditions. Three age groups were recognised for each gender. Mel Frequency Cepstral Coefficients (MFCC) were used to describe the recognized signals parametrically. Among the classification methods tested in this study, the best results were obtained for the SVM (Support Vector Machines) method.
Źródło:: Vibrations in Physical Systems; 2022, 33, 2; art. no. 2022203
0860-6897
Pojawia się w:: Vibrations in Physical Systems
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Skocz do pozycji: 6.

Tytuł:: Application of Intelligent Transportation Systems in Analyses of Human Spatial Mobility in Cities
Zastosowanie inteligentnych systemów transportowych w analizach mobilności przestrzennej ludzi w miastach
Autorzy:: Borowska-Stefańska, Marta
Kowalski, Michał
Kurzyk, Paulina
Mikušová, Miroslava
Wiśniewski, Szymon
Powiązania:: https://bibliotekanauki.pl/articles/2089596.pdf
Data publikacji:: 2021
Wydawca:: Uniwersytet Gdański. Komisja Geografii Komunikacji Polskiego Towarzystwa Geograficznego
Tematy:: ITS
intelligent transportation system
spatial mobility
transport geography
induction loop
ANPR
automatic number plate recognition
Inteligentny system transportowy
mobilność przestrzenna
geografia transportu
Opis:: This article contains results of studies on the applicability of data from Intelligent Transportation Systems (ITS) for the purposes of geographical studies regarding the spatial mobility of inhabitants within a big city. The article focuses on the option of applying two types of sub-systems – induction loops and automatic number-plate recognition (ANPR) – and includes examples of analyses based on the resulting data, which can serve as a basis for mobility studies. The area on the example of which the capabilities of application of ITS data have been presented is Lodz – a large city in central Poland. The conducted research shows that ITS systems offer an enormous potential in providing data for spatial mobility studies. In order to fully exploit its worth, however, it is imperative to expand the research procedure by including, for instance, the results of qualitative research. Also, the interpretation of results obtained on the basis of ITS data ought to be performed with an awareness of numerous significant preliminary and simplifying assumptions.
Źródło:: Prace Komisji Geografii Komunikacji PTG; 2021, 24(1); 7-30
1426-5915
2543-859X
Pojawia się w:: Prace Komisji Geografii Komunikacji PTG
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Skocz do pozycji: 7.

Tytuł:: Detection of fillers in the speech by people who stutter
Autorzy:: Suszyński, Waldemar
Charytanowicz, Małgorzata
Rosa, Wojciech
Koczan, Leopold
Stęgierski, Rafał
Powiązania:: https://bibliotekanauki.pl/articles/1956029.pdf
Data publikacji:: 2021
Wydawca:: Polskie Towarzystwo Promocji Wiedzy
Tematy:: stuttering
fillers disfluency
automatic recognition
fillers detection
jąkanie
dysfluencja
automatyczne rozpoznawanie
wykrywanie
Opis:: Stuttering is a speech impediment that is a very complex disorder. It is difficult to diagnose and treat, and is of unknown initiation, despite the large number of studies in this field. Stuttering can take many forms and varies from person to person, and it can change under the influence of external factors. Diagnosing and treating speech disorders such as stuttering requires from a speech therapist, not only good professional prepa-ration, but also experience gained through research and practice in the field. The use of acoustic methods in combination with elements of artificial intelligence makes it possible to objectively assess the disorder, as well as to control the effects of treatment. The main aim of the study was to present an algorithm for automatic recognition of fillers disfluency in the statements of people who stutter. This is done on the basis of their parameterized features in the amplitude-frequency space. The work provides as well, exemplary results demonstrating their possibility and effectiveness. In order to verify and optimize the procedures, the statements of seven stutterers with duration of 2 to 4 minutes were selected. Over 70% efficiency and predictability of automatic detection of these disfluencies was achieved. The use of an automatic method in conjunction with therapy for a stuttering person can give us the opportunity to objectively assess the disorder, as well as to evaluate the progress of therapy.
Źródło:: Applied Computer Science; 2021, 17, 4; 45-54
1895-3735
Pojawia się w:: Applied Computer Science
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Skocz do pozycji: 8.

Tytuł:: CAI – Narzędzia informatyczne wspierające tłumaczy konsekutywnych. Stan badań oraz perspektywy rozwoju
CAI Tools for Consecutive Interpreters. Present Solutions and Development Perspectives
Autorzy:: Sitkowski, Krzysztof
Powiązania:: https://bibliotekanauki.pl/articles/1193010.pdf
Data publikacji:: 2020
Wydawca:: Krakowskie Towarzystwo TERTIUM
Tematy:: Narzędzie CAI
oprogramowanie do rozpoznawania mowy (ASR)
tłumaczenie konsekutywne
kompresja tłumaczeniowa
narzędzie kompresujące
CAI tools
automatic speech recognition (ASR)
consecutive interpreting
compression
compression tools
Opis:: Celem artykułu jest przedstawienie CAI, czyli narzędzi informatycznych wspierających tłumacza w trakcie wykonywania tłumaczeń konsekutywnych. W pierwszej części przedstawiona jest definicja CAI oraz opis i przykłady pierwszej, drugiej i trzeciej generacji tego narzędzia. Na podstawie analizy przedmiotu stwierdzono, że istniejące rozwiązania, w postaci komercyjnej lub testowej, ograniczają swoje działanie głównie do zarządzania terminologią. Następnie autor odnosi się do możliwości wykorzystania pamięci tłumaczeniowych w tłumaczeniu konsekutywnym. Kolejna część opisuje dwa najważniejsze komponenty CAI, czyli oprogramowanie do rozpoznawania mowy (ASR) oraz narzędzie kompresujące. W dalszej części przedstawiono możliwe problemy rozwojowe narzędzia oraz opisano kompresję w tłumaczeniu konsekutywnym. W ostatniej części artykułu autor opisuje kompleksowe narzędzie CAI, jego komponenty, a taże scenariusz zastosowania w trakcie tłumaczenia konsekutywnego.
The aim of the article is to present CAI tools, i.e. Computer-Assisted Interpreting tools supporting the interpreter during consecutive interpreting. The first part presents the definition of CAI, a description and examples of the first, second and third generation of this tool. Based on the analysis of the subject, it was found that the existing solutions, either in commercial or test form, are limited to terminology management. Then the author refers to the possibility of using translation memories in consecutive interpreting. The following part describes the twomost important components of CAI, namely speech recognition software (ASR) and a compression tool. In the next part, possible development issues are presented.
Źródło:: Półrocznik Językoznawczy Tertium; 2020, 5, 2; 166-182
2543-7844
Pojawia się w:: Półrocznik Językoznawczy Tertium
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Skocz do pozycji: 9.

Tytuł:: Hybrid CNN-Ligru acoustic modeling using sincnet raw waveform for hindi ASR
Autorzy:: Kumar, Ankit
Aggarwal, Rajesh Kumar
Powiązania:: https://bibliotekanauki.pl/articles/1839250.pdf
Data publikacji:: 2020
Wydawca:: Akademia Górniczo-Hutnicza im. Stanisława Staszica w Krakowie. Wydawnictwo AGH
Tematy:: automatic speech recognition
CNN
CNN-LiGRU
DNN
Opis:: Deep neural networks (DNN) currently play a most vital role in automatic speech recognition (ASR). The convolution neural network (CNN) and recurrent neural network (RNN) are advanced versions of DNN. They are right to deal with the spatial and temporal properties of a speech signal, and both properties have a higher impact on accuracy. With its raw speech signal, CNN shows its superiority over precomputed acoustic features. Recently, a novel first convolution layer named SincNet was proposed to increase interpretability and system performance. In this work, we propose to combine SincNet-CNN with a light-gated recurrent unit (LiGRU) to help reduce the computational load and increase interpretability with a high accuracy. Different configurations of the hybrid model are extensively examined to achieve this goal. All of the experiments were conducted using the Kaldi and Pytorch-Kaldi toolkit with the Hindi speech dataset. The proposed model reports an 8.0% word error rate (WER).
Źródło:: Computer Science; 2020, 21 (4); 397-417
1508-2806
2300-7036
Pojawia się w:: Computer Science
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Skocz do pozycji: 10.

Tytuł:: Music Genre Recognition Using Convolutional Neural Networks
Rozpoznawanie gatunków muzycznych z użyciem splotowych sieci neuronowych
Autorzy:: Matocha, M.
Zieliński, S. K.
Powiązania:: https://bibliotekanauki.pl/articles/88408.pdf
Data publikacji:: 2018
Wydawca:: Politechnika Białostocka. Oficyna Wydawnicza Politechniki Białostockiej
Tematy:: automatyczne rozpoznawanie gatunków muzycznych
splotowe sieci neuronowe
pozyskiwanie informacji w muzyce
automatic music genre recognition
convolutional neural networks
music information retrieval
Opis:: The aim of this study was to develop a music genre classifier using convolutional neural networks and to compare its performance with a traditional algorithm based on support vector machines. A distinct feature of the proposed approach was to utilize two-channel stereo signals at the input of the convolutional network. The proposed method yielded similar results compared to those obtained with the traditional approach, demonstrating the potential of the proposed method and indicating the need for its further optimization. Using two-channel stereo signals at the input of the algorithm showed no improvements over the baseline method exploiting single-channel recordings, suggesting that monaural signals fed to the convolutional network might be sufficient to undertake the task of music genre recognition. According to the results, the network ‘prioritized’ the temporal changes over the frequency variations of the signals. This observation tentatively implies that the classifiers specifically designed to account for temporal changes might potentially better serve the task of music genre recognition than the convolutional neural networks.
Celem niniejszej pracy było opracowanie klasyfikatora gatunków muzycznych z użyciem splotowych sieci neuronowych i porównanie go z tradycyjnym algorytmem opartym na maszynie wektorów wspierających. Wyróżniającą cechą zaproponowanego podejścia było wykorzystanie dwu-kanałowego dźwięku stereofonicznego na wejściu sieci splotowej. Zaproponowana metoda dała podobne wyniki do rezultatów otrzymanych z użyciem podejścia tradycyjnego, demonstrując potencjał zaproponowanej metody oraz wskazując na potrzebę jej dalszej optymalizacji. Wykorzystanie dwu-kanałowego dźwięku stereofonicznego na wejściu algorytmu nie poprawiło wyników w porównaniu z metodą bazową wykorzystującą nagrania jednokanałowe, sugerując, iż zastosowanie dźwięków monofonicznych na wejściu splotowej sieci neuronowej jest adekwatne do celów rozpoznawania gatunków muzycznych. Zgodnie z uzyskanymi wynikami, sieć potraktowała priorytetowo zmiany czasowe w porównaniu ze zmianami częstotliwościowymi sygnałów. Obserwacja ta pozwala wstępnie przypuszczać że klasyfikatory specjalnie zaprojektowane, by uwzględnić zmiany czasowe, potencjalnie mogłyby lepiej służyć celom rozpoznawania gatunków muzycznych niż neuronowe sieci splotowe.
Źródło:: Advances in Computer Science Research; 2018, 14; 125-142
2300-715X
Pojawia się w:: Advances in Computer Science Research
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Skocz do pozycji: 11.

Tytuł:: An algorithm for vehicle identification by on-board Bluetooth devices exploiting Big-Data tools
Algorytm identyfikacji pojazdów poprzez urządzenia Bluetooth wykorzystujący narzędzia Big Data
Autorzy:: Bazan, M.
Janiczek, T.
Kurda, R.
Matusiak, K.
Sak, Ł.
Powiązania:: https://bibliotekanauki.pl/articles/115307.pdf
Data publikacji:: 2017
Wydawca:: Wyższa Szkoła Techniczna w Katowicach
Tematy:: automatic number plate recognition
Bluetooth devices
HaDoop
Spark
user identification
identyfikacja użytkownika
Opis:: Nowadays, vehicles are equipped with various on-board devices that work in Bluetooth technology and log on to the ITS infrastructure whenever passing by Bluetooth readers. The location of Bluetooth readers is an important issue for travel time prediction in urban areas. Bluetooth technology is used to enhance travel time prediction accuracy and is additional to vehicle license number identification. The algorithms for travel time prediction are used by such technologies e.g., TRAX to offer the road user an alternative route to traverse the most congested regions of the city in the most efficient way. In this paper we present the implementation of the algorithm that enables us to match Bluetooth on-board devices, and also cell phones that are mounted or are just in vehicles of road users. Since the ITS is a source of an enormous and increasing amount of data for this purpose we engage Big Data tools such as Apache HaDoop and Apache Spark. To build Map-Reduce tasks we use Hive-SQL. The algorithm is tested on ITS data from the city of Wroclaw. The results of the algorithm may be used to locate stolen vehicles.
Współczesne pojazdy wyposażane są w wiele różnych urządzeń Bluetooth, które logują się do infrastruktury ITS za każdym razem gdy przejeżdżają one w zasięgu czytników Bluetooth. Położenie czytników Bluetooth jest zagadnieniem istotnym dla metod predykcji czasu przejazdu w regionach zurbanizowanych. Technologia Bluetooth jest użyta do poprawy dokładności czasu przejazdu i jest uzupełnieniem dla identyfikacji pojazdów po numerach rejestracyjnych. Algorytmy do predykcji czasu przejazdu są używane do proponowania użytkownikom trasy alternatywnej w celu przejazdu przez najbardziej zatłoczone regiony miasta w sposób najbardziej efektywny. W artykule jest prezentowana implementacja algorytmu, który pozwala połączyć urządzenia Bluetooth i telefony znajdujące się w pojazdach z samymi pojazdami. Do tego celu angażuje się narzędzia Big Data takie jak Apache HaDoop i Apache Spark. Do zbudowania zadań Map-Reduce używa się Hive-SQLa. Algorytm był testowany na danych z wrocławskiego ITS. Wyniki działania algorytmu mogą być użyte do lokalizowania skradzionych pojazdów.
Źródło:: Zeszyty Naukowe Wyższej Szkoły Technicznej w Katowicach; 2017, 9; 7-21
2082-7016
2450-5552
Pojawia się w:: Zeszyty Naukowe Wyższej Szkoły Technicznej w Katowicach
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Skocz do pozycji: 12.

Tytuł:: An Effective Speaker Clustering Method using UBM and Ultra-Short Training Utterances
Autorzy:: Hossa, R.
Makowski, R.
Powiązania:: https://bibliotekanauki.pl/articles/176593.pdf
Data publikacji:: 2016
Wydawca:: Polska Akademia Nauk. Czytelnia Czasopism PAN
Tematy:: automatic speech recognition
interindividual difference compensation
speaker clustering
universal background model
GMM weighting factor adaptation
Opis:: The same speech sounds (phones) produced by different speakers can sometimes exhibit significant differences. Therefore, it is essential to use algorithms compensating these differences in ASR systems. Speaker clustering is an attractive solution to the compensation problem, as it does not require long utterances or high computational effort at the recognition stage. The report proposes a clustering method based solely on adaptation of UBM model weights. This solution has turned out to be effective even when using a very short utterance. The obtained improvement of frame recognition quality measured by means of frame error rate is over 5%. It is noteworthy that this improvement concerns all vowels, even though the clustering discussed in this report was based only on the phoneme a. This indicates a strong correlation between the articulation of different vowels, which is probably related to the size of the vocal tract.
Źródło:: Archives of Acoustics; 2016, 41, 1; 107-118
0137-5075
Pojawia się w:: Archives of Acoustics
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Skocz do pozycji: 13.

Tytuł:: Selected aspects of road vehicle localisation
Wybrane aspekty lokalizacji pojazdów drogowych
Autorzy:: Jackowski, S.
Górska, M.
Powiązania:: https://bibliotekanauki.pl/articles/310877.pdf
Data publikacji:: 2016
Wydawca:: Instytut Naukowo-Wydawniczy "SPATIUM"
Tematy:: license plate recognition
automatic vehicle localization
GPS
rozpoznawanie tablic rejestracyjnych
automatyczna lokalizacja pojazdów
Opis:: The article discusses possibilities of integrating techniques for optical recognition of vehicles, e.g. by automatic numer plate analysis, with data transmission via wireless mobile networks in order to create mobility patterns for objects of interest. The collected data may be interpolated to fill gaps and to prove the coincidence of the observed vehicle’s presence at selected places with other events. Further, course prediction by means of extrapolation may be attempted. Several theoretical and practical aspects of data acquisition, transmission and analysis are studied in this article.
W artykule omówiono możliwości integracji techniki optycznego rozpoznawania pojazdów, na przykład przez automatyczną analizę znaków znajdujących się na tablicach rejestracyjnych, z przesyłaniem pozyskanych danych za pośrednictwem heterogenicznej sieci telekomunikacyjnej w celu stworzenia modeli mobilności dla obiektów zainteresowania. Zgromadzone dane mogą być interpolowane celem weryfikacji obecności obserwowanego pojazdu w wybranych miejscach. Ponadto możliwe do określenia są dalsze przebiegi badanej trajektorii poprzez ekstrapolację. W artykule przedstawiono wybrane teoretyczne i praktyczne aspekty pozyskiwania danych, ich transmisji i analizy.
Źródło:: Autobusy : technika, eksploatacja, systemy transportowe; 2016, 17, 12; 195-198
1509-5878
2450-7725
Pojawia się w:: Autobusy : technika, eksploatacja, systemy transportowe
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Skocz do pozycji: 14.

Tytuł:: Selekcja cech osobniczych sygnału mowy z wykorzystaniem algorytmów genetycznych
Selection of individual features of a speech signal using genetic algorithms
Autorzy:: Kamiński, K.
Dobrowolski, A. P.
Majda-Zdancewicz, E.
Powiązania:: https://bibliotekanauki.pl/articles/949807.pdf
Data publikacji:: 2016
Wydawca:: Wojskowa Akademia Techniczna im. Jarosława Dąbrowskiego
Tematy:: biometria
automatyczne rozpoznawanie mówcy
algorytmy genetyczne
selekcja cech
biometrics
automatic speaker recognition
genetic algorithms
feature selection
Opis:: W artykule przedstawiono system automatycznego rozpoznawania mówcy zaimplementowany w środowisku Matlab oraz pokazano sposoby realizacji i optymalizacji poszczególnych elementów tego systemu. Główny nacisk położono na wyselekcjonowanie cech dystynktywnych głosu mówcy z wykorzystaniem algorytmu genetycznego, który pozwala na uwzględnienie synergii cech podczas selekcji. Pokazano również wyniki optymalizacji wybranych elementów klasyfikatora, m.in. liczby rozkładów Gaussa użytych do zamodelowania każdego z głosów. Ponadto, podczas tworzenia modeli poszczególnych głosów zastosowano uniwersalny model głosów.
The paper presents an automatic speaker’s recognition system, implemented in the Matlab environment, and demonstrates how to achieve and optimize various elements of the system. The main emphasis was put on features selection of a speech signal using a genetic algorithm which takes into account synergy of features. The results of optimization of selected elements of a classifier have been also shown, including the number of Gaussian distributions used to model each of the voices. In addition, for creating voice models, a universal voice model has been used.
Źródło:: Biuletyn Wojskowej Akademii Technicznej; 2016, 65, 1; 147-158
1234-5865
Pojawia się w:: Biuletyn Wojskowej Akademii Technicznej
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Skocz do pozycji: 15.

Tytuł:: System rozpoznawania mowy polskiej dla robota społecznego
Automatic Speech Recognition System for Polish Dedicated for a Social Robot
Autorzy:: Zygadło, A.
Janicki, A.
Dąbek, P.
Powiązania:: https://bibliotekanauki.pl/articles/277843.pdf
Data publikacji:: 2016
Wydawca:: Sieć Badawcza Łukasiewicz - Przemysłowy Instytut Automatyki i Pomiarów
Tematy:: automatyczne rozpoznawanie mowy
command and control
robot społeczny
automatic speech recognition
social robot
Opis:: W artykule przedstawiono system automatycznego rozpoznawania mowy polskiej dedykowany dla robota społecznego. System oparty jest na bezpłatnej i otwartej bibliotece oprogramowania pocketsphinx (CMU Sphinx). Przygotowano zbiory nagrań: treningowy i testowy wraz z transkrypcjami. Zbiór treningowy obejmował głosy 10 kobiet i 10 mężczyzn i został przygotowany na podstawie audiobooków, natomiast zbiór testowy – głosy 3 kobiet i 3 mężczyzn nagrane w warunkach laboratoryjnych specjalnie na potrzeby pracy. Przygotowany zbiór fonemów dla języka polskiego, składający się z 39 fonemów, opracowany został na podstawie dwóch popularnych zbiorów dostępnych danych. Słownik fonetyczny opracowano za pomocą funkcjonalności konwersji grapheme-to-phoneme z biblioteki eSpeak. Model statystyczny języka dla tekstu referencyjnego składającego się z 76 komend wygenerowano za pomocą programu cmuclmtk (CMU Sphinx). Uczenie modelu akustycznego oraz test jakości rozpoznawania mowy przeprowadzono za pomocą programu sphinxtrain (CMU Sphinx). W warunkach laboratoryjnych uzyskano wskaźnik błędu rozpoznawania słów (WER) na poziomie 4% i błędu rozpoznawania zdań (SER) na poziomie 9%. Przeprowadzono też badania systemu w warunkach rzeczywistych na grupie testowej złożonej z 2 kobiet i 3 mężczyzn, uzyskując wstępne wyniki rozpoznawania na poziomie 10% (SER) z bliskiej odległości oraz 60% (SER) z odległości 3 m. Określono kierunki dalszych prac.
Automatic Speech Recognition system for Polish and dedicated for social robotics applications is presented. The system is based on free and open software library pocketsphinx (CMU Sphinx). Training and test databases were prepared with transcriptions; the training database comprised voices of 10 women and 10 men, and it was prepared based on audiobooks, whereas the test database comprised voices of 3 women and 3 men recorded in laboratory conditions as a part of the present work. A phoneme set for Polish consisting of 39 phonemes based on two popular sets from other researchers was prepared. The phonetic dictionary was obtained using graphemeto-phoneme conversion from the eSpeak tool for speech synthesis. The language statistic model for the reference text including 76 commands was generated using cmuclmtk tool (CMU Sphinx). Training of the acoustic model and test of quality of speech recognition was conducted using the sphinxtrain tool (CMU Sphinx). The following error rates were obtained for laboratory conditions: 4% (WER) and 9% (SER). Next, investigations of the system in relevant real environment were conducted. The initial, tentative results are about 10% (SER) for the close distance of a speaker to a microphone, and about 60% (SER) for 3 m speaker-microphone distance. Directions of future works are formulated.
Źródło:: Pomiary Automatyka Robotyka; 2016, 20, 4; 27-36
1427-9126
Pojawia się w:: Pomiary Automatyka Robotyka
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Informacja

Wyszukujesz frazę "automatic recognition" wg kryterium: Temat

Źródło danych

Dostawca treści

Kolekcja

Rok wydania

Wydawca

Temat

Autor

Typ dokumentu

Język