Temat: rozpoznawanie mowy automatyczne - Katalog OPAC zbiorów

Skocz do pozycji: 1.

Tytuł:: System rozpoznawania mowy polskiej dla robota społecznego
Automatic Speech Recognition System for Polish Dedicated for a Social Robot
Autorzy:: Zygadło, A.
Janicki, A.
Dąbek, P.
Powiązania:: https://bibliotekanauki.pl/articles/277843.pdf
Data publikacji:: 2016
Wydawca:: Sieć Badawcza Łukasiewicz - Przemysłowy Instytut Automatyki i Pomiarów
Tematy:: automatyczne rozpoznawanie mowy
command and control
robot społeczny
automatic speech recognition
social robot
Opis:: W artykule przedstawiono system automatycznego rozpoznawania mowy polskiej dedykowany dla robota społecznego. System oparty jest na bezpłatnej i otwartej bibliotece oprogramowania pocketsphinx (CMU Sphinx). Przygotowano zbiory nagrań: treningowy i testowy wraz z transkrypcjami. Zbiór treningowy obejmował głosy 10 kobiet i 10 mężczyzn i został przygotowany na podstawie audiobooków, natomiast zbiór testowy – głosy 3 kobiet i 3 mężczyzn nagrane w warunkach laboratoryjnych specjalnie na potrzeby pracy. Przygotowany zbiór fonemów dla języka polskiego, składający się z 39 fonemów, opracowany został na podstawie dwóch popularnych zbiorów dostępnych danych. Słownik fonetyczny opracowano za pomocą funkcjonalności konwersji grapheme-to-phoneme z biblioteki eSpeak. Model statystyczny języka dla tekstu referencyjnego składającego się z 76 komend wygenerowano za pomocą programu cmuclmtk (CMU Sphinx). Uczenie modelu akustycznego oraz test jakości rozpoznawania mowy przeprowadzono za pomocą programu sphinxtrain (CMU Sphinx). W warunkach laboratoryjnych uzyskano wskaźnik błędu rozpoznawania słów (WER) na poziomie 4% i błędu rozpoznawania zdań (SER) na poziomie 9%. Przeprowadzono też badania systemu w warunkach rzeczywistych na grupie testowej złożonej z 2 kobiet i 3 mężczyzn, uzyskując wstępne wyniki rozpoznawania na poziomie 10% (SER) z bliskiej odległości oraz 60% (SER) z odległości 3 m. Określono kierunki dalszych prac.
Automatic Speech Recognition system for Polish and dedicated for social robotics applications is presented. The system is based on free and open software library pocketsphinx (CMU Sphinx). Training and test databases were prepared with transcriptions; the training database comprised voices of 10 women and 10 men, and it was prepared based on audiobooks, whereas the test database comprised voices of 3 women and 3 men recorded in laboratory conditions as a part of the present work. A phoneme set for Polish consisting of 39 phonemes based on two popular sets from other researchers was prepared. The phonetic dictionary was obtained using graphemeto-phoneme conversion from the eSpeak tool for speech synthesis. The language statistic model for the reference text including 76 commands was generated using cmuclmtk tool (CMU Sphinx). Training of the acoustic model and test of quality of speech recognition was conducted using the sphinxtrain tool (CMU Sphinx). The following error rates were obtained for laboratory conditions: 4% (WER) and 9% (SER). Next, investigations of the system in relevant real environment were conducted. The initial, tentative results are about 10% (SER) for the close distance of a speaker to a microphone, and about 60% (SER) for 3 m speaker-microphone distance. Directions of future works are formulated.
Źródło:: Pomiary Automatyka Robotyka; 2016, 20, 4; 27-36
1427-9126
Pojawia się w:: Pomiary Automatyka Robotyka
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Skocz do pozycji: 2.

Tytuł:: Recognition of speaker’s age group and gender for a large database of telephone-recorded voices
Autorzy:: Staroniewicz, Piotr
Powiązania:: https://bibliotekanauki.pl/articles/2202432.pdf
Data publikacji:: 2022
Wydawca:: Politechnika Poznańska. Instytut Mechaniki Stosowanej
Tematy:: speech processing
automatic age recognition
przetwarzanie mowy
automatyczne rozpoznawanie wieku
Opis:: The paper presents the results of the automatic recognition of age group and gender of speakers performed for the large SpeechDAT(E) acoustic database for the Polish language, containing recordings of 1000 speakers (486 males/514 females) aged 12 to 73, recorded in telephone conditions. Three age groups were recognised for each gender. Mel Frequency Cepstral Coefficients (MFCC) were used to describe the recognized signals parametrically. Among the classification methods tested in this study, the best results were obtained for the SVM (Support Vector Machines) method.
Źródło:: Vibrations in Physical Systems; 2022, 33, 2; art. no. 2022203
0860-6897
Pojawia się w:: Vibrations in Physical Systems
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Skocz do pozycji: 3.

Tytuł:: Percepcja audytywna, właściwości akustyczne oraz cechy dystrybucyjne sylab w języku polskim
Auditive perception, acoustic and distributional properties of syllables in Polish
Autorzy:: Śledziński, Daniel
Powiązania:: https://bibliotekanauki.pl/articles/916843.pdf
Data publikacji:: 2015-12-31
Wydawca:: Uniwersytet im. Adama Mickiewicza w Poznaniu
Tematy:: syllable
speech perception
artificial neural networks
automatic speech recognition
sylaba
percepcja mowy
sztuczne sieci neuronowe
automatyczne rozpoznawanie mowy
Opis:: This paper presents experiments concerning properties of selected CV syllables. Acoustic speech signal related to particular syllables was analyzed using artificial neural networks. The goal of the analyses was to investigate whether realizations of particular syllables retain acoustic features distinctive of these syllables. Aditionally, a perception test aiming at identification of the same syllable set was carried out. In the test we analyzed to which degree it is possible to identify syllables isolated from the linguistic context. The paper discusses also results on distributional properties of syllables which indicate that such properties may play a significant role in speech perception.Percepcja audytywna, właściwości akustyczne oraz cechy dystrybucyjne sylab w języku polskim
Źródło:: Investigationes Linguisticae; 2015, 32; 106-123
1426-188X
1733-1757
Pojawia się w:: Investigationes Linguisticae
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Skocz do pozycji: 4.

Tytuł:: Building compact language models for medical speech recognition in mobile devices with limited amount of memory
Autorzy:: Sas, J.
Powiązania:: https://bibliotekanauki.pl/articles/332971.pdf
Data publikacji:: 2012
Wydawca:: Uniwersytet Śląski. Wydział Informatyki i Nauki o Materiałach. Instytut Informatyki. Zakład Systemów Komputerowych
Tematy:: automatyczne rozpoznawanie mowy
medyczne systemy informacyjne
modelowanie języka
automatic speech recognition
medical information systems
language modeling
Opis:: The article presents the method of building compact language model for speech recognition in devices with limited amount of memory. Most popularly used bigram word-based language models allow for highly accurate speech recognition but need large amount of memory to store, mainly due to the big number of word bigrams. The method proposed here ranks bigrams according to their importance in speech recognition and replaces explicit estimation of less important bigrams probabilities by probabilities derived from the class-based model. The class-based model is created by assigning words appearing in the corpus to classes corresponding to syntactic properties of words. The classes represent various combinations of part of speech inflectional features like number, case, tense, person etc. In order to maximally reduce the amount of memory necessary to store class-based model, a method that reduces the number of part-of-speech classes has been applied, that merges the classes appearing in stochastically similar contexts in the corpus. The experiments carried out with selected domains of medical speech show that the method allows for 75% reduction of model size without significant loss of speech recognition accuracy.
Źródło:: Journal of Medical Informatics & Technologies; 2012, 20; 111-119
1642-6037
Pojawia się w:: Journal of Medical Informatics & Technologies
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Skocz do pozycji: 5.

Tytuł:: Optimal spoken dialog control in hands-free medical information systems
Autorzy:: Sas, J.
Powiązania:: https://bibliotekanauki.pl/articles/333081.pdf
Data publikacji:: 2009
Wydawca:: Uniwersytet Śląski. Wydział Informatyki i Nauki o Materiałach. Instytut Informatyki. Zakład Systemów Komputerowych
Tematy:: rozpoznawanie mowy automatyczne
optymalizacja genetyczna
systemy informacji medycznej
automatic speech recognition
genetic optimization
medical information systems
Opis:: In the paper a method of optimal selection of utterances used as command entry-words for voice controlled application is presented. Voice controlled programs seem to be particularly useful in the area of medical informatics, where a physician interacts with a program by voice while operating the medical device or being involved in examinations requiring manual activities. The proposed method selects command words from sets of proposals defined for each command so as to minimize the overall probability of incorrect command recognition. First the entry-word dissimilarity matrix is calculated. The word dissimilarities are evaluated using HMM models consisting of appropriately trained acoustic models of the phonemes constituting words. The trained HMM is used as the sample utterance generator for the word. The artificially created utterance samples are then recognized by speech recognizers created for pairs of words. The estimation of correct recognition probability is used as the word dissimilarity measure. The word dissimilarities are then used to determine the average assessment of words selections that can be used as commands. Selection is created by choosing single word from sets of candidates defined for each command. Finally, suboptimal selection is found by using genetic algorithm. Experiments carried out prove that suboptimal selection of command entry-words can observably increase the accuracy of spoken commands recognition in many cases.
Źródło:: Journal of Medical Informatics & Technologies; 2009, 13; 113-120
1642-6037
Pojawia się w:: Journal of Medical Informatics & Technologies
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Skocz do pozycji: 6.

Tytuł:: Pipelined language model construction for Polish speech recognition
Autorzy:: Sas, J.
Żołnierek, A.
Powiązania:: https://bibliotekanauki.pl/articles/329841.pdf
Data publikacji:: 2013
Wydawca:: Uniwersytet Zielonogórski. Oficyna Wydawnicza
Tematy:: automatic speech recognition
hidden Markov model
adaptive language model
automatyczne rozpoznawanie mowy
model Markova ukryty
model językowy adaptacyjny
Opis:: The aim of works described in this article is to elaborate and experimentally evaluate a consistent method of Language Model (LM) construction for the sake of Polish speech recognition. In the proposed method we tried to take into account the features and specific problems experienced in practical applications of speech recognition in the Polish language, reach inflection, a loose word order and the tendency for short word deletion. The LM is created in five stages. Each successive stage takes the model prepared at the previous stage and modifies or extends it so as to improve its properties. At the first stage, typical methods of LM smoothing are used to create the initial model. Four most frequently used methods of LM construction are here. At the second stage the model is extended in order to take into account words indirectly co-occurring in the corpus. At the next stage, LM modifications are aimed at reduction of short word deletion errors, which occur frequently in Polish speech recognition. The fourth stage extends the model by insertion of words that were not observed in the corpus. Finally the model is modified so as to assure highly accurate recognition of very important utterances. The performance of the methods applied is tested in four language domains.
Źródło:: International Journal of Applied Mathematics and Computer Science; 2013, 23, 3; 649-668
1641-876X
2083-8492
Pojawia się w:: International Journal of Applied Mathematics and Computer Science
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Skocz do pozycji: 7.

Tytuł:: Sterowanie głosowe w systemach obróbkowych
Voice control in machining systems
Autorzy:: Rogowski, A.
Powiązania:: https://bibliotekanauki.pl/articles/404531.pdf
Data publikacji:: 2017
Wydawca:: Wydawnictwo AWART
Tematy:: voice recognizing
voice control
work Centre
automatyczne rozpoznawanie mowy
algorytm
systemy obróbkowe
Opis:: In this paper possibilities of voice control applying for operation of CNC machine-tools were presented. This problem was shown against a background of hitherto existing results of investigations, concerning to automatic voice recognizing algorithm, as like as peculiar its features which are depended on complexity of used words of commands were described. On this ground possible variants of types of commands were given. As the example of practically existing solution one presented worked out at Warsaw Technical University system of voice controlled work centre EMCO.
W artykule omówiono możliwości stosowania sterowania głosowego przy obsłudze gniazd obróbkowych złożonych z obrabiarek CNC. Zagadnienie to ukazano na tle rezultatów dotychczasowych badań dotyczących automatycznego rozpoznawania mowy, a w szczególności jego zastosowania w szeroko rozumianym wytwarzaniu. Omówiony został ogólny schemat algorytmu rozpoznawania mowy oraz specyfika tego algorytmu zależna od stopnia złożoności stosowanych komend. Na bazie tego podane zostały możliwe warianty typów komend służących do obsługi zautomatyzowanych systemów obróbkowych i warianty ich przetwarzania w zależności od zadań, które mają być realizowane w wyniku tych komend. Jako przykład funkcjonującego rozwiązania przedstawiono opracowany na Politechnice Warszawskiej system sterowania głosowego szkoleniowym zrobotyzowanym gniazdem obróbkowym EMCO.
Źródło:: Obróbka Metalu; 2017, 3; 36-42
2081-7002
Pojawia się w:: Obróbka Metalu
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Skocz do pozycji: 8.

Tytuł:: Zastosowania systemów rozpoznawania mowy do sterowania i komunikacji głosowej z urządzeniami mechatronicznymi
Applications of speech recognition systems to control and voice communication with mechatronic devices
Autorzy:: Regulski, R.
Nowak, A.
Powiązania:: https://bibliotekanauki.pl/articles/276751.pdf
Data publikacji:: 2013
Wydawca:: Sieć Badawcza Łukasiewicz - Przemysłowy Instytut Automatyki i Pomiarów
Tematy:: automatyczne rozpoznawanie mowy
sterowanie głosowe
interfejs człowiek-maszyna
sterownik pralki
automatic speech recognition
voice control
human machine interface
washing machine controller
Opis:: Artykuł przedstawia przykłady wykorzystania systemów automatycznego rozpoznawania mowy do budowy głosowych interfejsów typu człowiek-maszyna. W artykule opisano sposób działania takich aplikacji pod kątem sterowania i komunikacji głosowej. W następnej części przedstawiono koncepcję i budowę systemu rozpoznawania mowy do komunikacji z 32-bitowym modułowym sterownikiem pralki.
This paper presents examples of the use of automatic speech recognition systems to build human-machine voice interfaces. Also this paper briefly describes how these applications can work. The rest of the article shows the concept of usage speech recognition system based on own driver which cooperate with washing machine controller.
Źródło:: Pomiary Automatyka Robotyka; 2013, 17, 2; 467-474
1427-9126
Pojawia się w:: Pomiary Automatyka Robotyka
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Skocz do pozycji: 9.

Tytuł:: Automatic recognition of voice commands in a car cabin
Automatyczne rozpoznawanie komend głosowych w kabinie pojazdu
Autorzy:: Mięsikowska, M.
Ruiter de, E.
Powiązania:: https://bibliotekanauki.pl/articles/156597.pdf
Data publikacji:: 2014
Wydawca:: Stowarzyszenie Inżynierów i Techników Mechaników Polskich
Tematy:: car cabin
in-car speech recognition
acoustics
speech intelligibility
kabina pojazdu
automatyczne rozpoznawanie mowy
warunki akustyczne
zrozumiałość mowy
Opis:: Automatic speech recognition systems are applied in vehicles. It is possible to control a navigation system, an air conditioning system, a media player, and make phone calls using voice commands. The effectiveness of speech recognition systems depends largely on the acoustic conditions in the cabin of a vehicle. In contrast, the recognition accuracy, determines the ability to extend the functionality of the application of speech recognition systems, not only to the basic functions listed above, but also to control the systems that affect the movement of the vehicle. The work shows the preliminary results of research on speech recognition and evaluation of speech intelligibility in the cabin of the vehicle in the presence of noise barriers. These results may be helpful in assessing the speech intelligibility and the results of automatic speech recognition systems in the cabin of the vehicle.
Systemy automatycznego rozpoznawania mowy są aplikowane w pojazdach. Za pomocą komend głosowych możemy sterować nawigacją, systemem klimatyzacji, odtwarzaczem multimediów, oraz wykonywać połączenia telefoniczne. Skuteczność systemów rozpoznawania mowy zależna jest w dużej mierze od warunków akustycznych panujących w kabinie pojazdu. Natomiast dokładność rozpoznawania, warunkuje możliwość rozszerzenia funkcjonalności stosowania systemów rozpoznawania mowy nie tylko do podstawowych funkcji wymienionych wyżej, ale także do sterowania układami mającymi wpływ na poruszanie się pojazdu. Praca pokazuje wstępne wyniki badań w zakresie rozpoznawania mowy oraz oceny zrozumiałości mowy w kabinie pojazdu w obecności ekranów akustycznych. Wyniki badań mogą okazać się pomocne w ocenie zrozumiałości mowy i rezultatów automatycznego rozpoznawania mowy w kabinie pojazdu.
Źródło:: Pomiary Automatyka Kontrola; 2014, R. 60, nr 8, 8; 652-654
0032-4140
Pojawia się w:: Pomiary Automatyka Kontrola
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Skocz do pozycji: 10.

Tytuł:: Behavioral features of the speech signal as part of improving the effectiveness of the automatic speaker recognition system
Autorzy:: Mały, Dominik
Dobrowolski, Andrzej
Powiązania:: https://bibliotekanauki.pl/articles/27323689.pdf
Data publikacji:: 2023
Wydawca:: Centrum Rzeczoznawstwa Budowlanego Sp. z o.o.
Tematy:: automatic speaker recognition
automatic speaker recognition systems
physical features
behavioral features
speech signal
automatyczne rozpoznawanie mówiącego
sygnał mowy
system automatycznego rozpoznawania mówiącego
cecha behawioralna
cecha fizyczna
Opis:: The current reality is saturated with intelligent telecommunications solutions, and automatic speaker recognition systems are an integral part of many of them. They are widely used in sectors such as banking, telecommunications and forensics. The ease of performing automatic analysis and efficient extraction of the distinctive characteristics of the human voice makes it possible to identify, verify, as well as authorize the speaker under investigation. Currently, the vast majority of solutions in the field of speaker recognition systems are based on the distinctive features resulting from the structure of the speaker's vocal tract (laryngeal sound analysis), called physical features of the voice. Despite the high efficiency of such systems - oscillating at more than 95% - their further development is already very difficult, due to the fact that the possibilities of distinctive physical features have been exhausted. Further opportunities to increase the effectiveness of ASR systems based on physical features appear after additional consideration of the behavioral features of the speech signal in the system, which is the subject of this article.
Źródło:: Inżynieria Bezpieczeństwa Obiektów Antropogenicznych; 2023, 4; 26--34
2450-1859
2450-8721
Pojawia się w:: Inżynieria Bezpieczeństwa Obiektów Antropogenicznych
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Skocz do pozycji: 11.

Tytuł:: Rozpoznawanie wieku i płci na podstawie analizy głosu
Age and gender recognition based on analysis of voice
Autorzy:: Gabryś, J.
Gil, G.
Kiszka, P.
Powiązania:: https://bibliotekanauki.pl/articles/261820.pdf
Data publikacji:: 2015
Wydawca:: Politechnika Wrocławska. Wydział Podstawowych Problemów Techniki. Katedra Inżynierii Biomedycznej
Tematy:: automatyczne rozpoznawanie mowy
wiek
płeć
współczynniki MFCC
klasyfikacja mówcy
maszyna wektorów nośnych
automatic speech recognition
age
gender
MFCC coefficients
classification of speaker
support vector machine (SVM)
Opis:: Metody automatycznego rozpoznawania wieku i płci pozwalają na rozpoznanie cech osoby mówiącej tylko na podstawie nagrania jej wypowiedzi. Mowa ludzka, poza werbalnym komunikatem, niesie ze sobą informacje dotyczące osoby mówiącej. Nagranie mowy osoby pozwala na wyodrębnienie takich informacji, jak jej płeć, wiek, a także emocje. Zaprezentowano przegląd metod rozpoznawania wieku i płci osób na podstawie ich mowy oraz wykonano implementację i przetestowano połączenie metod wyznaczania parametrów MFCC (współczynniki analizy cepstralnej w skali mel (Mel-frequency Cepstral Coefficients) i wysokości tonu głosu f0 oraz algorytmu SVM (metoda wektorów nośnych - Support Vector Machines) do klasyfikacji próbek głosowych. Testy zaimplementowanego rozwiązania pozwalają stwierdzić, że metoda jest skuteczna w większości przypadków testowych.
Methods for automatic recognition of the age and gender characteristics allow the identification of the person only on the basis of recording of this person speech. Human speech, beyond verbal communication, gives an information about the speaking person. Speech recording allows the identification personal characteristics such as gender, age, and the emotions. The paper presents an overview of methods of age and gender recognition of people based on their speech. A combination of methods for determining the parameters MFCC (Mel-frequency Cepstral Coefficients) and pitch of voice (f0) and SVM (Support Vector Machines) algorithm for the classification of voice samples is implanted and tested. It was demonstrated that the method is effective in the majority of test cases.
Źródło:: Acta Bio-Optica et Informatica Medica. Inżynieria Biomedyczna; 2015, 21, 3; 165-169
1234-5563
Pojawia się w:: Acta Bio-Optica et Informatica Medica. Inżynieria Biomedyczna
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Skocz do pozycji: 12.

Tytuł:: Analiza obwiedni jako parametr wspomagający automatyczną identyfikację wyrażeń
The envelope analysis as a useful parameter in automatic phrase identification
Autorzy:: Dulas, J.
Powiązania:: https://bibliotekanauki.pl/articles/156853.pdf
Data publikacji:: 2009
Wydawca:: Stowarzyszenie Inżynierów i Techników Mechaników Polskich
Tematy:: automatyczne rozpoznawanie sygnałów mowy
analiza obwiedni
automatic speech recognition
envelope analysis
Opis:: W badaniach nad automatycznym rozpoznawaniem sygnałów mowy notuje się stały postęp, choć różnorodność języków utrudnia wprowadzenie jednakowych rozwiązań. Przykładem rozwoju i upowszechnienia metod identyfikacji mowy może być system operacyjny Windows XP, w którym zamieszczono narzędzia do sterowania aplikacjami za pomocą sygnałów głosowych. Brak jednak nadal rozwiązań dla języka polskiego, co sprawia że potrzebne są badania zmierzające do opracowania niezawodnych algorytmów identyfikujących i sterujących. W artykule przedstawiono wyniki badań obwiedni sygnałów mowy, będących cyframi z zakresu 0-9, uzyskanych dla grupy 50-ciu osób różnych płci i w różnym wieku. Celem przeprowadzonych badań było uzyskanie odpowiedzi na pytanie, czy analiza obwiedni może stanowić parametr w procesie automatycznego rozpoznawania sygnałów mowy i czy jest możliwe stworzenie modeli obwiedni dla każdej z cyfr, które byłyby wspólne dla wszystkich (50) mówców.
In scientific research on the speech signal recognition there can be noted great development, although differences between languages make it difficult to work out the same algorithms for all of them. A good example of the big progress in this field can be Windows XP, an operating system which enables controlling some applications by voice (but not in Polish). There is still lack of good working programs controlled by Polish. In this paper the results of investigations on the voice signal envelope are described. There were tested digital recordings, from the range 0 - 9, obtained for 50 persons of different age and sex . The main goal was to find out if the envelope analysis could be helpful in automatic speech recognition. During the investigations basing on the analysis of the digit time characteristic, each digit was divided into parts (from 2 to 5) having the similar envelope. Also the minimum duration and the amplitude range were found for each part. The results are given in Table 1. Table 2 contains the results of fitting the envelope to each digit. It is shown that the envelope patterns are common for all the speakers and digits. Although the envelope analysis is not sufficient alone for automatic speech recognition (some digit patterns fit to the others), it can be used as one of the parameters employed for this purpose.
Źródło:: Pomiary Automatyka Kontrola; 2009, R. 55, nr 5, 5; 308-309
0032-4140
Pojawia się w:: Pomiary Automatyka Kontrola
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Skocz do pozycji: 13.

Tytuł:: Parametry identyfikacyjne umożliwiające automatyczne rozpoznawanie cyfr wypowiadanych w języku polskim
Identification parameters enabling automatic recognition of digits spoken in Polish
Autorzy:: Dulas, J.
Powiązania:: https://bibliotekanauki.pl/articles/157420.pdf
Data publikacji:: 2011
Wydawca:: Stowarzyszenie Inżynierów i Techników Mechaników Polskich
Tematy:: automatyczne rozpoznawanie sygnału mowy
fonemy
automatic speech recognition
phonemes
Opis:: Artykuł przedstawia najnowsze wyniki prac autora w dziedzinie automatycznego rozpoznawania sygnałów mowy. Wyniki badań prowadzonych na zbiorze 500 nagrań cyfr wypowiadanych w języku polskim przez 50 mówców różnej płci i w różnym wieku pozwalają na zaproponowanie zestawu parametrów niezbędnych do przeprowadzenia procesu ich identyfikacji. Jak pokazano w artykule zestaw kilku podstawowych cech identyfikujących jest wystarczający aby taki proces przeprowadzić. Zaproponowany zestaw parametrów jest łatwy do uzyskania przy niewielkiej mocy obliczeniowej.
The paper describes a new author's method for automatic recognition of digits spoken in Polish. In this new approach there are no frequency analyses as used to be made in such systems but the image recognition of the time characteristic is applied. Investigations performed on 500 records of people of different sex and age showed that there was possibility of constructing an automatic recognition system based on a few parameters. The first is the number of voiced phonemes included in a recognized word (Tab. 1). In this group there are all wavelets and some consonants. They include basic periods inside their time characteristics. This parameter is obtained using the grid method designed by the author (Fig. 3). The second one is the number and position of noisy phonemes. To this group there belong phonemes without basic periods but with big signal variety. This parameter is calculated using the number of local extrema, the signal amplitude level and checking if there are no basic periods. The third parameter is the shape of a signal envelope (Tab. 2). As investigations showed, it is possible to find the envelope pattern for each Polish digit common for all tested speakers. It was proved that these parameters are sufficient for automatic speech recognition of digits spoken in Polish. This new method can also be applied to other systems with small number of recognized words. It is fast and lack of frequency analyses causes that it has low hardware demands.
Źródło:: Pomiary Automatyka Kontrola; 2011, R. 57, nr 3, 3; 308-311
0032-4140
Pojawia się w:: Pomiary Automatyka Kontrola
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Informacja

Wyszukujesz frazę "rozpoznawanie mowy automatyczne" wg kryterium: Temat

Źródło danych

Dostawca treści

Kolekcja

Rok wydania

Wydawca

Temat

Autor

Typ dokumentu

Język