Temat: visual speech - Katalog OPAC zbiorów

Skocz do pozycji: 1.

Tytuł:: Speech Signal Measurement with 2D Microphone Array for Audio Visual Robot Control
Pomiar sygnału głosowego za pomocą matrycy mikrofonowej dwuwymiarowej przeznaczonej do audio-wizyjnego sterowania robota
Autorzy:: Bekiarski, A.
Pleshkova-Bekiarska, S. G.
Powiązania:: https://bibliotekanauki.pl/articles/153173.pdf
Data publikacji:: 2008
Wydawca:: Stowarzyszenie Inżynierów i Techników Mechaników Polskich
Tematy:: pomiar sygnału głosowego
sterowanie audio-wizyjne robota mobilnego
matryca mikrofonowa
przetwarzanie sygnału mowy
sensory robota
speech signal measurement
audiovisual robot control
audio visual robot sensors
microphone arrays
speech processing
Opis:: Speech signals are one of the essential sources of information in the field of modern intelligent robots, equipped with a microphone array as audio sensors. Applications of microphone arrays are well known. They are used to collect and measure the audio information in audio processing system of a robot. The audio information can be of different nature: music, speech, noise etc. The paper refers only to speech signals, which are used for robot control. There are many structures of the microphone arrays: linear, planar, circular etc., which can be used for collecting and measuring the speech signals with the audio system of an audio visual robot. Most often linear microphone arrays are used mainly because of theirs simplicity. They are also used for robot orientation and movement control in simple room situation, by means of the direction detection of speech arrival. The goal of this paper is presentation of the use 2D microphone array for speech signal measurement, and applying space-time filtering optimized to find speech direction of arrival (DOA). The discovered and calculated speech signal direction of arrival can be combined with the video sensor co-ordinate information to effectively control the mobile robot movements in specified direction.
Sygnał mowy jest jednym z głównych źródeł informacji dla współczesnych robotów inteligentnych, wyposażonych w matryce mikrofonowe pracujące jako sensory sygnału audio. Zastosowania takich matryc są dobrze znane. Służą one do zbierania i pomiaru informacji zawartej w sygnałach audio. Informacje audio mogą mieć różną naturę: może to być muzyka, mowa, szum itp. Artykuł dotyczy jedynie sygnałów głosowych, które są używane do sterowania robota. Istnieje wiele struktur matryc mikrofonowych, np. liniowe, planarne, kołowe itd., które mogą być używane do zbierania i pomiarów parametrów sygnału mowy przez system audio robota. Najczęściej z powodu ich prostoty są stosowane matryce liniowe. Wykorzystuje się je również do orientowania robota i sterowania jego ruchem w prostej sytuacji, gdy robot pracuje w pokoju, za pomocą wykrywania kierunku z którego przychodzi sygnał głosowy. Celem artykułu jest przedstawienie zastosowania dwuwymiarowej matrycy mikrofonowej do pomiaru sygnału głosowego oraz zastosowania filtracji czasowo-przestrzennej zoptymalizowanej do znajdowania kierunku z jakiego przychodzi sygnał głosowy (DOA). Wykryty i obliczony kierunek nadchodzenia sygnału głosowego może być połączony z informacjami o współrzędnych z sensora video w celu efektywnego sterowania ruchów robota mobilnego w określonym kierunku.
Źródło:: Pomiary Automatyka Kontrola; 2008, R. 54, nr 10, 10; 741-743
0032-4140
Pojawia się w:: Pomiary Automatyka Kontrola
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Skocz do pozycji: 2.

Tytuł:: Opieka logopedyczna dla uczniów ze specjalnymi potrzebami edukacyjnymi w regionie śląsko-morawskim
Speech therapy for pupils with special educational in Moravian Silesian Region
Autorzy:: Javorčeková, Gerlinda
Tesárková, Eva
Powiązania:: https://bibliotekanauki.pl/articles/667437.pdf
Data publikacji:: 2012-01-01
Wydawca:: Wydawnictwo Uniwersytetu Śląskiego
Tematy:: speech therapy
children with special educational visual communication system PECS
Opis:: Speech is very important for communication between people. The Moravian-Silesian Region speech therapy is provided free of charge, is covered under the contractual relationship of the medical facilities and health insurance. Speech therapy is available to every child and adult.
Źródło:: Logopedia Silesiana; 2012, Logopedia Silesiana nr 1; 15-18
2300-5246
2391-4297
Pojawia się w:: Logopedia Silesiana
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Skocz do pozycji: 3.

Tytuł:: Extending Visual Speech Synthesis for Polish with basic emotion model
Autorzy:: Bloch, J.
Powiązania:: https://bibliotekanauki.pl/articles/115798.pdf
Data publikacji:: 2013
Wydawca:: Fundacja na Rzecz Młodych Naukowców
Tematy:: Visual Speech Synthesis
emotion
Xface
Ekman
wizualna synteza mowy
emocje
Opis:: Expressing emotions is a very important feature of Visual Speech Synthesis systems. In 1972 the first “basic emotions” list was introduced, by Paul Ekman. Since then few different classifications were published. Most famous “basic emotion” models are briefly described in this paper. In previous publication new Visual Speech Synthesis system for Polish was presented. The system was based on Xface toolkit and “Karol” face model. The aim of this paper is to add “basic emotion” model, according to Paul Ekman’s classification, into “Karol” face model. To achieve this goal new emotional keyframes were proposed. This new functionality of “Karol” face model, allows to generate talking human face animations, which express emotions. The subjective test of new functionality are also included in the paper. The results showed that more information about speakers emotions is read from human face expression than form human speech signal. People can more easily recognize speakers emotion when they see his face expression.
Źródło:: Challenges of Modern Technology; 2013, 4, 2; 19-22
2082-2863
2353-4419
Pojawia się w:: Challenges of Modern Technology
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Skocz do pozycji: 4.

Tytuł:: Koncepcja zdalnej, głosowej i wizualnej komunikacji operatora i systemu monitorowania i optymalizacji procesów mikro- i nanoobróbki
A concept of distant voice and visual communication between the operator and a system for monitoring and optimization of micro- and nano-machining processes
Autorzy:: Lipiński, D.
Majewski, M.
Powiązania:: https://bibliotekanauki.pl/articles/156999.pdf
Data publikacji:: 2013
Wydawca:: Stowarzyszenie Inżynierów i Techników Mechaników Polskich
Tematy:: monitorowanie i optymalizacja procesów mikro- i nanoobróbki
inteligentne systemy monitorowania i optymalizacji
interakcja operatora z systemem
inteligentny interfejs do sterowania procesami
komunikacja głosowa i wizualna
interfejs mowy
jakość procesu
analiza danych pomiarowych
systemy ekspertowe
sztuczna inteligencja
monitoring and optimization of micro- and nano-machining processes
intelligent systems of monitoring and optimization
interaction between operators and systems
intelligent interface for system control
voice and visual communication
speech interface
process quality
measurement data analysis
expert systems
artificial intelligence
Opis:: W artykule przedstawiono nową koncepcję zdalnej, głosowej i wizualnej komunikacji operatora i systemu monitorowania i optymalizacji procesów mikro- i nanoobróbki. System zdalnego monitorowania i optymalizacji jakości procesów, wyposażony w interfejs wizualny i głosowy, przedstawiono w przykładowym zastosowaniu w procesach precyzyjnego szlifowania. Opracowana koncepcja proponuje architekturę systemu wyposażoną w warstwę analizy danych, warstwę nadzorowania procesu, warstwę decyzyjną, podsystem komunikacji głosowej w języku naturalnym oraz podsystem komunikacji wizualnej z opisem głosowym. Interakcja operatora z systemem za pomocą mowy i języka naturalnego zawiera inteligentne mechanizmy służące do identyfikacji biometrycznej operatora, rozpoznawania mowy, rozpoznawania słów składowych i komunikatów operatora, analizy składni komunikatów, analiza skutków poleceń, ocena bezpieczeństwa poleceń. Interakcja systemu z operatorem za pośrednictwem komunikatów wizualnych z opisem głosowym zawiera inteligentne mechanizmy służące do generowania wykresów i raportów, klasyfikacji form przekazów i ich tworzenia, generowania komunikatów w postaci graficznej i tekstowej, konsolidacji i analizy treści komunikatów, oraz syntezy komunikatów multimedialnych. Artykuł przedstawia również koncepcję inteligentnych metod i algorytmów jakościowego opisu procesu obróbki na podstawie analizy danych pomiarowych z zastosowaniem systemu ekspertowego opartego na regresyjnych sieciach neuronowych.
The paper deals with a new concept of distant voice and visual communication between the operator and a system for monitoring and optimization of micro- and nano-machining processes. The distant system for monitoring and optimization of the process quality, equipped with a visual and vocal interface, is presented in an exemplary application to precision grinding. There is proposed an architecture of the system equipped with a data analysis layer, a process supervision layer, a decision layer, a communication subsystem using speech and natural language, and a visual communication subsystem using vocal descriptions. As the system is equipped with several intelligent layers, it is capable of control, supervision and optimization of the processes of micro- and nano-machining. In the proposed system, computational intelligence methods allow for real-time data analysis of the monitored process, configuration of the system, process supervision based on process features and quality models. The system is also capable of detection of inaccuracies, estimation of inaccuracy results, compensation of inaccuracy results, and selection of machining parameters and conditions. In addition, it conducts assessment of the operator’s decisions. The system also consists of meaning analysis mechanisms of operator's messages and commands given by voice in a natural language, and various visual communication forms with the operator using vocal descriptions. The layer for data presentation and communication provides data and information about the machining parameters and conditions, tool condition, process condition, estimation of the process quality, and process variables. The interaction between the operator and the system by speech and natural language contains intelligent mechanisms for operator biometric identification, speech recognition, word recognition, recognition of messages and com-mands, syntax analysis of messages, and safety assessment of commands. The interaction between the system and the operator using visual messages with vocal descriptions includes intelligent mechanisms for generation of graphical and textual reports, classification of message forms, generation of messages in the graphical and textual forms, consolidation and analysis of message contents, synthesis of multimedia messages. In the paper, Fig. 1 presents the concept of distant voice and visual communication between the operator and a system for monitoring and optimization of micro- and nano-machining processes. The concept of the system of distant monitoring and optimization of the precision grinding processes using voice and visual communication between the operator and the system is shown in Fig. 2, while the complete structure of the system is depicted in Fig. 3. The paper also presents a concept of intelligent methods and algorithms (Fig. 4) for de-scribing the machining process quality on the basis of the measurement data analysis using an expert system equipped with regression neural networks.
Źródło:: Pomiary Automatyka Kontrola; 2013, R. 59, nr 7, 7; 648-651
0032-4140
Pojawia się w:: Pomiary Automatyka Kontrola
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Skocz do pozycji: 5.

Tytuł:: Audio-Visual Speech Processing System for Polish Applicable to Human-Computer Interaction
Autorzy:: Jadczyk, T.
Powiązania:: https://bibliotekanauki.pl/articles/305828.pdf
Data publikacji:: 2018
Wydawca:: Akademia Górniczo-Hutnicza im. Stanisława Staszica w Krakowie. Wydawnictwo AGH
Tematy:: audio-visual speech recognition
visual features extraction
human-computer interaction
Opis:: This paper describes audio-visual speech recognition system for Polish language and a set of performance tests under various acoustic conditions. We first present the overall structure of AVASR systems with three main areas: audio features extraction, visual features extraction and subsequently, audiovisual speech integration. We present MFCC features for audio stream with standard HMM modeling technique, then we describe appearance and shape based visual features. Subsequently we present two feature integration techniques, feature concatenation and model fusion. We also discuss the results of a set of experiments conducted to select best system setup for Polish, under noisy audio conditions. Experiments are simulating human-computer interaction in computer control case with voice commands in difficult audio environments. With Active Appearance Model (AAM) and multistream Hidden Markov Model (HMM) we can improve system accuracy by reducing Word Error Rate for more than 30%, comparing to audio-only speech recognition, when Signal-to-Noise Ratio goes down to 0dB.
Źródło:: Computer Science; 2018, 19 (1); 41-63
1508-2806
2300-7036
Pojawia się w:: Computer Science
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Skocz do pozycji: 6.

Tytuł:: A Study on the Impact of Lombard Effect on Recognition of Hindi Syllabic Units Using CNN Based Multimodal ASR Systems
Autorzy:: Uma Maheswari, Sadasivam
Shahina, A.
Rishickesh, Ramesh
Nayeemulla Khan, A.
Powiązania:: https://bibliotekanauki.pl/articles/176415.pdf
Data publikacji:: 2020
Wydawca:: Polska Akademia Nauk. Czasopisma i Monografie PAN
Tematy:: Lombard speech
multimodal ASR
throat microphone
visual speech
Convolutional Neural Network
Hidden Markov Model
late fusion
intermediate fusion
Opis:: Research work on the design of robust multimodal speech recognition systems making use of acoustic, and visual cues, extracted using the relatively noise robust alternate speech sensors is gaining interest in recent times among the speech processing research fraternity. The primary objective of this work is to study the exclusive influence of Lombard effect on the automatic recognition of the confusable syllabic consonant-vowel units of Hindi language, as a step towards building robust multimodal ASR systems in adverse environments in the context of Indian languages which are syllabic in nature. The dataset for this work comprises the confusable 145 consonant-vowel (CV) syllabic units of Hindi language recorded simultaneously using three modalities that capture the acoustic and visual speech cues, namely normal acoustic microphone (NM), throat microphone (TM) and a camera that captures the associated lip movements. The Lombard effect is induced by feeding crowd noise into the speaker’s headphone while recording. Convolutional Neural Network (CNN) models are built to categorise the CV units based on their place of articulation (POA), manner of articulation (MOA), and vowels (under clean and Lombard conditions). For validation purpose, corresponding Hidden Markov Models (HMM) are also built and tested. Unimodal Automatic Speech Recognition (ASR) systems built using each of the three speech cues from Lombard speech show a loss in recognition of MOA and vowels while POA gets a boost in all the systems due to Lombard effect. Combining the three complimentary speech cues to build bimodal and trimodal ASR systems shows that the recognition loss due to Lombard effect for MOA and vowels reduces compared to the unimodal systems, while the POA recognition is still better due to Lombard effect. A bimodal system is proposed using only alternate acoustic and visual cues which gives a better discrimination of the place and manner of articulation than even standard ASR system. Among the multimodal ASR systems studied, the proposed trimodal system based on Lombard speech gives the best recognition accuracy of 98%, 95%, and 76% for the vowels, MOA and POA, respectively, with an average improvement of 36% over the unimodal ASR systems and 9% improvement over the bimodal ASR systems.
Źródło:: Archives of Acoustics; 2020, 45, 3; 419-431
0137-5075
Pojawia się w:: Archives of Acoustics
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Skocz do pozycji: 7.

Tytuł:: An exploratory case study to investigate perceived pronunciation errors in Thai primary school students using audio-visual speech recognition
Autorzy:: Graham, Steven
Powiązania:: https://bibliotekanauki.pl/articles/2087181.pdf
Data publikacji:: 2021
Wydawca:: Uniwersytet Marii Curie-Skłodowskiej w Lublinie. IATEFL Poland Computer Special Interest Group
Tematy:: Speech recognition software
audio visual
English
computer assisted language learning
Opis:: An explorative case study has been conducted at a small rural school in the north east of Thailand to investigate the pronunciation errors that primary school students make when reading English aloud. This paper illustrates the opportunities and challenges of employing speech recognition software in rural classrooms by using it with specifically designed audio-visual materials based on the Thai curriculum to identify English language reading and pronunciation difficulties. A comparison is made between this study and published literature.
Źródło:: Teaching English with Technology; 2021, 21, 4; 3-18
1642-1027
Pojawia się w:: Teaching English with Technology
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Skocz do pozycji: 8.

Tytuł:: Role of Visual Speech Cues (Cued Speech) in Foreign Language Learning by Hearing School-Age Children
Autorzy:: Grabowska-Chenczke, Olga
Francuz, Piotr
Bałaj, Bibianna
Powiązania:: https://bibliotekanauki.pl/articles/31343465.pdf
Data publikacji:: 2023
Wydawca:: Katolicki Uniwersytet Lubelski Jana Pawła II. Towarzystwo Naukowe KUL
Tematy:: speech perception
foreign language learning
auditory distraction
visual speech cues
Cued Speech
Opis:: In this study, we aimed to determine the role of visual speech cues in the process of foreign language learning by hearing school-age children. Our experiments used Cued Speech, a method designed for people who are deaf or hard of hearing. We expected that the principles of the method might also be beneficial for people with normal hearing because they may help distinguish the sounds of foreign speech that are difficult to hear. This study mainly focused on the effects of speech perception. We tested 126 Polish junior high school students (66 girls and 60 boys) with a normal range of phonemic hearing and language aptitude. We envisaged that foreign language learners using visual speech cues would achieve a higher score on a test of foreign language than learners who had studied the language in the traditional manner. We also formulated a hypothesis concerning the interaction of training type and training conditions on the effectiveness of foreign language learning: that the difference in the effects of foreign language learning between participants who received visual or executive training and typical training would be more significant in the presence of auditory distractors than in their absence. We observed interactions between conditions and types of training for speech sound identification. Under conditions of auditory distraction, foreign language learners using Cued Speech scored significantly higher than learners who had traditional training.
Źródło:: Roczniki Psychologiczne; 2023, 26, 3; 215-240
1507-7888
Pojawia się w:: Roczniki Psychologiczne
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Informacja

Wyszukujesz frazę "visual speech" wg kryterium: Temat

Źródło danych

Dostawca treści

Kolekcja

Rok wydania

Wydawca

Temat

Autor

Typ dokumentu

Język