Informacja

Drogi użytkowniku, aplikacja do prawidłowego działania wymaga obsługi JavaScript. Proszę włącz obsługę JavaScript w Twojej przeglądarce.

Wyszukujesz frazę "attention mechanism" wg kryterium: Temat


Wyświetlanie 1-9 z 9
Tytuł:
Attention-based deep learning model for Arabic handwritten text recognition
Autorzy:
Aïcha Gader, Takwa Ben
Echi, Afef Kacem
Powiązania:
https://bibliotekanauki.pl/articles/2201264.pdf
Data publikacji:
2022
Wydawca:
Szkoła Główna Gospodarstwa Wiejskiego w Warszawie. Instytut Informatyki Technicznej
Tematy:
Arabic handwriting recognition
attention mechanism
BLSTM
CNN
CTC
RNN
Opis:
This work proposes a segmentation-free approach to Arabic Handwritten Text Recog-nition (AHTR): an attention-based Convolutional Neural Network - Recurrent Neural Network - Con-nectionist Temporal Classification (CNN-RNN-CTC) deep learning architecture. The model receives asinput an image and provides, through a CNN, a sequence of essential features, which are transferred toan Attention-based Bidirectional Long Short-Term Memory Network (BLSTM). The BLSTM gives features sequence in order, and the attention mechanism allows the selection of relevant information from the features sequences. The selected information is then fed to the CTC, enabling the loss calculation and the transcription prediction. The contribution lies in extending the CNN by dropout layers, batch normalization, and dropout regularization parameters to prevent over-fitting. The output of the RNN block is passed through an attention mechanism to utilize the most relevant parts of the input sequence in a flexible manner. This solution enhances previous methods by improving the CNN speed and performance and controlling over model over-fitting. The proposed system achieves the best accuracy of97.1% for the IFN-ENIT Arabic script database, which competes with the current state-of-the-art. It was also tested for the modern English handwriting of the IAM database, and the Character Error Rate of 2.9% is attained, which confirms the model’s script independence.
Źródło:
Machine Graphics & Vision; 2022, 31, 1/4; 49--73
1230-0535
2720-250X
Pojawia się w:
Machine Graphics & Vision
Dostawca treści:
Biblioteka Nauki
Artykuł
Tytuł:
Mask face inpainting based on improved generative adversarial network
Autorzy:
Liu, Qingyu
Juanatas, Roben A.
Powiązania:
https://bibliotekanauki.pl/articles/30148248.pdf
Data publikacji:
2023
Wydawca:
Polskie Towarzystwo Promocji Wiedzy
Tematy:
face inpainting
generative adversarial network
residual network
attention mechanism
Opis:
Face recognition technology has been widely used in all aspects of people's lives. However, the accuracy of face recognition is greatly reduced due to the obscuring of objects, such as masks and sunglasses. Wearing masks in public has been a crucial approach to preventing illness, especially since the Covid-19 outbreak. This poses challenges to applications such as face recognition. Therefore, the removal of masks via image inpainting has become a hot topic in the field of computer vision. Deep learning-based image inpainting techniques have taken observable results, but the restored images still have problems such as blurring and inconsistency. To address such problems, this paper proposes an improved inpainting model based on generative adversarial network: the model adds attention mechanisms to the sampling module based on pix2pix network; the residual module is improved by adding convolutional branches. The improved inpainting model can not only effectively restore faces obscured by face masks, but also realize the inpainting of randomly obscured images of human faces. To further validate the generality of the inpainting model, tests are conducted on the datasets of CelebA, Paris Street and Place2, and the experimental results show that both SSIM and PSNR have improved significantly.
Źródło:
Applied Computer Science; 2023, 19, 2; 25-42
1895-3735
2353-6977
Pojawia się w:
Applied Computer Science
Dostawca treści:
Biblioteka Nauki
Artykuł
Tytuł:
Aircraft Bleed Air System Fault Prediction based on Encoder-Decoder with Attention Mechanism
Autorzy:
Su, Siyu
Sun, Youchao
Peng, Chong
Wang, Yifan
Powiązania:
https://bibliotekanauki.pl/articles/27312776.pdf
Data publikacji:
2023
Wydawca:
Polska Akademia Nauk. Polskie Naukowo-Techniczne Towarzystwo Eksploatacyjne PAN
Tematy:
bleed air system
fault prediction
attention mechanism
deep learning
EWMA control chart
Opis:
The engine bleed air system (BAS) is one of the important systems for civil aircraft, and fault prediction of BAS is necessary to improve aircraft safety and the operator's profit. A dual-stage two-phase attention-based encoder-decoder (DSTP-ED) prediction model is proposed for BAS normal state estimation. Unlike traditional ED networks, the DSTP-ED combines spatial and temporal attention to better capture the spatiotemporal relationships to achieve higher prediction accuracy. Five data-driven algorithms, autoregressive integrated moving average (ARIMA), support vector regression (SVR), long short-term memory (LSTM), ED, and DSTP-ED, are applied to build prediction models for BAS. The comparison experiments show that the DSTP-ED model outperforms the other four data-driven models. An exponentially weighted moving average (EWMA) control chart is used as the evaluation criterion for the BAS failure warning. An empirical study based on Quick Access Recorder (QAR) data from Airbus A320 series aircraft demonstrates that the proposed method can effectively predict failures.
Źródło:
Eksploatacja i Niezawodność; 2023, 25, 3; art. no. 167792
1507-2711
Pojawia się w:
Eksploatacja i Niezawodność
Dostawca treści:
Biblioteka Nauki
Artykuł
Tytuł:
Intelligent fault diagnosis of rolling bearings based on continuous wavelet transform-multiscale feature fusion and improved channel attention mechanism
Autorzy:
Zhang, Jiqiang
Kong, Xiangwei
Cheng, Liu
Qi, Haochen
Yu, Mingzhu
Powiązania:
https://bibliotekanauki.pl/articles/24200817.pdf
Data publikacji:
2023
Wydawca:
Polska Akademia Nauk. Polskie Naukowo-Techniczne Towarzystwo Eksploatacyjne PAN
Tematy:
deep learning
continuous wavelet transform
improved channel attention mechanism
multi-conditions
convolutional neural network
Opis:
Accurate fault diagnosis is critical to operating rotating machinery safely and efficiently. Traditional fault information description methods rely on experts to extract statistical features, which inevitably leads to the problem of information loss. As a result, this paper proposes an intelligent fault diagnosis of rolling bearings based on a continuous wavelet transform(CWT)-multiscale feature fusion and an improved channel attention mechanism. Different from traditional CNNs, CWT can convert the 1-D signals into 2-D images, and extract the wavelet power spectrum, which is conducive to model recognition. In this case, the multiscale feature fusion was implemented by the parallel 2-D convolutional neural networks to accomplish deeper feature fusion. Meanwhile, the channel attention mechanism is improved by converting from compressed to extended ways in the excitation block to better obtain the evaluation score of the channel. The proposed model has been validated using two bearing datasets, and the results show that it has excellent accuracy compared to existing methods.
Źródło:
Eksploatacja i Niezawodność; 2023, 25, 1; art. no. 16
1507-2711
Pojawia się w:
Eksploatacja i Niezawodność
Dostawca treści:
Biblioteka Nauki
Artykuł
Tytuł:
MFFNet: A multi-frequency feature extraction and fusion network for visual processing
Autorzy:
Deng, Jinsheng
Zhang, Zhichao
Yin, Xiaoqing
Powiązania:
https://bibliotekanauki.pl/articles/2173678.pdf
Data publikacji:
2022
Wydawca:
Polska Akademia Nauk. Czytelnia Czasopism PAN
Tematy:
deblurring
multi-feature fusion
deep learning
attention mechanism
rozmywanie
fuzja wielu funkcji
głęboka nauka
mechanizm uwagi
Opis:
Convolutional neural networks have achieved tremendous success in the areas of image processing and computer vision. However, they experience problems with low-frequency information such as semantic and category content and background color, and high-frequency information such as edge and structure. We propose an efficient and accurate deep learning framework called the multi-frequency feature extraction and fusion network (MFFNet) to perform image processing tasks such as deblurring. MFFNet is aided by edge and attention modules to restore high-frequency information and overcomes the multiscale parameter problem and the low-efficiency issue of recurrent architectures. It handles information from multiple paths and extracts features such as edges, colors, positions, and differences. Then, edge detectors and attention modules are aggregated into units to refine and learn knowledge, and efficient multi-learning features are fused into a final perception result. Experimental results indicate that the proposed framework achieves state-of-the-art deblurring performance on benchmark datasets.
Źródło:
Bulletin of the Polish Academy of Sciences. Technical Sciences; 2022, 70, 3; art. no. e140466
0239-7528
Pojawia się w:
Bulletin of the Polish Academy of Sciences. Technical Sciences
Dostawca treści:
Biblioteka Nauki
Artykuł
Tytuł:
Spatiotemporal attention mechanism-based multistep traffic volume prediction model for highway toll stations
Autorzy:
Huang, Zijing
Lin, Peiqun
Lin, Xukun
Zhou, Chuhao
Huang, Tongge
Powiązania:
https://bibliotekanauki.pl/articles/2124715.pdf
Data publikacji:
2022
Wydawca:
Polska Akademia Nauk. Czytelnia Czasopism PAN
Tematy:
ITS
traffic volume forecasting
attention mechanism
highway toll station
model interpretation
natężenia ruchu
prognozowanie
mechanizm uwagi
stacja poboru opłat
Opis:
As the fundamental part of other Intelligent Transportation Systems (ITS) applications, short-term traffic volume prediction plays an important role in various intelligent transportation tasks, such as traffic management, traffic signal control and route planning. Although Neural-network-based traffic prediction methods can produce good results, most of the models can’t be explained in an intuitive way. In this paper, we not only proposed a model that increase the short-term prediction accuracy of the traffic volume, but also improved the interpretability of the model by analyzing the internal attention score learnt by the model. we propose a spatiotemporal attention mechanism-based multistep traffic volume prediction model (SAMM). Inside the model, an LSTM-based Encoder-Decoder network with a hybrid attention mechanism is introduced, which consists of spatial attention and temporal attention. In the first level, the local and global spatial attention mechanisms considering the micro traffic evolution and macro pattern similarity, respectively, are applied to capture and amplify the features from the highly correlated entrance stations. In the second level, a temporal attention mechanism is employed to amplify the features from the time steps captured as contributing more to the future exit volume. Considering the time-dependent characteristics and the continuity of the recent evolutionary traffic volume trend, the timestamp features and historical exit volume series of target stations are included as the external inputs. An experiment is conducted using data from the highway toll collection system of Guangdong Province, China. By extracting and analyzing the weights of the spatial and temporal attention layers, the contributions of the intermediate parameters are revealed and explained with knowledge acquired by historical statistics. The results show that the proposed model outperforms the state-of-the-art model by 29.51% in terms of MSE, 13.93% in terms of MAE, and 5.69% in terms of MAPE. The effectiveness of the Encoder-Decoder framework and the attention mechanism are also verified.
Źródło:
Archives of Transport; 2022, 61, 1; 21--38
0866-9546
2300-8830
Pojawia się w:
Archives of Transport
Dostawca treści:
Biblioteka Nauki
Artykuł
Tytuł:
Specific emitter identification based on one-dimensional complex-valued residual networks with an attention mechanism
Autorzy:
Qu, Lingzhi
Yang, Junan
Huang, Keju
Liu, Hui
Powiązania:
https://bibliotekanauki.pl/articles/2086889.pdf
Data publikacji:
2021
Wydawca:
Polska Akademia Nauk. Czytelnia Czasopism PAN
Tematy:
complex-valued residual network
specific emitter identification
fingerprint characteristic
attention mechanism
one-dimensional convolution
sieć rezydualna o złożonej wartości
specyficzna identyfikacja emiterów
charakterystyka linii papilarnych
mechanizm uwagi
splot jednowymiarowy
Opis:
Specific emitter identification (SEI) can distinguish single-radio transmitters using the subtle features of the received waveform. Therefore, it is used extensively in both military and civilian fields. However, the traditional identification method requires extensive prior knowledge and is time-consuming. Furthermore, it imposes various effects associated with identifying the communication radiation source signal in complex environments. To solve the problem of the weak robustness of the hand-crafted feature method, many scholars at home and abroad have used deep learning for image identification in the field of radiation source identification. However, the classification method based on a real-numbered neural network cannot extract In-phase/Quadrature (I/Q)-related information from electromagnetic signals. To address these shortcomings, this paper proposes a new SEI framework for deep learning structures. In the proposed framework, a complex-valued residual network structure is first used to mine the relevant information between the in-phase and orthogonal components of the radio frequency baseband signal. Then, a one-dimensional convolution layer is used to a) directly extract the features of a specific one-dimensional time-domain signal sequence, b) use the attention mechanism unit to identify the extracted features, and c) weight them according to their importance. Experiments show that the proposed framework having complex-valued residual networks with attention mechanism has the advantages of high accuracy and superior performance in identifying communication radiation source signals.
Źródło:
Bulletin of the Polish Academy of Sciences. Technical Sciences; 2021, 69, 5; e138814, 1--10
0239-7528
Pojawia się w:
Bulletin of the Polish Academy of Sciences. Technical Sciences
Dostawca treści:
Biblioteka Nauki
Artykuł
Tytuł:
Speech emotion recognition using wavelet packet reconstruction with attention-based deep recurrent neutral networks
Autorzy:
Meng, Hao
Yan, Tianhao
Wei, Hongwei
Ji, Xun
Powiązania:
https://bibliotekanauki.pl/articles/2173587.pdf
Data publikacji:
2021
Wydawca:
Polska Akademia Nauk. Czytelnia Czasopism PAN
Tematy:
speech emotion recognition
voice activity detection
wavelet packet reconstruction
feature extraction
LSTM networks
attention mechanism
rozpoznawanie emocji mowy
wykrywanie aktywności głosowej
rekonstrukcja pakietu falkowego
wyodrębnianie cech
mechanizm uwagi
sieć LSTM
Opis:
Speech emotion recognition (SER) is a complicated and challenging task in the human-computer interaction because it is difficult to find the best feature set to discriminate the emotional state entirely. We always used the FFT to handle the raw signal in the process of extracting the low-level description features, such as short-time energy, fundamental frequency, formant, MFCC (mel frequency cepstral coefficient) and so on. However, these features are built on the domain of frequency and ignore the information from temporal domain. In this paper, we propose a novel framework that utilizes multi-layers wavelet sequence set from wavelet packet reconstruction (WPR) and conventional feature set to constitute mixed feature set for achieving the emotional recognition with recurrent neural networks (RNN) based on the attention mechanism. In addition, the silent frames have a disadvantageous effect on SER, so we adopt voice activity detection of autocorrelation function to eliminate the emotional irrelevant frames. We show that the application of proposed algorithm significantly outperforms traditional features set in the prediction of spontaneous emotional states on the IEMOCAP corpus and EMODB database respectively, and we achieve better classification for both speaker-independent and speaker-dependent experiment. It is noteworthy that we acquire 62.52% and 77.57% accuracy results with speaker-independent (SI) performance, 66.90% and 82.26% accuracy results with speaker-dependent (SD) experiment in final.
Źródło:
Bulletin of the Polish Academy of Sciences. Technical Sciences; 2021, 69, 1; art. no. e136300
0239-7528
Pojawia się w:
Bulletin of the Polish Academy of Sciences. Technical Sciences
Dostawca treści:
Biblioteka Nauki
Artykuł
Tytuł:
Speech emotion recognition using wavelet packet reconstruction with attention-based deep recurrent neutral networks
Autorzy:
Meng, Hao
Yan, Tianhao
Wei, Hongwei
Ji, Xun
Powiązania:
https://bibliotekanauki.pl/articles/2090711.pdf
Data publikacji:
2021
Wydawca:
Polska Akademia Nauk. Czytelnia Czasopism PAN
Tematy:
speech emotion recognition
voice activity detection
wavelet packet reconstruction
feature extraction
LSTM networks
attention mechanism
rozpoznawanie emocji mowy
wykrywanie aktywności głosowej
rekonstrukcja pakietu falkowego
wyodrębnianie cech
mechanizm uwagi
sieć LSTM
Opis:
Speech emotion recognition (SER) is a complicated and challenging task in the human-computer interaction because it is difficult to find the best feature set to discriminate the emotional state entirely. We always used the FFT to handle the raw signal in the process of extracting the low-level description features, such as short-time energy, fundamental frequency, formant, MFCC (mel frequency cepstral coefficient) and so on. However, these features are built on the domain of frequency and ignore the information from temporal domain. In this paper, we propose a novel framework that utilizes multi-layers wavelet sequence set from wavelet packet reconstruction (WPR) and conventional feature set to constitute mixed feature set for achieving the emotional recognition with recurrent neural networks (RNN) based on the attention mechanism. In addition, the silent frames have a disadvantageous effect on SER, so we adopt voice activity detection of autocorrelation function to eliminate the emotional irrelevant frames. We show that the application of proposed algorithm significantly outperforms traditional features set in the prediction of spontaneous emotional states on the IEMOCAP corpus and EMODB database respectively, and we achieve better classification for both speaker-independent and speaker-dependent experiment. It is noteworthy that we acquire 62.52% and 77.57% accuracy results with speaker-independent (SI) performance, 66.90% and 82.26% accuracy results with speaker-dependent (SD) experiment in final.
Źródło:
Bulletin of the Polish Academy of Sciences. Technical Sciences; 2021, 69, 1; e136300, 1--12
0239-7528
Pojawia się w:
Bulletin of the Polish Academy of Sciences. Technical Sciences
Dostawca treści:
Biblioteka Nauki
Artykuł
    Wyświetlanie 1-9 z 9

    Ta witryna wykorzystuje pliki cookies do przechowywania informacji na Twoim komputerze. Pliki cookies stosujemy w celu świadczenia usług na najwyższym poziomie, w tym w sposób dostosowany do indywidualnych potrzeb. Korzystanie z witryny bez zmiany ustawień dotyczących cookies oznacza, że będą one zamieszczane w Twoim komputerze. W każdym momencie możesz dokonać zmiany ustawień dotyczących cookies