Temat: Enhancement - Katalog OPAC zbiorów

Skocz do pozycji: 1.

Tytuł:: Speech Enhancement Based on the Multi-Scales and Multi-Thresholds of the Auditory Perception Wavelet Transform
Autorzy:: Tao, Z.
Zhao, H. M.
Zhang, X-J.
Wu, D.
Powiązania:: https://bibliotekanauki.pl/articles/177021.pdf
Data publikacji:: 2011
Wydawca:: Polska Akademia Nauk. Czytelnia Czasopism PAN
Tematy:: speech enhancement
low SNR
auditory perception wavelet transform
unvoiced enhancement
masking effect
Opis:: This paper proposes a speech enhancement method using the multi-scales and multi-thresholds of the auditory perception wavelet transform, which is suitable for a low SNR (signal to noise ratio) environment. This method achieves the goal of noise reduction according to the threshold processing of the human ear’s auditory masking effect on the auditory perception wavelet transform parameters of a speech signal. At the same time, in order to prevent high frequency loss during the process of noise suppression, we first make a voicing decision based on the speech signals. Afterwards, we process the unvoiced sound segment and the voiced sound segment according to the different thresholds and different judgments. Lastly, we perform objective and subjective tests on the enhanced speech. The results show that, compared to other spectral subtractions, our method keeps the components of unvoiced sound intact, while it suppresses the residual noise and the background noise. Thus, the enhanced speech has better clarity and intelligibility.
Źródło:: Archives of Acoustics; 2011, 36, 3; 519-532
0137-5075
Pojawia się w:: Archives of Acoustics
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Skocz do pozycji: 2.

Tytuł:: A robust generalized sidelobe canceller employing speech leakage masking
Skuteczny tłumik listków bocznych z wykorzystaniem maskowania przecieku mowy
Autorzy:: Borowicz, A.
Powiązania:: https://bibliotekanauki.pl/articles/88400.pdf
Data publikacji:: 2014
Wydawca:: Politechnika Białostocka. Oficyna Wydawnicza Politechniki Białostockiej
Tematy:: GSC
psychoakustyka
uzdatnianie mowy
psychoacoustics
speech enhancement
Opis:: A novel speech enhancement method based on generalized sidelobe canceller (GSC) structure is presented. We show that it is possible to reduce audible speech distortions and preserve residual noise level under acoustic model uncertainties. It can be done by constraining a speech leakage power according to masking phenomena and conditional minimizing the residual noise power. We implemented the proposed approach using a simple delay-and-sum beamformer model. Finally a comparative evaluation of the selected methods is performed using objective speech quality measures. The results show that the novel method outperforms conventional one providing lower speech distortions.
Prezentowana jest nowa metoda uzdatniania mowy w oparciu o strukturę uogólnionego tłumika listków bocznych. Wykazujemy, ze możliwe jest zmniejszenie słyszalnych zniekształceń mowy przy zachowaniu stałego poziomu szumu rezydualnego, dla modeli przybliżonych środowiska akustycznego. Może to być dokonane poprzez uwarunkowanie poziomu mocy przecieku mowy zgodnie ze zjawiskiem maskowania oraz minimalizację warunkową mocy szumu rezydualnego. Proponowane podejście zaimplementowano w oparciu o prosty model beamformera opóźniająco-sumującego. Ostatecznie przeprowadzono ocenę porównawczą wybranych metod z wykorzystaniem obiektywnych miar jakości mowy. Wyniki pokazują, że nowa metoda przewyższa konwencjonalną zapewniając mniejsze zniekształcenia mowy.
Źródło:: Advances in Computer Science Research; 2014, 11; 17-29
2300-715X
Pojawia się w:: Advances in Computer Science Research
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Skocz do pozycji: 3.

Tytuł:: Speech Intelligibility in Rooms with and without an Induction Loop for Hearing Aid Users
Autorzy:: Kociński, J.
Ozimek, E.
Powiązania:: https://bibliotekanauki.pl/articles/177905.pdf
Data publikacji:: 2015
Wydawca:: Polska Akademia Nauk. Czytelnia Czasopism PAN
Tematy:: speech intelligibility
induction loop
hearing impairment
speech enhancement
Opis:: The paper presents the results of sentence and logatome speech intelligibility measured in rooms with induction loop for hearing aid users. Two rooms with different acoustic parameters were chosen. Twenty two subjects with mild, moderate and severe hearing impairment using hearing aids took part in the experiment. The intelligibility tests composed of sentences or logatomes were presented to the subjects at fixed measurement points of an enclosure. It was shown that a sentence test is more useful tool for speech intelligibility measurements in a room than logatome test. It was also shown that induction loop is very efficient system at improving speech intelligibility. Additionally, the questionnaire data showed that induction loop, apart from improving speech intelligibility, increased a subject’s general satisfaction with speech perception.
Źródło:: Archives of Acoustics; 2015, 40, 1; 51-58
0137-5075
Pojawia się w:: Archives of Acoustics
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Skocz do pozycji: 4.

Tytuł:: Two-Microphone Dereverberation for Automatic Speech Recognition of Polish
Autorzy:: Kundegorski, M.
Jackson, P. J. B.
Ziółko, B.
Powiązania:: https://bibliotekanauki.pl/articles/176431.pdf
Data publikacji:: 2014
Wydawca:: Polska Akademia Nauk. Czytelnia Czasopism PAN
Tematy:: speech enhancement
reverberation
automatic speech recognition
ASR
Polish
Opis:: Reverberation is a common problem for many speech technologies, such as automatic speech recogni- tion (ASR) systems. This paper investigates the novel combination of precedence, binaural and statistical independence cues for enhancing reverberant speech, prior to ASR, under these adverse acoustical con- ditions when two microphone signals are available. Results of the enhancement are evaluated in terms of relevant signal measures and accuracy for both English and Polish ASR tasks. These show inconsistencies between the signal and recognition measures, although in recognition the proposed method consistently outperforms all other combinations and the spectral-subtraction baseline.
Źródło:: Archives of Acoustics; 2014, 39, 3; 411-420
0137-5075
Pojawia się w:: Archives of Acoustics
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Skocz do pozycji: 5.

Tytuł:: Speech Enhancement Based on Constrained Low-rank Sparse Matrix Decomposition Integrated with Temporal Continuity Regularisation
Autorzy:: Sun, Chengli
Yuan, Conglin
Powiązania:: https://bibliotekanauki.pl/articles/178075.pdf
Data publikacji:: 2019
Wydawca:: Polska Akademia Nauk. Czasopisma i Monografie PAN
Tematy:: speech enhancement
temporal continuity
low-rank decomposition
sparse decomposition
Opis:: Speech enhancement in strong noise condition is a challenging problem. Low-rank and sparse matrix decomposition (LSMD) theory has been applied to speech enhancement recently and good performance was obtained. Existing LSMD algorithms consider each frame as an individual observation. However, real-world speeches usually have a temporal structure, and their acoustic characteristics vary slowly as a function of time. In this paper, we propose a temporal continuity constrained low-rank sparse matrix decomposition (TCCLSMD) based speech enhancement method. In this method, speech separation is formulated as a TCCLSMD problem and temporal continuity constraints are imposed in the LSMD process. We develop an alternative optimisation algorithm for noisy spectrogram decomposition. By means of TCCLSMD, the recovery speech spectrogram is more consistent with the structure of the clean speech spectrogram, and it can lead to more stable and reasonable results than the existing LSMD algorithm. Experiments with various types of noises show the proposed algorithm can achieve a better performance than traditional speech enhancement algorithms, in terms of yielding less residual noise and lower speech distortion.
Źródło:: Archives of Acoustics; 2019, 44, 4; 681-692
0137-5075
Pojawia się w:: Archives of Acoustics
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Skocz do pozycji: 6.

Tytuł:: Speech emotion recognition under white noise
Autorzy:: Huang, C.
Chen, G.
Yu, H.
Bao, Y.
Zhao, L.
Powiązania:: https://bibliotekanauki.pl/articles/177301.pdf
Data publikacji:: 2013
Wydawca:: Polska Akademia Nauk. Czytelnia Czasopism PAN
Tematy:: speech emotion recognition
speech enhancement
emotion model
Gaussian mixture model
Opis:: Speaker‘s emotional states are recognized from speech signal with Additive white Gaussian noise (AWGN). The influence of white noise on a typical emotion recogniztion system is studied. The emotion classifier is implemented with Gaussian mixture model (GMM). A Chinese speech emotion database is used for training and testing, which includes nine emotion classes (e.g. happiness, sadness, anger, surprise, fear, anxiety, hesitation, confidence and neutral state). Two speech enhancement algorithms are introduced for improved emotion classification. In the experiments, the Gaussian mixture model is trained on the clean speech data, while tested under AWGN with various signal to noise ratios (SNRs). The emotion class model and the dimension space model are both adopted for the evaluation of the emotion recognition system. Regarding the emotion class model, the nine emotion classes are classified. Considering the dimension space model, the arousal dimension and the valence dimension are classified into positive regions or negative regions. The experimental results show that the speech enhancement algorithms constantly improve the performance of our emotion recognition system under various SNRs, and the positive emotions are more likely to be miss-classified as negative emotions under white noise environment.
Źródło:: Archives of Acoustics; 2013, 38, 4; 457-463
0137-5075
Pojawia się w:: Archives of Acoustics
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Skocz do pozycji: 7.

Tytuł:: Incoherent Discriminative Dictionary Learning for Speech Enhancement
Autorzy:: Shaheen, D.
Dakkak, O. A.
Wainakh, M.
Powiązania:: https://bibliotekanauki.pl/articles/308116.pdf
Data publikacji:: 2018
Wydawca:: Instytut Łączności - Państwowy Instytut Badawczy
Tematy:: ADMM
l1 minimization algorithms
sparse coding
speech enhancement
supervised dictionary learning
Opis:: Speech enhancement is one of the many challenging tasks in signal processing, especially in the case of nonstationary speech-like noise. In this paper a new incoherent discriminative dictionary learning algorithm is proposed to model both speech and noise, where the cost function accounts for both “source confusion” and “source distortion” errors, with a regularization term that penalizes the coherence between speech and noise sub-dictionaries. At the enhancement stage, we use sparse coding on the learnt dictionary to ﬁnd an estimate for both clean speech and noise amplitude spectrum. In the ﬁnal phase, the Wiener ﬁlter is used to reﬁne the clean speech estimate. Experiments on the Noizeus dataset, using two objective speech enhancement measures: frequency-weighted segmental SNR and Perceptual Evaluation of Speech Quality (PESQ) demonstrate that the proposed algorithm outperforms other speech enhancement methods tested.
Źródło:: Journal of Telecommunications and Information Technology; 2018, 3; 42-54
1509-4553
1899-8852
Pojawia się w:: Journal of Telecommunications and Information Technology
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Skocz do pozycji: 8.

Tytuł:: Speech Enhancement Based on Discrete Wavelet Packet Transform and Itakura-Saito Nonnegative Matrix Factorisation
Autorzy:: Liu, Houguang
Wang, Wenbo
Xue, Lin
Yang, Jianhua
Wang, Zhihua
Hua, Chunli
Powiązania:: https://bibliotekanauki.pl/articles/1448505.pdf
Data publikacji:: 2020
Wydawca:: Polska Akademia Nauk. Czasopisma i Monografie PAN
Tematy:: speech enhancement
discrete wavelet packet transform
nonnegative matrix factorisation
Itakura-Saito divergence
Opis:: Nonnegative matrix factorization (NMF) is one of the most popular machine learning tools for speech enhancement (SE). However, there are two problems reducing the performance of the traditional NMF-based SE algorithms. One is related to the overlap-and-add operation used in the short time Fourier transform (STFT) based signal reconstruction, and the other is the Euclidean distance used commonly as an objective function; these methods can cause distortion in the SE process. In order to get over these shortcomings, we propose a novel SE joint framework which combines the discrete wavelet packet transform (DWPT) and the Itakura-Saito nonnegative matrix factorisation (ISNMF). In this approach, the speech signal was first split into a series of subband signals using the DWPT. Then, the ISNMF was used to enhance the speech for each subband signal. Finally, the inverse DWPT (IDWT) was utilised to reconstruct these enhanced speech subband signals. The experimental results show that the proposed joint framework effectively enhances the performance of speech enhancement and performs better in the unseen noise case compared to the traditional NMF methods.
Źródło:: Archives of Acoustics; 2020, 45, 4; 565-572
0137-5075
Pojawia się w:: Archives of Acoustics
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Skocz do pozycji: 9.

Tytuł:: Deep Neural Network for Supervised Single-Channel Speech Enhancement
Autorzy:: Saleem, Nasir
Irfan Khattak, Muhammad
Ali, Muhammad Yousaf
Shafi, Muhammad
Powiązania:: https://bibliotekanauki.pl/articles/177497.pdf
Data publikacji:: 2019
Wydawca:: Polska Akademia Nauk. Czasopisma i Monografie PAN
Tematy:: deep neural network
intelligibility
speech enhancement
speech quality
supervised learning
Wiener filtering
Opis:: Speech enhancement is fundamental for various real time speech applications and it is a challenging task in the case of a single channel because practically only one data channel is available. We have proposed a supervised single channel speech enhancement algorithm in this paper based on a deep neural network (DNN) and less aggressive Wiener filtering as additional DNN layer. During the training stage the network learns and predicts the magnitude spectrums of the clean and noise signals from input noisy speech acoustic features. Relative spectral transform-perceptual linear prediction (RASTA-PLP) is used in the proposed method to extract the acoustic features at the frame level. Autoregressive moving average (ARMA) filter is applied to smooth the temporal curves of extracted features. The trained network predicts the coefficients to construct a ratio mask based on mean square error (MSE) objective cost function. The less aggressive Wiener filter is placed as an additional layer on the top of a DNN to produce an enhanced magnitude spectrum. Finally, the noisy speech phase is used to reconstruct the enhanced speech. The experimental results demonstrate that the proposed DNN framework with less aggressive Wiener filtering outperforms the competing speech enhancement methods in terms of the speech quality and intelligibility.
Źródło:: Archives of Acoustics; 2019, 44, 1; 3-12
0137-5075
Pojawia się w:: Archives of Acoustics
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Skocz do pozycji: 10.

Tytuł:: Wavelet Packet Transform based Speech Enhancement via Two-Dimensional SPP Estimator with Generalized Gamma Priors
Autorzy:: Sun, P.
Qin, J.
Powiązania:: https://bibliotekanauki.pl/articles/177782.pdf
Data publikacji:: 2016
Wydawca:: Polska Akademia Nauk. Czytelnia Czasopism PAN
Tematy:: speech enhancement
speech presence probability
wavelet packet transform
two-dimensional Teager energy operator
Opis:: Despite various speech enhancement techniques have been developed for different applications, existing methods are limited in noisy environments with high ambient noise levels. Speech presence probability (SPP) estimation is a speech enhancement technique to reduce speech distortions, especially in low signalto-noise ratios (SNRs) scenario. In this paper, we propose a new two-dimensional (2D) Teager-energyoperators (TEOs) improved SPP estimator for speech enhancement in time-frequency (T-F) domain. Wavelet packet transform (WPT) as a multiband decomposition technique is used to concentrate the energy distribution of speech components. A minimum mean-square error (MMSE) estimator is obtained based on the generalized gamma distribution speech model in WPT domain. In addition, the speech samples corrupted by environment and occupational noises (i.e., machine shop, factory and station) at different input SNRs are used to validate the proposed algorithm. Results suggest that the proposed method achieves a significant enhancement on perceptual quality, compared with four conventional speech enhancement algorithms (i.e., MMSE-84, MMSE-04, Wiener-96, and BTW).
Źródło:: Archives of Acoustics; 2016, 41, 3; 579-590
0137-5075
Pojawia się w:: Archives of Acoustics
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Skocz do pozycji: 11.

Tytuł:: Speech Enhancement Using Sliding Window Empirical Mode Decomposition and Hurst-based Technique
Autorzy:: Poovarasan, Selvaraj
Chandra, Eswaran
Powiązania:: https://bibliotekanauki.pl/articles/176311.pdf
Data publikacji:: 2019
Wydawca:: Polska Akademia Nauk. Czasopisma i Monografie PAN
Tematy:: speech enhancement
Empirical Mode Decomposition
EMD
Intrinsic Mode Functions
hurst exponent
Sliding Window
SW
Opis:: The most challenging in speech enhancement technique is tracking non-stationary noises for long speech segments and low Signal-to-Noise Ratio (SNR). Different speech enhancement techniques have been proposed but, those techniques were inaccurate in tracking highly non-stationary noises. As a result, Empirical Mode Decomposition and Hurst-based (EMDH) approach is proposed to enhance the signals corrupted by non-stationary acoustic noises. Hurst exponent statistics was adopted for identifying and selecting the set of Intrinsic Mode Functions (IMF) that are most affected by the noise components. Moreover, the speech signal was reconstructed by considering the least corrupted IMF. Though it increases SNR, the time and resource consumption were high. Also, it requires a significant improvement under nonstationary noise scenario. Hence, in this article, EMDH approach is enhanced by using Sliding Window (SW) technique. In this SWEMDH approach, the computation of EMD is performed based on the small and sliding window along with the time axis. The sliding window depends on the signal frequency band. The possible discontinuities in IMF between windows are prevented by the total number of modes and the number of sifting iterations that should be set a priori. For each module, the number of lifting iterations is determined by decomposition of many signal windows by standard algorithm and calculating the average number of sifting steps for each module. Based on this approach, the time complexity is reduced significantly with suitable quality of decomposition. Finally, the experimental results show the considerable improvements in speech enhancement under non-stationary noise environments.
Źródło:: Archives of Acoustics; 2019, 44, 3; 429-437
0137-5075
Pojawia się w:: Archives of Acoustics
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Skocz do pozycji: 12.

Tytuł:: Adaptive Algorithms for Enhancement of Speech Subject to a High-Level Noise
Autorzy:: Latos, M.
Pawełczyk, M.
Powiązania:: https://bibliotekanauki.pl/articles/178055.pdf
Data publikacji:: 2010
Wydawca:: Polska Akademia Nauk. Czytelnia Czasopism PAN
Tematy:: speech enhancement
adaptive system
line enhancer
LMS algorithm
high-level noise
nonstationary noise
earplug
active noise control
Opis:: There are many industrial environments which are exposed to a high-level noise, sometimes much higher than the level of speech. Verbal communication is then practically unfeasible. In order to increase the speech intelligibility, appropriate speech enhancement algorithms can be used. It is impossible to filter off the noise completely from the acquired signal by using a conventional filter, because of two reasons. First, the speech and the noise frequency contents are overlapping. Second, the noise properties are subject to change. The adaptive realisation of the Wienerbased approach can be, however, applied. Two structures are possible. One is the line enhancer, where the predictive realisation of the Wiener approach is used. The benefit of using this structure it that it does not require additional apparatus. The second structure takes advantage of the high level of noise. Under such condition, placing another microphone, even close to the primary one, can provide a reference signal well correlated with the noise disturbing the speech and lacking the information about the speech. Then, the classical Wiener filter can be used, to produce an estimate of the noise based on the reference signal. That noise estimate can be then subtracted from the disturbed speech. Both algorithms are verified, based on the data obtained from the real industrial environment. For laboratory experiments the G.R.A.S. artificial head and two microphones, one at back side of an earplug and another at the mouth are used.
Źródło:: Archives of Acoustics; 2010, 35, 2; 203-212
0137-5075
Pojawia się w:: Archives of Acoustics
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Skocz do pozycji: 13.

Tytuł:: Enhancing speech signals based on an mems microphone array and temporal differences in the incoming signal
Autorzy:: Felcyn, Jan
Raszewski, Michał
Powiązania:: https://bibliotekanauki.pl/articles/2202430.pdf
Data publikacji:: 2022
Wydawca:: Politechnika Poznańska. Instytut Mechaniki Stosowanej
Tematy:: microphone array
speech enhancement
direction of arrival
signal processing
macierz mikrofonów
uzdatnianie mowy
kierunek nadejścia sygnału
przetwarzanie sygnałów
Opis:: The development of the Internet of things and automatisation in everyday life also influences our houses. There are more and more devices on the market which can be controlled remotely. One kind of such control involves the use of voice signals. This method tends to use microphone arrays and dedicated algorithms to enhance the speech signal and recognize the words in it. In this project, a small 5-microphone array was developed. To enhance the quality of the signal, dedicated software was written. It consists of several modules, including the direction of arrival estimation, denoising, and differentiation between adults and children. The results showed that the custom algorithm can increase the signal to noise ratio by up to 6 dB.
Źródło:: Vibrations in Physical Systems; 2022, 33, 2; art. no. 2022202
0860-6897
Pojawia się w:: Vibrations in Physical Systems
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Informacja

Wyszukujesz frazę "Enhancement" wg kryterium: Temat

Źródło danych

Dostawca treści

Kolekcja

Rok wydania

Wydawca

Temat

Autor

Typ dokumentu

Język