首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
基于听觉掩蔽效应和Bark子波变换的语音增强   总被引:22,自引:3,他引:19  
陶智  赵鹤鸣  龚呈卉 《声学学报》2005,30(4):367-372
提出了一种适用于低信噪比下的提高语音的听觉效果的语音增强方法。该方法在谱减法的基础上有两个特点:首先减参数是根据人耳听觉掩蔽效应提出的且是自适应的;其次采用了与人耳听觉系统特性更为适应的Bark子波变换方法对增强前后的语音进行分析。对该算法进行了客观和主观测试,结果表明:与谱减法相比对低信噪比的语音信号,(1)能更好地抑制残留噪声和背景噪声,(2)增强后的语音具有更好的清晰度和可懂度。  相似文献   

2.
王玥  李平  崔杰 《声学学报》2013,38(4):501-508
为了在噪声抑制和语音失真中之间寻找最佳平衡,提出了一种听觉频域掩蔽效应的自适应β阶贝叶斯感知估计语音增强算法,以期提高语音增强的综合性能。算法利用了人耳的听觉掩蔽效应,根据计算得到的频域掩蔽阈自适应调整β阶贝叶斯感知估计语音增强算法中的β值,从而仅将噪声抑制在掩蔽阈之下,保留较多的语音信息,降低语音失真。并分别用客观和主观评价方式,对所提出的算法的性能进行了评估,并与原来基于信噪比的自适应β阶贝叶斯感知估计语音增强算法进行了比较。结果表明,频域掩蔽的β阶贝叶斯感知估计方法的综合客观评价结果在信噪比为-10 dB至5 dB之间时均高于基于信噪比的自适应β阶贝叶斯感知估计语音增强算法。主观评价结果也表明频域掩蔽的β阶贝叶斯感知估计方法能在尽量保留语音信息的同时,较好的抑制背景噪声。   相似文献   

3.
闵姝君  田岚 《声学学报》2011,36(3):332-338
由于传统谱减语音增强存在残留的“音乐噪声”,因此基于传统谱减法降噪的电子耳蜗(CI)感知的声音品质也会受到影响。为提高CI的抗噪性,本文提出了一种自适应变阶谱减算法,并将该方法应用于电子耳蜗的语音增强中。根据CI电极对应的频带关系,该算法先对采集的带噪声音信号功率谱进行Bark子带划分,并在每个Bark子带中根据信噪比的变化进行谱减阶数和系数的自适应调节,使各子带噪声更均衡地去除,基本消除了传统方法存在的“音乐噪声”。基于该算法的电子耳蜗ACE仿真实验及测听结果表明,与传统谱减法相比,改进的算法能更好地抑制背景噪声和残留噪声,仿真得到的CI合成音感知更好和更清晰。  相似文献   

4.
均方误差函数是深度学习单通道语声增强算法最常用的一种代价函数。然而,均方误差值的大小与语声质量好坏并非完全相关。为了提高算法性能,该文在深度神经网络训练中引入了两类与人耳听觉相关的代价函数。第一类是加权欧氏距离代价函数,考虑了人耳听觉掩蔽效应;第二类是Itakura-Satio代价函数、COSH代价函数和加权似然比代价函数,强调语声谱峰的重要性,侧重于恢复干净语声谱峰信息。基于长短期记忆网络结构分析比较了两类代价函数在深度学习单通道语声增强算法中的性能,并与均方误差代价函数进行对比。实验结果表明,基于加权欧式距离代价函数的深度神经网络单通道语声增强算法能够获得更好的语声质量和更低的噪声残留。  相似文献   

5.
基于修正Mel域掩蔽模型和无语音概率的耳语音增强   总被引:1,自引:0,他引:1  
提出了一种基于修正Mel域听觉掩蔽模型和无语音概率的耳语音增强方法。该方法根据耳语音的发音特点对Mel频率进行修正,对每一帧耳语音信号进行Mel域频带滤波,同时通过无语音概率(SAP)动态地确定每个频带的听觉掩蔽阈值,对不同的听觉掩蔽阈值自适应地调整谱减系数来进行耳语音增强。对增强后的耳语音进行客观和主观测试,结果表明,该方法与其它谱减法相比,能将残留噪声和背景噪声控制在人耳掩蔽阈值下,取得更小的语音失真,主观听觉也得到了很大的改善。  相似文献   

6.
In this paper, two speech enhancement algorithms (SEAs) based on spectral subtraction (SS) principle have been evaluated for bilateral cochlear implant (BCI) users. Specifically, dual-channel noise power spectral estimation algorithm using power spectral densities (PSD) and cross power spectral density (CPSD) of the observed signals was studied. The enhanced speech signals were obtained using either Dual Channel Non Linear Spectral Subtraction ‘DC-NLSS’ or Dual-Channel Multi-Band Spectral Subtraction ‘DC-MBSS’ algorithms. For performance evaluation, some objective speech assessment tests relying on Perceptual Evaluation of Speech Quality (PESQ) score and speech Itakura-Saito (IS) distortion measurement were performed to fix the optimal number of frequency band needed in DC-MBSS algorithm. In order to evaluate the speech intelligibility, subjective listening tests were assessed with 50 normal hearing listeners using a specific BCI simulator and with three deafened BCI patients. Experimental results, obtained using French Lafon database corrupted by an additive babble noise at different Signal-to-Noise Ratios (SNR), showed that DC-MBSS algorithm improves speech understanding better than DC-NLSS algorithm for single and multiple interfering noise sources.  相似文献   

7.
针对以往语音增强算法在非平稳噪声环境下性能急剧下降的问题,基于时频字典学习方法提出了一种新的单通道语音增强算法。首先,提出采用时频字典学习方法对噪声的频谱结构的先验信息进行建模,并将其融入到卷积非负矩阵分解的框架下;然后,在固定噪声时频字典情况下,推导了时变增益和语音时频字典的乘性迭代求解公式;最后,利用该迭代公式更新语音和噪声的时变增益系数以及语音的时频字典,通过语音时频字典和时变增益的卷积运算重构出语音的幅度谱并用二值时频掩蔽方法消除噪声干扰。实验结果表明,在多项语音质量评价指标上,本文算法都取得了更好的结果。在非平稳噪声和低信噪比环境下,相比于多带谱减法和非负稀疏编码去噪算法,本文算法更有效地消除了噪声,增强后的语音具有更好的质量。  相似文献   

8.
为了研究心理声学在语声增强方面的应用,本文提出了一种基于等效矩阵带宽(ERB)尺度划分的多子带语声信号抗噪谱减算法。此算法根据ERB尺度将带噪信号的频谱划分成多个子带,然后再根据每个子带的分段信噪比以及心理声学掩蔽原则分别计算每个子带的谱减参数,最后在每个子带中分别进行谱减算法处理。实验结果表明,应用新算法所获得的语声增强结果在信噪比、IS失真以及PESQ方面均优于之前提出的多子带语声信号抗噪谱减算法。  相似文献   

9.
李轶南  张雄伟  贾冲  陈亮  曾理 《声学学报》2015,40(4):607-614
针对现有基于字典学习的增强算法需要先验信息、不易实时处理的问题,提出一种便于实时处理的无监督的单通道语音增强算法。首先,该算法将无监督条件下背景噪声的建模问题转化为带噪语音幅度谱的稀疏低秩噪声分解;然后,采用增量非负子空间方法对背景噪声进行在线字典学习,获得能够体现背景噪声时变特性的自适应噪声字典;最后,利用所得的噪声字典,采用易于实时处理的逐帧迭代方式,对带噪语音进行处理。实验结果表明:相较于多带谱减法和基于低秩稀疏矩阵分解的增强算法,所提算法在噪声抑制方面的性能尤为显著,在多项性能评价指标上,均表现出更好的结果。  相似文献   

10.
Speech signals recorded with a distant microphone usually are interfered by the spatial reverberation in the room, which severely degrades the clarity and intelligibility of speech. A speech dereverberation method based on spectral subtraction and spectral line enhancement is proposed in this paper. Following the generalized statistical reverberation model, the power spectrum of late reverberation is estimated and removed from the reverberation speech by the spectral subtraction method. Then, according to the human auditory model, a spectral line enhancement technique based on adaptive post-filtering is adopted to further eliminate the reverberant components between adjacent speech formants. The proposed method can effectively suppress the spatial reverberation and improve the auditory perception of speech. The subjective and objective evaluation results reveal that the perceptual quality of speech is greatly improved by the proposed method.  相似文献   

11.
Whispered speech enhancement using auditory masking model in modified Meldomain and Speech Absence Probability(SAP)was proposed.In light of the phonation characteristic of whisper,we modify the Mel-frequency Scaling model.Whispered speech is filtered by the proposed model.Meanwhile,the value of masking threshold for each frequency band is dynamically determined by speech absence probability.Then whispered speech enhancement is conducted by adaptively rectifying the spectrum subtraction coefficients using different masking threshold values.Results of objective and subjective tests on the enhanced whispered signal show that compared with other methods;the proposed method can enhance whispered signal with better subjective auditory quality and less distortion by reducing the music noise and background noise under the masking threshold value.  相似文献   

12.
In this paper, a single-channel speech enhancement algorithm based on non-linear and multi-band Adaptive Gain Control (AGC) is proposed. The algorithm requires neither Signal-to-Noise Ratio (SNR) nor noise parameters estimation. It reduces the background noise in the temporal domain rather than the spectral domain using a non-linear and automatically adjustable gain function for multi-band AGC. The gain function varies in time and is deduced from the temporal envelope of each frequency band to highly compress the frequency regions where noise is present and lightly compress the frequency regions where speech is present. Objective evaluation using the PESQ (Perceptual Evaluation of Speech Quality) metric shows that the proposed algorithm performs better than three benchmarks, namely: the spectral subtraction, the Wiener filter based on a priori SNR estimation and a band-pass modulation filtering algorithm. In addition, blind subjective tests show that the proposed algorithm introduces less musical noise compared to the benchmark algorithms and was preferred 78.8% of the time in terms of signal quality. The proposed algorithm is implemented in a miniature low power digital signal processor to validate its feasibility and complexity for smart hearing protection in noisy environments.  相似文献   

13.
在充分考虑人耳听觉特性和噪声统计特性的基础上,提出一种时频结合Bark尺度自适应阈值的语音消噪算法,在Bark频域上自适应调整增强系数可以较准确地进行阈值判定。仿真实验验证,时频结合算法在低信噪比输入情况下较传统语音降噪方法具有明显优势,其在消除高斯白噪声的同时有效降低了语音损失,可获得最大信噪比,谱失真测度最小,增强语音的MOS(Mean Opinion Score)评分明显提高,具有较好的听觉效果。  相似文献   

14.
Modifications to improve the performance of the improved multiband excitation (IMBE) model in coding narrow-band speech are presented in this paper. The first modification is based on the phenomenon of auditory masking which helps the model to focus on the perceptual aspect of the coder by eliminating components that are inaudible to human ears during the analysis process of the IMBE model. In addition, a spectral amplitude enhancement stage is added. Preliminary results indicate that the proposed modifications improve the objective and subjective performances of the IMBE model in coding narrow-band speech at 4.15 kbps.  相似文献   

15.
田玉静  左红伟  王超 《应用声学》2020,39(6):932-939
语音通信系统中,语音通过信道传输将不可避免地引入码间串扰和信号畸变,同时受到噪声污染。本文在分析自适应盲均衡算法CMA(constant modulus algorithm)和改进盲均衡算法的基础上,考虑到自适应盲均衡技术在语音噪声控制方面能力有限,将自适应盲均衡技术与小波包掩蔽阈值降噪算法联合使用,形成一种基带语音增强新方法。仿真试验结果显示自适应盲均衡技术可以使星座图变得清晰而紧凑,有效减小误码率。研究证实该方法在语音信号ISI和畸变严重情况下,在白噪及有色噪声不同的噪声环境中都具有稳定的降噪能力,消噪同时可获得汉语普通话良好的听觉效果。  相似文献   

16.
The speech understanding of persons with "flat" hearing loss (HI) was compared to a normal-hearing (NH) control group to examine how hearing loss affects the contribution of speech information in various frequency regions. Speech understanding in noise was assessed at multiple low- and high-pass filter cutoff frequencies. Noise levels were chosen to ensure that the noise, rather than quiet thresholds, determined audibility. The performance of HI subjects was compared to a NH group listening at the same signal-to-noise ratio and a comparable presentation level. Although absolute speech scores for the HI group were reduced, performance improvements as the speech and noise bandwidth increased were comparable between groups. These data suggest that the presence of hearing loss results in a uniform, rather than frequency-specific, deficit in the contribution of speech information. Measures of auditory thresholds in noise and speech intelligibility index (SII) calculations were also performed. These data suggest that differences in performance between the HI and NH groups are due primarily to audibility differences between groups. Measures of auditory thresholds in noise showed the "effective masking spectrum" of the noise was greater for the HI than the NH subjects.  相似文献   

17.
A significant and often unavoidable problem in bioacoustic signal processing is the presence of background noise due to an adverse recording environment. This paper proposes a new bioacoustic signal enhancement technique which can be used on a wide range of species. The technique is based on a perceptually scaled wavelet packet decomposition using a species-specific Greenwood scale function. Spectral estimation techniques, similar to those used for human speech enhancement, are used for estimation of clean signal wavelet coefficients under an additive noise model. The new approach is compared to several other techniques, including basic bandpass filtering as well as classical speech enhancement methods such as spectral subtraction, Wiener filtering, and Ephraim-Malah filtering. Vocalizations recorded from several species are used for evaluation, including the ortolan bunting (Emberiza hortulana), rhesus monkey (Macaca mulatta), and humpback whale (Megaptera novaeanglia), with both additive white Gaussian noise and environment recording noise added across a range of signal-to-noise ratios (SNRs). Results, measured by both SNR and segmental SNR of the enhanced wave forms, indicate that the proposed method outperforms other approaches for a wide range of noise conditions.  相似文献   

18.
自适应平滑周期图语音增强研究   总被引:2,自引:0,他引:2  
提出基于功率谱结构特征的频带间自适应平滑周期图,解决周期图估计的频率分辨率和方差的矛盾,并应用于语音增强算法的幅度谱减法。测试结果表明,自适应平滑周期图谱减法对于各种功率谱结构特征的噪声,在平均段信噪比提高、平均对数谱距离等性能指标上优于其它周期图估计方法的谱减法。  相似文献   

19.
The pattern of auditory masking derived from Gaussian noise is often cited and used to predict the detrimental effects of masking noise on marine mammals. However, environmental noise (both anthropogenic and natural) may not always be Gaussian distributed. Some noise sources are highly structured with complex amplitude fluctuations that extend across frequency regions, which are often termed comodulated noise. Recent evidence with bottlenose dolphins using comodulated noise demonstrated a significant release from masking compared to Gaussian maskers of the same bandwidth and pressure spectral density level, a result known as comodulation masking release. The present study demonstrates a pattern of masking where both temporally fluctuating comodulated noise and environmental noise produce lower masked thresholds compared to Gaussian noise of the same spectral density level and bandwidth. Furthermore, a threshold reduction or "masking release" occurred when the environmental noise bandwidth increased beyond a critical band. These results provide further evidence that conventional models of auditory masking using Gaussian maskers (i.e., the power spectrum model) do not fully describe the masking effects that occur in realistic environments.  相似文献   

20.
As a fundamental part of speech enhancement, noise estimation is particularly challenging in highly non-stationary noise environments. In this work, we propose an effective algorithm on the basis of the “Improved Minima Controlled Recursive Averaging (IMCRA)” with the objective to improve the performance of noise estimation. The main contributions of this work are: (i) in the algorithm, a rough decision about speech presence is proposed by calculating the autocorrelation and cross-channel correlation of the T–F (Time–Frequency) units; (ii) with this decision, we refine the smoothing parameters for the smoothing of noisy power spectrum and the recursive averaging in noise spectrum estimation as well as the weighting factor for the a priori SNR (Signal to Noise Ratio) estimation in the IMCRA; (iii) we improve the search of local minima during spectral bursts by adding a minimum search with a shorter window. Extensive experiments are carried out to evaluate the performance of our proposed algorithm. The experimental results illustrate that, compared with the IMCRA, the proposed approach significantly improves the accuracy of noise spectrum estimation and the quality of enhanced speech in the typical noise situations.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号