首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 109 毫秒
1.
王玥  李平  崔杰 《声学学报》2013,38(4):501-508
为了在噪声抑制和语音失真中之间寻找最佳平衡,提出了一种听觉频域掩蔽效应的自适应β阶贝叶斯感知估计语音增强算法,以期提高语音增强的综合性能。算法利用了人耳的听觉掩蔽效应,根据计算得到的频域掩蔽阈自适应调整β阶贝叶斯感知估计语音增强算法中的β值,从而仅将噪声抑制在掩蔽阈之下,保留较多的语音信息,降低语音失真。并分别用客观和主观评价方式,对所提出的算法的性能进行了评估,并与原来基于信噪比的自适应β阶贝叶斯感知估计语音增强算法进行了比较。结果表明,频域掩蔽的β阶贝叶斯感知估计方法的综合客观评价结果在信噪比为-10 dB至5 dB之间时均高于基于信噪比的自适应β阶贝叶斯感知估计语音增强算法。主观评价结果也表明频域掩蔽的β阶贝叶斯感知估计方法能在尽量保留语音信息的同时,较好的抑制背景噪声。   相似文献   

2.
为了克服低信噪比输入下,语音增强造成语音清音中的弱分量损失,造成重构信号包络失真的问题。论文提出了一种新的语音增强方法。该方法根据语音感知模型,采用不完全小波包分解拟合语音临界频带,并对语音按子带能量进行清浊音区分处理,在阈值计算上,提出了一种清浊音分离,基于子带信号能量的小波包自适应阈值算法。通过仿真实验,客观评测和听音测试表明,该算法在低信噪比输入时较传统算法,能够更加有效地减少重构信号包络失真,在不损伤语音清晰度和自然度的前提下,使输出信噪比明显提高。将该算法与能量谱减法结合,进行二次增强能进一步提高降噪输出的语音质量。  相似文献   

3.
王辉  张玲华 《声学学报》2012,37(5):534-538
自适应波束形成算法是数字助听器的核心算法之一。针对自适应波束形成算法中不可避免存在的语音泄漏,本文先对传统GSC结构自适应波束形成算法进行理论研究,并提出一种汉语处理技术,补偿泄漏的语音。这种汉语处理技术利用汉语语音特有的基音频率信息,调整语音幅度谱包络,提高谱包络与基频曲线形状的相似度以提高语音的可懂度。针对泄漏的语音在高频清辅音段有较大损失的特点,在频域上对清辅音进行放大,在不改变共振峰结构的情况下,提高清辅音的能量,同时降低语音间隔段GSC算法泄漏的噪声能量,提高对语音的辨别。仿真实验结果表明,这种汉语语音处理能够补偿自适应波束形成算法造成的语音泄漏,提高语音的可懂度。   相似文献   

4.
为了提高传统正交匹配追踪(Orthogonal Matching Pursuit,OMP )算法的语音增强性能和运算速度,本研究基于稀疏编码理论,提出了一种改进的OMP算法的语音增强算法。其一,将K-奇异值分解(K-singular value decomposition,K-SVD)算法与OMP算法相结合,通过设置能量阈值的方法,提高OMP算法的语音增强性能;其二,通过改进传统OMP算法中信号稀疏逼近的计算方法,提高算法的运算速度。改进的OMP算法的语音增强算法与传统K-SVD语音增强算法相比,采用PESQ评价增强语音的质量,NCM评价语音的可懂度。在NCM的值基本保持不变的情况下,PESQ的值平均提高约12.47%,取得了更好的增强效果。取得了更好的增强效果。改进的OMP算法的运算速度与传统OMP算法相比提高近一倍。  相似文献   

5.
In this paper, a novel single microphone channel-based speech enhancement technique is presented. While most of the conventional nonnegative matrix factorization-based approaches focus on generating a basis matrix of speech and noise for enhancement, the proposed algorithm performs an additional process to reconstruct speech from noisy speech when these two elements are highly overlapped in selected spectral bands. This process involves a log-spectral amplitude based estimator, which provides the spectrotemporal speech presence probability to obtain a more accurate reconstruction. Moreover, the proposed algorithm applies an unsupervised learning method to the input noise, so it is adaptable to any type of environmental noise without a pre-trained dictionary. The experimental results demonstrate that the proposed algorithm obtains improved speech enhancement performance compared with conventional single channel-based approaches.  相似文献   

6.
It is well known that the non-stationary wideband noise is the most difficult to be removed in speech enhancement. In this paper a novel speech enhancement algorithm based on the dyadic wavelet transform and the simplified Karhunen-Loeve transform (KLT) is proposed to suppress the non-stationary wideband noise. The noisy speech is decomposed into components by the wavelet space and KLT-based vector space, and the components are processed and reconstructed, respectively, by distinguishing between voiced speech and unvoiced speech. There are no requirements of noise whitening and SNR pre-calculating. In order to evaluate the performance of this algorithm in more detail, a three-dimensional spectral distortion measure is introduced. Experiments and comparison between different speech enhancement systems by means of the distortion measure show that the proposed method has no drawbacks existing in the previous methods and performs better shaping and suppressing of the non-stationary wideband noise for speech enhancement.  相似文献   

7.
针对含噪语音难以实现有效的语音转换,本文提出了一种采用联合字典优化的噪声鲁棒性语音转换算法。在联合字典的构成中,语音字典采用后向剔除算法(Backward Elimination algorithm,BE)进行优化,同时引入噪声字典,使得含噪语音与联合字典相匹配。实验结果表明,在保证转换效果的前提下,后向剔除算法能够减少字典帧数,降低计算量。在低信噪比和多种噪声环境下,本文算法与传统NMF算法和基于谱减法消噪的NMF转换算法相比具有更好的转换效果,噪声字典的引入提升了语音转换系统的噪声鲁棒性。   相似文献   

8.
Pitch detection is an important part of speech recognition and speech processing. In this paper, a pitch detection algorithm based on second generation wavelet transform was developed. The proposed algorithm reduces the computational load of those algorithms that were based on classical wavelet transform. The proposed pitch detection algorithm was tested for both real speech and synthetic speech signal. Some experiments were carried out under noisy environment condition to evaluate the accuracy and robustness of the proposed algorithm. Results showed that the proposed algorithm was robust to noise and provided accurate estimates of the pitch period for both low-pitched and high-pitched speakers. Moreover, different wavelet filters that were obtained using second generation wavelet transform were considered to see the effects of them on the proposed algorithm. It was noticed that Haar filter showed good performance as compared to the other wavelet filters.  相似文献   

9.
马震  吴殿红 《应用声学》2016,35(2):137-143
在多脉冲线性预测编码的基础上,本文提出了位置无关脉冲搜索算法。该算法不需要搜索脉冲位置,而是根据给定的脉冲位置一次性解出脉冲幅度矢量。这就保证了得到的脉冲组合在最小二乘意义下是最优的,为改进合成语音质量提供了理论基础。进而在激励脉冲与位置无关的理论基础上,提出了定点脉冲线性预测编码方法。对所提出的算法在MATLAB下进行了仿真,仿真结果发现位置无关脉冲搜索算法得到的合成语音质量优于序贯法,编码时间也要比序贯法短。定点脉冲线性预测编码方法可以在2.7 kbps的编码速率下获得与G.729相近的合成语音。  相似文献   

10.
针对以往语音增强算法在非平稳噪声环境下性能急剧下降的问题,基于时频字典学习方法提出了一种新的单通道语音增强算法。首先,提出采用时频字典学习方法对噪声的频谱结构的先验信息进行建模,并将其融入到卷积非负矩阵分解的框架下;然后,在固定噪声时频字典情况下,推导了时变增益和语音时频字典的乘性迭代求解公式;最后,利用该迭代公式更新语音和噪声的时变增益系数以及语音的时频字典,通过语音时频字典和时变增益的卷积运算重构出语音的幅度谱并用二值时频掩蔽方法消除噪声干扰。实验结果表明,在多项语音质量评价指标上,本文算法都取得了更好的结果。在非平稳噪声和低信噪比环境下,相比于多带谱减法和非负稀疏编码去噪算法,本文算法更有效地消除了噪声,增强后的语音具有更好的质量。   相似文献   

11.
In this paper, two speech enhancement algorithms (SEAs) based on spectral subtraction (SS) principle have been evaluated for bilateral cochlear implant (BCI) users. Specifically, dual-channel noise power spectral estimation algorithm using power spectral densities (PSD) and cross power spectral density (CPSD) of the observed signals was studied. The enhanced speech signals were obtained using either Dual Channel Non Linear Spectral Subtraction ‘DC-NLSS’ or Dual-Channel Multi-Band Spectral Subtraction ‘DC-MBSS’ algorithms. For performance evaluation, some objective speech assessment tests relying on Perceptual Evaluation of Speech Quality (PESQ) score and speech Itakura-Saito (IS) distortion measurement were performed to fix the optimal number of frequency band needed in DC-MBSS algorithm. In order to evaluate the speech intelligibility, subjective listening tests were assessed with 50 normal hearing listeners using a specific BCI simulator and with three deafened BCI patients. Experimental results, obtained using French Lafon database corrupted by an additive babble noise at different Signal-to-Noise Ratios (SNR), showed that DC-MBSS algorithm improves speech understanding better than DC-NLSS algorithm for single and multiple interfering noise sources.  相似文献   

12.
田玉静  左红伟  王超 《应用声学》2020,39(6):932-939
语音通信系统中,语音通过信道传输将不可避免地引入码间串扰和信号畸变,同时受到噪声污染。本文在分析自适应盲均衡算法CMA(constant modulus algorithm)和改进盲均衡算法的基础上,考虑到自适应盲均衡技术在语音噪声控制方面能力有限,将自适应盲均衡技术与小波包掩蔽阈值降噪算法联合使用,形成一种基带语音增强新方法。仿真试验结果显示自适应盲均衡技术可以使星座图变得清晰而紧凑,有效减小误码率。研究证实该方法在语音信号ISI和畸变严重情况下,在白噪及有色噪声不同的噪声环境中都具有稳定的降噪能力,消噪同时可获得汉语普通话良好的听觉效果。  相似文献   

13.
为了给双耳听力设备佩戴者带来更好的语音可懂度,提出了一种利用双耳时间差与声级差的近场语音增强算法,该方法首先利用这两种差异来估计语音的功率谱和语音的相干函数,然后计算干扰噪声在左右耳间的头相关传输函数的比值,最后构造两个维纳滤波器。客观评价的参数显示该算法去噪效果优于对比算法而目标语音的时间差误差和声级差误差低于对比算法。主观的言语接受阈测试表明该方法能有效提高语音可懂度。结果表明,该算法在能够有效去除干扰噪声的同时,保留了目标语音的空间信息。   相似文献   

14.
In this paper, a fundamental frequency (F(0)) tracking algorithm is presented that is extremely robust for both high quality and telephone speech, at signal to noise ratios ranging from clean speech to very noisy speech. The algorithm is named "YAAPT," for "yet another algorithm for pitch tracking." The algorithm is based on a combination of time domain processing, using the normalized cross correlation, and frequency domain processing. Major steps include processing of the original acoustic signal and a nonlinearly processed version of the signal, the use of a new method for computing a modified autocorrelation function that incorporates information from multiple spectral harmonic peaks, peak picking to select multiple F(0) candidates and associated figures of merit, and extensive use of dynamic programming to find the "best" track among the multiple F(0) candidates. The algorithm was evaluated by using three databases and compared to three other published F(0) tracking algorithms by using both high quality and telephone speech for various noise conditions. For clean speech, the error rates obtained are comparable to those obtained with the best results reported for any other algorithm; for noisy telephone speech, the error rates obtained are lower than those obtained with other methods.  相似文献   

15.
为了解决含噪语句分割问题,也为了解决某些低信噪比环境下传统气导语句分割算法分割效果差、分割准确度低且算法自适应性弱等问题,提出一种基于骨导语音自适应的分段双门限语音分割方法。将骨导语音和气导语音同步采集,获取抗噪性能更好的骨导语音,然后在融合过零率与短时能量中引入随机动态阈值的自适应方法进行端点检测,最后利用分段双门限和语音聚类等手段实现语音分割,提高语音分割算法的鲁棒性。通过实验验证了所提算法的有效性和可行性,同时与其他语音分割算法进行了对比,证明该文所提分割算法精度更高,效果更好。  相似文献   

16.
In this paper, an accurate pitch and voiced/unvoiced determination algorithm for speech analysis is described. The algorithm is called AMPEX (auditory model-based pitch extractor) and it performs a temporal analysis of the outputs emerging from a new auditory model. However, in spite of its use of an auditory model, AMPEX should not be regarded as a substitute for any psychophysical theory of human auditory pitch perception. What is mainly described is the design of a computationally efficient auditory model, the perceptually motivated determination of the model parameters, the conception of a reliable pitch extractor for speech analysis, and the elaboration of an experimental procedure for evaluating the performance of such a pitch extractor. In the course of the evaluation experiment several kinds of speech stimuli including clean speech, bandpass-filtered speech, and noisy speech were presented to three different pitch extractors. The experimental results clearly indicate that AMPEX outperforms the best algorithms available.  相似文献   

17.
An algorithm to measure speech-to-noise ratios has been implemented on a minicomputer. The algorithm attributes the energy within each consecutive 20-ms frame of a speech-plus-noise waveform to either a speech or noise source. This discrimination process is based upon the known characteristics of frame energy histograms of such waveforms. In response to observed inaccuracies of this discrimination process in cases of low speech versus noise separation, a method of estimating the speech Vrms of the signal is incorporated which attempts to recover speech energy, "masked" by noise. The algorithm's ability to track known speech-to-noise ratios on a decibel-for-decibel basis down to a ratio of approximately 5 dB has been demonstrated by experimentation.  相似文献   

18.
A noise robust voice conversion algorithm based on joint dictionary optimization is proposed to effectively convert noisy source speech into the target one. In composition of the joint dictionary, speech dictionary is optimized using backward elimination algorithm. At the same time, a noise dictionary is introduced to match the noisy speech. The experimental results show that the backward elimination algorithm can reduce the number of dictionary frames and reduce the amount of calculation while ensuring the conversion effect. In low SNR and multiple noise environments, the algorithm has better conversion effect than both the traditional NMF algorithm and the NMF conversion algorithm plus spectral subtraction de-noising. The proposed algorithm improves the robustness of voice conversion system.  相似文献   

19.
Musical residual noise is a major problem for a speech enhancement system. This noise is very annoying to the human ear and can significantly deteriorate the perception quality of enhanced speech. In this study, we aim at reducing the quantity of musical residual noise by a two-stage speech enhancement approach. In the first stage a preprocessor enhances noisy speech using an algorithm which combines the two-step-decision-directed and the Virag methods. In the second stage the enhanced speech signal is post-processed by an iterative-directional-median filter to significantly reduce the quantity of residual noise, while maintaining the harmonic spectra. Experimental results show that the proposed approach can significantly improve the performance of a speech enhancement system by reducing the quantity of residual noise.  相似文献   

20.
In this paper, we address the problem of noise reduction and speech enhancement by adaptive filtering algorithm. Recently, the well known forward blind source separation (FBSS) structure has been largely studied and intensively used to reduce acoustic noise components and to enhance speech signal. The FBSS structure is often combined with adaptive algorithms to accelerate the adaptation of the cross-filters, and to improve noise suppression at the output. In this paper, we propose to use a wavelet transform decomposition in the FBSS structure by using a two-channel forward wavelet symmetric adaptive decorrelating (WFSAD) algorithm. The proposed WFSAD algorithm provides a better compromise between time and frequency resolution and improves robustness of the noise reduction process when compared with the classical two-channel forward symmetric adaptive decorrelating (FSAD) algorithm. Simulation results prove the efficiency of the proposed WFBSS algorithm in comparison with conventional ones in terms of several objective and subjective criteria.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号