期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

A noise cross PSD estimator based on improved minimum statistics method for two-microphone speech enhancement dedicated to a bilateral cochlear implant

Fathi Kallel Mohamed Ghorbel Mondher Frikha Christian Berger-Vachon Ahmed Ben Hamida 《Applied Acoustics》2012,73(3):256-264

Coherence based methods have been successfully applied to dual-microphone noise reduction systems. These techniques showed good results when noise signals on two microphones were uncorrelated, but their performance decreased with correlated noises. It could be improved when the cross power spectral density (CPSD) of received noises is available.In this paper, an improved minimum tracking (IMT) technique for noise CPSD estimation was proposed. The performance of this technique was compared to two other noise CPSD estimators based on voice activity detection (VAD) and minimum tracking (MT) approaches. Evaluation was performed at four signal-to-noise ratios (SNR) and two interfering noise source configurations.Results showed a superiority of the IMT approach in terms of low computing time and quality indicated by the perceptual evaluation of speech quality (PESQ) scores. Then, subjective listening tests were carried out with 50 normal hearing listeners using a specific bilateral cochlear implant (BCI) simulator and utilizing the French Lafon database corrupted by additional babble noise. Results obtained with the proposed technique were better than the two previously mentioned noise CPSD estimators. 相似文献

2.

Noise reduction of speech signals using time-varying and multi-band adaptive gain control for smart digital hearing protectors

Narimene Lezzoum Ghyslain GagnonJérémie Voix 《Applied Acoustics》2016

In this paper, a single-channel speech enhancement algorithm based on non-linear and multi-band Adaptive Gain Control (AGC) is proposed. The algorithm requires neither Signal-to-Noise Ratio (SNR) nor noise parameters estimation. It reduces the background noise in the temporal domain rather than the spectral domain using a non-linear and automatically adjustable gain function for multi-band AGC. The gain function varies in time and is deduced from the temporal envelope of each frequency band to highly compress the frequency regions where noise is present and lightly compress the frequency regions where speech is present. Objective evaluation using the PESQ (Perceptual Evaluation of Speech Quality) metric shows that the proposed algorithm performs better than three benchmarks, namely: the spectral subtraction, the Wiener filter based on a priori SNR estimation and a band-pass modulation filtering algorithm. In addition, blind subjective tests show that the proposed algorithm introduces less musical noise compared to the benchmark algorithms and was preferred 78.8% of the time in terms of signal quality. The proposed algorithm is implemented in a miniature low power digital signal processor to validate its feasibility and complexity for smart hearing protection in noisy environments. 相似文献

3.

一种抑制方向性噪声的双耳语音增强算法

方义冯海泓陈友元胡晓城《声学学报》2016,41(6):897-904

为了给双耳听力设备佩戴者带来更好的语音可懂度,提出了一种利用双耳时间差与声级差的近场语音增强算法,该方法首先利用这两种差异来估计语音的功率谱和语音的相干函数,然后计算干扰噪声在左右耳间的头相关传输函数的比值,最后构造两个维纳滤波器。客观评价的参数显示该算法去噪效果优于对比算法而目标语音的时间差误差和声级差误差低于对比算法。主观的言语接受阈测试表明该方法能有效提高语音可懂度。结果表明,该算法在能够有效去除干扰噪声的同时,保留了目标语音的空间信息。相似文献

4.

Application of spectral subtraction method on enhancement of electrolarynx speech

Liu H Zhao Q Wan M Wang S 《The Journal of the Acoustical Society of America》2006,120(1):398-406

Although electrolarynx (EL) serves as an important method of phonation for the laryngectomees, the resulting speech is of poor intelligibility due to the presence of a steady background noise caused by the instrument, even worse in the case of additive noise. This paper investigates the problem of EL speech enhancement by taking into account the frequency-domain masking properties of the human auditory system. One approach is incorporating an auditory masking threshold (AMT) for parametric adaptation in a subtractive-type enhancement process. The other is the supplementary AMT (SAMT) algorithm, which applies a cross-correlation spectral subtraction (CCSS) approach as a post-processing scheme to enhancing EL speech dealt with the AMT method. The performance of these two algorithms was evaluated as compared to the power spectral subtraction (PSS) algorithm. The best performance of EL speech enhancement was associated with the SAMT algorithm, followed by the AMT algorithm and the PSS algorithm. Acoustic and perceptual analyses indicated that the AMT and SAMT algorithms achieved the better performances of noise reduction and the enhanced EL speech was more pleasant to human listeners as compared to the PSS algorithm. 相似文献

5.

联合深度神经网络和凸优化的单通道语音增强算法 总被引：1，自引：1，他引：0

下载免费PDF全文

张晓艳张天骐葛宛营白杨柳《声学学报》2021,46(3):471-480

噪声估计的准确性直接影响语音增强算法的好坏,为提升当前语音增强算法的噪声抑制效果,有效求解无约束优化问题,提出一种联合深度神经网络(DNN)和凸优化的时频掩蔽优化算法进行单通道语音增强。首先,提取带噪语音的能量谱作为DNN的输入特征;接着,将噪声与带噪语音的频带内互相关系数(ICC Factor)作为DNN的训练目标;然后,利用DNN模型得到的互相关系数构造凸优化的目标函数;最后,联合DNN和凸优化,利用新混合共轭梯度法迭代处理初始掩蔽,通过新的掩蔽合成增强语音。仿真实验表明,在不同背景噪声的低信噪比下,相比改进前,新的掩蔽使增强语音获得了更好的对数谱距离(LSD)、主观语音质量(PESQ)、短时客观可懂度(STOI)和分段信噪比(segSNR)指标,提升了语音的整体质量并且可以有效抑制噪声。相似文献

6.

A spectral/temporal method for robust fundamental frequency tracking

Zahorian SA Hu H 《The Journal of the Acoustical Society of America》2008,123(6):4559-4571

In this paper, a fundamental frequency (F(0)) tracking algorithm is presented that is extremely robust for both high quality and telephone speech, at signal to noise ratios ranging from clean speech to very noisy speech. The algorithm is named "YAAPT," for "yet another algorithm for pitch tracking." The algorithm is based on a combination of time domain processing, using the normalized cross correlation, and frequency domain processing. Major steps include processing of the original acoustic signal and a nonlinearly processed version of the signal, the use of a new method for computing a modified autocorrelation function that incorporates information from multiple spectral harmonic peaks, peak picking to select multiple F(0) candidates and associated figures of merit, and extensive use of dynamic programming to find the "best" track among the multiple F(0) candidates. The algorithm was evaluated by using three databases and compared to three other published F(0) tracking algorithms by using both high quality and telephone speech for various noise conditions. For clean speech, the error rates obtained are comparable to those obtained with the best results reported for any other algorithm; for noisy telephone speech, the error rates obtained are lower than those obtained with other methods. 相似文献

7.

结合幅度谱和功率谱字典的语音增强方法 总被引：1，自引：0，他引：1

下载免费PDF全文

聂玲子陈雪勤赵鹤鸣《声学学报》2021,46(1):81-91

从双路字典学习、噪声功率谱估计、语音幅度谱重构角度提出了一种改进的谱特征稀疏表示语音增强方法。在字典学习阶段,融合功率谱与幅度谱特征,采用区分性字典降低语音字典和噪声字典的相干性;在语音增强阶段,提出一种噪声功率谱估计方法对非平稳噪声进行跟踪估计;考虑到幅度谱和功率谱特征对不同噪声的适应程度不同,设计了语音重构权值表。对分别由幅度谱和功率谱恢复而来的两路信号进行自适应加权重构,结合相位补偿函数得到增强后的语音信号。实验结果表明,该方法在平稳、非平稳噪声环境下相比于单一谱特征的语音增强方法平均提高31.6%,改善了语音增强方法的性能。相似文献

8.

A cross-spectrum weighting algorithm for speech enhancement and array processing: combining phase-shift information and stationary signal properties

Schwetz I Gruhler G Obermayer K 《The Journal of the Acoustical Society of America》2006,119(2):952-964

In this paper, a gain function for noise cancellation with a two-channel microphone array is presented. This gain function combines ideas from one- and multichannel algorithms. It is developed using a minimum mean square error estimator for the amplitude of the speech signal from the cross spectrum between two microphone signals. To consider speech pauses and the absence of spectral components of the speech, an extension of this gain function is presented. The performance of the overall gain function is shown in terms of the cancellation of (diffuse) driving noise as well as the cancellation of an interfering speech signal, both recorded in a car. 相似文献

9.

Comparative intelligibility investigation of single-channel noise-reduction algorithms for Chinese, Japanese, and English

Li J Yang L Zhang J Yan Y Hu Y Akagi M Loizou PC 《The Journal of the Acoustical Society of America》2011,129(5):3291-3301

A large number of single-channel noise-reduction algorithms have been proposed based largely on mathematical principles. Most of these algorithms, however, have been evaluated with English speech. Given the different perceptual cues used by native listeners of different languages including tonal languages, it is of interest to examine whether there are any language effects when the same noise-reduction algorithm is used to process noisy speech in different languages. A comparative evaluation and investigation is taken in this study of various single-channel noise-reduction algorithms applied to noisy speech taken from three languages: Chinese, Japanese, and English. Clean speech signals (Chinese words and Japanese words) were first corrupted by three types of noise at two signal-to-noise ratios and then processed by five single-channel noise-reduction algorithms. The processed signals were finally presented to normal-hearing listeners for recognition. Intelligibility evaluation showed that the majority of noise-reduction algorithms did not improve speech intelligibility. Consistent with a previous study with the English language, the Wiener filtering algorithm produced small, but statistically significant, improvements in intelligibility for car and white noise conditions. Significant differences between the performances of noise-reduction algorithms across the three languages were observed. 相似文献

10.

A comparative intelligibility study of single-microphone noise reduction algorithms 总被引：1，自引：0，他引：1

Hu Y Loizou PC 《The Journal of the Acoustical Society of America》2007,122(3):1777

The evaluation of intelligibility of noise reduction algorithms is reported. IEEE sentences and consonants were corrupted by four types of noise including babble, car, street and train at two signal-to-noise ratio levels (0 and 5 dB), and then processed by eight speech enhancement methods encompassing four classes of algorithms: spectral subtractive, sub-space, statistical model based and Wiener-type algorithms. The enhanced speech was presented to normal-hearing listeners for identification. With the exception of a single noise condition, no algorithm produced significant improvements in speech intelligibility. Information transmission analysis of the consonant confusion matrices indicated that no algorithm improved significantly the place feature score, significantly, which is critically important for speech recognition. The algorithms which were found in previous studies to perform the best in terms of overall quality, were not the same algorithms that performed the best in terms of speech intelligibility. The subspace algorithm, for instance, was previously found to perform the worst in terms of overall quality, but performed well in the present study in terms of preserving speech intelligibility. Overall, the analysis of consonant confusion matrices suggests that in order for noise reduction algorithms to improve speech intelligibility, they need to improve the place and manner feature scores. 相似文献

11.

低信噪比下的语音增强处理 总被引：1，自引：0，他引：1

下载免费PDF全文

李国锋《应用声学》1995,14(5):13-16

本文介绍了一种基于功率谱减的方法来增强带有白噪声的语音信号。过量功率谱减是语音增强的一个有效的方法，其处理后产生的纯音噪声采用中心限幅的方法可以很好地得到抑制。相似文献

12.

基于数字信号处理的嗓音控制开关(VOX)算法研究

下载免费PDF全文

张天骐李伟林孝康刘林《应用声学》2005,24(3):157-163

本文提出了一种基于数字谱分析的嗓音控制开关(VOX，Voice—Operated Transmit)的新算法，该算法简单、实用，在某种程度上克服了传统VOX算法的结构复杂、参数难调等局限，对噪声的鲁棒性也较好，而且易于用数字信号处理实现。首先利用信号功率谱二次处理，提取出语音的平均幅度包络，然后对所得包络进行阈值处理、限幅放大，最后就得到VOX函数。理论分析和计算机模拟结果表明，该算法不仅能较为准确地提取出语音波形的平均幅度包络，而且能工作在较低的信噪比条件下。相似文献

13.

基于双传声器的蓝牙耳机降噪算法

下载免费PDF全文

严馨叶邱小军卢晶《应用声学》2014,33(4):313-323

用于免提通信设备的语音增强算法一直是研究的热点问题,而算法处理结果的音质问题近年来也备受关注。针对基于双传声器降噪的蓝牙耳机系统,将常用多通道传声器降噪算法归纳为基于相干函数法和基于空间预分离法这两大类进行分析和比较。基于相干函数法利用两个通道间信号的相干函数对含噪信号滤波达到降噪目的,而基于空间预分离法利用空间特性从含噪信号中分离出噪声参考信号来消除噪声。分析基于降噪量、语音音质和综合性能三个指标,从约束语音损伤的角度分析最优解的形式,并对比两类算法的实际性能。结果表明选择合适的算法可权衡降噪量与语音损伤,达到较好的综合性能。相似文献

14.

An iterative noise cross-PSD estimation for two-microphone speech enhancement 总被引：2，自引：0，他引：2

Mohsen Rahmani Ahmad Akbari Beghdad Ayad 《Applied Acoustics》2009,70(3):514-521

Among various speech enhancement methods, dual-microphone methods are of a great importance for their low cost implementation and for exploiting spatial-filtering benefits of the microphone arrays. Coherence based methods are well-known as efficient two-microphone noise reduction techniques. These techniques do not work well, when received noise signals are correlated. These can be improved when the cross power spectral density (CPSD) of noise is available. In this paper, we propose an iterative approach for estimation of the noise CPSD to be employed in coherence based methods. We compare the proposed iterative noise CPSD estimation with a noise CPSD estimation technique based on voice activity detector (VAD), both of which are employed in a two-microphone speech enhancement, separately. Evaluation results show that the two-microphone speech enhancement scheme utilizing the proposed noise CPSD estimation technique performs superior than the enhancement system using the VAD-based noise CPSD estimation. 相似文献

15.

采用条件波数谱密度函数的宽带高分辨方位谱估计算法 总被引：1，自引：0，他引：1

下载免费PDF全文

李学敏黄海宁李宇叶青华张扬帆《声学学报》2019,44(4):585-594

针对宽带高分辨方位估计存在方位估计偏差大、算法复杂度高等问题,提出了一种基于条件波数谱密度(Conditional Wavenumber Spectral Density based,CWSD-based)的宽带高分辨方位谱估计算法.该算法利用条件波数谱密度将阵列信号转换到频率-波数空间,宽带信号能量在该空间的坐标呈现与入射角相关的线性分布,通过借鉴直线检测原理,实现邻近目标的高分辨方位估计,且无需预估角度和信源数等信息。仿真结果表明,该算法理论分辨率与处理最高频率成反比,估计均方误差约为0.1°,对阵形畸变鲁棒,运算效率高。海上试验数据表明,本文方法在方位分辨率、弱目标检测、非目标向噪声抑制、稳健性等方面都优于宽带常规波束形成和最小方差无畸变算法,在实际海洋中可实现超低旁瓣高分辨波达方向估计。相似文献

16.

Spectral and temporal cues for phoneme recognition in noise

Xu L Zheng Y 《The Journal of the Acoustical Society of America》2007,122(3):1758

Cochlear implant users receive limited spectral and temporal information. Their speech recognition deteriorates dramatically in noise. The aim of the present study was to determine the relative contributions of spectral and temporal cues to speech recognition in noise. Spectral information was manipulated by varying the number of channels from 2 to 32 in a noise-excited vocoder. Temporal information was manipulated by varying the low-pass cutoff frequency of the envelope extractor from 1 to 512 Hz. Ten normal-hearing, native speakers of English participated in tests of phoneme recognition using vocoder processed consonants and vowels under three conditions (quiet, and +6 and 0 dB signal-to-noise ratios). The number of channels required for vowel-recognition performance to plateau increased from 12 in quiet to 16-24 in the two noise conditions. However, for consonant recognition, no further improvement in performance was evident when the number of channels was > or =12 in any of the three conditions. The contribution of temporal cues for phoneme recognition showed a similar pattern in both quiet and noise conditions. Similar to the quiet conditions, there was a trade-off between temporal and spectral cues for phoneme recognition in noise. 相似文献

17.

Relationship between perception of spectral ripple and speech recognition in cochlear implant and vocoder listeners

Litvak LM Spahr AJ Saoji AA Fridman GY 《The Journal of the Acoustical Society of America》2007,122(2):982-991

Spectral resolution has been reported to be closely related to vowel and consonant recognition in cochlear implant (CI) listeners. One measure of spectral resolution is spectral modulation threshold (SMT), which is defined as the smallest detectable spectral contrast in the spectral ripple stimulus. SMT may be determined by the activation pattern associated with electrical stimulation. In the present study, broad activation patterns were simulated using a multi-band vocoder to determine if similar impairments in speech understanding scores could be produced in normal-hearing listeners. Tokens were first decomposed into 15 logarithmically spaced bands and then re-synthesized by multiplying the envelope of each band by matched filtered noise. Various amounts of current spread were simulated by adjusting the drop-off of the noise spectrum away from the peak (40-5 dBoctave). The average SMT (0.25 and 0.5 cyclesoctave) increased from 6.3 to 22.5 dB, while average vowel identification scores dropped from 86% to 19% and consonant identification scores dropped from 93% to 59%. In each condition, the impairments in speech understanding were generally similar to those found in CI listeners with similar SMTs, suggesting that variability in spread of neural activation largely accounts for the variability in speech perception of CI listeners. 相似文献

18.

时频结合自适应阈值小波包消噪算法

下载免费PDF全文

田玉静董玉民左红伟《应用声学》2010,29(4):256-262

在充分考虑人耳听觉特性和噪声统计特性的基础上,提出一种时频结合Bark尺度自适应阈值的语音消噪算法,在Bark频域上自适应调整增强系数可以较准确地进行阈值判定。仿真实验验证,时频结合算法在低信噪比输入情况下较传统语音降噪方法具有明显优势,其在消除高斯白噪声的同时有效降低了语音损失,可获得最大信噪比,谱失真测度最小,增强语音的MOS(Mean Opinion Score)评分明显提高,具有较好的听觉效果。相似文献

19.

Performance of time- and frequency-domain binaural beamformers based on recorded signals from real rooms 总被引：1，自引：0，他引：1

Lockwood ME Jones DL Bilger RC Lansing CR O'Brien WD Wheeler BC Feng AS 《The Journal of the Acoustical Society of America》2004,115(1):379-391

Extraction of a target sound source amidst multiple interfering sound sources is difficult when there are fewer sensors than sources, as is the case for human listeners in the classic cocktail-party situation. This study compares the signal extraction performance of five algorithms using recordings of speech sources made with three different two-microphone arrays in three rooms of varying reverberation time. Test signals, consisting of two to five speech sources, were constructed for each room and array. The signals were processed with each algorithm, and the signal extraction performance was quantified by calculating the signal-to-noise ratio of the output. A frequency-domain minimum-variance distortionless-response beamformer outperformed the time-domain based Frost beamformer and generalized sidelobe canceler for all tests with two or more interfering sound sources, and performed comparably or better than the time-domain algorithms for tests with one interfering sound source. The frequency-domain minimum-variance algorithm offered performance comparable to that of the Peissig-Kollmeier binaural frequency-domain algorithm, but with much less distortion of the target signal. Comparisons were also made to a simple beamformer. In addition, computer simulations illustrate that, when processing speech signals, the chosen implementation of the frequency-domain minimum-variance technique adapts more quickly and accurately than time-domain techniques. 相似文献

20.

两阶段复数谱卷积循环网络立体声回声消除

下载免费PDF全文

程琳娟彭任华郑成诗李晓东《声学学报》2023,48(1):199-214

提出了一种两阶段复数谱卷积循环网络(CRN)的立体声回声消除(SAEC)算法,该算法无需对立体声信号进行去相关,因而能够在保证立体声音质和空间感的同时,解决自适应滤波SAEC算法非唯一解问题。所提算法采用两个阶段进行回声消除,第一阶段根据传声器接收信号和参考信号估计回声信号,第二阶段将估计回声信号作为先验信息,联合传声器接收信号作为输入特征,估计近端语音。相对于单阶段CRN算法,该方法能够提高网络对回声和近端语音的区分度,有助于近端语音的提取。另外,网络的输入特征和训练目标均采用复数谱,降低了近端语音的相位估计误差,因而可以进一步提升算法性能。实验表明,基于两阶段复数谱CRN的SAEC算法在单端讲话时的回声抑制量和双端讲话时的语音质量都明显优于传统算法以及单阶段CRN算法。相似文献