首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 93 毫秒
1.
结合幅度谱和功率谱字典的语音增强方法   总被引:1,自引:0,他引:1       下载免费PDF全文
从双路字典学习、噪声功率谱估计、语音幅度谱重构角度提出了一种改进的谱特征稀疏表示语音增强方法。在字典学习阶段,融合功率谱与幅度谱特征,采用区分性字典降低语音字典和噪声字典的相干性;在语音增强阶段,提出一种噪声功率谱估计方法对非平稳噪声进行跟踪估计;考虑到幅度谱和功率谱特征对不同噪声的适应程度不同,设计了语音重构权值表。对分别由幅度谱和功率谱恢复而来的两路信号进行自适应加权重构,结合相位补偿函数得到增强后的语音信号。实验结果表明,该方法在平稳、非平稳噪声环境下相比于单一谱特征的语音增强方法平均提高31.6%,改善了语音增强方法的性能。   相似文献   

2.
由于传统谱减语音增强存在残留的"音乐噪声",因此基于传统谱减法降噪的电子耳蜗(CI)感知的声音品质也会受到影响.为提高CI的抗噪性,本文提出了一种自适应变阶谱减算法,并将该方法应用于电子耳蜗的语音增强中.根据CI电极对应的频带关系,该算法先对采集的带噪声音信号功率谱进行Bark子带划分,并在每个Bark子带中根据信噪比的变化进行谱减阶数和系数的自适应调节,使各子带噪声更均衡地去除,基本消除了传统方法存在的"音乐噪声".基于该算法的电子耳蜗ACE仿真实验及测听结果表明,与传统谱减法相比,改进的算法能更好地抑制背景噪声和残留噪声,仿真得到的CI合成音感知更好和更清晰.  相似文献   

3.
语音信号的增强相对谱滤波   总被引:3,自引:0,他引:3  
提出了在语音信号对数功率谱域和功率谱域顺序滤波的新的增强RASTA滤波(ERASTA)方法。语音识别和说话人识别实验表明,ERASTA滤波能够有效地去除加性噪声和卷积噪声的干扰,ERASTA算法与语音信号的失真过程和噪声的功率谱无关。ERASTA方法性能同JRASTA算法类似或更好,且不需要JRASTA 算法中的实时语音信噪比估计。ERASTA 滤波器的设计表明,低频率的谱调制分量可引起语音识别和说话人识别性能的下降,说话人识别较语音识别需要较小的谱时间调制带宽。  相似文献   

4.
程宁  刘文举 《声学学报》2009,34(6):554-565
针对信号子空间语音增强算法中的子空间选择和线性滤波器中噪声功率谱和拉格朗日乘子的估计问题,用高斯、拉普拉斯和伽玛模型描述了语音的分布,提出了利用目标语音概率最大化来确定信号子空间维度的方法。在噪声子空间上,利用条件概率估计出噪声功率谱。接着,为了合理地折中增强语音中的残余噪声和语音畸变,提出了一种基于人耳听觉掩蔽效应的拉格朗日乘子估计方法。实验证明,在多项语音质量评价指标上,所提算法都取得了更好的结果。所提的信号子空间算法比传统的信号子空间算法更有效地消除了噪声,使得恢复的语音具有更好的质量。   相似文献   

5.
自适应平滑周期图语音增强研究   总被引:2,自引:0,他引:2  
提出基于功率谱结构特征的频带间自适应平滑周期图,解决周期图估计的频率分辨率和方差的矛盾,并应用于语音增强算法的幅度谱减法。测试结果表明,自适应平滑周期图谱减法对于各种功率谱结构特征的噪声,在平均段信噪比提高、平均对数谱距离等性能指标上优于其它周期图估计方法的谱减法。  相似文献   

6.
基于多窗谱的心理声学语音增强   总被引:5,自引:2,他引:5  
吴红卫  吴镇扬  赵力 《声学学报》2007,32(3):275-281
与传统的周期谱图相比,多窗谱具有更小的估计方差。从含噪语音的多窗谱对噪声及噪声与含噪语音之比(NNSR)进行估计,用基于NNSR的幅度谱减实现用于计算人耳掩蔽阈值的预增强语音,用集成了人耳掩蔽阈值的心理声学加权规则实现最终的增强语音。考虑到多窗谱的特点对掩蔽偏移量进行了修正,修正后的重建语音,其客观测量指标修正巴克谱测度比修正前有一定的改进。再对心理声学加权规则作最大值小于1的限制,则输入信噪比越大(0 dB以上),分段信噪比和总体信噪比提高得越多。非正式试听表明重建语音失真较小,背景噪声大大降低,且没有音乐噪声。  相似文献   

7.
基于修正Mel域掩蔽模型和无语音概率的耳语音增强   总被引:1,自引:0,他引:1  
提出了一种基于修正Mel域听觉掩蔽模型和无语音概率的耳语音增强方法。该方法根据耳语音的发音特点对Mel频率进行修正,对每一帧耳语音信号进行Mel域频带滤波,同时通过无语音概率(SAP)动态地确定每个频带的听觉掩蔽阈值,对不同的听觉掩蔽阈值自适应地调整谱减系数来进行耳语音增强。对增强后的耳语音进行客观和主观测试,结果表明,该方法与其它谱减法相比,能将残留噪声和背景噪声控制在人耳掩蔽阈值下,取得更小的语音失真,主观听觉也得到了很大的改善。   相似文献   

8.
基于听觉掩蔽效应和Bark子波变换的语音增强   总被引:22,自引:3,他引:19  
提出了一种适用于低信噪比下的提高语音的听觉效果的语音增强方法。该方法在谱减法的基础上有两个特点:首先减参数是根据人耳听觉掩蔽效应提出的且是自适应的;其次采用了与人耳听觉系统特性更为适应的Bark子波变换方法对增强前后的语音进行分析。对该算法进行了客观和主观测试,结果表明:与谱减法相比对低信噪比的语音信号,(1)能更好地抑制残留噪声和背景噪声,(2)增强后的语音具有更好的清晰度和可懂度。  相似文献   

9.
提出了一种基于极大似然的噪声对数功率谱估计方法,采用高斯混合模型对每一个频带上的功率谱包络构建统计模型,将时序包络划分为语音和非语音类,它们分别对应于高斯混合模型的两个高斯分量,描述语音和非语音的统计分布,其中非语音高斯分量的均值即为噪声功率谱的最优估计.采用序贯学习的方法,在极大似然准则下逐帧更新模型参数,并逐帧给出噪声功率谱的最优估计值。此外,由于序贯更新过程中语音信号长时缺失,容易导致模型失稳,提出了一种在线的最小描述长度准则(MDL)来判断语音信号是否长时缺失,从而保证了模型的稳定性.实验表明,算法性能整体优于经典的MS和IMCRA算法。   相似文献   

10.
针对目前有监督语音增强忽略了纯净语音、噪声与带噪语音之间的幅度谱相似性对增强效果影响等问题,提出了一种联合精确比值掩蔽(ARM)与深度神经网络(DNN)的语音增强方法。该方法利用纯净语音与带噪语音、噪声与带噪语音的幅度谱归一化互相关系数,设计了一种基于时频域理想比值掩蔽的精确比值掩蔽作为目标掩蔽;然后以纯净语音和噪声幅度谱为训练目标的DNN为基线,通过该DNN的输出来估计目标掩蔽,并对基线DNN和目标掩蔽进行联合优化,增强语音由目标掩蔽从带噪语音中估计得到;此外,考虑到纯净语音与噪声的区分性信息,采用一种区分性训练函数代替均方误差(MSE)函数作为基线DNN的目标函数,以使网络输出更加准确。实验表明,区分性训练函数提升了基线DNN以及整个联合优化网络的增强效果;在匹配噪声和不匹配噪声下,相比于其它常见DNN方法,本文方法取得了更高的平均客观语音质量评估(PESQ)和短时客观可懂度(STOI),增强后的语音保留了更多语音成分,同时对噪声的抑制效果更加明显。   相似文献   

11.
陈立伟  张晔 《应用声学》2006,25(2):90-95
研究了一种非齐次隐马尔可夫模型(Inhomogeneous Hidden Markov Model),然后将自组织特征映射神经网络与这种非齐次隐马尔可夫模型相结合,训练出抗噪声的HMM模型,并应用该混合模型进行语音识别。实验结果表明,该模型适合于对噪声背景下的语音进行识别。该模型具有更好的抗噪鲁棒性,在信噪比较低的情况下(5dB-10dB),识别率可以提高5%左右。  相似文献   

12.
针对区域有源降噪问题,为获得更优降噪效果,根据实际次级通路传递函数,提出次级声源优化布放的有源控制系统并详细比较了两种次级声源优化布放算法与次级声源均匀布放的实际降噪效果。应用的第一种次级声源优化算法是l2范数约束的约束匹配追踪算法,第二种次级声源优化算法是l1范数约束的稀疏正则化方法。在全消声室中利用扬声器线阵进行多通道有源降噪实验研究,实验结果表明,在200~1000 Hz,次级声源优化布放的控制系统的平均降噪量比次级声源均匀布放的控制系统的平均降噪量多5 dB左右;在1100~1900 Hz,次级声源优化布放的控制系统的平均降噪量比次级声源均匀布放的控制系统的平均降噪量多11~13 dB左右,次级声源优化布放的控制系统的降噪量分布更加均匀且次级声源输出能量更小。此外,两种优化算法中,稀疏正则化方法的降噪效果更佳。  相似文献   

13.
Thresholds for 10-ms sinusoids simultaneously masked by bursts of bandpass noise centered on the signal frequency were measured for a wide range of signal frequencies and noise levels. Thresholds were defined as the signal power relative to the masker power at the output of an auditory filter centered on the signal frequency. It was found that the presentation of a continuous random noise, with a spectral notch centered on the signal frequency, produced a reduction in signal thresholds of up to 11 dB. A notched noise spectrum level of 0-5 dB above that of the masker proved most effective in producing a masking release, as measured by a reduction in masked threshold. A release from masking of up to 7 dB could be obtained with a continuous bandpass noise. The most effective spectrum level of this noise was 5 dB below that of the masker. The effect of the continuous notched noise was to reduce signal-to-masker ratios at threshold to about 0 dB, regardless of the threshold in the absence of continuous noise. Thus the greatest release from masking occurred when "unreleased" thresholds were highest. The release from masking is almost complete within 320 ms of notched noise onset, and persists for about 160 ms after notched noise offset, regardless of notched noise level. The phenomenon is similar in many ways to the "overshoot" effect reported by Zwicker [J. Acoust. Soc. Am. 37, 653-663 (1965)]. It is argued that both effects can be largely attributed to peripheral short-term adaptation, a mechanism which is also believed to be involved in forward masking.  相似文献   

14.
It can be difficult for the voice clinician to observe or measure how a patient uses his voice in a noisy environment. We consider here a novel method for obtaining this information in the laboratory. Worksite noise and filtered white noise were reproduced over high-fidelity loudspeakers. In this noise, 11 subjects read an instructional text of 1.5 to 2 minutes duration, as if addressing a group of people. Using channel estimation techniques, the site noise was suppressed from the recording, and the voice signal alone was recovered. The attainable noise rejection is limited only by the precision of the experimental setup, which includes the need for the subject to remain still so as not to perturb the estimated acoustic channel. This feasibility study, with 7 female and 4 male subjects, showed that small displacements of the speaker's body, even breathing, impose a practical limit on the attainable noise rejection. The noise rejection was typically 30 dB and maximally 40 dB down over the entire voice spectrum. Recordings thus processed were clean enough to permit voice analysis with the long-time average spectrum and the computerized phonetogram. The effects of site noise on voice sound pressure level, fundamental frequency, long-term average spectrum centroid, phonetogram area, and phonation time were much as expected, but with some interesting differences between females and males.  相似文献   

15.
An ocean surface wave spectrum which is used for low frequency ambient noise in deep water is proposed. It explains the mechanism of low frequency ambient noise from the theoretical relation between the spectrum of sound pressure and wave. Combining the surface wave spectrum and local wind speed in deep water, a theoretical expression of low frequency ambient noise is obtained with wave generated noise theory. Simulation results show that the wave spectrum is crucial to the intensity and the spectral slope of radiated noise spectrum,and the theoretical noise spectrum could be used to predict the ambient noise in deep water.The predicting results axe verified through the experimental data recorded by an ocean bottom seismometer that was deployed on the floor of deep water in April 2016. It is observed that the statistical noise levels from the experimental data for frequencies from 1 Hz to 100 Hz are larger than 70 dB, and the low frequency ambient noise spectrum follows the shape of inverted"N",the valley of noise spectrum is at 3-4 Hz, and the noise intensity is 70 dB. The peak of noise spectrum is at 50 Hz, and the noise intensity is 92 dB. The correlation coefficient is 0.95 between the model spectrum and measured data.  相似文献   

16.
徐东  李风华  郭永刚  王元 《声学学报》2018,43(2):137-144
提出了一种适用于深海低频环境噪声的波浪谱,通过声压谱和波浪谱的理论关系,分析了深海低频噪声在百赫兹以下的谱特征,解释了不同频段噪声谱的主要产生机理。将深海传播条件下海面波浪谱与海面风速相结合,利用波浪发声理论得到一种低频海洋环境噪声理论表示方法。仿真结果表明,波浪谱决定着辐射噪声谱的强度和斜率,本模型得到的理论噪声谱可以对低频海洋环境噪声进行预报。2016年的深海实验观测数据分析显示,统计的环境噪声谱级在1 Hz至100 Hz频段范围内大于70 dB,并且噪声谱在低频段呈倒“N”型,在34 Hz处为噪声谱的谷值,噪声级为70 dB,在50 Hz处为噪声谱的峰值,噪声级为92 dB,通过理论计算和实验对比,相关系数为0.95,理论结果和实验测量对比结果符合较好。   相似文献   

17.
非稳散粒噪音相位依赖特性的实验研究   总被引:1,自引:1,他引:0  
为了抑制相位调制引起的散粒噪音,实验研究了通过调制二极管激光器输出光强度,使用相位解调技术得到的不同解调相位下非稳散粒噪音的功率谱.在最小散粒噪音处,得到了相比于无调制情况下的散粒噪音减小3 dB的灵敏测量结果.实验结果表明,选择合适的调制相位可以在一定程度上抑制散粒噪音.研究结果与理论分析一致.  相似文献   

18.
在生物体拉曼光谱快速采集或低功率采集过程中,往往会获得低信噪比拉曼光谱。针对低信噪比光谱数据,提出应用补充总体经验模态方法(CEEMD)分解拉曼光谱,并且依据特征模态分量的归一化排列熵值(NPE)按比例扣除噪声成分的方法,称为局部补充总体均值经验模分解方法(LCEEMD)。LCEEMD方法不仅解决了经验模态(EMD)分解中高频信号与噪声的模态混叠问题,还有效降低了总体经验模态分解法(EEMD)中的残留噪声。仿真数据实验显示,LCEEMD方法在处理10db信噪比模拟光谱时获得了39.615 0 db信噪比,0.001 17标准差和0.999 9相关系数。在人体皮肤拉曼光谱试验中,LCEEMD方法滤波后数据准确呈现出角质层脂质酰胺I带激发拉曼强谱峰以及甘油三酸酯中(CO)酯微弱谱峰。在水稻叶片可溶性糖定量预测模型中,LCEEMD方法取得了0.871 7预测相关系数和0.912 0预测标准误差,优于EMD和EEMD软阈值去噪(0.511 4,1.647 8和0.638 2,1.508 8)。LCEEMD方法实施过程中,根据去噪性能指标反馈调整归一化排列熵阈值,直至获得最佳去噪效果,滤波过程无需参数设置,可以自适应实现。  相似文献   

19.
A 24 dB gain bismuth-doped fiber amplifier at 1430 nm pumped by a 65 mW commercial laser diode at 1310 nm is reported for the first time (to our knowledge). A 3 dB bandwidth of about 40 nm, a noise figure of 6 dB, and a power conversion efficiency of about 60% are demonstrated. The temperature behavior of the gain spectrum is examined.  相似文献   

20.
The effective internal level of a 1-kHz tone at 50 dB SPL was estimated by measuring the forward masking produced on a 10-ms signal tone of the same frequency. Noise containing a spectral notch was then added to the masker tone, and its influence on the effective level of the tone was measured with a variety of noise levels, notch widths, and notch shapes. In experiment 1, the masker tone was centered in the spectral notch, itself centered in a 2-kHz band of noise. As the spectrum level in the noise passbands increased from 6 dB/Hz to 36 dB/Hz, signal threshold decreased, indicating a decrease in masking by the masker tone. This "unmasking" effect of the noise was attributed to suppression of the masker tone by the components in the noise. Unmasking was greatest with the narrowest spectral notch (250 Hz), and decreased to zero as the notch widened to 1500 Hz. Compared to its level when presented alone, the effective internal level of the masker tone could be reduced by up to 30 dB (250-Hz notch, 36 dB/Hz). The relative suppressive strength of individual noise components was estimated in experiment 2, in which the 1-kHz masker tone was located at one edge of a spectral notch, rather than in the center. Noise spectrum level was fixed at 16 dB/Hz. As notch width decreased to zero, on either the high-frequency or low-frequency side of the masker tone, its effective internal level was again reduced by approximately 30 dB. In a tentative analysis, the first derivative of the smoothed threshold function was taken, to provide an estimate of the relative contributions to suppression at 1 kHz of noise components between 250 and 1740 Hz.(ABSTRACT TRUNCATED AT 250 WORDS)  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号