首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 171 毫秒
1.
一般的语音增强算法在强噪声环境中只能提高信噪比,不能提高可懂度。本文提出用可调节白噪声代替信号中非语音部分的语音可懂度增强处理新算法。实验证明此方法能明显改善强噪声时的语音可懂度,能对低至-10dB的带噪语音信号进行有效的可懂度增强。  相似文献   

2.
研究用短波语音通话携带的飞机舱室噪声对飞机类型进行识别的方法。分析了飞机舱室内噪声在短波信道和语音通话干扰下的物理特性,定义了估计语音段的飞机噪声信噪比的公式,提出了自适应的抑制语音增强飞机噪声的模型,通过CZT变换分别提取目标信号不同频段的功率谱密度级特征,并设计了用支持向量机进行分类识别的二叉分类树。对8类现场实测数据进行实验:增强后语音段的平均信噪比提高约22 dB,分类树对语音应答间隔噪声、语音段信号和增强后的信号的平均识别率分别为82.79%,15.25%,50.18%。实验表明:应答间隔噪声可用于飞机类型识别;语音抑制算法带来较大的信噪比和识别率增益,证明语音段蕴含有助于飞机类型识别的重要信息,可为后续的研究奠定基础。   相似文献   

3.
通过对高频语抗噪声通讯系统的研究,并经实验证明该系统的高频语通讯具有很强的抗环境噪声功能,较好地解决了噪声环境中的传递语音,高频语生成的数学表达。高频语通讯装置的研究设计;采用该装置在实验室内分别用语音声级85、90,95dB(A)在高于该声级的噪声环境下作或懂度实验研究;高频语声级90dB(A)在舰船主机舱室105dB(A)环境下载与不载耳塞的可懂度试验,该系统的形成,在语音,环境噪声比为-10至-15dB时语言可懂度可达90%以上,从理论上升到研制装置成功,突破了传统的语言传递遵循部位机理的信噪比必须为+5dB的论点。  相似文献   

4.
王炜宇  马蕙  王超 《应用声学》2023,42(4):844-852
警报语音广播是紧急情况时提高建筑物内疏散效率的有效手段。通过实验室研究的方法,研究了语音特性及声场因素对老年群体警报语音可懂度和主观感受的影响。主观感受选取了听音容易程度和感知紧迫性两个维度。研究结果表明,老年人警报语音可懂度和听音容易程度评价主要受语速、信噪比和混响时间的影响,且呈现一致的变化趋势,即随着语速和混响时间降低以及信噪比的增加,老年人可懂度和听音容易程度得分均升高,而声压级(最低设置为60dB)、有无警铃和噪声类型没有显著影响。感知紧迫性随语速和声压级的增加而显著增加,信噪比、混响时间及有无警铃声对感知紧迫性并无显著影响。采用人声播报的警报语音其可懂度和感知紧迫性显著高于合成声。比较老年人和年轻人群体的结果发现,在语速、声压级和噪声类型对主观评价的影响上有显著差异。为建立老年人理想且安全的声环境,应采用人声播报并适当降低语速以保证可懂度,同时混响及信噪比条件两方面的改善都是必要的。  相似文献   

5.
通过心理声学实验研究了来自不同方向具有不同信噪比的两种十扰声条件下,母语为汉语的听者对英语的空间去掩蔽现象.在消声室指定位置布放扬声器,发出目标声和干扰声,通过听者对目标卢进行听音识别,得到听者识别的正确率.实验结果显示:只在正前方播放目标语音时,识别正确率大于99%,当目标和干扰语音都位于听者正前方时,正确率为57%;当目标和干扰语音随机位于±60°时,正确率为96%;特别地,当甘标语音和干扰信号都位于听者正前方时,若干扰为噪声,随着信噪比从0 dB降低到-12 dB,正确率从96%降低到34%,而当干扰为语音时,随着信噪比从0 dB降低到-12 dB,正确率先足下降,随后有平均幅度为27%的明显上升,在此之后又是下降的趋势;当噪声干扰和语音干扰位于60°时,随着信噪比从-4 dB降低到-16 dB,正确率分别从99%降低到80%和从98%降低到91%.研究表明:空间分离对于母语为汉语的听者的英语语音町懂度有明显增益;大多数情况下英语语音的正确率都随着信噪比的降低而下降.这和对母语为其他语言的相关研究结论一致.  相似文献   

6.
周健  郑文明  王青云  赵力 《声学学报》2014,39(4):501-508
提出两种基于非对称代价函数的耳语音增强算法,将语音增强过程中的放大失真和压缩失真区分对待。Modified ItakuraSaito (MIS)算法对放大失真给予更多的惩罚,而Kullback-Leibler (KL)算法则对压缩失真给予更多的惩罚。实验结果表明,在低于—6 dB的低信噪比情况中,经MIS算法增强后的耳语音的可懂度相比传统算法有显著提高;而KL算法则获得了同最小均方误差语音增强算法近似的可懂度提高效果,证实了耳语音中的放大失真和压缩失真对于耳语音可懂度的影响并不相同,低信噪比时较大的压缩失真有助于提高耳语音可懂度,而高信噪比时的压缩失真对耳语音可懂度影响较小。   相似文献   

7.
本文介绍了从噪声的背景中提取语言信号的频谱减方法、自相关方法和线性滤波方法及其对信噪比从+6dB到-6dB的汉语元音、词句和语句进行处理的结果。经过处理后信噪比均可提高6dB左右,但语言的可懂度没有明显改善。本文还对几种提取基频的方法,在噪声背景中提取语言基频的效果作了比较。  相似文献   

8.
蓝天  惠国强  李萌  吕忆蓝  刘峤 《声学学报》2020,45(6):897-905
提出了采用上下文相关的注意力机制及循环神经网络的语音增强方法。该方法在训练阶段联合训练计算注意力评分的多层感知机和增强语音的深度循环网络,在测试阶段计算每一帧语音的注意力向量并与该帧语音拼接输入深度循环网络增强。在不同信噪比的实验中,该方法相比基线模型能更好地提高语音质量和可懂度,-6 dB下相对带噪语音短时客观可懂度(STOI)和语音质量感知评估(PESQ)可分别提高0.16和0.77,同时在未知噪声条件下该方法性能仍最优或接近最优。因此注意力机制可以有效强化模型对上下文信息的利用能力,从而提高模型增强性能。   相似文献   

9.
提出了一种采用感知语谱结构边界参数(PSSB)的语音端点检测算法,用于在低信噪比环境下的语音信号预处理。在对含噪语音进行基于听觉感知特性的语音增强之后,针对语音信号的连续分布特性与残留噪声的随机分布特性之间的不同点,对增强后语音的时-频语谱进行二维增强,从而进一步突出连续分布的纯净语音的语谱结构。通过对增强后语音语谱结构的二维边界检测,提出PSSB参数,并用于端点检测。实验结果表明,在白噪声-10 dB到10 dB的各种信噪比环境下,采用PSSB参数的端点检测算法,相对于其它端点检测算法,更有效地检测出语音的端点。在-10 dB的极低信噪比下,提出的方法仍然有75.2%的正确率。采用PSSB参数的端点检测算法,更适合于低信噪比白噪声环境下的语音端点检测。   相似文献   

10.
基于听觉掩蔽效应和Bark子波变换的语音增强   总被引:22,自引:3,他引:19  
陶智  赵鹤鸣  龚呈卉 《声学学报》2005,30(4):367-372
提出了一种适用于低信噪比下的提高语音的听觉效果的语音增强方法。该方法在谱减法的基础上有两个特点:首先减参数是根据人耳听觉掩蔽效应提出的且是自适应的;其次采用了与人耳听觉系统特性更为适应的Bark子波变换方法对增强前后的语音进行分析。对该算法进行了客观和主观测试,结果表明:与谱减法相比对低信噪比的语音信号,(1)能更好地抑制残留噪声和背景噪声,(2)增强后的语音具有更好的清晰度和可懂度。  相似文献   

11.
The effect of ambient noise on vocal output and the preferred listening level of conversational speech was investigated under conditions typical of everyday speech communication. For a speaker-listener distance of 1 m, vocal output and the preferred listening level in quiet were both about 50 dB(A). Deviations from this value were observed when the noise level exceeded a level of about 40 dB(A). The regression lines for the data points above this level showed a 3 dB rise for a 10 dB rise in noise level. The experiments further suggest that both speaker and listener (when the latter is able to control the playback level of recorded speech) try to compensate for the noise interference by raising the level of speech in order to keep the (subjective) loudness of speech in noise equal to the loudness of speech in quiet.  相似文献   

12.
In the n-of-m strategy, the signal is processed through m bandpass filters from which only the n maximum envelope amplitudes are selected for stimulation. While this maximum selection criterion, adopted in the advanced combination encoder strategy, works well in quiet, it can be problematic in noise as it is sensitive to the spectral composition of the input signal and does not account for situations in which the masker completely dominates the target. A new selection criterion is proposed based on the signal-to-noise ratio (SNR) of individual channels. The new criterion selects target-dominated (SNR > or = 0 dB) channels and discards masker-dominated (SNR<0 dB) channels. Experiment 1 assessed cochlear implant users' performance with the proposed strategy assuming that the channel SNRs are known. Results indicated that the proposed strategy can restore speech intelligibility to the level attained in quiet independent of the type of masker (babble or continuous noise) and SNR level (0-10 dB) used. Results from experiment 2 showed that a 25% error rate can be tolerated in channel selection without compromising speech intelligibility. Overall, the findings from the present study suggest that the SNR criterion is an effective selection criterion for n-of-m strategies with the potential of restoring speech intelligibility.  相似文献   

13.
Aiming at the poor detection rate of multi-frequency weak signals under a strong background of noise, a novel method based on adaptive stochastic resonance (SR) theory is proposed in this paper. The optimal parameters can be obtained automatically via measurement by establishing an adaptive SR system model and using the reverse location method. After passing through the adaptive SR system, the spectrum values of all eight signals greatly improve, the largest spectrum value gain increases from 12.41 to 2033 when the frequency is 0.01?Hz, which is an improvement of a factor of 162.8, and the signal-to-noise ratio (SNR) gain of the whole system is 10.3134?dB. Under the condition of different input noise intensities and signal amplitudes, the mean SNR of the system increases from –13.1136 to –2.7614?dB, which is a 78.9% increase, and the largest SNR gain is 13.4702?dB when the noise intensity D?=?1.2 and signal amplitude A?=?0.11. Compared to the single optimal spectrum value, when defining multiple optimum spectrum values as the SNR criterion, the detection sensitivity is less than 0.35 when the input noise intensity is between 0.5 and 2.5, and the sensitivity value is 6.29 times higher when D?=?2.5. The system successfully realizes the adaptive detection of twelve weak signals, and the SNR gain is 7.9743?dB, which improves the channel capacity of signal detection. The experimental results demonstrate the high efficiency and strong applicability of the system, improving the signal processing efficiency and speed of signal transmission.  相似文献   

14.
The signals of running speech and sustained vowels of normals and subjects suffering from dysphonia were analyzed statistically with respect to the signal-to-noise ratio (SNR). The distribution of the SNR measured in multiple overlapping frames in the speech signal was described by a linear combination of the distribution frequencies for SNR = 0 dB, 0 dB less than SNR less than 15 dB, and SNR greater than or equal to 15 dB. The values of the linear combination, the SNR of the vowels, and clinical assignment of the voices to normal and pathologic populations based on laryngoscopic and stroboscopic investigation parameters were used to compare the different evaluations of the voices. The SNR distribution in speech remained stable over signal lengths of more than 30 s. The correlation coefficient between the SNR measure for running speech and the SNR of sustained vowels amounted to only 0.63. The error rate in the discrimination between normal and dysphonic voices amounted to 22.6% in application to sustained vowels and 5.6% when the SNR distribution was used. Possible reasons for the observed discrepancies are discussed, and the results are compared to those of other studies.  相似文献   

15.
王玥  李平  崔杰 《声学学报》2013,38(4):501-508
为了在噪声抑制和语音失真中之间寻找最佳平衡,提出了一种听觉频域掩蔽效应的自适应β阶贝叶斯感知估计语音增强算法,以期提高语音增强的综合性能。算法利用了人耳的听觉掩蔽效应,根据计算得到的频域掩蔽阈自适应调整β阶贝叶斯感知估计语音增强算法中的β值,从而仅将噪声抑制在掩蔽阈之下,保留较多的语音信息,降低语音失真。并分别用客观和主观评价方式,对所提出的算法的性能进行了评估,并与原来基于信噪比的自适应β阶贝叶斯感知估计语音增强算法进行了比较。结果表明,频域掩蔽的β阶贝叶斯感知估计方法的综合客观评价结果在信噪比为-10 dB至5 dB之间时均高于基于信噪比的自适应β阶贝叶斯感知估计语音增强算法。主观评价结果也表明频域掩蔽的β阶贝叶斯感知估计方法能在尽量保留语音信息的同时,较好的抑制背景噪声。   相似文献   

16.
Speech recognition in noise improves with combined acoustic and electric stimulation compared to electric stimulation alone [Kong et al., J. Acoust. Soc. Am. 117, 1351-1361 (2005)]. Here the contribution of fundamental frequency (F0) and low-frequency phonetic cues to speech recognition in combined hearing was investigated. Normal-hearing listeners heard vocoded speech in one ear and low-pass (LP) filtered speech in the other. Three listening conditions (vocode-alone, LP-alone, combined) were investigated. Target speech (average F0=120 Hz) was mixed with a time-reversed masker (average F0=172 Hz) at three signal-to-noise ratios (SNRs). LP speech aided performance at all SNRs. Low-frequency phonetic cues were then removed by replacing the LP speech with a LP equal-amplitude harmonic complex, frequency and amplitude modulated by the F0 and temporal envelope of voiced segments of the target. The combined hearing advantage disappeared at 10 and 15 dB SNR, but persisted at 5 dB SNR. A similar finding occurred when, additionally, F0 contour cues were removed. These results are consistent with a role for low-frequency phonetic cues, but not with a combination of F0 information between the two ears. The enhanced performance at 5 dB SNR with F0 contour cues absent suggests that voicing or glimpsing cues may be responsible for the combined hearing benefit.  相似文献   

17.
The purpose of this study was to determine the influence of hearing protection devices (HPDs) on the understanding of speech in young adults with normal hearing, both in a silent situation and in the presence of ambient noise. The experimental research was carried out with the following variables: five different conditions of HPD use (without protectors, with two earplugs and with two earmuffs); a type of noise (pink noise); 4 test levels (60, 70, 80 and 90 dB[A]); 6 signal/noise ratios (without noise, +5, +10, zero, −5 and −10 dB); 5 repetitions for each case, totalling 600 tests with 10 monosyllables in each one. The variable measure was the percentage of correctly heard words (monosyllabic) in the test. The results revealed that, at the lowest levels (60 and 70 dB), the protectors reduced the intelligibility of speech (compared to the tests without protectors) while, in the presence of ambient noise levels of 80 and 90 dB and unfavourable signal/noise ratios (0, −5 and −10 dB), the HPDs improved the intelligibility. A comparison of the effectiveness of earplugs versus earmuffs showed that the former offer greater efficiency in respect to the recognition of speech, providing a 30% improvement over situations in which no protection is used. As might be expected, this study confirmed that the protectors' influence on speech intelligibility is related directly to the spectral curve of the protector's attenuation.  相似文献   

18.
Chinese word recognition (CWR) test was conducted by grades 3 and 5 children under the different conditions of reverberation time (RT), background noise level (BNL) and speech sound pressure level (SSPL) in three primary-school classrooms. The CWR scores and signal to noise ratios (SNRs) have been obtained at listening positions. Results show that the CWR score for grades 3 and 5 children increases with increase of SSPL, decrease of RT or increase of age, but it decreases with increase of BNL under the same conditions. For a mixed noise of 56 dBA (speech-spectrum-like noise and ambient noise), the CWR scores in the classroom for grades 3 and 5 children reach a peak at SNR of 15–20 dBA under the same RT and age of children condition. For the natural ambient noise, the CWR score for grades 3 and 5 children gradually increases with increase of the SNR. The high SSPL could not guarantee good CWR for children in classroom, which also depends on RT and BNL in classroom. When the classroom has long RT or high BNL, the increase of SSPL would not be necessarily to achieve better CWR. The novelty of the present study is to further evaluate and confirm the results under environments of real classrooms (not simulated room in laboratory).  相似文献   

19.
为了实现低信噪比下公共场所异常声音声学特征提取,提出经验小波滤波器组用于提取异常声音声学特征。首先,根据等效矩形带宽的人耳听觉特性,得到各滤波器的中心频率,计算出经验小波滤波器组的边界。然后,将边界代入经验小波细节函数和尺度函数中,形成经验小波滤波器组。最后,用经验小波滤波器组分解低信噪比下公共场所异常声音,经分解的各模态归一化对数能量作为异常声音声学特征,用于分类识别。相关实验表明,提出的经验小波滤波器组与典型的语音信号处理及时频信号处理方法相比,在低信噪比(0 dB)的商店、银行、办公室、自动取款机环境下,对异常声音的平均识别率提高了4.75%~37.92%,验证了提出方法的有效性。   相似文献   

20.
Previous research has demonstrated reduced speech recognition when speech is presented at higher-than-normal levels (e.g., above conversational speech levels), particularly in the presence of speech-shaped background noise. Persons with hearing loss frequently listen to speech-in-noise at these levels through hearing aids, which incorporate multiple-channel, wide dynamic range compression. This study examined the interactive effects of signal-to-noise ratio (SNR), speech presentation level, and compression ratio on consonant recognition in noise. Nine subjects with normal hearing identified CV and VC nonsense syllables in a speech-shaped noise at two SNRs (0 and +6 dB), three presentation levels (65, 80, and 95 dB SPL) and four compression ratios (1:1, 2:1, 4:1, and 6:1). Stimuli were processed through a simulated three-channel, fast-acting, wide dynamic range compression hearing aid. Consonant recognition performance decreased as compression ratio increased and presentation level increased. Interaction effects were noted between SNR and compression ratio, as well as between presentation level and compression ratio. Performance decrements due to increases in compression ratio were larger at the better (+6 dB) SNR and at the lowest (65 dB SPL) presentation level. At higher levels (95 dB SPL), such as those experienced by persons with hearing loss, increasing compression ratio did not significantly affect speech intelligibility.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号