共查询到20条相似文献,搜索用时 46 毫秒
1.
2.
认知心理学发现,视觉、听觉接收到信息有冲突时,大脑皮层电位会发生扰动,由此可探索认知冲突控制的“刺激-反应”机制。视觉认知冲突实验较多,成果丰硕,而相应的听觉实验很少,并且得到不一样的结论。本研究利用冲突和非冲突的语音信号刺激,分析研究脑电信号,提出基于三阶段听觉认知控制的时域特征模型。研究人脑听觉通道在出现语音认知冲突时的认知控制的规律下的单次试验脑电数据特征提取方法。根据得到的认知规律,单次试验脑电样本被分成3个部分。被分割的每个阶段使用时域上的平均幅值和Lempel-Ziv复杂度(LZC)进行计算,从而联合3个阶段的特征作为听觉认知脑电样本的特征。结果表明:(1)先发现的认知冲突相关的混合脑电成分“N1-P2&N2&Late-SW”分别体现了听觉认知控制的3个阶段;(2)一个更完整的听觉认知控制过程应包括3个阶段的时域特征:感知阶段:110~140 ms,识别阶段:260~320 ms,解决阶段:500~700 ms;(3)提出针对单次听觉认知控制脑电样本的特征提取方法,联合使用平均幅度和LZC可以获得最好的识别率(99.33%)。实验结果证明了提出的方法能够有效地检测听觉认知控制脑电数据,进而提供人脑认知控制能力评价的声学方法。 相似文献
3.
提出两种基于非对称代价函数的耳语音增强算法,将语音增强过程中的放大失真和压缩失真区分对待。Modified ItakuraSaito (MIS)算法对放大失真给予更多的惩罚,而Kullback-Leibler (KL)算法则对压缩失真给予更多的惩罚。实验结果表明,在低于—6 dB的低信噪比情况中,经MIS算法增强后的耳语音的可懂度相比传统算法有显著提高;而KL算法则获得了同最小均方误差语音增强算法近似的可懂度提高效果,证实了耳语音中的放大失真和压缩失真对于耳语音可懂度的影响并不相同,低信噪比时较大的压缩失真有助于提高耳语音可懂度,而高信噪比时的压缩失真对耳语音可懂度影响较小。 相似文献
4.
5.
针对单一波束形成器难以深度抑制空间相干干扰的问题,提出了一种综合了最小方差无畸变响应波束形成器与对称子阵延时求和波束形成器的语音增强方法。定义了一种波束输出比因子,根据该因子在目标声区域和干扰声区域的幅值变化,给出了采样协方差矩阵对角加载量的调整方法,并进一步利用该因子在后滤波环节对空间干扰进行判决滤波。文中对判决滤波时的上限阈值和下限阈值的实时更新方法给出了说明。所提出的算法能进一步抑制空间干扰和噪声,且可满足实时需要。在传声器圆阵上的实验表明,该方法在输出信干噪比及语音质量上,均优于经典对角加载算法及采样协方差矩阵扫描重构算法。 相似文献
6.
本文用动态系统的一种滤波方法──卡尔曼滤波对受附加噪声污染的语音信号进行增强处理的工作.所处理的附加噪声有宽带白噪声和有色噪声,且带噪语音信号是低估噪比的、无参考噪声源的.经增强处理后,语音信号的信噪比大约有7—10dB的提高. 相似文献
7.
为了在噪声抑制和语音失真中之间寻找最佳平衡,提出了一种听觉频域掩蔽效应的自适应β阶贝叶斯感知估计语音增强算法,以期提高语音增强的综合性能。算法利用了人耳的听觉掩蔽效应,根据计算得到的频域掩蔽阈自适应调整β阶贝叶斯感知估计语音增强算法中的β值,从而仅将噪声抑制在掩蔽阈之下,保留较多的语音信息,降低语音失真。并分别用客观和主观评价方式,对所提出的算法的性能进行了评估,并与原来基于信噪比的自适应β阶贝叶斯感知估计语音增强算法进行了比较。结果表明,频域掩蔽的β阶贝叶斯感知估计方法的综合客观评价结果在信噪比为-10 dB至5 dB之间时均高于基于信噪比的自适应β阶贝叶斯感知估计语音增强算法。主观评价结果也表明频域掩蔽的β阶贝叶斯感知估计方法能在尽量保留语音信息的同时,较好的抑制背景噪声。 相似文献
8.
语音中相位的听觉感知实验研究 总被引:2,自引:0,他引:2
人的听觉对语音信号中相位的感知比较迟钝,因而对语音信号进行处理和编码时常常不关心相位失真。实际上,相位失真到一定程度时会明显导致语音质量的下降。为了取得高质量的声码器,语音谱分量的相位信息是不能不考虑的。本文通过主观听觉测试实验研究了语音信号的短时Fourier变换相位谱对人的听觉感知的影响。测试结果表明:(1)如果完全舍弃原相位信息,则得到的重建语音含有很强的噪声且自然度很差;(2)不论舍弃高频段还是低频段的相位信息,均能导致听觉感知差异;(3)当相位的量阶小于π/7时,人的听觉系统将分辨不出重建语音和原始语音之间存在的差异. 相似文献
9.
10.
11.
The last decade has seen increasing interest in techniques for the enhancement of digital speech signals. Significant gains have been made in terms of signal-to-noise ratio (SNR) and quality, but few techniques have produced improvements in intelligibility. A method for speech enhancement based on nonlinear expansion of the spectral envelope is presented. The expansion is consistent with both the long-term spectrum of the speech and with the probability that speech is present in a given sample. Objective SNR measures are used to compare this algorithm with the well-known spectral subtraction method, with an alternative expansion scheme, and with limiting SNRs resulting from perfect recovery of the amplitude spectrum. For the purpose of intelligibility assessments, a simplified version of the algorithm has been implemented on a Texas Instruments TMS320-C25 system. Listening trials with this real-time system, conducted using a modified rhyme test, have produced small, but consistent, improvements in articulation scores. 相似文献
12.
Auditory discontinuities interact with categorization: implications for speech perception 总被引:2,自引:0,他引:2
Behavioral experiments with infants, adults, and nonhuman animals converge with neurophysiological findings to suggest that there is a discontinuity in auditory processing of stimulus components differing in onset time by about 20 ms. This discontinuity has been implicated as a basis for boundaries between speech categories distinguished by voice onset time (VOT). Here, it is investigated how this discontinuity interacts with the learning of novel perceptual categories. Adult listeners were trained to categorize nonspeech stimuli that mimicked certain temporal properties of VOT stimuli. One group of listeners learned categories with a boundary coincident with the perceptual discontinuity. Another group learned categories defined such that the perceptual discontinuity fell within a category. Listeners in the latter group required significantly more experience to reach criterion categorization performance. Evidence of interactions between the perceptual discontinuity and the learned categories extended to generalization tests as well. It has been hypothesized that languages make use of perceptual discontinuities to promote distinctiveness among sounds within a language inventory. The present data suggest that discontinuities interact with category learning. As such, "learnability" may play a predictive role in selection of language sound inventories. 相似文献
13.
提出一种面向自定义语音唤醒的单通道语音增强方法。该方法预先将关键词音素信息存入文本编码矩阵,并在常规语音增强模型基础上添加一个基于注意力机制的音素偏置模块。该模块利用语音增强模型中间特征从文本编码矩阵中获取当前帧的音素信息,并将其融入语音增强模型的后续计算中,从而提升语音增强模型对关键词相关音素的增强效果。在不同噪声环境下的实验结果表明,该方法可以更有效地抑制关键词部分噪声。同时所提出方法对比常规语音增强方法与其他文本相关语音增强方法,在自定义语音唤醒性能上可以分别获得14.3%和7.6%的相对提升。 相似文献
14.
Auditory enhancement of changes in spectral amplitude 总被引:1,自引:0,他引:1
Q Summerfield A Sidwell T Nelson 《The Journal of the Acoustical Society of America》1987,81(3):700-708
An auditory enhancement effect occurs when one component of a harmonic series is omitted for a few hundred milliseconds and then reintroduced: The reintroduced harmonic stands out perceptually. Three experiments are reported that studied a version of this effect in which several components of a harmonic series are enhanced to define the formants of a vowel. Using the accuracy of vowel identification to measure the prominence of the formant peaks in the effective auditory representation, forms of the effect were identified that are qualitatively similar to the incremental and decremental responses seen in primary auditory-nerve fibers. These results are compatible with an origin for the enhancement effect in peripheral auditory adaptation. However, an additional mechanism is required to account for the demonstration [Viemeister and Bacon, J. Acoust. Soc. Am. 71, 1502-1507 (1982)] that enhancement can involve a true gain in the frequency region of the reintroduced component. These effects demonstrate one way in which the auditory system may attenuate the prominence of background noises while preserving the ability to represent changes in spectral amplitude produced by newly arriving signals. 相似文献
15.
16.
Among various speech enhancement methods, dual-microphone methods are of a great importance for their low cost implementation and for exploiting spatial-filtering benefits of the microphone arrays. Coherence based methods are well-known as efficient two-microphone noise reduction techniques. These techniques do not work well, when received noise signals are correlated. These can be improved when the cross power spectral density (CPSD) of noise is available. In this paper, we propose an iterative approach for estimation of the noise CPSD to be employed in coherence based methods. We compare the proposed iterative noise CPSD estimation with a noise CPSD estimation technique based on voice activity detector (VAD), both of which are employed in a two-microphone speech enhancement, separately. Evaluation results show that the two-microphone speech enhancement scheme utilizing the proposed noise CPSD estimation technique performs superior than the enhancement system using the VAD-based noise CPSD estimation. 相似文献
17.
Siren noises usually severely disturb the intelligibility of voice communication inside the cabs of police, paramedic and fire vehicles. It is often desired that such unwanted noise can be removed from the speech signal. In this paper, a new method is proposed to adaptively cancel siren noises and enhance speech signals. Based on the characteristics of siren noises, an anti-speech filter and a time delayer are employed in the single and dual channel noise cancellation systems to reduce the siren noises. Experiment results demonstrate that the effectiveness of the proposed method for canceling the siren noises and the performance of the enhanced speech signal is satisfying. 相似文献
18.
语音存在概率的估计是语音增强的核心技术之一,针对传统的存在概率估计方法是启发式的,没有把存在概率的估计统一到一个理论框架之中,不能保证估计最优,提出了一种基于序贯隐马尔可夫模型(SHMM)的存在概率估计方法,在每一子带上构建一个SHMM模型描述对数功率谱包络的时间序列,把谱包络序列看作一个在语音和噪声状态之间转移的动态一阶马尔可夫链,采用单高斯函数构建每一状态的概率模型,语音状态的后验概率即为语音信号的存在概率。为了满足算法实时性要求,SHMM参数估计简化为一阶回归过程,根据极大似然准则逐帧更新模型参数。实验表明:SHMM所描述的时序相关性对存在概率的估计起到关键作用,它优于一般的启发式估计方法;SHMM算法的语音增强分段信噪比(SegSNR)和对数谱失真(LSD)性能优于经典的改进型最小统计量控制递归平均(IMCRA)算法。 相似文献
19.
We proposed two whispered speech enhancement methods based on asymmetric cost functions in this paper to deal with the amplification and attenuation distortions of whispered speech distinctively.The modified Itakura-Saito(MIS)distance function provides more penalties to speech amplification distortion,whereas the Kullback-Leibler(KL)divergence function gives more penalties to speech attenuation distortion.The experimental results show that the MIS function based method achieves significant improvement of intelligibility in contrast to the conventional speech enhancement algorithms when the signal-to-noise ratio(SNR)falls below-6 dB,whereas the KL function based one achieves the similar result as the minimum mean square error(MMSE)speech enhancement method.The results show that the effects of the amplification and attenuation distortions on the intelligibility of the enhanced whisper are different,where larger attenuation distortion may result in better intelligibility of speech with low SNR.However,the attenuation distortion has small effects on intelligibility of speech with high SNR. 相似文献
20.
This paper addresses the problem of the speech quality improvement using adaptive filtering algorithms. Recently in Djendi and Bendoumia (2014) [1], we have proposed a new two-channel backward algorithm for noise reduction and speech intelligibility enhancement. The main drawback of proposed two-channel subband algorithm is its poor performance when the number of subband is high. This inconvenience is well seen in the steady state regime values. The source of this problem is the fixed step-sizes of the cross-adaptive filtering algorithms that distort the speech signal when they are selected high and degrade the convergence speed behaviours when they are selected small. In this paper, we propose four modifications of this algorithm which allow improving both the convergence speed and the steady state values even in very noisy condition and a high number of subbands. To confirm the good performance of the four proposed variable-step-size SBBSS algorithms, we have carried out several simulations in various noisy environments. In these simulations, we have evaluated objective and subjective criteria as the system mismatch, the cepstral distance, the output signal-to-noise-ratio, and the mean opinion score (MOS) method to compare the four proposed variables step-size versions of the SBBSS algorithm with their original versions and with the two-channel fullband backward (2CFB) least mean square algorithm. 相似文献