首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到19条相似文献,搜索用时 122 毫秒
1.
戴明扬  徐柏龄 《应用声学》2001,20(6):6-12,44
本文基于人耳听觉模型提出了一种鲁棒性的话者特征参数提取方法。该种方法中,首先由Gamma tone听觉滤波器组和Meddis内耳毛细胞发放模型获得表征听觉神经活动特性的听觉相关图。由听觉神经脉冲发放的锁相特性和双声抑制特性,我们将听觉相关图每个频带中的幅值最大频率分量作为表征当前频带特性的特征参量,于是所有频带的特征参量便构成了表征当前语音段特性的特征矢量;我们采用DCT交换进一步消除各个特征参量之间的相关性,压缩特征矢量的维数。有效性试验表明,该种特征矢量基本上反映了输入语音的谱包络特性;抗噪声性能实验表明,在高斯白噪声和汽车噪声干扰下,这种特征参数比LPCC和MFCC有较小的相对失真;基于矢量量化的文本无关话者辨识表明,对于三种类型的噪声干扰该种特征参数在低信噪比下都获得了较好的识别结果。  相似文献   

2.
杜衣杭  方卫宁 《声学学报》2019,44(5):945-950
听觉训练可以提升人在噪声环境中语音识别的绩效.首先设计了一种以稳定声源为刺激的听觉追踪任务,在20个训练单元后,采用由干扰语音类型和信噪比两个因素构成3×5语音型噪声掩蔽下的语音识别测试验证了该训练方法的有效性.结果发现,训练组的语音识别率显著高于对照组,证明听觉注意力可以通过声源追踪任务的训练得到提高。实验结果表明,声源追踪训练可以使人在语音型噪声掩蔽下的听觉注意力水平趋于稳定。   相似文献   

3.
用于无监督语音降噪的听觉感知鲁棒主成分分析法   总被引:1,自引:0,他引:1       下载免费PDF全文
闵刚  邹霞  韩伟  张雄伟  谭薇 《声学学报》2017,42(2):246-256
针对现有稀疏低秩分解语音降噪方法对人耳听觉感知特性应用不充分、语音失真易被感知的问题,提出了一种用于语音降噪的听觉感知鲁棒主成分分析法。由于耳蜗基底膜对于频率感知具有非线性特性,该方法采用耳蜗谱图作为语噪分离的基础。此外,选用符合人耳听觉感知特性的板仓-斋田距离度量作为优化目标函数,在稀疏低秩建模过程中引入非负约束以使分解分量更符合实际物理含义,并在交替方向乘子法框架下推导了具有闭合解形式的迭代优化算法。文中方法在语音降噪时是完全无监督的,无需预先训练语音或噪声模型。多种类型噪声和不同信噪比条件下的仿真实验验证了该方法的有效性,噪声抑制效果较目前同类算法更为显著,且降噪后语音的可懂度和总体质量有所提高、至少相当。   相似文献   

4.
基于听觉模型的耳语音的声韵切分   总被引:5,自引:0,他引:5       下载免费PDF全文
丁慧  栗学丽  徐柏龄 《应用声学》2004,23(2):20-25,44
本文分析了耳语音的特点,并根据生理声学及心理声学的基本理论与实验资料,提出了一种利用听觉模型来进行耳语音声韵切分的方法。这种适用于耳语音声韵切分的听觉感知模型主要分为四个层次:耳蜗对声音频率的分解机理;听觉系统的时域和频域非线性变化;中枢神经系统的侧抑制机理。这种模型能反映在噪声环境下人对低能量语音的听觉感知特性,因而适于耳语音识别,在耳语音声韵母切分实验中得到了满意的结果。  相似文献   

5.
水下噪声听觉属性的主观评价与分析   总被引:3,自引:0,他引:3       下载免费PDF全文
王娜  陈克安  黄凰 《物理学报》2009,58(10):7330-7338
为探求人耳感知水下目标类型的声学因素,研究了水下噪声听觉属性空间的维度数及其各维度的物理解释.首先通过词汇聚类分析和问卷调查确定评价水下噪声听觉属性的汉语描述词,然后完成基于成对比较法和语义细分法的主观评价实验,获得听觉属性的不相似性矩阵及各样本在不同听觉属性下的主观评价分值.最后,利用多维尺度分析确定水下噪声听觉属性空间由五个维度组成,再利用主成分分析得到独立的五个主成分,进而利用相关系数和压力值确定五个主成分分别表示听觉属性空间的五个维度,根据各个主成分对应的汉语描述词所反映的听觉属性对其进行物理解释 关键词: 听觉属性 多维尺度分析 主成分分析  相似文献   

6.
谢菠荪  孟庆林 《应用声学》2018,37(5):607-613
空间听觉是对声音空间属性或特性的主观感觉,包括对声源的定位、对环境反射声的主观感觉等。复杂声学环境下的语言获取也和空间听觉密切相关。听觉障碍通常会包括空间听觉能力的下降甚至缺失,影响语言的获取能力。人工听觉是治疗听觉障碍的手段,理想情况下应能恢复或改善患者的空间听觉能力。该文综述了听觉障碍患者的空间听觉及其人工恢复方面的研究、进展及存在的问题。  相似文献   

7.
为提高复杂场景下的听障患者的语言理解度,本文提出一种仿人耳听觉的助听器双耳声源定位算法。算法首先借鉴耳蜗分频特性和听觉掩蔽特性,将声音信号进行多通道分解,并提取人耳敏感频带的信号进行双耳时间差(Interaural Time Difference,ITD)估计;然后基于人耳哈斯效应,提取有效的ITD信息;最后采用头相关模型,将ITD转化为声源方向信息。同时,为了改善混响和多干扰声场景下的声源定位能力,本文提出一种多通道的加权联合策略。仿真和场景测试实验表明,算法的抗干扰性强,定位精度高。而且,在7名受试者的理解度测试中,同现有的助听器增强算法相比,结合定位算法的语音增强算法达到3~5 dB的性能改善。  相似文献   

8.
基于掩蔽特性的噪声品质评估研究   总被引:1,自引:0,他引:1  
针对宽带噪声的频谱特点和人耳听觉的掩蔽特性,提出了一种噪声掩蔽的等效原则,以及一种新的烦恼度指数的计算方法。在采用时频分析的方法,结合统计学规律,辨识出噪声中致人烦恼的频率成份之后,利用该烦恼度指数可度量各频率成份的烦恼程度,从而建立了一种噪声品质评估模型。试验研究的结果表明:基于掩蔽特性的噪声品质评估模型可以快捷而准确地识别噪声中烦恼的频率成份,并与人耳主观辨识的结果具有良好的一致性。  相似文献   

9.
梁瑞宇  奚吉  赵力  邹采荣  黄程韦 《物理学报》2012,61(13):134305-134305
降频助听算法是改善听障患者声音辨识能力的最安全有效的方法. 本文以主观测试实验为手段, 通过分析当前算法的声音识别能力的不足, 提出一种自适应慢放降频算法. 算法结合慢放算法和频移算法的优点, 并能根据信号的频谱结构, 自适应调整慢放因子, 降低时域不同步性. 并且, 通过分析含噪信号和噪声信号的频谱关系, 提出一种噪声下的慢放因子评估方法. 实验结果显示, 同其他降频算法相比, 该算法可以提高15%到20%的识别率. 在对听障患者的测试中, 同传统的助听设备相比, 平均识别率也获得显著改善.  相似文献   

10.
李薇  孟子厚 《应用声学》2014,33(1):45-52
设计了一组噪声掩蔽下的纯音听辨训练实验,探究听感训练对听音人员在噪声背景下的目标信号辨别能力的影响。实验结果表明,听音训练有助于目标信号辨识能力的提高,并从统计上分析了此组听音人的目标声识别能力随训练时间的变化曲线。探讨了不同信噪比下听音者的听辨学习变化规律,以及不同个体在训练前后的听辨能力差异。  相似文献   

11.
Auditory functional magnetic resonance imaging (fMRI) requires quantification of sound stimuli in the magnetic environment and adequate isolation of background noise. We report the development of two novel sound measurement systems that accurately measure the sound intensity inside the ear, which can simultaneously provide the similar or greater amount of scanner- noise protection than ear-muffs. First, we placed a 2.6 x 2.6-mm microphone in an insert phone that was connected to a headphone [microphone-integrated, foam-tipped insert-phone with a headphone (MIHP)]. This attenuated scanner noise by 37.8+/-4.6 dB, a level better than the reference amount obtained using earmuffs. The nonmetallic optical microphone was integrated with a headphone [optical microphone in a headphone (OMHP)] and it effectively detected the change of sound intensity caused by variable compression on the cushions of the headphone. Wearing the OMHP reduced the noise by 28.5+/-5.9 dB and did not affect echoplanar magnetic resonance images. We also performed an auditory fMRI study using the MIHP system and presented increase in the auditory cortical activation following 10-dB increment in the intensity of sound stimulation. These two newly developed sound measurement systems successfully achieved the accurate quantification of sound stimuli with maintaining the similar level of noise protection of wearing earmuffs in the auditory fMRI experiment.  相似文献   

12.
基于听觉感知的噪声语义描述是噪声声品质研究的基础性问题,已有研究未将语义描述与噪声来源、频谱特性以及产品运行状态等物理信息联系起来。该文分别针对飞机舱内噪声、车辆噪声和空气净化器噪声这3组典型噪声开展了主观评价实验,并通过多维尺度分析和主成分分析描述了3组噪声的语义空间,系统分析了不同类型噪声的描述词,同时解释了描述词与噪声物理属性之间的联系。研究发现:飞机舱内噪声、车辆噪声以及空气净化器噪声可以由4维、4维和3维语义空间进行描述;不同类型噪声在语义描述中具有共性与个性,3组噪声语义的主要维度均与嘈杂感相关,而噪声的个性描述词与其声源的物理属性密切相关;进行声品质建模及应用时,应同时考虑噪声共性和个性描述词对听觉感知的影响,采取有针对性的措施以提升产品声品质。该文从听觉感知的角度进行了噪声特性的语义描述和分析,研究结果可为产品声品质以及噪声控制研究提供帮助。  相似文献   

13.
Performance on 19 auditory discrimination and identification tasks was measured for 340 listeners with normal hearing. Test stimuli included single tones, sequences of tones, amplitude-modulated and rippled noise, temporal gaps, speech, and environmental sounds. Principal components analysis and structural equation modeling of the data support the existence of a general auditory ability and four specific auditory abilities. The specific abilities are (1) loudness and duration (overall energy) discrimination; (2) sensitivity to temporal envelope variation; (3) identification of highly familiar sounds (speech and nonspeech); and (4) discrimination of unfamiliar simple and complex spectral and temporal patterns. Examination of Scholastic Aptitude Test (SAT) scores for a large subset of the population revealed little or no association between general or specific auditory abilities and general intellectual ability. The findings provide a basis for research to further specify the nature of the auditory abilities. Of particular interest are results suggestive of a familiar sound recognition (FSR) ability, apparently specialized for sound recognition on the basis of limited or distorted information. This FSR ability is independent of normal variation in both spectral-temporal acuity and of general intellectual ability.  相似文献   

14.
The spectral envelope is a major determinant of the perceptual identity of many classes of sound including speech. When sounds are transmitted from the source to the listener, the spectral envelope is invariably and diversely distorted, by factors such as room reverberation. Perceptual compensation for spectral-envelope distortion was investigated here. Carrier sounds were distorted by spectral envelope difference filters whose frequency response is the spectral envelope of one vowel minus the spectral envelope of another. The filter /I/ minus /e/ and its inverse were used. Subjects identified a test sound that followed the carrier. The test sound was drawn from an /Itch/ to /etch/ continuum. Perceptual compensation produces a phoneme boundary difference between /I/ minus /e/ and its inverse. Carriers were the phrase "the next word is" spoken by the same (male) speaker as the test sounds, signal-correlated noise derived from this phrase, the same phrase spoken by a female speaker, male and female versions played backwards, and a repeated end-point vowel. The carrier and test were presented to the same ear, to different ears, and from different apparent directions (by varying interaural time delay). The results show that compensation is unlike peripheral phenomena, such as adaptation, and unlike phonetic perceptual phenomena. The evidence favors a central, auditory mechanism.  相似文献   

15.
The spectral properties of a complex stimulus (rippled noise) were varied over time, and listeners were asked to discriminate between this stimulus and a flat-spectrum, stationary noise. The spacing between the spectral peaks of rippled noise was changed sinusoidally as a function of time, or the location of the spectral peaks of rippled noise was moved up and down the spectrum as a sinusoidal function of time. In most conditions, listeners were able to make the discriminations up to rates of temporal modulation of 5-10 cycles per second. Beyond 5-10 cps the rippled noise with the temporally varying peaks was indiscriminable from a flat (nonrippled) noise. The results suggest that for temporal changes in the spectral peaks of rippled noise, listeners cannot monitor the output of a single (or small number of) auditory channel(s) (critical bands), or that the mechanism used to extract the perceptual information from these stimuli is slow. Temporal variations in the spectral properties of rippled noise may relate to temporal changes in the repetition pitch of complex sounds, the temporal properties of the coloration added to sound in a reverberant environment, and the nature of spectral peak changes such as those that occur in speech-formant transitions. The results are relevant to the general issue of the auditory system's ability to extract information from a complex spectral profile.  相似文献   

16.
When a test sound consisting of pure tones with equal intensities is preceded by a precursor sound identical to the test sound except for a reduction in the intensity of one tone, an auditory "enhancement" phenomenon occurs: In the test sound, the tone which was previously softer stands out perceptually. Here, enhancement was investigated using inharmonic sounds made up of five pure tones well resolved in the auditory periphery. It was found that enhancement can be elicited not only by increases in intensity but also by shifts in frequency. In both cases, when the precursor and test sounds are separated by a 500-ms delay, inserting a burst of pink noise during the delay has little effect on enhancement. Presenting the precursor and test sounds to opposite ears rather than to the same ear significantly reduces the enhancement resulting from increases in intensity, but not the enhancement resulting from shifts in frequency. This difference suggests that the mechanisms of enhancement are not identical for the two types of change. For frequency shifts, enhancement may be partly based on the existence of automatic "frequency-shift detectors" [Demany and Ramos, J. Acoust. Soc. Am. 117, 833-841 (2005)].  相似文献   

17.
In a test sound consisting of a burst of pink noise, an arbitrarily selected target frequency band can be "enhanced" by the previous presentation of a similar noise with a spectral notch in the target frequency region. As a result of the enhancement, the test sound evokes a pitch sensation corresponding to the pitch of the target band. Here, a pitch comparison task was used to assess enhancement. In the first experiment, a stronger enhancement effect was found when the test sound and its precursor had the same interaural time difference (ITD) than when they had opposite ITDs. Two subsequent experiments were concerned with the audibility of an instance of dichotic pitch in binaural test sounds preceded by precursors. They showed that it is possible to enhance a frequency region on the sole basis of ITD manipulations, using spectrally identical test sounds and precursors. However, the observed effects were small. A major goal of this study was to test the hypothesis that enhancement originates at least in part from neural adaptation processes taking place at a central level of the auditory system. The data failed to provide strong support for this hypothesis.  相似文献   

18.
均方误差(Mean-Square Error,MSE)函数是深度学习单通道语声增强算法最常用的一种代价函数。然而,MSE误差值的大小与语声质量好坏并非完全相关。为了提高算法性能,本文在深度神经网络训练中引入了两类与人耳听觉相关的代价函数。第一类是加权欧氏距离代价函数,考虑了人耳听觉掩蔽效应;第二类是Itakura-Satio代价函数、COSH代价函数和加权似然比代价函数,强调语声谱峰的重要性,侧重于恢复干净语声谱峰信息。基于长短期记忆网络结构分析比较了两类代价函数在深度学习单通道语声增强算法中的性能,并与MSE代价函数进行对比。实验结果表明,基于加权欧式距离代价函数的深度神经网络单通道语声增强算法能够获得更好的语声质量和更低的噪声残留。  相似文献   

19.
The prevalence of noise in the riding of motorcycles has been a source of concern to both riders and researchers in recent times. Detailed flow field information will allow insight into the flow mechanisms responsible for the production of sound within motorcycle helmets. Flow field surveys of this nature are not found in the available literature which has tended to focus on sound pressure levels at ear as these are of interest for noise exposure legislation. A detailed flow survey of a commercial motorcycle helmet has been carried out in combination with surface pressure measurements and at ear acoustics. Three potential noise source regions are investigated, namely, the helmet wake, the surface boundary layer and the cavity under the helmet at the chin bar. Extensive information is provided on the structure of the helmet wake including its frequency content. While the wake and boundary layer flows showed negligible contributions to at-ear sound the cavity region around the chin bar was identified as a key noise source. The contribution of the cavity region was investigated as a function of flow speed and helmet angle both of which are shown to be key factors governing the sound produced by this region.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号