首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 234 毫秒
1.
系统地研究了人工耳蜗植入者的电刺激听觉部位音调感知,全面地探讨了部位音调感知与人工耳蜗植入者言语识别和音乐感知的关系。4位成人语后聋人工耳蜗植入者参与了该研究。通过电极音调排序测试度量植入者的部位音调感知能力。言语能力测试和音乐音高分辨测试分别用米考察植入者的言语识别和音乐感知能力。结果显示,随着电极刺激部位从蜗尖移向蜗底,所有受试者均可获得从"低"到"高"的音调感知变化,但个体差异较大。受试者的言语识别结果与其电刺激听觉部位音调感知能力相关,但受到天花板效应影响,对应关系并不明显。受试者的音乐音高分辨成绩与其电刺激听觉部位音调感知能力呈较好的对应关系。结果表明,当前人工耳蜗声音编码策略所传递的声信号特征已可使植入者获得良好的言语识别效果;且安静环境下言语识别对植入者的部位音调感知能力要求不高。但当前的声音编码策略并未能有效对音乐信号进行编码;植入者在理解音乐这类复杂声信号时,其电刺激听觉部位音调感知能力一定程度决定了其听音效果。   相似文献   

2.
孟庆林  原猛  夏洋  冯海泓 《声学学报》2015,40(2):300-306
通过乐器音自动识别实验研究了幅度调制(amplitude modulation,AM)信息对于人类乐器识别的影响。具体步骤为:依据听觉模型,提取乐器音信号中若干频带中的AM信息,再基于所得到的AM信息计算统计学特征,采用逐对比较的支持向量机法进行乐器音的机器识别。采用了5种分频带数目(2,4,8,16和32)和4种AM计算方法。结果表明,频带数的增加有助于识别效果的提高,但从16频带到32频带效果趋近平稳;不同的AM提取方法也会对识别结果产生影响,其中解析信号法产生的AM信息提供了最好的乐器识别效果。分析发现自动识别结果高于采用相似的AM信息的人类识别结果。该自动识别系统为人工耳蜗或声码器仿真声模型的乐器识别提供了一个计算模型,对人工耳蜗乐器识别实验和训练具有参考价值。   相似文献   

3.
余紫莹  许勇  杨军  沈钧贤 《应用声学》2013,32(6):501-507
骨导超声听觉感知是超声振动通过头骨传导产生听觉感知的一种特殊现象。本文首先介绍了骨导超声听觉感知研究领域的国内外发展现状,在此基础上设计了一套骨导超声听觉感知测试系统,并完成了系统的软、硬件平台实现。利用该系统,分别对听力正常和听力障碍被试者进行了主观测试,围绕单频超声感知效果及单频可听声、汉语语音由超声载波调制后的感知效果开展研究,比较了多种不同调制方式下的主观感知效果,并分析了空气传导声对实验结果的影响。  相似文献   

4.
本实验用玻璃微电极的方法记录了白噪声暴露前后,在单音和两音刺激声作用下豚鼠耳蜗第三回中阶交流感受器电位的变化,并用园窗电极引导的CAP阈值监测声损伤程度.比较单音和两音诱发的CM幅度发现:两音抑制与耳蜗的非线性特性密切相关,均依赖于耳蜗的正常生理环境;外毛细胞的双向换能作用是耳蜗非线性机械特性和两音抑制存在的前题。本实验还对中耳和刺激声谐波产生两音抑制的可能性给予排除,并介绍了实验中所用的计算机程控刺激、记录分析系统和用快速傅里叶变换(FFT)测量不同频率分量的方法。  相似文献   

5.
听觉模型输出谱特征在声目标识别中的应用   总被引:6,自引:0,他引:6  
马元锋  陈克安  王娜  郑文 《声学学报》2009,34(2):142-150
利用模拟人耳声信号处理过程的CcGC滤波器组模型,研究了听觉特征应用于声目标识别相比传统特征的优势。结果表明:当信号的信噪比下降时,听觉特征逐渐表现出更好的性能,体现了听觉系统优异的抗噪声能力。随后,本文从CcGC滤波器组模型所反映的听觉系统四个主要特性入手,通过仿真实验研究了耳蜗抑制噪声的机理,结果表明听觉系统的临界带划分和非线性压缩在耳蜗抑制噪声中起着关键的作用。   相似文献   

6.
耳蜗植入表明用电刺激听觉神经可以使全聋人恢复一些听感。以前的植入器仅能给病人以初步听觉能力,用于一般声知觉和片段语言提示。用新的多电极植入器取得的结果表明,某些病人仅按植入器通入的声音就能完成高级语言识别。 单电极耳蜗植入器将处理后的声波波形送给放置在耳蜗或圆窗凹处的电极,这种植入器对受刺激的神经没有空间选择性,即出现所有神经细胞同时激励的情况。使用单电极时,病  相似文献   

7.
梁雍  陈克安 《声学学报》2018,43(4):708-718
针对低信噪比下声源材料类型的细分任务,将稀疏表达用于冲击声信号的声源类型识别,提取的稀疏特征相比传统的MFCC特征有效改善了识别性能。分别基于3种预定义词典和一组根据训练信号学习的词典,利用正交匹配追踪(OMP)方法对录制冲击声进行稀疏表达,提取稀疏特征用于不同信噪比下冲击声信号的声源辨识,并与MFCC特征进行比较。对包含12类材料的冲击声数据库的分类结果显示,在几乎所有情况下,稀疏特征比MFCC特征具有更好的识别效果。特别是在信噪比较低的情况下,稀疏特征具有更好的抗噪性能。   相似文献   

8.
正听觉让我们可以进行言语交流、音乐欣赏、感受环境等声音感知活动。在正常的听觉通路中,声音的机械振动主要通过耳廓、耳道、鼓膜和听骨链的传导进入耳蜗,振动信号在耳蜗中被转换为神经电信号,电信号被听觉神经系统逐渐上传至中枢,经处理形成听觉。当听觉通路受到破坏时,听觉功能就会产生损伤。大多数听力受损者的听觉通路是在机械振动传导环节出了问题,他们可以通过佩戴助听器来获得声学放大,进而得到听力补偿。然而,对于很多重度以上感  相似文献   

9.
杨琳  张建平  王迪  颜永红 《声学学报》2009,34(2):151-157
在传统人工耳蜗连续交叠采样(Continuous Interleaved Sampler,CIS)算法的基础上,提出一种基于精细结构(频率调制信息)的人工耳蜗语音处理算法,在不引入过高频率成分、保证工艺可实现性的前提下,使语音识别率大幅提高。听觉仿真实验的结果表明,与传统的基于时域包络的CIS算法相比,基于精细结构的CIS算法对于元音可懂度的改进可以达到28%;声调的识别率在各种噪声条件下提高20%以上;在一般噪声环境下,辅音和句子的可懂度也分别获得了22.9%和28.3%的改进。   相似文献   

10.
载波频率偏移和脉冲干扰是水声通信系统中常见的干扰形式。为了提高系统的鲁棒性,消除这两种类型干扰的影响,提出利用空载波对载波频率偏移和脉冲干扰进行估计。空载波估计技术采用了分步优化和联合优化估计两种方法。分步优化估计首先利用重采样和空载波进行载波频率偏移估计,补偿后利用空载波对脉冲噪声进行消除。考虑到载波频率偏移和脉冲噪声估计时的相互影响特性,利用二者的联合估计得到优化解。仿真和水池实验结果表明,两种优化算法都取得了良好的效果;同时联合优化估计比分步优化估计得到更大的系统性能增益,在系统误比特率为10-2时,联合优化解得到了3dB左右的系统性能增益。通过对载波频率偏移和脉冲干扰的估计消除,大大提高了水声通信系统的鲁棒性。   相似文献   

11.
Temporal information provided by cochlear implants enables successful speech perception in quiet, but limited spectral information precludes comparable success in voice perception. Talker identification and speech decoding by young hearing children (5-7 yr), older hearing children (10-12 yr), and hearing adults were examined by means of vocoder simulations of cochlear implant processing. In Experiment 1, listeners heard vocoder simulations of sentences from a man, woman, and girl and were required to identify the talker from a closed set. Younger children identified talkers more poorly than older listeners, but all age groups showed similar benefit from increased spectral information. In Experiment 2, children and adults provided verbatim repetition of vocoded sentences from the same talkers. The youngest children had more difficulty than older listeners, but all age groups showed comparable benefit from increasing spectral resolution. At comparable levels of spectral degradation, performance on the open-set task of speech decoding was considerably more accurate than on the closed-set task of talker identification. Hearing children's ability to identify talkers and decode speech from spectrally degraded material sheds light on the difficulty of these domains for child implant users.  相似文献   

12.
For normal-hearing (NH) listeners, masker energy outside the spectral region of a target signal can improve target detection and identification, a phenomenon referred to as comodulation masking release (CMR). This study examined whether, for cochlear implant (CI) listeners and for NH listeners presented with a "noise vocoded" CI simulation, speech identification in modulated noise is improved by a co-modulated flanking band. In Experiment 1, NH listeners identified noise-vocoded speech in a background of on-target noise with or without a flanking narrow band of noise outside the spectral region of the target. The on-target noise and flanker were either 16-Hz square-wave modulated with the same phase or were unmodulated; the speech was taken from a closed-set corpus. Performance was better in modulated than in unmodulated noise, and this difference was slightly greater when the comodulated flanker was present, consistent with a small CMR of about 1.7 dB for noise-vocoded speech. Experiment 2, which tested CI listeners using the same speech materials, found no advantage for modulated versus unmodulated maskers and no CMR. Thus although NH listeners can benefit from CMR even for speech signals with reduced spectro-temporal detail, no CMR was observed for CI users.  相似文献   

13.

Background

Emotionally salient information in spoken language can be provided by variations in speech melody (prosody) or by emotional semantics. Emotional prosody is essential to convey feelings through speech. In sensori-neural hearing loss, impaired speech perception can be improved by cochlear implants (CIs). Aim of this study was to investigate the performance of normal-hearing (NH) participants on the perception of emotional prosody with vocoded stimuli. Semantically neutral sentences with emotional (happy, angry and neutral) prosody were used. Sentences were manipulated to simulate two CI speech-coding strategies: the Advance Combination Encoder (ACE) and the newly developed Psychoacoustic Advanced Combination Encoder (PACE). Twenty NH adults were asked to recognize emotional prosody from ACE and PACE simulations. Performance was assessed using behavioral tests and event-related potentials (ERPs).

Results

Behavioral data revealed superior performance with original stimuli compared to the simulations. For simulations, better recognition for happy and angry prosody was observed compared to the neutral. Irrespective of simulated or unsimulated stimulus type, a significantly larger P200 event-related potential was observed for happy prosody after sentence onset than the other two emotions. Further, the amplitude of P200 was significantly more positive for PACE strategy use compared to the ACE strategy.

Conclusions

Results suggested P200 peak as an indicator of active differentiation and recognition of emotional prosody. Larger P200 peak amplitude for happy prosody indicated importance of fundamental frequency (F0) cues in prosody processing. Advantage of PACE over ACE highlighted a privileged role of the psychoacoustic masking model in improving prosody perception. Taken together, the study emphasizes on the importance of vocoded simulation to better understand the prosodic cues which CI users may be utilizing.  相似文献   

14.
Nonlinear sensory and neural processing mechanisms have been exploited to enhance spectral contrast for improvement of speech understanding in noise. The "companding" algorithm employs both two-tone suppression and adaptive gain mechanisms to achieve spectral enhancement. This study implemented a 50-channel companding strategy and evaluated its efficiency as a front-end noise suppression technique in cochlear implants. The key parameters were identified and evaluated to optimize the companding performance. Both normal-hearing (NH) listeners and cochlear-implant (CI) users performed phoneme and sentence recognition tests in quiet and in steady-state speech-shaped noise. Data from the NH listeners showed that for noise conditions, the implemented strategy improved vowel perception but not consonant and sentence perception. However, the CI users showed significant improvements in both phoneme and sentence perception in noise. Maximum average improvement for vowel recognition was 21.3 percentage points (p<0.05) at 0 dB signal-to-noise ratio (SNR), followed by 17.7 percentage points (p<0.05) at 5 dB SNR for sentence recognition and 12.1 percentage points (p<0.05) at 5 dB SNR for consonant recognition. While the observed results could be attributed to the enhanced spectral contrast, it is likely that the corresponding temporal changes caused by companding also played a significant role and should be addressed by future studies.  相似文献   

15.
This study evaluated the effects of time compression and expansion on sentence recognition by normal-hearing (NH) listeners and cochlear-implant (CI) recipients of the Nucleus-22 device. Sentence recognition was measured in five CI users using custom 4-channel continuous interleaved sampler (CIS) processors and five NH listeners using either 4-channel or 32-channel noise-band processors. For NH listeners, recognition was largely unaffected by time expansion, regardless of spectral resolution. However, recognition of time-compressed speech varied significantly with spectral resolution. When fine spectral resolution (32 channels) was available, speech recognition was unaffected even when the duration of sentences was shortened to 40% of their original length (equivalent to a mean duration of 40 ms/phoneme). However, a mean duration of 60 ms/phoneme was required to achieve the same level of recognition when only coarse spectral resolution (4 channels) was available. Recognition patterns were highly variable across CI listeners. The best CI listener performed as well as NH subjects listening to corresponding spectral conditions; however, three out of five CI listeners performed significantly poorer in recognizing time-compressed speech. Further investigation revealed that these three poorer-performing CI users also had more difficulty with simple temporal gap-detection tasks. The results indicate that limited spectral resolution reduces the ability to recognize time-compressed speech. Some CI listeners have more difficulty with time-compressed speech, as produced by rapid speakers, because of reduced spectral resolution and deficits in auditory temporal processing.  相似文献   

16.
The differences in spectral shape resolution abilities among cochlear implant (CI) listeners, and between CI and normal-hearing (NH) listeners, when listening with the same number of channels (12), was investigated. In addition, the effect of the number of channels on spectral shape resolution was examined. The stimuli were rippled noise signals with various ripple frequency-spacings. An adaptive 41FC procedure was used to determine the threshold for resolvable ripple spacing, which was the spacing at which an interchange in peak and valley positions could be discriminated. The results showed poorer spectral shape resolution in CI compared to NH listeners (average thresholds of approximately 3000 and 400 Hz, respectively), and wide variability among CI listeners (range of approximately 800 to 8000 Hz). There was a significant relationship between spectral shape resolution and vowel recognition. The spectral shape resolution thresholds of NH listeners increased as the number of channels increased from 1 to 16, while the CI listeners showed a performance plateau at 4-6 channels, which is consistent with previous results using speech recognition measures. These results indicate that this test may provide a measure of CI performance which is time efficient and non-linguistic, and therefore, if verified, may provide a useful contribution to the prediction of speech perception in adults and children who use CIs.  相似文献   

17.
Speech recognition was measured as a function of spectral resolution (number of spectral channels) and speech-to-noise ratio in normal-hearing (NH) and cochlear-implant (CI) listeners. Vowel, consonant, word, and sentence recognition were measured in five normal-hearing listeners, ten listeners with the Nucleus-22 cochlear implant, and nine listeners with the Advanced Bionics Clarion cochlear implant. Recognition was measured as a function of the number of spectral channels (noise bands or electrodes) at signal-to-noise ratios of + 15, + 10, +5, 0 dB, and in quiet. Performance with three different speech processing strategies (SPEAK, CIS, and SAS) was similar across all conditions, and improved as the number of electrodes increased (up to seven or eight) for all conditions. For all noise levels, vowel and consonant recognition with the SPEAK speech processor did not improve with more than seven electrodes, while for normal-hearing listeners, performance continued to increase up to at least 20 channels. Speech recognition on more difficult speech materials (word and sentence recognition) showed a marginally significant increase in Nucleus-22 listeners from seven to ten electrodes. The average implant score on all processing strategies was poorer than scores of NH listeners with similar processing. However, the best CI scores were similar to the normal-hearing scores for that condition (up to seven channels). CI listeners with the highest performance level increased in performance as the number of electrodes increased up to seven, while CI listeners with low levels of speech recognition did not increase in performance as the number of electrodes was increased beyond four. These results quantify the effect of number of spectral channels on speech recognition in noise and demonstrate that most CI subjects are not able to fully utilize the spectral information provided by the number of electrodes used in their implant.  相似文献   

18.
Although in a number of experiments noise-band vocoders have been shown to provide acoustic models for speech perception in cochlear implants (CI), the present study assesses in four experiments whether and under what limitations noise-band vocoders can be used as an acoustic model for pitch perception in CI. The first two experiments examine the effect of spectral smearing on simulated electrode discrimination and fundamental frequency (FO) discrimination. The third experiment assesses the effect of spectral mismatch in an FO-discrimination task with two different vocoders. The fourth experiment investigates the effect of amplitude compression on modulation rate discrimination. For each experiment, the results obtained from normal-hearing subjects presented with vocoded stimuli are compared to results obtained directly from CI recipients. The results show that place pitch sensitivity drops with increased spectral smearing and that place pitch cues for multi-channel stimuli can adequately be mimicked when the discriminability of adjacent channels is adjusted by varying the spectral slopes to match that of CI subjects. The results also indicate that temporal pitch sensitivity is limited for noise-band carriers with low center frequencies and that the absence of a compression function in the vocoder might alter the saliency of the temporal pitch cues.  相似文献   

19.
Although some cochlear implant (CI) listeners can show good word recognition accuracy, it is not clear how they perceive and use the various acoustic cues that contribute to phonetic perceptions. In this study, the use of acoustic cues was assessed for normal-hearing (NH) listeners in optimal and spectrally degraded conditions, and also for CI listeners. Two experiments tested the tense/lax vowel contrast (varying in formant structure, vowel-inherent spectral change, and vowel duration) and the word-final fricative voicing contrast (varying in F1 transition, vowel duration, consonant duration, and consonant voicing). Identification results were modeled using mixed-effects logistic regression. These experiments suggested that under spectrally-degraded conditions, NH listeners decrease their use of formant cues and increase their use of durational cues. Compared to NH listeners, CI listeners showed decreased use of spectral cues like formant structure and formant change and consonant voicing, and showed greater use of durational cues (especially for the fricative contrast). The results suggest that although NH and CI listeners may show similar accuracy on basic tests of word, phoneme or feature recognition, they may be using different perceptual strategies in the process.  相似文献   

20.
The addition of low-passed (LP) speech or even a tone following the fundamental frequency (F0) of speech has been shown to benefit speech recognition for cochlear implant (CI) users with residual acoustic hearing. The mechanisms underlying this benefit are still unclear. In this study, eight bimodal subjects (CI users with acoustic hearing in the non-implanted ear) and eight simulated bimodal subjects (using vocoded and LP speech) were tested on vowel and consonant recognition to determine the relative contributions of acoustic and phonetic cues, including F0, to the bimodal benefit. Several listening conditions were tested (CI/Vocoder, LP, T(F0-env), CI/Vocoder + LP, CI/Vocoder + T(F0-env)). Compared with CI/Vocoder performance, LP significantly enhanced both consonant and vowel perception, whereas a tone following the F0 contour of target speech and modulated with an amplitude envelope of the maximum frequency of the F0 contour (T(F0-env)) enhanced only consonant perception. Information transfer analysis revealed a dual mechanism in the bimodal benefit: The tone representing F0 provided voicing and manner information, whereas LP provided additional manner, place, and vowel formant information. The data in actual bimodal subjects also showed that the degree of the bimodal benefit depended on the cutoff and slope of residual acoustic hearing.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号