首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 78 毫秒
1.
The benefits of combined electric and acoustic stimulation (EAS) in terms of speech recognition in noise are well established; however the underlying factors responsible for this benefit are not clear. The present study tests the hypothesis that having access to acoustic information in the low frequencies makes it easier for listeners to glimpse the target. Normal-hearing listeners were presented with vocoded speech alone (V), low-pass (LP) filtered speech alone, combined vocoded and LP speech (LP+V) and with vocoded stimuli constructed so that the low-frequency envelopes were easier to glimpse. Target speech was mixed with two types of maskers (steady-state noise and competing talker) at -5 to 5 dB signal-to-noise ratios. Results indicated no advantage of LP+V in steady noise, but a significant advantage over V in the competing talker background, an outcome consistent with the notion that it is easier for listeners to glimpse the target in fluctuating maskers. A significant improvement in performance was noted with the modified glimpsed stimuli over the original vocoded stimuli. These findings taken together suggest that a significant factor contributing to the EAS advantage is the enhanced ability to glimpse the target.  相似文献   

2.
Speech recognition in noise improves with combined acoustic and electric stimulation compared to electric stimulation alone [Kong et al., J. Acoust. Soc. Am. 117, 1351-1361 (2005)]. Here the contribution of fundamental frequency (F0) and low-frequency phonetic cues to speech recognition in combined hearing was investigated. Normal-hearing listeners heard vocoded speech in one ear and low-pass (LP) filtered speech in the other. Three listening conditions (vocode-alone, LP-alone, combined) were investigated. Target speech (average F0=120 Hz) was mixed with a time-reversed masker (average F0=172 Hz) at three signal-to-noise ratios (SNRs). LP speech aided performance at all SNRs. Low-frequency phonetic cues were then removed by replacing the LP speech with a LP equal-amplitude harmonic complex, frequency and amplitude modulated by the F0 and temporal envelope of voiced segments of the target. The combined hearing advantage disappeared at 10 and 15 dB SNR, but persisted at 5 dB SNR. A similar finding occurred when, additionally, F0 contour cues were removed. These results are consistent with a role for low-frequency phonetic cues, but not with a combination of F0 information between the two ears. The enhanced performance at 5 dB SNR with F0 contour cues absent suggests that voicing or glimpsing cues may be responsible for the combined hearing benefit.  相似文献   

3.
Cochlear implant users report difficulty understanding speech in both noisy and reverberant environments. Electric-acoustic stimulation (EAS) is known to improve speech intelligibility in noise. However, little is known about the potential benefits of EAS in reverberation, or about how such benefits relate to those observed in noise. The present study used EAS simulations to examine these questions. Sentences were convolved with impulse responses from a model of a room whose estimated reverberation times were varied from 0 to 1 sec. These reverberated stimuli were then vocoded to simulate electric stimulation, or presented as a combination of vocoder plus low-pass filtered speech to simulate EAS. Monaural sentence recognition scores were measured in two conditions: reverberated speech and speech in a reverberated noise. The long-term spectrum and amplitude modulations of the noise were equated to the reverberant energy, allowing a comparison of the effects of the interferer (speech vs noise). Results indicate that, at least in simulation, (1) EAS provides significant benefit in reverberation; (2) the benefits of EAS in reverberation may be underestimated by those in a comparable noise; and (3) the EAS benefit in reverberation likely arises from partially preserved cues in this background accessible via the low-frequency acoustic component.  相似文献   

4.
The purpose of this study was to explore the potential advantages, both theoretical and applied, of preserving low-frequency acoustic hearing in cochlear implant patients. Several hypotheses are presented that predict that residual low-frequency acoustic hearing along with electric stimulation for high frequencies will provide an advantage over traditional long-electrode cochlear implants for the recognition of speech in competing backgrounds. A simulation experiment in normal-hearing subjects demonstrated a clear advantage for preserving low-frequency residual acoustic hearing for speech recognition in a background of other talkers, but not in steady noise. Three subjects with an implanted "short-electrode" cochlear implant and preserved low-frequency acoustic hearing were also tested on speech recognition in the same competing backgrounds and compared to a larger group of traditional cochlear implant users. Each of the three short-electrode subjects performed better than any of the traditional long-electrode implant subjects for speech recognition in a background of other talkers, but not in steady noise, in general agreement with the simulation studies. When compared to a subgroup of traditional implant users matched according to speech recognition ability in quiet, the short-electrode patients showed a 9-dB advantage in the multitalker background. These experiments provide strong preliminary support for retaining residual low-frequency acoustic hearing in cochlear implant patients. The results are consistent with the idea that better perception of voice pitch, which can aid in separating voices in a background of other talkers, was responsible for this advantage.  相似文献   

5.
Speech reception thresholds (SRTs) were measured with a competing talker background for signals processed to contain variable amounts of temporal fine structure (TFS) information, using nine normal-hearing and nine hearing-impaired subjects. Signals (speech and background talker) were bandpass filtered into channels. Channel signals for channel numbers above a "cut-off channel" (CO) were vocoded to remove TFS information, while channel signals for channel numbers of CO and below were left unprocessed. Signals from all channels were combined. As a group, hearing-impaired subjects benefited less than normal-hearing subjects from the additional TFS information that was available as CO increased. The amount of benefit varied between hearing-impaired individuals, with some showing no improvement in SRT and one showing an improvement similar to that for normal-hearing subjects. The reduced ability to take advantage of TFS information in speech may partially explain why subjects with cochlear hearing loss get less benefit from listening in a fluctuating background than normal-hearing subjects. TFS information may be important in identifying the temporal "dips" in such a background.  相似文献   

6.

Background

Emotionally salient information in spoken language can be provided by variations in speech melody (prosody) or by emotional semantics. Emotional prosody is essential to convey feelings through speech. In sensori-neural hearing loss, impaired speech perception can be improved by cochlear implants (CIs). Aim of this study was to investigate the performance of normal-hearing (NH) participants on the perception of emotional prosody with vocoded stimuli. Semantically neutral sentences with emotional (happy, angry and neutral) prosody were used. Sentences were manipulated to simulate two CI speech-coding strategies: the Advance Combination Encoder (ACE) and the newly developed Psychoacoustic Advanced Combination Encoder (PACE). Twenty NH adults were asked to recognize emotional prosody from ACE and PACE simulations. Performance was assessed using behavioral tests and event-related potentials (ERPs).

Results

Behavioral data revealed superior performance with original stimuli compared to the simulations. For simulations, better recognition for happy and angry prosody was observed compared to the neutral. Irrespective of simulated or unsimulated stimulus type, a significantly larger P200 event-related potential was observed for happy prosody after sentence onset than the other two emotions. Further, the amplitude of P200 was significantly more positive for PACE strategy use compared to the ACE strategy.

Conclusions

Results suggested P200 peak as an indicator of active differentiation and recognition of emotional prosody. Larger P200 peak amplitude for happy prosody indicated importance of fundamental frequency (F0) cues in prosody processing. Advantage of PACE over ACE highlighted a privileged role of the psychoacoustic masking model in improving prosody perception. Taken together, the study emphasizes on the importance of vocoded simulation to better understand the prosodic cues which CI users may be utilizing.  相似文献   

7.
Due to the limited number of cochlear implantees speaking Mandarin Chinese, it is extremely difficult to evaluate new speech coding algorithms designed for tonal languages. Access to an intelligibility index that could reliably predict the intelligibility of vocoded (and non-vocoded) Mandarin Chinese is a viable solution to address this challenge. The speech-transmission index (STI) and coherence-based intelligibility measures, among others, have been examined extensively for predicting the intelligibility of English speech but have not been evaluated for vocoded or wideband (non-vocoded) Mandarin speech despite the perceptual differences between the two languages. The results indicated that the coherence-based measures seem to be influenced by the characteristics of the spoken language. The highest correlation (r = 0.91-0.97) was obtained in Mandarin Chinese with a weighted coherence measure that included primarily information from high-intensity voiced segments (e.g., vowels) containing F0 information, known to be important for lexical tone recognition. In contrast, in English, highest correlation was obtained with a coherence measure that included information from weak consonants and vowel/consonant transitions. A band-importance function was proposed that captured information about the amplitude envelope contour. A higher modulation rate (100 Hz) was found necessary for the STI-based measures for maximum correlation (r = 0.94-0.96) with vocoded Mandarin and English recognition.  相似文献   

8.
Previous studies have assessed the importance of temporal fine structure (TFS) for speech perception in noise by comparing the performance of normal-hearing listeners in two conditions. In one condition, the stimuli have useful information in both their temporal envelopes and their TFS. In the other condition, stimuli are vocoded and contain useful information only in their temporal envelopes. However, these studies have confounded differences in TFS with differences in the temporal envelope. The present study manipulated the analytic signal of stimuli to preserve the temporal envelope between conditions with different TFS. The inclusion of informative TFS improved speech-reception thresholds for sentences presented in steady and modulated noise, demonstrating that there are significant benefits of including informative TFS even when the temporal envelope is controlled. It is likely that the results of previous studies largely reflect the benefits of TFS, rather than uncontrolled effects of changes in the temporal envelope.  相似文献   

9.
For normal-hearing (NH) listeners, masker energy outside the spectral region of a target signal can improve target detection and identification, a phenomenon referred to as comodulation masking release (CMR). This study examined whether, for cochlear implant (CI) listeners and for NH listeners presented with a "noise vocoded" CI simulation, speech identification in modulated noise is improved by a co-modulated flanking band. In Experiment 1, NH listeners identified noise-vocoded speech in a background of on-target noise with or without a flanking narrow band of noise outside the spectral region of the target. The on-target noise and flanker were either 16-Hz square-wave modulated with the same phase or were unmodulated; the speech was taken from a closed-set corpus. Performance was better in modulated than in unmodulated noise, and this difference was slightly greater when the comodulated flanker was present, consistent with a small CMR of about 1.7 dB for noise-vocoded speech. Experiment 2, which tested CI listeners using the same speech materials, found no advantage for modulated versus unmodulated maskers and no CMR. Thus although NH listeners can benefit from CMR even for speech signals with reduced spectro-temporal detail, no CMR was observed for CI users.  相似文献   

10.
Chinese sentence recognition strongly relates to the reception of tonal information. For cochlear implant (CI) users with residual acoustic hearing, tonal information may be enhanced by restoring low-frequency acoustic cues in the nonimplanted ear. The present study investigated the contribution of low-frequency acoustic information to Chinese speech recognition in Mandarin-speaking normal-hearing subjects listening to acoustic simulations of bilaterally combined electric and acoustic hearing. Subjects listened to a 6-channel CI simulation in one ear and low-pass filtered speech in the other ear. Chinese tone, phoneme, and sentence recognition were measured in steady-state, speech-shaped noise, as a function of the cutoff frequency for low-pass filtered speech. Results showed that low-frequency acoustic information below 500 Hz contributed most strongly to tone recognition, while low-frequency acoustic information above 500 Hz contributed most strongly to phoneme recognition. For Chinese sentences, speech reception thresholds (SRTs) improved with increasing amounts of low-frequency acoustic information, and significantly improved when low-frequency acoustic information above 500 Hz was preserved. SRTs were not significantly affected by the degree of spectral overlap between the CI simulation and low-pass filtered speech. These results suggest that, for CI patients with residual acoustic hearing, preserving low-frequency acoustic information can improve Chinese speech recognition in noise.  相似文献   

11.
This study investigated which acoustic cues within the speech signal are responsible for bimodal speech perception benefit. Seven cochlear implant (CI) users with usable residual hearing at low frequencies in the non-implanted ear participated. Sentence tests were performed in near-quiet (some noise on the CI side to reduce scores from ceiling) and in a modulated noise background, with the implant alone and with the addition, in the hearing ear, of one of four types of acoustic signals derived from the same sentences: (1) a complex tone modulated by the fundamental frequency (F0) and amplitude envelope contours; (2) a pure tone modulated by the F0 and amplitude contours; (3) a noise-vocoded signal; (4) unprocessed speech. The modulated tones provided F0 information without spectral shape information, whilst the vocoded signal presented spectral shape information without F0 information. For the group as a whole, only the unprocessed speech condition provided significant benefit over implant-alone scores, in both near-quiet and noise. This suggests that, on average, F0 or spectral cues in isolation provided limited benefit for these subjects in the tested listening conditions, and that the significant benefit observed in the full-signal condition was derived from implantees' use of a combination of these cues.  相似文献   

12.
The speech understanding of persons with sloping high-frequency (HF) hearing impairment (HI) was compared to normal hearing (NH) controls and previous research on persons with "flat" losses [Hornsby and Ricketts (2003). J. Acoust. Soc. Am. 113, 1706-1717] to examine how hearing loss configuration affects the contribution of speech information in various frequency regions. Speech understanding was assessed at multiple low- and high-pass filter cutoff frequencies. Crossover frequencies, defined as the cutoff frequencies at which low- and high-pass filtering yielded equivalent performance, were significantly lower for the sloping HI, compared to NH, group suggesting that HF HI limits the utility of HF speech information. Speech intelligibility index calculations suggest this limited utility was not due simply to reduced audibility but also to the negative effects of high presentation levels and a poorer-than-normal use of speech information in the frequency region with the greatest hearing loss (the HF regions). This deficit was comparable, however, to that seen in low-frequency regions of persons with similar HF thresholds and "flat" hearing losses suggesting that sensorineural HI results in a "uniform," rather than frequency-specific, deficit in speech understanding, at least for persons with HF thresholds up to 60-80 dB HL.  相似文献   

13.
To examine spectral effects on declines in speech recognition in noise at high levels, word recognition for 18 young adults with normal hearing was assessed for low-pass-filtered speech and speech-shaped maskers or high-pass-filtered speech and speech-shaped maskers at three speech levels (70, 77, and 84 dB SPL) for each of three signal-to-noise ratios (+8, +3, and -2 dB). An additional low-level noise produced equivalent masked thresholds for all subjects. Pure-tone thresholds were measured in quiet and in all maskers. If word recognition was determined entirely by signal-to-noise ratio, and was independent of signal levels and the spectral content of speech and maskers, scores should remain constant with increasing level for both low- and high-frequency speech and maskers. Recognition of low-frequency speech in low-frequency maskers and high-frequency speech in high-frequency maskers decreased significantly with increasing speech level when signal-to-noise ratio was held constant. For low-frequency speech and speech-shaped maskers, the decline was attributed to nonlinear growth of masking which reduced the "effective" signal-to-noise ratio at high levels, similar to previous results for broadband speech and speech-shaped maskers. Masking growth and reduced "effective" signal-to-noise ratio accounted for some but not all the decline in recognition of high-frequency speech in high-frequency maskers.  相似文献   

14.
The addition of low-passed (LP) speech or even a tone following the fundamental frequency (F0) of speech has been shown to benefit speech recognition for cochlear implant (CI) users with residual acoustic hearing. The mechanisms underlying this benefit are still unclear. In this study, eight bimodal subjects (CI users with acoustic hearing in the non-implanted ear) and eight simulated bimodal subjects (using vocoded and LP speech) were tested on vowel and consonant recognition to determine the relative contributions of acoustic and phonetic cues, including F0, to the bimodal benefit. Several listening conditions were tested (CI/Vocoder, LP, T(F0-env), CI/Vocoder + LP, CI/Vocoder + T(F0-env)). Compared with CI/Vocoder performance, LP significantly enhanced both consonant and vowel perception, whereas a tone following the F0 contour of target speech and modulated with an amplitude envelope of the maximum frequency of the F0 contour (T(F0-env)) enhanced only consonant perception. Information transfer analysis revealed a dual mechanism in the bimodal benefit: The tone representing F0 provided voicing and manner information, whereas LP provided additional manner, place, and vowel formant information. The data in actual bimodal subjects also showed that the degree of the bimodal benefit depended on the cutoff and slope of residual acoustic hearing.  相似文献   

15.
Temporal information provided by cochlear implants enables successful speech perception in quiet, but limited spectral information precludes comparable success in voice perception. Talker identification and speech decoding by young hearing children (5-7 yr), older hearing children (10-12 yr), and hearing adults were examined by means of vocoder simulations of cochlear implant processing. In Experiment 1, listeners heard vocoder simulations of sentences from a man, woman, and girl and were required to identify the talker from a closed set. Younger children identified talkers more poorly than older listeners, but all age groups showed similar benefit from increased spectral information. In Experiment 2, children and adults provided verbatim repetition of vocoded sentences from the same talkers. The youngest children had more difficulty than older listeners, but all age groups showed comparable benefit from increasing spectral resolution. At comparable levels of spectral degradation, performance on the open-set task of speech decoding was considerably more accurate than on the closed-set task of talker identification. Hearing children's ability to identify talkers and decode speech from spectrally degraded material sheds light on the difficulty of these domains for child implant users.  相似文献   

16.
Three experiments measured constancy in speech perception, using natural-speech messages or noise-band vocoder versions of them. The eight vocoder-bands had equally log-spaced center-frequencies and the shapes of corresponding "auditory" filters. Consequently, the bands had the temporal envelopes that arise in these auditory filters when the speech is played. The "sir" or "stir" test-words were distinguished by degrees of amplitude modulation, and played in the context; "next you'll get _ to click on." Listeners identified test-words appropriately, even in the vocoder conditions where the speech had a "noise-like" quality. Constancy was assessed by comparing the identification of test-words with low or high levels of room reflections across conditions where the context had either a low or a high level of reflections. Constancy was obtained with both the natural and the vocoded speech, indicating that the effect arises through temporal-envelope processing. Two further experiments assessed perceptual weighting of the different bands, both in the test word and in the context. The resulting weighting functions both increase monotonically with frequency, following the spectral characteristics of the test-word's [s]. It is suggested that these two weighting functions are similar because they both come about through the perceptual grouping of the test-word's bands.  相似文献   

17.
These experiments examined how high presentation levels influence speech recognition for high- and low-frequency stimuli in noise. Normally hearing (NH) and hearing-impaired (HI) listeners were tested. In Experiment 1, high- and low-frequency bandwidths yielding 70%-correct word recognition in quiet were determined at levels associated with broadband speech at 75 dB SPL. In Experiment 2, broadband and band-limited sentences (based on passbands measured in Experiment 1) were presented at this level in speech-shaped noise filtered to the same frequency bandwidths as targets. Noise levels were adjusted to produce approximately 30%-correct word recognition. Frequency bandwidths and signal-to-noise ratios supporting criterion performance in Experiment 2 were tested at 75, 87.5, and 100 dB SPL in Experiment 3. Performance tended to decrease as levels increased. For NH listeners, this "rollover" effect was greater for high-frequency and broadband materials than for low-frequency stimuli. For HI listeners, the 75- to 87.5-dB increase improved signal audibility for high-frequency stimuli and rollover was not observed. However, the 87.5- to 100-dB increase produced qualitatively similar results for both groups: scores decreased most for high-frequency stimuli and least for low-frequency materials. Predictions of speech intelligibility by quantitative methods such as the Speech Intelligibility Index may be improved if rollover effects are modeled as frequency dependent.  相似文献   

18.
Across-frequency processing by common interaural time delay (ITD) in spatial unmasking was investigated by measuring speech reception thresholds (SRTs) for high- and low-frequency bands of target speech presented against concurrent speech or a noise masker. Experiment 1 indicated that presenting one of these target bands with an ITD of +500 micros and the other with zero ITD (like the masker) provided some release from masking, but full binaural advantage was only measured when both target bands were given an ITD of + 500 micros. Experiment 2 showed that full binaural advantage could also be achieved when the high- and low-frequency bands were presented with ITDs of equal but opposite magnitude (+/- 500 micros). In experiment 3, the masker was also split into high- and low-frequency bands with ITDs of equal but opposite magnitude (+/-500 micros). The ITD of the low-frequency target band matched that of the high-frequency masking band and vice versa. SRTs indicated that, as long as the target and masker differed in ITD within each frequency band, full binaural advantage could be achieved. These results suggest that the mechanism underlying spatial unmasking exploits differences in ITD independently within each frequency channel.  相似文献   

19.
This study investigated the benefits of adding unprocessed low-frequency information to acoustic simulations of cochlear-implant processing in normal-hearing listeners. Implant processing was simulated using an eight-channel noise-excited envelope vocoder, and low-frequency information was added by replacing the lower frequency channels of the processor with a low-pass-filtered version of the original stimulus. Experiment 1 measured sentence-level speech reception as a function of target-to-masker ratio, with either steady-state speech-shaped noise or single-talker maskers. Experiment 2 measured listeners' ability to identify two vowels presented simultaneously, as a function of the F0 difference between the two vowels. In both experiments low-frequency information was added below either 300 or 600 Hz. The introduction of the additional low-frequency information led to substantial and significant improvements in performance in both experiments, with a greater improvement observed for the higher (600 Hz) than for the lower (300 Hz) cutoff frequency. However, performance never equaled performance in the unprocessed conditions. The results confirm other recent demonstrations that added low-frequency information can provide significant benefits in intelligibility, which may at least in part be attributed to improvements in F0 representation. The findings provide further support for efforts to make use of residual acoustic hearing in cochlear-implant users.  相似文献   

20.
对14位正常听力者开展了环境声的人工耳蜗仿真声识别实验,比较了两类声码器仿真(正弦载波和噪声载波)条件下的环境声识别效果差异,然后对9位讲普通话的成年人工耳蜗植入者开展了环境声识别实验。实验材料是从互联网上搜集,并经过12位正常听力者主观测试验证后,筛选出的67种环境声。结果显示,载波类型没有对67种环境声的平均识别效果产生显著影响,但是声学特征的差异会导致单个环境声的识别效果对载波类型有依赖。另外,人工耳蜗植入者的环境声识别效果较差,有待通过信号处理策略、神经接口和康复手段的改进而得到提高。本研究中开发的环境声材料可以用于评估人工耳蜗环境声识别效果。   相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号