期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Effects of reverberation and masking on speech intelligibility in cochlear implant simulations

Poissant SF Whitmal NA Freyman RL 《The Journal of the Acoustical Society of America》2006,119(3):1606-1615

Two experiments investigated the impact of reverberation and masking on speech understanding using cochlear implant (CI) simulations. Experiment 1 tested sentence recognition in quiet. Stimuli were processed with reverberation simulation (T=0.425, 0.266, 0.152, and 0.0 s) and then either processed with vocoding (6, 12, or 24 channels) or were subjected to no further processing. Reverberation alone had only a small impact on perception when as few as 12 channels of information were available. However, when the processing was limited to 6 channels, perception was extremely vulnerable to the effects of reverberation. In experiment 2, subjects listened to reverberated sentences, through 6- and 12-channel processors, in the presence of either speech-spectrum noise (SSN) or two-talker babble (TTB) at various target-to-masker ratios. The combined impact of reverberation and masking was profound, although there was no interaction between the two effects. This differs from results obtained in subjects listening to unprocessed speech where interactions between reverberation and masking have been shown to exist. A speech transmission index (STI) analysis indicated a reasonably good prediction of speech recognition performance. Unlike previous investigations, the SSN and TTB maskers produced equivalent results, raising questions about the role of informational masking in CI processed speech. 相似文献

2.

Channel weights for speech recognition in cochlear implant users

Mehr MA Turner CW Parkinson A 《The Journal of the Acoustical Society of America》2001,109(1):359-366

The purpose of this study was to develop and validate a method of estimating the relative "weight" that a multichannel cochlear implant user places on individual channels, indicating its contribution to overall speech recognition. The correlational method as applied to speech recognition was used both with normal-hearing listeners and with cochlear implant users fitted with six-channel speech processors. Speech was divided into frequency bands corresponding to the bands of the processor and a randomly chosen level of corresponding filtered noise was added to each channel on each trial. Channels in which the signal-to-noise ratio was more highly correlated with performance have higher weights, and conversely, channels in which the correlations were smaller have lower weights. Normal-hearing listeners showed approximately equal weights across frequency bands. In contrast, cochlear implant users showed unequal weighting across bands, and varied from individual to individual with some channels apparently not contributing significantly to speech recognition. To validate these channel weights, individual channels were removed and speech recognition in quiet was tested. A strong correlation was found between the relative weight of the channel removed and the decrease in speech recognition, thus providing support for use of the correlational method for cochlear implant users. 相似文献

3.

Effects of short-term auditory deprivation on speech production in adult cochlear implant users.

M A Svirsky H Lane J S Perkell J Wozniak 《The Journal of the Acoustical Society of America》1992,92(3):1284-1300

Speech production parameters of three postlingually deafened adults who use cochlear implants were measured: after 24 h of auditory deprivation (which was achieved by turning the subject's speech processor off); after turning the speech processor back on; and after turning the speech processor off again. The measured parameters included vowel acoustics [F1, F2, F0, sound-pressure level (SPL), duration and H1-H2, the amplitude difference between the first two spectral harmonics, a correlate of breathiness] while reading word lists, and average airflow during the reading of passages. Changes in speech processor state (on-to-off or vice versa) were accompanied by numerous changes in speech production parameters. Many changes were in the direction of normalcy, and most were consistent with long-term speech production changes in the same subjects following activation of the processors of their cochlear implants [Perkell et al., J. Acoust. Soc. Am. 91, 2961-2978 (1992)]. Changes in mean airflow were always accompanied by H1-H2 (breathiness) changes in the same direction, probably due to underlying changes in laryngeal posture. Some parameters (different combinations of SPL, F0, H1-H2 and formants for different subjects) showed very rapid changes when turning the speech processor on or off. Parameter changes were faster and more pronounced, however, when the speech processor was turned on than when it was turned off. The picture that emerges from the present study is consistent with a dual role for auditory feedback in speech production: long-term calibration of articulatory parameters as well as feedback mechanisms with relatively short time constants. 相似文献

4.

Beneficial acoustic speech cues for cochlear implant users with residual acoustic hearing

Visram AS Azadpour M Kluk K McKay CM 《The Journal of the Acoustical Society of America》2012,131(5):4042-4050

This study investigated which acoustic cues within the speech signal are responsible for bimodal speech perception benefit. Seven cochlear implant (CI) users with usable residual hearing at low frequencies in the non-implanted ear participated. Sentence tests were performed in near-quiet (some noise on the CI side to reduce scores from ceiling) and in a modulated noise background, with the implant alone and with the addition, in the hearing ear, of one of four types of acoustic signals derived from the same sentences: (1) a complex tone modulated by the fundamental frequency (F0) and amplitude envelope contours; (2) a pure tone modulated by the F0 and amplitude contours; (3) a noise-vocoded signal; (4) unprocessed speech. The modulated tones provided F0 information without spectral shape information, whilst the vocoded signal presented spectral shape information without F0 information. For the group as a whole, only the unprocessed speech condition provided significant benefit over implant-alone scores, in both near-quiet and noise. This suggests that, on average, F0 or spectral cues in isolation provided limited benefit for these subjects in the tested listening conditions, and that the significant benefit observed in the full-signal condition was derived from implantees' use of a combination of these cues. 相似文献

5.

Effect of spectral normalization on different talker speech recognition by cochlear implant users

Liu C Galvin J Fu QJ Narayanan SS 《The Journal of the Acoustical Society of America》2008,123(5):2836-2847

In cochlear implants (CIs), different talkers often produce different levels of speech understanding because of the spectrally distorted speech patterns provided by the implant device. A spectral normalization approach was used to transform the spectral characteristics of one talker to those of another talker. In Experiment 1, speech recognition with two talkers was measured in CI users, with and without spectral normalization. Results showed that the spectral normalization algorithm had small but significant effect on performance. In Experiment 2, the effects of spectral normalization were measured in CI users and normal-hearing (NH) subjects; a pitch-stretching technique was used to simulate six talkers with different fundamental frequencies and vocal tract configurations. NH baseline performance was nearly perfect with these pitch-shift transformations. For CI subjects, while there was considerable intersubject variability in performance with the different pitch-shift transformations, spectral normalization significantly improved the intelligibility of these simulated talkers. The results from Experiments 1 and 2 demonstrate that spectral normalization toward more-intelligible talkers significantly improved CI users' speech understanding with less-intelligible talkers. The results suggest that spectral normalization using optimal reference patterns for individual CI patients may compensate for some of the acoustic variability across talkers. 相似文献

6.

Talker intelligibility differences in cochlear implant listeners

Green T Katiri S Faulkner A Rosen S 《The Journal of the Acoustical Society of America》2007,121(6):EL223-EL229

People vary in the intelligibility of their speech. This study investigated whether across-talker intelligibility differences observed in normally-hearing listeners are also found in cochlear implant (CI) users. Speech perception for male, female, and child pairs of talkers differing in intelligibility was assessed with actual and simulated CI processing and in normal hearing. While overall speech recognition was, as expected, poorer for CI users, differences in intelligibility across talkers were consistent across all listener groups. This suggests that the primary determinants of intelligibility differences are preserved in the CI-processed signal, though no single critical acoustic property could be identified. 相似文献

7.

Understanding speech in modulated interference: cochlear implant users and normal-hearing listeners 总被引：9，自引：0，他引：9

Nelson PB Jin SH Carney AE Nelson DA 《The Journal of the Acoustical Society of America》2003,113(2):961-968

Many competing noises in real environments are modulated or fluctuating in level. Listeners with normal hearing are able to take advantage of temporal gaps in fluctuating maskers. Listeners with sensorineural hearing loss show less benefit from modulated maskers. Cochlear implant users may be more adversely affected by modulated maskers because of their limited spectral resolution and by their reliance on envelope-based signal-processing strategies of implant processors. The current study evaluated cochlear implant users' ability to understand sentences in the presence of modulated speech-shaped noise. Normal-hearing listeners served as a comparison group. Listeners repeated IEEE sentences in quiet, steady noise, and modulated noise maskers. Maskers were presented at varying signal-to-noise ratios (SNRs) at six modulation rates varying from 1 to 32 Hz. Results suggested that normal-hearing listeners obtain significant release from masking from modulated maskers, especially at 8-Hz masker modulation frequency. In contrast, cochlear implant users experience very little release from masking from modulated maskers. The data suggest, in fact, that they may show negative effects of modulated maskers at syllabic modulation rates (2-4 Hz). Similar patterns of results were obtained from implant listeners using three different devices with different speech-processor strategies. The lack of release from masking occurs in implant listeners independent of their device characteristics, and may be attributable to the nature of implant processing strategies and/or the lack of spectral detail in processed stimuli. 相似文献

8.

Effects of cochlear implant processing and fundamental frequency on the intelligibility of competing sentences

Stickney GS Assmann PF Chang J Zeng FG 《The Journal of the Acoustical Society of America》2007,122(2):1069-1078

Speech perception in the presence of another competing voice is one of the most challenging tasks for cochlear implant users. Several studies have shown that (1) the fundamental frequency (F0) is a useful cue for segregating competing speech sounds and (2) the F0 is better represented by the temporal fine structure than by the temporal envelope. However, current cochlear implant speech processing algorithms emphasize temporal envelope information and discard the temporal fine structure. In this study, speech recognition was measured as a function of the F0 separation of the target and competing sentence in normal-hearing and cochlear implant listeners. For the normal-hearing listeners, the combined sentences were processed through either a standard implant simulation or a new algorithm which additionally extracts a slowed-down version of the temporal fine structure (called Frequency-Amplitude-Modulation-Encoding). The results showed no benefit of increasing F0 separation for the cochlear implant or simulation groups. In contrast, the new algorithm resulted in gradual improvements with increasing F0 separation, similar to that found with unprocessed sentences. These results emphasize the importance of temporal fine structure for speech perception and demonstrate a potential remedy for difficulty in the perceptual segregation of competing speech sounds. 相似文献

9.

The impact of reverberant self-masking and overlap-masking effects on speech intelligibility by cochlear implant listeners (L)

Kokkinakis K Loizou PC 《The Journal of the Acoustical Society of America》2011,130(3):1099-1102

The purpose of this study is to determine the relative impact of reverberant self-masking and overlap-masking effects on speech intelligibility by cochlear implant listeners. Sentences were presented in two conditions wherein reverberant consonant segments were replaced with clean consonants, and in another condition wherein reverberant vowel segments were replaced with clean vowels. The underlying assumption is that self-masking effects would dominate in the first condition, whereas overlap-masking effects would dominate in the second condition. Results indicated that the degradation of speech intelligibility in reverberant conditions is caused primarily by self-masking effects that give rise to flattened formant transitions. 相似文献

10.

Recognition of spectrally asynchronous speech by normal-hearing listeners and Nucleus-22 cochlear implant users

Fu QJ Galvin JJ 《The Journal of the Acoustical Society of America》2001,109(3):1166-1172

This experiment examined the effects of spectral resolution and fine spectral structure on recognition of spectrally asynchronous sentences by normal-hearing and cochlear implant listeners. Sentence recognition was measured in six normal-hearing subjects listening to either full-spectrum or noise-band processors and five Nucleus-22 cochlear implant listeners fitted with 4-channel continuous interleaved sampling (CIS) processors. For the full-spectrum processor, the speech signals were divided into either 4 or 16 channels. For the noise-band processor, after band-pass filtering into 4 or 16 channels, the envelope of each channel was extracted and used to modulate noise of the same bandwidth as the analysis band, thus eliminating the fine spectral structure available in the full-spectrum processor. For the 4-channel CIS processor, the amplitude envelopes extracted from four bands were transformed to electric currents by a power function and the resulting electric currents were used to modulate pulse trains delivered to four electrode pairs. For all processors, the output of each channel was time-shifted relative to other channels, varying the channel delay across channels from 0 to 240 ms (in 40-ms steps). Within each delay condition, all channels were desynchronized such that the cross-channel delays between adjacent channels were maximized, thereby avoiding local pockets of channel synchrony. Results show no significant difference between the 4- and 16-channel full-spectrum speech processor for normal-hearing listeners. Recognition scores dropped significantly only when the maximum delay reached 200 ms for the 4-channel processor and 240 ms for the 16-channel processor. When fine spectral structures were removed in the noise-band processor, sentence recognition dropped significantly when the maximum delay was 160 ms for the 16-channel noise-band processor and 40 ms for the 4-channel noise-band processor. There was no significant difference between implant listeners using the 4-channel CIS processor and normal-hearing listeners using the 4-channel noise-band processor. The results imply that when fine spectral structures are not available, as in the implant listener's case, increased spectral resolution is important for overcoming cross-channel asynchrony in speech signals. 相似文献

11.

Effect of talker and speaking style on the speech transmission index

van Wijngaarden SJ Houtgast T 《The Journal of the Acoustical Society of America》2004,115(1):38-41

相似文献

12.

The ability of cochlear implant users to use temporal envelope cues recovered from speech frequency modulation

JH Won C Lorenzi K Nie X Li EM Jameyson WR Drennan JT Rubinstein 《The Journal of the Acoustical Society of America》2012,132(2):1113-1119

Previous studies have demonstrated that normal-hearing listeners can understand speech using the recovered "temporal envelopes," i.e., amplitude modulation (AM) cues from frequency modulation (FM). This study evaluated this mechanism in cochlear implant (CI) users for consonant identification. Stimuli containing only FM cues were created using 1, 2, 4, and 8-band FM-vocoders to determine if consonant identification performance would improve as the recovered AM cues become more available. A consistent improvement was observed as the band number decreased from 8 to 1, supporting the hypothesis that (1) the CI sound processor generates recovered AM cues from broadband FM, and (2) CI users can use the recovered AM cues to recognize speech. The correlation between the intact and the recovered AM components at the output of the sound processor was also generally higher when the band number was low, supporting the consonant identification results. Moreover, CI subjects who were better at using recovered AM cues from broadband FM cues showed better identification performance with intact (unprocessed) speech stimuli. This suggests that speech perception performance variability in CI users may be partly caused by differences in their ability to use AM cues recovered from FM speech cues. 相似文献

13.

Effects of hearing protector devices on speech intelligibility

João Candido Fernandes 《Applied Acoustics》2003,64(6):581-590

The purpose of this study was to determine the influence of hearing protection devices (HPDs) on the understanding of speech in young adults with normal hearing, both in a silent situation and in the presence of ambient noise. The experimental research was carried out with the following variables: five different conditions of HPD use (without protectors, with two earplugs and with two earmuffs); a type of noise (pink noise); 4 test levels (60, 70, 80 and 90 dB[A]); 6 signal/noise ratios (without noise, +5, +10, zero, −5 and −10 dB); 5 repetitions for each case, totalling 600 tests with 10 monosyllables in each one. The variable measure was the percentage of correctly heard words (monosyllabic) in the test. The results revealed that, at the lowest levels (60 and 70 dB), the protectors reduced the intelligibility of speech (compared to the tests without protectors) while, in the presence of ambient noise levels of 80 and 90 dB and unfavourable signal/noise ratios (0, −5 and −10 dB), the HPDs improved the intelligibility. A comparison of the effectiveness of earplugs versus earmuffs showed that the former offer greater efficiency in respect to the recognition of speech, providing a 30% improvement over situations in which no protection is used. As might be expected, this study confirmed that the protectors' influence on speech intelligibility is related directly to the spectral curve of the protector's attenuation. 相似文献

14.

The effect of reduced vowel working space on speech intelligibility in Mandarin-speaking young adults with cerebral palsy

Liu HM Tsao FM Kuhl PK 《The Journal of the Acoustical Society of America》2005,117(6):3879-3889

The purpose of this study was to examine the effect of reduced vowel working space on dysarthric talkers' speech intelligibility using both acoustic and perceptual approaches. In experiment 1, the acoustic-perceptual relationship between vowel working space area and speech intelligibility was examined in Mandarin-speaking young adults with cerebral palsy. Subjects read aloud 18 bisyllabic words containing the vowels /i/, /a/, and /u/ using their normal speaking rate. Each talker's words were identified by three normal listeners. The percentage of correct vowel and word identification were calculated as vowel intelligibility and word intelligibility, respectively. Results revealed that talkers with cerebral palsy exhibited smaller vowel working space areas compared to ten age-matched controls. The vowel working space area was significantly correlated with vowel intelligibility (r=0.632, p<0.005) and with word intelligibility (r=0.684, p<0.005). Experiment 2 examined whether tokens of expanded vowel working spaces were perceived as better vowel exemplars and represented with greater perceptual spaces than tokens of reduced vowel working spaces. The results of the perceptual experiment support this prediction. The distorted vowels of talkers with cerebral palsy compose a smaller acoustic space that results in shrunken intervowel perceptual distances for listeners. 相似文献

15.

Comparing spatial tuning curves, spectral ripple resolution, and speech perception in cochlear implant users

Anderson ES Nelson DA Kreft H Nelson PB Oxenham AJ 《The Journal of the Acoustical Society of America》2011,130(1):364-375

Spectral ripple discrimination thresholds were measured in 15 cochlear-implant users with broadband (350-5600 Hz) and octave-band noise stimuli. The results were compared with spatial tuning curve (STC) bandwidths previously obtained from the same subjects. Spatial tuning curve bandwidths did not correlate significantly with broadband spectral ripple discrimination thresholds but did correlate significantly with ripple discrimination thresholds when the rippled noise was confined to an octave-wide passband, centered on the STC's probe electrode frequency allocation. Ripple discrimination thresholds were also measured for octave-band stimuli in four contiguous octaves, with center frequencies from 500 Hz to 4000 Hz. Substantial variations in thresholds with center frequency were found in individuals, but no general trends of increasing or decreasing resolution from apex to base were observed in the pooled data. Neither ripple nor STC measures correlated consistently with speech measures in noise and quiet in the sample of subjects in this study. Overall, the results suggest that spectral ripple discrimination measures provide a reasonable measure of spectral resolution that correlates well with more direct, but more time-consuming, measures of spectral resolution, but that such measures do not always provide a clear and robust predictor of performance in speech perception tasks. 相似文献

16.

Forward-masked spatial tuning curves in cochlear implant users

Nelson DA Donaldson GS Kreft H 《The Journal of the Acoustical Society of America》2008,123(3):1522-1543

Forward-masked psychophysical spatial tuning curves (fmSTCs) were measured in twelve cochlear-implant subjects, six using bipolar stimulation (Nucleus devices) and six using monopolar stimulation (Clarion devices). fmSTCs were measured at several probe levels on a middle electrode using a fixed-level probe stimulus and variable-level maskers. The average fmSTC slopes obtained in subjects using bipolar stimulation (3.7 dBmm) were approximately three times steeper than average slopes obtained in subjects using monopolar stimulation (1.2 dBmm). Average spatial bandwidths were about half as wide for subjects with bipolar stimulation (2.6 mm) than for subjects with monopolar stimulation (4.6 mm). None of the tuning curve characteristics changed significantly with probe level. fmSTCs replotted in terms of acoustic frequency, using Greenwood's [J. Acoust. Soc. Am. 33, 1344-1356 (1961)] frequency-to-place equation, were compared with forward-masked psychophysical tuning curves obtained previously from normal-hearing and hearing-impaired acoustic listeners. The average tuning characteristics of fmSTCs in electric hearing were similar to the broad tuning observed in normal-hearing and hearing-impaired acoustic listeners at high stimulus levels. This suggests that spatial tuning is not the primary factor limiting speech perception in many cochlear implant users. 相似文献

17.

Effects of intermodulation distortion on the intelligibility of speech

S.G. Beristain 《Journal of sound and vibration》1977,54(3):453-454

相似文献

18.

Effects of the salience of pitch and periodicity information on the intelligibility of four-channel vocoded speech: implications for cochlear implants

Faulkner A Rosen S Smith C 《The Journal of the Acoustical Society of America》2000,108(4):1877-1887

Recent simulations of continuous interleaved sampling (CIS) cochlear implant speech processors have used acoustic stimulation that provides only weak cues to pitch, periodicity, and aperiodicity, although these are regarded as important perceptual factors of speech. Four-channel vocoders simulating CIS processors have been constructed, in which the salience of speech-derived periodicity and pitch information was manipulated. The highest salience of pitch and periodicity was provided by an explicit encoding, using a pulse carrier following fundamental frequency for voiced speech, and a noise carrier during voiceless speech. Other processors included noise-excited vocoders with envelope cutoff frequencies of 32 and 400 Hz. The use of a pulse carrier following fundamental frequency gave substantially higher performance in identification of frequency glides than did vocoders using envelope-modulated noise carriers. The perception of consonant voicing information was improved by processors that preserved periodicity, and connected discourse tracking rates were slightly faster with noise carriers modulated by envelopes with a cutoff frequency of 400 Hz compared to 32 Hz. However, consonant and vowel identification, sentence intelligibility, and connected discourse tracking rates were generally similar through all of the processors. For these speech tasks, pitch and periodicity beyond the weak information available from 400 Hz envelope-modulated noise did not contribute substantially to performance. 相似文献

19.

A concept for a research tool for experiments with cochlear implant users

Geurts L Wouters J 《The Journal of the Acoustical Society of America》2000,108(6):2949-2956

APEX, an acronym for computer Application for Psycho-Electrical eXperiments, is a user friendly tool used to conduct psychophysical experiments and to investigate new speech coding algorithms with cochlear implant users. Most common psychophysical experiments can be easily programmed and all stimuli can be easily created without any knowledge of computer programing. The pulsatile stimuli are composed off-line using custom-made MATLAB (Registered trademark of The Mathworks, Inc., http://www.mathworks.com) functions and are stored on hard disk or CD ROM. These functions convert either a speech signal into a pulse sequence or generate any sequence of pulses based on the parameters specified by the experimenter. The APEX personal computer (PC) software reads a text file which specifies the experiment and the stimuli, controls the experiment, delivers the stimuli to the subject through a digital signal processor (DSP) board, collects the responses via a computer mouse or a graphics tablet, and writes the results to the same file. At present, the APEX system is implemented for the LAURA (Registered trademark of Philips Hearing Implants) cochlear implant. However, the concept-and many parts of the system-is portable to any other device. Also, psycho-acoustical experiments can be conducted by presenting the stimuli acoustically through a sound card. 相似文献

20.

Sensitivity to binaural timing in bilateral cochlear implant users 总被引：2，自引：0，他引：2

van Hoesel RJ 《The Journal of the Acoustical Society of America》2007,121(4):2192-2206

Various measures of binaural timing sensitivity were made in three bilateral cochlear implant users, who had demonstrated moderate-to-good interaural time delay (ITD) sensitivity at 100 pulses-per-second (pps). Overall, ITD thresholds increased at higher pulse rates, lower levels, and shorter durations, although intersubject differences were evident. Monaural rate-discrimination thresholds, using the same stimulation parameters, showed more substantial elevation than ITDs with increased rate. ITD sensitivity with 6000 pps stimuli, amplitude-modulated at 100 Hz, was similar to that with unmodulated pulse trains at 100 pps, but at 200 and 300 Hz performance was poorer than with unmodulated signals. Measures of sensitivity to binaural beats with unmodulated pulse-trains showed that all three subjects could use time-varying ITD cues at 100 pps, but not 300 pps, even though static ITD sensitivity was relatively unaffected over that range. The difference between static and dynamic ITD thresholds is discussed in terms of relative contributions from initial and later arriving cues, which was further examined in an experiment using two-pulse stimuli as a function of interpulse separation. In agreement with the binaural-beat data, findings from that experiment showed poor discrimination of ITDs on the second pulse when the interval between pulses was reduced to a few milliseconds. 相似文献