首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 625 毫秒
1.
2.
This study investigated the effect of pulsatile stimulation rate on medial vowel and consonant recognition in cochlear implant listeners. Experiment 1 measured phoneme recognition as a function of stimulation rate in six Nucleus-22 cochlear implant listeners using an experimental four-channel continuous interleaved sampler (CIS) speech processing strategy. Results showed that all stimulation rates from 150 to 500 pulses/s/electrode produced equally good performance, while stimulation rates lower than 150 pulses/s/electrode produced significantly poorer performance. Experiment 2 measured phoneme recognition by implant listeners and normal-hearing listeners as a function of the low-pass cutoff frequency for envelope information. Results from both acoustic and electric hearing showed no significant difference in performance for all cutoff frequencies higher than 20 Hz. Both vowel and consonant scores dropped significantly when the cutoff frequency was reduced from 20 Hz to 2 Hz. The results of these two experiments suggest that temporal envelope information can be conveyed by relatively low stimulation rates. The pattern of results for both electrical and acoustic hearing is consistent with a simple model of temporal integration with an equivalent rectangular duration (ERD) of the temporal integrator of about 7 ms.  相似文献   

3.
Spectral resolution has been reported to be closely related to vowel and consonant recognition in cochlear implant (CI) listeners. One measure of spectral resolution is spectral modulation threshold (SMT), which is defined as the smallest detectable spectral contrast in the spectral ripple stimulus. SMT may be determined by the activation pattern associated with electrical stimulation. In the present study, broad activation patterns were simulated using a multi-band vocoder to determine if similar impairments in speech understanding scores could be produced in normal-hearing listeners. Tokens were first decomposed into 15 logarithmically spaced bands and then re-synthesized by multiplying the envelope of each band by matched filtered noise. Various amounts of current spread were simulated by adjusting the drop-off of the noise spectrum away from the peak (40-5 dBoctave). The average SMT (0.25 and 0.5 cyclesoctave) increased from 6.3 to 22.5 dB, while average vowel identification scores dropped from 86% to 19% and consonant identification scores dropped from 93% to 59%. In each condition, the impairments in speech understanding were generally similar to those found in CI listeners with similar SMTs, suggesting that variability in spread of neural activation largely accounts for the variability in speech perception of CI listeners.  相似文献   

4.
On the role of spectral transition for speech perception   总被引:2,自引:0,他引:2  
This paper examines the relationship between dynamic spectral features and the identification of Japanese syllables modified by initial and/or final truncation. The experiments confirm several main points. "Perceptual critical points," where the percent correct identification of the truncated syllable as a function of the truncation position changes abruptly, are related to maximum spectral transition positions. A speech wave of approximately 10 ms in duration that includes the maximum spectral transition position bears the most important information for consonant and syllable perception. Consonant and vowel identification scores simultaneously change as a function of the truncation position in the short period, including the 10-ms period for final truncation. This suggests that crucial information for both vowel and consonant identification is contained across the same initial part of each syllable. The spectral transition is more crucial than unvoiced and buzz bar periods for consonant (syllable) perception, although the latter features are of some perceptual importance. Also, vowel nuclei are not necessary for either vowel or syllable perception.  相似文献   

5.
Nonlinear sensory and neural processing mechanisms have been exploited to enhance spectral contrast for improvement of speech understanding in noise. The "companding" algorithm employs both two-tone suppression and adaptive gain mechanisms to achieve spectral enhancement. This study implemented a 50-channel companding strategy and evaluated its efficiency as a front-end noise suppression technique in cochlear implants. The key parameters were identified and evaluated to optimize the companding performance. Both normal-hearing (NH) listeners and cochlear-implant (CI) users performed phoneme and sentence recognition tests in quiet and in steady-state speech-shaped noise. Data from the NH listeners showed that for noise conditions, the implemented strategy improved vowel perception but not consonant and sentence perception. However, the CI users showed significant improvements in both phoneme and sentence perception in noise. Maximum average improvement for vowel recognition was 21.3 percentage points (p<0.05) at 0 dB signal-to-noise ratio (SNR), followed by 17.7 percentage points (p<0.05) at 5 dB SNR for sentence recognition and 12.1 percentage points (p<0.05) at 5 dB SNR for consonant recognition. While the observed results could be attributed to the enhanced spectral contrast, it is likely that the corresponding temporal changes caused by companding also played a significant role and should be addressed by future studies.  相似文献   

6.
Cochlear implants provide users with limited spectral and temporal information. In this study, the amount of spectral and temporal information was systematically varied through simulations of cochlear implant processors using a noise-excited vocoder. Spectral information was controlled by varying the number of channels between 1 and 16, and temporal information was controlled by varying the lowpass cutoff frequencies of the envelope extractors from 1 to 512 Hz. Consonants and vowels processed using those conditions were presented to seven normal-hearing native-English-speaking listeners for identification. The results demonstrated that both spectral and temporal cues were important for consonant and vowel recognition with the spectral cues having a greater effect than the temporal cues for the ranges of numbers of channels and lowpass cutoff frequencies tested. The lowpass cutoff for asymptotic performance in consonant and vowel recognition was 16 and 4 Hz, respectively. The number of channels at which performance plateaued for consonants and vowels was 8 and 12, respectively. Within the above-mentioned ranges of lowpass cutoff frequency and number of channels, the temporal and spectral cues showed a tradeoff for phoneme recognition. Information transfer analyses showed different relative contributions of spectral and temporal cues in the perception of various phonetic/acoustic features.  相似文献   

7.
Spectral peak resolution was investigated in normal hearing (NH), hearing impaired (HI), and cochlear implant (CI) listeners. The task involved discriminating between two rippled noise stimuli in which the frequency positions of the log-spaced peaks and valleys were interchanged. The ripple spacing was varied adaptively from 0.13 to 11.31 ripples/octave, and the minimum ripple spacing at which a reversal in peak and trough positions could be detected was determined as the spectral peak resolution threshold for each listener. Spectral peak resolution was best, on average, in NH listeners, poorest in CI listeners, and intermediate for HI listeners. There was a significant relationship between spectral peak resolution and both vowel and consonant recognition in quiet across the three listener groups. The results indicate that the degree of spectral peak resolution required for accurate vowel and consonant recognition in quiet backgrounds is around 4 ripples/octave, and that spectral peak resolution poorer than around 1-2 ripples/octave may result in highly degraded speech recognition. These results suggest that efforts to improve spectral peak resolution for HI and CI users may lead to improved speech recognition.  相似文献   

8.
Several studies have demonstrated that when talkers are instructed to speak clearly, the resulting speech is significantly more intelligible than speech produced in ordinary conversation. These speech intelligibility improvements are accompanied by a wide variety of acoustic changes. The current study explored the relationship between acoustic properties of vowels and their identification in clear and conversational speech, for young normal-hearing (YNH) and elderly hearing-impaired (EHI) listeners. Monosyllabic words excised from sentences spoken either clearly or conversationally by a male talker were presented in 12-talker babble for vowel identification. While vowel intelligibility was significantly higher in clear speech than in conversational speech for the YNH listeners, no clear speech advantage was found for the EHI group. Regression analyses were used to assess the relative importance of spectral target, dynamic formant movement, and duration information for perception of individual vowels. For both listener groups, all three types of information emerged as primary cues to vowel identity. However, the relative importance of the three cues for individual vowels differed greatly for the YNH and EHI listeners. This suggests that hearing loss alters the way acoustic cues are used for identifying vowels.  相似文献   

9.
The hypothesis was investigated that selectively increasing the discrimination of low-frequency information (below 2600 Hz) by altering the frequency-to-electrode allocation would improve speech perception by cochlear implantees. Two experimental conditions were compared, both utilizing ten electrode positions selected based on maximal discrimination. A fixed frequency range (200-10513 Hz) was allocated either relatively evenly across the ten electrodes, or so that nine of the ten positions were allocated to the frequencies up to 2600 Hz. Two additional conditions utilizing all available electrode positions (15-18 electrodes) were assessed: one with each subject's usual frequency-to-electrode allocation; and the other using the same analysis filters as the other experimental conditions. Seven users of the Nucleus CI22 implant wore processors mapped with each experimental condition for 2-week periods away from the laboratory, followed by assessment of perception of words in quiet and sentences in noise. Performance with both ten-electrode maps was significantly poorer than with both full-electrode maps on at least one measure. Performance with the map allocating nine out of ten electrodes to low frequencies was equivalent to that with the full-electrode maps for vowel perception and sentences in noise, but was worse for consonant perception. Performance with the evenly allocated ten-electrode map was equivalent to that with the full-electrode maps for consonant perception, but worse for vowel perception and sentences in noise. Comparison of the two full-electrode maps showed that subjects could fully adapt to frequency shifts up to ratio changes of 1.3, given 2 weeks' experience. Future research is needed to investigate whether speech perception may be improved by the manipulation of frequency-to-electrode allocation in maps which have a full complement of electrodes in Nucleus implants.  相似文献   

10.
Synthesis (carrier) signals in acoustic models embody assumptions about perception of auditory electric stimulation. This study compared speech intelligibility of consonants and vowels processed through a set of nine acoustic models that used Spectral Peak (SPEAK) and Advanced Combination Encoder (ACE)-like speech processing, using synthesis signals which were representative of signals used previously in acoustic models as well as two new ones. Performance of the synthesis signals was determined in terms of correspondence with cochlear implant (CI) listener results for 12 attributes of phoneme perception (consonant and vowel recognition; F1, F2, and duration information transmission for vowels; voicing, manner, place of articulation, affrication, burst, nasality, and amplitude envelope information transmission for consonants) using four measures of performance. Modulated synthesis signals produced the best correspondence with CI consonant intelligibility, while sinusoids, narrow noise bands, and varying noise bands produced the best correspondence with CI vowel intelligibility. The signals that performed best overall (in terms of correspondence with both vowel and consonant attributes) were modulated and unmodulated noise bands of varying bandwidth that corresponded to a linearly varying excitation width of 0.4 mm at the apical to 8 mm at the basal channels.  相似文献   

11.
The addition of low-passed (LP) speech or even a tone following the fundamental frequency (F0) of speech has been shown to benefit speech recognition for cochlear implant (CI) users with residual acoustic hearing. The mechanisms underlying this benefit are still unclear. In this study, eight bimodal subjects (CI users with acoustic hearing in the non-implanted ear) and eight simulated bimodal subjects (using vocoded and LP speech) were tested on vowel and consonant recognition to determine the relative contributions of acoustic and phonetic cues, including F0, to the bimodal benefit. Several listening conditions were tested (CI/Vocoder, LP, T(F0-env), CI/Vocoder + LP, CI/Vocoder + T(F0-env)). Compared with CI/Vocoder performance, LP significantly enhanced both consonant and vowel perception, whereas a tone following the F0 contour of target speech and modulated with an amplitude envelope of the maximum frequency of the F0 contour (T(F0-env)) enhanced only consonant perception. Information transfer analysis revealed a dual mechanism in the bimodal benefit: The tone representing F0 provided voicing and manner information, whereas LP provided additional manner, place, and vowel formant information. The data in actual bimodal subjects also showed that the degree of the bimodal benefit depended on the cutoff and slope of residual acoustic hearing.  相似文献   

12.
Speech recognition was measured as a function of spectral resolution (number of spectral channels) and speech-to-noise ratio in normal-hearing (NH) and cochlear-implant (CI) listeners. Vowel, consonant, word, and sentence recognition were measured in five normal-hearing listeners, ten listeners with the Nucleus-22 cochlear implant, and nine listeners with the Advanced Bionics Clarion cochlear implant. Recognition was measured as a function of the number of spectral channels (noise bands or electrodes) at signal-to-noise ratios of + 15, + 10, +5, 0 dB, and in quiet. Performance with three different speech processing strategies (SPEAK, CIS, and SAS) was similar across all conditions, and improved as the number of electrodes increased (up to seven or eight) for all conditions. For all noise levels, vowel and consonant recognition with the SPEAK speech processor did not improve with more than seven electrodes, while for normal-hearing listeners, performance continued to increase up to at least 20 channels. Speech recognition on more difficult speech materials (word and sentence recognition) showed a marginally significant increase in Nucleus-22 listeners from seven to ten electrodes. The average implant score on all processing strategies was poorer than scores of NH listeners with similar processing. However, the best CI scores were similar to the normal-hearing scores for that condition (up to seven channels). CI listeners with the highest performance level increased in performance as the number of electrodes increased up to seven, while CI listeners with low levels of speech recognition did not increase in performance as the number of electrodes was increased beyond four. These results quantify the effect of number of spectral channels on speech recognition in noise and demonstrate that most CI subjects are not able to fully utilize the spectral information provided by the number of electrodes used in their implant.  相似文献   

13.
The present study evaluated auditory-visual speech perception in cochlear-implant users as well as normal-hearing and simulated-implant controls to delineate relative contributions of sensory experience and cues. Auditory-only, visual-only, or auditory-visual speech perception was examined in the context of categorical perception, in which an animated face mouthing ba, da, or ga was paired with synthesized phonemes from an 11-token auditory continuum. A three-alternative, forced-choice method was used to yield percent identification scores. Normal-hearing listeners showed sharp phoneme boundaries and strong reliance on the auditory cue, whereas actual and simulated implant listeners showed much weaker categorical perception but stronger dependence on the visual cue. The implant users were able to integrate both congruent and incongruent acoustic and optical cues to derive relatively weak but significant auditory-visual integration. This auditory-visual integration was correlated with the duration of the implant experience but not the duration of deafness. Compared with the actual implant performance, acoustic simulations of the cochlear implant could predict the auditory-only performance but not the auditory-visual integration. These results suggest that both altered sensory experience and improvised acoustic cues contribute to the auditory-visual speech perception in cochlear-implant users.  相似文献   

14.
The multidimensional phoneme identification model is applied to consonant confusion matrices obtained from 28 postlingually deafened cochlear implant users. This model predicts consonant matrices based on these subjects' ability to discriminate a set of postulated spectral, temporal, and amplitude speech cues as presented to them by their device. The model produced confusion matrices that matched many aspects of individual subjects' consonant matrices, including information transfer for the voicing, manner, and place features, despite individual differences in age at implantation, implant experience, device and stimulation strategy used, as well as overall consonant identification level. The model was able to match the general pattern of errors between consonants, but not the full complexity of all consonant errors made by each individual. The present study represents an important first step in developing a model that can be used to test specific hypotheses about the mechanisms cochlear implant users employ to understand speech.  相似文献   

15.
Three studies are reported on the speech perception of normally hearing and hearing-impaired adults using combinations of visual, auditory, and tactile input. In study 1, mean scores for four normally hearing subjects showed that addition of tactile information, provided through the multichannel electrotactile speech processor, to either audition alone (300-Hz low-pass-filtered speech) or lipreading plus audition resulted in significant improvements in phoneme and word discrimination scores. Information transmission analyses demonstrated the effectiveness of the tactile aid in providing cues to duration, F1 and F2 features for vowels, and manner of articulation features for consonants, especially features requiring detection and discrimination of high-frequency information. In study 2, six different cutoff frequencies were used for a low-pass-filtered auditory signal. Mean scores for vowel and consonant identification were significantly higher with the addition of tactile input to audition alone at each cutoff frequency up to 1500 Hz. The mean speechtracking rate was also significantly increased by the additional tactile input up to 1500 Hz. Study 3 examined speech discrimination of three hearing-impaired adults. Additional information available through the tactile aid was shown to improve speech discrimination scores; however, the degree of increase was inversely related to the level of residual hearing. Results indicate that the electrotactile aid may be useful for patients with little residual hearing and for the severely to profoundly hearing impaired, who could benefit from the high-frequency information presented through the tactile modality, but unavailable through hearing aids.  相似文献   

16.
This study investigated the effect of five speech processing parameters, currently employed in cochlear implant processors, on speech understanding. Experiment 1 examined speech recognition as a function of stimulation rate in six Med-E1/CIS-Link cochlear implant listeners. Results showed that higher stimulation rates (2100 pulses/s) produced a significantly higher performance on word and consonant recognition than lower stimulation rates (<800 pulses/s). The effect of stimulation rate on consonant recognition was highly dependent on the vowel context. The largest benefit was noted for consonants in the /uCu/ and /iCi/ contexts, while the smallest benefit was noted for consonants in the /aCa/ context. This finding suggests that the /aCa/ consonant test, which is widely used today, is not sensitive enough to parametric variations of implant processors. Experiment 2 examined vowel and consonant recognition as a function of pulse width for low-rate (400 and 800 pps) implementations of the CIS strategy. For the 400-pps condition, wider pulse widths (208 micros/phase) produced significantly higher performance on consonant recognition than shorter pulse widths (40 micros/phase). Experiments 3-5 examined vowel and consonant recognition as a function of the filter overlap in the analysis filters, shape of the amplitude mapping function, and signal bandwidth. Results showed that the amount of filter overlap (ranging from -20 to -60 dB/oct) and the signal bandwidth (ranging from 6.7 to 9.9 kHz) had no effect on phoneme recognition. The shape of the amplitude mapping functions (ranging from strongly compressive to weakly compressive) had only a minor effect on performance, with the lowest performance obtained for nearly linear mapping functions. Of the five speech processing parameters examined in this study, the pulse rate and the pulse width had the largest (positive) effect on speech recognition. For a fixed pulse width, higher rates (2100 pps) of stimulation provided a significantly better performance on word recognition than lower rates (<800 pps) of stimulation. High performance was also achieved by jointly varying the pulse rate and pulse width. The above results indicate that audiologists can optimize the implant listener's performance either by increasing the pulse rate or by jointly varying the pulse rate and pulse width.  相似文献   

17.
Although some cochlear implant (CI) listeners can show good word recognition accuracy, it is not clear how they perceive and use the various acoustic cues that contribute to phonetic perceptions. In this study, the use of acoustic cues was assessed for normal-hearing (NH) listeners in optimal and spectrally degraded conditions, and also for CI listeners. Two experiments tested the tense/lax vowel contrast (varying in formant structure, vowel-inherent spectral change, and vowel duration) and the word-final fricative voicing contrast (varying in F1 transition, vowel duration, consonant duration, and consonant voicing). Identification results were modeled using mixed-effects logistic regression. These experiments suggested that under spectrally-degraded conditions, NH listeners decrease their use of formant cues and increase their use of durational cues. Compared to NH listeners, CI listeners showed decreased use of spectral cues like formant structure and formant change and consonant voicing, and showed greater use of durational cues (especially for the fricative contrast). The results suggest that although NH and CI listeners may show similar accuracy on basic tests of word, phoneme or feature recognition, they may be using different perceptual strategies in the process.  相似文献   

18.
The purpose of this study was to determine the role of static, dynamic, and integrated cues for perception in three adult age groups, and to determine whether age has an effect on both consonant and vowel perception, as predicted by the "age-related deficit hypothesis." Eight adult subjects in each of the age ranges of young (ages 20-26), middle aged (ages 52-59), and old (ages 70-76) listened to synthesized syllables composed of combinations of [b d g] and [i u a]. The synthesis parameters included manipulations of the following stimulus variables: formant transition (moving or straight), noise burst (present or absent), and voicing duration (10, 30, or 46 ms). Vowel perception was high across all conditions and there were no significant differences among age groups. Consonant identification showed a definite effect of age. Young and middle-aged adults were significantly better than older adults at identifying consonants from secondary cues only. Older adults relied on the integration of static and dynamic cues to a greater extent than younger and middle-aged listeners for identification of place of articulation of stop consonants. Duration facilitated correct stop-consonant identification in the young and middle-aged groups for the no-burst conditions, but not in the old group. These findings for the duration of stop-consonant transitions indicate reductions in processing speed with age. In general, the results did not support the age-related deficit hypothesis for adult identification of vowels and consonants from dynamic spectral cues.  相似文献   

19.
There is limited documentation available on how sensorineurally hearing-impaired listeners use the various sources of phonemic information that are known to be distributed across time in the speech waveform. In this investigation, a group of normally hearing listeners and a group of sensorineurally hearing-impaired listeners (with and without the benefit of amplification) identified various consonant and vowel productions that had been systematically varied in duration. The consonants (presented in a /haCa/ environment) and the vowels (presented in a /bVd/ environment) were truncated in steps to eliminate various segments from the end of the stimulus. The results indicated that normally hearing listeners could extract more phonemic information, especially cues to consonant place, from the earlier occurring portions of the stimulus waveforms than could the hearing-impaired listeners. The use of amplification partially decreased the performance differences between the normally hearing listeners and the unaided hearing-impaired listeners. The results are relevant to current models of normal speech perception that emphasize the need for the listener to make phonemic identifications as quickly as possible.  相似文献   

20.
Two related studies investigated the relationship between place-pitch sensitivity and consonant recognition in cochlear implant listeners using the Nucleus MPEAK and SPEAK speech processing strategies. Average place-pitch sensitivity across the electrode array was evaluated as a function of electrode separation, using a psychophysical electrode pitch-ranking task. Consonant recognition was assessed by analyzing error matrices obtained with a standard consonant confusion procedure to obtain relative transmitted information (RTI) measures for three features: stimulus (RTI stim), envelope (RTI env[plc]), and place-of-articulation (RTI plc[env]). The first experiment evaluated consonant recognition performance with MPEAK and SPEAK in the same subjects. Subjects were experienced users of the MPEAK strategy who used the SPEAK strategy on a daily basis for one month and were tested with both processors. It was hypothesized that subjects with good place-pitch sensitivity would demonstrate better consonant place-cue perception with SPEAK than with MPEAK, by virtue of their ability to make use of SPEAK's enhanced representation of spectral speech cues. Surprisingly, all but one subject demonstrated poor consonant place-cue performance with both MPEAK and SPEAK even though most subjects demonstrated good or excellent place-pitch sensitivity. Consistent with this, no systematic relationship between place-pitch sensitivity and consonant place-cue performance was observed. Subjects' poor place-cue perception with SPEAK was subsequently attributed to the relatively short period of experience that they were given with the SPEAK strategy. The second study reexamined the relationship between place-pitch sensitivity and consonant recognition in a group of experienced SPEAK users. For these subjects, a positive relationship was observed between place-pitch sensitivity and consonant place-cue performance, supporting the hypothesis that good place-pitch sensitivity facilitates subjects' use of spectral cues to consonant identity. A strong, linear relationship was also observed between measures of envelope- and place-cue extraction, with place-cue performance increasing as a constant proportion (approximately 0.8) of envelope-cue performance. To the extent that the envelope-cue measure reflects subjects' abilities to resolve amplitude fluctuations in the speech envelope, this finding suggests that both envelope- and place-cue perception depend strongly on subjects' envelope-processing abilities. Related to this, the data suggest that good place-cue perception depends both on envelope-processing abilities and place-pitch sensitivity, and that either factor may limit place-cue perception in a given cochlear implant listener. Data from both experiments indicate that subjects with small electric dynamic ranges (< 8 dB for 125-Hz, 205-microsecond/ph pulse trains) are more likely to demonstrate poor electrode pitch-ranking skills and poor consonant recognition performance than subjects with larger electric dynamic ranges.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号