首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 0 毫秒
1.
Standard continuous interleaved sampling processing, and a modified processing strategy designed to enhance temporal cues to voice pitch, were compared on tests of intonation perception, and vowel perception, both in implant users and in acoustic simulations. In standard processing, 400 Hz low-pass envelopes modulated either pulse trains (implant users) or noise carriers (simulations). In the modified strategy, slow-rate envelope modulations, which convey dynamic spectral variation crucial for speech understanding, were extracted by low-pass filtering (32 Hz). In addition, during voiced speech, higher-rate temporal modulation in each channel was provided by 100% amplitude-modulation by a sawtooth-like wave form whose periodicity followed the fundamental frequency (F0) of the input. Channel levels were determined by the product of the lower- and higher-rate modulation components. Both in acoustic simulations and in implant users, the ability to use intonation information to identify sentences as question or statement was significantly better with modified processing. However, while there was no difference in vowel recognition in the acoustic simulation, implant users performed worse with modified processing both in vowel recognition and in formant frequency discrimination. It appears that, while enhancing pitch perception, modified processing harmed the transmission of spectral information.  相似文献   

2.
Multichannel cochlear implant users vary greatly in their word-recognition abilities. This study examined whether their word recognition was related to the use of either highly dynamic or relatively steady-state vowel cues contained in /bVb/ and /wVb/ syllables. Nine conditions were created containing different combinations of formant transition, steady-state, and duration cues. Because processor strategies differ, the ability to perceive static and dynamic information may depend on the type of cochlear implant used. Ten Nucleus and ten Ineraid subjects participated, along with 12 normal-hearing control subjects. Vowel identification did not differ between implanted groups, but both were significantly poorer at identifying vowels than the normal-hearing group. Vowel identification was best when at least two kinds of cues were available. Using only one type of cue, performance was better with excised vowels containing steady-state formants than in "vowelless" syllables, where the center vocalic portion was deleted and transitions were joined. In the latter syllable type, Nucleus subjects identified vowels significantly better when /b/ was the initial consonant; the other two groups were not affected by specific consonantal context. Cochlear implant subjects' word-recognition was positively correlated with the use of dynamic vowel cues, but not with steady-state cues.  相似文献   

3.
Two experiments investigating the effects of auditory stimulation delivered via a Nucleus multichannel cochlear implant upon vowel production in adventitiously deafened adult speakers are reported. The first experiment contrasts vowel formant frequencies produced without auditory stimulation (implant processor OFF) to those produced with auditory stimulation (processor ON). Significant shifts in second formant frequencies were observed for intermediate vowels produced without auditory stimulation; however, no significant shifts were observed for the point vowels. Higher first formant frequencies occurred in five of eight vowels when the processor was turned ON versus OFF. A second experiment contrasted productions of the word "head" produced with a FULL map, OFF condition, and a SINGLE channel condition that restricted the amount of auditory information received by the subjects. This experiment revealed significant shifts in second formant frequencies between FULL map utterances and the other conditions. No significant differences in second formant frequencies were observed between SINGLE channel and OFF conditions. These data suggest auditory feedback information may be used to adjust the articulation of some speech sounds.  相似文献   

4.
This study investigated the integration of place- and temporal-pitch cues in pitch contour identification (PCI), in which cochlear implant (CI) users were asked to judge the overall pitch-change direction of stimuli. Falling and rising pitch contours were created either by continuously steering current between adjacent electrodes (place pitch), by continuously changing amplitude modulation (AM) frequency (temporal pitch), or both. The percentage of rising responses was recorded as a function of current steering or AM frequency change, with single or combined pitch cues. A significant correlation was found between subjects' sensitivity to current steering and AM frequency change. The integration of place- and temporal-pitch cues was most effective when the two cues were similarly discriminable in isolation. Adding the other (place or temporal) pitch cues shifted the temporal- or place-pitch psychometric functions horizontally without changing the slopes. PCI was significantly better with consistent place- and temporal-pitch cues than with inconsistent cues. PCI with single cues and integration of pitch cues were similar on different electrodes. The results suggest that CI users effectively integrate place- and temporal-pitch cues in relative pitch perception tasks. Current steering and AM frequency change should be coordinated to better transmit dynamic pitch information to CI users.  相似文献   

5.
The abilities to hear changes in pitch for sung vowels and understand speech using an experimental sound coding strategy (eTone) that enhanced coding of temporal fundamental frequency (F0) information were tested in six cochlear implant users, and compared with performance using their clinical (ACE) strategy. In addition, rate- and modulation rate-pitch difference limens (DLs) were measured using synthetic stimuli with F0s below 300 Hz to determine psychophysical abilities of each subject and to provide experience in attending to rate cues for the judgment of pitch. Sung-vowel pitch ranking tests for stimuli separated by three semitones presented across an F0 range of one octave (139-277 Hz) showed a significant benefit for the experimental strategy compared to ACE. Average d-prime (d') values for eTone (d' = 1.05) were approximately three time larger than for ACE (d' = 0.35). Similar scores for both strategies in the speech recognition tests showed that coding of segmental speech information by the experimental strategy was not degraded. Average F0 DLs were consistent with results from previous studies and for all subjects were less than or equal to approximately three semitones for F0s of 125 and 200?Hz.  相似文献   

6.
Three experiments studied the effect of pulse rate on temporal pitch perception by cochlear implant users. Experiment 1 measured rate discrimination for pulse trains presented in bipolar mode to either an apical, middle, or basal electrode and for standard rates of 100 and 200 pps. In each block of trials the signals could have a level of -0.35, 0, or +0.35 dB re the standard, and performance for each signal level was recorded separately. Signal level affected performance for just over half of the combinations of subject, electrode, and standard rate studied. Performance was usually, but not always, better at the higher signal level. Experiment 2 showed that, for a given subject and condition, the direction of the effect was similar in monopolar and bipolar mode. Experiment 3 employed a pitch comparison procedure without feedback, and showed that the signal levels in experiment 1 that produced the best performance for a given subject and condition also led to the signal having a higher pitch. It is concluded that small level differences can have a robust and substantial effect on pitch judgments and argue that these effects are not entirely due to response biases or to co-variation of place-of-excitation with level.  相似文献   

7.
The purpose of this study was to determine whether children give more perceptual weight than do adults to dynamic spectral cues versus static cues. Listeners were 10 children between the ages of 3;8 and 4;1 (mean 3;11) and ten adults between the ages of 23;10 and 32;0 (mean 25;11). Three experimental stimulus conditions were presented, with each containing stimuli of 30 ms duration. The first experimental condition consisted of unchanging formant onset frequencies ranging in value from frequencies for [i] to those for [a], appropriate for a bilabial stop consonant context. The second two experimental conditions consisted of either an [i] or [a] onset frequency with a 25 ms portion of a formant transition whose trajectory was toward one of a series of target frequencies ranging from those for [i] to those for [a]. Results indicated that the children attended differently than the adults on both the [a] and [i] formant onset frequency cue to identify the vowels. The adults gave more equal weight to the [i]-onset and [a]-onset dynamic cues as reflected in category boundaries than the children did. For the [i]-onset condition, children were not as confident compared to adults in vowel perception, as reflected in slope analyses.  相似文献   

8.
Previous studies have demonstrated that normal-hearing listeners can understand speech using the recovered "temporal envelopes," i.e., amplitude modulation (AM) cues from frequency modulation (FM). This study evaluated this mechanism in cochlear implant (CI) users for consonant identification. Stimuli containing only FM cues were created using 1, 2, 4, and 8-band FM-vocoders to determine if consonant identification performance would improve as the recovered AM cues become more available. A consistent improvement was observed as the band number decreased from 8 to 1, supporting the hypothesis that (1) the CI sound processor generates recovered AM cues from broadband FM, and (2) CI users can use the recovered AM cues to recognize speech. The correlation between the intact and the recovered AM components at the output of the sound processor was also generally higher when the band number was low, supporting the consonant identification results. Moreover, CI subjects who were better at using recovered AM cues from broadband FM cues showed better identification performance with intact (unprocessed) speech stimuli. This suggests that speech perception performance variability in CI users may be partly caused by differences in their ability to use AM cues recovered from FM speech cues.  相似文献   

9.
Vowel perception strategies were assessed for two "average" and one "star" single-channel 3M/House and three "average" and one "star" Nucleus 22-channel cochlear implant patients and six normal-hearing control subjects. All subjects were tested by computer with real and synthetic speech versions of [symbol: see text], presented randomly. Duration, fundamental frequency, and first, second, and third formant frequency cues to the vowels were the vowels were systematically manipulated. Results showed high accuracy for the normal-hearing subjects in all conditions but that of the first formant alone. "Average" single-channel patients classified only real speech [hVd] syllables differently from synthetic steady state syllables. The "star" single-channel patient identified the vowels at much better than chance levels, with a results pattern suggesting effective use of first formant and duration information. Both "star" and "average" Nucleus users showed similar response patterns, performing better than chance in most conditions, and identifying the vowels using duration and some frequency information from all three formants.  相似文献   

10.
The hypothesis was investigated that selectively increasing the discrimination of low-frequency information (below 2600 Hz) by altering the frequency-to-electrode allocation would improve speech perception by cochlear implantees. Two experimental conditions were compared, both utilizing ten electrode positions selected based on maximal discrimination. A fixed frequency range (200-10513 Hz) was allocated either relatively evenly across the ten electrodes, or so that nine of the ten positions were allocated to the frequencies up to 2600 Hz. Two additional conditions utilizing all available electrode positions (15-18 electrodes) were assessed: one with each subject's usual frequency-to-electrode allocation; and the other using the same analysis filters as the other experimental conditions. Seven users of the Nucleus CI22 implant wore processors mapped with each experimental condition for 2-week periods away from the laboratory, followed by assessment of perception of words in quiet and sentences in noise. Performance with both ten-electrode maps was significantly poorer than with both full-electrode maps on at least one measure. Performance with the map allocating nine out of ten electrodes to low frequencies was equivalent to that with the full-electrode maps for vowel perception and sentences in noise, but was worse for consonant perception. Performance with the evenly allocated ten-electrode map was equivalent to that with the full-electrode maps for consonant perception, but worse for vowel perception and sentences in noise. Comparison of the two full-electrode maps showed that subjects could fully adapt to frequency shifts up to ratio changes of 1.3, given 2 weeks' experience. Future research is needed to investigate whether speech perception may be improved by the manipulation of frequency-to-electrode allocation in maps which have a full complement of electrodes in Nucleus implants.  相似文献   

11.
The present study systematically manipulated three acoustic cues--fundamental frequency (f0), amplitude envelope, and duration--to investigate their contributions to tonal contrasts in Mandarin. Simplified stimuli with all possible combinations of these three cues were presented for identification to eight normal-hearing listeners, all native speakers of Mandarin from Taiwan. The f0 information was conveyed either by an f0-controlled sawtooth carrier or a modulated noise so as to compare the performance achievable by a clear indication of voice f0 and what is possible with purely temporal coding of f0. Tone recognition performance with explicit f0 was much better than that with any combination of other acoustic cues (consistently greater than 90% correct compared to 33%-65%; chance is 25%). In the absence of explicit f0, the temporal coding of f0 and amplitude envelope both contributed somewhat to tone recognition, while duration had only a marginal effect. Performance based on these secondary cues varied greatly across listeners. These results explain the relatively poor perception of tone in cochlear implant users, given that cochlear implants currently provide only weak cues to f0, so that users must rely upon the purely temporal (and secondary) features for the perception of tone.  相似文献   

12.
Five bilateral cochlear implant users were tested for their localization abilities and speech understanding in noise, for both monaural and binaural listening conditions. They also participated in lateralization tasks to assess the impact of variations in interaural time delays (ITDs) and interaural level differences (ILDs) for electrical pulse trains under direct computer control. The localization task used pink noise bursts presented from an eight-loudspeaker array spanning an arc of approximately 108 degrees in front of the listeners at ear level (0-degree elevation). Subjects showed large benefits from bilateral device use compared to either side alone. Typical root-mean-square (rms) averaged errors across all eight loudspeakers in the array were about 10 degrees for bilateral device use and ranged from 20 degrees to 60 degrees using either ear alone. Speech reception thresholds (SRTs) were measured for sentences presented from directly in front of the listeners (0 degrees) in spectrally matching speech-weighted noise at either 0 degrees, +90 degrees or -90 degrees for four subjects out of five tested who could perform the task. For noise to either side, bilateral device use showed a substantial benefit over unilateral device use when noise was ipsilateral to the unilateral device. This was primarily because of monaural head-shadow effects, which resulted in robust SRT improvements (P<0.001) of about 4 to 5 dB when ipsilateral and contralateral noise positions were compared. The additional benefit of using both ears compared to the shadowed ear (i.e., binaural unmasking) was only 1 or 2 dB and less robust (P = 0.04). Results from the lateralization studies showed consistently good sensitivity to ILDs; better than the smallest level adjustment available in the implants (0.17 dB) for some subjects. Sensitivity to ITDs was moderate on the other hand, typically of the order of 100 micros. ITD sensitivity deteriorated rapidly when stimulation rates for unmodulated pulse-trains increased above a few hundred Hz but at 800 pps showed sensitivity comparable to 50-pps pulse-trains when a 50-Hz modulation was applied. In our opinion, these results clearly demonstrate important benefits are available from bilateral implantation, both for localizing sounds (in quiet) and for listening in noise when signal and noise sources are spatially separated. The data do indicate, however, that effects of interaural timing cues are weaker than those from interaural level cues and according to our psychophysical findings rely on the availability of low-rate information below a few hundred Hz.  相似文献   

13.
Japanese 5- to 13-yr-olds who used cochlear implants (CIs) and a comparison group of normally hearing (NH) Japanese children were tested on their perception and production of speech prosody. For the perception task, they were required to judge whether semantically neutral utterances that were normalized for amplitude were spoken in a happy, sad, or angry manner. The performance of NH children was error-free. By contrast, child CI users performed well below ceiling but above chance levels on happy- and sad-sounding utterances but not on angry-sounding utterances. For the production task, children were required to imitate stereotyped Japanese utterances expressing disappointment and surprise as well as culturally typically representations of crow and cat sounds. NH 5- and 6-year-olds produced significantly poorer imitations than older hearing children, but age was unrelated to the imitation quality of child CI users. Overall, child CI user's imitations were significantly poorer than those of NH children, but they did not differ significantly from the imitations of the youngest NH group. Moreover, there was a robust correlation between the performance of child CI users on the perception and production tasks; this implies that difficulties with prosodic perception underlie their difficulties with prosodic imitation.  相似文献   

14.
Many studies have noted great variability in speech perception ability among postlingually deafened adults with cochlear implants. This study examined phoneme misperceptions for 30 cochlear implant listeners using either the Nucleus-22 or Clarion version 1.2 device to examine whether listeners with better overall speech perception differed qualitatively from poorer listeners in their perception of vowel and consonant features. In the first analysis, simple regressions were used to predict the mean percent-correct scores for consonants and vowels for the better group of listeners from those of the poorer group. A strong relationship between the two groups was found for consonant identification, and a weak, nonsignificant relationship was found for vowel identification. In the second analysis, it was found that less information was transmitted for consonant and vowel features to the poorer listeners than to the better listeners; however, the pattern of information transmission was similar across groups. Taken together, results suggest that the performance difference between the two groups is primarily quantitative. The results underscore the importance of examining individuals' perception of individual phoneme features when attempting to relate speech perception to other predictor variables.  相似文献   

15.
Cochlear implant (CI) users in tone language environments report great difficulty in perceiving lexical tone. This study investigated the augmentation of simulated cochlear implant audio by visual (facial) speech information for tone. Native speakers of Mandarin and Australian English were asked to discriminate between minimal pairs of Mandarin tones in five conditions: Auditory-Only, Auditory-Visual, CI-simulated Auditory-Only, CI-simulated Auditory-Visual, and Visual-Only (silent video). Discrimination in CI-simulated audio conditions was poor compared with normal audio, and varied according to tone pair, with tone pairs with strong non-F0 cues discriminated the most easily. The availability of visual speech information also improved discrimination in the CI-simulated audio conditions, particularly on tone pairs with strong durational cues. In the silent Visual-Only condition, both Mandarin and Australian English speakers discriminated tones above chance levels. Interestingly, tone-nai?ve listeners outperformed native listeners in the Visual-Only condition, suggesting firstly that visual speech information for tone is available, and may in fact be under-used by normal-hearing tone language perceivers, and secondly that the perception of such information may be language-general, rather than the product of language-specific learning. This may find application in the development of methods to improve tone perception in CI users in tone language environments.  相似文献   

16.
This study compared how normal-hearing listeners (NH) and listeners with moderate to moderately severe cochlear hearing loss (HI) use and combine information within and across frequency regions in the perceptual separation of competing vowels with fundamental frequency differences (deltaF0) ranging from 0 to 9 semitones. Following the procedure of Culling and Darwin [J. Acoust. Soc. Am. 93, 3454-3467 (1993)], eight NH listeners and eight HI listeners identified competing vowels with either a consistent or inconsistent harmonic structure. Vowels were amplified to assure audibility for HI listeners. The contribution of frequency region depended on the value of deltaF0 between the competing vowels. When deltaF0 was small, both groups of listeners effectively utilized deltaF0 cues in the low-frequency region. In contrast, HI listeners derived significantly less benefit than NH listeners from deltaF0 cues conveyed by the high-frequency region at small deltaF0's. At larger deltaF0's, both groups combined deltaF0 cues from the low and high formant-frequency regions. Cochlear impairment appears to negatively impact the ability to use F0 cues for within-formant grouping in the high-frequency region. However, cochlear loss does not appear to disrupt the ability to use within-formant F0 cues in the low-frequency region or to group F0 cues across formant regions.  相似文献   

17.
The goals of the present study were to measure acoustic temporal modulation transfer functions (TMTFs) in cochlear implant listeners and examine the relationship between modulation detection and speech recognition abilities. The effects of automatic gain control, presentation level and number of channels on modulation detection thresholds (MDTs) were examined using the listeners' clinical sound processor. The general form of the TMTF was low-pass, consistent with previous studies. The operation of automatic gain control had no effect on MDTs when the stimuli were presented at 65 dBA. MDTs were not dependent on the presentation levels (ranging from 50 to 75 dBA) nor on the number of channels. Significant correlations were found between MDTs and speech recognition scores. The rates of decay of the TMTFs were predictive of speech recognition abilities. Spectral-ripple discrimination was evaluated to examine the relationship between temporal and spectral envelope sensitivities. No correlations were found between the two measures, and 56% of the variance in speech recognition was predicted jointly by the two tasks. The present study suggests that temporal modulation detection measured with the sound processor can serve as a useful measure of the ability of clinical sound processing strategies to deliver clinically pertinent temporal information.  相似文献   

18.
A sound-coding strategy for users of cochlear implants, named enhanced-envelope-encoded tone (eTone), was developed to improve coding of fundamental frequency (F0) in the temporal envelopes of the electrical stimulus signals. It is based on the advanced combinational encoder (ACE) strategy and includes additional processing that explicitly applies F0 modulation to channel envelope signals that contain harmonics of prominent complex tones. Channels that contain only inharmonic signals retain envelopes normally produced by ACE. The strategy incorporates an F0 estimator to determine the frequency of modulation and a harmonic probability estimator to control the amount of modulation enhancement applied to each channel. The F0 estimator was designed to provide an accurate estimate of F0 with minimal processing lag and robustness to the effects of competing noise. Error rates for the F0 estimator and accuracy of the harmonic probability estimator were compared with previous approaches and outcomes demonstrated that the strategy operates effectively across a range of signals and conditions that are relevant to cochlear implant users.  相似文献   

19.
This study investigates the effects of speaking condition and auditory feedback on vowel production by postlingually deafened adults. Thirteen cochlear implant users produced repetitions of nine American English vowels prior to implantation, and at one month and one year after implantation. There were three speaking conditions (clear, normal, and fast), and two feedback conditions after implantation (implant processor turned on and off). Ten normal-hearing controls were also recorded once. Vowel contrasts in the formant space (expressed in mels) were larger in the clear than in the fast condition, both for controls and for implant users at all three time samples. Implant users also produced differences in duration between clear and fast conditions that were in the range of those obtained from the controls. In agreement with prior work, the implant users had contrast values lower than did the controls. The implant users' contrasts were larger with hearing on than off and improved from one month to one year postimplant. Because the controls and implant users responded similarly to a change in speaking condition, it is inferred that auditory feedback, although demonstrably important for maintaining normative values of vowel contrasts, is not needed to maintain the distinctiveness of those contrasts in different speaking conditions.  相似文献   

20.
To provide a perceptual framework for the objective evaluation of durational rules in speech synthesis, two experiments were conducted to investigate the differences between vowel (V) onsets and V-offsets in their functions of marking the perceived temporal structure of speech. The first experiment measured the detectability of temporal modifications given in four-mora (CVCVCVCV) Japanese words. In the V-onset condition, the inter-onset intervals of vowels were uniformly changed (either expanded or reduced) while their inter-offset intervals were preserved. In the V-offset condition, this was reversed. These manipulations did not change the duration of the entire word. Each of the modified words was paired with its unmodified counterpart, and the pair was given to listeners, who were asked to rate the difference between the paired words. The results show that there were no significant differences in the listeners' abilities to detect the temporal modification between the V-onset and V-offset conditions. In the second experiment, the listeners were asked to estimate the differences they perceived in speaking rates for the same stimulus set as that of the first experiment. Interestingly, the results show a clear difference in the listeners' performance between the V-onset and V-offset conditions. Specifically, changing the V-onset intervals changed the perceived speaking rates, which showed a linear relation (r = -0.9) despite the fact that the duration of the entire word remained unchanged. In contrast, modifying the V-offset intervals produced no clear relation with the perceived speaking rates. The second experiment also showed that the listeners performed well in speaking rate discrimination (3.5%-5% in the change ratio). These results are discussed in relation to the differences in the listeners' temporal processing range (local or global) between the two experiments.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号