首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 62 毫秒
1.
The effect of the filter bank on fundamental frequency (F0) discrimination was examined in four Nucleus CI24 cochlear implant subjects for synthetic stylized vowel-like stimuli. The four tested filter banks differed in cutoff frequencies, amount of overlap between filters, and shape of the filters. To assess the effects of temporal pitch cues on F0 discrimination, temporal fluctuations were removed above 10 Hz in one condition and above 200 Hz in another. Results indicate that F0 discrimination based upon place pitch cues is possible, but just-noticeable differences exceed 1 octave or more depending on the filter bank used. Increasing the frequency resolution in the F0 range improves the F0 discrimination based upon place pitch cues. The results of F0 discrimination based upon place pitch agree with a model that compares the centroids of the electrical excitation pattern. The addition of temporal fluctuations up to 200 Hz significantly improves F0 discrimination. Just-noticeable differences using both place and temporal pitch cues range from 6% to 60%. Filter banks that do not resolve the higher harmonics provided the best temporal pitch cues, because temporal pitch cues are clearest when the fluctuation on all channels is at F0 and preferably in phase.  相似文献   

2.
Pitch ranking of sung vowel stimuli, separated in fundamental frequency (F0) by half an octave, was measured with a group of eleven Nucleus 24 cochlear implant recipients using different sound coding strategies. In three consecutive studies, either two or three different sound coding strategies were compared to the Advanced Combinational Encoder (ACE) strategy. These strategies included Continuous Interleaved Sampling (CIS), Peak Derived Timing (PDT), Modulation Depth Enhancement (MDE), F0 Synchronized ACE (FOSync), and Multi-channel Envelope Modulation (MEM), the last four being experimental strategies. While pitch ranking results on average were poor compared to those expected for most normal hearing listeners, significantly higher scores were obtained using the MEM, MDE, and FOSync strategies compared to ACE. These strategies enhanced coding of temporal F0 cues by providing deeper modulation cues to F0 coincidentally in time across all activated electrodes. In the final study, speech recognition tests were also conducted using ACE, CIS, MDE, and MEM. Similar results among all strategies were obtained for word tests in quiet and between ACE and MEM for sentence tests in noise. These findings demonstrate that strategies such as MEM may aid perception of pitch and still adequately code segmental speech features as per existing coding strategies.  相似文献   

3.
Temporal models of pitch and harmonic segregation call for delays of up to 30 ms to cover the full range of existence of musical pitch. To date there is little anatomical or physiological evidence for delays that long. We propose a mechanism by which delays may be synthesized from cross-channel phase interaction. Phases of adjacent cochlear filter channels are shifted by an amount proportional to frequency and then combined as a weighted sum to approximate a delay. Synthetic delays may be used by pitch perception models such as autocorrelation, segregation models such as harmonic cancellation, and binaural processing models to explain sensitivity to large interaural delays. The maximum duration of synthetic delays is limited by the duration of the impulse responses of cochlear filters, itself inversely proportional to cochlear filter bandwidth. Maximum delay is thus frequency dependent. This may explain the fact, puzzling for temporal pitch models such as autocorrelation, that pitch is more salient and easy to discriminate for complex tones that contain resolved partials.  相似文献   

4.
Four multiple-channel cochlear implant patients were tested with synthesized versions of the words "hid, head, had, hud, hod, hood" containing 1, 2, or 3 formants, and with a natural 2-formant version of the same words. The formant frequencies were encoded in terms of the positions of electrical stimulation in the cochlea. Loudness, duration, and fundamental frequency were kept fixed within the synthetic stimulus sets. The average recognition scores were 47%, 61%, 62%, and 79% for the synthesized 1-, 2-, and 3-format vowels and the natural vowels, respectively. These scores showed that the place coding of the first and second formant frequencies accounted for a large part of the vowel recognition of cochlear implant patients using these coding schemes. The recognition of the natural stimuli was significantly higher than recognition of the synthetic stimuli, indicating that extra cues such as loudness, duration, and fundamental frequency contributed to recognition of the spoken words.  相似文献   

5.
Vowels are mainly classified by the positions of peaks in their frequency spectra, the formants. For normal-hearing subjects, change detection and direction discrimination were measured for linear glides in the center frequency (CF) of formantlike sounds. A CF rove was used to prevent subjects from using either the start or end points of the glides as cues. In addition, change detection and starting-phase (start-direction) discrimination were measured for similar stimuli with a sinusoidal 5-Hz formant-frequency modulation. The stimuli consisted of single formants generated using a number of different stimulus parameters including fundamental frequency, spectral slope, frequency region, and position of the formant relative to the harmonic spectrum. The change detection thresholds were in good agreement with the predictions of a model which analyzed and combined the effects of place-of-excitation and temporal cues. For most stimuli, thresholds were approximately equal for change detection and start-direction discrimination. Exceptions were found for stimuli that consisted of only one or two harmonics. In a separate experiment, it was shown that change detection and start-direction discrimination of linear and sinusoidal formant-frequency modulations were impaired by off-frequency frequency-modulated interferers. This frequency modulation detection interference was larger for formants with shallow than for those with steep spectral slopes.  相似文献   

6.
Previous studies have demonstrated that normal-hearing listeners can understand speech using the recovered "temporal envelopes," i.e., amplitude modulation (AM) cues from frequency modulation (FM). This study evaluated this mechanism in cochlear implant (CI) users for consonant identification. Stimuli containing only FM cues were created using 1, 2, 4, and 8-band FM-vocoders to determine if consonant identification performance would improve as the recovered AM cues become more available. A consistent improvement was observed as the band number decreased from 8 to 1, supporting the hypothesis that (1) the CI sound processor generates recovered AM cues from broadband FM, and (2) CI users can use the recovered AM cues to recognize speech. The correlation between the intact and the recovered AM components at the output of the sound processor was also generally higher when the band number was low, supporting the consonant identification results. Moreover, CI subjects who were better at using recovered AM cues from broadband FM cues showed better identification performance with intact (unprocessed) speech stimuli. This suggests that speech perception performance variability in CI users may be partly caused by differences in their ability to use AM cues recovered from FM speech cues.  相似文献   

7.
8.
These experiments measure the ability to detect a change in the relative phase of a single component in a harmonic complex tone. Complex tones containing the first 20 harmonics of 50, 100, or 200 Hz, all at equal amplitude, were used. All of the harmonics except one started in cosine phase. The remaining harmonic started in cosine phase, but was shifted in phase half-way through either the first or the second of the two stimuli comprising a trial. The subject had to identify the stimulus containing the phase-shifted component. For normally hearing subjects tested at a level of 70 dB SPL per component, thresholds for detecting the phase shift [i.e., phase difference limens (DLs)] were smallest (2 degrees-4 degrees) for harmonics above the eighth and for the lowest fundamental frequency (F0). Changes in phase were not detectable for harmonic numbers below three or four at the lowest F0 and below 5-13 at the highest F0. The DLs increased slightly for the highest harmonics in the complexes. The DLs increased markedly with decreasing level, except for the highest harmonic, where only a small effect of level was found. Subjects reported that the phase-shifted harmonic appeared to "pop out" and was heard with a pure-tone quality. A pitch-matching experiment demonstrated that the pitch of this tone corresponded to the frequency of the phase-shifted component. For the highest harmonic, the phase shift was associated with a downward shift of the edge pitch heard in the reference (all cosine phase) stimulus. When the phases of the components in the reference stimulus were randomized, phase DLs were much higher (and often impossible to measure), the pop-out phenomenon was not observed, and no edge pitch was heard. Subjects with unilateral cochlear hearing impairment generally showed poorer phase sensitivity in their impaired than in their normal ears, when the two ears were compared at equal sound-pressure levels. However, at comparable sensation levels, the impaired ears sometimes showed lower phase DLs. The results are explained by considering the waveforms that would occur at the outputs of the auditory filters in response to these stimuli.  相似文献   

9.
The abilities to hear changes in pitch for sung vowels and understand speech using an experimental sound coding strategy (eTone) that enhanced coding of temporal fundamental frequency (F0) information were tested in six cochlear implant users, and compared with performance using their clinical (ACE) strategy. In addition, rate- and modulation rate-pitch difference limens (DLs) were measured using synthetic stimuli with F0s below 300 Hz to determine psychophysical abilities of each subject and to provide experience in attending to rate cues for the judgment of pitch. Sung-vowel pitch ranking tests for stimuli separated by three semitones presented across an F0 range of one octave (139-277 Hz) showed a significant benefit for the experimental strategy compared to ACE. Average d-prime (d') values for eTone (d' = 1.05) were approximately three time larger than for ACE (d' = 0.35). Similar scores for both strategies in the speech recognition tests showed that coding of segmental speech information by the experimental strategy was not degraded. Average F0 DLs were consistent with results from previous studies and for all subjects were less than or equal to approximately three semitones for F0s of 125 and 200?Hz.  相似文献   

10.
Natural spoken language processing includes not only speech recognition but also identification of the speaker's gender, age, emotional, and social status. Our purpose in this study is to evaluate whether temporal cues are sufficient to support both speech and speaker recognition. Ten cochlear-implant and six normal-hearing subjects were presented with vowel tokens spoken by three men, three women, two boys, and two girls. In one condition, the subject was asked to recognize the vowel. In the other condition, the subject was asked to identify the speaker. Extensive training was provided for the speaker recognition task. Normal-hearing subjects achieved nearly perfect performance in both tasks. Cochlear-implant subjects achieved good performance in vowel recognition but poor performance in speaker recognition. The level of the cochlear implant performance was functionally equivalent to normal performance with eight spectral bands for vowel recognition but only to one band for speaker recognition. These results show a disassociation between speech and speaker recognition with primarily temporal cues, highlighting the limitation of current speech processing strategies in cochlear implants. Several methods, including explicit encoding of fundamental frequency and frequency modulation, are proposed to improve speaker recognition for current cochlear implant users.  相似文献   

11.
Speech perception in the presence of another competing voice is one of the most challenging tasks for cochlear implant users. Several studies have shown that (1) the fundamental frequency (F0) is a useful cue for segregating competing speech sounds and (2) the F0 is better represented by the temporal fine structure than by the temporal envelope. However, current cochlear implant speech processing algorithms emphasize temporal envelope information and discard the temporal fine structure. In this study, speech recognition was measured as a function of the F0 separation of the target and competing sentence in normal-hearing and cochlear implant listeners. For the normal-hearing listeners, the combined sentences were processed through either a standard implant simulation or a new algorithm which additionally extracts a slowed-down version of the temporal fine structure (called Frequency-Amplitude-Modulation-Encoding). The results showed no benefit of increasing F0 separation for the cochlear implant or simulation groups. In contrast, the new algorithm resulted in gradual improvements with increasing F0 separation, similar to that found with unprocessed sentences. These results emphasize the importance of temporal fine structure for speech perception and demonstrate a potential remedy for difficulty in the perceptual segregation of competing speech sounds.  相似文献   

12.
The temporal representation of speechlike stimuli in the auditory-nerve output of a guinea pig cochlea model is described. The model consists of a bank of dual resonance nonlinear filters that simulate the vibratory response of the basilar membrane followed by a model of the inner hair cell/auditory nerve complex. The model is evaluated by comparing its output with published physiological auditory nerve data in response to single and double vowels. The evaluation includes analyses of individual fibers, as well as ensemble responses over a wide range of best frequencies. In all cases the model response closely follows the patterns in the physiological data, particularly the tendency for the temporal firing pattern of each fiber to represent the frequency of a nearby formant of the speech sound. In the model this behavior is largely a consequence of filter shapes; nonlinear filtering has only a small contribution at low frequencies. The guinea pig cochlear model produces a useful simulation of the measured physiological response to simple speech sounds and is therefore suitable for use in more advanced applications including attempts to generalize these principles to the response of human auditory system, both normal and impaired.  相似文献   

13.
A sound-coding strategy for users of cochlear implants, named enhanced-envelope-encoded tone (eTone), was developed to improve coding of fundamental frequency (F0) in the temporal envelopes of the electrical stimulus signals. It is based on the advanced combinational encoder (ACE) strategy and includes additional processing that explicitly applies F0 modulation to channel envelope signals that contain harmonics of prominent complex tones. Channels that contain only inharmonic signals retain envelopes normally produced by ACE. The strategy incorporates an F0 estimator to determine the frequency of modulation and a harmonic probability estimator to control the amount of modulation enhancement applied to each channel. The F0 estimator was designed to provide an accurate estimate of F0 with minimal processing lag and robustness to the effects of competing noise. Error rates for the F0 estimator and accuracy of the harmonic probability estimator were compared with previous approaches and outcomes demonstrated that the strategy operates effectively across a range of signals and conditions that are relevant to cochlear implant users.  相似文献   

14.
Two experiments investigated the ability of 17 school-aged children to process purely temporal and spectro-temporal cues that signal changes in pitch. Percentage correct was measured for the discrimination of sinusoidal amplitude modulation rate (AMR) of broadband noise in experiment 1 and for the discrimination of fundamental frequency (F0) of broadband sine-phase harmonic complexes in experiment 2. The reference AMR was 100 Hz as was the reference F0. A child-friendly interface helped listeners to remain attentive to the task. Data were fitted using a maximum-likelihood technique that extracted threshold, slope, and lapse rate. All thresholds were subsequently standardized to a common d' value equal to 0.77. There were relatively large individual differences across listeners: eight had relatively adult-like thresholds in both tasks and nine had higher thresholds. However, these individual differences did not vary systematically with age, over the span of 6-16 yr. Thresholds were correlated across the two tasks and were about nine times finer for F0 discrimination than for AMR discrimination as has been previously observed in adults.  相似文献   

15.
The present study systematically manipulated three acoustic cues--fundamental frequency (f0), amplitude envelope, and duration--to investigate their contributions to tonal contrasts in Mandarin. Simplified stimuli with all possible combinations of these three cues were presented for identification to eight normal-hearing listeners, all native speakers of Mandarin from Taiwan. The f0 information was conveyed either by an f0-controlled sawtooth carrier or a modulated noise so as to compare the performance achievable by a clear indication of voice f0 and what is possible with purely temporal coding of f0. Tone recognition performance with explicit f0 was much better than that with any combination of other acoustic cues (consistently greater than 90% correct compared to 33%-65%; chance is 25%). In the absence of explicit f0, the temporal coding of f0 and amplitude envelope both contributed somewhat to tone recognition, while duration had only a marginal effect. Performance based on these secondary cues varied greatly across listeners. These results explain the relatively poor perception of tone in cochlear implant users, given that cochlear implants currently provide only weak cues to f0, so that users must rely upon the purely temporal (and secondary) features for the perception of tone.  相似文献   

16.
The four experiments reported here measure listeners' accuracy and consistency in adjusting a formant frequency of one- or two-formant complex sounds to match the timbre of a target sound. By presenting the target and the adjustable sound on different fundamental frequencies, listeners are prevented from performing the task by comparing the absolute or relative levels of resolved spectral components. Experiment 1 uses two-formant vowellike sounds. When the two sounds have the same F0, the variability of matches (within-subject standard deviation) for either the first or the second formant is around 1%-3%, which is comparable to existing data on formant frequency discrimination thresholds. With a difference in F0, variability increases to around 8% for first-formant matches, but to only about 4% for second-formant matches. Experiment 2 uses sounds with a single formant at 1100 or 1200 Hz with both sounds on either low or high fundamental frequencies. The increase in variability produced by a difference in F0 is greater for high F0's (where the harmonics close to the formant peak are resolved) than it is for low F0's (where they are unresolved). Listeners also showed systematic errors in their mean matches to sounds with different high F0's. The direction of the systematic errors was towards the most intense harmonic. Experiments 3 and 4 showed that introduction of a vibratolike frequency modulation (FM) on F0 reduces the variability of matches, but does not reduce the systematic error. The experiments demonstrate, for the specific frequencies and FM used, that there is a perceptual cost to interpolating a spectral envelope across resolved harmonics.  相似文献   

17.
The ability to segregate two spectrally and temporally overlapping signals based on differences in temporal envelope structure and binaural cues was investigated. Signals were a harmonic tone complex (HTC) with 20 Hz fundamental frequency and a bandpass noise (BPN). Both signals had interaural differences of the same absolute value, but with opposite signs to establish lateralization to different sides of the medial plane, such that their combination yielded two different spatial configurations. As an indication for segregation ability, threshold interaural time and level differences were measured for discrimination between these spatial configurations. Discrimination based on interaural level differences was good, although absolute thresholds depended on signal bandwidth and center frequency. Discrimination based on interaural time differences required the signals' temporal envelope structures to be sufficiently different. Long-term interaural cross-correlation patterns or long-term averaged patterns after equalization-cancellation of the combined signals did not provide information for the discrimination. The binaural system must, therefore, have been capable of processing changes in interaural time differences within the period of the harmonic tone complex, suggesting that monaural information from the temporal envelopes influences the use of binaural information in the perceptual organization of signal components.  相似文献   

18.
The ability of normally hearing and hearing-impaired subjects to use temporal fine structure information in complex tones was measured. Subjects were required to discriminate a harmonic complex tone from a tone in which all components were shifted upwards by the same amount in Hz, in a three-alternative, forced-choice task. The tones either contained five equal-amplitude components (non-shaped stimuli) or contained many components, but were passed through a fixed bandpass filter to reduce excitation pattern changes (shaped stimuli). Components were centered at nominal harmonic numbers (N) 7, 11, and 18. For the shaped stimuli, hearing-impaired subjects performed much more poorly than normally hearing subjects, with most of the former scoring no better than chance when N=11 or 18, suggesting that they could not access the temporal fine structure information. Performance for the hearing-impaired subjects was significantly improved for the non-shaped stimuli, presumably because they could benefit from spectral cues. It is proposed that normal-hearing subjects can use temporal fine structure information provided the spacing between fine structure peaks is not too small relative to the envelope period, but subjects with moderate cochlear hearing loss make little use of temporal fine structure information for unresolved components.  相似文献   

19.
Gap detection as a measure of electrode interaction in cochlear implants.   总被引:1,自引:0,他引:1  
Gap detection thresholds were measured as an indication of the amount of interaction between electrodes in a cochlear implant. The hypothesis in this study was as follows: when the two stimuli that bound the gap stimulate the same electrode, and thus the same neural population, the gap detection threshold will be short. As two stimuli are presented to two electrodes that are more widely separated, the amount of neural overlap of the two stimuli decreases, the stimuli sound more dissimilar, and the gap thresholds increase. Gap detection thresholds can thus be used to infer the amount of overlap in neural populations stimulated by two electrodes. Three users of the Nucleus cochlear implant participated in this study. Gap detection thresholds were measured as a function of the distance between the two electrode pairs and as a function of the spacing between the two electrodes of a bipolar pair (i.e., using different modes of stimulation). The results indicate that measuring gap detection thresholds may provide an estimate of the amount of electrode interaction. Gap detection thresholds were a function of the physical separation of the electrode pairs used for the two stimuli that bound the gap. Lower gap thresholds were observed when the two electrode pairs were closely spaced, and gap thresholds increased as the separation increased, resulting in a "psychophysical tuning curve" as a function of electrode separation. The sharpness of tuning varied across subjects, and for the three subjects in this study, the tuning was generally sharper for the subjects with better speech recognition. The data also indicate that increasing the separation between active and reference electrodes has limited effect on spatial selectivity (or tuning) as measured perceptually.  相似文献   

20.
This study investigated the integration of place- and temporal-pitch cues in pitch contour identification (PCI), in which cochlear implant (CI) users were asked to judge the overall pitch-change direction of stimuli. Falling and rising pitch contours were created either by continuously steering current between adjacent electrodes (place pitch), by continuously changing amplitude modulation (AM) frequency (temporal pitch), or both. The percentage of rising responses was recorded as a function of current steering or AM frequency change, with single or combined pitch cues. A significant correlation was found between subjects' sensitivity to current steering and AM frequency change. The integration of place- and temporal-pitch cues was most effective when the two cues were similarly discriminable in isolation. Adding the other (place or temporal) pitch cues shifted the temporal- or place-pitch psychometric functions horizontally without changing the slopes. PCI was significantly better with consistent place- and temporal-pitch cues than with inconsistent cues. PCI with single cues and integration of pitch cues were similar on different electrodes. The results suggest that CI users effectively integrate place- and temporal-pitch cues in relative pitch perception tasks. Current steering and AM frequency change should be coordinated to better transmit dynamic pitch information to CI users.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号