首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
When listeners hear a target signal in the presence of competing sounds, they are quite good at extracting information at instances when the local signal-to-noise ratio of the target is most favorable. Previous research suggests that listeners can easily understand a periodically interrupted target when it is interleaved with noise. It is not clear if this ability extends to the case where an interrupted target is alternated with a speech masker rather than noise. This study examined speech intelligibility in the presence of noise or speech maskers, which were either continuous or interrupted at one of six rates between 4 and 128 Hz. Results indicated that with noise maskers, listeners performed significantly better with interrupted, rather than continuous maskers. With speech maskers, however, performance was better in continuous, rather than interrupted masker conditions. Presumably the listeners used continuity as a cue to distinguish the continuous masker from the interrupted target. Intelligibility in the interrupted masker condition was improved by introducing a pitch difference between the target and speech masker. These results highlight the role that target-masker differences in continuity and pitch play in the segregation of competing speech signals.  相似文献   

2.
Normal-hearing (NH) listeners maintain robust speech understanding in modulated noise by "glimpsing" portions of speech from a partially masked waveform--a phenomenon known as masking release (MR). Cochlear implant (CI) users, however, generally lack such resiliency. In previous studies, temporal masking of speech by noise occurred randomly, obscuring to what degree MR is attributable to the temporal overlap of speech and masker. In the present study, masker conditions were constructed to either promote (+MR) or suppress (-MR) masking release by controlling the degree of temporal overlap. Sentence recognition was measured in 14 CI subjects and 22 young-adult NH subjects. Normal-hearing subjects showed large amounts of masking release in the +MR condition and a marked difference between +MR and -MR conditions. In contrast, CI subjects demonstrated less effect of MR overall, and some displayed modulation interference as reflected by poorer performance in modulated maskers. These results suggest that the poor performance of typical CI users in noise might be accounted for by factors that extend beyond peripheral masking, such as reduced segmental boundaries between syllables or words. Encouragingly, the best CI users tested here could take advantage of masker fluctuations to better segregate the speech from the background.  相似文献   

3.
Although some cochlear implant (CI) listeners can show good word recognition accuracy, it is not clear how they perceive and use the various acoustic cues that contribute to phonetic perceptions. In this study, the use of acoustic cues was assessed for normal-hearing (NH) listeners in optimal and spectrally degraded conditions, and also for CI listeners. Two experiments tested the tense/lax vowel contrast (varying in formant structure, vowel-inherent spectral change, and vowel duration) and the word-final fricative voicing contrast (varying in F1 transition, vowel duration, consonant duration, and consonant voicing). Identification results were modeled using mixed-effects logistic regression. These experiments suggested that under spectrally-degraded conditions, NH listeners decrease their use of formant cues and increase their use of durational cues. Compared to NH listeners, CI listeners showed decreased use of spectral cues like formant structure and formant change and consonant voicing, and showed greater use of durational cues (especially for the fricative contrast). The results suggest that although NH and CI listeners may show similar accuracy on basic tests of word, phoneme or feature recognition, they may be using different perceptual strategies in the process.  相似文献   

4.
This study compared how normal-hearing listeners (NH) and listeners with moderate to moderately severe cochlear hearing loss (HI) use and combine information within and across frequency regions in the perceptual separation of competing vowels with fundamental frequency differences (deltaF0) ranging from 0 to 9 semitones. Following the procedure of Culling and Darwin [J. Acoust. Soc. Am. 93, 3454-3467 (1993)], eight NH listeners and eight HI listeners identified competing vowels with either a consistent or inconsistent harmonic structure. Vowels were amplified to assure audibility for HI listeners. The contribution of frequency region depended on the value of deltaF0 between the competing vowels. When deltaF0 was small, both groups of listeners effectively utilized deltaF0 cues in the low-frequency region. In contrast, HI listeners derived significantly less benefit than NH listeners from deltaF0 cues conveyed by the high-frequency region at small deltaF0's. At larger deltaF0's, both groups combined deltaF0 cues from the low and high formant-frequency regions. Cochlear impairment appears to negatively impact the ability to use F0 cues for within-formant grouping in the high-frequency region. However, cochlear loss does not appear to disrupt the ability to use within-formant F0 cues in the low-frequency region or to group F0 cues across formant regions.  相似文献   

5.
For normal-hearing (NH) listeners, masker energy outside the spectral region of a target signal can improve target detection and identification, a phenomenon referred to as comodulation masking release (CMR). This study examined whether, for cochlear implant (CI) listeners and for NH listeners presented with a "noise vocoded" CI simulation, speech identification in modulated noise is improved by a co-modulated flanking band. In Experiment 1, NH listeners identified noise-vocoded speech in a background of on-target noise with or without a flanking narrow band of noise outside the spectral region of the target. The on-target noise and flanker were either 16-Hz square-wave modulated with the same phase or were unmodulated; the speech was taken from a closed-set corpus. Performance was better in modulated than in unmodulated noise, and this difference was slightly greater when the comodulated flanker was present, consistent with a small CMR of about 1.7 dB for noise-vocoded speech. Experiment 2, which tested CI listeners using the same speech materials, found no advantage for modulated versus unmodulated maskers and no CMR. Thus although NH listeners can benefit from CMR even for speech signals with reduced spectro-temporal detail, no CMR was observed for CI users.  相似文献   

6.
When listening to natural speech, listeners are fairly adept at using cues such as pitch, vocal tract length, prosody, and level differences to extract a target speech signal from an interfering speech masker. However, little is known about the cues that listeners might use to segregate synthetic speech signals that retain the intelligibility characteristics of speech but lack many of the features that listeners normally use to segregate competing talkers. In this experiment, intelligibility was measured in a diotic listening task that required the segregation of two simultaneously presented synthetic sentences. Three types of synthetic signals were created: (1) sine-wave speech (SWS); (2) modulated noise-band speech (MNB); and (3) modulated sine-band speech (MSB). The listeners performed worse for all three types of synthetic signals than they did with natural speech signals, particularly at low signal-to-noise ratio (SNR) values. Of the three synthetic signals, the results indicate that SWS signals preserve more of the voice characteristics used for speech segregation than MNB and MSB signals. These findings have implications for cochlear implant users, who rely on signals very similar to MNB speech and thus are likely to have difficulty understanding speech in cocktail-party listening environments.  相似文献   

7.
This study examined the sensitivity of four cochlear implant (CI) listeners to interaural time difference (ITD) in different portions of four-pulse sequences in lateralization discrimination. ITD was present either in all the pulses (referred to as condition Wave), the two middle pulses (Ongoing), the first pulse (Onset), the last pulse (Offset), or both the first and last pulse (Gating). All ITD conditions were tested at different pulse rates (100, 200, 400, and 800 pulses/s pps). Also, five normal hearing (NH) subjects were tested, listening to an acoustic simulation of CI stimulation. All CI and NH listeners were sensitive in condition Gating at all pulse rates for which they showed sensitivity in condition Wave. The sensitivity in condition Onset increased with the pulse rate for three CI listeners as well as for all NH listeners. The performance in condition Ongoing varied over the subjects. One CI listener showed sensitivity up to 800 pps, two up to 400 pps, and one at 100 pps only. The group of NH listeners showed sensitivity up to 200 pps. The result that CI listeners detect ITD from the middle pulses of short trains indicates the relevance of fine timing of stimulation pulses in lateralization and therefore in CI stimulation strategies.  相似文献   

8.
Spoken communication in a non-native language is especially difficult in the presence of noise. This study compared English and Spanish listeners' perceptions of English intervocalic consonants as a function of masker type. Three maskers (stationary noise, multitalker babble, and competing speech) provided varying amounts of energetic and informational masking. Competing English and Spanish speech maskers were used to examine the effect of masker language. Non-native performance fell short of that of native listeners in quiet, but a larger performance differential was found for all masking conditions. Both groups performed better in competing speech than in stationary noise, and both suffered most in babble. Since babble is a less effective energetic masker than stationary noise, these results suggest that non-native listeners are more adversely affected by both energetic and informational masking. A strong correlation was found between non-native performance in quiet and degree of deterioration in noise, suggesting that non-native phonetic category learning can be fragile. A small effect of language background was evident: English listeners performed better when the competing speech was Spanish.  相似文献   

9.
Harmonic complex tones comprising components in different spectral regions may differ considerably in timbre. While the pitch of "residue" tones of this type has been studied extensively, their timbral properties have received little attention. Discrimination of F0 for such tones is typically poorer than for complex tones with "corresponding" harmonics [A. Faulkner, J. Acoust. Soc. Am. 78, 1993-2004 (1985)]. The F0 DLs may be higher because timbre differences impair pitch discrimination. The present experiment explores effects of changes in spectral locus and F0 of harmonic complex tones on both pitch and timbre. Six normally hearing listeners indicated if the second tone of a two-tone sequence was: (1) same, (2) higher in pitch, (3) lower in pitch, (4) same in pitch but different in "something else," (5) higher in pitch and different in "something else," or (6) lower in pitch and different in "something else" than the first. ("Something else" is assumed to represent timbre.) The tones varied in spectral loci of four equal-amplitude harmonics m, m + 1, m + 2, and m + 3 (m = 1,2,3,4,5,6) and ranged in F0 from 200 to 200 +/- 2n Hz (n = 0,1,2,4,8,16,32). Results show that changes in F0 primarily affect pitch, and changes in spectral locus primarily affect timbre. However, a change in spectral locus can also influence pitch. The direction of locus change was reported as the direction of pitch change, despite no change in F0 or changes in F0 in the opposite direction for delta F0 < or = 0-2%. This implies that listeners may be attending to the "spectral pitch" of components, or to changes in a timbral attribute like "sharpness," which are construed as changes in overall pitch in the absence of strong F0 cues. For delta F0 > or = 2%, the direction of reported pitch change accord with the direction of F0 change, but the locus change continued to be reported as a timbre change. Rather than spectral-pitch matching of corresponding components, a context-dependent spectral evaluation process is thus implied in discernment of changes in pitch and timbre. Relative magnitudes of change in derived features of the spectrum such as harmonic number and F0, and absolute features such as spectral frequencies are compared. What is called "spectral pitch," contributes to the overall pitch, but also appears to be an important dimension of the multidimensional percept, timbre.  相似文献   

10.
Traditionally, timbre has been defined as that perceptual attribute that differentiates two sounds when pitch and loudness are equal, and thus is a measure of dissimilarity. By such a definition, each voice possesses a set of timbres, and the ability to identify any voice or voice category across different pitch-loudness-vowel combinations must be due to an ability to "link" these timbres by abstracting the "timbre transformation," the manner in which timbre subtly changes across pitch and loudness for a specific voice or voice category. Using stimuli produced across the singing range by singers from different voice categories, this study sought to examine how timbre and pitch interact in the perception of dissimilarity in male singing voices. This study also investigated whether or not listener experience affects the perception of timbre as a function of pitch. The resulting multidimensional scaling (MDS) representations showed that for all stimuli and listeners, dimension 1 correlated with pitch, while dimension 2 correlated with spectral centroid and separated vocal stimuli into the categories baritone and tenor. Dimension 3 appeared highly idiosyncratic depending on the nature of the stimuli and on the experience of the listener. Inexperienced listeners appeared to rely more heavily on pitch in making dissimilarity judgments than did experienced listeners. The resulting MDS representations of dissimilarity across pitch provide a glimpse of the timbre transformation of voice categories across pitch.  相似文献   

11.
Traditionally, timbre has been defined as that perceptual attribute that differentiates two sounds when pitch and loudness are equal and thus is a measure of dissimilarity. By such a definition, each voice possesses a set of timbres, and the identity of any voice or voice category across different pitch-loudness-vowel combinations must be due to an abstraction of the pattern of timbre transformation. Using stimuli produced across the singing range by singers from different voice categories, this study sought to examine how timbre and pitch interact in the perception of dissimilarity. This study also investigated whether listener experience affects the perception of timbre as a function of pitch. The resulting multidimensional scaling (MDS) representations showed that for all stimuli and listeners, dimension 1 correlated with pitch, whereas dimension 2 correlated with spectral centroid and separated vocal stimuli into the categories mezzo-soprano and soprano. Dimension 3 appeared highly idiosyncratic depending on the nature of the stimuli and on the experience of the listener. Inexperienced listeners appeared to rely more heavily on pitch in making dissimilarity judgments than did experienced listeners. The resulting MDS representations of dissimilarity across pitch provide a glimpse of the timbre transformation of voice categories across pitch.  相似文献   

12.
This study investigated the integration of place- and temporal-pitch cues in pitch contour identification (PCI), in which cochlear implant (CI) users were asked to judge the overall pitch-change direction of stimuli. Falling and rising pitch contours were created either by continuously steering current between adjacent electrodes (place pitch), by continuously changing amplitude modulation (AM) frequency (temporal pitch), or both. The percentage of rising responses was recorded as a function of current steering or AM frequency change, with single or combined pitch cues. A significant correlation was found between subjects' sensitivity to current steering and AM frequency change. The integration of place- and temporal-pitch cues was most effective when the two cues were similarly discriminable in isolation. Adding the other (place or temporal) pitch cues shifted the temporal- or place-pitch psychometric functions horizontally without changing the slopes. PCI was significantly better with consistent place- and temporal-pitch cues than with inconsistent cues. PCI with single cues and integration of pitch cues were similar on different electrodes. The results suggest that CI users effectively integrate place- and temporal-pitch cues in relative pitch perception tasks. Current steering and AM frequency change should be coordinated to better transmit dynamic pitch information to CI users.  相似文献   

13.
Three experiments investigated the role of pre/post exposure to a masker in a detection task with complex, random, spectro-temporal maskers. In the first experiment, the masker was either continuously presented or pulsed on and off with the signal. For most listeners, thresholds were lower when the masker was continuously presented, despite the fact that there was more uncertainty about the timing of the signal. In the second experiment, the signal-bearing portion of the masker was preceded and followed by masker "fringes" of different durations. Consistent with the findings of Experiment 1, for some listeners shorter-duration fringes led to higher thresholds than long-duration fringes. In the third experiment, the masker fringe (a) preceded, (b) followed, or (c) both preceded and followed, the signal. Relative to the middle signal conditions, a late signal yielded lower thresholds and the early signal yielded higher thresholds. These results indicate that listeners can use features of an ongoing sound to extract an added signal and that listeners differ in the importance of pre-exposure for efficient signal extraction. However, listeners do not appear to perform this comparison retrospectively after the signal, potentially indicating a form of backward masking.  相似文献   

14.
Speech recognition in noise improves with combined acoustic and electric stimulation compared to electric stimulation alone [Kong et al., J. Acoust. Soc. Am. 117, 1351-1361 (2005)]. Here the contribution of fundamental frequency (F0) and low-frequency phonetic cues to speech recognition in combined hearing was investigated. Normal-hearing listeners heard vocoded speech in one ear and low-pass (LP) filtered speech in the other. Three listening conditions (vocode-alone, LP-alone, combined) were investigated. Target speech (average F0=120 Hz) was mixed with a time-reversed masker (average F0=172 Hz) at three signal-to-noise ratios (SNRs). LP speech aided performance at all SNRs. Low-frequency phonetic cues were then removed by replacing the LP speech with a LP equal-amplitude harmonic complex, frequency and amplitude modulated by the F0 and temporal envelope of voiced segments of the target. The combined hearing advantage disappeared at 10 and 15 dB SNR, but persisted at 5 dB SNR. A similar finding occurred when, additionally, F0 contour cues were removed. These results are consistent with a role for low-frequency phonetic cues, but not with a combination of F0 information between the two ears. The enhanced performance at 5 dB SNR with F0 contour cues absent suggests that voicing or glimpsing cues may be responsible for the combined hearing benefit.  相似文献   

15.
A triadic comparisons task and an identification task were used to evaluate normally hearing listeners' and hearing-impaired listeners' perceptions of synthetic CV stimuli in the presence of competition. The competing signals included multitalker babble, continuous speech spectrum noise, a CV masker, and a brief noise masker shaped to resemble the onset spectrum of the CV masker. All signals and maskers were presented monotically. Interference by competition was assessed by comparing Multidimensional Scaling solutions derived from each masking condition to that derived from the baseline (quiet) condition. Analysis of the effects of continuous maskers revealed that multitalker babble and continuous noise caused the same amount of change in performance, as compared to the baseline condition, for all listeners. CV masking changed performance significantly more than did brief noise masking, and the hearing-impaired listeners experienced more degradation in performance than normals. Finally, the velar CV maskers (g epsilon and k epsilon) caused significantly greater masking effects than the bilabial CV maskers (b epsilon and p epsilon), and were most resistant to masking by other competing stimuli. The results suggest that speech intelligibility difficulties in the presence of competing segments of speech are primarily attributable to phonetic interference rather than to spectral masking. Individual differences in hearing-impaired listeners' performances are also discussed.  相似文献   

16.
Although most recent multitalker research has emphasized the importance of binaural cues, monaural cues can play an equally important role in the perception of multiple simultaneous speech signals. In this experiment, the intelligibility of a target phrase masked by a single competing masker phrase was measured as a function of signal-to-noise ratio (SNR) with same-talker, same-sex, and different-sex target and masker voices. The results indicate that informational masking, rather than energetic masking, dominated performance in this experiment. The amount of masking was highly dependent on the similarity of the target and masker voices: performance was best when different-sex talkers were used and worst when the same talker was used for target and masker. Performance did not, however, improve monotonically with increasing SNR. Intelligibility generally plateaued at SNRs below 0 dB and, in some cases, intensity differences between the target and masking voices produced substantial improvements in performance with decreasing SNR. The results indicate that informational and energetic masking play substantially different roles in the perception of competing speech messages.  相似文献   

17.
Melodic contour identification was measured in cochlear implant (CI) and normal-hearing (NH) subjects for piano samples processed by four bandpass filters: low (310-620 Hz), middle (620-2480 Hz), high (2480-4960 Hz), and full (310-4960 Hz). NH performance was near-perfect for all filter ranges and much higher than CI performance. The best mean CI performance was with the middle frequency range; performance was much better for some CI subjects with the middle rather than the full filter. These results suggest that acoustic filtering may reduce potential mismatches between fundamental frequencies and harmonic components thereby improving CI users' melodic pitch perception.  相似文献   

18.
A masker can reduce target intelligibility both by interfering with the target's peripheral representation ("energetic masking") and/or by causing more central interference ("informational masking"). Intelligibility generally improves with increasing spatial separation between two sources, an effect known as spatial release from masking (SRM). Here, SRM was measured using two concurrent sine-vocoded talkers. Target and masker were each composed of eight different narrowbands of speech (with little spectral overlap). The broadband target-to-masker energy ratio (TMR) was varied, and response errors were used to assess the relative importance of energetic and informational masking. Performance improved with increasing TMR. SRM occurred at all TMRs; however, the pattern of errors suggests that spatial separation affected performance differently, depending on the dominant type of masking. Detailed error analysis suggests that informational masking occurred due to failures in either across-time linkage of target segments (streaming) or top-down selection of the target. Specifically, differences in the spatial cues in target and masker improved streaming and target selection. In contrast, level differences helped listeners select the target, but had little influence on streaming. These results demonstrate that at least two mechanisms (differentially affected by spatial and level cues) influence informational masking.  相似文献   

19.
Sequences of rapidly occurring sounds that differ from each other are often perceptually segregated into "streams" within which the range of differences is smaller [Bregman and Campbell, J. Exp. Psychol. 89, 244-249 (1971)]. Early research on streaming implied it to be pitch dominated, but Wessel [Comput. Music J. 3, 45-52 (1979)] demonstrated that timbre differences could also bring about segregation. In the present study, pitch and timbre attributes were put in competition in four-tone sequences of the form: T2P1-TmP1-T2Pn-TmPn, with the first pair assigned pitch P1 but different timbres T2 and Tm, and the second pair pitch Pn, and similarly contrasted timbres. Six listeners were asked to indicate whether perceived grouping of 49 such sequences was based on pitch proximity, timbre similarity, or ambiguous percepts not dominated by either cue. Results confirm that timbre can segregate sequences and imply that timbre and pitch compete in perceptually organizing complex sequences. Because timbre differences were provided by varying the locus of four equal-amplitude harmonics, and pitch differences were provided by varying their relative spacing, it is suggested that the tradeoffs observed may actually arise due to differences in perceived salience of "spectral pitch" and "virtual pitch" [Terhardt, J. Acoust. Soc. Am. 55, 1061-1069 (1974)] dependent on relative changes in spectral locus and spectral spacing over time.  相似文献   

20.
Detection performance for a masked auditory signal of fixed frequency can be substantially degraded if there is uncertainty about the frequency content of the masker. A quasimolecular psychophysical approach was used to examine response strategies in masker-uncertainty conditions, and to investigate the influence of uncertainty when the number of different masker samples was limited to ten or fewer. The task of the four listeners was to detect a 1000-Hz signal that was presented simultaneously with one of ten ten-tone masker samples. The masker sample was either fixed throughout a block of two-interval forced-choice trials or was randomized across or within trials. The primary results showed that: (1) When the signal level was low and the masker sample differed between the two intervals of a trial, most listeners based their responses more on the presence of specific masker samples than on the signal. (2) The detrimental effect of masker uncertainty was clearly evident when only four maskers were randomly presented, and grew as the size of the masker set was increased from two to ten. (3) The slopes of psychometric functions measured with the same masker samples differed among the fixed and two random-masker conditions. (4) There were large differences in the influence of masker uncertainty across masker samples and listeners. These data demonstrate the great susceptibility of human listeners to the influence of masker uncertainty and the ability of quasimolecular investigations to reveal important aspects of behavior in uncertainty condition.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号