首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Introduction of masker amplitude modulation (AM) can improve signal detection in a number of paradigms. In some cases this advantage depends on the coherence of modulation across a relatively wide frequency range. In the experiments described below, observers were asked to identify masked spondee words produced by a single male talker. The target spondees and masking noise were filtered into nine narrow bands, and the coherence of AM of either the speech signal or noise masker was manipulated. Inherent modulation of the masker bands was manipulated via assignment of real and imaginary values to the associated components of each band in the frequency domain, and AM of speech bands was achieved via multiplication with envelopes extracted from these maskers. Responses were based on two alternatives, four alternatives, or open response sets. The effect of masker AM coherence was highly dependent upon the size of the response set: coherent AM was associated with better thresholds in a two-alternative response set, but poorer thresholds in an open response set. Results with AM speech did not depend critically upon the across-frequency temporal synchrony of AM imposed on the speech material.  相似文献   

2.
The present study investigated the effect of envelope modulations in a background masker on consonant recognition by normal hearing listeners. It is well known that listeners understand speech better under a temporally modulated masker than under a steady masker at the same level, due to masking release. The possibility of an opposite phenomenon, modulation interference, whereby speech recognition could be degraded by a modulated masker due to interference with auditory processing of the speech envelope, was hypothesized and tested under various speech and masker conditions. It was of interest whether modulation interference for speech perception, if it were observed, could be predicted by modulation masking, as found in psychoacoustic studies using nonspeech stimuli. Results revealed that masking release measurably occurred under a variety of conditions, especially when the speech signal maintained a high degree of redundancy across several frequency bands. Modulation interference was also clearly observed under several circumstances when the speech signal did not contain a high redundancy. However, the effect of modulation interference did not follow the expected pattern from psychoacoustic modulation masking results. In conclusion, (1) both factors, modulation interference and masking release, should be accounted for whenever a background masker contains temporal fluctuations, and (2) caution needs to be taken when psychoacoustic theory on modulation masking is applied to speech recognition.  相似文献   

3.
Under certain conditions, speech recognition in noise decreases above conversational levels when signal-to-noise ratio is held constant. The current study was undertaken to determine if nonlinear growth of masking and the subsequent reduction in "effective" signal-to-noise ratio accounts for this decline. Nine young adults with normal hearing listened to monosyllabic words at three levels in each of three levels of a masker shaped to match the speech spectrum. An additional low-level noise equated audibility by producing equivalent masked thresholds for all subjects. If word recognition was determined entirely by signal-to-noise ratio and was independent of overall speech and masker levels, scores at a given signal-to-noise ratio should remain constant with increasing level. Masked pure-tone thresholds measured in the speech-shaped maskers increased linearly with increasing masker level at lower frequencies but nonlinearly at higher frequencies, consistent with nonlinear growth of upward spread of masking that followed the peaks in the spectrum of the speech-shaped masker. Word recognition declined significantly with increasing level when signal-to-noise ratio was held constant which was attributed to nonlinear growth of masking and reduced "effective" signal-to-noise ratio at high speech-shaped masker levels, as indicated by audibility estimates based on the Articulation Index.  相似文献   

4.
The idea that listeners are able to "glimpse" the target speech in the presence of competing noise has been supported by many studies, and is based on the assumption that listeners are able to glimpse pieces of the target speech occurring at different times and somehow patch them together to hear out the target speech. The factors influencing glimpsing in noise are not well understood and are examined in the present study. Specifically, the effects of the frequency location, spectral width, and duration of the glimpses are examined. Stimuli were constructed using an ideal time-frequency (T-F) masking technique that ensures that the target is stronger than the masker in certain T-F regions of the mixture, thereby rendering certain regions easier to glimpse than others. Sentences were synthesized using this technique with glimpse information placed in several frequency regions while varying the glimpse window duration and total duration of glimpsing. Results indicated that the frequency location and total duration of the glimpses had a significant effect on speech recognition, with the highest performance obtained when the listeners were able to glimpse information in the F1F2 frequency region (0-3 kHz) for at least 60% of the utterance.  相似文献   

5.
This study examined combinations of energetic and informational maskers in speech identification. Speech targets and maskers (speech or noise) were processed and filtered into sets of 15 narrow frequency bands. The target was the sum of eight randomly selected bands. More masking occurred for speech maskers than for spectrally matched noise maskers regardless of whether the masker bands overlapped the target bands. The greater effect of the speech maskers was interpreted as due to informational masking. When the masker was comprised of nonoverlapping bands of speech, the addition of bands of noise overlapping the speech masker, but not the speech target, reduced the overall amount of masking. Surprisingly, presenting the noise to the ear contralateral to the target and masker produced an even greater release from masking. The contralateral noise was apparently sufficient to cause a slight change in the image of the ipsilateral speech masker, possibly pulling it away from the target enough to allow the focus of attention on the target. This finding is consistent with the interpretation that in some conditions small binaural differences may be sufficient to cause, or significantly strengthen, the perceptual segregation of sounds.  相似文献   

6.
The present study examined the relative influence of the off- and on-frequency spectral components of modulated and unmodulated maskers on consonant recognition. Stimuli were divided into 30 contiguous equivalent rectangular bandwidths. The temporal fine structure (TFS) in each "target" band was either left intact or replaced with tones using vocoder processing. Recognition scores for 10, 15 and 20 target bands randomly located in frequency were obtained in quiet and in the presence of all 30 masker bands, only the off-frequency masker bands, or only the on-frequency masker bands. The amount of masking produced by the on-frequency bands was generally comparable to that produced by the broadband masker. However, the difference between these two conditions was often significant, indicating an influence of the off-frequency masker bands, likely through modulation interference or spectral restoration. Although vocoder processing systematically lead to poorer consonant recognition scores, the deficit observed in noise could often be attributed to that observed in quiet. These data indicate that (i) speech recognition is affected by the off-frequency components of the background and (ii) the nature of the target TFS does not systematically affect speech recognition in noise, especially when energetic masking and/or the number of target bands is limited.  相似文献   

7.
Normal-hearing (NH) listeners maintain robust speech understanding in modulated noise by "glimpsing" portions of speech from a partially masked waveform--a phenomenon known as masking release (MR). Cochlear implant (CI) users, however, generally lack such resiliency. In previous studies, temporal masking of speech by noise occurred randomly, obscuring to what degree MR is attributable to the temporal overlap of speech and masker. In the present study, masker conditions were constructed to either promote (+MR) or suppress (-MR) masking release by controlling the degree of temporal overlap. Sentence recognition was measured in 14 CI subjects and 22 young-adult NH subjects. Normal-hearing subjects showed large amounts of masking release in the +MR condition and a marked difference between +MR and -MR conditions. In contrast, CI subjects demonstrated less effect of MR overall, and some displayed modulation interference as reflected by poorer performance in modulated maskers. These results suggest that the poor performance of typical CI users in noise might be accounted for by factors that extend beyond peripheral masking, such as reduced segmental boundaries between syllables or words. Encouragingly, the best CI users tested here could take advantage of masker fluctuations to better segregate the speech from the background.  相似文献   

8.
Across-frequency processing by common interaural time delay (ITD) in spatial unmasking was investigated by measuring speech reception thresholds (SRTs) for high- and low-frequency bands of target speech presented against concurrent speech or a noise masker. Experiment 1 indicated that presenting one of these target bands with an ITD of +500 micros and the other with zero ITD (like the masker) provided some release from masking, but full binaural advantage was only measured when both target bands were given an ITD of + 500 micros. Experiment 2 showed that full binaural advantage could also be achieved when the high- and low-frequency bands were presented with ITDs of equal but opposite magnitude (+/- 500 micros). In experiment 3, the masker was also split into high- and low-frequency bands with ITDs of equal but opposite magnitude (+/-500 micros). The ITD of the low-frequency target band matched that of the high-frequency masking band and vice versa. SRTs indicated that, as long as the target and masker differed in ITD within each frequency band, full binaural advantage could be achieved. These results suggest that the mechanism underlying spatial unmasking exploits differences in ITD independently within each frequency channel.  相似文献   

9.
Similarity between the target and masking voices is known to have a strong influence on performance in monaural and binaural selective attention tasks, but little is known about the role it might play in dichotic listening tasks with a target signal and one masking voice in the one ear and a second independent masking voice in the opposite ear. This experiment examined performance in a dichotic listening task with a target talker in one ear and same-talker, same-sex, or different-sex maskers in both the target and the unattended ears. The results indicate that listeners were most susceptible to across-ear interference with a different-sex within-ear masker and least susceptible with a same-talker within-ear masker, suggesting that the amount of across-ear interference cannot be predicted from the difficulty of selectively attending to the within-ear masking voice. The results also show that the amount of across-ear interference consistently increases when the across-ear masking voice is more similar to the target speech than the within-ear masking voice is, but that no corresponding decline in across-ear interference occurs when the across-ear voice is less similar to the target than the within-ear voice. These results are consistent with an "integrated strategy" model of speech perception where the listener chooses a segregation strategy based on the characteristics of the masker present in the target ear and the amount of across-ear interference is determined by the extent to which this strategy can also effectively be used to suppress the masker in the unattended ear.  相似文献   

10.
A series of four experiments was undertaken to ascertain whether signal threshold in frequency-modulated noise bands is dependent upon the coherence of modulation. The specific goal was to determine whether a masking release could be obtained with frequency modulation (FM), analogous to the comodulation masking release (CMR) phenomenon observed with amplitude modulation (AM). It was hypothesized that an across-frequency grouping process might give rise to such an effect. In experiments 1-3, maskers were composed of three noise bands centered on 1600, 2000, and 2400 Hz; these were either comodulated or noncomodulated with respect to both FM and AM. In experiment 1, the modulation was sinusoidal, and the signal was a 2000-Hz pure tone; in experiment 2, the modulation was random, and the signal was an FM noise band centered on 2000 Hz. The results obtained showed that, given sufficient width of modulation, thresholds were lower in a coherent FM masker than in an incoherent FM masker, regardless of the pattern of AM or signal type. However, thresholds in multiband maskers were usually elevated relative to that in a single-band masker centered on the signal. Experiment 3 demonstrated that coherent FM could be discriminated from incoherent FM. Experiment 4 gave similar patterns of results to the respective conditions of experiments 2 and 3, but for an inharmonic masker with bands centered on 1580, 2000, and 2532 Hz. While within-channel processes could not be entirely excluded from contributing to the present results, the experimental conditions were designed to be minimally conducive to such processes.  相似文献   

11.
Comodulation masking release (CMR) refers to an improvement in the detection threshold of a signal masked by noise with coherent amplitude fluctuation across frequency, as compared to noise without the envelope coherence. The present study tested whether such an advantage for signal detection would facilitate the identification of speech phonemes. Consonant identification of bandpass speech was measured under the following three masker conditions: (1) a single band of noise in the speech band ("on-frequency" masker); (2) two bands of noise, one in the on-frequency band and the other in the "flanking band," with coherence of temporal envelope fluctuation between the two bands (comodulation); and (3) two bands of noise (on-frequency band and flanking band), without the coherence of the envelopes (noncomodulation). A pilot experiment with a small number of consonant tokens was followed by the main experiment with 12 consonants and the following masking conditions: three frequency locations of the flanking band and two masker levels. Results showed that in all conditions, the comodulation condition provided higher identification scores than the noncomodulation condition, and the difference in score was 3.5% on average. No significant difference was observed between the on-frequency only condition and the comodulation condition, i.e., an "unmasking" effect by the addition of a comodulated flaking band was not observed. The positive effect of CMR on consonant recognition found in the present study endorses a "cued-listening" theory, rather than an envelope correlation theory, as a basis of CMR in a suprathreshold task.  相似文献   

12.
The speech perception of two multiple-channel cochlear implant patients was compared with that of three normally hearing listeners using an acoustic model of the implant for 22 different speech tests. The tests used included a minimal auditory capabilities battery, both closed-set and open-set word and sentence tests, speech tracking and a 12-consonant confusion study using nonsense syllables. The acoustic model represented electrical current pulses by bursts of noise and the effects of different electrodes were represented by using bandpass filters with different center frequencies. All subjects used a speech processor that coded the fundamental voicing frequency of speech as a pulse rate and the second formant frequency of speech as the electrode position in the cochlea, or the center frequency of the bandpass filter. Very good agreement was found for the two groups of subjects, indicating that the acoustic model is a useful tool for the development and evaluation of alternative cochlear implant speech processing strategies.  相似文献   

13.
This experiment assessed the benefits of suppression and the impact of reduced or absent suppression on speech recognition in noise. Psychophysical suppression was measured in forward masking using tonal maskers and suppressors and band limited noise maskers and suppressors. Subjects were 10 younger and 10 older adults with normal hearing, and 10 older adults with cochlear hearing loss. For younger subjects with normal hearing, suppression measured with noise maskers increased with masker level and was larger at 2.0 kHz than at 0.8 kHz. Less suppression was observed for older than younger subjects with normal hearing. There was little evidence of suppression for older subjects with cochlear hearing loss. Suppression measured with noise maskers and suppressors was larger in magnitude and more prevalent than suppression measured with tonal maskers and suppressors. The benefit of suppression to speech recognition in noise was assessed by obtaining scores for filtered consonant-vowel syllables as a function of the bandwidth of a forward masker. Speech-recognition scores in forward maskers should be higher than those in simultaneous maskers given that forward maskers are less effective than simultaneous maskers. If suppression also mitigated the effects of the forward masker and resulted in an improved signal-to-noise ratio, scores should decrease less in forward masking as forward-masker bandwidth increased, and differences between scores in forward and simultaneous maskers should increase, as was observed for younger subjects with normal hearing. Less or no benefit of suppression to speech recognition in noise was observed for older subjects with normal hearing or hearing loss. In general, as suppression measured with tonal signals increased, the combined benefit of forward masking and suppression to speech recognition in noise also increased.  相似文献   

14.
This study investigated whether speech-like maskers without linguistic content produce informational masking of speech. The target stimuli were nonsense Chinese Mandarin sentences. In experiment I, the masker contained harmonics the fundamental frequency (F0) of which was sinusoidally modulated and the mean F0 of which was varied. The magnitude of informational masking was evaluated by measuring the change in intelligibility (releasing effect) produced by inducing a perceived spatial separation of the target speech and masker via the precedence effect. The releasing effect was small and was only clear when the target and masker had the same mean F0, suggesting that informational masking was small. Performance with the harmonic maskers was better than with a steady speech-shaped noise (SSN) masker. In experiments II and III, the maskers were speech-like synthesized signals, alternating between segments with harmonic structure and segments composed of SSN. Performance was much worse than for experiment I, and worse than when an SSN masker was used, suggesting that substantial informational masking occurred. The similarity of the F0 contours of the target and masker had little effect. The informational masking effect was not influenced by whether or not the noise-like segments of the masker were synchronous with the unvoiced segments of the target speech.  相似文献   

15.
Masking might be due either to the spread of the excitation produced by the masker to the place of the tone signal along the cochlea or to the suppression of the response to the signal by the masker. In order to identify the contributions of these two mechanisms to tone-on-tone masking, masked thresholds of auditory-nerve fibers were measured in anesthetized cats using the same stimulus paradigms and detection criteria as in psychophysics. Suppressive masking was identified by comparing thresholds for simultaneous masking with those for a nonsimultaneous masking technique resembling pulsation thresholds. These nonsimultaneous thresholds do not include the contribution of suppression to masking because suppression only occurs for stimuli that overlap in time. For each masker and signal frequency, the fibers with the lowest (or "best") masked thresholds had characteristic frequencies (CF) slightly on the opposite side of the masker frequency with respect to the signal frequency, consistent with the psychophysical phenomenon of off-frequency listening. Patterns of best masked thresholds against signal frequency resembled psychophysical masking patterns in that they showed a maximum for signal frequencies close to the masker, and a skew toward high frequencies. Masking was found to be both excitatory and suppressive, with the relative contribution of the two mechanisms depending on the frequency separation between signal and masker. Suppressive masking was large for signal frequencies well above the masker. For these conditions, simultaneous thresholds grew more rapidly with masker level than did nonsimultaneous thresholds, suggesting that the upward spread of masking is largely due to the growth of suppression rather than to that of excitation.  相似文献   

16.
Although informational masking is thought to reflect central mechanisms, the effects are generally much stronger when the target and masker are presented to the same ear than when they are presented to different ears. However, the results of a recent study by Brungart and Simpson [J. Acoust. Soc. Am. 112, 2985-2995 (2002)] indicated that a speech masker that is presented contralateral to a speech signal can produce substantial amounts of informational masking when a second speech masker is played simultaneously in the same ear as the signal. In this study, we conducted a series of experiments that paralleled those of Brungart and Simpson but used a pure-tone signal and multitone informational maskers in a detection task. Both the signal and the maskers were played as sequences of short bursts in each observation interval. The maskers were arranged in two types of spectrotemporal patterns. One type of pattern, called "multiple-bursts same" (MBS), has previously been shown to produce very large amounts of informational masking while the other type of pattern, called "multiple-bursts different" (MBD), has been shown to produce very small amounts of informational masking. Several conditions of ipsilateral, contralateral, and combined presentation of these maskers were tested. The results showed that presentation of the MBS masker in the contralateral ear produced a substantial amount of informational masking when the MBD masker was simultaneously presented to the ipsilateral ear. The results supported the earlier findings of Brungart and Simpson indicating that listeners are unable to selectively focus their attention on a single ear in some complex dichotic listening conditions. These results suggest that this contralateral masking effect is not restricted to speech and may reflect more general limitations on processing capacity. Further, it was concluded that the magnitude of the contralateral masking effect was related both to the informational masking value of the contralateral masker and the complexity of the stimulus and/or task in the ear in which the signal was presented.  相似文献   

17.
The Tickle Talker is an electrotactile speech perception device. Subjects were evaluated using the device in various tactile-alone and tactile-visual contexts to assess the generalization to other contexts of tactile-alone perceptual skills. The subjects were from a group of six normally hearing subjects who had previously received 12 to 33 h of tactile-alone word recognition training and had learned an average vocabulary of 50 words [Galvin et al., J. Acoust. Soc. Am. 106, 1084-1089 (1999)]. The tactile-alone evaluation contexts were sentences, unfamiliar talkers, and untrained words. The tactile-visual evaluation contexts were closed-set words, open-set words, and open-set sentences. Tactile-alone perceptual skills were generalized to unfamiliar speakers, sentences, and untrained words, though scores indicated that generalization was not complete. In contrast, the generalization of skills to tactile-visual contexts was minimal or absent. The potential value of tactile-alone training for hearing-impaired users of the Tickle Talker is discussed.  相似文献   

18.
This study examined whether speech-on-speech masking is sensitive to variation in the degree of similarity between the target and the masker speech. Three experiments investigated whether speech-in-speech recognition varies across different background speech languages (English vs Dutch) for both English and Dutch targets, as well as across variation in the semantic content of the background speech (meaningful vs semantically anomalous sentences), and across variation in listener status vis-a?-vis the target and masker languages (native, non-native, or unfamiliar). The results showed that the more similar the target speech is to the masker speech (e.g., same vs different language, same vs different levels of semantic content), the greater the interference on speech recognition accuracy. Moreover, the listener's knowledge of the target and the background language modulate the size of the release from masking. These factors had an especially strong effect on masking effectiveness in highly unfavorable listening conditions. Overall this research provided evidence that that the degree of target-masker similarity plays a significant role in speech-in-speech recognition. The results also give insight into how listeners assign their resources differently depending on whether they are listening to their first or second language.  相似文献   

19.
Release from masking caused by envelope fluctuations   总被引:1,自引:0,他引:1  
This paper examines how short-term energy fluctuations in a masker affect the thresholds for tones at frequencies above those of the masker. Two equally intense tones at 1060 and 1075 Hz produce up to 25 dB less masking than does a 1075-Hz tone set to the overall level of the two-tone complex. At wider frequency separations, two-tone complexes also produce less masking than the pure tone. These results indicate that envelope fluctuations in a masker, whose spectrum is confined to a single critical band, may result in release from masking. The release from masking probably is related to the comodulation masking release reported by Hall et al. [J. Acoust. Soc. Am. 76, 50-56 (1984b)] for modulated-noise maskers with bandwidths greater than one critical band. Further measurements with maskers, whose intensity level in the critical band around 1 kHz was 90 dB SPL, show similar masking by a pure tone and a 625- to 1075-Hz bandpass noise, but less masking by narrow-band noises. These results are inconsistent with a simple frequency selective energy-detector model and indicate that the auditory system can use periods of low masker energy as brief as a few ms to enhance detection of a tone. The results also imply that the upward spread of excitation is best represented by masking patterns for noises with bandwidths of several critical bands.  相似文献   

20.
This study investigated the role of uncertainty in masking of speech by interfering speech. Target stimuli were nonsense sentences recorded by a female talker. Masking sentences were recorded from ten female talkers and combined into pairs. Listeners' recognition performance was measured with both target and masker presented from a front loudspeaker (nonspatial condition) or with a masker presented from two loudspeakers, with the right leading the front by 4 ms (spatial condition). In Experiment 1, the sentences were presented in blocks in which the masking talkers, spatial configuration, and signal-to-noise (S-N) ratio were fixed. Listeners' recognition performance varied widely among the masking talkers in the nonspatial condition, much less so in the spatial condition. This result was attributed to variation in effectiveness of informational masking in the nonspatial condition. The second experiment increased uncertainty by randomizing masking talkers and S-N ratios across trials in some conditions, and reduced uncertainty by presenting the same token of masker across trials in other conditions. These variations in masker uncertainty had relatively small effects on speech recognition.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号