共查询到20条相似文献,搜索用时 15 毫秒
1.
Ihlefeld A Shinn-Cunningham B 《The Journal of the Acoustical Society of America》2008,123(6):4369-4379
A masker can reduce target intelligibility both by interfering with the target's peripheral representation ("energetic masking") and/or by causing more central interference ("informational masking"). Intelligibility generally improves with increasing spatial separation between two sources, an effect known as spatial release from masking (SRM). Here, SRM was measured using two concurrent sine-vocoded talkers. Target and masker were each composed of eight different narrowbands of speech (with little spectral overlap). The broadband target-to-masker energy ratio (TMR) was varied, and response errors were used to assess the relative importance of energetic and informational masking. Performance improved with increasing TMR. SRM occurred at all TMRs; however, the pattern of errors suggests that spatial separation affected performance differently, depending on the dominant type of masking. Detailed error analysis suggests that informational masking occurred due to failures in either across-time linkage of target segments (streaming) or top-down selection of the target. Specifically, differences in the spatial cues in target and masker improved streaming and target selection. In contrast, level differences helped listeners select the target, but had little influence on streaming. These results demonstrate that at least two mechanisms (differentially affected by spatial and level cues) influence informational masking. 相似文献
2.
Ihlefeld A Shinn-Cunningham B 《The Journal of the Acoustical Society of America》2008,123(6):4380-4392
When listening selectively to one talker in a two-talker environment, performance generally improves with spatial separation of the sources. The current study explores the role of spatial separation in divided listening, when listeners reported both of two simultaneous messages processed to have little spectral overlap (limiting "energetic masking" between the messages). One message was presented at a fixed level, while the other message level varied from equal to 40 dB less than that of the fixed-level message. Results demonstrate that spatial separation of the competing messages improved divided-listening performance. Most errors occurred because listeners failed to report the content of the less-intense talker. Moreover, performance generally improved as the broadband energy ratio of the variable-level to the fixed-level talker increased. The error patterns suggest that spatial separation improves the intelligibility of the less-intense talker by improving the ability to (1) hear portions of the signal that would otherwise be masked, (2) segregate the two talkers properly into separate perceptual streams, and (3) selectively focus attention on the less-intense talker. Spatial configuration did not noticeably affect the ability to report the more-intense talker, suggesting that it was processed differently than the less-intense talker, which was actively attended. 相似文献
3.
Two experiments compared the effect of supplying visual speech information (e.g., lipreading cues) on the ability to hear one female talker's voice in the presence of steady-state noise or a masking complex consisting of two other female voices. In the first experiment intelligibility of sentences was measured in the presence of the two types of maskers with and without perceived spatial separation of target and masker. The second study tested detection of sentences in the same experimental conditions. Results showed that visual cues provided more benefit for both recognition and detection of speech when the masker consisted of other voices (versus steady-state noise). Moreover, visual cues provided greater benefit when the target speech and masker were spatially coincident versus when they appeared to arise from different spatial locations. The data obtained here are consistent with the hypothesis that lipreading cues help to segregate a target voice from competing voices, in addition to the established benefit of supplementing masked phonetic information. 相似文献
4.
Scott SK Rosen S Wickham L Wise RJ 《The Journal of the Acoustical Society of America》2004,115(2):813-821
Positron emission tomography (PET) was used to investigate the neural basis of the comprehension of speech in unmodulated noise ("energetic" masking, dominated by effects at the auditory periphery), and when presented with another speaker ("informational" masking, dominated by more central effects). Each type of signal was presented at four different signal-to-noise ratios (SNRs) (+3, 0, -3, -6 dB for the speech-in-speech, +6, +3, 0, -3 dB for the speech-in-noise), with listeners instructed to listen for meaning to the target speaker. Consistent with behavioral studies, there was SNR-dependent activation associated with the comprehension of speech in noise, with no SNR-dependent activity for the comprehension of speech-in-speech (at low or negative SNRs). There was, in addition, activation in bilateral superior temporal gyri which was associated with the informational masking condition. The extent to which this activation of classical "speech" areas of the temporal lobes might delineate the neural basis of the informational masking is considered, as is the relationship of these findings to the interfering effects of unattended speech and sound on more explicit working memory tasks. This study is a novel demonstration of candidate neural systems involved in the perception of speech in noisy environments, and of the processing of multiple speakers in the dorso-lateral temporal lobes. 相似文献
5.
Although many studies have shown that intelligibility improves when a speech signal and an interfering sound source are spatially separated in azimuth, little is known about the effect that spatial separation in distance has on the perception of competing sound sources near the head. In this experiment, head-related transfer functions (HRTFs) were used to process stimuli in order to simulate a target talker and a masking sound located at different distances along the listener's interaural axis. One of the signals was always presented at a distance of 1 m, and the other signal was presented 1 m, 25 cm, or 12 cm from the center of the listener's head. The results show that distance separation has very different effects on speech segregation for different types of maskers. When speech-shaped noise was used as the masker, most of the intelligibility advantages of spatial separation could be accounted for by spectral differences in the target and masking signals at the ear with the higher signal-to-noise ratio (SNR). When a same-sex talker was used as the masker, the intelligibility advantages of spatial separation in distance were dominated by binaural effects that produced the same performance improvements as a 4-5-dB increase in the SNR of a diotic stimulus. These results suggest that distance-dependent changes in the interaural difference cues of nearby sources play a much larger role in the reduction of the informational masking produced by an interfering speech signal than in the reduction of the energetic masking produced by an interfering noise source. 相似文献
6.
The detection of a tone added to a random-frequency, multitone masker can be very poor even when the maskers have little energy in the frequency region of the signal. This paper examines the effects of adding a pretrial cue to reduce uncertainty for the masker or the signal. The first two experiments examined the effect of cuing a fixed-frequency signal as the number of masker components and presentation methods were manipulated. Cue effectiveness varied across observers, but could reduce thresholds by as much as 20 dB. Procedural comparisons indicated observers benefited more from having two masker samples to compare, with or without a signal cue, than having a single interval with one masker sample and a signal cue. The third experiment used random-frequency signals and compared no-cue, signal-cue, and masker-cue conditions, and also systematically varied the time interval between cue offset and trial onset. Thresholds with a cued random-frequency signal remained higher than for a cued fixed-frequency signal. For time intervals between the cue and trial of 50 ms or longer, thresholds were approximately the same with a signal or a masker cue and lower than when there was no cue. Without a cue or with a masker cue, analyses of possible decision strategies suggested observers attended to the potential signal frequencies, particularly the highest signal frequency. With a signal cue, observers appeared to attend to the frequency of the subsequent signal. 相似文献
7.
8.
How much masking is informational masking? 总被引:1,自引:0,他引:1
R A Lutfi 《The Journal of the Acoustical Society of America》1990,88(6):2607-2610
9.
RA Lutfi AC Chang J Stamas L Gilbertson 《The Journal of the Acoustical Society of America》2012,132(2):EL109-EL113
There has been growing interest in recent years in masking that appears to have its origin at a central level of the auditory nervous system-so-called informational masking (IM). Masker uncertainty and target-masker similarity have been identified as the two major factors affecting IM; however, no theoretical framework currently exists that would give precise meaning to these terms necessary to evaluate their relative importance or model their effects. The present paper offers a first attempt at such a framework constructed within the doctrines of the theory of signal detection. 相似文献
10.
Talkers change the way they speak in noisy conditions. For energetic maskers, speech production changes are relatively well-understood, but less is known about how informational maskers such as competing speech affect speech production. The current study examines the effect of energetic and informational maskers on speech production by talkers speaking alone or in pairs. Talkers produced speech in quiet and in backgrounds of speech-shaped noise, speech-modulated noise, and competing speech. Relative to quiet, speech output level and fundamental frequency increased and spectral tilt flattened in proportion to the energetic masking capacity of the background. In response to modulated backgrounds, talkers were able to reduce substantially the degree of temporal overlap with the noise, with greater reduction for the competing speech background. Reduction in foreground-background overlap can be expected to lead to a release from both energetic and informational masking for listeners. Passive changes in speech rate, mean pause length or pause distribution cannot explain the overlap reduction, which appears instead to result from a purposeful process of listening while speaking. Talkers appear to monitor the background and exploit upcoming pauses, a strategy which is particularly effective for backgrounds containing intelligible speech. 相似文献
11.
Cooke M Garcia Lecumberri ML Barker J 《The Journal of the Acoustical Society of America》2008,123(1):414-427
Studies comparing native and non-native listener performance on speech perception tasks can distinguish the roles of general auditory and language-independent processes from those involving prior knowledge of a given language. Previous experiments have demonstrated a performance disparity between native and non-native listeners on tasks involving sentence processing in noise. However, the effects of energetic and informational masking have not been explicitly distinguished. Here, English and Spanish listener groups identified keywords in English sentences in quiet and masked by either stationary noise or a competing utterance, conditions known to produce predominantly energetic and informational masking, respectively. In the stationary noise conditions, non-native talkers suffered more from increasing levels of noise for two of the three keywords scored. In the competing talker condition, the performance differential also increased with masker level. A computer model of energetic masking in the competing talker condition ruled out the possibility that the native advantage could be explained wholly by energetic masking. Both groups drew equal benefit from differences in mean F0 between target and masker, suggesting that processes which make use of this cue do not engage language-specific knowledge. 相似文献
12.
13.
Ihlefeld A Shinn-Cunningham BG Carlyon RP 《The Journal of the Acoustical Society of America》2012,131(2):1315-1324
For normal-hearing (NH) listeners, masker energy outside the spectral region of a target signal can improve target detection and identification, a phenomenon referred to as comodulation masking release (CMR). This study examined whether, for cochlear implant (CI) listeners and for NH listeners presented with a "noise vocoded" CI simulation, speech identification in modulated noise is improved by a co-modulated flanking band. In Experiment 1, NH listeners identified noise-vocoded speech in a background of on-target noise with or without a flanking narrow band of noise outside the spectral region of the target. The on-target noise and flanker were either 16-Hz square-wave modulated with the same phase or were unmodulated; the speech was taken from a closed-set corpus. Performance was better in modulated than in unmodulated noise, and this difference was slightly greater when the comodulated flanker was present, consistent with a small CMR of about 1.7 dB for noise-vocoded speech. Experiment 2, which tested CI listeners using the same speech materials, found no advantage for modulated versus unmodulated maskers and no CMR. Thus although NH listeners can benefit from CMR even for speech signals with reduced spectro-temporal detail, no CMR was observed for CI users. 相似文献
14.
The degree of similarity between signal and masker in informational masking paradigms has been hypothesized to contribute to informational masking. The present study attempted to quantify "similarity" using a discrimination task. Listeners discriminated various signal stimuli from a multitone complex and then detected the presence of those signals embedded in a multitone informational masker. Discriminability negatively correlated with detection threshold in an informational masking experiment, indicating that similarity between signal and the masker quality contributed to informational masking. These results suggest a method for specifying relevant signal attributes in informational masking paradigms involving similarity manipulations. 相似文献
15.
Detection thresholds for a tone in an unfamiliar tonal pattern can be greatly elevated under conditions of masker uncertainty [Neff and Green, Percept. Psychophys. 41, 409-415 (1987); Oh and Lutfi, J. Acoust. Soc. Am. 101, 3148 (1997)]. The present experiment was undertaken to determine whether harmonicity of masker tones can reduce the detrimental effect of masker uncertainty. Inharmonic maskers were comprised of m=2-49 frequency components selected at random on each presentation within 100-10000 Hz, excluding frequencies between 920-1080. Harmonic maskers were comprised of frequency components selected at random within this same range, but constrained to have a fundamental frequency of 200 Hz. For inharmonic maskers the signal was a 1000-Hz tone. For harmonic-maskers the signal was a tone whose frequency was either harmonically (1000 Hz) or inharmonically (1047 Hz) related to the masker. In all conditions the amount of masking was greatest for m = 20-40 components. At this point, harmonic maskers with harmonic signal produced an average of 9-12 dB less masking than inharmonic maskers. Harmonic maskers with inharmonic signal produced an average of 16-20 dB less masking. 相似文献
16.
This study sought to determine whether speech recognition in a modulating noise background can be facilitated by a process attributable to comodulation masking release (CMR). Experiment 1 examined the masked identification of six filtered vowels as a function of the number of comodulated noisebands present. A benefit of increased number was observed, consistent with an interpretation in terms of CMR, although it could not be certain that the basis of the discrimination was word recognition in the semantic sense. Experiment 2 made use of a forced-choice rhyming test in which the response foils differed only in a single filtered consonant; again, the measure of interest was performance as a function of the number of comodulated noisebands present. No evidence for a suprathreshold CMR was observed. Experiment 3 made use of open-set sentence material and employed a different paradigm, which allowed a measure of CMR in terms of the difference between thresholds in correlated and uncorrelated noise to be determined. While a CMR for speech detection was observed, no CMR for speech recognition was found. It was concluded that CMR is most evident in masked detection tasks and that diminishing returns are encountered as the signal-to-masker ratio is raised. 相似文献
17.
Auditory spatial attention is one mechanism that may contribute to the ability to identify one sound source in a multi-source environment. The role of auditory spatial attention in a multi-source environment was investigated using the probe-signal method. The experiment took place in a quiet room with seven speakers arranged in a semi-circle in front of the listener. The speakers were placed at 30-degree intervals at a distance of 5 ft from the listener. The signal was comprised of eight contiguous, 60-ms pure-tone bursts arranged in either a rising or falling frequency pattern. Masker components were also comprised of eight contiguous pure-tone bursts but with durations that varied randomly from 20 to 100 ms. The six maskers were played with the signal and were constructed in order to result in informational rather than energetic masking. The frequency of each masker component was chosen randomly on each burst from a narrow frequency band, independent from the signal frequency band. The task was 1I-2AFC fixed-level identification with response time measurement. The listener was instructed to focus attention on a specified speaker (expected location) for a block of trials. Accuracy and response time were compared across two conditions: (1) signal presented at the expected location and (2) signal presented at an unexpected location. Results indicate a significant increase in accuracy and faster response time when the signal was presented at the expected location as compared to an unexpected location. These results suggest that auditory spatial attention plays an important role in multi-source listening, especially when the listening environment is complex and uncertain. 相似文献
18.
19.
20.
When multitone maskers are used in a two-interval, forced choice experiment, the amount of masking is larger when the masker is randomly chosen on each presentation interval compared to on each trial (the same masker in the two listening intervals). These conditions are referred to as having within- versus between-trial randomization. If it is assumed that an observer's ultimate detection decision depends on a single decision variable (DV), it is probable that the DV's variance will be substantially larger in the within-trial randomization condition compared to the between-trial randomization condition. The goal of the current experiment is to evaluate the degree to which this stimulus-based change in DV variance can account for the difference in thresholds in the within-versus between-trial randomization conditions. Thresholds are measured for the detection of a tone added to a six-component masker in between- and within-trial randomization conditions. The slopes of the psychometric functions provide an estimate of the variance in the DV for the between- and within-trial randomization conditions. Additionally, a channel model is fitted to the psychophysical results in the within-trial randomization condition. The resulting model is then used to predict the value of the DV for each trial, and ultimately to estimate the proportion of the total variance in the within-trial randomization condition that is attributable to changes in maskers across intervals. The variance of the DV in the between-trial randomization condition accounted for approximately 65% of the total variance in the DV in the within-trial randomization condition. Stimulus-based interval-by-interval masker randomization accounted for approximately 20% of the total variance of the within-trial randomization DV. The remaining 15% of the DV variance in the within-trial randomization condition remained unaccounted for. This result is fairly stable whether the maskers are drawn from a small versus large pool of potential maskers. 相似文献