首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
This study examined combinations of energetic and informational maskers in speech identification. Speech targets and maskers (speech or noise) were processed and filtered into sets of 15 narrow frequency bands. The target was the sum of eight randomly selected bands. More masking occurred for speech maskers than for spectrally matched noise maskers regardless of whether the masker bands overlapped the target bands. The greater effect of the speech maskers was interpreted as due to informational masking. When the masker was comprised of nonoverlapping bands of speech, the addition of bands of noise overlapping the speech masker, but not the speech target, reduced the overall amount of masking. Surprisingly, presenting the noise to the ear contralateral to the target and masker produced an even greater release from masking. The contralateral noise was apparently sufficient to cause a slight change in the image of the ipsilateral speech masker, possibly pulling it away from the target enough to allow the focus of attention on the target. This finding is consistent with the interpretation that in some conditions small binaural differences may be sufficient to cause, or significantly strengthen, the perceptual segregation of sounds.  相似文献   

2.
This study examined whether increasing the similarity between informational maskers and signals would increase the amount of masking obtained in a nonspeech pattern identification task. The signals were contiguous sequences of pure-tone bursts arranged in six narrow-band spectro-temporal patterns. The informational maskers were sequences of multitone bursts played synchronously with the signal tones. The listener's task was to identify the patterns in a 1-interval 6-alternative forced-choice procedure. Three types of multitone maskers were generated according to different randomization rules. For the least signal-like informational masker, the components in each multitone burst were chosen at random within the frequency range of 200-6500 Hz, excluding a "protected region" around the signal frequencies. For the intermediate masker, the frequency components in the first burst were chosen quasirandomly, but the components in successive bursts were constrained to fall in narrow frequency bands around the frequencies of the components in the initial burst. Within the narrow bands the frequencies were randomized. This masker was considered to be more similar to the signal patterns because it consisted of a set of narrow-band sequences any one of which might be mistaken for a signal pattern. The most signal-like masker was similar to the intermediate masker in that it consisted of a set of synchronously played narrow-band sequences, but the variation in frequency within each sequence was sinusoidal, completing roughly one period in a sequence. This masker consisted of discernible patterns but not patterns that were part of the set of signals. In addition, masking produced by Gaussian noise bursts--thought to produce primarily peripherally based "energetic masking"--was measured and compared to the informational masking results. For the three informational maskers, more masking was produced by the maskers comprised of narrow-band sequences than for the masker in which the frequencies were not constrained to narrow bands. Also, the slopes of the performance-level functions for the three informational maskers were much shallower than for the Gaussian noise masker or for no masker. The findings provided qualified support for the hypothesis that increasing the similarity between signals and maskers, or parts of the maskers, causes greater informational masking. However, it is also possible that the greater masking was a consequence of increasing the number of perceptual "streams" that had to be evaluated by the listener.  相似文献   

3.
In Experiment 1, the validity of parameters associated with the roex(p, r) auditory filter shape was examined for three different types of maskers: (a) A noise masker, (b) a random 12-tone masker whose frequencies varied on a burst-by-burst basis [multiple-burst different (MBD)], and (c) a random 12-tone masker whose frequencies were the same across bursts [multiple-burst same (MBS)]. First, the power spectrum model of masking was used to estimate auditory filter shapes for four observers. Second, the resulting auditory filter shapes were used in a computer simulation that provided an estimate of internal noise for each observer. Third, relative weights across frequency were estimated for each observer and each masker type. For the noise masker, these analyses provided predictions and relative weights that were consistent across the three analyses. For the MBD and MBS maskers, there was little consistency; neither the estimated internal noise nor the estimated relative weights reliably supported a single-filter model of detection. In Experiment 2, the time course for the detection of a tone added to an MBD masker was evaluated by estimating relative weights jointly in time and frequency. The relative weights at the signal frequency formed a rough inverse "U" across time.  相似文献   

4.
This study investigated whether speech-like maskers without linguistic content produce informational masking of speech. The target stimuli were nonsense Chinese Mandarin sentences. In experiment I, the masker contained harmonics the fundamental frequency (F0) of which was sinusoidally modulated and the mean F0 of which was varied. The magnitude of informational masking was evaluated by measuring the change in intelligibility (releasing effect) produced by inducing a perceived spatial separation of the target speech and masker via the precedence effect. The releasing effect was small and was only clear when the target and masker had the same mean F0, suggesting that informational masking was small. Performance with the harmonic maskers was better than with a steady speech-shaped noise (SSN) masker. In experiments II and III, the maskers were speech-like synthesized signals, alternating between segments with harmonic structure and segments composed of SSN. Performance was much worse than for experiment I, and worse than when an SSN masker was used, suggesting that substantial informational masking occurred. The similarity of the F0 contours of the target and masker had little effect. The informational masking effect was not influenced by whether or not the noise-like segments of the masker were synchronous with the unvoiced segments of the target speech.  相似文献   

5.
Detection was measured for a 500 Hz tone masked by noise (an "energetic" masker) or sets of ten randomly drawn tones (an "informational" masker). Presenting the maskers diotically and the target tone with a variety of interaural differences (interaural amplitude ratios and/or interaural time delays) resulted in reduced detection thresholds relative to when the target was presented diotically ("binaural release from masking"). Thresholds observed when time and amplitude differences applied to the target were "reinforcing" (favored the same ear, resulting in a lateralized position for the target) were not significantly different from thresholds obtained when differences were "opposing" (favored opposite ears, resulting in a centered position for the target). This irrelevance of differences in the perceived location of the target is a classic result for energetic maskers but had not previously been shown for informational maskers. However, this parallellism between the patterns of binaural release for energetic and informational maskers was not accompanied by high correlations between the patterns for individual listeners, supporting the idea that the mechanisms for binaural release from energetic and informational masking are fundamentally different.  相似文献   

6.
In the many studies done on informational masking, interfering speech reduces speech intelligibility. This effect is often used to secure privacy in public spaces. These applications require estimates of how much masking is required. In general, masking effects are estimated by using spectrum information as excitation patterns. However, estimates of informational masking can hardly be obtained by only using spectrum information. Therefore, we estimated the effects of informational masking using time-domain information. Then, we calculated the cepstra of the envelopes’ magnitude histograms. If these cepstra are different between the target and the masker, the signals are not similar in the time-domain. Furthermore, the effect of informational masking would be low. Therefore, we considered the histograms’ cepstra distances (HCD) to estimate signal similarities. The signal similarities in our first experiment were estimated using five maskers by utilizing the HCD. These maskers were random noise, music, female speech, male speech, and target speaker’s speech. Male and female speech were more similar to the target speech than music and noise. Also, the same speaker’s speech was the most similar in the set of maskers. A listening test was carried out in the second experiment to verify the HCD. A double masker was used in this experiment as an effective informational masker. It has similar characteristics to reversal speech. The listening test results suggest the double-masker’s masking effects has the same relation with HCD. This suggests informational masking can be estimated by signal similarity using the HCD.  相似文献   

7.
Spoken communication in a non-native language is especially difficult in the presence of noise. This study compared English and Spanish listeners' perceptions of English intervocalic consonants as a function of masker type. Three maskers (stationary noise, multitalker babble, and competing speech) provided varying amounts of energetic and informational masking. Competing English and Spanish speech maskers were used to examine the effect of masker language. Non-native performance fell short of that of native listeners in quiet, but a larger performance differential was found for all masking conditions. Both groups performed better in competing speech than in stationary noise, and both suffered most in babble. Since babble is a less effective energetic masker than stationary noise, these results suggest that non-native listeners are more adversely affected by both energetic and informational masking. A strong correlation was found between non-native performance in quiet and degree of deterioration in noise, suggesting that non-native phonetic category learning can be fragile. A small effect of language background was evident: English listeners performed better when the competing speech was Spanish.  相似文献   

8.
The effect of perceived spatial differences on masking release was examined using a 4AFC speech detection paradigm. Targets were 20 words produced by a female talker. Maskers were recordings of continuous streams of nonsense sentences spoken by two female talkers and mixed into each of two channels (two talker, and the same masker time reversed). Two masker spatial conditions were employed: "RF" with a 4 ms time lead to the loudspeaker 60 degrees horizontally to the right, and "FR" with the time lead to the front (0 degrees ) loudspeaker. The reference nonspatial "F" masker was presented from the front loudspeaker only. Target presentation was always from the front loudspeaker. In Experiment 1, target detection threshold for both natural and time-reversed spatial maskers was 17-20 dB lower than that for the nonspatial masker, suggesting that significant release from informational masking occurs with spatial speech maskers regardless of masker understandability. In Experiment 2, the effectiveness of the FR and RF maskers was evaluated as the right loudspeaker output was attenuated until the two-source maskers were indistinguishable from the F masker, as measured independently in a discrimination task. Results indicated that spatial release from masking can be observed with barely noticeable target-masker spatial differences.  相似文献   

9.
Masked thresholds for a 1000-Hz sinusoidal signal were measured as a function of masker level in both forward and simultaneous masking for two types of maskers: a 1000-Hz sinusoid and a narrowband noise, 60-Hz wide, centered at 1000 Hz. In forward masking, the noise masker produced much steeper growth-of-masking functions than the sinusoid. Presenting a contralateral broadband noise "cue" with the forward masker dramatically reduced the slope of masking for the noise masker but did not influence results for the sinusoidal masker. The noise remained the more effective masker. The amount of masking produced by combinations of equally effective narrowband-noise and sinusoidal maskers was compared to that produced by each masker individually with and without the contralateral cue. No additional masking beyond that predicted by energy summation was measured for forward masking. Additional masking beyond energy-sum predictions was measured for analogous conditions in simultaneous masking. Comparisons of results obtained with and without the contralateral cue suggest that signal thresholds in the presence of narrowband-noise forward maskers can reflect nonperipheral auditory processes.  相似文献   

10.
In the simultaneous multitone masking paradigm introduced by Neff and Green [Percept. Psychophys. 41, 409-415 (1987)] the masker typically is a small number of tones having frequencies and levels that are randomly drawn on every presentation. Large amounts of masking for a pure-tone signal often occur that are thought to reflect central, rather than peripheral, limitations on processing. Previous work from this laboratory has indicated that playing a rapid succession of randomly drawn multitone maskers in each observation interval dramatically reduces the amount of masking that is observed relative to a single burst (SB). In this multiple-bursts-different (MBD) procedure, the signal tone is the only constant frequency component during the sequence of bursts and tends to perceptually segregate from the masker. In this study, the number of masker bursts and the interburst interval (IBI) were varied. The goals were to determine how the release from masking relative to the SB condition depends on the number of bursts and to examine whether increasing the IBI would cause each burst to be processed independently. If the latter were true, it might disrupt the perception of signal stream coherence, thereby diminishing the MBD advantage. However, multiple independent looks could also lead to an improvement in performance. For those subjects showing large amounts of informational masking in the SB condition, substantial reduction in masked thresholds occurred as the number of masker bursts increased, while masking increased as IBI lengthened. The results were not consistent with a simple version of a multiple-look model in which the information from each burst was combined optimally, but instead appear to be attributable to mechanisms involved in the perceptual organization of sounds.  相似文献   

11.
When a masking sound is spatially separated from a target speech signal, substantial releases from masking typically occur both for speech and noise maskers. However, when a delayed copy of the masker is also presented at the location of the target speech (a condition that has been referred to as the front target, right-front masker or F-RF configuration), the advantages of spatial separation vanish for noise maskers but remain substantial for speech maskers. This effect has been attributed to precedence, which introduces an apparent spatial separation between the target and masker in the F-RF configuration that helps the listener to segregate the target from a masking voice but not from a masking noise. In this study, virtual synthesis techniques were used to examine variations of the F-RF configuration in an attempt to more fully understand the stimulus parameters that influence the release from masking obtained in that condition. The results show that the release from speech-on-speech masking caused by the addition of the delayed copy of the masker is robust across a wide variety of source locations, masker locations, and masker delay values. This suggests that the speech unmasking that occurs in the F-RF configuration is not dependent on any single perceptual cue and may indicate that F-RF speech segregation is only partially based on the apparent left-right location of the RF masker.  相似文献   

12.
Similarity between the target and masking voices is known to have a strong influence on performance in monaural and binaural selective attention tasks, but little is known about the role it might play in dichotic listening tasks with a target signal and one masking voice in the one ear and a second independent masking voice in the opposite ear. This experiment examined performance in a dichotic listening task with a target talker in one ear and same-talker, same-sex, or different-sex maskers in both the target and the unattended ears. The results indicate that listeners were most susceptible to across-ear interference with a different-sex within-ear masker and least susceptible with a same-talker within-ear masker, suggesting that the amount of across-ear interference cannot be predicted from the difficulty of selectively attending to the within-ear masking voice. The results also show that the amount of across-ear interference consistently increases when the across-ear masking voice is more similar to the target speech than the within-ear masking voice is, but that no corresponding decline in across-ear interference occurs when the across-ear voice is less similar to the target than the within-ear voice. These results are consistent with an "integrated strategy" model of speech perception where the listener chooses a segregation strategy based on the characteristics of the masker present in the target ear and the amount of across-ear interference is determined by the extent to which this strategy can also effectively be used to suppress the masker in the unattended ear.  相似文献   

13.
The across-trial effect of maskers in conditions of informational masking was evaluated from performance on occasional trials in which the signal was presented alone. For 6 of 12 listeners participating in the study, a significant number of errors were obtained on signal-alone trials; in some cases equivalent to that signal+ masker trials. On immediately preceding trial blocks for which there were no intervening maskers, performance for these signals was perfect. The results indicate that informational maskers can have a significant effect on signal threshold, both within and across trials.  相似文献   

14.
These experiments investigated whether perceptual cueing plays a role in the "unmasking" effects which have been observed in forward masking for narrow-band noise maskers and brief signals. The forward masking produced by a 100-Hz-wide noise masker at a level of 60 dB SPL was measured for a 1-kHz sinusoidal signal with a raised-cosine envelope and a duration of 10 ms at the 6-dB-down points, both for the masker alone, and with various components added to the masker (and gated synchronously with the masker). Unmasking was found to occur even for components which were extremely unlikely to produce a significant suppression of the masker: these included a 75-dB SPL 4-kHz sinusoid, a 50-dB SPL 1.4-kHz sinusoid, a noise low-pass filtered at 4 kHz with a spectrum level of 0 dB, and a noise low-pass filtered at 4 kHz with a spectrum level of 20 dB presented in the opposite ear to the masker-plus-signal. It is concluded that perceptual cueing can play a significant role in producing unmasking for brief signals following narrow-band noise maskers, and that it is unwise to interpret the unmasking solely in terms of suppression.  相似文献   

15.
Although many studies have shown that intelligibility improves when a speech signal and an interfering sound source are spatially separated in azimuth, little is known about the effect that spatial separation in distance has on the perception of competing sound sources near the head. In this experiment, head-related transfer functions (HRTFs) were used to process stimuli in order to simulate a target talker and a masking sound located at different distances along the listener's interaural axis. One of the signals was always presented at a distance of 1 m, and the other signal was presented 1 m, 25 cm, or 12 cm from the center of the listener's head. The results show that distance separation has very different effects on speech segregation for different types of maskers. When speech-shaped noise was used as the masker, most of the intelligibility advantages of spatial separation could be accounted for by spectral differences in the target and masking signals at the ear with the higher signal-to-noise ratio (SNR). When a same-sex talker was used as the masker, the intelligibility advantages of spatial separation in distance were dominated by binaural effects that produced the same performance improvements as a 4-5-dB increase in the SNR of a diotic stimulus. These results suggest that distance-dependent changes in the interaural difference cues of nearby sources play a much larger role in the reduction of the informational masking produced by an interfering speech signal than in the reduction of the energetic masking produced by an interfering noise source.  相似文献   

16.
In tone-on-tone masking, thresholds often decrease as the onset of the signal is delayed relative to the onset of the masker, especially when the frequency of the masker is higher than the frequency of the signal. This temporal effect was studied here by using a tonal "precursor," whose offset preceded the onset of the tonal masker (and signal). Under the right conditions, the precursor can reduce or eliminate the temporal effect by decreasing the threshold for a signal at masker onset, presumably for the same reason that the threshold decreases as a signal is delayed relative to the onset of a masker. In the present study, the frequency of the signal was 4000 Hz, and the frequency of the masker and precursor was typically 5000 Hz. In experiment 1, the precursor was presented to the ear receiving the masker and signal (ipsilateral precursor); in experiment 2, it was presented to the opposite ear (contralateral precursor). The results from experiment 1 can be summarized as follows: the ipsilateral precursor (a) reaches its maximum effectiveness (in reducing the temporal effect) for precursor durations of 200-400 ms; (b) is ineffective once the delay between its offset and the onset of the masker reaches about 50-100 ms; (c) is generally ineffective when its level is 10 or more dB lower than the level of the masker, but is effective when its level is equal to or greater than the level of the masker; and (d) becomes progressively less effective as its frequency is either increased or decreased relative to the frequency of the masker. The results from experiment 2 can be summarized simply by stating that the contralateral precursor is ineffective in reducing the temporal effect. These results suggest that the effect of the precursor may be mediated peripherally.  相似文献   

17.
In forward masking, performance may be affected by confusion, that is, by the difficulty of discriminating a suprathreshold signal from the preceding masker. This study investigated confusion effects for forward maskers composed of repeated bursts of a 100-Hz sinusoid followed by sinusoidal signals; such "pulsing" maskers produce confusion when the properties of the signal are identical to those of an individual masker "pulse." The level, frequency, and duration of the signal relative to an individual masker pulse, as well as offset-onset delay, were varied to determine the minimum change necessary to eliminate confusion. For maskers composed of 20-ms pulses, confusion was eliminated by changes in signal level of 5 dB or changes in signal frequency of 30 to 40 Hz. For maskers composed of 10-, 20-, or 40-ms pulses, confusion was eliminated by signal delays of 8 to 16 ms or by signal durations less than half or greater than twice the masker-pulse duration. Results with adaptive procedures designed to measure confusion-free or confusion-determined thresholds suggest that confusion effects can be minimized or avoided by extensive listener training with a procedure in which the signal and masker are not presented at similar intensities.  相似文献   

18.
Speech recognition performance was measured in normal-hearing and cochlear-implant listeners with maskers consisting of either steady-state speech-spectrum-shaped noise or a competing sentence. Target sentences from a male talker were presented in the presence of one of three competing talkers (same male, different male, or female) or speech-spectrum-shaped noise generated from this talker at several target-to-masker ratios. For the normal-hearing listeners, target-masker combinations were processed through a noise-excited vocoder designed to simulate a cochlear implant. With unprocessed stimuli, a normal-hearing control group maintained high levels of intelligibility down to target-to-masker ratios as low as 0 dB and showed a release from masking, producing better performance with single-talker maskers than with steady-state noise. In contrast, no masking release was observed in either implant or normal-hearing subjects listening through an implant simulation. The performance of the simulation and implant groups did not improve when the single-talker masker was a different talker compared to the same talker as the target speech, as was found in the normal-hearing control. These results are interpreted as evidence for a significant role of informational masking and modulation interference in cochlear implant speech recognition with fluctuating maskers. This informational masking may originate from increased target-masker similarity when spectral resolution is reduced.  相似文献   

19.
Spatial unmasking describes the improvement in the detection or identification of a target sound afforded by separating it spatially from simultaneous masking sounds. This effect has been studied extensively for speech intelligibility in the presence of interfering sounds. In the current study, listeners identified zebra finch song, which shares many acoustic properties with speech but lacks semantic and linguistic content. Three maskers with the same long-term spectral content but different short-term statistics were used: (1) chorus (combinations of unfamiliar zebra finch songs), (2) song-shaped noise (broadband noise with the average spectrum of chorus), and (3) chorus-modulated noise (song-shaped noise multiplied by the broadband envelope from a chorus masker). The amount of masking and spatial unmasking depended on the masker and there was evidence of release from both energetic and informational masking. Spatial unmasking was greatest for the statistically similar chorus masker. For the two noise maskers, there was less spatial unmasking and it was wholly accounted for by the relative target and masker levels at the acoustically better ear. The results share many features with analogous results using speech targets, suggesting that spatial separation aids in the segregation of complex natural sounds through mechanisms that are not specific to speech.  相似文献   

20.
This study investigated the benefit of a priori cues in a masked nonspeech pattern identification experiment. Targets were narrowband sequences of tone bursts forming six easily identifiable frequency patterns selected randomly on each trial. The frequency band containing the target was randomized. Maskers were also narrowband sequences of tone bursts chosen randomly on every trial. Targets and maskers were presented monaurally in mutually exclusive frequency bands, producing large amounts of informational masking. Cuing the masker produced a significant improvement in performance, while holding the target frequency band constant provided no benefit. The cue providing the greatest benefit was a copy of the masker presented ipsilaterally before the target-plus-masker. The masker cue presented contralaterally, and a notched-noise cue produced smaller benefits. One possible mechanism underlying these findings is auditory "enhancement" in which the neural response to the target is increased relative to the masker by differential prior stimulation of the target and masker frequency regions. A second possible mechanism provides a benefit to performance by comparing the spectrotemporal correspondence of the cue and target-plus-masker and is effective for either ipsilateral or contralateral cue presentation. These effects improve identification performance by emphasizing spectral contrasts in sequences or streams of sounds.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号