首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
These experiments examine how comodulation masking release (CMR) varies with masker bandwidth, modulator bandwidth, and signal duration. In experiment 1, thresholds were measured for a 400-ms, 2000-Hz signal masked by continuous noise varying in bandwidth from 50-3200 Hz in 1-oct steps. In one condition, using random noise maskers, thresholds increased with increasing bandwidth up to 400 Hz and then remained approximately constant. In another set of conditions, the masker was multiplied (amplitude modulated) by a low-pass noise (bandwidth varied from 12.5-400 Hz in 1-oct steps). This produced correlated envelope fluctuations across frequency. Thresholds were generally lower than for random noise maskers with the same bandwidth. For maskers less than one critical band wide, the release from masking was largest (about 5 dB) for maskers with low rates of modulation (12.5-Hz-wide low-pass modulator). It is argued that this release from masking is not a "true" CMR but results from a within-channel cue. For broadband maskers (greater than 400 Hz), the release from masking increased with increasing masker bandwidth and decreasing modulator bandwidth, reaching an asymptote of 12 dB for a masker bandwidth of 800 Hz and a modulator bandwidth of 50 Hz. Most of this release from masking can be attributed to a CMR. In experiment 2, the modulator bandwidth was fixed at 12.5 Hz and the signal duration was varied. For masker bandwidths greater than 400 Hz, the CMR decreased from 12 to 5 dB as the signal duration was decreased from 400 to 25 ms.(ABSTRACT TRUNCATED AT 250 WORDS)  相似文献   

2.
Spectrally shaped steady noise is commonly used as a masker of speech. The effects of inherent random fluctuations in amplitude of such a noise are typically ignored. Here, the importance of these random fluctuations was assessed by comparing two cases. For one, speech was mixed with steady speech-shaped noise and N-channel tone vocoded, a process referred to as signal-domain mixing (SDM); this preserved the random fluctuations of the noise. For the second, the envelope of speech alone was extracted for each vocoder channel and a constant was added corresponding to the root-mean-square value of the noise envelope for that channel. This is referred to as envelope-domain mixing (EDM); it removed the random fluctuations of the noise. Sinusoidally modulated noise and a single talker were also used as backgrounds, with both SDM and EDM. Speech intelligibility was measured for N?=?12, 19, and 30, with the target-to-background ratio fixed at -7 dB. For SDM, performance was best for the speech background and worst for the steady noise. For EDM, this pattern was reversed. Intelligibility with steady noise was consistently very poor for SDM, but near-ceiling for EDM, demonstrating that the random fluctuations in steady noise have a large effect.  相似文献   

3.
Background noise reduces the depth of the low-frequency envelope modulations known to be important for speech intelligibility. The relative strength of the target and masker envelope modulations can be quantified using a modulation signal-to-noise ratio, (S/N)(mod), measure. Such a measure can be used in noise-suppression algorithms to extract target-relevant modulations from the corrupted (target + masker) envelopes for potential improvement in speech intelligibility. In the present study, envelopes are decomposed in the modulation spectral domain into a number of channels spanning the range of 0-30 Hz. Target-dominant modulations are identified and retained in each channel based on the (S/N)(mod) selection criterion, while modulations which potentially interfere with perception of the target (i.e., those dominated by the masker) are discarded. The impact of modulation-selective processing on the speech-reception threshold for sentences in noise is assessed with normal-hearing listeners. Results indicate that the intelligibility of noise-masked speech can be improved by as much as 13 dB when preserving target-dominant modulations, present up to a modulation frequency of 18 Hz, while discarding masker-dominant modulations from the mixture envelopes.  相似文献   

4.
This study investigated the effects of simulated cochlear-implant processing on speech reception in a variety of complex masking situations. Speech recognition was measured as a function of target-to-masker ratio, processing condition (4, 8, 24 channels, and unprocessed) and masker type (speech-shaped noise, amplitude-modulated speech-shaped noise, single male talker, and single female talker). The results showed that simulated implant processing was more detrimental to speech reception in fluctuating interference than in steady-state noise. Performance in the 24-channel processing condition was substantially poorer than in the unprocessed condition, despite the comparable representation of the spectral envelope. The detrimental effects of simulated implant processing in fluctuating maskers, even with large numbers of channels, may be due to the reduction in the pitch cues used in sound source segregation, which are normally carried by the peripherally resolved low-frequency harmonics and the temporal fine structure. The results suggest that using steady-state noise to test speech intelligibility may underestimate the difficulties experienced by cochlear-implant users in fluctuating acoustic backgrounds.  相似文献   

5.
In the n-of-m strategy, the signal is processed through m bandpass filters from which only the n maximum envelope amplitudes are selected for stimulation. While this maximum selection criterion, adopted in the advanced combination encoder strategy, works well in quiet, it can be problematic in noise as it is sensitive to the spectral composition of the input signal and does not account for situations in which the masker completely dominates the target. A new selection criterion is proposed based on the signal-to-noise ratio (SNR) of individual channels. The new criterion selects target-dominated (SNR > or = 0 dB) channels and discards masker-dominated (SNR<0 dB) channels. Experiment 1 assessed cochlear implant users' performance with the proposed strategy assuming that the channel SNRs are known. Results indicated that the proposed strategy can restore speech intelligibility to the level attained in quiet independent of the type of masker (babble or continuous noise) and SNR level (0-10 dB) used. Results from experiment 2 showed that a 25% error rate can be tolerated in channel selection without compromising speech intelligibility. Overall, the findings from the present study suggest that the SNR criterion is an effective selection criterion for n-of-m strategies with the potential of restoring speech intelligibility.  相似文献   

6.
Modulation masking: effects of modulation frequency, depth, and phase   总被引:1,自引:0,他引:1  
Modulation thresholds were measured for a sinusoidally amplitude-modulated (SAM) broadband noise in the presence of a SAM broadband background noise with a modulation depth (mm) of 0.00, 0.25, or 0.50, where the condition mm = 0.00 corresponds to standard (unmasked) modulation detection. The modulation frequency of the masker was 4, 16, or 64 Hz; the modulation frequency of the signal ranged from 2-512 Hz. The greatest amount of modulation masking (masked threshold minus unmasked threshold) typically occurred when the signal frequency was near the masker frequency. The modulation masking patterns (amount of modulation masking versus signal frequency) for the 4-Hz masker were low pass, whereas the patterns for the 16- and 64-Hz maskers were somewhat bandpass (although not strictly so). In general, the greater the modulation depth of the masker, the greater the amount of modulation masking (although this trend was reversed for the 4-Hz masker at high signal frequencies). These modulation-masking data suggest that there are channels in the auditory system which are tuned for the detection of modulation frequency, much like there are channels (critical bands or auditory filters) tuned for the detection of spectral frequency.  相似文献   

7.
These experiments on across-channel masking (ACM) and comodulation masking release (CMR) were designed to extend the work of Grose and Hall [J. Acoust. Soc. Am. 85, 1276-1284 (1989)] on CMR. They investigated the effect of the temporal position of a brief 700-Hz signal relative to the modulation cycle of a 700-Hz masker 100% sinusoidally amplitude modulated (SAM) at a 10-Hz rate, which was either presented alone (reference masker) or formed part of a masker consisting of the 3rd to 11th harmonics of a 100-Hz fundamental. In the harmonic maskers, each harmonic was either SAM with the same 10-Hz modulator phase (comodulated masker) or with a shift in modulator phase of 90 degrees for each successive harmonic (phase-incoherent masker). When the signal was presented at the dips of the envelope of the 700-Hz component, the comodulated masker gave lower thresholds than the reference masker, while the phase-incoherent masker gave higher thresholds, i.e., a CMR was observed. No CMR was found when the signal was presented at the peaks of the envelope. In experiment 1, we replicated the experiment of Grose and Hall, but with an additional condition in which the 600- and 800-Hz components were removed from the masker, in order to investigate the role of within-channel masking effects. The results were similar to those of Grose and Hall. In experiment 2, the signal was added at the peaks of the envelope of the 700-Hz component, but in antiphase to the carrier of that component and at a level chosen to transform the peaks into dips. No CMR was found. Rather, performance was worse for both the comodulated and phase-incoherent maskers than for the reference masker. This was true even when the flanking components in the maskers were all remote in frequency from 700 Hz. In experiment 3, the masker components were all 50% SAM and the signal was added in antiphase at a dip of the envelope of the 700-Hz component, thus making the dip deeper. Performance was worse for the phase-incoherent than for the reference masker and was worse still for the comodulated masker. The results of all three experiments indicate strong ACM effects. CMR was found only when the signal was placed in the dips of the masker envelope and when it produced an increase in level relative to that in adjacent bands.  相似文献   

8.
Lutfi [J. Acoust. Soc. Am. 73, 262-267 (1983)] compared simultaneous masking functions (signal threshold versus masker level) for individual sinusoidal and narrow-band noise maskers, and for those maskers presented in pairs. Lutfi found that the pairs of maskers produced 10-17 dB "excess" masking over that predicted from the linear sum of their individual masking and explained the results in terms of a model in which the effects of the maskers are summed after undergoing independent compressive transformations. This paper describes experiments similar to those of Lutfi, and presents evidence suggesting that Lutfi's results may have been influenced by two factors: (1) combination-product detection, and (2) the use of different detection cues for single maskers and for pairs of maskers. Experiment I showed that when the stimulus conditions were chosen so as to minimize the likelihood of combination-product detection, "excess" masking was only 3-5 dB. Experiment II supported the idea that for a single narrow-band noise masker, subjects make use of the relatively slow envelope fluctuations to enhance performance. When two independent narrow-band noise maskers are added, the effectiveness of this cue is reduced, and between 3 and 9 dB of "excess" masking occurs. When the two noises are derived from the same source, and have correlated envelope fluctuations, no "excess" masking occurs. The results indicate that Lufti's compressive-nonlinearity model clearly fails in some situations.  相似文献   

9.
The purpose of this investigation was to examine two stimulus parameters that were reasoned to be of importance to comodulation masking release (CMR). The first was the degree of fluctuation, or depth of modulation, in the masker bands, and the second was the temporal position of the signal with respect to the modulations of the masker. The investigation began by demonstrating the efficacy of sinusoidally amplitude-modulated (SAM) tonal complex maskers in eliciting CMR. "Nine-band" maskers, 650 ms in duration, were constructed by adding together nine SAM tones spaced at 100-Hz intervals from 300 to 1100 Hz. The rate of modulation for each SAM tone was 10 Hz, and the depth of modulation was 100%. Using such maskers, it was shown that when the on-frequency SAM tone had a modulation depth of 100%, the threshold for a 250-ms, 700-Hz tone improved monotonically as the modulation depths of the flanking SAM tones increased from 0% to 100%. When the on-frequency SAM tone had a modulation depth of 63%, some listeners performed optimally when the flanking SAM tones also exhibited a modulation depth of 63%, whereas others performed best when the flankers had modulation depths of 100%. With regard to signal position, a typical CMR effect was observed when the signal, consisting of a train of three 50-ms, 700-Hz tone bursts, was placed in the dips of the on-frequency masker. However, when the signal was placed at the peaks of the envelope, an increase in masking was observed for a comodulated masker.(ABSTRACT TRUNCATED AT 250 WORDS)  相似文献   

10.
A mathematical formula for estimating spatial release from masking (SRM) in a cocktail party environment would be useful as a simpler alternative to computationally intensive algorithms and may enhance understanding of underlying mechanisms. The experiment presented herein was designed to provide a strong test of a model that divides SRM into contributions of asymmetry and angular separation [Bronkhorst (2000). Acustica 86, 117-128] and to examine whether that model can be extended to include speech maskers. Across masker types the contribution to SRM of angular separation of maskers from the target was found to grow at a diminishing rate as angular separation increased within the frontal hemifield, contrary to predictions of the model. Speech maskers differed from noise maskers in the overall magnitude of SRM and in the contribution of angular separation (both greater for speech). These results were used to develop a modified model that achieved good fits to data for noise maskers (ρ=0.93) and for speech maskers (ρ=0.94) while using the same functions to describe separation and asymmetry components of SRM for both masker types. These findings suggest that this approach can be used to accurately model SRM for speech maskers in addition to primarily "energetic" noise maskers.  相似文献   

11.
This study examined combinations of energetic and informational maskers in speech identification. Speech targets and maskers (speech or noise) were processed and filtered into sets of 15 narrow frequency bands. The target was the sum of eight randomly selected bands. More masking occurred for speech maskers than for spectrally matched noise maskers regardless of whether the masker bands overlapped the target bands. The greater effect of the speech maskers was interpreted as due to informational masking. When the masker was comprised of nonoverlapping bands of speech, the addition of bands of noise overlapping the speech masker, but not the speech target, reduced the overall amount of masking. Surprisingly, presenting the noise to the ear contralateral to the target and masker produced an even greater release from masking. The contralateral noise was apparently sufficient to cause a slight change in the image of the ipsilateral speech masker, possibly pulling it away from the target enough to allow the focus of attention on the target. This finding is consistent with the interpretation that in some conditions small binaural differences may be sufficient to cause, or significantly strengthen, the perceptual segregation of sounds.  相似文献   

12.
This study investigated whether speech-like maskers without linguistic content produce informational masking of speech. The target stimuli were nonsense Chinese Mandarin sentences. In experiment I, the masker contained harmonics the fundamental frequency (F0) of which was sinusoidally modulated and the mean F0 of which was varied. The magnitude of informational masking was evaluated by measuring the change in intelligibility (releasing effect) produced by inducing a perceived spatial separation of the target speech and masker via the precedence effect. The releasing effect was small and was only clear when the target and masker had the same mean F0, suggesting that informational masking was small. Performance with the harmonic maskers was better than with a steady speech-shaped noise (SSN) masker. In experiments II and III, the maskers were speech-like synthesized signals, alternating between segments with harmonic structure and segments composed of SSN. Performance was much worse than for experiment I, and worse than when an SSN masker was used, suggesting that substantial informational masking occurred. The similarity of the F0 contours of the target and masker had little effect. The informational masking effect was not influenced by whether or not the noise-like segments of the masker were synchronous with the unvoiced segments of the target speech.  相似文献   

13.
Spatial unmasking describes the improvement in the detection or identification of a target sound afforded by separating it spatially from simultaneous masking sounds. This effect has been studied extensively for speech intelligibility in the presence of interfering sounds. In the current study, listeners identified zebra finch song, which shares many acoustic properties with speech but lacks semantic and linguistic content. Three maskers with the same long-term spectral content but different short-term statistics were used: (1) chorus (combinations of unfamiliar zebra finch songs), (2) song-shaped noise (broadband noise with the average spectrum of chorus), and (3) chorus-modulated noise (song-shaped noise multiplied by the broadband envelope from a chorus masker). The amount of masking and spatial unmasking depended on the masker and there was evidence of release from both energetic and informational masking. Spatial unmasking was greatest for the statistically similar chorus masker. For the two noise maskers, there was less spatial unmasking and it was wholly accounted for by the relative target and masker levels at the acoustically better ear. The results share many features with analogous results using speech targets, suggesting that spatial separation aids in the segregation of complex natural sounds through mechanisms that are not specific to speech.  相似文献   

14.
When listeners hear a target signal in the presence of competing sounds, they are quite good at extracting information at instances when the local signal-to-noise ratio of the target is most favorable. Previous research suggests that listeners can easily understand a periodically interrupted target when it is interleaved with noise. It is not clear if this ability extends to the case where an interrupted target is alternated with a speech masker rather than noise. This study examined speech intelligibility in the presence of noise or speech maskers, which were either continuous or interrupted at one of six rates between 4 and 128 Hz. Results indicated that with noise maskers, listeners performed significantly better with interrupted, rather than continuous maskers. With speech maskers, however, performance was better in continuous, rather than interrupted masker conditions. Presumably the listeners used continuity as a cue to distinguish the continuous masker from the interrupted target. Intelligibility in the interrupted masker condition was improved by introducing a pitch difference between the target and speech masker. These results highlight the role that target-masker differences in continuity and pitch play in the segregation of competing speech signals.  相似文献   

15.
A triadic comparisons task and an identification task were used to evaluate normally hearing listeners' and hearing-impaired listeners' perceptions of synthetic CV stimuli in the presence of competition. The competing signals included multitalker babble, continuous speech spectrum noise, a CV masker, and a brief noise masker shaped to resemble the onset spectrum of the CV masker. All signals and maskers were presented monotically. Interference by competition was assessed by comparing Multidimensional Scaling solutions derived from each masking condition to that derived from the baseline (quiet) condition. Analysis of the effects of continuous maskers revealed that multitalker babble and continuous noise caused the same amount of change in performance, as compared to the baseline condition, for all listeners. CV masking changed performance significantly more than did brief noise masking, and the hearing-impaired listeners experienced more degradation in performance than normals. Finally, the velar CV maskers (g epsilon and k epsilon) caused significantly greater masking effects than the bilabial CV maskers (b epsilon and p epsilon), and were most resistant to masking by other competing stimuli. The results suggest that speech intelligibility difficulties in the presence of competing segments of speech are primarily attributable to phonetic interference rather than to spectral masking. Individual differences in hearing-impaired listeners' performances are also discussed.  相似文献   

16.
Speech recognition performance was measured in normal-hearing and cochlear-implant listeners with maskers consisting of either steady-state speech-spectrum-shaped noise or a competing sentence. Target sentences from a male talker were presented in the presence of one of three competing talkers (same male, different male, or female) or speech-spectrum-shaped noise generated from this talker at several target-to-masker ratios. For the normal-hearing listeners, target-masker combinations were processed through a noise-excited vocoder designed to simulate a cochlear implant. With unprocessed stimuli, a normal-hearing control group maintained high levels of intelligibility down to target-to-masker ratios as low as 0 dB and showed a release from masking, producing better performance with single-talker maskers than with steady-state noise. In contrast, no masking release was observed in either implant or normal-hearing subjects listening through an implant simulation. The performance of the simulation and implant groups did not improve when the single-talker masker was a different talker compared to the same talker as the target speech, as was found in the normal-hearing control. These results are interpreted as evidence for a significant role of informational masking and modulation interference in cochlear implant speech recognition with fluctuating maskers. This informational masking may originate from increased target-masker similarity when spectral resolution is reduced.  相似文献   

17.
This experiment assessed the benefits of suppression and the impact of reduced or absent suppression on speech recognition in noise. Psychophysical suppression was measured in forward masking using tonal maskers and suppressors and band limited noise maskers and suppressors. Subjects were 10 younger and 10 older adults with normal hearing, and 10 older adults with cochlear hearing loss. For younger subjects with normal hearing, suppression measured with noise maskers increased with masker level and was larger at 2.0 kHz than at 0.8 kHz. Less suppression was observed for older than younger subjects with normal hearing. There was little evidence of suppression for older subjects with cochlear hearing loss. Suppression measured with noise maskers and suppressors was larger in magnitude and more prevalent than suppression measured with tonal maskers and suppressors. The benefit of suppression to speech recognition in noise was assessed by obtaining scores for filtered consonant-vowel syllables as a function of the bandwidth of a forward masker. Speech-recognition scores in forward maskers should be higher than those in simultaneous maskers given that forward maskers are less effective than simultaneous maskers. If suppression also mitigated the effects of the forward masker and resulted in an improved signal-to-noise ratio, scores should decrease less in forward masking as forward-masker bandwidth increased, and differences between scores in forward and simultaneous maskers should increase, as was observed for younger subjects with normal hearing. Less or no benefit of suppression to speech recognition in noise was observed for older subjects with normal hearing or hearing loss. In general, as suppression measured with tonal signals increased, the combined benefit of forward masking and suppression to speech recognition in noise also increased.  相似文献   

18.
These experiments investigated whether perceptual cueing plays a role in the "unmasking" effects which have been observed in forward masking for narrow-band noise maskers and brief signals. The forward masking produced by a 100-Hz-wide noise masker at a level of 60 dB SPL was measured for a 1-kHz sinusoidal signal with a raised-cosine envelope and a duration of 10 ms at the 6-dB-down points, both for the masker alone, and with various components added to the masker (and gated synchronously with the masker). Unmasking was found to occur even for components which were extremely unlikely to produce a significant suppression of the masker: these included a 75-dB SPL 4-kHz sinusoid, a 50-dB SPL 1.4-kHz sinusoid, a noise low-pass filtered at 4 kHz with a spectrum level of 0 dB, and a noise low-pass filtered at 4 kHz with a spectrum level of 20 dB presented in the opposite ear to the masker-plus-signal. It is concluded that perceptual cueing can play a significant role in producing unmasking for brief signals following narrow-band noise maskers, and that it is unwise to interpret the unmasking solely in terms of suppression.  相似文献   

19.
Spatial unmasking of speech has traditionally been studied with target and masker at the same, relatively large distance. The present study investigated spatial unmasking for configurations in which the simulated sources varied in azimuth and could be either near or far from the head. Target sentences and speech-shaped noise maskers were simulated over headphones using head-related transfer functions derived from a spherical-head model. Speech reception thresholds were measured adaptively, varying target level while keeping the masker level constant at the "better" ear. Results demonstrate that small positional changes can result in very large changes in speech intelligibility when sources are near the listener as a result of large changes in the overall level of the stimuli reaching the ears. In addition, the difference in the target-to-masker ratios at the two ears can be substantially larger for nearby sources than for relatively distant sources. Predictions from an existing model of binaural speech intelligibility are in good agreement with results from all conditions comparable to those that have been tested previously. However, small but important deviations between the measured and predicted results are observed for other spatial configurations, suggesting that current theories do not accurately account for speech intelligibility for some of the novel spatial configurations tested.  相似文献   

20.
Vibrotacile masking of Pacinian and non-Pacinian channels   总被引:1,自引:0,他引:1  
Vibrotactile masking functions were determined using sinusoidal and noise maskers. Results were nearly identical within the Pacinian (P) and non-Pacinian (NP) channels. At low maskers SLs there was a substantial amount of negative masking which proved not be an artifact of stimulus definition. The critical parameters for successful prediction of the data were a peripheral threshold and internal Gaussian noise. Threshold shifts in cross-channel stimulation can be attributed to the masker exceeding the detection threshold of the signal channel.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号