首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
For normal-hearing (NH) listeners, masker energy outside the spectral region of a target signal can improve target detection and identification, a phenomenon referred to as comodulation masking release (CMR). This study examined whether, for cochlear implant (CI) listeners and for NH listeners presented with a "noise vocoded" CI simulation, speech identification in modulated noise is improved by a co-modulated flanking band. In Experiment 1, NH listeners identified noise-vocoded speech in a background of on-target noise with or without a flanking narrow band of noise outside the spectral region of the target. The on-target noise and flanker were either 16-Hz square-wave modulated with the same phase or were unmodulated; the speech was taken from a closed-set corpus. Performance was better in modulated than in unmodulated noise, and this difference was slightly greater when the comodulated flanker was present, consistent with a small CMR of about 1.7 dB for noise-vocoded speech. Experiment 2, which tested CI listeners using the same speech materials, found no advantage for modulated versus unmodulated maskers and no CMR. Thus although NH listeners can benefit from CMR even for speech signals with reduced spectro-temporal detail, no CMR was observed for CI users.  相似文献   

2.
Many competing noises in real environments are modulated or fluctuating in level. Listeners with normal hearing are able to take advantage of temporal gaps in fluctuating maskers. Listeners with sensorineural hearing loss show less benefit from modulated maskers. Cochlear implant users may be more adversely affected by modulated maskers because of their limited spectral resolution and by their reliance on envelope-based signal-processing strategies of implant processors. The current study evaluated cochlear implant users' ability to understand sentences in the presence of modulated speech-shaped noise. Normal-hearing listeners served as a comparison group. Listeners repeated IEEE sentences in quiet, steady noise, and modulated noise maskers. Maskers were presented at varying signal-to-noise ratios (SNRs) at six modulation rates varying from 1 to 32 Hz. Results suggested that normal-hearing listeners obtain significant release from masking from modulated maskers, especially at 8-Hz masker modulation frequency. In contrast, cochlear implant users experience very little release from masking from modulated maskers. The data suggest, in fact, that they may show negative effects of modulated maskers at syllabic modulation rates (2-4 Hz). Similar patterns of results were obtained from implant listeners using three different devices with different speech-processor strategies. The lack of release from masking occurs in implant listeners independent of their device characteristics, and may be attributable to the nature of implant processing strategies and/or the lack of spectral detail in processed stimuli.  相似文献   

3.
The present study investigated the effect of envelope modulations in a background masker on consonant recognition by normal hearing listeners. It is well known that listeners understand speech better under a temporally modulated masker than under a steady masker at the same level, due to masking release. The possibility of an opposite phenomenon, modulation interference, whereby speech recognition could be degraded by a modulated masker due to interference with auditory processing of the speech envelope, was hypothesized and tested under various speech and masker conditions. It was of interest whether modulation interference for speech perception, if it were observed, could be predicted by modulation masking, as found in psychoacoustic studies using nonspeech stimuli. Results revealed that masking release measurably occurred under a variety of conditions, especially when the speech signal maintained a high degree of redundancy across several frequency bands. Modulation interference was also clearly observed under several circumstances when the speech signal did not contain a high redundancy. However, the effect of modulation interference did not follow the expected pattern from psychoacoustic modulation masking results. In conclusion, (1) both factors, modulation interference and masking release, should be accounted for whenever a background masker contains temporal fluctuations, and (2) caution needs to be taken when psychoacoustic theory on modulation masking is applied to speech recognition.  相似文献   

4.
The first part of this paper presents several experiments on signal detection in temporally modulated noise, yielding a general approach toward the concept of comodulation masking release (CMR). Measurements were made on masked thresholds of both long- and short-duration, narrow-band signals presented in a 100% sinusoidally amplitude-modulated (SAM) noise masker (modulation frequency 32 Hz), as a function of masker bandwidth from 1/3 oct up to 13/3 octs, while the masker band was geometrically centered at signal frequency. With the short-duration signals placed in the valley of the masker, a substantial CMR (i.e., a decrease of masked threshold with increasing masker bandwidth) was found, whereas for the long-duration signals CMR was smaller. Furthermore, investigations were carried out to determine whether CMR changes when the bandwidth of the signals, consisting of bandpass impulse responses, is increased. The data indicate that substantial CMR remains even when all masker bands contain a signal component, thus minimizing across-channel differences. This finding is not in line with current models accounting for the CMR phenomenon. The second part of this paper concerns signal detection in spectrally shaped noise. Also investigated was whether release from masking occurs for the detection of a pure-tone signal at a valley or a peak of a simultaneously presented masking noise with a sinusoidally rippled power spectrum, when this masker was preceded and followed by a second noise (temporal flanking burst) with an identical spectral shape as the on-signal noise. Similar to CMR effects for temporal modulations, the data indicate that coshaping masking release (CSMR) occurs when the signal is placed in a valley of the spectral envelope of the masker, whereas no release from masking is found when the signal is placed at a peak of the spectral envelope of the masker. The implications of these experiments for measures of spectral and temporal resolution are discussed.  相似文献   

5.
Talkers change the way they speak in noisy conditions. For energetic maskers, speech production changes are relatively well-understood, but less is known about how informational maskers such as competing speech affect speech production. The current study examines the effect of energetic and informational maskers on speech production by talkers speaking alone or in pairs. Talkers produced speech in quiet and in backgrounds of speech-shaped noise, speech-modulated noise, and competing speech. Relative to quiet, speech output level and fundamental frequency increased and spectral tilt flattened in proportion to the energetic masking capacity of the background. In response to modulated backgrounds, talkers were able to reduce substantially the degree of temporal overlap with the noise, with greater reduction for the competing speech background. Reduction in foreground-background overlap can be expected to lead to a release from both energetic and informational masking for listeners. Passive changes in speech rate, mean pause length or pause distribution cannot explain the overlap reduction, which appears instead to result from a purposeful process of listening while speaking. Talkers appear to monitor the background and exploit upcoming pauses, a strategy which is particularly effective for backgrounds containing intelligible speech.  相似文献   

6.
When a masking sound is spatially separated from a target speech signal, substantial releases from masking typically occur both for speech and noise maskers. However, when a delayed copy of the masker is also presented at the location of the target speech (a condition that has been referred to as the front target, right-front masker or F-RF configuration), the advantages of spatial separation vanish for noise maskers but remain substantial for speech maskers. This effect has been attributed to precedence, which introduces an apparent spatial separation between the target and masker in the F-RF configuration that helps the listener to segregate the target from a masking voice but not from a masking noise. In this study, virtual synthesis techniques were used to examine variations of the F-RF configuration in an attempt to more fully understand the stimulus parameters that influence the release from masking obtained in that condition. The results show that the release from speech-on-speech masking caused by the addition of the delayed copy of the masker is robust across a wide variety of source locations, masker locations, and masker delay values. This suggests that the speech unmasking that occurs in the F-RF configuration is not dependent on any single perceptual cue and may indicate that F-RF speech segregation is only partially based on the apparent left-right location of the RF masker.  相似文献   

7.
This study examined combinations of energetic and informational maskers in speech identification. Speech targets and maskers (speech or noise) were processed and filtered into sets of 15 narrow frequency bands. The target was the sum of eight randomly selected bands. More masking occurred for speech maskers than for spectrally matched noise maskers regardless of whether the masker bands overlapped the target bands. The greater effect of the speech maskers was interpreted as due to informational masking. When the masker was comprised of nonoverlapping bands of speech, the addition of bands of noise overlapping the speech masker, but not the speech target, reduced the overall amount of masking. Surprisingly, presenting the noise to the ear contralateral to the target and masker produced an even greater release from masking. The contralateral noise was apparently sufficient to cause a slight change in the image of the ipsilateral speech masker, possibly pulling it away from the target enough to allow the focus of attention on the target. This finding is consistent with the interpretation that in some conditions small binaural differences may be sufficient to cause, or significantly strengthen, the perceptual segregation of sounds.  相似文献   

8.
To assess age-related differences in benefit from masker modulation, younger and older adults with normal hearing but not identical audiograms listened to nonsense syllables in each of two maskers: (1) a steady-state noise shaped to match the long-term spectrum of the speech, and (2) this same noise modulated by a 10-Hz square wave, resulting in an interrupted noise. An additional low-level broadband noise was always present which was shaped to produce equivalent masked thresholds for all subjects. This minimized differences in speech audibility due to differences in quiet thresholds among subjects. An additional goal was to determine if age-related differences in benefit from modulation could be explained by differences in thresholds measured in simultaneous and forward maskers. Accordingly, thresholds for 350-ms pure tones were measured in quiet and in each masker; thresholds for 20-ms signals in forward and simultaneous masking were also measured at selected signal frequencies. To determine if benefit from modulated maskers varied with masker spectrum and to provide a comparison with previous studies, a subgroup of younger subjects also listened in steady-state and interrupted noise that was not spectrally shaped. Articulation index (AI) values were computed and speech-recognition scores were predicted for steady-state and interrupted noise; predicted benefit from modulation was also determined. Masked thresholds of older subjects were slightly higher than those of younger subjects; larger age-related threshold differences were observed for short-duration than for long-duration signals. In steady-state noise, speech recognition for older subjects was poorer than for younger subjects, which was partially attributable to older subjects' slightly higher thresholds in these maskers. In interrupted noise, although predicted benefit was larger for older than younger subjects, scores improved more for younger than for older subjects, particularly at the higher noise level. This may be related to age-related increases in thresholds in steady-state noise and in forward masking, especially at higher frequencies. Benefit of interrupted maskers was larger for unshaped than for speech-shaped noise, consistent with AI predictions.  相似文献   

9.
Spatial unmasking describes the improvement in the detection or identification of a target sound afforded by separating it spatially from simultaneous masking sounds. This effect has been studied extensively for speech intelligibility in the presence of interfering sounds. In the current study, listeners identified zebra finch song, which shares many acoustic properties with speech but lacks semantic and linguistic content. Three maskers with the same long-term spectral content but different short-term statistics were used: (1) chorus (combinations of unfamiliar zebra finch songs), (2) song-shaped noise (broadband noise with the average spectrum of chorus), and (3) chorus-modulated noise (song-shaped noise multiplied by the broadband envelope from a chorus masker). The amount of masking and spatial unmasking depended on the masker and there was evidence of release from both energetic and informational masking. Spatial unmasking was greatest for the statistically similar chorus masker. For the two noise maskers, there was less spatial unmasking and it was wholly accounted for by the relative target and masker levels at the acoustically better ear. The results share many features with analogous results using speech targets, suggesting that spatial separation aids in the segregation of complex natural sounds through mechanisms that are not specific to speech.  相似文献   

10.
The amount of masking exerted by one speech sound on another can be reduced by presenting the masker twice, from two different locations in the horizontal plane, with one of the presentations delayed to simulate an acoustical reflection. Three experiments were conducted on various aspects of this phenomenon. Experiment 1 varied the number of masking talkers from one to three and the signal-to-noise (S/N) ratio from -12 to +4 dB. Evidence of masking release was found for every combination of these variables tested. For the most difficult conditions (multiple maskers and negative S/N) the amount of release was approximately 10 dB. Experiment 2 varied the timing of leading and lagging masker presentations over a broad range, to include shorter delay times where room reflections of speech are rarely noticed by listeners and longer delays where reflections can become disruptive. Substantial masking release was found for all of the shorter delay times tested, and negligible release was found at the longer delays. Finally, Experiment 3 used speech-spectrum noise as a masker and searched for possible energetic masking release as a function of the lead-lag time delay. Release of up to 4 dB was found whenever delays were 2 ms or less. No energetic masking release was found at longer delays.  相似文献   

11.
Listeners with sensorineural hearing loss are poorer than listeners with normal hearing at understanding one talker in the presence of another. This deficit is more pronounced when competing talkers are spatially separated, implying a reduced "spatial benefit" in hearing-impaired listeners. This study tested the hypothesis that this deficit is due to increased masking specifically during the simultaneous portions of competing speech signals. Monosyllabic words were compressed to a uniform duration and concatenated to create target and masker sentences with three levels of temporal overlap: 0% (non-overlapping in time), 50% (partially overlapping), or 100% (completely overlapping). Listeners with hearing loss performed particularly poorly in the 100% overlap condition, consistent with the idea that simultaneous speech sounds are most problematic for these listeners. However, spatial release from masking was reduced in all overlap conditions, suggesting that increased masking during periods of temporal overlap is only one factor limiting spatial unmasking in hearing-impaired listeners.  相似文献   

12.
This study investigated whether speech-like maskers without linguistic content produce informational masking of speech. The target stimuli were nonsense Chinese Mandarin sentences. In experiment I, the masker contained harmonics the fundamental frequency (F0) of which was sinusoidally modulated and the mean F0 of which was varied. The magnitude of informational masking was evaluated by measuring the change in intelligibility (releasing effect) produced by inducing a perceived spatial separation of the target speech and masker via the precedence effect. The releasing effect was small and was only clear when the target and masker had the same mean F0, suggesting that informational masking was small. Performance with the harmonic maskers was better than with a steady speech-shaped noise (SSN) masker. In experiments II and III, the maskers were speech-like synthesized signals, alternating between segments with harmonic structure and segments composed of SSN. Performance was much worse than for experiment I, and worse than when an SSN masker was used, suggesting that substantial informational masking occurred. The similarity of the F0 contours of the target and masker had little effect. The informational masking effect was not influenced by whether or not the noise-like segments of the masker were synchronous with the unvoiced segments of the target speech.  相似文献   

13.
The speech-reception threshold (SRT) for sentences presented in a fluctuating interfering background sound of 80 dBA SPL is measured for 20 normal-hearing listeners and 20 listeners with sensorineural hearing impairment. The interfering sounds range from steady-state noise, via modulated noise, to a single competing voice. Two voices are used, one male and one female, and the spectrum of the masker is shaped according to these voices. For both voices, the SRT is measured as well in noise spectrally shaped according to the target voice as shaped according to the other voice. The results show that, for normal-hearing listeners, the SRT for sentences in modulated noise is 4-6 dB lower than for steady-state noise; for sentences masked by a competing voice, this difference is 6-8 dB. For listeners with moderate sensorineural hearing loss, elevated thresholds are obtained without an appreciable effect of masker fluctuations. The implications of these results for estimating a hearing handicap in everyday conditions are discussed. By using the articulation index (AI), it is shown that hearing-impaired individuals perform poorer than suggested by the loss of audibility for some parts of the speech signal. Finally, three mechanisms are discussed that contribute to the absence of unmasking by masker fluctuations in hearing-impaired listeners. The low sensation level at which the impaired listeners receive the masker seems a major determinant. The second and third factors are: reduced temporal resolution and a reduction in comodulation masking release, respectively.  相似文献   

14.
Across-frequency processing by common interaural time delay (ITD) in spatial unmasking was investigated by measuring speech reception thresholds (SRTs) for high- and low-frequency bands of target speech presented against concurrent speech or a noise masker. Experiment 1 indicated that presenting one of these target bands with an ITD of +500 micros and the other with zero ITD (like the masker) provided some release from masking, but full binaural advantage was only measured when both target bands were given an ITD of + 500 micros. Experiment 2 showed that full binaural advantage could also be achieved when the high- and low-frequency bands were presented with ITDs of equal but opposite magnitude (+/- 500 micros). In experiment 3, the masker was also split into high- and low-frequency bands with ITDs of equal but opposite magnitude (+/-500 micros). The ITD of the low-frequency target band matched that of the high-frequency masking band and vice versa. SRTs indicated that, as long as the target and masker differed in ITD within each frequency band, full binaural advantage could be achieved. These results suggest that the mechanism underlying spatial unmasking exploits differences in ITD independently within each frequency channel.  相似文献   

15.
The present study sought to clarify the role of non-simultaneous masking in the binaural masking level difference for maskers that fluctuate in level. In the first experiment the signal was a brief 500-Hz tone, and the masker was a bandpass noise (100-2000 Hz), with the initial and final 200-ms bursts presented at 40-dB spectrum level and the inter-burst gap presented at 20-dB spectrum level. Temporal windows were fitted to thresholds measured for a range of gap durations and signal positions within the gap. In the second experiment, individual differences in out of phase (NoSπ) thresholds were compared for a brief signal in a gapped bandpass masker, a brief signal in a steady bandpass masker, and a long signal in a narrowband (50-Hz-wide) noise masker. The third experiment measured brief tone detection thresholds in forward, simultaneous, and backward masking conditions for a 50- and for a 1900-Hz-wide noise masker centered on the 500-Hz signal frequency. Results are consistent with comparable temporal resolution in the in phase (NoSo) and NoSπ conditions and no effect of temporal resolution on individual observers' ability to utilize binaural cues in narrowband noise. The large masking release observed for a narrowband noise masker may be due to binaural masking release from non-simultaneous, informational masking.  相似文献   

16.
The present study examined the relative influence of the off- and on-frequency spectral components of modulated and unmodulated maskers on consonant recognition. Stimuli were divided into 30 contiguous equivalent rectangular bandwidths. The temporal fine structure (TFS) in each "target" band was either left intact or replaced with tones using vocoder processing. Recognition scores for 10, 15 and 20 target bands randomly located in frequency were obtained in quiet and in the presence of all 30 masker bands, only the off-frequency masker bands, or only the on-frequency masker bands. The amount of masking produced by the on-frequency bands was generally comparable to that produced by the broadband masker. However, the difference between these two conditions was often significant, indicating an influence of the off-frequency masker bands, likely through modulation interference or spectral restoration. Although vocoder processing systematically lead to poorer consonant recognition scores, the deficit observed in noise could often be attributed to that observed in quiet. These data indicate that (i) speech recognition is affected by the off-frequency components of the background and (ii) the nature of the target TFS does not systematically affect speech recognition in noise, especially when energetic masking and/or the number of target bands is limited.  相似文献   

17.
Spoken communication in a non-native language is especially difficult in the presence of noise. This study compared English and Spanish listeners' perceptions of English intervocalic consonants as a function of masker type. Three maskers (stationary noise, multitalker babble, and competing speech) provided varying amounts of energetic and informational masking. Competing English and Spanish speech maskers were used to examine the effect of masker language. Non-native performance fell short of that of native listeners in quiet, but a larger performance differential was found for all masking conditions. Both groups performed better in competing speech than in stationary noise, and both suffered most in babble. Since babble is a less effective energetic masker than stationary noise, these results suggest that non-native listeners are more adversely affected by both energetic and informational masking. A strong correlation was found between non-native performance in quiet and degree of deterioration in noise, suggesting that non-native phonetic category learning can be fragile. A small effect of language background was evident: English listeners performed better when the competing speech was Spanish.  相似文献   

18.
These experiments examine how comodulation masking release (CMR) varies with masker bandwidth, modulator bandwidth, and signal duration. In experiment 1, thresholds were measured for a 400-ms, 2000-Hz signal masked by continuous noise varying in bandwidth from 50-3200 Hz in 1-oct steps. In one condition, using random noise maskers, thresholds increased with increasing bandwidth up to 400 Hz and then remained approximately constant. In another set of conditions, the masker was multiplied (amplitude modulated) by a low-pass noise (bandwidth varied from 12.5-400 Hz in 1-oct steps). This produced correlated envelope fluctuations across frequency. Thresholds were generally lower than for random noise maskers with the same bandwidth. For maskers less than one critical band wide, the release from masking was largest (about 5 dB) for maskers with low rates of modulation (12.5-Hz-wide low-pass modulator). It is argued that this release from masking is not a "true" CMR but results from a within-channel cue. For broadband maskers (greater than 400 Hz), the release from masking increased with increasing masker bandwidth and decreasing modulator bandwidth, reaching an asymptote of 12 dB for a masker bandwidth of 800 Hz and a modulator bandwidth of 50 Hz. Most of this release from masking can be attributed to a CMR. In experiment 2, the modulator bandwidth was fixed at 12.5 Hz and the signal duration was varied. For masker bandwidths greater than 400 Hz, the CMR decreased from 12 to 5 dB as the signal duration was decreased from 400 to 25 ms.(ABSTRACT TRUNCATED AT 250 WORDS)  相似文献   

19.
This study investigated comodulation detection differences (CDD) for fixed- and roved-frequency maskers. The objective was to determine whether CDD could be accounted for better in terms of energetic masking or in terms of perceptual fusion/segregation related to comodulation. Roved-frequency maskers were used in order to minimize the role of energetic masking, allowing possible effects related to perceptual fusion/segregation to be revealed. The signals and maskers were composed of 30-Hz-wide noise bands. The signal was either comodulated with the masker (A/A condition) or had a temporal envelope that was independent (A/B condition). The masker was either gated synchronously with the signal or had a leading temporal fringe of 200 ms. In the fixed-frequency masker conditions, listeners with low A/A thresholds showed little masking release due to masker temporal fringe and had CDDs that could be accounted for by energetic masking. Listeners with higher A/A thresholds in the fixed-frequency masker conditions showed relatively large CDDs and large masking release due to a masker temporal fringe. The CDDs of these listeners may have arisen, at least in part, from processes related to perceptual segregation. Some listeners in the roved masker conditions also had large CDDs that appeared to be related to perceptual segregation.  相似文献   

20.
The Speech Reception Threshold for sentences in stationary noise and in several amplitude-modulated noises was measured for 8 normal-hearing listeners, 29 sensorineural hearing-impaired listeners, and 16 normal-hearing listeners with simulated hearing loss. This approach makes it possible to determine whether the reduced benefit from masker modulations, as often observed for hearing-impaired listeners, is due to a loss of signal audibility, or due to suprathreshold deficits, such as reduced spectral and temporal resolution, which were measured in four separate psychophysical tasks. Results show that the reduced masking release can only partly be accounted for by reduced audibility, and that, when considering suprathreshold deficits, the normal effects associated with a raised presentation level should be taken into account. In this perspective, reduced spectral resolution does not appear to qualify as an actual suprathreshold deficit, while reduced temporal resolution does. Temporal resolution and age are shown to be the main factors governing masking release for speech in modulated noise, accounting for more than half of the intersubject variance. Their influence appears to be related to the processing of mainly the higher stimulus frequencies. Results based on calculations of the Speech Intelligibility Index in modulated noise confirm these conclusions.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号