首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Speech reception thresholds (SRTs) were measured for target speech presented concurrently with interfering speech (spoken by a different speaker). In experiment 1, the target and interferer were divided spectrally into high- and low-frequency bands and presented over headphones in three conditions: monaural, dichotic (target and interferer to different ears), and swapped (the low-frequency target band and the high-frequency interferer band were presented to one ear, while the high-frequency target band and the low-frequency interferer band were presented to the other ear). SRTs were highest in the monaural condition and lowest in the dichotic condition; SRTs in the swapped condition were intermediate. In experiment 2, two new conditions were devised such that one target band was presented in isolation to one ear while the other band was presented at the other ear with the interferer. The pattern of SRTs observed in experiment 2 suggests that performance in the swapped condition reflects the intelligibility of the target frequency bands at just one ear; the auditory system appears unable to exploit advantageous target-to-interferer ratios at different ears when segregating target speech from a competing speech interferer.  相似文献   

2.
This study examined combinations of energetic and informational maskers in speech identification. Speech targets and maskers (speech or noise) were processed and filtered into sets of 15 narrow frequency bands. The target was the sum of eight randomly selected bands. More masking occurred for speech maskers than for spectrally matched noise maskers regardless of whether the masker bands overlapped the target bands. The greater effect of the speech maskers was interpreted as due to informational masking. When the masker was comprised of nonoverlapping bands of speech, the addition of bands of noise overlapping the speech masker, but not the speech target, reduced the overall amount of masking. Surprisingly, presenting the noise to the ear contralateral to the target and masker produced an even greater release from masking. The contralateral noise was apparently sufficient to cause a slight change in the image of the ipsilateral speech masker, possibly pulling it away from the target enough to allow the focus of attention on the target. This finding is consistent with the interpretation that in some conditions small binaural differences may be sufficient to cause, or significantly strengthen, the perceptual segregation of sounds.  相似文献   

3.
A study was made of the effect of interaural time delay (ITD) and acoustic headshadow on binaural speech intelligibility in noise. A free-field condition was simulated by presenting recordings, made with a KEMAR manikin in an anechoic room, through earphones. Recordings were made of speech, reproduced in front of the manikin, and of noise, emanating from seven angles in the azimuthal plane, ranging from 0 degree (frontal) to 180 degrees in steps of 30 degrees. From this noise, two signals were derived, one containing only ITD, the other containing only interaural level differences (ILD) due to headshadow. Using this material, speech reception thresholds (SRT) for sentences in noise were determined for a group of normal-hearing subjects. Results show that (1) for noise azimuths between 30 degrees and 150 degrees, the gain due to ITD lies between 3.9 and 5.1 dB, while the gain due to ILD ranges from 3.5 to 7.8 dB, and (2) ILD decreases the effectiveness of binaural unmasking due to ITD (on the average, the threshold shift drops from 4.6 to 2.6 dB). In a second experiment, also conducted with normal-hearing subjects, similar stimuli were used, but now presented monaurally or with an overall 20-dB attenuation in one channel, in order to simulate hearing loss. In addition, SRTs were determined for noise with fixed ITDs, for comparison with the results obtained with head-induced (frequency dependent) ITDs.(ABSTRACT TRUNCATED AT 250 WORDS)  相似文献   

4.
This study investigates whether binaural signal detection is improved by the listener's previous knowledge about the interaural phase relations of masker and test signal. Binaural masked thresholds were measured for a 500-ms dichotic noise masker that had an interaural phase difference of 0 below 500 Hz and of pi above 500 Hz. The thresholds for two difference 20-ms test signals were determined within the same measurement using an interleaved adaptive 3-interval forced-choice (3IFC) procedure. In each 3IFC trial, both signals could occur with equal probability (uncertainty). The two signals differed in frequency and interaural phase in such a way that one signal always had a frequency above the masker edge frequency (500 Hz) and no interaural phase difference (So), whereas the other signal frequency was below 500 Hz and the interaural phase difference was pi (S pi). The frequencies of a signal pair remained fixed during the whole 3IFC track. These two signals thus lead to two different binaural conditions, i.e., NoS pi for the low-frequency signal and N pi So for the high-frequency signal. For comparison, binaural masked thresholds were measured with the same masker for fixed signal frequency and phase. The binaural masking level differences (BMLDs) resulting from the two experimental conditions show no significant difference. This indicates that the binaural system is able to apply different internal transformations or processing strategies simultaneously in different critical bands and even within the same critical band.  相似文献   

5.
The effect of onset interaural time differences (ITDs) on lateralization and detection was investigated for broadband pulse trains 250 ms long with a binaural fundamental frequency of 250 Hz. Within each train, ITDs of successive binaural pulse pairs alternated between two of three values (0 micros, 500 micros left-leading, and 500 micros right-leading) or were invariant. For the alternating conditions, the experimental manipulation was the choice of which of two ITDs was presented first (i.e., at stimulus onset). Lateralization, which was estimated using a broadband noise pointer with a listener adjustable interaural delay, was determined largely by the onset ITD. However, detection thresholds for the signals in left-leading or diotic continuous broadband noise were not affected by where the signals were lateralized. A quantitative analysis suggested that binaural masked thresholds for the pulse trains were well accounted for by the level and phase of harmonic components at 500 and 750 Hz. Detection thresholds obtained for brief stimuli (two binaural pulse or noise burst pairs) were also independent of which of two ITDs was presented first. The control of lateralization by onset cues appears to be based on mechanisms not essential for binaural detection.  相似文献   

6.
Three experiments investigated the roles of interaural time differences (ITDs) and level differences (ILDs) in spatial unmasking in multi-source environments. In experiment 1, speech reception thresholds (SRTs) were measured in virtual-acoustic simulations of an anechoic environment with three interfering sound sources of either speech or noise. The target source lay directly ahead, while three interfering sources were (1) all at the target's location (0 degrees,0 degrees,0 degrees), (2) at locations distributed across both hemifields (-30 degrees,60 degrees,90 degrees), (3) at locations in the same hemifield (30 degrees,60 degrees,90 degrees), or (4) co-located in one hemifield (90 degrees,90 degrees,90 degrees). Sounds were convolved with head-related impulse responses (HRIRs) that were manipulated to remove individual binaural cues. Three conditions used HRIRs with (1) both ILDs and ITDs, (2) only ILDs, and (3) only ITDs. The ITD-only condition produced the same pattern of results across spatial configurations as the combined cues, but with smaller differences between spatial configurations. The ILD-only condition yielded similar SRTs for the (-30 degrees,60 degrees,90 degrees) and (0 degrees,0 degrees,0 degrees) configurations, as expected for best-ear listening. In experiment 2, pure-tone BMLDs were measured at third-octave frequencies against the ITD-only, speech-shaped noise interferers of experiment 1. These BMLDs were 4-8 dB at low frequencies for all spatial configurations. In experiment 3, SRTs were measured for speech in diotic, speech-shaped noise. Noises were filtered to reduce the spectrum level at each frequency according to the BMLDs measured in experiment 2. SRTs were as low or lower than those of the corresponding ITD-only conditions from experiment 1. Thus, an explanation of speech understanding in complex listening environments based on the combination of best-ear listening and binaural unmasking (without involving sound-localization) cannot be excluded.  相似文献   

7.
Sound source localization on the horizontal plane is primarily determined by interaural time differences (ITDs) for low-frequency stimuli and by interaural level differences (ILDs) for high-frequency stimuli, but ITDs in high-frequency complex stimuli can also be used for localization. Of interest here is the relationship between the processing of high-frequency ITDs and that of low-frequency ITDs and high-frequency ILDs. A few similarities in human performance with high- and low-frequency ITDs have been taken as evidence for similar ITD processing across frequency regions. However, such similarities, unless accompanied by differences between ITD and ILD performance on the same measure, could potentially reflect processing attributes common to both ITDs and ILDs rather than to ITDs only. In the present experiment, both learning and variability patterns in human discrimination of ITDs in high-frequency amplitude-modulated tones were examined and compared to previously obtained data with low-frequency ITDs and high-frequency ILDs. Both patterns for high-frequency ITDs were more similar to those for low-frequency ITDs than for high-frequency ILDs. These results thus add to the evidence supporting similar ITD processing across frequency regions, and further suggest that both high- and low-frequency ITD processing is less modifiable and more noisy than ILD processing.  相似文献   

8.
Comodulation masking release (CMR) refers to an improvement in the detection threshold of a signal masked by noise with coherent amplitude fluctuation across frequency, as compared to noise without the envelope coherence. The present study tested whether such an advantage for signal detection would facilitate the identification of speech phonemes. Consonant identification of bandpass speech was measured under the following three masker conditions: (1) a single band of noise in the speech band ("on-frequency" masker); (2) two bands of noise, one in the on-frequency band and the other in the "flanking band," with coherence of temporal envelope fluctuation between the two bands (comodulation); and (3) two bands of noise (on-frequency band and flanking band), without the coherence of the envelopes (noncomodulation). A pilot experiment with a small number of consonant tokens was followed by the main experiment with 12 consonants and the following masking conditions: three frequency locations of the flanking band and two masker levels. Results showed that in all conditions, the comodulation condition provided higher identification scores than the noncomodulation condition, and the difference in score was 3.5% on average. No significant difference was observed between the on-frequency only condition and the comodulation condition, i.e., an "unmasking" effect by the addition of a comodulated flaking band was not observed. The positive effect of CMR on consonant recognition found in the present study endorses a "cued-listening" theory, rather than an envelope correlation theory, as a basis of CMR in a suprathreshold task.  相似文献   

9.
Spatial unmasking describes the improvement in the detection or identification of a target sound afforded by separating it spatially from simultaneous masking sounds. This effect has been studied extensively for speech intelligibility in the presence of interfering sounds. In the current study, listeners identified zebra finch song, which shares many acoustic properties with speech but lacks semantic and linguistic content. Three maskers with the same long-term spectral content but different short-term statistics were used: (1) chorus (combinations of unfamiliar zebra finch songs), (2) song-shaped noise (broadband noise with the average spectrum of chorus), and (3) chorus-modulated noise (song-shaped noise multiplied by the broadband envelope from a chorus masker). The amount of masking and spatial unmasking depended on the masker and there was evidence of release from both energetic and informational masking. Spatial unmasking was greatest for the statistically similar chorus masker. For the two noise maskers, there was less spatial unmasking and it was wholly accounted for by the relative target and masker levels at the acoustically better ear. The results share many features with analogous results using speech targets, suggesting that spatial separation aids in the segregation of complex natural sounds through mechanisms that are not specific to speech.  相似文献   

10.
The masking level difference (MLD) for a narrowband noise masker is associated with marked individual differences. This pair of studies examines factors that might account for these individual differences. Experiment 1 estimated the MLD for a 50 Hz wide band of masking noise centered at 500 or 2000 Hz, gated on for 400 ms. Tonal signals were either brief (15 ms) or long (200 ms), and brief signals were coincident with either a dip or peak in the masker envelope. Experiment 2 estimated the MLD for both signal and masker consisting of a 50 Hz wide bandpass noise centered on 500 Hz. Signals were generated to provide only interaural phase cues, only interaural level cues, or both. The pattern of individual differences was dominated by variability in NoSpi thresholds, and NoSpi thresholds were highly correlated across all conditions. Results suggest that the individual differences observed in Experiment 1 were not primarily driven by differences in the use of binaural fine structure cues or in binaural temporal resolution. The range of thresholds obtained for a brief NoSpi tonal signal at 500 Hz was consistent with a model based on normalized interaural correlation. This model was not consistent for analogous conditions at 2000 Hz.  相似文献   

11.
When a masking sound is spatially separated from a target speech signal, substantial releases from masking typically occur both for speech and noise maskers. However, when a delayed copy of the masker is also presented at the location of the target speech (a condition that has been referred to as the front target, right-front masker or F-RF configuration), the advantages of spatial separation vanish for noise maskers but remain substantial for speech maskers. This effect has been attributed to precedence, which introduces an apparent spatial separation between the target and masker in the F-RF configuration that helps the listener to segregate the target from a masking voice but not from a masking noise. In this study, virtual synthesis techniques were used to examine variations of the F-RF configuration in an attempt to more fully understand the stimulus parameters that influence the release from masking obtained in that condition. The results show that the release from speech-on-speech masking caused by the addition of the delayed copy of the masker is robust across a wide variety of source locations, masker locations, and masker delay values. This suggests that the speech unmasking that occurs in the F-RF configuration is not dependent on any single perceptual cue and may indicate that F-RF speech segregation is only partially based on the apparent left-right location of the RF masker.  相似文献   

12.
Similarity between the target and masking voices is known to have a strong influence on performance in monaural and binaural selective attention tasks, but little is known about the role it might play in dichotic listening tasks with a target signal and one masking voice in the one ear and a second independent masking voice in the opposite ear. This experiment examined performance in a dichotic listening task with a target talker in one ear and same-talker, same-sex, or different-sex maskers in both the target and the unattended ears. The results indicate that listeners were most susceptible to across-ear interference with a different-sex within-ear masker and least susceptible with a same-talker within-ear masker, suggesting that the amount of across-ear interference cannot be predicted from the difficulty of selectively attending to the within-ear masking voice. The results also show that the amount of across-ear interference consistently increases when the across-ear masking voice is more similar to the target speech than the within-ear masking voice is, but that no corresponding decline in across-ear interference occurs when the across-ear voice is less similar to the target than the within-ear voice. These results are consistent with an "integrated strategy" model of speech perception where the listener chooses a segregation strategy based on the characteristics of the masker present in the target ear and the amount of across-ear interference is determined by the extent to which this strategy can also effectively be used to suppress the masker in the unattended ear.  相似文献   

13.
The present study examined the relative influence of the off- and on-frequency spectral components of modulated and unmodulated maskers on consonant recognition. Stimuli were divided into 30 contiguous equivalent rectangular bandwidths. The temporal fine structure (TFS) in each "target" band was either left intact or replaced with tones using vocoder processing. Recognition scores for 10, 15 and 20 target bands randomly located in frequency were obtained in quiet and in the presence of all 30 masker bands, only the off-frequency masker bands, or only the on-frequency masker bands. The amount of masking produced by the on-frequency bands was generally comparable to that produced by the broadband masker. However, the difference between these two conditions was often significant, indicating an influence of the off-frequency masker bands, likely through modulation interference or spectral restoration. Although vocoder processing systematically lead to poorer consonant recognition scores, the deficit observed in noise could often be attributed to that observed in quiet. These data indicate that (i) speech recognition is affected by the off-frequency components of the background and (ii) the nature of the target TFS does not systematically affect speech recognition in noise, especially when energetic masking and/or the number of target bands is limited.  相似文献   

14.
Experiment 1 examined detection and discrimination of monaural four-tone sequences composed of 400-, 500-, and 625-Hz sinusoids. In the baseline conditions, the masker was monaural composed of 25-Hz-wide bands of random noise centered on 320, 400, 500, 625, and 781 Hz. In the binaural masking release conditions, the noise was presented diotically. In the monaural masking release conditions, the noise was presented to the same ear as the signal, but it was comodulated. Tones had half-amplitude durations of 30, 60, or 150 ms. There was no delay between successive tones, so the rate of frequency change depended on tone duration. Listeners discriminated between sequences composed of 500-400-625-500 Hz and 500-625-400-500 Hz. Discrimination results were poor for rapid sequences in both monaural and binaural masking release conditions relative to baseline conditions. Results from experiment 2 indicated that poor discrimination for rapid sequences could also occur in the baseline conditions, provided that the frequency separation among tonal components was small. Sluggish processing in the present paradigm was not restricted to conditions relying on binaural cues. It is argued that sluggishness may reflect a long temporal window in monaural and binaural masking release conditions or an interaction between poor cue quality and task difficulty.  相似文献   

15.
Modulation thresholds were measured in three subjects for a sinusoidally amplitude-modulated (SAM) wideband noise (the signal) in the presence of a second amplitude-modulated wideband noise (the masker). In monaural conditions (Mm-Sm) masker and signal were presented to only one ear; in binaural conditions (M0-S pi) the masker was presented diotically while the phase of modulation of the SAM noise signal was inverted in one ear relative to the other. In experiment 1 masker modulation frequency (fm) was fixed at 16 Hz, and signal modulation frequency (fs) was varied from 2-512 Hz. For monaural presentation, masking generally decreased as fs diverged from fm, although there was a secondary increase in masking for very low signal modulation frequencies, as reported previously [Bacon and Grantham, J. Acoust. Soc. Am. 85, 2575-2580 (1989)]. The binaural masking patterns did not show this low-frequency upturn: binaural thresholds continued to improve as fs decreased from 16 to 2 Hz. Thus, comparing masked monaural and masked binaural thresholds, there was an average binaural advantage, or masking-level difference (MLD) of 9.4 dB at fs = 2 Hz and 5.3 dB at fs = 4 Hz. In addition, there were positive MLDs for the on-frequency condition (fm = fs = 16 Hz: average MLD = 4.4 dB) and for the highest signal frequency tested (fs = 512 Hz: average MLD = 7.3 dB). In experiment 2 the signal was a SAM noise (fs = 16 Hz), and the masker was a wideband noise, amplitude-modulated by a narrow band of noise centered at fs. There was no effect on monaural or binaural thresholds as masker modulator bandwidth was varied from 4 to 20 Hz (the average MLD remained constant at 8.0 dB), which suggests that the observed "tuning" for modulation may be based on temporal pattern discrimination and not on a critical-band-like filtering mechanism. In a final condition the masker modulator was a 10-Hz-wide band of noise centered at the 64-Hz signal modulation frequency. The average MLD in this case was 7.4 dB. The results are discussed in terms of various binaural capacities that probably play a role in binaural release from modulation masking, including detection of varying interaural intensity differences (IIDs) and discrimination of interaural correlation.  相似文献   

16.
Howard-Jones and Rosen [(1993). J. Acoust. Soc. Am. 93, 2915-2922] investigated the ability to integrate glimpses of speech that are separated in time and frequency using a "checkerboard" masker, with asynchronous amplitude modulation (AM) across frequency. Asynchronous glimpsing was demonstrated only for spectrally wide frequency bands. It is possible that the reduced evidence of spectro-temporal integration with narrower bands was due to spread of masking at the periphery. The present study tested this hypothesis with a dichotic condition, in which the even- and odd-numbered bands of the target speech and asynchronous AM masker were presented to opposite ears, minimizing the deleterious effects of masking spread. For closed-set consonant recognition, thresholds were 5.1-8.5?dB better for dichotic than for monotic asynchronous AM conditions. Results were similar for closed-set word recognition, but for open-set word recognition the benefit of dichotic presentation was more modest and level dependent, consistent with the effects of spread of masking being level dependent. There was greater evidence of asynchronous glimpsing in the open-set than closed-set tasks. Presenting stimuli dichotically supported asynchronous glimpsing with narrower frequency bands than previously shown, though the magnitude of glimpsing was reduced for narrower bandwidths even in some dichotic conditions.  相似文献   

17.
Four adult bilateral cochlear implant users, with good open-set sentence recognition, were tested with three different sound coding strategies for binaural speech unmasking and their ability to localize 100 and 500 Hz click trains in noise. Two of the strategies tested were envelope-based strategies that are clinically widely used. The third was a research strategy that additionally preserved fine-timing cues at low frequencies. Speech reception thresholds were determined in diotic noise for diotic and interaurally time-delayed speech using direct audio input to a bilateral research processor. Localization in noise was assessed in the free field. Overall results, for both speech and localization tests, were similar with all three strategies. None provided a binaural speech unmasking advantage due to the application of 700 micros interaural time delay to the speech signal, and localization results showed similar response patterns across strategies that were well accounted for by the use of broadband interaural level cues. The data from both experiments combined indicate that, in contrast to normal hearing, timing cues available from natural head-width delays do not offer binaural advantages with present methods of electrical stimulation, even when fine-timing cues are explicitly coded.  相似文献   

18.
Introduction of masker amplitude modulation (AM) can improve signal detection in a number of paradigms. In some cases this advantage depends on the coherence of modulation across a relatively wide frequency range. In the experiments described below, observers were asked to identify masked spondee words produced by a single male talker. The target spondees and masking noise were filtered into nine narrow bands, and the coherence of AM of either the speech signal or noise masker was manipulated. Inherent modulation of the masker bands was manipulated via assignment of real and imaginary values to the associated components of each band in the frequency domain, and AM of speech bands was achieved via multiplication with envelopes extracted from these maskers. Responses were based on two alternatives, four alternatives, or open response sets. The effect of masker AM coherence was highly dependent upon the size of the response set: coherent AM was associated with better thresholds in a two-alternative response set, but poorer thresholds in an open response set. Results with AM speech did not depend critically upon the across-frequency temporal synchrony of AM imposed on the speech material.  相似文献   

19.
This and two accompanying articles [Breebaart et al., J. Acoust. Soc. Am. 110, 1074-1088 (2001); 110, 1105-1117 (2001)] describe a computational model for the signal processing in the binaural auditory system. The model consists of several stages of monaural and binaural preprocessing combined with an optimal detector. In the present article the model is tested and validated by comparing its predictions with experimental data for binaural discrimination and masking conditions as a function of the spectral parameters of both masker and signal. For this purpose, the model is used as an artificial observer in a three-interval, forced-choice adaptive procedure. All model parameters were kept constant for all simulations described in this and the subsequent article. The effects of the following experimental parameters were investigated: center frequency of both masker and target, bandwidth of masker and target, the interaural phase relations of masker and target, and the level of the masker. Several phenomena that occur in binaural listening conditions can be accounted for. These include the wider effective binaural critical bandwidth observed in band-widening NoS(pi) conditions, the different masker-level dependence of binaural detection thresholds for narrow- and for wide-band maskers, the unification of IID and ITD sensitivity with binaural detection data, and the dependence of binaural thresholds on frequency.  相似文献   

20.
In this paper previous experiments on auditory filter shapes in binaural masking experiments [A. Kohlrausch, J. Acoust. Soc. Am. 84, 573-583 (1988)] are extended to a wider range of masker and signal durations. The masker was a dichotic broadband noise with frequency-dependent interaural parameters. The interaural phase difference of the masker was 0 below 500 Hz and pi above 500 Hz. Signal frequency varied between 200 and 800 Hz, and the signal was presented either monaurally (Sm) or binaurally in antiphase (S pi). In the first experiment, the masker duration was fixed at 500 ms and signals of 250 and 20 ms were used. In the second experiment, the signal duration was fixed at 20 ms, and the masker duration was reduced to 25 ms. The results from both experiments are consistent with studies using No or N pi maskers: The binaural masking level difference (BMLD) increases slightly for shorter test signals and decreases strongly for short maskers. The BMLD patterns of the first experiment are well described by the auditory-filter model derived for stationary test signals, if the additional influence of "off-frequency listening" for the short test signal is taken into account. The BMLDs resulting from the second experiment (25-ms masker), however, are much lower than predicted by this filter model This outcome supports previous observations that binaural unmasking becomes less effective for very short masker durations and indicates that this effect is even stronger for maskers with a complex structure of interaural parameters.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号