期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Speech detection in spatial and nonspatial speech maskers

Balakrishnan U Freyman RL 《The Journal of the Acoustical Society of America》2008,123(5):2680-2691

The effect of perceived spatial differences on masking release was examined using a 4AFC speech detection paradigm. Targets were 20 words produced by a female talker. Maskers were recordings of continuous streams of nonsense sentences spoken by two female talkers and mixed into each of two channels (two talker, and the same masker time reversed). Two masker spatial conditions were employed: "RF" with a 4 ms time lead to the loudspeaker 60 degrees horizontally to the right, and "FR" with the time lead to the front (0 degrees ) loudspeaker. The reference nonspatial "F" masker was presented from the front loudspeaker only. Target presentation was always from the front loudspeaker. In Experiment 1, target detection threshold for both natural and time-reversed spatial maskers was 17-20 dB lower than that for the nonspatial masker, suggesting that significant release from informational masking occurs with spatial speech maskers regardless of masker understandability. In Experiment 2, the effectiveness of the FR and RF maskers was evaluated as the right loudspeaker output was attenuated until the two-source maskers were indistinguishable from the F masker, as measured independently in a discrimination task. Results indicated that spatial release from masking can be observed with barely noticeable target-masker spatial differences. 相似文献

2.

Contributions of talker characteristics and spatial location to auditory streaming

Allen K Carlile S Alais D 《The Journal of the Acoustical Society of America》2008,123(3):1562-1570

To examine whether auditory streaming contributes to unmasking, intelligibility of target sentences against two competing talkers was measured using the coordinate response measure (CRM) [Bolia et al., J. Acoust. Soc. Am. 107, 1065-1066 (2007)] corpus. In the control condition, the speech reception threshold (50% correct) was measured when the target and two maskers were collocated straight ahead. Separating maskers from the target by +/-30 degrees resulted in spatial release from masking of 12 dB. CRM sentences involve an identifier in the first part and two target words in the second part. In experimental conditions, masking talkers started spatially separated at +/-30 degrees but became collocated with the target before the scoring words. In one experiment, one target and two different maskers were randomly selected from a mixed-sex corpus. Significant unmasking of 4 dB remained despite the absence of persistent location cues. When same-sex talkers were used as maskers and target, unmasking was reduced. These data suggest that initial separation may permit confident identification and streaming of the target and masker speech where significant differences between target and masker voice characteristics exist, but where target and masker characteristics are similar, listeners must rely more heavily on continuing spatial cues. 相似文献

3.

Cochlear implant speech recognition with speech maskers

Stickney GS Zeng FG Litovsky R Assmann P 《The Journal of the Acoustical Society of America》2004,116(2):1081-1091

Speech recognition performance was measured in normal-hearing and cochlear-implant listeners with maskers consisting of either steady-state speech-spectrum-shaped noise or a competing sentence. Target sentences from a male talker were presented in the presence of one of three competing talkers (same male, different male, or female) or speech-spectrum-shaped noise generated from this talker at several target-to-masker ratios. For the normal-hearing listeners, target-masker combinations were processed through a noise-excited vocoder designed to simulate a cochlear implant. With unprocessed stimuli, a normal-hearing control group maintained high levels of intelligibility down to target-to-masker ratios as low as 0 dB and showed a release from masking, producing better performance with single-talker maskers than with steady-state noise. In contrast, no masking release was observed in either implant or normal-hearing subjects listening through an implant simulation. The performance of the simulation and implant groups did not improve when the single-talker masker was a different talker compared to the same talker as the target speech, as was found in the normal-hearing control. These results are interpreted as evidence for a significant role of informational masking and modulation interference in cochlear implant speech recognition with fluctuating maskers. This informational masking may originate from increased target-masker similarity when spectral resolution is reduced. 相似文献

4.

Effect of target-masker similarity on across-ear interference in a dichotic cocktail-party listening task

Brungart DS Simpson BD 《The Journal of the Acoustical Society of America》2007,122(3):1724

Similarity between the target and masking voices is known to have a strong influence on performance in monaural and binaural selective attention tasks, but little is known about the role it might play in dichotic listening tasks with a target signal and one masking voice in the one ear and a second independent masking voice in the opposite ear. This experiment examined performance in a dichotic listening task with a target talker in one ear and same-talker, same-sex, or different-sex maskers in both the target and the unattended ears. The results indicate that listeners were most susceptible to across-ear interference with a different-sex within-ear masker and least susceptible with a same-talker within-ear masker, suggesting that the amount of across-ear interference cannot be predicted from the difficulty of selectively attending to the within-ear masking voice. The results also show that the amount of across-ear interference consistently increases when the across-ear masking voice is more similar to the target speech than the within-ear masking voice is, but that no corresponding decline in across-ear interference occurs when the across-ear voice is less similar to the target than the within-ear voice. These results are consistent with an "integrated strategy" model of speech perception where the listener chooses a segregation strategy based on the characteristics of the masker present in the target ear and the amount of across-ear interference is determined by the extent to which this strategy can also effectively be used to suppress the masker in the unattended ear. 相似文献

5.

Stimulus factors influencing spatial release from speech-on-speech masking

Kidd G Mason CR Best V Marrone N 《The Journal of the Acoustical Society of America》2010,128(4):1965-1978

This study examined spatial release from masking (SRM) when a target talker was masked by competing talkers or by other types of sounds. The focus was on the role of interaural time differences (ITDs) and time-varying interaural level differences (ILDs) under conditions varying in the strength of informational masking (IM). In the first experiment, a target talker was masked by two other talkers that were either colocated with the target or were symmetrically spatially separated from the target with the stimuli presented through loudspeakers. The sounds were filtered into different frequency regions to restrict the available interaural cues. The largest SRM occurred for the broadband condition followed by a low-pass condition. However, even the highest frequency bandpass-filtered condition (3-6 kHz) yielded a significant SRM. In the second experiment the stimuli were presented via earphones. The listeners identified the speech of a target talker masked by one or two other talkers or noises when the maskers were colocated with the target or were perceptually separated by ITDs. The results revealed a complex pattern of masking in which the factors affecting performance in colocated and spatially separated conditions are to a large degree independent. 相似文献

6.

Precedence-based speech segregation in a virtual auditory environment

Brungart DS Simpson BD Freyman RL 《The Journal of the Acoustical Society of America》2005,118(5):3241-3251

When a masking sound is spatially separated from a target speech signal, substantial releases from masking typically occur both for speech and noise maskers. However, when a delayed copy of the masker is also presented at the location of the target speech (a condition that has been referred to as the front target, right-front masker or F-RF configuration), the advantages of spatial separation vanish for noise maskers but remain substantial for speech maskers. This effect has been attributed to precedence, which introduces an apparent spatial separation between the target and masker in the F-RF configuration that helps the listener to segregate the target from a masking voice but not from a masking noise. In this study, virtual synthesis techniques were used to examine variations of the F-RF configuration in an attempt to more fully understand the stimulus parameters that influence the release from masking obtained in that condition. The results show that the release from speech-on-speech masking caused by the addition of the delayed copy of the masker is robust across a wide variety of source locations, masker locations, and masker delay values. This suggests that the speech unmasking that occurs in the F-RF configuration is not dependent on any single perceptual cue and may indicate that F-RF speech segregation is only partially based on the apparent left-right location of the RF masker. 相似文献

7.

The effects of spatial separation in distance on the informational and energetic masking of a nearby speech signal

Brungart DS Simpson BD 《The Journal of the Acoustical Society of America》2002,112(2):664-676

Although many studies have shown that intelligibility improves when a speech signal and an interfering sound source are spatially separated in azimuth, little is known about the effect that spatial separation in distance has on the perception of competing sound sources near the head. In this experiment, head-related transfer functions (HRTFs) were used to process stimuli in order to simulate a target talker and a masking sound located at different distances along the listener's interaural axis. One of the signals was always presented at a distance of 1 m, and the other signal was presented 1 m, 25 cm, or 12 cm from the center of the listener's head. The results show that distance separation has very different effects on speech segregation for different types of maskers. When speech-shaped noise was used as the masker, most of the intelligibility advantages of spatial separation could be accounted for by spectral differences in the target and masking signals at the ear with the higher signal-to-noise ratio (SNR). When a same-sex talker was used as the masker, the intelligibility advantages of spatial separation in distance were dominated by binaural effects that produced the same performance improvements as a 4-5-dB increase in the SNR of a diotic stimulus. These results suggest that distance-dependent changes in the interaural difference cues of nearby sources play a much larger role in the reduction of the informational masking produced by an interfering speech signal than in the reduction of the energetic masking produced by an interfering noise source. 相似文献

8.

The role of perceived spatial separation in the unmasking of speech 总被引：12，自引：0，他引：12

Freyman RL Helfer KS McCall DD Clifton RK 《The Journal of the Acoustical Society of America》1999,106(6):3578-3588

Spatial separation of speech and noise in an anechoic space creates a release from masking that often improves speech intelligibility. However, the masking release is severely reduced in reverberant spaces. This study investigated whether the distinct and separate localization of speech and interference provides any perceptual advantage that, due to the precedence effect, is not degraded by reflections. Listeners' identification of nonsense sentences spoken by a female talker was measured in the presence of either speech-spectrum noise or other sentences spoken by a second female talker. Target and interference stimuli were presented in an anechoic chamber from loudspeakers directly in front and 60 degrees to the right in single-source and precedence-effect (lead-lag) conditions. For speech-spectrum noise, the spatial separation advantage for speech recognition (8 dB) was predictable from articulation index computations based on measured release from masking for narrow-band stimuli. The spatial separation advantage was only 1 dB in the lead-lag condition, despite the fact that a large perceptual separation was produced by the precedence effect. For the female talker interference, a much larger advantage occurred, apparently because informational masking was reduced by differences in perceived locations of target and interference. 相似文献

9.

The influence of non-spatial factors on measures of spatial release from masking

Best V Marrone N Mason CR Kidd G 《The Journal of the Acoustical Society of America》2012,131(4):3103-3110

This study tested the hypothesis that the reduction in spatial release from masking (SRM) resulting from sensorineural hearing loss in competing speech mixtures is influenced by the characteristics of the interfering speech. A frontal speech target was presented simultaneously with two intelligible or two time-reversed (unintelligible) speech maskers that were either colocated with the target or were symmetrically separated from the target in the horizontal plane. The difference in SRM between listeners with hearing impairment and listeners with normal hearing was substantially larger for the forward maskers (deficit of 5.8 dB) than for the reversed maskers (deficit of 1.6 dB). This was driven by the fact that all listeners, regardless of hearing abilities, performed similarly (and poorly) in the colocated condition with intelligible maskers. The same conditions were then tested in listeners with normal hearing using headphone stimuli that were degraded by noise vocoding. Reducing the number of available spectral channels systematically reduced the measured SRM, and again, more so for forward (reduction of 3.8 dB) than for reversed speech maskers (reduction of 1.8 dB). The results suggest that non-spatial factors can strongly influence both the magnitude of SRM and the apparent deficit in SRM for listeners with impaired hearing. 相似文献

10.

Informational masking caused by contralateral stimulation

Kidd G Mason CR Arbogast TL Brungart DS Simpson BD 《The Journal of the Acoustical Society of America》2003,113(3):1594-1603

Although informational masking is thought to reflect central mechanisms, the effects are generally much stronger when the target and masker are presented to the same ear than when they are presented to different ears. However, the results of a recent study by Brungart and Simpson [J. Acoust. Soc. Am. 112, 2985-2995 (2002)] indicated that a speech masker that is presented contralateral to a speech signal can produce substantial amounts of informational masking when a second speech masker is played simultaneously in the same ear as the signal. In this study, we conducted a series of experiments that paralleled those of Brungart and Simpson but used a pure-tone signal and multitone informational maskers in a detection task. Both the signal and the maskers were played as sequences of short bursts in each observation interval. The maskers were arranged in two types of spectrotemporal patterns. One type of pattern, called "multiple-bursts same" (MBS), has previously been shown to produce very large amounts of informational masking while the other type of pattern, called "multiple-bursts different" (MBD), has been shown to produce very small amounts of informational masking. Several conditions of ipsilateral, contralateral, and combined presentation of these maskers were tested. The results showed that presentation of the MBS masker in the contralateral ear produced a substantial amount of informational masking when the MBD masker was simultaneously presented to the ipsilateral ear. The results supported the earlier findings of Brungart and Simpson indicating that listeners are unable to selectively focus their attention on a single ear in some complex dichotic listening conditions. These results suggest that this contralateral masking effect is not restricted to speech and may reflect more general limitations on processing capacity. Further, it was concluded that the magnitude of the contralateral masking effect was related both to the informational masking value of the contralateral masker and the complexity of the stimulus and/or task in the ear in which the signal was presented. 相似文献

11.

Comodulation masking release (CMR) as a function of masker bandwidth, modulator bandwidth, and signal duration

G P Schooneveldt B C Moore 《The Journal of the Acoustical Society of America》1989,85(1):273-281

These experiments examine how comodulation masking release (CMR) varies with masker bandwidth, modulator bandwidth, and signal duration. In experiment 1, thresholds were measured for a 400-ms, 2000-Hz signal masked by continuous noise varying in bandwidth from 50-3200 Hz in 1-oct steps. In one condition, using random noise maskers, thresholds increased with increasing bandwidth up to 400 Hz and then remained approximately constant. In another set of conditions, the masker was multiplied (amplitude modulated) by a low-pass noise (bandwidth varied from 12.5-400 Hz in 1-oct steps). This produced correlated envelope fluctuations across frequency. Thresholds were generally lower than for random noise maskers with the same bandwidth. For maskers less than one critical band wide, the release from masking was largest (about 5 dB) for maskers with low rates of modulation (12.5-Hz-wide low-pass modulator). It is argued that this release from masking is not a "true" CMR but results from a within-channel cue. For broadband maskers (greater than 400 Hz), the release from masking increased with increasing masker bandwidth and decreasing modulator bandwidth, reaching an asymptote of 12 dB for a masker bandwidth of 800 Hz and a modulator bandwidth of 50 Hz. Most of this release from masking can be attributed to a CMR. In experiment 2, the modulator bandwidth was fixed at 12.5 Hz and the signal duration was varied. For masker bandwidths greater than 400 Hz, the CMR decreased from 12 to 5 dB as the signal duration was decreased from 400 to 25 ms.(ABSTRACT TRUNCATED AT 250 WORDS) 相似文献

12.

Voice gender differences and separation of simultaneous talkers in cochlear implant users with residual hearing

AS Visram K Kluk CM McKay 《The Journal of the Acoustical Society of America》2012,132(2):EL135-EL141

Perception of a target voice in the presence of a competing talker, of same or different gender as the target, was investigated in cochlear implant users, in implant-alone and bimodal (acoustic hearing in the non-implanted ear) conditions. Recordings of two male and two female talkers acted as targets and maskers, to investigate whether bimodal benefit increased for different compared to same gender target/maskers due to increased ability to perceive and utilize fundamental frequency and spectral-shape differences. In both listening conditions participants showed benefit of target/masker gender difference. There was an overall bimodal benefit, which was independent of target/masker gender difference. 相似文献

13.

The extent to which a position-based explanation accounts for binaural release from informational masking

Gallun FJ Durlach NI Colburn HS Shinn-Cunningham BG Best V Mason CR Kidd G 《The Journal of the Acoustical Society of America》2008,124(1):439-449

Detection was measured for a 500 Hz tone masked by noise (an "energetic" masker) or sets of ten randomly drawn tones (an "informational" masker). Presenting the maskers diotically and the target tone with a variety of interaural differences (interaural amplitude ratios and/or interaural time delays) resulted in reduced detection thresholds relative to when the target was presented diotically ("binaural release from masking"). Thresholds observed when time and amplitude differences applied to the target were "reinforcing" (favored the same ear, resulting in a lateralized position for the target) were not significantly different from thresholds obtained when differences were "opposing" (favored opposite ears, resulting in a centered position for the target). This irrelevance of differences in the perceived location of the target is a classic result for energetic maskers but had not previously been shown for informational maskers. However, this parallellism between the patterns of binaural release for energetic and informational maskers was not accompanied by high correlations between the patterns for individual listeners, supporting the idea that the mechanisms for binaural release from energetic and informational masking are fundamentally different. 相似文献

14.

Combining energetic and informational masking for speech identification

Kidd G Mason CR Gallun FJ 《The Journal of the Acoustical Society of America》2005,118(2):982-992

This study examined combinations of energetic and informational maskers in speech identification. Speech targets and maskers (speech or noise) were processed and filtered into sets of 15 narrow frequency bands. The target was the sum of eight randomly selected bands. More masking occurred for speech maskers than for spectrally matched noise maskers regardless of whether the masker bands overlapped the target bands. The greater effect of the speech maskers was interpreted as due to informational masking. When the masker was comprised of nonoverlapping bands of speech, the addition of bands of noise overlapping the speech masker, but not the speech target, reduced the overall amount of masking. Surprisingly, presenting the noise to the ear contralateral to the target and masker produced an even greater release from masking. The contralateral noise was apparently sufficient to cause a slight change in the image of the ipsilateral speech masker, possibly pulling it away from the target enough to allow the focus of attention on the target. This finding is consistent with the interpretation that in some conditions small binaural differences may be sufficient to cause, or significantly strengthen, the perceptual segregation of sounds. 相似文献

15.

Within-ear and across-ear interference in a dichotic cocktail party listening task: effects of masker uncertainty

Brungart DS Simpson BD 《The Journal of the Acoustical Society of America》2004,115(1):301-310

Increases in masker variability have been shown to increase the effects of informational masking in non-speech listening tasks, but relatively little is known about the influence that masker uncertainty has on the informational components of speech-on-speech masking. In this experiment, listeners were asked to extract information from a target phrase that was presented in their right ear while ignoring masking phrases that were presented in the same ear as the target phrase and in the ear opposite the target phrase. The level of masker uncertainty was varied by holding constant or "freezing" the talkers speaking the masking phrases, the semantic content used in the masking phrases, or both the talkers and the semantic content in the masking phrases within each block of 120 trials. The results showed that freezing the semantic content of the masking phrase in the target ear was the only reduction in masker uncertainty that ever resulted in a significant improvement in performance. Providing feedback after each trial improved performance overall, but did not prevent the listeners from making incorrect responses that matched the content of the frozen target-ear masking phrase. However, removing the target-ear contents corresponding to the masking phrase from the response set resulted in a dramatic improvement in performance. This suggests that the listeners were generally able to understand both of the phrases presented to the target ear, and that their incorrect responses in the task were almost entirely a result of their inability to determine which words were spoken by the target talker. 相似文献

16.

Spatial unmasking of birdsong in human listeners: energetic and informational factors

Best V Ozmeral E Gallun FJ Sen K Shinn-Cunningham BG 《The Journal of the Acoustical Society of America》2005,118(6):3766-3773

Spatial unmasking describes the improvement in the detection or identification of a target sound afforded by separating it spatially from simultaneous masking sounds. This effect has been studied extensively for speech intelligibility in the presence of interfering sounds. In the current study, listeners identified zebra finch song, which shares many acoustic properties with speech but lacks semantic and linguistic content. Three maskers with the same long-term spectral content but different short-term statistics were used: (1) chorus (combinations of unfamiliar zebra finch songs), (2) song-shaped noise (broadband noise with the average spectrum of chorus), and (3) chorus-modulated noise (song-shaped noise multiplied by the broadband envelope from a chorus masker). The amount of masking and spatial unmasking depended on the masker and there was evidence of release from both energetic and informational masking. Spatial unmasking was greatest for the statistically similar chorus masker. For the two noise maskers, there was less spatial unmasking and it was wholly accounted for by the relative target and masker levels at the acoustically better ear. The results share many features with analogous results using speech targets, suggesting that spatial separation aids in the segregation of complex natural sounds through mechanisms that are not specific to speech. 相似文献

17.

A cocktail party model of spatial release from masking by both noise and speech interferers

Jones GL Litovsky RY 《The Journal of the Acoustical Society of America》2011,130(3):1463-1474

A mathematical formula for estimating spatial release from masking (SRM) in a cocktail party environment would be useful as a simpler alternative to computationally intensive algorithms and may enhance understanding of underlying mechanisms. The experiment presented herein was designed to provide a strong test of a model that divides SRM into contributions of asymmetry and angular separation [Bronkhorst (2000). Acustica 86, 117-128] and to examine whether that model can be extended to include speech maskers. Across masker types the contribution to SRM of angular separation of maskers from the target was found to grow at a diminishing rate as angular separation increased within the frontal hemifield, contrary to predictions of the model. Speech maskers differed from noise maskers in the overall magnitude of SRM and in the contribution of angular separation (both greater for speech). These results were used to develop a modified model that achieved good fits to data for noise maskers (ρ=0.93) and for speech maskers (ρ=0.94) while using the same functions to describe separation and asymmetry components of SRM for both masker types. These findings suggest that this approach can be used to accurately model SRM for speech maskers in addition to primarily "energetic" noise maskers. 相似文献

18.

The benefit of binaural hearing in a cocktail party: effect of location and type of interferer

Hawley ML Litovsky RY Culling JF 《The Journal of the Acoustical Society of America》2004,115(2):833-843

The "cocktail party problem" was studied using virtual stimuli whose spatial locations were generated using anechoic head-related impulse responses from the AUDIS database [Blauert et al., J. Acoust. Soc. Am. 103, 3082 (1998)]. Speech reception thresholds (SRTs) were measured for Harvard IEEE sentences presented from the front in the presence of one, two, or three interfering sources. Four types of interferer were used: (1) other sentences spoken by the same talker, (2) time-reversed sentences of the same talker, (3) speech-spectrum shaped noise, and (4) speech-spectrum shaped noise, modulated by the temporal envelope of the sentences. Each interferer was matched to the spectrum of the target talker. Interferers were placed in several spatial configurations, either coincident with or separated from the target. Binaural advantage was derived by subtracting SRTs from listening with the "better monaural ear" from those for binaural listening. For a single interferer, there was a binaural advantage of 2-4 dB for all interferer types. For two or three interferers, the advantage was 2-4 dB for noise and speech-modulated noise, and 6-7 dB for speech and time-reversed speech. These data suggest that the benefit of binaural hearing for speech intelligibility is especially pronounced when there are multiple voiced interferers at different locations from the target, regardless of spatial configuration; measurements with fewer or with other types of interferers can underestimate this benefit. 相似文献

19.

Aging, spatial cues, and single- versus dual-task performance in competing speech perception

Helfer KS Chevalier J Freyman RL 《The Journal of the Acoustical Society of America》2010,128(6):3625-3633

Older individuals often report difficulty coping in situations with multiple conversations in which they at times need to "tune out" the background speech and at other times seek to monitor competing messages. The present study was designed to simulate this type of interaction by examining the cost of requiring listeners to perform a secondary task in conjunction with understanding a target talker in the presence of competing speech. The ability of younger and older adults to understand a target utterance was measured with and without requiring the listener to also determine how many masking voices were presented time-reversed. Also of interest was how spatial separation affected the ability to perform these two tasks. Older adults demonstrated slightly reduced overall speech recognition and obtained less spatial release from masking, as compared to younger listeners. For both younger and older listeners, spatial separation increased the costs associated with performing both tasks together. The meaningfulness of the masker had a greater detrimental effect on speech understanding for older participants than for younger participants. However, the results suggest that the problems experienced by older adults in complex listening situations are not necessarily due to a deficit in the ability to switch and/or divide attention among talkers. 相似文献

20.

Informational masking of speech produced by speech-like sounds without linguistic content

Chen J Li H Li L Wu X Moore BC 《The Journal of the Acoustical Society of America》2012,131(4):2914-2926

This study investigated whether speech-like maskers without linguistic content produce informational masking of speech. The target stimuli were nonsense Chinese Mandarin sentences. In experiment I, the masker contained harmonics the fundamental frequency (F0) of which was sinusoidally modulated and the mean F0 of which was varied. The magnitude of informational masking was evaluated by measuring the change in intelligibility (releasing effect) produced by inducing a perceived spatial separation of the target speech and masker via the precedence effect. The releasing effect was small and was only clear when the target and masker had the same mean F0, suggesting that informational masking was small. Performance with the harmonic maskers was better than with a steady speech-shaped noise (SSN) masker. In experiments II and III, the maskers were speech-like synthesized signals, alternating between segments with harmonic structure and segments composed of SSN. Performance was much worse than for experiment I, and worse than when an SSN masker was used, suggesting that substantial informational masking occurred. The similarity of the F0 contours of the target and masker had little effect. The informational masking effect was not influenced by whether or not the noise-like segments of the masker were synchronous with the unvoiced segments of the target speech. 相似文献