首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Studies comparing native and non-native listener performance on speech perception tasks can distinguish the roles of general auditory and language-independent processes from those involving prior knowledge of a given language. Previous experiments have demonstrated a performance disparity between native and non-native listeners on tasks involving sentence processing in noise. However, the effects of energetic and informational masking have not been explicitly distinguished. Here, English and Spanish listener groups identified keywords in English sentences in quiet and masked by either stationary noise or a competing utterance, conditions known to produce predominantly energetic and informational masking, respectively. In the stationary noise conditions, non-native talkers suffered more from increasing levels of noise for two of the three keywords scored. In the competing talker condition, the performance differential also increased with masker level. A computer model of energetic masking in the competing talker condition ruled out the possibility that the native advantage could be explained wholly by energetic masking. Both groups drew equal benefit from differences in mean F0 between target and masker, suggesting that processes which make use of this cue do not engage language-specific knowledge.  相似文献   

2.
A triadic comparisons task and an identification task were used to evaluate normally hearing listeners' and hearing-impaired listeners' perceptions of synthetic CV stimuli in the presence of competition. The competing signals included multitalker babble, continuous speech spectrum noise, a CV masker, and a brief noise masker shaped to resemble the onset spectrum of the CV masker. All signals and maskers were presented monotically. Interference by competition was assessed by comparing Multidimensional Scaling solutions derived from each masking condition to that derived from the baseline (quiet) condition. Analysis of the effects of continuous maskers revealed that multitalker babble and continuous noise caused the same amount of change in performance, as compared to the baseline condition, for all listeners. CV masking changed performance significantly more than did brief noise masking, and the hearing-impaired listeners experienced more degradation in performance than normals. Finally, the velar CV maskers (g epsilon and k epsilon) caused significantly greater masking effects than the bilabial CV maskers (b epsilon and p epsilon), and were most resistant to masking by other competing stimuli. The results suggest that speech intelligibility difficulties in the presence of competing segments of speech are primarily attributable to phonetic interference rather than to spectral masking. Individual differences in hearing-impaired listeners' performances are also discussed.  相似文献   

3.
Talkers change the way they speak in noisy conditions. For energetic maskers, speech production changes are relatively well-understood, but less is known about how informational maskers such as competing speech affect speech production. The current study examines the effect of energetic and informational maskers on speech production by talkers speaking alone or in pairs. Talkers produced speech in quiet and in backgrounds of speech-shaped noise, speech-modulated noise, and competing speech. Relative to quiet, speech output level and fundamental frequency increased and spectral tilt flattened in proportion to the energetic masking capacity of the background. In response to modulated backgrounds, talkers were able to reduce substantially the degree of temporal overlap with the noise, with greater reduction for the competing speech background. Reduction in foreground-background overlap can be expected to lead to a release from both energetic and informational masking for listeners. Passive changes in speech rate, mean pause length or pause distribution cannot explain the overlap reduction, which appears instead to result from a purposeful process of listening while speaking. Talkers appear to monitor the background and exploit upcoming pauses, a strategy which is particularly effective for backgrounds containing intelligible speech.  相似文献   

4.
Speech recognition performance was measured in normal-hearing and cochlear-implant listeners with maskers consisting of either steady-state speech-spectrum-shaped noise or a competing sentence. Target sentences from a male talker were presented in the presence of one of three competing talkers (same male, different male, or female) or speech-spectrum-shaped noise generated from this talker at several target-to-masker ratios. For the normal-hearing listeners, target-masker combinations were processed through a noise-excited vocoder designed to simulate a cochlear implant. With unprocessed stimuli, a normal-hearing control group maintained high levels of intelligibility down to target-to-masker ratios as low as 0 dB and showed a release from masking, producing better performance with single-talker maskers than with steady-state noise. In contrast, no masking release was observed in either implant or normal-hearing subjects listening through an implant simulation. The performance of the simulation and implant groups did not improve when the single-talker masker was a different talker compared to the same talker as the target speech, as was found in the normal-hearing control. These results are interpreted as evidence for a significant role of informational masking and modulation interference in cochlear implant speech recognition with fluctuating maskers. This informational masking may originate from increased target-masker similarity when spectral resolution is reduced.  相似文献   

5.
Studies of speech perception in various types of background noise have shown that noise with linguistic content affects listeners differently than nonlinguistic noise [e.g., Simpson, S. A., and Cooke, M. (2005). "Consonant identification in N-talker babble is a nonmonotonic function of N," J. Acoust. Soc. Am. 118, 2775-2778; Sperry, J. L., Wiley, T. L., and Chial, M. R. (1997). "Word recognition performance in various background competitors," J. Am. Acad. Audiol. 8, 71-80] but few studies of multi-talker babble have employed background babble in languages other than the target speech language. To determine whether the adverse effect of background speech is due to the linguistic content or to the acoustic characteristics of the speech masker, this study assessed speech-in-noise recognition when the language of the background noise was either the same or different from the language of the target speech. Replicating previous findings, results showed poorer English sentence recognition by native English listeners in six-talker babble than in two-talker babble, regardless of the language of the babble. In addition, our results showed that in two-talker babble, native English listeners were more adversely affected by English babble than by Mandarin Chinese babble. These findings demonstrate informational masking on sentence-in-noise recognition in the form of "linguistic interference." Whether this interference is at the lexical, sublexical, and/or prosodic levels of linguistic structure and whether it is modulated by the phonetic similarity between the target and noise languages remains to be determined.  相似文献   

6.
Speech produced in the presence of noise-Lombard speech-is more intelligible in noise than speech produced in quiet, but the origin of this advantage is poorly understood. Some of the benefit appears to arise from auditory factors such as energetic masking release, but a role for linguistic enhancements similar to those exhibited in clear speech is possible. The current study examined the effect of Lombard speech in noise and in quiet for Spanish learners of English. Non-native listeners showed a substantial benefit of Lombard speech in noise, although not quite as large as that displayed by native listeners tested on the same task in an earlier study [Lu and Cooke (2008), J. Acoust. Soc. Am. 124, 3261-3275]. The difference between the two groups is unlikely to be due to energetic masking. However, Lombard speech was less intelligible in quiet for non-native listeners than normal speech. The relatively small difference in Lombard benefit in noise for native and non-native listeners, along with the absence of Lombard benefit in quiet, suggests that any contribution of linguistic enhancements in the Lombard benefit for natives is small.  相似文献   

7.
When listeners hear a target signal in the presence of competing sounds, they are quite good at extracting information at instances when the local signal-to-noise ratio of the target is most favorable. Previous research suggests that listeners can easily understand a periodically interrupted target when it is interleaved with noise. It is not clear if this ability extends to the case where an interrupted target is alternated with a speech masker rather than noise. This study examined speech intelligibility in the presence of noise or speech maskers, which were either continuous or interrupted at one of six rates between 4 and 128 Hz. Results indicated that with noise maskers, listeners performed significantly better with interrupted, rather than continuous maskers. With speech maskers, however, performance was better in continuous, rather than interrupted masker conditions. Presumably the listeners used continuity as a cue to distinguish the continuous masker from the interrupted target. Intelligibility in the interrupted masker condition was improved by introducing a pitch difference between the target and speech masker. These results highlight the role that target-masker differences in continuity and pitch play in the segregation of competing speech signals.  相似文献   

8.
This study examined combinations of energetic and informational maskers in speech identification. Speech targets and maskers (speech or noise) were processed and filtered into sets of 15 narrow frequency bands. The target was the sum of eight randomly selected bands. More masking occurred for speech maskers than for spectrally matched noise maskers regardless of whether the masker bands overlapped the target bands. The greater effect of the speech maskers was interpreted as due to informational masking. When the masker was comprised of nonoverlapping bands of speech, the addition of bands of noise overlapping the speech masker, but not the speech target, reduced the overall amount of masking. Surprisingly, presenting the noise to the ear contralateral to the target and masker produced an even greater release from masking. The contralateral noise was apparently sufficient to cause a slight change in the image of the ipsilateral speech masker, possibly pulling it away from the target enough to allow the focus of attention on the target. This finding is consistent with the interpretation that in some conditions small binaural differences may be sufficient to cause, or significantly strengthen, the perceptual segregation of sounds.  相似文献   

9.
Spatial unmasking describes the improvement in the detection or identification of a target sound afforded by separating it spatially from simultaneous masking sounds. This effect has been studied extensively for speech intelligibility in the presence of interfering sounds. In the current study, listeners identified zebra finch song, which shares many acoustic properties with speech but lacks semantic and linguistic content. Three maskers with the same long-term spectral content but different short-term statistics were used: (1) chorus (combinations of unfamiliar zebra finch songs), (2) song-shaped noise (broadband noise with the average spectrum of chorus), and (3) chorus-modulated noise (song-shaped noise multiplied by the broadband envelope from a chorus masker). The amount of masking and spatial unmasking depended on the masker and there was evidence of release from both energetic and informational masking. Spatial unmasking was greatest for the statistically similar chorus masker. For the two noise maskers, there was less spatial unmasking and it was wholly accounted for by the relative target and masker levels at the acoustically better ear. The results share many features with analogous results using speech targets, suggesting that spatial separation aids in the segregation of complex natural sounds through mechanisms that are not specific to speech.  相似文献   

10.
Detection was measured for a 500 Hz tone masked by noise (an "energetic" masker) or sets of ten randomly drawn tones (an "informational" masker). Presenting the maskers diotically and the target tone with a variety of interaural differences (interaural amplitude ratios and/or interaural time delays) resulted in reduced detection thresholds relative to when the target was presented diotically ("binaural release from masking"). Thresholds observed when time and amplitude differences applied to the target were "reinforcing" (favored the same ear, resulting in a lateralized position for the target) were not significantly different from thresholds obtained when differences were "opposing" (favored opposite ears, resulting in a centered position for the target). This irrelevance of differences in the perceived location of the target is a classic result for energetic maskers but had not previously been shown for informational maskers. However, this parallellism between the patterns of binaural release for energetic and informational maskers was not accompanied by high correlations between the patterns for individual listeners, supporting the idea that the mechanisms for binaural release from energetic and informational masking are fundamentally different.  相似文献   

11.
This study investigated whether speech-like maskers without linguistic content produce informational masking of speech. The target stimuli were nonsense Chinese Mandarin sentences. In experiment I, the masker contained harmonics the fundamental frequency (F0) of which was sinusoidally modulated and the mean F0 of which was varied. The magnitude of informational masking was evaluated by measuring the change in intelligibility (releasing effect) produced by inducing a perceived spatial separation of the target speech and masker via the precedence effect. The releasing effect was small and was only clear when the target and masker had the same mean F0, suggesting that informational masking was small. Performance with the harmonic maskers was better than with a steady speech-shaped noise (SSN) masker. In experiments II and III, the maskers were speech-like synthesized signals, alternating between segments with harmonic structure and segments composed of SSN. Performance was much worse than for experiment I, and worse than when an SSN masker was used, suggesting that substantial informational masking occurred. The similarity of the F0 contours of the target and masker had little effect. The informational masking effect was not influenced by whether or not the noise-like segments of the masker were synchronous with the unvoiced segments of the target speech.  相似文献   

12.
This study investigated comodulation detection differences (CDD) for fixed- and roved-frequency maskers. The objective was to determine whether CDD could be accounted for better in terms of energetic masking or in terms of perceptual fusion/segregation related to comodulation. Roved-frequency maskers were used in order to minimize the role of energetic masking, allowing possible effects related to perceptual fusion/segregation to be revealed. The signals and maskers were composed of 30-Hz-wide noise bands. The signal was either comodulated with the masker (A/A condition) or had a temporal envelope that was independent (A/B condition). The masker was either gated synchronously with the signal or had a leading temporal fringe of 200 ms. In the fixed-frequency masker conditions, listeners with low A/A thresholds showed little masking release due to masker temporal fringe and had CDDs that could be accounted for by energetic masking. Listeners with higher A/A thresholds in the fixed-frequency masker conditions showed relatively large CDDs and large masking release due to a masker temporal fringe. The CDDs of these listeners may have arisen, at least in part, from processes related to perceptual segregation. Some listeners in the roved masker conditions also had large CDDs that appeared to be related to perceptual segregation.  相似文献   

13.
English consonant recognition in undegraded and degraded listening conditions was compared for listeners whose primary language was either Japanese or American English. There were ten subjects in each of the two groups, termed the non-native (Japanese) and the native (American) subjects, respectively. The Modified Rhyme Test was degraded either by a babble of voices (S/N = -3 dB) or by a room reverberation (reverberation time, T = 1.2 s). The Japanese subjects performed at a lower level than the American subjects in both noise and reverberation, although the performance difference in the undegraded, quiet condition was relatively small. There was no difference between the scores obtained in noise and in reverberation for either group. A limited-error analysis revealed some differences in type of errors for the groups of listeners. Implications of the results are discussed in terms of the effects of degraded listening conditions on non-native listeners' speech perception.  相似文献   

14.
Although informational masking is thought to reflect central mechanisms, the effects are generally much stronger when the target and masker are presented to the same ear than when they are presented to different ears. However, the results of a recent study by Brungart and Simpson [J. Acoust. Soc. Am. 112, 2985-2995 (2002)] indicated that a speech masker that is presented contralateral to a speech signal can produce substantial amounts of informational masking when a second speech masker is played simultaneously in the same ear as the signal. In this study, we conducted a series of experiments that paralleled those of Brungart and Simpson but used a pure-tone signal and multitone informational maskers in a detection task. Both the signal and the maskers were played as sequences of short bursts in each observation interval. The maskers were arranged in two types of spectrotemporal patterns. One type of pattern, called "multiple-bursts same" (MBS), has previously been shown to produce very large amounts of informational masking while the other type of pattern, called "multiple-bursts different" (MBD), has been shown to produce very small amounts of informational masking. Several conditions of ipsilateral, contralateral, and combined presentation of these maskers were tested. The results showed that presentation of the MBS masker in the contralateral ear produced a substantial amount of informational masking when the MBD masker was simultaneously presented to the ipsilateral ear. The results supported the earlier findings of Brungart and Simpson indicating that listeners are unable to selectively focus their attention on a single ear in some complex dichotic listening conditions. These results suggest that this contralateral masking effect is not restricted to speech and may reflect more general limitations on processing capacity. Further, it was concluded that the magnitude of the contralateral masking effect was related both to the informational masking value of the contralateral masker and the complexity of the stimulus and/or task in the ear in which the signal was presented.  相似文献   

15.
In the many studies done on informational masking, interfering speech reduces speech intelligibility. This effect is often used to secure privacy in public spaces. These applications require estimates of how much masking is required. In general, masking effects are estimated by using spectrum information as excitation patterns. However, estimates of informational masking can hardly be obtained by only using spectrum information. Therefore, we estimated the effects of informational masking using time-domain information. Then, we calculated the cepstra of the envelopes’ magnitude histograms. If these cepstra are different between the target and the masker, the signals are not similar in the time-domain. Furthermore, the effect of informational masking would be low. Therefore, we considered the histograms’ cepstra distances (HCD) to estimate signal similarities. The signal similarities in our first experiment were estimated using five maskers by utilizing the HCD. These maskers were random noise, music, female speech, male speech, and target speaker’s speech. Male and female speech were more similar to the target speech than music and noise. Also, the same speaker’s speech was the most similar in the set of maskers. A listening test was carried out in the second experiment to verify the HCD. A double masker was used in this experiment as an effective informational masker. It has similar characteristics to reversal speech. The listening test results suggest the double-masker’s masking effects has the same relation with HCD. This suggests informational masking can be estimated by signal similarity using the HCD.  相似文献   

16.
Although most recent multitalker research has emphasized the importance of binaural cues, monaural cues can play an equally important role in the perception of multiple simultaneous speech signals. In this experiment, the intelligibility of a target phrase masked by a single competing masker phrase was measured as a function of signal-to-noise ratio (SNR) with same-talker, same-sex, and different-sex target and masker voices. The results indicate that informational masking, rather than energetic masking, dominated performance in this experiment. The amount of masking was highly dependent on the similarity of the target and masker voices: performance was best when different-sex talkers were used and worst when the same talker was used for target and masker. Performance did not, however, improve monotonically with increasing SNR. Intelligibility generally plateaued at SNRs below 0 dB and, in some cases, intensity differences between the target and masking voices produced substantial improvements in performance with decreasing SNR. The results indicate that informational and energetic masking play substantially different roles in the perception of competing speech messages.  相似文献   

17.
Many competing noises in real environments are modulated or fluctuating in level. Listeners with normal hearing are able to take advantage of temporal gaps in fluctuating maskers. Listeners with sensorineural hearing loss show less benefit from modulated maskers. Cochlear implant users may be more adversely affected by modulated maskers because of their limited spectral resolution and by their reliance on envelope-based signal-processing strategies of implant processors. The current study evaluated cochlear implant users' ability to understand sentences in the presence of modulated speech-shaped noise. Normal-hearing listeners served as a comparison group. Listeners repeated IEEE sentences in quiet, steady noise, and modulated noise maskers. Maskers were presented at varying signal-to-noise ratios (SNRs) at six modulation rates varying from 1 to 32 Hz. Results suggested that normal-hearing listeners obtain significant release from masking from modulated maskers, especially at 8-Hz masker modulation frequency. In contrast, cochlear implant users experience very little release from masking from modulated maskers. The data suggest, in fact, that they may show negative effects of modulated maskers at syllabic modulation rates (2-4 Hz). Similar patterns of results were obtained from implant listeners using three different devices with different speech-processor strategies. The lack of release from masking occurs in implant listeners independent of their device characteristics, and may be attributable to the nature of implant processing strategies and/or the lack of spectral detail in processed stimuli.  相似文献   

18.
This study examined whether increasing the similarity between informational maskers and signals would increase the amount of masking obtained in a nonspeech pattern identification task. The signals were contiguous sequences of pure-tone bursts arranged in six narrow-band spectro-temporal patterns. The informational maskers were sequences of multitone bursts played synchronously with the signal tones. The listener's task was to identify the patterns in a 1-interval 6-alternative forced-choice procedure. Three types of multitone maskers were generated according to different randomization rules. For the least signal-like informational masker, the components in each multitone burst were chosen at random within the frequency range of 200-6500 Hz, excluding a "protected region" around the signal frequencies. For the intermediate masker, the frequency components in the first burst were chosen quasirandomly, but the components in successive bursts were constrained to fall in narrow frequency bands around the frequencies of the components in the initial burst. Within the narrow bands the frequencies were randomized. This masker was considered to be more similar to the signal patterns because it consisted of a set of narrow-band sequences any one of which might be mistaken for a signal pattern. The most signal-like masker was similar to the intermediate masker in that it consisted of a set of synchronously played narrow-band sequences, but the variation in frequency within each sequence was sinusoidal, completing roughly one period in a sequence. This masker consisted of discernible patterns but not patterns that were part of the set of signals. In addition, masking produced by Gaussian noise bursts--thought to produce primarily peripherally based "energetic masking"--was measured and compared to the informational masking results. For the three informational maskers, more masking was produced by the maskers comprised of narrow-band sequences than for the masker in which the frequencies were not constrained to narrow bands. Also, the slopes of the performance-level functions for the three informational maskers were much shallower than for the Gaussian noise masker or for no masker. The findings provided qualified support for the hypothesis that increasing the similarity between signals and maskers, or parts of the maskers, causes greater informational masking. However, it is also possible that the greater masking was a consequence of increasing the number of perceptual "streams" that had to be evaluated by the listener.  相似文献   

19.
To examine spectral and threshold effects for speech and noise at high levels, recognition of nonsense syllables was assessed for low-pass-filtered speech and speech-shaped maskers and high-pass-filtered speech and speech-shaped maskers at three speech levels, with signal-to-noise ratio held constant. Subjects were younger adults with normal hearing and older adults with normal hearing but significantly higher average quiet thresholds. A broadband masker was always present to minimize audibility differences between subject groups and across presentation levels. For subjects with lower thresholds, the declines in recognition of low-frequency syllables in low-frequency maskers were attributed to nonlinear growth of masking which reduced "effective" signal-to-noise ratio at high levels, whereas the decline for subjects with higher thresholds was not fully explained by nonlinear masking growth. For all subjects, masking growth did not entirely account for declines in recognition of high-frequency syllables in high-frequency maskers at high levels. Relative to younger subjects with normal hearing and lower quiet thresholds, older subjects with normal hearing and higher quiet thresholds had poorer consonant recognition in noise, especially for high-frequency speech in high-frequency maskers. Age-related effects on thresholds and task proficiency may be determining factors in the recognition of speech in noise at high levels.  相似文献   

20.
Normal-hearing (NH) listeners maintain robust speech understanding in modulated noise by "glimpsing" portions of speech from a partially masked waveform--a phenomenon known as masking release (MR). Cochlear implant (CI) users, however, generally lack such resiliency. In previous studies, temporal masking of speech by noise occurred randomly, obscuring to what degree MR is attributable to the temporal overlap of speech and masker. In the present study, masker conditions were constructed to either promote (+MR) or suppress (-MR) masking release by controlling the degree of temporal overlap. Sentence recognition was measured in 14 CI subjects and 22 young-adult NH subjects. Normal-hearing subjects showed large amounts of masking release in the +MR condition and a marked difference between +MR and -MR conditions. In contrast, CI subjects demonstrated less effect of MR overall, and some displayed modulation interference as reflected by poorer performance in modulated maskers. These results suggest that the poor performance of typical CI users in noise might be accounted for by factors that extend beyond peripheral masking, such as reduced segmental boundaries between syllables or words. Encouragingly, the best CI users tested here could take advantage of masker fluctuations to better segregate the speech from the background.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号