首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
2.
Speech intelligibility was investigated by varying the number of interfering talkers, level, and mean pitch differences between target and interfering speech, and the presence of tactile support. In a first experiment the speech-reception threshold (SRT) for sentences was measured for a male talker against a background of one to eight interfering male talkers or speech noise. Speech was presented diotically and vibro-tactile support was given by presenting the low-pass-filtered signal (0-200 Hz) to the index finger. The benefit in the SRT resulting from tactile support ranged from 0 to 2.4 dB and was largest for one or two interfering talkers. A second experiment focused on masking effects of one interfering talker. The interference was the target talker's own voice with an increased mean pitch by 2, 4, 8, or 12 semitones. Level differences between target and interfering speech ranged from -16 to +4 dB. Results from measurements of correctly perceived words in sentences show an intelligibility increase of up to 27% due to tactile support. Performance gradually improves with increasing pitch difference. Louder target speech generally helps perception, but results for level differences are considerably dependent on pitch differences. Differences in performance between noise and speech maskers and between speech maskers with various mean pitches are explained by the effect of informational masking.  相似文献   

3.
Speech intelligibility (PB words) in traffic-like noise was investigated in a laboratory situation simulating three common listening situations, indoors at 1 and 4 m and outdoors at 1 m. The maximum noise levels still permitting 75% intelligibility of PB words in these three listening situations were also defined. A total of 269 persons were examined. Forty-six had normal hearing, 90 a presbycusis-type hearing loss, 95 a noise-induced hearing loss and 38 a conductive hearing loss. In the indoor situation the majority of the groups with impaired hearing retained good speech intelligibility in 40 dB(A) masking noise. Lowering the noise level to less than 40 dB(A) resulted in a minor, usually insignificant, improvement in speech intelligibility. Listeners with normal hearing maintained good speech intelligibility in the outdoor listening situation at noise levels up to 60 dB(A), without lip-reading (i.e., using non-auditory information). For groups with impaired hearing due to age and/or noise, representing 8% of the population in Sweden, the noise level outdoors had to be lowered to less than 50 dB(A), in order to achieve good speech intelligibility at 1 m without lip-reading.  相似文献   

4.
Binaural speech intelligibility of individual listeners under realistic conditions was predicted using a model consisting of a gammatone filter bank, an independent equalization-cancellation (EC) process in each frequency band, a gammatone resynthesis, and the speech intelligibility index (SII). Hearing loss was simulated by adding uncorrelated masking noises (according to the pure-tone audiogram) to the ear channels. Speech intelligibility measurements were carried out with 8 normal-hearing and 15 hearing-impaired listeners, collecting speech reception threshold (SRT) data for three different room acoustic conditions (anechoic, office room, cafeteria hall) and eight directions of a single noise source (speech in front). Artificial EC processing errors derived from binaural masking level difference data using pure tones were incorporated into the model. Except for an adjustment of the SII-to-intelligibility mapping function, no model parameter was fitted to the SRT data of this study. The overall correlation coefficient between predicted and observed SRTs was 0.95. The dependence of the SRT of an individual listener on the noise direction and on room acoustics was predicted with a median correlation coefficient of 0.91. The effect of individual hearing impairment was predicted with a median correlation coefficient of 0.95. However, for mild hearing losses the release from masking was overestimated.  相似文献   

5.
The Signal-to-Noise Ratio devised by Lochner and Burger contributed an objective design index for predicting speech intelligibility. Their index provided a measure of useful and detrimental reflected speech energy according to the integration and masking characteristics of hearing, and enabled predictions to be made from impulse measurements in models. However, it was found necessary to extend the Signal-to-Noise Ratio theory to account for the effect of fluctuating ambient background noise on speech intelligibility. A modified Signal-to-Noise Ratio was derived from a best-fitting empirical correlation with speech intelligibility in a series of measurements in existing auditoria. In the modified Signal-to-Noise Ratio ambient background noise is no longer considered in terms of its steady state characteristics but more specifically in terms of its transient and spectral characteristics given by the concept of the L10 PNC level. The index has been applied as design criteria to prediction and to evaluation techniques.  相似文献   

6.
The effects of intensity on monosyllabic word recognition were studied in adults with normal hearing and mild-to-moderate sensorineural hearing loss. The stimuli were bandlimited NU#6 word lists presented in quiet and talker-spectrum-matched noise. Speech levels ranged from 64 to 99 dB SPL and S/N ratios from 28 to -4 dB. In quiet, the performance of normal-hearing subjects remained essentially constant in noise, at a fixed S/N ratio, it decreased as a linear function of speech level. Hearing-impaired subjects performed like normal-hearing subjects tested in noise when the data were corrected for the effects of audibility loss. From these and other results, it was concluded that: (1) speech intelligibility in noise decreases when speech levels exceed 69 dB SPL and the S/N ratio remains constant; (2) the effects of speech and noise level are synergistic; (3) the deterioration in intelligibility can be modeled as a relative increase in the effective masking level; (4) normal-hearing and hearing-impaired subjects are affected similarly by increased signal level when differences in speech audibility are considered; (5) the negative effects of increasing speech and noise levels on speech recognition are similar for all adult subjects, at least up to 80 years; and (6) the effective dynamic range of speech may be larger than the commonly assumed value of 30 dB.  相似文献   

7.
A model for predicting the intelligibility of processed noisy speech is proposed. The speech-based envelope power spectrum model has a similar structure as the model of Ewert and Dau [(2000). J. Acoust. Soc. Am. 108, 1181-1196], developed to account for modulation detection and masking data. The model estimates the speech-to-noise envelope power ratio, SNR(env), at the output of a modulation filterbank and relates this metric to speech intelligibility using the concept of an ideal observer. Predictions were compared to data on the intelligibility of speech presented in stationary speech-shaped noise. The model was further tested in conditions with noisy speech subjected to reverberation and spectral subtraction. Good agreement between predictions and data was found in all cases. For spectral subtraction, an analysis of the model's internal representation of the stimuli revealed that the predicted decrease of intelligibility was caused by the estimated noise envelope power exceeding that of the speech. The classical concept of the speech transmission index fails in this condition. The results strongly suggest that the signal-to-noise ratio at the output of a modulation frequency selective process provides a key measure of speech intelligibility.  相似文献   

8.
When a target-speech/masker mixture is processed with the signal-separation technique, ideal binary mask (IBM), intelligibility of target speech is remarkably improved in both normal-hearing listeners and hearing-impaired listeners. Intelligibility of speech can also be improved by filling in speech gaps with un-modulated broadband noise. This study investigated whether intelligibility of target speech in the IBM-treated target-speech/masker mixture can be further improved by adding a broadband-noise background. The results of this study show that following the IBM manipulation, which remarkably released target speech from speech-spectrum noise, foreign-speech, or native-speech masking (experiment 1), adding a broadband-noise background with the signal-to-noise ratio no less than 4 dB significantly improved intelligibility of target speech when the masker was either noise (experiment 2) or speech (experiment 3). The results suggest that since adding the noise background shallows the areas of silence in the time-frequency domain of the IBM-treated target-speech/masker mixture, the abruption of transient changes in the mixture is smoothed and the perceived continuity of target-speech components becomes enhanced, leading to improved target-speech intelligibility. The findings are useful for advancing computational auditory scene analysis, hearing-aid/cochlear-implant designs, and understanding of speech perception under "cocktail-party" conditions.  相似文献   

9.
This study demonstrates a new possibility of estimating intelligibility of speech in informational maskers. The temporal and spectral properties of sound maskers are investigated to achieve acoustic privacy in public spaces. Speech intelligibility (SI) tests were conducted using Japanese sentences in daily use for energy (white noise) or informational (reversed speech) maskers. We found that the masking effects including informational masking on SI might not be estimated by analyzing the narrow-band temporal envelopes, which is a common way of predicting SI under noisy conditions. The masking effects might instead be visualized by spectral auto-correlation analysis on a frame-by-frame basis, for the series of dominant-spectral peaks of the masked target in the frequency domain. Consequently, we found that dissimilarity in frame-based spectral-auto-correlation sequences between the original and masked targets was the key to evaluating maskers including informational masking effects on SI.  相似文献   

10.
An adaptive test has been developed to determine the minimum bandwidth of speech that a listener needs to reach 50% intelligibility. Measuring this speech-reception bandwidth threshold (SRBT), in addition to the more common speech-reception threshold (SRT) in noise, may be useful in investigating the factors underlying impaired suprathreshold speech perception. Speech was bandpass filtered (center frequency: 1 kHz) and complementary bandstop filtered noise was added. To obtain reference values, the SRBT was measured in 12 normal-hearing listeners at four sound-pressure levels, in combination with three overall spectral tilts. Plotting SRBT as a function of sound-pressure level resulted in U-shaped curves. The most narrow SRBT (1.4 octave) was obtained at an A-weighted sound-pressure level of 55 dB. The required bandwidth increases with increasing level, probably due to upward spread of masking. At a lower level (40 dBA) listeners also need a broader band, because parts of the speech signal will be below threshold. The SII (Speech Intelligibility Index) model reasonably predicts the data, although it seems to underestimate upward spread of masking.  相似文献   

11.
Time-reversed speech has been known to effectively mask information for speech privacy applications. However, the annoyance and distraction caused by the time-reversed speech-like masking sound is higher than other masking sound. This study investigates the effects of adding artificial reverberation to the time-reversed speech. Subjective listening tests have been conducted to measure the intelligibility of target speech, annoyance and distraction caused by the masking sound. The experimental results suggest that adding artificial reverberation to a speech-like masking sound has a significant effect to reduce the annoyance level while maintaining the masking effectiveness of the original masking sound. A trend was also observed that the addition of artificial reverberation could reduce the level of distraction caused by the masking sound.  相似文献   

12.
Low-frequency masking by intense high-frequency noise bands, referred to as remote masking (RM), was the first evidence to challenge energy-detection models of signal detection. Its underlying mechanisms remain unknown. RM was measured in five normal-hearing young-adults at 250, 350, 500, and 700 Hz using equal-power, spectrally matched random-phase noise (RPN) and low-noise noise (LNN) narrowband maskers. RM was also measured using equal-power, two-tone complex (TC2) and eight-tone complex (TC8). Maskers were centered at 3000 Hz with one or two equivalent rectangular bandwidths (ERBs). Masker levels varied from 80 to 95 dB sound pressure level in 5 dB steps. LNN produced negligible masking for all conditions. An increase in bandwidth in RPN yielded greater masking over a wider frequency region. Masking for TC2 was limited to 350 and 700 Hz for one ERB but shifted to only 700 Hz for two ERBs. A spread of masking to 500 and 700 Hz was observed for TC8 when the bandwidth was increased from one to two ERBs. Results suggest that high-frequency noise bands at high levels could generate significant low-frequency masking. It is possible that listeners experience significant RM due to the amplification of various competing noises that might have significant implications for speech perception in noise.  相似文献   

13.
The role of perceived spatial separation in the unmasking of speech   总被引:12,自引:0,他引:12  
Spatial separation of speech and noise in an anechoic space creates a release from masking that often improves speech intelligibility. However, the masking release is severely reduced in reverberant spaces. This study investigated whether the distinct and separate localization of speech and interference provides any perceptual advantage that, due to the precedence effect, is not degraded by reflections. Listeners' identification of nonsense sentences spoken by a female talker was measured in the presence of either speech-spectrum noise or other sentences spoken by a second female talker. Target and interference stimuli were presented in an anechoic chamber from loudspeakers directly in front and 60 degrees to the right in single-source and precedence-effect (lead-lag) conditions. For speech-spectrum noise, the spatial separation advantage for speech recognition (8 dB) was predictable from articulation index computations based on measured release from masking for narrow-band stimuli. The spatial separation advantage was only 1 dB in the lead-lag condition, despite the fact that a large perceptual separation was produced by the precedence effect. For the female talker interference, a much larger advantage occurred, apparently because informational masking was reduced by differences in perceived locations of target and interference.  相似文献   

14.
The word recognition ability of 4 normal-hearing and 13 cochlearly hearing-impaired listeners was evaluated. Filtered and unfiltered speech in quiet and in noise were presented monaurally through headphones. The noise varied over listening situations with regard to spectrum, level, and temporal envelope. Articulation index theory was applied to predict the results. Two calculation methods were used, both based on the ANSI S3.5-1969 20-band method [S3.5-1969 (American National Standards Institute, New York)]. Method I was almost identical to the ANSI method. Method II included a level- and hearing-loss-dependent calculation of masking of stationary and on-off gated noise signals and of self-masking of speech. Method II provided the best prediction capability, and it is concluded that speech intelligibility of cochlearly hearing-impaired listeners may also, to a first approximation, be predicted from articulation index theory.  相似文献   

15.
Relations between perception of suprathreshold speech and auditory functions were examined in 24 hearing-impaired listeners and 12 normal-hearing listeners. The speech intelligibility index (SII) was used to account for audibility. The auditory functions included detection efficiency, temporal and spectral resolution, temporal and spectral integration, and discrimination of intensity, frequency, rhythm, and spectro-temporal shape. All auditory functions were measured at 1 kHz. Speech intelligibility was assessed with the speech-reception threshold (SRT) in quiet and in noise, and with the speech-reception bandwidth threshold (SRBT), previously developed for investigating speech perception in a limited frequency region around 1 kHz. The results showed that the elevated SRT in quiet could be explained on the basis of audibility. Audibility could only partly account for the elevated SRT values in noise and the deviant SRBT values, suggesting that suprathreshold deficits affected intelligibility in these conditions. SII predictions for the SRBT improved significantly by including the individually measured upward spread of masking in the SII model. Reduced spectral resolution, reduced temporal resolution, and reduced frequency discrimination appeared to be related to speech perception deficits. Loss of peripheral compression appeared to have the smallest effect on the intelligibility of suprathreshold speech.  相似文献   

16.
The speech understanding of persons with "flat" hearing loss (HI) was compared to a normal-hearing (NH) control group to examine how hearing loss affects the contribution of speech information in various frequency regions. Speech understanding in noise was assessed at multiple low- and high-pass filter cutoff frequencies. Noise levels were chosen to ensure that the noise, rather than quiet thresholds, determined audibility. The performance of HI subjects was compared to a NH group listening at the same signal-to-noise ratio and a comparable presentation level. Although absolute speech scores for the HI group were reduced, performance improvements as the speech and noise bandwidth increased were comparable between groups. These data suggest that the presence of hearing loss results in a uniform, rather than frequency-specific, deficit in the contribution of speech information. Measures of auditory thresholds in noise and speech intelligibility index (SII) calculations were also performed. These data suggest that differences in performance between the HI and NH groups are due primarily to audibility differences between groups. Measures of auditory thresholds in noise showed the "effective masking spectrum" of the noise was greater for the HI than the NH subjects.  相似文献   

17.
Speech intelligibility in classrooms affects the learning efficiency of students directly, especially for the students who are using a second language. The speech intelligibility value is determined by many factors such as speech level, signal to noise ratio, and reverberation time in the rooms. This paper investigates the contributions of these factors with subjective tests, especially speech level, which is required for designing the optimal gain for sound amplification systems in classrooms. The test material was generated by mixing the convolution output of the English Coordinate Response Measure corpus and the room impulse responses with the background noise. The subjects are all Chinese students who use English as a second language. It is found that the speech intelligibility increases first and then decreases with the increase of speech level, and the optimal English speech level is about 71 dBA in classrooms for Chinese listeners when the signal to noise ratio and the reverberation time keep constant. Finally, a regression equation is proposed to predict the speech intelligibility based on speech level, signal to noise ratio, and reverberation time.  相似文献   

18.
Although many studies have shown that intelligibility improves when a speech signal and an interfering sound source are spatially separated in azimuth, little is known about the effect that spatial separation in distance has on the perception of competing sound sources near the head. In this experiment, head-related transfer functions (HRTFs) were used to process stimuli in order to simulate a target talker and a masking sound located at different distances along the listener's interaural axis. One of the signals was always presented at a distance of 1 m, and the other signal was presented 1 m, 25 cm, or 12 cm from the center of the listener's head. The results show that distance separation has very different effects on speech segregation for different types of maskers. When speech-shaped noise was used as the masker, most of the intelligibility advantages of spatial separation could be accounted for by spectral differences in the target and masking signals at the ear with the higher signal-to-noise ratio (SNR). When a same-sex talker was used as the masker, the intelligibility advantages of spatial separation in distance were dominated by binaural effects that produced the same performance improvements as a 4-5-dB increase in the SNR of a diotic stimulus. These results suggest that distance-dependent changes in the interaural difference cues of nearby sources play a much larger role in the reduction of the informational masking produced by an interfering speech signal than in the reduction of the energetic masking produced by an interfering noise source.  相似文献   

19.
In the many studies done on informational masking, interfering speech reduces speech intelligibility. This effect is often used to secure privacy in public spaces. These applications require estimates of how much masking is required. In general, masking effects are estimated by using spectrum information as excitation patterns. However, estimates of informational masking can hardly be obtained by only using spectrum information. Therefore, we estimated the effects of informational masking using time-domain information. Then, we calculated the cepstra of the envelopes’ magnitude histograms. If these cepstra are different between the target and the masker, the signals are not similar in the time-domain. Furthermore, the effect of informational masking would be low. Therefore, we considered the histograms’ cepstra distances (HCD) to estimate signal similarities. The signal similarities in our first experiment were estimated using five maskers by utilizing the HCD. These maskers were random noise, music, female speech, male speech, and target speaker’s speech. Male and female speech were more similar to the target speech than music and noise. Also, the same speaker’s speech was the most similar in the set of maskers. A listening test was carried out in the second experiment to verify the HCD. A double masker was used in this experiment as an effective informational masker. It has similar characteristics to reversal speech. The listening test results suggest the double-masker’s masking effects has the same relation with HCD. This suggests informational masking can be estimated by signal similarity using the HCD.  相似文献   

20.
Spatial unmasking describes the improvement in the detection or identification of a target sound afforded by separating it spatially from simultaneous masking sounds. This effect has been studied extensively for speech intelligibility in the presence of interfering sounds. In the current study, listeners identified zebra finch song, which shares many acoustic properties with speech but lacks semantic and linguistic content. Three maskers with the same long-term spectral content but different short-term statistics were used: (1) chorus (combinations of unfamiliar zebra finch songs), (2) song-shaped noise (broadband noise with the average spectrum of chorus), and (3) chorus-modulated noise (song-shaped noise multiplied by the broadband envelope from a chorus masker). The amount of masking and spatial unmasking depended on the masker and there was evidence of release from both energetic and informational masking. Spatial unmasking was greatest for the statistically similar chorus masker. For the two noise maskers, there was less spatial unmasking and it was wholly accounted for by the relative target and masker levels at the acoustically better ear. The results share many features with analogous results using speech targets, suggesting that spatial separation aids in the segregation of complex natural sounds through mechanisms that are not specific to speech.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号