首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
The Speech Transmission Index (STI) is a physical metric that is well correlated with the intelligibility of speech degraded by additive noise and reverberation. The traditional STI uses modulated noise as a probe signal and is valid for assessing degradations that result from linear operations on the speech signal. Researchers have attempted to extend the STI to predict the intelligibility of nonlinearly processed speech by proposing variations that use speech as a probe signal. This work considers four previously proposed speech-based STI methods and four novel methods, studied under conditions of additive noise, reverberation, and two nonlinear operations (envelope thresholding and spectral subtraction). Analyzing intermediate metrics in the STI calculation reveals why some methods fail for nonlinear operations. Results indicate that none of the previously proposed methods is adequate for all of the conditions considered, while four proposed methods produce qualitatively reasonable results and warrant further study. The discussion considers the relevance of this work to predicting the intelligibility of cochlear-implant processed speech.  相似文献   

2.
Speech intelligibility studies in classrooms   总被引:2,自引:0,他引:2  
Speech intelligibility tests and acoustical measurements were made in ten occupied classrooms. Octave-band measurements of background noise levels, early decay times, and reverberation times, as well as various early/late sound ratios, and the center time were obtained. Various octave-band useful/detrimental ratios were calculated along with the speech transmission index. The interrelationships of these measures were considered to evaluate which were most appropriate in classrooms, and the best predictors of speech intelligibility scores were identified. From these results ideal design goals for acoustical conditions for classrooms were determined either in terms of the 50-ms useful/detrimental ratios or from combinations of the reverberation time and background noise level.  相似文献   

3.
A 1000 consonant–vowel–consonant structure logatoms corpus (CVC-structure), grouped in 10 phonetically equally balanced lists of 100 words each, was developed to satisfy the need of subjective assessment of speech intelligibility in American Spanish speaking environments. This corpus was tested and correlated with the Speech Transmission Index (STI) measurements to compare its articulation intelligibility score with other lists’ scores.Through the development of this work it was determined that in two different acoustically poor rooms that have the same STI (with STI < 0.50), the intelligibility score is lower when the articulation test is performed in a quiet room with high reverberation time than when it is performed in a very noisy room with low reverberation time. The final correlation curve of the American Spanish CVC-structure corpus was around 10% points higher than the CVCEQB curve obtained by Steeneken and Houtgast in 2002.  相似文献   

4.
While the Speech Transmission Index (STI) is widely applied for prediction of speech intelligibility in room acoustics and telecommunication engineering, it is unclear how to interpret STI values when non-native talkers or listeners are involved. Based on subjectively measured psychometric functions for sentence intelligibility in noise, for populations of native and non-native communicators, a correction function for the interpretation of the STI is derived. This function is applied to determine the appropriate STI ranges with qualification labels ("bad"-"excellent"), for specific populations of non-natives. The correction function is derived by relating the non-native psychometric function to the native psychometric function by a single parameter (nu). For listeners, the nu parameter is found to be highly correlated with linguistic entropy. It is shown that the proposed correction function is also valid for conditions featuring bandwidth limiting and reverberation.  相似文献   

5.
Speech intelligibility metrics that take into account sound reflections in the room and the background noise have been compared, assuming diffuse sound field. Under this assumption, sound decays exponentially with a decay constant inversely proportional to reverberation time. Analytical formulas were obtained for each speech intelligibility metric providing a common basis for comparison. These formulas were applied to three sizes of rectangular classrooms. The sound source was the human voice without amplification, and background noise was taken into account by a noise-to-signal ratio. Correlations between the metrics and speech intelligibility are presented and applied to the classrooms under study. Relationships between some speech intelligibility metrics were also established. For each noise-to-signal ratio, the value of each speech intelligibility metric is maximized for a specific reverberation time. For quiet classrooms, the reverberation time that maximizes these speech intelligibility metrics is between 0.1 and 0.3 s. Speech intelligibility of 100% is possible with reverberation times up to 0.4-0.5 s and this is the recommended range. The study suggests "ideal" and "acceptable" maximum background-noise level for classrooms of 25 and 20 dB, respectively, below the voice level at 1 m in front of the talker.  相似文献   

6.
Speech intelligibility in classrooms affects the learning efficiency of students directly, especially for the students who are using a second language. The speech intelligibility value is determined by many factors such as speech level, signal to noise ratio, and reverberation time in the rooms. This paper investigates the contributions of these factors with subjective tests, especially speech level, which is required for designing the optimal gain for sound amplification systems in classrooms. The test material was generated by mixing the convolution output of the English Coordinate Response Measure corpus and the room impulse responses with the background noise. The subjects are all Chinese students who use English as a second language. It is found that the speech intelligibility increases first and then decreases with the increase of speech level, and the optimal English speech level is about 71 dBA in classrooms for Chinese listeners when the signal to noise ratio and the reverberation time keep constant. Finally, a regression equation is proposed to predict the speech intelligibility based on speech level, signal to noise ratio, and reverberation time.  相似文献   

7.
Annoyance ratings in speech intelligibility tests at 45 dB(A) and 55 dB(A) traffic noise were investigated in a laboratory study. Subjects were chosen according to their hearing acuity to be representative of 70-year-old men and women, and of noise-induced hearing losses typical for a great number of industrial workers. These groups were compared with normal hearing subjects of the same sex and, when possible, the same age. The subjects rated their annoyance on an open 100 mm scale. Significant correlations were found between annoyance expressed in millimetres and speech intelligibility in percent when all subjects were taken as one sample. Speech intelligibility was also calculated from physical measurements of speech and noise by using the articulation index method. Observed and calculated speech intelligibility scores are compared and discussed. Also treated is the estimation of annoyance by traffic noise at moderate noise levels via speech intelligibility scores.  相似文献   

8.
Reverberation interferes with the ability to understand speech in rooms. Overlap-masking explains this degradation by assuming reverberant phonemes endure in time and mask subsequent reverberant phonemes. Most listeners benefit from binaural listening when reverberation exists, indicating that the listener's binaural system processes the two channels to reduce the reverberation. This paper investigates the hypothesis that the binaural word intelligibility advantage found in reverberation is a result of binaural overlap-masking release with the reverberation acting as masking noise. The tests utilize phonetically balanced word lists (ANSI-S3.2 1989), that are presented diotically and binaurally with recorded reverberation and reverberation-like noise. A small room, 62 m3, reverberates the words. These are recorded using two microphones without additional noise sources. The reverberation-like noise is a modified form of these recordings and has a similar spectral content. It does not contain binaural localization cues due to a phase randomization procedure. Listening to the reverberant words binaurally improves the intelligibility by 6.0% over diotic listening. The binaural intelligibility advantage for reverberation-like noise is only 2.6%. This indicates that binaural overlap-masking release is insufficient to explain the entire binaural word intelligibility advantage in reverberation.  相似文献   

9.
Chinese word recognition (CWR) test was conducted by grades 3 and 5 children under the different conditions of reverberation time (RT), background noise level (BNL) and speech sound pressure level (SSPL) in three primary-school classrooms. The CWR scores and signal to noise ratios (SNRs) have been obtained at listening positions. Results show that the CWR score for grades 3 and 5 children increases with increase of SSPL, decrease of RT or increase of age, but it decreases with increase of BNL under the same conditions. For a mixed noise of 56 dBA (speech-spectrum-like noise and ambient noise), the CWR scores in the classroom for grades 3 and 5 children reach a peak at SNR of 15–20 dBA under the same RT and age of children condition. For the natural ambient noise, the CWR score for grades 3 and 5 children gradually increases with increase of the SNR. The high SSPL could not guarantee good CWR for children in classroom, which also depends on RT and BNL in classroom. When the classroom has long RT or high BNL, the increase of SSPL would not be necessarily to achieve better CWR. The novelty of the present study is to further evaluate and confirm the results under environments of real classrooms (not simulated room in laboratory).  相似文献   

10.
Speech reception thresholds were measured to investigate the influence of a room on speech segregation between a spatially separated target and interferer. The listening tests were realized under headphones. A room simulation allowed selected positioning of the interferer and target, as well as varying the absorption coefficient of the room internal surfaces. The measurements involved target sentences and speech-shaped noise or 2-voice interferers. Four experiments revealed that speech segregation in rooms was not only dependent on the azimuth separation of sound sources, but also on their direct-to-reverberant energy ratio at the listening position. This parameter was varied for interferer and target independently. Speech intelligibility decreased as the direct-to-reverberant ratio of sources was degraded by sound reflections in the room. The influence of the direct-to-reverberant ratio of the interferer was in agreement with binaural unmasking theories, through its effect on interaural coherence. The effect on the target occurred at higher levels of reverberation and was explained by the intrinsic degradation of speech intelligibility in reverberation.  相似文献   

11.
本文研究了开放型办公室中平稳噪声掩蔽语音环境下噪声可懂度的客观评价指标与工作效率之间的关系。文章通过对三种客观评价指标:Speech Transmission Index(STI),Perceptually Evaluation of Speech Quality(PESQ)和modified Normalized Covariance Method(mNCM)与专门设计的主观实验结果相对比,得到了该条件下客观评价指标与主观烦扰度和工作效率之间的关系。结果显示,客观评价指标与主观实验结果均具有较高的相关性,说明利用客观评价指标来预测、评估工作效率具有可行性。实验结果还初步揭示了噪声的语言可懂度和工作效率之间的变化规律:在噪声的语言可懂度的中间区域,工作效率变化显著;但噪声的语言可懂度高于一定值以后,工作效率趋于稳定。  相似文献   

12.
The effects of intensity on monosyllabic word recognition were studied in adults with normal hearing and mild-to-moderate sensorineural hearing loss. The stimuli were bandlimited NU#6 word lists presented in quiet and talker-spectrum-matched noise. Speech levels ranged from 64 to 99 dB SPL and S/N ratios from 28 to -4 dB. In quiet, the performance of normal-hearing subjects remained essentially constant in noise, at a fixed S/N ratio, it decreased as a linear function of speech level. Hearing-impaired subjects performed like normal-hearing subjects tested in noise when the data were corrected for the effects of audibility loss. From these and other results, it was concluded that: (1) speech intelligibility in noise decreases when speech levels exceed 69 dB SPL and the S/N ratio remains constant; (2) the effects of speech and noise level are synergistic; (3) the deterioration in intelligibility can be modeled as a relative increase in the effective masking level; (4) normal-hearing and hearing-impaired subjects are affected similarly by increased signal level when differences in speech audibility are considered; (5) the negative effects of increasing speech and noise levels on speech recognition are similar for all adult subjects, at least up to 80 years; and (6) the effective dynamic range of speech may be larger than the commonly assumed value of 30 dB.  相似文献   

13.
This is the second of two papers describing the results of acoustical measurements and speech intelligibility tests in elementary school classrooms. The intelligibility tests were performed in 41 classrooms in 12 different schools evenly divided among grades 1, 3, and 6 students (nominally 6, 8, and 11 year olds). Speech intelligibility tests were carried out on classes of students seated at their own desks in their regular classrooms. Mean intelligibility scores were significantly related to signal-to-noise ratios and to the grade of the students. While the results are different than those from some previous laboratory studies that included less realistic conditions, they agree with previous in-classroom experiments. The results indicate that +15 dB signal-to-noise ratio is not adequate for the youngest children. By combining the speech intelligibility test results with measurements of speech and noise levels during actual teaching situations, estimates of the fraction of students experiencing near-ideal acoustical conditions were made. The results are used as a basis for estimating ideal acoustical criteria for elementary school classrooms.  相似文献   

14.
15.
A number of objective evaluation methods are currently used to quantify the speech intelligibility in a built environment, including the speech transmission index (STI), rapid speech transmission index (RASTI), articulation index (AI), and the percent articulation loss of consonants (%ALCons). Certain software programs can quickly evaluate STI, RASTI, and %ALCons from a measured room impulse response. In this project, two impulse-response-based software packages (WinMLS and SIA-Smaart Acoustic Tools) were evaluated for their ability to determine intelligibility accurately. In four different spaces with background noise levels less than NC 45, speech intelligibility was measured via three methods: (1) with WinMLS 2000; (2) with SIA-Smaart Acoustic Tools (v4.0.2); and (3) from listening tests with humans. The study found that WinMLS measurements of speech intelligibility based on STI, RASTI, and %ALCons corresponded well with performance on the listening tests. SIA-Smaart results were correlated to human responses, but tended to under-predict intelligibility based on STI and RASTI, and over-predict intelligibility based on %ALCons.  相似文献   

16.
Intelligibility tests were performed by teachers and pupils in classrooms under a variety of (road traffic) noise conditions. The intelligibility scores are found to deteriorate at (indoor) noise levels exceeding a critical value of — 15 dB with regard to a teacher's long-term (reverberant) speech level. The implications for external noise levels are discussed: typically, an external noise level of 50 dB(A) would imply that the critical indoor level is exceeded for about 20 per cent of teachers.  相似文献   

17.
Although cochlear implant (CI) users have enjoyed good speech recognition in quiet, they still have difficulties understanding speech in noise. We conducted three experiments to determine whether a directional microphone and an adaptive multichannel noise reduction algorithm could enhance CI performance in noise and whether Speech Transmission Index (STI) can be used to predict CI performance in various acoustic and signal processing conditions. In Experiment I, CI users listened to speech in noise processed by 4 hearing aid settings: omni-directional microphone, omni-directional microphone plus noise reduction, directional microphone, and directional microphone plus noise reduction. The directional microphone significantly improved speech recognition in noise. Both directional microphone and noise reduction algorithm improved overall preference. In Experiment II, normal hearing individuals listened to the recorded speech produced by 4- or 8-channel CI simulations. The 8-channel simulation yielded similar speech recognition results as in Experiment I, whereas the 4-channel simulation produced no significant difference among the 4 settings. In Experiment III, we examined the relationship between STIs and speech recognition. The results suggested that STI could predict actual and simulated CI speech intelligibility with acoustic degradation and the directional microphone, but not the noise reduction algorithm. Implications for intelligibility enhancement are discussed.  相似文献   

18.
Stone et al. [J. Acoust. Soc Am. 130, 2874-2881 (2011)], using vocoder processing, showed that the envelope modulations of a notionally steady noise were more effective than the envelope energy as a masker of speech. Here the same effect is demonstrated using non-vocoded signals. Speech was filtered into 28 channels. A masker centered on each channel was added to the channel signal at a target-to-background ratio of -5 or -10 dB. Maskers were sinusoids or noise bands with bandwidth 1/3 or 1 ERB(N) (ERB(N) being the bandwidth of "normal" auditory filters), synthesized with Gaussian (GN) or low-noise (LNN) statistics. To minimize peripheral interactions between maskers, odd-numbered channels were presented to one ear and even to the other. Speech intelligibility was assessed in the presence of each "steady" masker and that masker 100% sinusoidally amplitude modulated (SAM) at 8 Hz. Intelligibility decreased with increasing envelope fluctuation of the maskers. Masking release, the difference in intelligibility between the SAM and its "steady" counterpart, increased with bandwidth from near-zero to around 50 percentage points for the 1-ERB(N) GN. It is concluded that the sinusoidal and GN maskers behaved primarily as energetic and modulation maskers, respectively.  相似文献   

19.
Although the speech transmission index (STI) is a well-accepted and standardized method for objective prediction of speech intelligibility in a wide range of environments and applications, it is essentially a monaural model. Advantages of binaural hearing in speech intelligibility are disregarded. In specific conditions, this leads to considerable mismatches between subjective intelligibility and the STI. A binaural version of the STI was developed based on interaural cross correlograms, which shows a considerably improved correspondence with subjective intelligibility in dichotic listening conditions. The new binaural STI is designed to be a relatively simple model, which adds only few parameters to the original standardized STI and changes none of the existing model parameters. For monaural conditions, the outcome is identical to the standardized STI. The new model was validated on a set of 39 dichotic listening conditions, featuring anechoic, classroom, listening room, and strongly echoic environments. For these 39 conditions, speech intelligibility [consonant-vowel-consonant (CVC) word score] and binaural STI were measured. On the basis of these conditions, the relation between binaural STI and CVC word scores closely matches the STI reference curve (standardized relation between STI and CVC word score) for monaural listening. A better-ear STI appears to perform quite well in relation to the binaural STI model; the monaural STI performs poorly in these cases.  相似文献   

20.
English consonant recognition in undegraded and degraded listening conditions was compared for listeners whose primary language was either Japanese or American English. There were ten subjects in each of the two groups, termed the non-native (Japanese) and the native (American) subjects, respectively. The Modified Rhyme Test was degraded either by a babble of voices (S/N = -3 dB) or by a room reverberation (reverberation time, T = 1.2 s). The Japanese subjects performed at a lower level than the American subjects in both noise and reverberation, although the performance difference in the undegraded, quiet condition was relatively small. There was no difference between the scores obtained in noise and in reverberation for either group. A limited-error analysis revealed some differences in type of errors for the groups of listeners. Implications of the results are discussed in terms of the effects of degraded listening conditions on non-native listeners' speech perception.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号