首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 0 毫秒
1.
The speech level of verbal information in public spaces should be determined to make it acceptable to as many listeners as possible, while simultaneously maintaining maximum intelligibility and considering the variation in the hearing levels of listeners. In the present study, the universally acceptable range of speech level in reverberant and quiet sound fields for both young listeners with normal hearing and aged listeners with hearing loss due to aging was investigated. Word intelligibility scores and listening difficulty ratings as a function of speech level were obtained by listening tests. The results of the listening tests clarified that (1) the universally acceptable ranges of speech level are from 60 to 70 dBA, from 56 to 61 dBA, from 52 to 67 dBA and from 58 to 63 dBA for the test sound fields with the reverberation times of 0.0, 0.5, 1.0 and 2.0 s, respectively, and (2) there is a speech level that falls within all of the universally acceptable ranges of speech level obtained in the present study; that speech level is around 60 dBA.  相似文献   

2.
Listening difficulty ratings [Morimoto et al., J. Acoust. Soc. Am. 116, 1607-1613 (2004)] were obtained for 20 young adult listeners and 34 elderly listeners in reverberant and noisy sound fields simulated in an anechoic room. The listening difficulty ratings were compared with acoustical objective measures. The results and analyses showed the following: (i) The correlation between listening difficulty ratings and the revised speech transmission index (STI(r)), and that for the useful-detrimental ratio (U(50)) were high, regardless of the age of the listeners. (ii) STI(r) and U(50) need to be increased by 0.12 and 4.2 dB, respectively, to equalize the listening difficulty ratings for the elderly listeners with those for the young listeners. (iii) The estimation accuracies for STI(r) and U(50) can be improved by calculating them with the L(eq) of background noise linearly increased by 4 to 10 dB, which depends on the age of the listeners and the objective measures. However, the improvement was not statistically significant for the elderly listeners.  相似文献   

3.
This paper shows an accurate speech detection algorithm for improving the performance of speech recognition systems working in noisy environments. The proposed method is based on a hard decision clustering approach where a set of prototypes is used to characterize the noisy channel. Detecting the presence of speech is enabled by a decision rule formulated in terms of an averaged distance between the observation vector and a cluster-based noise model. The algorithm benefits from using contextual information, a strategy that considers not only a single speech frame but also a neighborhood of data in order to smooth the decision function and improve speech detection robustness. The proposed scheme exhibits reduced computational cost making it adequate for real time applications, i.e., automated speech recognition systems. An exhaustive analysis is conducted on the AURORA 2 and AURORA 3 databases in order to assess the performance of the algorithm and to compare it to existing standard voice activity detection (VAD) methods. The results show significant improvements in detection accuracy and speech recognition rate over standard VADs such as ITU-T G.729, ETSI GSM AMR, and ETSI AFE for distributed speech recognition and a representative set of recently reported VAD algorithms.  相似文献   

4.
田斌  易克初 《声学学报》2003,28(1):28-32
针对语音识别中由于强噪声的影响而引起的Lombard和Loud效应进行研究,提出了基于训练数据的加性噪声和Lombard及Loud效应的联合补偿法。对于加性噪声是从谱减法的逆向角度对训练数据在频谱域采用谱加法;对于Lombard和Loud语音,则采用基于隐马尔可夫模型(HMM)状态标注的训练数据补偿,该方法同时考虑Lombard和Loud语音不同声学单元的不同状态在倒谱域的多种变化和多种变异情况下不同声学单元的音长及相对音长的变化。这种基于数据的多模式补偿使模型自动适应多种噪声和语音变异情况,在强噪声环境下具有很强的鲁棒性,并且不影响识别系统在正常环境或正常发音时的识别性能.同时,由于补偿是在训练过程中得到,不增加识别时的计算复杂度。  相似文献   

5.
Speech-intelligibility tests auralized in a virtual classroom were used to investigate the optimal reverberation times for verbal communication for normal-hearing and hearing-impaired adults. The idealized classroom had simple geometry, uniform surface absorption, and an approximately diffuse sound field. It contained a speech source, a listener at a receiver position, and a noise source located at one of two positions. The relative output levels of the speech and noise sources were varied, along with the surface absorption and the corresponding reverberation time. The binaural impulse responses of the speech and noise sources in each classroom configuration were convolved with Modified Rhyme Test (MRT) and babble-noise signals. The resulting signals were presented to normal-hearing and hearing-impaired adult subjects to identify the configurations that gave the highest speech intelligibilities for the two groups. For both subject groups, when the speech source was closer to the listener than the noise source, the optimal reverberation time was zero. When the noise source was closer to the listener than the speech source, the optimal reverberation time included both zero and nonzero values. The results generally support previous theoretical results.  相似文献   

6.
方元 《声学学报》2001,26(4):324-328
提出了一种解决两路信号同步的方法.与单纯平衡信号的延迟相比,在具有一定长度混响时间的房间里,能够较为有效地抵消房间冲激响应的影响。实验结果证明,该方法在信噪比较低的情况下仍可以取得较好的效果;同步后的信号相加,几乎可以达到信噪比提高的极限。  相似文献   

7.
Three experiments investigated the roles of interaural time differences (ITDs) and level differences (ILDs) in spatial unmasking in multi-source environments. In experiment 1, speech reception thresholds (SRTs) were measured in virtual-acoustic simulations of an anechoic environment with three interfering sound sources of either speech or noise. The target source lay directly ahead, while three interfering sources were (1) all at the target's location (0 degrees,0 degrees,0 degrees), (2) at locations distributed across both hemifields (-30 degrees,60 degrees,90 degrees), (3) at locations in the same hemifield (30 degrees,60 degrees,90 degrees), or (4) co-located in one hemifield (90 degrees,90 degrees,90 degrees). Sounds were convolved with head-related impulse responses (HRIRs) that were manipulated to remove individual binaural cues. Three conditions used HRIRs with (1) both ILDs and ITDs, (2) only ILDs, and (3) only ITDs. The ITD-only condition produced the same pattern of results across spatial configurations as the combined cues, but with smaller differences between spatial configurations. The ILD-only condition yielded similar SRTs for the (-30 degrees,60 degrees,90 degrees) and (0 degrees,0 degrees,0 degrees) configurations, as expected for best-ear listening. In experiment 2, pure-tone BMLDs were measured at third-octave frequencies against the ITD-only, speech-shaped noise interferers of experiment 1. These BMLDs were 4-8 dB at low frequencies for all spatial configurations. In experiment 3, SRTs were measured for speech in diotic, speech-shaped noise. Noises were filtered to reduce the spectrum level at each frequency according to the BMLDs measured in experiment 2. SRTs were as low or lower than those of the corresponding ITD-only conditions from experiment 1. Thus, an explanation of speech understanding in complex listening environments based on the combination of best-ear listening and binaural unmasking (without involving sound-localization) cannot be excluded.  相似文献   

8.
The rationale for a method to quantify the information content of linguistic stimuli, i.e., the linguistic entropy, is developed. The method is an adapted version of the letter-guessing procedure originally devised by Shannon [Bell Syst. Tech. J. 30, 50-64 (1951)]. It is applied to sentences included in a widely used test to measure speech-reception thresholds and originally selected to be approximately equally redundant. Results of a first experiment reveal that this method enables one to detect subtle differences between sentences and sentence lists with respect to linguistic entropy. Results of a second experiment show that (1) in young listeners and with the sentences employed, manipulating linguistic entropy can result in an effect on SRT of approximately 4 dB in terms of signal-to-noise ratio; (2) the range of this effect is approximately the same in elderly listeners.  相似文献   

9.
Upward spreading of masking, measured in terms of absolute masked threshold, is greater in hearing-impaired listeners than in listeners with normal hearing. The purpose of this study was to make further observations on upward-masked thresholds and speech recognition in noise in elderly listeners. Two age groups were used: One group consisted of listeners who were more than 60 years old, and the second group consisted of listeners who were less than 36 years old. Both groups had listeners with normal hearing as well as listeners with mild to moderate sensorineural loss. The masking paradigm consisted of a continuous low-pass-filtered (1000-Hz) noise, which was mixed with the output of a self-tracking, sweep-frequency Bekesy audiometer. Thresholds were measured in quiet and with maskers at 70 and 90 dB SPL. The upward-masked thresholds were similar for young and elderly hearing-impaired listeners. A few elderly listeners had lower upward-masked thresholds compared with the young control group; however, their on-frequency masked thresholds were nearly identical to the control group. A significant correlation was found between upward-masked thresholds and the Speech Perception in Noise (SPIN) test in elderly listeners.  相似文献   

10.
基于时间反转的复杂声场拾声传声器阵列性能研究   总被引:1,自引:0,他引:1  
蔡野锋  邱小军  杨军 《声学学报》2010,35(6):593-600
探讨时间反转技术在复杂声场传声器阵列拾声中应用的可行性及其机理,给出其一般规律和性能。研究表明:在自由空间中,其拾声性能与频率,阵列形状和半径有关,频率越高,半径越大,拾声效果越好。在普通房间中,在语音频段内,圆弧阵列在预定目标点处的阵列增益性能要比离预定目标点约25 cm远处的位置处大5 dB以上。在普通房间和混响室中的实验验证了上述结论。  相似文献   

11.
Reinforcing speech levels and controlling noise and reverberation are the ultimate acoustical goals of lecture-room design to achieve high speech intelligibility. The effects of sound absorption on these factors have opposite consequences for speech intelligibility. Here, novel ceiling baffles and reflectors were evaluated as a sound-control measure, using computer and 1/8-scale models of a lecture room with hard surfaces and excessive reverberation. Parallel ceiling baffles running front to back were investigated. They were expected to absorb reverberation incident on the ceiling from many angles, while leaving speech signals, reflecting from the ceiling to the back of the room, unaffected. Various baffle spacings and absorptions, central and side speaker positions, and receiver positions throughout the room, were considered. Reflective baffles controlled reverberation, with a minimum decrease of sound levels. Absorptive baffles reduced reverberation, but reduced speech levels significantly. Ceiling reflectors, in the form of obstacles of semicircular cross section, suspended below the ceiling, were also tested. These were either 7 m long and in parallel, front-to-back lines, or 0.8 m long and randomly distributed, with flat side up or down, and reflective or absorptive top surfaces. The long reflectors with flat side down and no absorption were somewhat effective; the other configurations were not.  相似文献   

12.
周健  郑文明  王青云  赵力 《声学学报》2014,39(4):501-508
提出两种基于非对称代价函数的耳语音增强算法,将语音增强过程中的放大失真和压缩失真区分对待。Modified ItakuraSaito (MIS)算法对放大失真给予更多的惩罚,而Kullback-Leibler (KL)算法则对压缩失真给予更多的惩罚。实验结果表明,在低于—6 dB的低信噪比情况中,经MIS算法增强后的耳语音的可懂度相比传统算法有显著提高;而KL算法则获得了同最小均方误差语音增强算法近似的可懂度提高效果,证实了耳语音中的放大失真和压缩失真对于耳语音可懂度的影响并不相同,低信噪比时较大的压缩失真有助于提高耳语音可懂度,而高信噪比时的压缩失真对耳语音可懂度影响较小。  相似文献   

13.
Planar nearfield acoustic holography (NAH) is extended to identify the sound source in a noisy environment. The extended method requires the knowledge of the pressures on two closely spaced parallel hologram planes and the plane wave reflection coefficient on the target source surface. First, the incoming field coming from the back side of the microphone array and the scattered field due to the incoming wave falling on the target source are correlated through the plane wave reflection coefficient on the target source surface. Then, the mixed field on the hologram plane can be represented by the field that would be radiated by the target source into free space and the incoming field. Finally, the field that would be radiated by the target source into free space can be extracted by using the pressures measured on two hologram planes, which will be further used to accurately identify the sound source via planar NAH. The validity of the proposed method is demonstrated by simulations and experiment, and the influence of the relative strength of the disturbing source to the target source is also investigated.  相似文献   

14.
《Physics letters. A》2019,383(17):2004-2010
In this work we consider bipartite noisy bound entangled states with positive partial transpose, that is, such a state can be written as a convex combination of an edge state and a separable state. In particular, we present schemes to construct distinct classes of noisy bound entangled states which satisfy the range criterion. As a consequence of the present study we also identify noisy bound entangled states which do not satisfy the range criterion. All of the present states are constituted by exploring different types of product bases.  相似文献   

15.
16.
Existing objective speech-intelligibility measures are suitable for several types of degradation, however, it turns out that they are less appropriate in cases where noisy speech is processed by a time-frequency weighting. To this end, an extensive evaluation is presented of objective measure for intelligibility prediction of noisy speech processed with a technique called ideal time frequency (TF) segregation. In total 17 measures are evaluated, including four advanced speech-intelligibility measures (CSII, CSTI, NSEC, DAU), the advanced speech-quality measure (PESQ), and several frame-based measures (e.g., SSNR). Furthermore, several additional measures are proposed. The study comprised a total number of 168 different TF-weightings, including unprocessed noisy speech. Out of all measures, the proposed frame-based measure MCC gave the best results (ρ?=?0.93). An additional experiment shows that the good performing measures in this study also show high correlation with the intelligibility of single-channel noise reduced speech.  相似文献   

17.
This paper reports on an evaluation of ratings of the sound insulation of simulated walls in terms of the intelligibility of speech transmitted through the walls. Subjects listened to speech modified to simulate transmission through 20 different walls with a wide range of sound insulation ratings, with constant ambient noise. The subjects' mean speech intelligibility scores were compared with various physical measures to test the success of the measures as sound insulation ratings. The standard Sound Transmission Class (STC) and Weighted Sound Reduction Index ratings were only moderately successful predictors of intelligibility scores, and eliminating the 8 dB rule from STC led to very modest improvements. Various previously established speech intelligibility measures (e.g., Articulation Index or Speech Intelligibility Index) and measures derived from them, such as the Articulation Class, were all relatively strongly related to speech intelligibility scores. In general, measures that involved arithmetic averages or summations of decibel values over frequency bands important for speech were most strongly related to intelligibility scores. The two most accurate predictors of the intelligibility of transmitted speech were an arithmetic average transmission loss over the frequencies from 200 to 2.5 kHz and the addition of a new spectrum weighting term to R(w) that included frequencies from 400 to 2.5 kHz.  相似文献   

18.
19.
Annoyance ratings in speech intelligibility tests at 45 dB(A) and 55 dB(A) traffic noise were investigated in a laboratory study. Subjects were chosen according to their hearing acuity to be representative of 70-year-old men and women, and of noise-induced hearing losses typical for a great number of industrial workers. These groups were compared with normal hearing subjects of the same sex and, when possible, the same age. The subjects rated their annoyance on an open 100 mm scale. Significant correlations were found between annoyance expressed in millimetres and speech intelligibility in percent when all subjects were taken as one sample. Speech intelligibility was also calculated from physical measurements of speech and noise by using the articulation index method. Observed and calculated speech intelligibility scores are compared and discussed. Also treated is the estimation of annoyance by traffic noise at moderate noise levels via speech intelligibility scores.  相似文献   

20.
The speech intelligibility in classroom can be influenced by background-noise levels, speech sound pressure level (SSPL), reverberation time and signal-to-noise ratio (SNR). The relationship between SSPL and subjective Chinese Mandarin speech intelligibility and the effect of different SNRs on Chinese Mandarin speech intelligibility in the simulated classroom were investigated through room acoustical simulation, auralisation technique and subjective evaluation. Chinese speech intelligibility test signals recorded in anechoic chamber were convolved with the simulated binaural room impulse responses, and then reproduced through the headphone by different SSPLs and SNRs. The results show that Chinese Mandarin speech intelligibility scores increase with increasing of SSPLs and SNRs within a certain range in simulated classrooms. Chinese Mandarin speech intelligibility scores have no significant difference with SNRs of no less than 15 dBA under the same reverberation time condition.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号