首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
吴礼福  王华  程义  郭业才 《应用声学》2016,35(4):288-293
混响是室内声学中的重要现象,在室内设计与音频信号处理中都需要测量或估计混响时间。本文改进了一种基于最大似然估计的混响时间盲估计方法,即采用说话人在房间中自然说话时发出的混响语音信号来估计混响时间的方法。该方法首先确定语音衰减段的最优边界,其次计算该衰减段的两个额外参数,据此筛选出符合条件的语音段,最后将满足条件的语音段采用最大似然估计得到混响时间估计值。在五个不同混响时间条件下的仿真表明,与已有方法相比,改进方法估计的混响时间同真实混响时间的偏差更小,方差更低,估计准确性较高。  相似文献   

2.
A blind method for suppressing late reverberation from speech and audio signals is presented. The proposed technique operates both on the spectral and on the sub-band domains employing a single input channel. At first, a preliminary rough clean signal estimation is required and for this, any standard technique may be applied; however here the estimate is obtained through spectral subtraction. Then, an auditory masking model is employed in sub-bands to extract the reverberation masking index (RMI) which identifies signal regions with perceived alterations due to late reverberation. Utilizing a selective signal processing technique only these regions are suppressed through sub-band temporal envelope filtering based on analytical expressions. Objective and subjective measures indicate that the proposed method achieves significant late reverberation suppression for both speech and music signals over a wide range of reverberation time (RT) scenarios.  相似文献   

3.
Room impulse responses (RIRs) are used very widely to characterize the acoustic conditions of rooms, such as in the derivation of reverberation time, early decay time and clarity index. This study investigates the subjective decay rate (or reverberance) of RIRs when directly listened to (rather than convolved with a dry signal such as speech or music). Through a subjective experiment, it investigates the effects of gain (or listening level) and background noise level on the reverberance of RIRs that had been measured in three concert auditoria. The task of the experiment was to match the decay rate of RIRs to that of a reference RIR by ear, by adjusting the RIRs’ exponential decay rate. Based on objective loudness modeling, gain should have a positive effect on reverberance, and background noise has a negative effect. This is confirmed in the results of the experiment. Furthermore, the objectively calculated loudness decay function provides an effective predictor of subjective decay rate, which performs better than conventional early decay time or reverberation time for the RIRs tested.  相似文献   

4.
Measurements have been carried out on furnished orchestra platforms in four concert halls in Italy in order to describe the sound field perceived by musicians. The heterogeneous nature of the orchestra suggested a procedure able to take into account the mutual hearing between instrumental sections. The measured parameters were the early, late and total support, the reverberation time, the early decay time and the clarity index. A part of the study has been devoted to the measurement uncertainty estimation. The source directivity and the small displacements of the microphone influence the early decay time to a great extent while the on-platform spatial variability affects both the early decay time and the clarity index. Per-section early support shows differences that render the overall spatial mean inappropriate to describe the stage as a whole. For the other parameters an overall mean platform value can instead be suitable, even though, for the case of clarity a more evident group variability is observed. The values of late support, reverberation time, early decay time and clarity index, proposed in literature as suitable measures of reverberance for musicians, are not all intercorrelated, indicating that not all these parameters can be associated to the same subjective impression.  相似文献   

5.
This study examines the auditory attribute that describes the perceived amount of reverberation, known as "reverberance." Listening experiments were performed using two signals commonly heard in auditoria: excerpts of orchestral music and western classical singing. Listeners adjusted the decay rate of room impulse responses prior to convolution with these signals, so as to match the reverberance of each stimulus to that of a reference stimulus. The analysis examines the hypothesis that reverberance is related to the loudness decay rate of the underlying room impulse response. This hypothesis is tested using computational models of time varying or dynamic loudness, from which parameters analogous to conventional reverberation parameters (early decay time and reverberation time) are derived. The results show that listening level significantly affects reverberance, and that the loudness-based parameters outperform related conventional parameters. Results support the proposed relationship between reverberance and the computationally predicted loudness decay function of sound in rooms.  相似文献   

6.
Possibilities to eliminate the reverberation from a speech signal are investigated by applying the method based on the determination of the parameters of the reverberation frequency response from the cepstrum of the reverberation-distorted signal. The delays of reverberating signals and, for the case of a weak reverberation, their amplitudes are determined from the cepstrum of the signal with reverberation. For the cases of medium and strong reverberation, the levels of reverberating signals are refined by adjusting a certain factor. The criterion used for the adjustment of the factor is based on the shape of the speech signal amplitude distribution. By numerical modeling, it is demonstrated that the proposed method can reduce the reverberation level by 30 dB.  相似文献   

7.
8.
Speech intelligibility metrics that take into account sound reflections in the room and the background noise have been compared, assuming diffuse sound field. Under this assumption, sound decays exponentially with a decay constant inversely proportional to reverberation time. Analytical formulas were obtained for each speech intelligibility metric providing a common basis for comparison. These formulas were applied to three sizes of rectangular classrooms. The sound source was the human voice without amplification, and background noise was taken into account by a noise-to-signal ratio. Correlations between the metrics and speech intelligibility are presented and applied to the classrooms under study. Relationships between some speech intelligibility metrics were also established. For each noise-to-signal ratio, the value of each speech intelligibility metric is maximized for a specific reverberation time. For quiet classrooms, the reverberation time that maximizes these speech intelligibility metrics is between 0.1 and 0.3 s. Speech intelligibility of 100% is possible with reverberation times up to 0.4-0.5 s and this is the recommended range. The study suggests "ideal" and "acceptable" maximum background-noise level for classrooms of 25 and 20 dB, respectively, below the voice level at 1 m in front of the talker.  相似文献   

9.
An algorithm for blind estimation of reverberation time (RT) in speech signals is proposed. Analysis is restricted to the free-decaying regions of the signal, where the reverberation effect dominates, yielding a more accurate RT estimate at a reduced computational cost. A spectral decomposition is performed on the reverberant signal and partial RT estimates are determined in all signal subbands, providing more data to the statistical-analysis stage of the algorithm, which yields the final RT estimate. Algorithm performance is assessed using two distinct speech databases, achieving 91% and 97% correlation with the RTs measured by a standard nonblind method, indicating that the proposed method blindly estimates the RT in a reliable and consistent manner.  相似文献   

10.
Speech signals recorded with a distant microphone usually are interfered by the spatial reverberation in the room, which severely degrades the clarity and intelligibility of speech. A speech dereverberation method based on spectral subtraction and spectral line enhancement is proposed in this paper. Following the generalized statistical reverberation model, the power spectrum of late reverberation is estimated and removed from the reverberation speech by the spectral subtraction method. Then, according to the human auditory model, a spectral line enhancement technique based on adaptive post-filtering is adopted to further eliminate the reverberant components between adjacent speech formants. The proposed method can effectively suppress the spatial reverberation and improve the auditory perception of speech. The subjective and objective evaluation results reveal that the perceptual quality of speech is greatly improved by the proposed method.  相似文献   

11.
Speech intelligibility in classrooms affects the learning efficiency of students directly, especially for the students who are using a second language. The speech intelligibility value is determined by many factors such as speech level, signal to noise ratio, and reverberation time in the rooms. This paper investigates the contributions of these factors with subjective tests, especially speech level, which is required for designing the optimal gain for sound amplification systems in classrooms. The test material was generated by mixing the convolution output of the English Coordinate Response Measure corpus and the room impulse responses with the background noise. The subjects are all Chinese students who use English as a second language. It is found that the speech intelligibility increases first and then decreases with the increase of speech level, and the optimal English speech level is about 71 dBA in classrooms for Chinese listeners when the signal to noise ratio and the reverberation time keep constant. Finally, a regression equation is proposed to predict the speech intelligibility based on speech level, signal to noise ratio, and reverberation time.  相似文献   

12.
In several auditoria, it has been observed that the reverberation time is longer than expected and that the cause is a horizontal reverberant field established in the region near the ceiling, a field which is remote from the sound absorbing audience. This has been observed in the Boston Symphony Hall, Massachusetts, and the Stadthalle Göttingen, Germany. Subjective remarks on their acoustics suggest that there are no unfavourable comments linked to the secondary sound field. Two acoustic scale models are considered here. In a generic rectangular concert hall model, the walls and ceiling contained openings in which either plane or scattering panels could be placed. With plane panels, the model reverberation time (RT) was measured as 53% higher than the Sabine prediction (frequency 500/1000 Hz), compared with 8% higher with scattering panels. The second model of a 300 seat lecture theatre with a 6 m or 8 m high ceiling had raked seating. In this case, the amount of absorption in the model was increased until the point was reached where speech had acceptable intelligibility, with the early energy fraction, D ? 0.5. For this acceptable speech condition with the 6 m ceiling, the measured mid-frequency T15 was 1.47 s, whereas the Sabine predicted RT was 1.06 s. The sound decay was basically non-linear with T30 > T15 > EDT. Exploiting a high-level horizontal reverberant field offers the possibility of acoustics that are better adapted as suitable for both speech and unamplified music, without any physical change in the auditorium. Using secondary reverberation in an auditorium for a wide variety of music might also be beneficial.  相似文献   

13.
民乐片段混响感主观偏爱度的初步实验   总被引:3,自引:1,他引:3       下载免费PDF全文
为了探讨不同受试人群、不同实验素材对民乐片段混响感主观偏爱度的影响,以民族器乐中典型的弹拨乐器、拉弦乐器、吹奏乐器独奏作品片段为实验素材,以有经验的音响技师和普通大学生为受试人,采用对偶比较法对实验素材混响处理的主观偏爱度进行了实验测量。实验得出了两组受试对不同实验素材所偏爱的混响时间,验证了同一受试组对不同素材偏爱的混响感不同;发现不同受试组对同一实验素材实验结果的差别不显著。  相似文献   

14.
The paper presents the function of STI in the domain of reverberation time. Through the application of the said function, we can quickly estimate the speech transmission index, knowing only the time of room reverberation. For that purpose we applied a known method which consists in physical estimation of speech intelligibility basing on the modulation transfer function (MTF) determined in a room. Then, the STI was described using a logarithmic function whereof argument was the room reverberation time. To verify the model, reverberation times of six rooms were measured. The selected rooms were very different deliberately. They had different cubature and shape. The selection included a small cuboid, lecture halls and a church. Then, the same rooms were modeled in the ODEON version 11.23 and their reverberation times were determined. Furthermore, the STI was determined in the ODEON and then compared with the reverberation time obtained in effect of fast estimation. The statistical verification with the use of correlation index and regression equation has demonstrated that the fast estimation yields results close to those obtained in the computer simulation in ODEON. We obtained the correlation index at the level close to 1. Furthermore, the test probability at the level lower than 0.05 bespeaks of a statistically significant linear relation for the confidence level of 95%.  相似文献   

15.
The main drawback of minimum variance distortionless response (MVDR) beamformer is the cancellation of the desired speech signal and its degradation in multi-path wave propagation environment. To make the adaptive algorithm robust against room reverberation and to prevent desired signal cancellation an estimation of unknown desired speaker's transfer function was proposed. The estimation is based on the signal and the interference covariance matrices. The estimated transfer function is then applied to the MVDR beamformer. The proposed algorithm was tested on a simulated room with reverberation. The results showed better quality of the restored speech compared to some typical adaptive algorithms.  相似文献   

16.
一个快速自动音乐记谱方法   总被引:1,自引:0,他引:1  
自动音乐记谱是音乐信号处理中的关键技术。本文描述了一个快速的自动复音音乐记谱方法。该方法采用回声器时频分析(RTFI)作为时频分析工具,主要由两个阶段组成,能量基的音符切分和多基频估计。本文所采用的多基频估计方法首先将RTFI能量谱按照谐音组合原理转换为基频能量谱,并基于基频能量谱采用简单的峰拾起方法对基频做初步估计;然后根据频谱不规律性和乐音谐音结构的基本假定,消除初步估计中的错误预测。  相似文献   

17.
Perceptual distances among single tokens of American English vowels were established for nonreverberant and reverberant conditions. Fifteen vowels in the phonetic context (b-t), embedded in the sentence "Mark the (b-t) again" were recorded by a male talker. For the reverberant condition, the sentences were played through a room with a reverberation time of 1.2 s. The CVC syllables were removed from the sentences and presented in pairs to ten subjects with audiometrically normal hearing, who judged the similarity of the syllable pairs separately for the nonreverberant and reverberant conditions. The results were analyzed by multidimensional scaling procedures, which showed that the perceptual data were accounted for by a three-dimensional vowel space. Correlations were obtained between the coordinates of the vowels along each dimension and selected acoustic parameters. For both conditions, dimensions 1 and 2 were highly correlated with formant frequencies F2 and F1, respectively, and dimension 3 was correlated with the product of the duration of the vowels and the difference between F3 and F1 expressed on the Bark scale. These observations are discussed in terms of the influence of reverberation on speech perception.  相似文献   

18.
Speech intelligibility studies in classrooms   总被引:2,自引:0,他引:2  
Speech intelligibility tests and acoustical measurements were made in ten occupied classrooms. Octave-band measurements of background noise levels, early decay times, and reverberation times, as well as various early/late sound ratios, and the center time were obtained. Various octave-band useful/detrimental ratios were calculated along with the speech transmission index. The interrelationships of these measures were considered to evaluate which were most appropriate in classrooms, and the best predictors of speech intelligibility scores were identified. From these results ideal design goals for acoustical conditions for classrooms were determined either in terms of the 50-ms useful/detrimental ratios or from combinations of the reverberation time and background noise level.  相似文献   

19.
For a mixture of target speech and noise in anechoic conditions, the ideal binary mask is defined as follows: It selects the time-frequency units where target energy exceeds noise energy by a certain local threshold and cancels the other units. In this study, the definition of the ideal binary mask is extended to reverberant conditions. Given the division between early and late reflections in terms of speech intelligibility, three ideal binary masks can be defined: an ideal binary mask that uses the direct path of the target as the desired signal, an ideal binary mask that uses the direct path and early reflections of the target as the desired signal, and an ideal binary mask that uses the reverberant target as the desired signal. The effects of these ideal binary mask definitions on speech intelligibility are compared across two types of interference: speech shaped noise and concurrent female speech. As suggested by psychoacoustical studies, the ideal binary mask based on the direct path and early reflections of target speech outperforms the other masks as reverberation time increases and produces substantial reductions in terms of speech reception threshold for normal hearing listeners.  相似文献   

20.
In 1965, the Catholic Church liturgy changed to allow priests to face the congregation. Whereas Church tradition, teaching, and participation have been much discussed with respect to priest orientation at Mass, the acoustical changes in this regard have not yet been examined scientifically. To discuss acoustic desired within churches, it is necessary to know the acoustical characteristics appropriate for each phase of the liturgy. In this study, acoustic measurements were taken at various source locations and directions using both old and new liturgies performed in Japanese churches. A directional loudspeaker was used as the source to provide vocal and organ acoustic fields, and impulse responses were measured. Various acoustical parameters such as reverberation time and early decay time were analyzed. The speech transmission index was higher for the new Catholic liturgy, suggesting that the change in liturgy has improved speech intelligibility. Moreover, the interaural cross-correlation coefficient and early lateral energy fraction were higher and lower, respectively, suggesting that the change in liturgy has made the apparent source width smaller.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号