首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Ambient sound can impair verbal short-term memory performance. This finding is relevant to the acoustic optimization of open-plan offices. Two algorithmic approaches claim to model the impairment during a given sound condition. One model is based on the Speech Transmission Index (STI). The other approach relies on the hearing sensation fluctuation strength (F). Within the scope of our consulting activities the approach based on F can hardly be applied and the model based on the STI is often misinterpreted in terms of semanticity. Therefore we put to test the two models and elucidate the relevance of temporal–spectral variability and semanticity of background sound with regard to impairment of performance. A group of 24 subjects performed a short-term memory task and rated perceived annoyance during eight different speech and speech-like noise conditions, which varied with regard to STI and F. The empirical data is compared to the model predictions, which only partly cover the experimental results. Speech impairs performance more than all other sound conditions and variable speech-like noise is more impairing than continuous speech-like noise. Sound masking with continuous speech-like noise provides relief from the negative effect of background speech. This positive effect is more pronounced if the signal to noise ratio is −3 dB(A) or even lower.  相似文献   

2.
Although many researchers have shown that listeners are able to selectively attend to a target speech signal when a masking talker is present in the same ear as the target speech or when a masking talker is present in a different ear than the target speech, little is known about selective auditory attention in tasks with a target talker in one ear and independent masking talkers in both ears at the same time. In this series of experiments, listeners were asked to respond to a target speech signal spoken by one of two competing talkers in their right (target) ear while ignoring a simultaneous masking sound in their left (unattended) ear. When the masking sound in the unattended ear was noise, listeners were able to segregate the competing talkers in the target ear nearly as well as they could with no sound in the unattended ear. When the masking sound in the unattended ear was speech, however, speech segregation in the target ear was substantially worse than with no sound in the unattended ear. When the masking sound in the unattended ear was time-reversed speech, speech segregation was degraded only when the target speech was presented at a lower level than the masking speech in the target ear. These results show that within-ear and across-ear speech segregation are closely related processes that cannot be performed simultaneously when the interfering sound in the unattended ear is qualitatively similar to speech.  相似文献   

3.
Recent results have shown that listeners attending to the quieter of two speech signals in one ear (the target ear) are highly susceptible to interference from normal or time-reversed speech signals presented in the unattended ear. However, speech-shaped noise signals have little impact on the segregation of speech in the opposite ear. This suggests that there is a fundamental difference between the across-ear interference effects of speech and nonspeech signals. In this experiment, the intelligibility and contralateral-ear masking characteristics of three synthetic speech signals with parametrically adjustable speech-like properties were examined: (1) a modulated noise-band (MNB) speech signal composed of fixed-frequency bands of envelope-modulated noise; (2) a modulated sine-band (MSB) speech signal composed of fixed-frequency amplitude-modulated sinewaves; and (3) a "sinewave speech" signal composed of sine waves tracking the first four formants of speech. In all three cases, a systematic decrease in performance in the two-talker target-ear listening task was found as the number of bands in the contralateral speech-like masker increased. These results suggest that speech-like fluctuations in the spectral envelope of a signal play an important role in determining the amount of across-ear interference that a signal will produce in a dichotic cocktail-party listening task.  相似文献   

4.
Speech intelligibility metrics that take into account sound reflections in the room and the background noise have been compared, assuming diffuse sound field. Under this assumption, sound decays exponentially with a decay constant inversely proportional to reverberation time. Analytical formulas were obtained for each speech intelligibility metric providing a common basis for comparison. These formulas were applied to three sizes of rectangular classrooms. The sound source was the human voice without amplification, and background noise was taken into account by a noise-to-signal ratio. Correlations between the metrics and speech intelligibility are presented and applied to the classrooms under study. Relationships between some speech intelligibility metrics were also established. For each noise-to-signal ratio, the value of each speech intelligibility metric is maximized for a specific reverberation time. For quiet classrooms, the reverberation time that maximizes these speech intelligibility metrics is between 0.1 and 0.3 s. Speech intelligibility of 100% is possible with reverberation times up to 0.4-0.5 s and this is the recommended range. The study suggests "ideal" and "acceptable" maximum background-noise level for classrooms of 25 and 20 dB, respectively, below the voice level at 1 m in front of the talker.  相似文献   

5.
In this study, total annoyance caused by different simultaneous environmental sounds is investigated. In spite of a number of puzzling data in the literature, it is fairly well established that in combinations in which the annoyance of one source is considerably higher than that of another source, total annoyance is equal to the maximum annoyance of the separate sources. For combinations in which both sounds are about equally annoying, total annoyance seems to be higher than the maximum source-specific annoyance. The available data, however, are too rough to model total annoyance in these conditions. The present laboratory studies were therefore designed to explore further possible procedures to quantify total annoyance. Subjects rated the (total) annoyance caused by various combinations of impulse, road-traffic, and aircraft sounds. The results support a simple model which predicts the overall or total rating sound level L(t) for combinations of several types of sounds. Here, L(t) is numerically equal to the A-weighted equivalent sound level L(eq) of road-traffic sound with the same annoyance as caused by the combination of sounds. In the model, the sound exposure caused by the impulse and/or aircraft sounds is first expressed in the L(eq) of equally annoying road-traffic sound. With the help of source-specific dose-effect relationships, this is achieved by adding level-dependent penalties to the L(eq) of the respective sources. Weighted summation of the corrected L(eq)'s of the various sources then results in L(t). An optimal overall fit of the data from two separate experiments was obtained when the weighted summation of the corrected L(eq)'s was performed with the parameter k in k log(sigma 10(corrected L(eq) of source j)/k) set to 15. The standard deviation of the differences between the experimental results and the model predictions with k = 15 was equivalent to the small change in annoyance produced by a 1.5-dB shift in the L(eq) of road-traffic sound. Adoption of k = 15 implies that after correction, two equal L(eq)'s yield a total rating sound level which is 4.5 dB higher than each single-source corrected L(eq).  相似文献   

6.
Unattended background speech is a known source of cognitive and subjective distraction in open-plan offices. This study investigated whether the deleterious effects of background speech can be affected by room acoustic design that decreases speech intelligibility, as measured by the Speech Transmission Index (STI). The experiment was conducted in an open-plan office laboratory (84 m2) in which four acoustic conditions were physically built. Three conditions contained background speech. A quiet condition was included for comparison. The speech conditions differed in terms of the degree of absorption, screen height, desk isolation, and the level of masking sound. The speech sounds simulated an environment where phone conversations are heard from different locations varying in distance. Ninety-eight volunteers were tested. The presence of background speech had detrimental effects on the subjective perceptions of noise effects and on cognitive performance in short-term memory and working memory tasks. These effects were not attenuated nor amplified within a three-hour working period. The reduction of the STI by room acoustic means decreased subjective disturbance, whereas the effects on cognitive performance were somewhat smaller than expected. The effects of room acoustic design on subjective distraction were stronger among noise-sensitive subjects, suggesting that they benefited more from acoustic improvements than non-sensitive subjects. The results imply that reducing the STI is beneficial for performance and acoustic satisfaction especially regarding speech coming from more distant desks. However, acoustic design does not sufficiently decrease the distraction caused by speech from adjacent desks.  相似文献   

7.
Two experiments investigated the impact of reverberation and masking on speech understanding using cochlear implant (CI) simulations. Experiment 1 tested sentence recognition in quiet. Stimuli were processed with reverberation simulation (T=0.425, 0.266, 0.152, and 0.0 s) and then either processed with vocoding (6, 12, or 24 channels) or were subjected to no further processing. Reverberation alone had only a small impact on perception when as few as 12 channels of information were available. However, when the processing was limited to 6 channels, perception was extremely vulnerable to the effects of reverberation. In experiment 2, subjects listened to reverberated sentences, through 6- and 12-channel processors, in the presence of either speech-spectrum noise (SSN) or two-talker babble (TTB) at various target-to-masker ratios. The combined impact of reverberation and masking was profound, although there was no interaction between the two effects. This differs from results obtained in subjects listening to unprocessed speech where interactions between reverberation and masking have been shown to exist. A speech transmission index (STI) analysis indicated a reasonably good prediction of speech recognition performance. Unlike previous investigations, the SSN and TTB maskers produced equivalent results, raising questions about the role of informational masking in CI processed speech.  相似文献   

8.
Speech recognition in noisy environments improves when the speech signal is spatially separated from the interfering sound. This effect, known as spatial release from masking (SRM), was recently shown in young children. The present study compared SRM in children of ages 5-7 with adults for interferers introducing energetic, informational, and/or linguistic components. Three types of interferers were used: speech, reversed speech, and modulated white noise. Two female voices with different long-term spectra were also used. Speech reception thresholds (SRTs) were compared for: Quiet (target 0 degrees front, no interferer), Front (target and interferer both 0 degrees front), and Right (interferer 90 degrees right, target 0 degrees front). Children had higher SRTs and greater masking than adults. When spatial cues were not available, adults, but not children, were able to use differences in interferer type to separate the target from the interferer. Both children and adults showed SRM. Children, unlike adults, demonstrated large amounts of SRM for a time-reversed speech interferer. In conclusion, masking and SRM vary with the type of interfering sound, and this variation interacts with age; SRM may not depend on the spectral peculiarities of a particular type of voice when the target speech and interfering speech are different sex talkers.  相似文献   

9.
This study investigated whether speech-like maskers without linguistic content produce informational masking of speech. The target stimuli were nonsense Chinese Mandarin sentences. In experiment I, the masker contained harmonics the fundamental frequency (F0) of which was sinusoidally modulated and the mean F0 of which was varied. The magnitude of informational masking was evaluated by measuring the change in intelligibility (releasing effect) produced by inducing a perceived spatial separation of the target speech and masker via the precedence effect. The releasing effect was small and was only clear when the target and masker had the same mean F0, suggesting that informational masking was small. Performance with the harmonic maskers was better than with a steady speech-shaped noise (SSN) masker. In experiments II and III, the maskers were speech-like synthesized signals, alternating between segments with harmonic structure and segments composed of SSN. Performance was much worse than for experiment I, and worse than when an SSN masker was used, suggesting that substantial informational masking occurred. The similarity of the F0 contours of the target and masker had little effect. The informational masking effect was not influenced by whether or not the noise-like segments of the masker were synchronous with the unvoiced segments of the target speech.  相似文献   

10.
A laboratory study was designed in which the annoyance was investigated for 14 different impulse sound types produced by various firearms ranging in caliber from 7.62 to 155 mm. Sixteen subjects rated the annoyance for the simulated conditions of (1) being outdoors, and (2) being indoors with the windows closed. In the latter case, a representative outdoor-to-indoor reduction in sound level was applied. It was anticipated that the presumed additional annoyance caused by the "heaviness" of the impulse sounds might be predicted from the difference between the C-weighted sound exposure level (CSEL; LCE) and the A-weighted sound exposure level (ASEL; LAE). In the outdoor rating conditions, the annoyance was almost entirely determined by ASEL. The explained variance, r2, in the mean ratings by ASEL was 0.95. In the indoor rating conditions, however, the explained variance in the annoyance ratings by (outdoor) ASEL was significantly increased from r2 = 0.87 to r2= 0.97 by adding the product (LCE-LAE)(LAE-alpha) as a second variable. In combination with a 12-dB adjustment for small firearms, the present results showed that for the entire set of impulse sounds rated indoors with windows closed, the rating sound level, Lr, is given by Lr=LAE +12dB+beta(LCE-LAE)(LAE-alpha), with alpha=45dB and beta=0.015dB(-1). For the outdoor rating condition, the optimal parameter values were equal to alpha=57 dB and, again, beta=0.015 dB(-1). In validation studies, in which the effects of the present rating procedure will be compared to field data, it has to be determined to what extent the constants alpha and beta have to be adjusted.  相似文献   

11.
When a masking sound is spatially separated from a target speech signal, substantial releases from masking typically occur both for speech and noise maskers. However, when a delayed copy of the masker is also presented at the location of the target speech (a condition that has been referred to as the front target, right-front masker or F-RF configuration), the advantages of spatial separation vanish for noise maskers but remain substantial for speech maskers. This effect has been attributed to precedence, which introduces an apparent spatial separation between the target and masker in the F-RF configuration that helps the listener to segregate the target from a masking voice but not from a masking noise. In this study, virtual synthesis techniques were used to examine variations of the F-RF configuration in an attempt to more fully understand the stimulus parameters that influence the release from masking obtained in that condition. The results show that the release from speech-on-speech masking caused by the addition of the delayed copy of the masker is robust across a wide variety of source locations, masker locations, and masker delay values. This suggests that the speech unmasking that occurs in the F-RF configuration is not dependent on any single perceptual cue and may indicate that F-RF speech segregation is only partially based on the apparent left-right location of the RF masker.  相似文献   

12.
Numerous studies have shown that task-irrelevant background speech impairs performance of verbal short-term memory. This well-established effect is related to practice in open-plan offices, where employees are potentially disturbed by the speech of their colleagues. One option to reduce the disruptive effect is by masking the speech, for example, using random noise. Based on past research by Jones and Macken (1995), the ISO Standard 3382-3 (2012) assumes that multiple background speakers in open-plan offices may mask each other in a natural way, consequently reducing the disruptive effect of speech. The aim of this study was to check this assumption using a realistic acoustical simulation of an open-plan office situation. A combination of a nearby speaker and a varying number of background speakers was played to 26 participants while they performed on a verbal short-term memory task. Additionally, the intelligibility of the presented speaker sentences, levels of annoyance, and workload were checked. The results show a significant trend towards an improvement of short-term memory performance when the number of babble voices grows from one to six. However, performance levels are far from those reached under silent conditions. Moreover, annoyance and measures of subjective workload did not diminish due to babble masking.  相似文献   

13.
This paper focuses on masking speech with meaningless steady noise as a way of realizing a comfortable sound environment. As a basis for research, meaningless steady noise at minimum sound pressure levels for masking of male or female meaningful speech is considered, based on psychological experiments using a method of adjustment. From the results, band-limited pink noise can be selected as the most effective noise for masking of speech. In the case of speech with a lower sound pressure level, the sound pressure level of the meaningless steady noise needs to be a little higher.  相似文献   

14.
A blind method for suppressing late reverberation from speech and audio signals is presented. The proposed technique operates both on the spectral and on the sub-band domains employing a single input channel. At first, a preliminary rough clean signal estimation is required and for this, any standard technique may be applied; however here the estimate is obtained through spectral subtraction. Then, an auditory masking model is employed in sub-bands to extract the reverberation masking index (RMI) which identifies signal regions with perceived alterations due to late reverberation. Utilizing a selective signal processing technique only these regions are suppressed through sub-band temporal envelope filtering based on analytical expressions. Objective and subjective measures indicate that the proposed method achieves significant late reverberation suppression for both speech and music signals over a wide range of reverberation time (RT) scenarios.  相似文献   

15.
真实环境中存在的噪声和混响会降低语音识别系统的性能。封闭空间中的混响包括直达声、早期反射和后期混响3部分,它们对语音识别系统具有不同的影响.我们研究了早期反射和后期混响的不同划分方法,以其中的早期反射为目标语音,计算出了不同的理想比值掩蔽并研究了它们对语音识别系统性能的影响;在此基础上,利用双向长短时记忆网络(BLSTM)估计理想比值掩蔽,测试它们对语音识别系统性能的影响.实验结果表明,基于Abel早期反射和后期混响的划分方法,理想比值掩蔽能够降低词错误率约2.8%;基于BLSTM的估计方法过低估计了理想比值掩蔽,未能有效提高语音识别系统的性能。  相似文献   

16.
The acoustical characteristics of 14 university classrooms at the University of British Columbia were measured before and after renovation—seven of these are discussed in detail here. From these measurements, and theoretical considerations, values of quantities used to assess each classroom configuration were predicted, and used to evaluate renovation quality. Information on each renovation was determined with the help of the university campus-planning office and/or the project acoustical consultant. These were related to the evaluation results in order to determine the relationship between design and acoustical quality. The criteria focused on the quality of verbal communication in the classrooms. Room-average Speech Intelligibility (SI) and its physical correlate, Speech Transmission Index (STI), were used to quantify verbal-communication quality. A simplified STI-calculation procedure was applied. The results indicate that some renovations were beneficial, others were not. Verbal-communication quality varied from ‘poor’ to ‘good’. The effect of a renovation depends on a complex interplay between changes in the reverberation and changes in the signal-to-noise level difference, as affected by sound absorption and the source outputs. Renovations which reduce noise are beneficial unless signal-to-noise level differences remain optimal. Renovations often put too much emphasis on adding sound absorption to control reverberation, at the expense of lower speech levels, particularly at the backs of classrooms. The absorption and noise contributed by room occupants has apparently often been neglected.  相似文献   

17.
Speech intelligibility in classrooms affects the learning efficiency of students directly, especially for the students who are using a second language. The speech intelligibility value is determined by many factors such as speech level, signal to noise ratio, and reverberation time in the rooms. This paper investigates the contributions of these factors with subjective tests, especially speech level, which is required for designing the optimal gain for sound amplification systems in classrooms. The test material was generated by mixing the convolution output of the English Coordinate Response Measure corpus and the room impulse responses with the background noise. The subjects are all Chinese students who use English as a second language. It is found that the speech intelligibility increases first and then decreases with the increase of speech level, and the optimal English speech level is about 71 dBA in classrooms for Chinese listeners when the signal to noise ratio and the reverberation time keep constant. Finally, a regression equation is proposed to predict the speech intelligibility based on speech level, signal to noise ratio, and reverberation time.  相似文献   

18.
Reverberation interferes with the ability to understand speech in rooms. Overlap-masking explains this degradation by assuming reverberant phonemes endure in time and mask subsequent reverberant phonemes. Most listeners benefit from binaural listening when reverberation exists, indicating that the listener's binaural system processes the two channels to reduce the reverberation. This paper investigates the hypothesis that the binaural word intelligibility advantage found in reverberation is a result of binaural overlap-masking release with the reverberation acting as masking noise. The tests utilize phonetically balanced word lists (ANSI-S3.2 1989), that are presented diotically and binaurally with recorded reverberation and reverberation-like noise. A small room, 62 m3, reverberates the words. These are recorded using two microphones without additional noise sources. The reverberation-like noise is a modified form of these recordings and has a similar spectral content. It does not contain binaural localization cues due to a phase randomization procedure. Listening to the reverberant words binaurally improves the intelligibility by 6.0% over diotic listening. The binaural intelligibility advantage for reverberation-like noise is only 2.6%. This indicates that binaural overlap-masking release is insufficient to explain the entire binaural word intelligibility advantage in reverberation.  相似文献   

19.
It is known that the sound field in a long space is not diffuse, and that the classic theory of room acoustics is not applicable. A theoretical model is developed for the prediction of reverberation time and speech transmission index in rectangular long enclosures, such as corridors and train stations, where the acoustic quality is important for speech. The model is based on an image-source method, and both acoustically hard and impedance boundaries are investigated. An approximate analytical solution is used to predict the frequency response of the sound field. The reverberation time is determined from the decay curve which is computed by a reverse-time integration of the squared impulse response. The angle-dependence of reflection coefficients of the boundaries and the change of phase upon reflection are incorporated in this model. Due to the relatively long distance of sound propagation, the effect of atmospheric absorption is also considered. Measurements of reverberation time and speech transmission index taken from a real tunnel, a corridor, and a model tunnel are presented. The theoretical predictions are found to agree well with the experimental data. An application of the proposed model has been suggested.  相似文献   

20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号