首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Speech-intelligibility tests auralized in a virtual classroom were used to investigate the optimal reverberation times for verbal communication for normal-hearing and hearing-impaired adults. The idealized classroom had simple geometry, uniform surface absorption, and an approximately diffuse sound field. It contained a speech source, a listener at a receiver position, and a noise source located at one of two positions. The relative output levels of the speech and noise sources were varied, along with the surface absorption and the corresponding reverberation time. The binaural impulse responses of the speech and noise sources in each classroom configuration were convolved with Modified Rhyme Test (MRT) and babble-noise signals. The resulting signals were presented to normal-hearing and hearing-impaired adult subjects to identify the configurations that gave the highest speech intelligibilities for the two groups. For both subject groups, when the speech source was closer to the listener than the noise source, the optimal reverberation time was zero. When the noise source was closer to the listener than the speech source, the optimal reverberation time included both zero and nonzero values. The results generally support previous theoretical results.  相似文献   

2.
Quantifying the intelligibility of speech in noise for non-native listeners   总被引:3,自引:0,他引:3  
When listening to languages learned at a later age, speech intelligibility is generally lower than when listening to one's native language. The main purpose of this study is to quantify speech intelligibility in noise for specific populations of non-native listeners, only broadly addressing the underlying perceptual and linguistic processing. An easy method is sought to extend these quantitative findings to other listener populations. Dutch subjects listening to Germans and English speech, ranging from reasonable to excellent proficiency in these languages, were found to require a 1-7 dB better speech-to-noise ratio to obtain 50% sentence intelligibility than native listeners. Also, the psychometric function for sentence recognition in noise was found to be shallower for non-native than for native listeners (worst-case slope around the 50% point of 7.5%/dB, compared to 12.6%/dB for native listeners). Differences between native and non-native speech intelligibility are largely predicted by linguistic entropy estimates as derived from a letter guessing task. Less effective use of context effects (especially semantic redundancy) explains the reduced speech intelligibility for non-native listeners. While measuring speech intelligibility for many different populations of listeners (languages, linguistic experience) may be prohibitively time consuming, obtaining predictions of non-native intelligibility from linguistic entropy may help to extend the results of this study to other listener populations.  相似文献   

3.
Reinforcing speech levels and controlling noise and reverberation are the ultimate acoustical goals of lecture-room design to achieve high speech intelligibility. The effects of sound absorption on these factors have opposite consequences for speech intelligibility. Here, novel ceiling baffles and reflectors were evaluated as a sound-control measure, using computer and 1/8-scale models of a lecture room with hard surfaces and excessive reverberation. Parallel ceiling baffles running front to back were investigated. They were expected to absorb reverberation incident on the ceiling from many angles, while leaving speech signals, reflecting from the ceiling to the back of the room, unaffected. Various baffle spacings and absorptions, central and side speaker positions, and receiver positions throughout the room, were considered. Reflective baffles controlled reverberation, with a minimum decrease of sound levels. Absorptive baffles reduced reverberation, but reduced speech levels significantly. Ceiling reflectors, in the form of obstacles of semicircular cross section, suspended below the ceiling, were also tested. These were either 7 m long and in parallel, front-to-back lines, or 0.8 m long and randomly distributed, with flat side up or down, and reflective or absorptive top surfaces. The long reflectors with flat side down and no absorption were somewhat effective; the other configurations were not.  相似文献   

4.
Spectral peak resolution was investigated in normal hearing (NH), hearing impaired (HI), and cochlear implant (CI) listeners. The task involved discriminating between two rippled noise stimuli in which the frequency positions of the log-spaced peaks and valleys were interchanged. The ripple spacing was varied adaptively from 0.13 to 11.31 ripples/octave, and the minimum ripple spacing at which a reversal in peak and trough positions could be detected was determined as the spectral peak resolution threshold for each listener. Spectral peak resolution was best, on average, in NH listeners, poorest in CI listeners, and intermediate for HI listeners. There was a significant relationship between spectral peak resolution and both vowel and consonant recognition in quiet across the three listener groups. The results indicate that the degree of spectral peak resolution required for accurate vowel and consonant recognition in quiet backgrounds is around 4 ripples/octave, and that spectral peak resolution poorer than around 1-2 ripples/octave may result in highly degraded speech recognition. These results suggest that efforts to improve spectral peak resolution for HI and CI users may lead to improved speech recognition.  相似文献   

5.
Speech-reception thresholds (SRT) were measured for 17 normal-hearing and 17 hearing-impaired listeners in conditions simulating free-field situations with between one and six interfering talkers. The stimuli, speech and noise with identical long-term average spectra, were recorded with a KEMAR manikin in an anechoic room and presented to the subjects through headphones. The noise was modulated using the envelope fluctuations of the speech. Several conditions were simulated with the speaker always in front of the listener and the maskers either also in front, or positioned in a symmetrical or asymmetrical configuration around the listener. Results show that the hearing impaired have significantly poorer performance than the normal hearing in all conditions. The mean SRT differences between the groups range from 4.2-10 dB. It appears that the modulations in the masker act as an important cue for the normal-hearing listeners, who experience up to 5-dB release from masking, while being hardly beneficial for the hearing impaired listeners. The gain occurring when maskers are moved from the frontal position to positions around the listener varies from 1.5 to 8 dB for the normal hearing, and from 1 to 6.5 dB for the hearing impaired. It depends strongly on the number of maskers and their positions, but less on hearing impairment. The difference between the SRTs for binaural and best-ear listening (the "cocktail party effect") is approximately 3 dB in all conditions for both the normal-hearing and the hearing-impaired listeners.  相似文献   

6.
In part I of this paper a general model was developed of sound propagation between adjacent rectangular workstations in a conventional open-plan office. In this paper, the new model is used to investigate the importance of various office design parameters on calculated speech privacy. The additional effects of the side and back panels of complete workstations are examined in detail. Calculations for systematic variations of the principal design parameters show that the separating screen height and the ceiling absorption have the largest effects on expected speech privacy. High speech privacy can only be achieved with the combination of high screens, high ceiling absorption, and high panel absorption. Empirical corrections are developed to estimate how the presence of ceiling lights reduces the effective ceiling absorption. The complete model is shown to accurately predict speech privacy for a range of office design configurations with an RMS error in predicted SII values 0.02.  相似文献   

7.
The question of what is the optimal reverberation time for speech intelligibility in an occupied classroom has been studied recently in two different ways, with contradictory results. Experiments have been performed under various conditions of speech-signal to background-noise level difference and reverberation time, finding an optimal reverberation time of zero. Theoretical predictions of appropriate speech-intelligibility metrics, based on diffuse-field theory, found nonzero optimal reverberation times. These two contradictory results are explained by the different ways in which the two methods account for background noise, both of which are unrealistic. To obtain more realistic and accurate predictions, noise sources inside the classroom are considered. A more realistic treatment of noise is incorporated into diffuse-field theory by considering both speech and noise sources and the effects of reverberation on their steady-state levels. The model shows that the optimal reverberation time is zero when the speech source is closer to the listener than the noise source, and nonzero when the noise source is closer than the speech source. Diffuse-field theory is used to determine optimal reverberation times in unoccupied classrooms given optimal values for the occupied classroom. Resulting times can be as high as several seconds in large classrooms; in some cases, optimal values are unachievable, because the occupants contribute too much absorption.  相似文献   

8.
Many hearing-impaired listeners suffer from distorted auditory processing capabilities. This study examines which aspects of auditory coding (i.e., intensity, time, or frequency) are distorted and how this affects speech perception. The distortion-sensitivity model is used: The effect of distorted auditory coding of a speech signal is simulated by an artificial distortion, and the sensitivity of speech intelligibility to this artificial distortion is compared for normal-hearing and hearing-impaired listeners. Stimuli (speech plus noise) are wavelet coded using a complex sinusoidal carrier with a Gaussian envelope (1/4 octave bandwidth). Intensity information is distorted by multiplying the modulus of each wavelet coefficient by a random factor. Temporal and spectral information are distorted by randomly shifting the wavelet positions along the temporal or spectral axis, respectively. Measured were (1) detection thresholds for each type of distortion, and (2) speech-reception thresholds for various degrees of distortion. For spectral distortion, hearing-impaired listeners showed increased detection thresholds and were also less sensitive to the distortion with respect to speech perception. For intensity and temporal distortion, this was not observed. Results indicate that a distorted coding of spectral information may be an important factor underlying reduced speech intelligibility for the hearing impaired.  相似文献   

9.
Spatial unmasking of speech has traditionally been studied with target and masker at the same, relatively large distance. The present study investigated spatial unmasking for configurations in which the simulated sources varied in azimuth and could be either near or far from the head. Target sentences and speech-shaped noise maskers were simulated over headphones using head-related transfer functions derived from a spherical-head model. Speech reception thresholds were measured adaptively, varying target level while keeping the masker level constant at the "better" ear. Results demonstrate that small positional changes can result in very large changes in speech intelligibility when sources are near the listener as a result of large changes in the overall level of the stimuli reaching the ears. In addition, the difference in the target-to-masker ratios at the two ears can be substantially larger for nearby sources than for relatively distant sources. Predictions from an existing model of binaural speech intelligibility are in good agreement with results from all conditions comparable to those that have been tested previously. However, small but important deviations between the measured and predicted results are observed for other spatial configurations, suggesting that current theories do not accurately account for speech intelligibility for some of the novel spatial configurations tested.  相似文献   

10.
The acoustical characteristics of 14 university classrooms at the University of British Columbia were measured before and after renovation—seven of these are discussed in detail here. From these measurements, and theoretical considerations, values of quantities used to assess each classroom configuration were predicted, and used to evaluate renovation quality. Information on each renovation was determined with the help of the university campus-planning office and/or the project acoustical consultant. These were related to the evaluation results in order to determine the relationship between design and acoustical quality. The criteria focused on the quality of verbal communication in the classrooms. Room-average Speech Intelligibility (SI) and its physical correlate, Speech Transmission Index (STI), were used to quantify verbal-communication quality. A simplified STI-calculation procedure was applied. The results indicate that some renovations were beneficial, others were not. Verbal-communication quality varied from ‘poor’ to ‘good’. The effect of a renovation depends on a complex interplay between changes in the reverberation and changes in the signal-to-noise level difference, as affected by sound absorption and the source outputs. Renovations which reduce noise are beneficial unless signal-to-noise level differences remain optimal. Renovations often put too much emphasis on adding sound absorption to control reverberation, at the expense of lower speech levels, particularly at the backs of classrooms. The absorption and noise contributed by room occupants has apparently often been neglected.  相似文献   

11.
Previous research has shown that speech recognition differences between native and proficient non-native listeners emerge under suboptimal conditions. Current evidence has suggested that the key deficit that underlies this disproportionate effect of unfavorable listening conditions for non-native listeners is their less effective use of compensatory information at higher levels of processing to recover from information loss at the phoneme identification level. The present study investigated whether this non-native disadvantage could be overcome if enhancements at various levels of processing were presented in combination. Native and non-native listeners were presented with English sentences in which the final word varied in predictability and which were produced in either plain or clear speech. Results showed that, relative to the low-predictability-plain-speech baseline condition, non-native listener final word recognition improved only when both semantic and acoustic enhancements were available (high-predictability-clear-speech). In contrast, the native listeners benefited from each source of enhancement separately and in combination. These results suggests that native and non-native listeners apply similar strategies for speech-in-noise perception: The crucial difference is in the signal clarity required for contextual information to be effective, rather than in an inability of non-native listeners to take advantage of this contextual information per se.  相似文献   

12.
The objectives of this prospective and exploratory study are to determine: (1) na?ve listener preference for gender in tracheoesophageal (TE) speech when speech severity is controlled; (2) the accuracy of identifying TE speaker gender; (3) the effects of gender identification on judgments of speech acceptability (ACC) and naturalness (NAT); and (4) the acoustic basis of ACC and NAT judgments. Six male and six female adult TE speakers were matched for speech severity. Twenty na?ve listeners made auditory-perceptual judgments of speech samples in three listening sessions. First, listeners performed preference judgments using a paired comparison paradigm. Second, listeners made judgments of speaker gender, speech ACC, and NAT using rating scales. Last, listeners made ACC and NAT judgments when speaker gender was provided coincidentally. Duration, frequency, and spectral measures were performed. No significant differences were found for preference of male or female speakers. All male speakers were accurately identified, but only two of six female speakers were accurately identified. Significant interactions were found between gender and listening condition (gender known) for NAT and ACC judgments. Males were judged more natural when gender was known; female speakers were judged less natural and less acceptable when gender was known. Regression analyses revealed that judgments of female speakers were best predicted with duration measures when gender was unknown, but with spectral measures when gender was known; judgments of males were best predicted with spectral measures. Na?ve listeners have difficulty identifying the gender of female TE speakers. Listeners show no preference for speaker gender, but when gender is known, female speakers are least acceptable and natural. The nature of the perceptual task may affect the acoustic basis of listener judgments.  相似文献   

13.
《Journal of voice》2020,34(5):806.e7-806.e18
There is a high prevalence of dysphonia among professional voice users and the impact of the disordered voice on the speaker is well documented. However, there is minimal research on the impact of the disordered voice on the listener. Considering that professional voice users include teachers and air-traffic controllers, among others, it is imperative to determine the impact of a disordered voice on the listener. To address this, the objectives of the current study included: (1) determine whether there are differences in speech intelligibility between individuals with healthy voices and those with dysphonia; (2) understand whether cognitive-perceptual strategies increase speech intelligibility for dysphonic speakers; and (3) determine the relationship between subjective voice quality ratings and speech intelligibility. Sentence stimuli were recorded from 12 speakers with dysphonia and four age- and gender-matched typical, healthy speakers and presented to 129 healthy listeners divided into one of three strategy groups (ie, control, acknowledgement, and listener strategies). Four expert raters also completed a perceptual voice assessment using the Consensus Assessment Perceptual Evaluation of Voice for each speaker. Results indicated that dysphonic voices were significantly less intelligible than healthy voices (P0.001) and the use of cognitive-perceptual strategies provided to the listener did not significantly improve speech intelligibility scores (P = 0.602). Using the subjective voice quality ratings, regression analysis found that breathiness was able to predict 41% of the variance associated with number of errors (P = 0.008). Overall results of the study suggest that speakers with dysphonia demonstrate reduced speech intelligibility and that providing the listener with specific strategies may not result in improved intelligibility.  相似文献   

14.
The purpose of this experiment was to determine the applicability of the Articulation Index (AI) model for characterizing the speech recognition performance of listeners with mild-to-moderate hearing loss. Performance-intensity functions were obtained from five normal-hearing listeners and 11 hearing-impaired listeners using a closed-set nonsense syllable test for two frequency responses (uniform and high-frequency emphasis). For each listener, the fitting constant Q of the nonlinear transfer function relating AI and speech recognition was estimated. Results indicated that the function mapping AI onto performance was approximately the same for normal and hearing-impaired listeners with mild-to-moderate hearing loss and high speech recognition scores. For a hearing-impaired listener with poor speech recognition ability, the AI procedure was a poor predictor of performance. The AI procedure as presently used is inadequate for predicting performance of individuals with reduced speech recognition ability and should be used conservatively in applications predicting optimal or acceptable frequency response characteristics for hearing-aid amplification systems.  相似文献   

15.
Speakers may adapt the phonetic details of their productions when they anticipate perceptual difficulty or comprehension failure on the part of a listener. Previous research suggests that a speaking style known as clear speech is more intelligible overall than casual, conversational speech for a variety of listener populations. However, it is unknown whether clear speech improves the intelligibility of fricative consonants specifically, or how its effects on fricative perception might differ depending on listener population. The primary goal of this study was to determine whether clear speech enhances fricative intelligibility for normal-hearing listeners and listeners with simulated impairment. Two experiments measured babble signal-to-noise ratio thresholds for fricative minimal pair distinctions for 14 normal-hearing listeners and 14 listeners with simulated sloping, recruiting impairment. Results indicated that clear speech helped both groups overall. However, for impaired listeners, reliable clear speech intelligibility advantages were not found for non-sibilant pairs. Correlation analyses comparing acoustic and perceptual data indicated that a shift of energy concentration toward higher frequency regions and greater source strength contributed to the clear speech effect for normal-hearing listeners. Correlations between acoustic and perceptual data were less consistent for listeners with simulated impairment, and suggested that lower-frequency information may play a role.  相似文献   

16.
Binaural speech intelligibility of individual listeners under realistic conditions was predicted using a model consisting of a gammatone filter bank, an independent equalization-cancellation (EC) process in each frequency band, a gammatone resynthesis, and the speech intelligibility index (SII). Hearing loss was simulated by adding uncorrelated masking noises (according to the pure-tone audiogram) to the ear channels. Speech intelligibility measurements were carried out with 8 normal-hearing and 15 hearing-impaired listeners, collecting speech reception threshold (SRT) data for three different room acoustic conditions (anechoic, office room, cafeteria hall) and eight directions of a single noise source (speech in front). Artificial EC processing errors derived from binaural masking level difference data using pure tones were incorporated into the model. Except for an adjustment of the SII-to-intelligibility mapping function, no model parameter was fitted to the SRT data of this study. The overall correlation coefficient between predicted and observed SRTs was 0.95. The dependence of the SRT of an individual listener on the noise direction and on room acoustics was predicted with a median correlation coefficient of 0.91. The effect of individual hearing impairment was predicted with a median correlation coefficient of 0.95. However, for mild hearing losses the release from masking was overestimated.  相似文献   

17.
Recent papers have discussed the optimal reverberation times in classrooms for speech intelligibility, based on the assumption of a diffuse sound field. Here this question was investigated for more ‘typical’ classrooms with non-diffuse sound fields. A ray-tracing model was modified to predict speech-intelligibility metric U50. It was used to predict U50 in various classroom configurations for various values of the room absorption, allowing the optimal absorption (that predicting the highest U50)—and the corresponding optimal reverberation time—to be identified in each case. The range of absorptions and reverberation times corresponding to high speech intelligibility were also predicted in each case. Optimal reverberation times were also predicted from the optimal surface-absorption coefficients using Sabine and Eyring versions of diffuse-field theory, and using the diffuse-field expression of Hodgson and Nosal. In order to validate the ray-tracing model, predictions were made for three classrooms with highly diffuse sound fields; these were compared to values obtained by the diffuse-field models, with good agreement. The methods were then applied to three ‘typical’ classrooms with non-diffuse fields. Optimal reverberation times increased with room volume and noise level to over 1 s. The accuracy of the Hodgson and Nosal expression varied with classroom size and noise level. The optimal average surface-absorption coefficients varied from 0.19 to 0.83 in the different classroom configurations tested. High speech intelligibility was, in general, predicted for a wide range of coefficients, but could not be obtained in a large, noisy classroom.  相似文献   

18.
Nonoptimal classroom acoustical conditions directly affect speech perception and, thus, learning by students. Moreover, they may lead to voice problems for the instructor, who is forced to raise his/her voice when lecturing to compensate for poor acoustical conditions. The project applied previously developed simplified methods to predict speech intelligibility in occupied classrooms from measurements in unoccupied and occupied university classrooms. The methods were used to predict the speech intelligibility at various positions in 279 University of British Columbia (UBC) classrooms, when 70% occupied, and for four instructor voice levels. Classrooms were classified and rank ordered by acoustical quality, as determined by the room-average speech intelligibility. This information was used by UBC to prioritize classrooms for renovation. Here, the statistical results are reported to illustrate the range of acoustical qualities found at a typical university. Moreover, the variations of quality with relevant classroom acoustical parameters were studied to better understand the results. In particular, the factors leading to the best and worst conditions were studied. It was found that 81% of the 279 classrooms have "good," "very good," or "excellent" acoustical quality with a "typical" (average-male) instructor. However, 50 (18%) of the classrooms had "fair" or "poor" quality, and two had "bad" quality, due to high ventilation-noise levels. Most rooms were "very good" or "excellent" at the front, and "good" or "very good" at the back. Speech quality varied strongly with the instructor voice level. In the worst case considered, with a quiet female instructor, most of the classrooms were "bad" or "poor." Quality also varies with occupancy, with decreased occupancy resulting in decreased quality. The research showed that a new classroom acoustical design and renovation should focus on limiting background noise. They should promote high instructor speech levels at the back of the classrooms. This involves, in part, limiting the amount of sound absorption that is introduced into classrooms to control reverberation. Speech quality is not very sensitive to changes in reverberation, so controlling it for its own sake should not be a design priority.  相似文献   

19.
Improved acoustical privacy is the principal goal of the acoustical design of open plan offices. As the replacement of the Articulation Index (AI), the Speech Intelligibility Index (SII) can be used as a single-number measure of the speech privacy in open-plan offices. In this paper, a mathematical model of the speech propagation over single screens in a large open-plan office space is presented. The calculated effects of the office parameters, such as the screen height, ceiling and floor absorption, etc. on the SII behind the screen are discussed and are compared with measured results. To facilitate the practical use of the model, an empirical correction is derived from a wide range of ceiling tiles to provide values of the effective sound absorption of typical suspended ceilings in open offices. Compared to measured results, SII can be predicted with an RMS error of 0.03.  相似文献   

20.
To examine spectral and threshold effects for speech and noise at high levels, recognition of nonsense syllables was assessed for low-pass-filtered speech and speech-shaped maskers and high-pass-filtered speech and speech-shaped maskers at three speech levels, with signal-to-noise ratio held constant. Subjects were younger adults with normal hearing and older adults with normal hearing but significantly higher average quiet thresholds. A broadband masker was always present to minimize audibility differences between subject groups and across presentation levels. For subjects with lower thresholds, the declines in recognition of low-frequency syllables in low-frequency maskers were attributed to nonlinear growth of masking which reduced "effective" signal-to-noise ratio at high levels, whereas the decline for subjects with higher thresholds was not fully explained by nonlinear masking growth. For all subjects, masking growth did not entirely account for declines in recognition of high-frequency syllables in high-frequency maskers at high levels. Relative to younger subjects with normal hearing and lower quiet thresholds, older subjects with normal hearing and higher quiet thresholds had poorer consonant recognition in noise, especially for high-frequency speech in high-frequency maskers. Age-related effects on thresholds and task proficiency may be determining factors in the recognition of speech in noise at high levels.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号