首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 45 毫秒
1.
本文研究了开放型办公室中平稳噪声掩蔽语音环境下噪声可懂度的客观评价指标与工作效率之间的关系。文章通过对三种客观评价指标:Speech Transmission Index(STI),Perceptually Evaluation of Speech Quality(PESQ)和modified Normalized Covariance Method(mNCM)与专门设计的主观实验结果相对比,得到了该条件下客观评价指标与主观烦扰度和工作效率之间的关系。结果显示,客观评价指标与主观实验结果均具有较高的相关性,说明利用客观评价指标来预测、评估工作效率具有可行性。实验结果还初步揭示了噪声的语言可懂度和工作效率之间的变化规律:在噪声的语言可懂度的中间区域,工作效率变化显著;但噪声的语言可懂度高于一定值以后,工作效率趋于稳定。  相似文献   

2.
Acoustic cues related to the voice source, including harmonic structure and spectral tilt, were examined for relevance to prosodic boundary detection. The measurements considered here comprise five categories: duration, pitch, harmonic structure, spectral tilt, and amplitude. Distributions of the measurements and statistical analysis show that the measurements may be used to differentiate between prosodic categories. Detection experiments on the Boston University Radio Speech Corpus show equal error detection rates around 70% for accent and boundary detection, using only the acoustic measurements described, without any lexical or syntactic information. Further investigation of the detection results shows that duration and amplitude measurements, and, to a lesser degree, pitch measurements, are useful for detecting accents, while all voice source measurements except pitch measurements are useful for boundary detection.  相似文献   

3.
The acoustical characteristics of 14 university classrooms at the University of British Columbia were measured before and after renovation—seven of these are discussed in detail here. From these measurements, and theoretical considerations, values of quantities used to assess each classroom configuration were predicted, and used to evaluate renovation quality. Information on each renovation was determined with the help of the university campus-planning office and/or the project acoustical consultant. These were related to the evaluation results in order to determine the relationship between design and acoustical quality. The criteria focused on the quality of verbal communication in the classrooms. Room-average Speech Intelligibility (SI) and its physical correlate, Speech Transmission Index (STI), were used to quantify verbal-communication quality. A simplified STI-calculation procedure was applied. The results indicate that some renovations were beneficial, others were not. Verbal-communication quality varied from ‘poor’ to ‘good’. The effect of a renovation depends on a complex interplay between changes in the reverberation and changes in the signal-to-noise level difference, as affected by sound absorption and the source outputs. Renovations which reduce noise are beneficial unless signal-to-noise level differences remain optimal. Renovations often put too much emphasis on adding sound absorption to control reverberation, at the expense of lower speech levels, particularly at the backs of classrooms. The absorption and noise contributed by room occupants has apparently often been neglected.  相似文献   

4.
In this study, the calculations and results of acoustic voice analysis as calculated by two different analysis systems (Doctor Speech (DRS), Tiger Electronics, Neu-Anspach, Germany, and Computerized Speech Lab (CSL), Kay Elemetrics Corporation, Lincoln Park, NJ) are compared. A group of 120 normal voices was selected for analysis of the objective parameters: fundamental frequency (F(0)), variation of F(0) (F(0)SD), jitter, shimmer, and harmonics-to-noise ratio (HNR). The subject group was a random selection of normal voices of adults. The aim of this comparison was to find determined differences and similarities in data measurements between both systems to make data transfer possible. A significant correlation was found for F(0), HNR, and shimmer relative. The correlation for jitter (relative and absolute) and F(0)SD was weak. DRS and CSL are not comparable in absolute figures, but their judgment against normative data is identical. Further research is necessary to explore the affect on pathological voices or child voices.  相似文献   

5.
Speech reception thresholds were measured to investigate the influence of a room on speech segregation between a spatially separated target and interferer. The listening tests were realized under headphones. A room simulation allowed selected positioning of the interferer and target, as well as varying the absorption coefficient of the room internal surfaces. The measurements involved target sentences and speech-shaped noise or 2-voice interferers. Four experiments revealed that speech segregation in rooms was not only dependent on the azimuth separation of sound sources, but also on their direct-to-reverberant energy ratio at the listening position. This parameter was varied for interferer and target independently. Speech intelligibility decreased as the direct-to-reverberant ratio of sources was degraded by sound reflections in the room. The influence of the direct-to-reverberant ratio of the interferer was in agreement with binaural unmasking theories, through its effect on interaural coherence. The effect on the target occurred at higher levels of reverberation and was explained by the intrinsic degradation of speech intelligibility in reverberation.  相似文献   

6.
Speech intelligibility was investigated by varying the number of interfering talkers, level, and mean pitch differences between target and interfering speech, and the presence of tactile support. In a first experiment the speech-reception threshold (SRT) for sentences was measured for a male talker against a background of one to eight interfering male talkers or speech noise. Speech was presented diotically and vibro-tactile support was given by presenting the low-pass-filtered signal (0-200 Hz) to the index finger. The benefit in the SRT resulting from tactile support ranged from 0 to 2.4 dB and was largest for one or two interfering talkers. A second experiment focused on masking effects of one interfering talker. The interference was the target talker's own voice with an increased mean pitch by 2, 4, 8, or 12 semitones. Level differences between target and interfering speech ranged from -16 to +4 dB. Results from measurements of correctly perceived words in sentences show an intelligibility increase of up to 27% due to tactile support. Performance gradually improves with increasing pitch difference. Louder target speech generally helps perception, but results for level differences are considerably dependent on pitch differences. Differences in performance between noise and speech maskers and between speech maskers with various mean pitches are explained by the effect of informational masking.  相似文献   

7.
Speech intelligibility studies in classrooms   总被引:2,自引:0,他引:2  
Speech intelligibility tests and acoustical measurements were made in ten occupied classrooms. Octave-band measurements of background noise levels, early decay times, and reverberation times, as well as various early/late sound ratios, and the center time were obtained. Various octave-band useful/detrimental ratios were calculated along with the speech transmission index. The interrelationships of these measures were considered to evaluate which were most appropriate in classrooms, and the best predictors of speech intelligibility scores were identified. From these results ideal design goals for acoustical conditions for classrooms were determined either in terms of the 50-ms useful/detrimental ratios or from combinations of the reverberation time and background noise level.  相似文献   

8.
This is the second of two papers describing the results of acoustical measurements and speech intelligibility tests in elementary school classrooms. The intelligibility tests were performed in 41 classrooms in 12 different schools evenly divided among grades 1, 3, and 6 students (nominally 6, 8, and 11 year olds). Speech intelligibility tests were carried out on classes of students seated at their own desks in their regular classrooms. Mean intelligibility scores were significantly related to signal-to-noise ratios and to the grade of the students. While the results are different than those from some previous laboratory studies that included less realistic conditions, they agree with previous in-classroom experiments. The results indicate that +15 dB signal-to-noise ratio is not adequate for the youngest children. By combining the speech intelligibility test results with measurements of speech and noise levels during actual teaching situations, estimates of the fraction of students experiencing near-ideal acoustical conditions were made. The results are used as a basis for estimating ideal acoustical criteria for elementary school classrooms.  相似文献   

9.
A new approach is described for the design of speech materials used in subjective speech quality evaluation. Speech sounds are classified by their acoustic properties, and sentences are composed so as to concentrate all sounds with similar properties within one sentence. As a test of the method, subjective quality data were collected, using both a rank ordering and a rating task, from a set of 12 linear predictive vocoders, whose parameters were chosen so as to equate their bit rates at 2600 bps. The results show that the method can reliably reveal small differences in quality, and also yields information that can be of diagnostic help in determining the causes of quality degradation by a particular vocoder. A set of phoneme-specific sentences is appended.  相似文献   

10.
This work concerns speech intelligibility tests and measurements in three primary schools in Italy, one of which was conducted before and after an acoustical treatment. Speech intelligibility scores (IS) with different reverberation times (RT) and types of noise were obtained using diagnostic rhyme tests on 983 pupils from grades 2-5 (nominally 7-10 year olds), and these scores were then correlated with the Speech Transmission Index (STI). The grade 2 pupils understood fewer words in the lower STI range than the pupils in the higher grades, whereas an IS of ~97% was achieved by all the grades with a STI of 0.9. In the presence of traffic noise, which resulted the most interfering noise, a decrease in RT from 1.6 to 0.4 s determined an IS increase on equal A-weighted speech-to-noise level difference, S/N(A), which varied from 13% to 6%, over the S/N(A) range of -15 to +6 dB, respectively. In the case of babble noise, whose source was located in the middle of the classroom, the same decrease in reverberation time leads to a negligible variation in IS over a similar S/N(A) range.  相似文献   

11.
A time-domain model of sound wave propagation in the branching airways of the subglottal system is presented. The model is formulated as an extension to an acoustic transmission-line modeling scheme originally developed for simulating the supraglottal system in the time-domain during speech production [Maeda (1982). Speech Commun. 1, 199-229; Mokhtari et al. (2008). Speech Commun. 50, 179-190]. The approach allows for predictions of time-varying acoustic pressure and volume velocity at any point along the various generations of subglottal airways from trachea to alveoli. In addition, the model can be configured so that its overall structure simulates different geometric forms, including airways that branch in a symmetric or asymmetric pattern. Three subglottal configurations, two symmetric and one asymmetric, were represented based on reported anatomical dimensions of the subglottal airways. Estimates of the acoustic input impedances of these subglottal configurations revealed resonant characteristics similar to those found in the previous studies. Simulations of voiced sound propagation into the subglottal airways, achieved by coupling the subglottal model to a two-mass vocal fold model and a supraglottal tract configured for different vowels, yielded predictions of time-domain sound pressure waveforms below the vocal folds that compare favorably to previous measurements in human subjects.  相似文献   

12.
Speech signals recorded with a distant microphone usually are interfered by the spatial reverberation in the room, which severely degrades the clarity and intelligibility of speech. A speech dereverberation method based on spectral subtraction and spectral line enhancement is proposed in this paper. Following the generalized statistical reverberation model, the power spectrum of late reverberation is estimated and removed from the reverberation speech by the spectral subtraction method. Then, according to the human auditory model, a spectral line enhancement technique based on adaptive post-filtering is adopted to further eliminate the reverberant components between adjacent speech formants. The proposed method can effectively suppress the spatial reverberation and improve the auditory perception of speech. The subjective and objective evaluation results reveal that the perceptual quality of speech is greatly improved by the proposed method.  相似文献   

13.
Annoyance ratings in speech intelligibility tests at 45 dB(A) and 55 dB(A) traffic noise were investigated in a laboratory study. Subjects were chosen according to their hearing acuity to be representative of 70-year-old men and women, and of noise-induced hearing losses typical for a great number of industrial workers. These groups were compared with normal hearing subjects of the same sex and, when possible, the same age. The subjects rated their annoyance on an open 100 mm scale. Significant correlations were found between annoyance expressed in millimetres and speech intelligibility in percent when all subjects were taken as one sample. Speech intelligibility was also calculated from physical measurements of speech and noise by using the articulation index method. Observed and calculated speech intelligibility scores are compared and discussed. Also treated is the estimation of annoyance by traffic noise at moderate noise levels via speech intelligibility scores.  相似文献   

14.
The modern theory of hoarseness is that there are multifactorial etiologies contributing to the voice problem. The hypothesis of this study is that muscle tension dysphonia is multifactorial with various contributing etiologies. METHODS: This project is a retrospective chart review of all patients seen in the Voice Speech and Language Service and Swallowing Center at our institution with a diagnosis of muscle tension (functional hypertensive) dysphonia over a 30-month period. A literature search and review is also performed regarding current and emerging concepts of muscle tension dysphonia. RESULTS: One hundred fifty subjects were identified (60% female, 40% male, with a mean age of 42.3 years). Significant factors in patient history believed to contribute to abnormal voice production were gastroesophageal reflux in 49%, high stress levels in 18%, excessive amounts of voice use in 63%, and excessive loudness demands on voice use in 23%. Otolaryngologic evaluation was performed in 82% of patients, in whom lesions, significant vocal fold edema, or paralysis/paresis was identified in 52.3%. Speech pathology assessment revealed poor breath support, inappropriately low pitch, and visible cervical neck tension in the majority of patients. Inappropriate intensity was observed in 23.3% of patients. This set of multiple contributing factors is discussed in the context of current and emerging understanding of muscle tension dysphonia. CONCLUSIONS: Results confirm multifactorial etiologies contributing to hoarseness in the patients identified with muscle tension dysphonia. An interdisciplinary approach to treating all contributing factors portends the best prognosis.  相似文献   

15.
People working in noisy environments often complain of difficulty communicating when they wear hearing protection. It was hypothesized that part of the workers' communication difficulties stem from changes in speech production that occur when hearing protectors are worn. To address this possibility, overall and one-third-octave-band SPL measurements were obtained for 16 men and 16 women as they produced connected speech while wearing foam, flange, or no earplugs (open ears) in quiet and in pink noise at 60, 70, 80, 90, and 100 dB SPL. The attenuation and the occlusion effect produced by the earplugs were measured. The Speech Intelligibility Index (SII) was also calculated for each condition. The talkers produced lower overall speech levels, speech-to-noise ratios, and SII values, and less high-frequency speech energy, when they wore earplugs compared with the open-ear condition. Small differences in the speech measures between the talkers wearing foam and flange earplugs were observed. Overall, the results of the study indicate that talkers wearing earplugs (and consequently their listeners) are at a disadvantage when communicating in noise.  相似文献   

16.
言语知觉是心理学的一个领域,其发展与语音学、语音工程和人工智能等许多学科有关.本文简要介绍了在认知心理学和其它相关学科的推动下,言语知觉发展的主线、当前的主要问题和研究现状。  相似文献   

17.
There is a need, both for speech theory and for many practical applications, to know the intelligibilities of individual passbands that span the speech spectrum when they are heard singly and in combination. While indirect procedures have been employed for estimating passband intelligibilities (e.g., the Speech Intelligibility Index), direct measurements have been blocked by the confounding contributions from transition band slopes that accompany filtering. A recent study has reported that slopes of several thousand dBA/octave produced by high-order finite impulse response filtering were required to produce the effectively rectangular bands necessary to eliminate appreciable contributions from transition bands [Warren et al., J. Acoust. Soc. Am. 115, 1292-1295 (2004)]. Using such essentially vertical slopes, the present study employed sentences, and reports the intelligibilities of their six 1-octave contiguous passbands having center frequencies from 0.25 to 8 kHz when heard alone, and for each of their 15 possible pairings.  相似文献   

18.
基于发音特征的汉语普通话语音声学建模   总被引:3,自引:0,他引:3  
将表征汉语普通话语音特点的发音特征引入汉语普通话语音识别的声学建模中,根据普通话发音特点,确定了用于区别普通话元音、辅音以及声调信息的9种发音特征,并以此为目标值训练神经网络得到语音信号属于各类发音特征的后验概率,将此概率作为语音识别的输入特征建立声学模型。在汉语普通话非特定人大词表自然口语对话识别系统中进行了实验验证,并与基于频谱特征的声学模型进行了比较,在相同解码速度下,由此方法建立的声学模型汉字错误率相对下降6.8%;将发音特征和频谱特征进行了融合实验,融合以后的识别系统相对基于频谱特征系统的汉字错误率相对下降10.1%。上述结果表明,基于发音特征的声学模型更加有效的实现了对语音特性的表征,通过利用发音特征和频谱特征的互补性,能够进一步实现对语音识别性能的提高。   相似文献   

19.
This paper establishes a clear relation between the photocount distribution (PCD) of a low light level signal and its autocorrelation function (AC). The conclusion is that the evaluation of the AC from the measurement of the PCD is equivalent to that of the AC. Although the signal to noise ratio decreases when AC evaluation is obtained from PCD measurements, this technique allows AC values to be obtained more quickly and easily. Theoretical evaluation of the variances involved in these measurements and a simulation checking procedure are also described.  相似文献   

20.
Peri- and intraoral devices are often used to obtain measurements concerning articulator motions and placements. Surprisingly, there are few formal evaluations of the potential influence of these devices on speech production behavior. In particular, the potential effects of lingual pellets or coils used in x-ray or electromagnetic studies of tongue motion have never been evaluated formally, even though a large x-ray database exists and electromagnetic systems are commercially available. The x-ray microbeam database [Westbury, J. "X-ray Microbeam Speech Production Database User's Handbook, version 1" (1994)] includes several utterances produced with pellets-off and -on, which allowed us to evaluate effects of pellets for the utterance, She had your dark suit in greasy wash water all year, using acoustic and perceptual measures. Overall, there were no acoustic or perceptual measures that showed consistent effects of pellets across speakers, but certain effects were consistent either within a given speaker or in direction across a subgroup of the speakers. The results are discussed in terms of the general goodness of the assumption that point parameterization of lingual motion does not interfere with normal articulatory behaviors. A brief screening procedure is suggested to protect articulatory kinematic experiments from those individuals who may show consistent effects of having devices placed on perioral structures.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号