首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
2.
3.
Operations of the type of taking the logarithm, summation, and time delay can be used by a simple operator, which, with different values of frequency and time parameters, creates different transformations of the current spectrum. This operator simulates such properties of auditory perception as temporal and frequency masking, effects of turning on and off a stimulus, selective response to transients with different rates, detection of amplitude and frequency modulations, and adaptation to the mean signal level.  相似文献   

4.
Fundamental frequency (F0) perturbation has been found to be useful as an acoustic correlate of the perception of dysphonia in adult voices. In a previous investigation, we showed that hoarseness in children's voices is a stable concept composed mainly of three predictors: hyperfunction, breathiness, and roughness. In the present investigation, the relation between F0 perturbation and hoarseness as well as its predictors was analyzed in running speech of six children representing different degrees of hoarseness. Two perturbation measures were used: the standard deviation of the distribution of perturbation data and the mean of the absolute value of perturbation. The results revealed no clear relation.  相似文献   

5.
6.
By using a computer simulation, the author examined the theoretically expected discrepancy of the obtained H/N ratio from the real H/N value, the effects of perturbatios of pitch, and amplitude on the H/N ratio and others. Regarding the averaged wave as an estimate of the harmonic component, resulted in harmonic energy greater than its real value. The H/N ratio method detected perturbations of pitch and amplitude as a quantity of noise energy. The effect of pitch perturbation on the H/N ratio was much greater than that of amplitude perturbation. The error range of the measurement system did not contribute significantly to the calculated H/N ratios.  相似文献   

7.
The signals of running speech and sustained vowels of normals and subjects suffering from dysphonia were analyzed statistically with respect to the signal-to-noise ratio (SNR). The distribution of the SNR measured in multiple overlapping frames in the speech signal was described by a linear combination of the distribution frequencies for SNR = 0 dB, 0 dB less than SNR less than 15 dB, and SNR greater than or equal to 15 dB. The values of the linear combination, the SNR of the vowels, and clinical assignment of the voices to normal and pathologic populations based on laryngoscopic and stroboscopic investigation parameters were used to compare the different evaluations of the voices. The SNR distribution in speech remained stable over signal lengths of more than 30 s. The correlation coefficient between the SNR measure for running speech and the SNR of sustained vowels amounted to only 0.63. The error rate in the discrimination between normal and dysphonic voices amounted to 22.6% in application to sustained vowels and 5.6% when the SNR distribution was used. Possible reasons for the observed discrepancies are discussed, and the results are compared to those of other studies.  相似文献   

8.
A method for studying speech rhythm is presented, using Fourier analysis of the amplitude envelope of bandpass-filtered speech. Rather than quantifying rhythm with time-domain measurements of interval durations, a frequency-domain representation is used--the rhythm spectrum. This paper describes the method in detail, and discusses approaches to characterizing rhythm with low-frequency spectral information.  相似文献   

9.
10.
In stuttered repetitions of a syllable, the vowel that occurs often sounds like schwa even when schwa is not intended. In this article, acoustic analyses are reported which show that the spectral properties of stuttered vowels are similar to the following fluent vowel, so it would appear that the stutterers are articulating the vowel appropriately. Though spectral properties of the stuttered vowels are normal, others are unusual: The stuttered vowels are low in amplitude and short in duration. In two experiments, the effects of amplitude and duration on perception of these vowels are examined. It is shown that, if the amplitude of stuttered vowels is made normal and their duration is lengthened, they sound more like the intended vowels. These experiments lead to the conclusion that low amplitude and short duration are the factors that cause stuttered vowels to sound like schwa. This differs from the view of certain clinicians and theorists who contend that stutterers actually articulate /schwa/'s when these are heard in stuttered speech. Implications for stuttering therapy are considered.  相似文献   

11.
中国科学院声学研究所建立了一个汉语普通话语音数据库,这个语音数据库由声母、韵母、1282个单音节、几百个双音词和三音词、语音试验句、短文及数字0—9等构成。该语音数据库的发音人有六位(三男三女),他们是广播学院的教师和职业播音员,讲标准的汉语普通话。语音材料录制在高质量的磁带上,其中有一部分已数字化。已有许多汉语语音研究部门使用该语音数据库。  相似文献   

12.
13.
The wave function of the Dirac particles in the longitudinal electric field running with the velocity of light is investigated, and the condition of particle capture by this field is presented. The Dirac particles with mass m0 and charge e that were previously stationary are captured by a repulsive field of strength E if the longitudinal field extension exceeds l = m0c2/eE. The captured Dirac particles concentrate, like classical particles, at the distance l from the forefront of the running field, but unlike classical particles, the nth part of the Dirac particles is not captured by the field, where n = exp(−l/2λ0) and λ0 is the Compton wavelength of the particle. __________ Translated from Izvestiya Vysshikh Uchebnykh Zavedenii, Fizika, No. 10, pp. 25–33, October, 2006.  相似文献   

14.
15.
Although listeners routinely perceive both the sex and individual identity of talkers from their speech, explanations of these abilities are incomplete. Here, variation in vocal production-related anatomy was assumed to affect vowel acoustics thought to be critical for indexical cueing. Integrating this approach with source-filter theory, patterns of acoustic parameters that should represent sex and identity were identified. Due to sexual dimorphism, the combination of fundamental frequency (F0, reflecting larynx size) and vocal tract length cues (VTL, reflecting body size) was predicted to provide the strongest acoustic correlates of talker sex. Acoustic measures associated with presumed variations in supralaryngeal vocal tract-related anatomy occurring within sex were expected to be prominent in individual talker identity. These predictions were supported by results of analyses of 2500 tokens of the /epsilon/ phoneme, extracted from the naturally produced speech of 125 subjects. Classification by talker sex was virtually perfect when F0 and VTL were used together, whereas talker classification depended primarily on the various acoustic parameters associated with vocal-tract filtering.  相似文献   

16.
汉语语音的非线性动力学特性分析   总被引:2,自引:0,他引:2  
用非线性动力学方法对正常语速的汉语语音进行了初步研究,对关税维算法进行了改进以适合语音信号的特点,文章给出了摩擦音和单元音汉语的相空间重构图及关联维曲线,发现由于它们在发声机制上的不同,关联维算法能够区分摩擦音和单元音,同时初步研究表明,关联维算法也能为区分汉语四声提供信息。  相似文献   

17.
A glimpsing model of speech perception in noise   总被引:5,自引:0,他引:5  
Do listeners process noisy speech by taking advantage of "glimpses"-spectrotemporal regions in which the target signal is least affected by the background? This study used an automatic speech recognition system, adapted for use with partially specified inputs, to identify consonants in noise. Twelve masking conditions were chosen to create a range of glimpse sizes. Several different glimpsing models were employed, differing in the local signal-to-noise ratio (SNR) used for detection, the minimum glimpse size, and the use of information in the masked regions. Recognition results were compared with behavioral data. A quantitative analysis demonstrated that the proportion of the time-frequency plane glimpsed is a good predictor of intelligibility. Recognition scores in each noise condition confirmed that sufficient information exists in glimpses to support consonant identification. Close fits to listeners' performance were obtained at two local SNR thresholds: one at around 8 dB and another in the range -5 to -2 dB. A transmitted information analysis revealed that cues to voicing are degraded more in the model than in human auditory processing.  相似文献   

18.
A method for computing the speech transmission index (STI) using real speech stimuli is presented and evaluated. The method reduces the effects of some of the artifacts that can be encountered when speech waveforms are used as probe stimuli. Speech-based STIs are computed for conversational and clearly articulated speech in several noisy, reverberant, and noisy-reverberant environments and compared with speech intelligibility scores. The results indicate that, for each speaking style, the speech-based STI values are monotonically related to intelligibility scores for the degraded speech conditions tested. Therefore, the STI can be computed using speech probe waveforms and the values of the resulting indices are as good predictors of intelligibility scores as those derived from MTFs by theoretical methods.  相似文献   

19.
Hands-free speech input is required in many modern telecommunication applications that employ autoregressive (AR) techniques such as linear predictive coding. When the hands-free input is obtained in enclosed reverberant spaces such as typical office rooms, the speech signal is distorted by the room transfer function. This paper utilizes theoretical results from statistical room acoustics to analyze the AR modeling of speech under these reverberant conditions. Three cases are considered: (i) AR coefficients calculated from a single observation; (ii) AR coefficients calculated jointly from an M-channel observation (M > 1); and (iii) AR coefficients calculated from the output of a delay-and sum beamformer. The statistical analysis, with supporting simulations, shows that the spatial expectation of the AR coefficients for cases (i) and (ii) are approximately equal to those from the original speech, while for case (iii) there is a discrepancy due to spatial correlation between the microphones which can be significant. It is subsequently demonstrated that at each individual source-microphone position (without spatial expectation), the M-channel AR coefficients from case (ii) provide the best approximation to the clean speech coefficients when microphones are closely spaced (<0.3m).  相似文献   

20.
Computer models of the process of speech articulation require a detailed knowledge of the vocal tract configurations employed in speech and the application of acoustic theory to calculate the sound waveform. Almost all currently available data on vocal tract dimensions come from x-ray films and are severely limited in quantity and coherence due to restrictions on radiation dosage and intersubject differences. We are using MRI techniques to obtain the pharyngeal dimensions of speakers producing sustained vowels. The fact that MRI does not employ ionizing radiation provides speech research with the opportunity to obtain comprehensive bodies of much-needed data on the articulatory characteristics of single subjects.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号