首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 976 毫秒
1.
王叶斌  赵鹤鸣 《声学学报》2009,34(3):275-280
提出了一种基于辅助变量粒子滤波技术的连续语音声道共振特性(VTRs)轨迹跟踪方法。该方法基于描述语音信号特征的状态空间模型,采用粒子滤波技术跟踪VTRs的轨迹。语音模型由具有目标导向特性的动态方程和VTRs至倒谱系数(LPCC)的非线性映射构成。该方法有两个特点:首先,采用粒子滤波技术来处理语音模型的非线性问题;其次,在语音模型的状态方程中嵌入辅助变量用于标示VTRs在频域中的分布信息,并为粒子滤波过程中的粒子抽取提供目标导向。实验结果表明,该方法只需少量粒子即可正确跟踪连续语音的VTRs轨迹,而且可以在跟踪过程中避免虚假峰和合并峰的干扰。  相似文献   

2.
普通话三合元音音节最小时间感知阈及其声学特性   总被引:1,自引:0,他引:1       下载免费PDF全文
祖漪清 《应用声学》1994,12(2):27-34
本研究的实验材料取自中国社会科学院语言研究所语音数据库.库中存有15个男青的语音材料,共有15×15=225个三合元音音节.本研究的主要目的是从普通话三合元音入手,在对15个说话人的语音材料统计的基础上,通过对最小时间感知阈Tlim的测量与研究,从声学和感知的角度,给出三合元音必不可少的信息,指出多余信息.实验结果表明,TIim内的共振峰变化情况可分为两类.一是动态特性,它的表现是:(a)△F1>90%,△F2约50%;(b)Tlim内至少包括F1,F3两个拐点中的一个;(C)Tlim内包括F2变化最剧烈的部分.这四点对四个三合元音是一致的.第二类是边界条件,Tlim受到位置和大小两方面的限制,证明其边界共振峰频率十分重要.  相似文献   

3.
CoFe2O4纳米粒子的共振散射光谱研究   总被引:9,自引:0,他引:9  
液相纳米粒子CoFe2O4在400,470,510,800和940nm产生五个共振散射峰。它是一种非线性光散射介质。当激发波长为330nm时,CoFe2O4纳米粒子分别在于330,660和990nm产生一个共振散射峰、一个1/2频散射峰和一个1/3分频散射峰;当激发波长为800nm时,在800nm产生一个共振散射峰,而在400nm产生一个较该共振散射峰更强的2倍频散射峰。分频散射和倍频散射与共振散射有相似的散射行为。根据建立的灰白粒子体系共振散射光谱原理定性解析了CoFe2O4纳米粒子体系的共振散射光谱。  相似文献   

4.
俞振利  程伯中 《声学学报》2000,25(5):455-462
提出基于语音生成模型和发音模型RTLA合成模式实现以共振峰轨迹为目标的语音合成的新方法。该方法采用了基于发音声学原理的反射型传输线模型来实现语音合成器。用于控制合成器的声道面积函数参数由以三个共振峰轨迹为目标的语音生成逆向解获得。该方法不仅可以得到动态过渡和自然度好的合成语音,能够方便灵活地控制或改变语音音色,合成器所需的输入控制参数少,参数更新率低。  相似文献   

5.
一种多基阵机动目标被动跟踪算法   总被引:1,自引:1,他引:0       下载免费PDF全文
许兆鹏  韩树平 《应用声学》2011,30(4):282-287
由多部声纳基阵获取的方位信息对水中机动目标的跟踪实质上是一个非线性状态估计问题,文中首先依据各基阵的方位信息,采用最小二乘法得到目标位置在各采样时刻的初步估计,然后将其作为测量值用于交互多模型算法(IMM)并结合线性卡尔曼滤波(KF)得到目标运动速度和轨迹,避免了应用非线性估计算法直接进行多个方位数据融合过程中存在的各种问题。仿真结果表明这一算法简便,与双基阵纯方位机动目标被动跟踪相比具有较快的收敛速度和较高的跟踪精度。  相似文献   

6.
为提升卷积特征目标跟踪算法的实时性和稳健性,利用不同卷积层特征对不同目标表征能力不同的特性,提出双模型自适应切换的实时跟踪方法。该方法对选取的两个卷积层特征使用目标区域和跟踪搜索区域卷积特征的能量均值比来评估卷积特征,选择能量均值比大于给定阈值的卷积通道特征来训练两个相关滤波分类器,然后利用目标相关滤波响应图的峰旁比自适应切换两个相关滤波分类器来预测目标位置,最后采用稀疏模型更新策略来更新分类器。在标准数据集上进行算法测试,实验结果表明,本文算法平均距离精度为89.3%,接近连续卷积跟踪算法,平均跟踪速度为25.8frams/s,是连续卷积跟踪算法的25倍,整体性能优于实验中的对比跟踪算法。  相似文献   

7.
印章的鉴定是法庭科学的重要组成部分,为了更好的打击刑事犯罪,文章提出了采用傅里叶变换红外图像系统对骑缝章的形成进行了快速无损鉴别研究。结合面扫描和反射模式, 先对骑缝章不同部位进行可见图像和红外图像扫描,然后根据印油的红外谱图中吸收峰位置或其特征吸收峰的吸光度比值来鉴定骑缝章是否为一次性形成。红外谱图的结果表明,如果是一次性形成的骑缝章,印油红外谱图中吸收峰位置或其特征吸收峰的吸光度比值基本相近;如果非一次性形成的骑缝章,印油红外谱图中吸收峰位置或其特征吸收峰的吸光度比值相差很大。该法具有快速、准确、客观和不损坏样品等特点。  相似文献   

8.
三层材料纳米颗粒形状对消光特性的影响   总被引:1,自引:1,他引:0  
通过高分辨率电子束光刻方法制备了不同形状的三层复合材料纳米颗粒,研究了这种纳米颗粒的形状变化对消光特性的影响。测试结果表明,当入射波偏振方向平行于短轴时,随着长宽比的增大,共振峰位置发生蓝移;当光源偏振方向平行于长轴时,随着长宽比的增大,共振峰位置发生红移。还用时域有限差分算法以及表面等离波子的Lorentz模型对纳米颗粒的消光特性进行数值计算,所得的消光频谱曲线、共振峰位置变化趋势与实验基本一致。此外,还研究了主体材料层厚度对消光特性的影响,发现其厚度在20~90nm变化时,共振峰发生3~115nm的蓝移。  相似文献   

9.
赵毅  尹雪飞  陈克安 《应用声学》2010,29(6):416-424
共振峰频率是语音信号的一个重要参数。传统的基于线性预测的共振峰检测算法由于受到计算量的限制,很难实现实时处理。本文提出一种基于倒谱变换的共振峰频率检测算法,采用后置处理,比较声道冲击响应对数幅频特性的二次导数和相频特性一次导数检测出的结果,删除伪峰数值和甄别合并共振峰,提高检测精度。仿真结果证明,该算法计算效率高,低信噪比下仍能保持较好的检测性能。  相似文献   

10.
基线校正是一种常用的消除光谱荧光干扰的方法,是拉曼光谱数据处理的必要步骤之一。传统的多项式拟合基线校正算法,简单且易于实现,但是拟合阶次难以确定,灵活性较差。使用非均匀B样条代替多项式进行拟合,在保留原有算法优点的基础上,利用原始拉曼谱图的峰位置信息自适应地确定非均匀B样条的节点向量,然后以固定阶次拟合光谱基线。B样条自身具有分段光滑的特性,而计算样条节点的节点向量自适应选取算法中的峰位置信息通过使用两次具有不同母函数的连续小波变换(continuous wavelet transform, CWT)来获取,既加强了原始光谱数据与B样条算法本身的联系,也克服了传统多项式拟合的不足。为了验证本文算法的有效性,选取了甲基对硫磷和某品牌菜籽油两种被测物进行实验,并使用该算法进行了基线校正,并与两种其他的基线校正算法与进行了对比。实验结果表明,该方法利用固定的拟合阶次就能达到较好的校正效果,所需要的参数较少,校正结果不会出现过拟合或欠拟合的现象,是一种有效的拉曼光谱基线校正算法。  相似文献   

11.
The speech perception of two multiple-channel cochlear implant patients was compared with that of three normally hearing listeners using an acoustic model of the implant for 22 different speech tests. The tests used included a minimal auditory capabilities battery, both closed-set and open-set word and sentence tests, speech tracking and a 12-consonant confusion study using nonsense syllables. The acoustic model represented electrical current pulses by bursts of noise and the effects of different electrodes were represented by using bandpass filters with different center frequencies. All subjects used a speech processor that coded the fundamental voicing frequency of speech as a pulse rate and the second formant frequency of speech as the electrode position in the cochlea, or the center frequency of the bandpass filter. Very good agreement was found for the two groups of subjects, indicating that the acoustic model is a useful tool for the development and evaluation of alternative cochlear implant speech processing strategies.  相似文献   

12.
This study concerned the effect of the first subglottal formant (F1') on the modal-falsetto register transition in males and females. Phonations using air and a helium-oxygen mixture (helox) were used in a comparative study to tease apart possible acoustic and myoelastic contributions to involuntary register transitions. Recordings of the first subglottal formant and its accompanying bandwidths, and the lower and upper shift point marking the outer boundaries of abrupt register transitions, were obtained via a neck-mounted accelerometer, and analyzed using spectrograms and power spectra on a K-5500 Sona-Graph. The four subjects had their hearing masked bilaterally with speech level noise to increase the likelihood of involuntary register transition via minimized auditory feedback. In three of the four test subjects registration was surmised to be primarily a laryngeal event, as evidenced by the similar frequency dependency of voice breaks in both air and helox. It may be hypothesized that subglottal resonance influenced register transition in the fourth subject, as voice breaks rose with helox-induced phonation; however, this result did not reach statistical significance. Therefore, in this experiment subglottal resonance was not found to have a significant influence on register transition as originally hypothesized.  相似文献   

13.
An extensive developmental acoustic study of the speech patterns of children and adults was reported by Lee and colleagues [Lee et al., J. Acoust. Soc. Am. 105, 1455-1468 (1999)]. This paper presents a reexamination of selected fundamental frequency and formant frequency data presented in their report for ten monophthongs by investigating sex-specific and developmental patterns using two different approaches. The first of these includes the investigation of age- and sex-specific formant frequency patterns in the monophthongs. The second, the investigation of fundamental frequency and formant frequency data using the critical band rate (bark) scale and a number of acoustic-phonetic dimensions of the monophthongs from an age- and sex-specific perspective. These acoustic-phonetic dimensions include: vowel spaces and distances from speaker centroids; frequency differences between the formant frequencies of males and females; vowel openness/closeness and frontness/backness; the degree of vocal effort; and formant frequency ranges. Both approaches reveal both age- and sex-specific development patterns which also appear to be dependent on whether vowels are peripheral or nonperipheral. The developmental emergence of these sex-specific differences are discussed with reference to anatomical, physiological, sociophonetic, and culturally determined factors. Some directions for further investigation into the age-linked sex differences in speech across the lifespan are also proposed.  相似文献   

14.
Responses of auditory-nerve fibers in anesthetized cats to nine different spoken stop- and nasal-consonant/vowel syllables presented at 70 dB SPL in various levels of speech-shaped noise [signal-to-noise (S/N) ratios of 30, 20, 10, and 0 dB] are reported. The temporal aspects of speech encoding were analyzed using spectrograms. The responses of the "lower-spontaneous-rate" fibers (less than 20/s) were found to be more limited than those of the high-spontaneous-rate fibers. The lower-spontaneous-rate fibers did not encode noise-only portions of the stimulus at the lowest noise level (S/N = 30 dB) and only responded to the consonant if there was a formant or major spectral peak near its characteristic frequency. The fibers' responses at the higher noise levels were compared to those obtained at the lowest noise level using the covariance as a quantitative measure of signal degradation. The lower-spontaneous-rate fibers were found to preserve more of their initial temporal encoding than high-spontaneous-rate fibers of the same characteristic frequency. The auditory-nerve fibers' responses were also analyzed for rate-place encoding of the stimuli. The results are similar to those found for temporal encoding.  相似文献   

15.
In this paper, a fundamental frequency (F(0)) tracking algorithm is presented that is extremely robust for both high quality and telephone speech, at signal to noise ratios ranging from clean speech to very noisy speech. The algorithm is named "YAAPT," for "yet another algorithm for pitch tracking." The algorithm is based on a combination of time domain processing, using the normalized cross correlation, and frequency domain processing. Major steps include processing of the original acoustic signal and a nonlinearly processed version of the signal, the use of a new method for computing a modified autocorrelation function that incorporates information from multiple spectral harmonic peaks, peak picking to select multiple F(0) candidates and associated figures of merit, and extensive use of dynamic programming to find the "best" track among the multiple F(0) candidates. The algorithm was evaluated by using three databases and compared to three other published F(0) tracking algorithms by using both high quality and telephone speech for various noise conditions. For clean speech, the error rates obtained are comparable to those obtained with the best results reported for any other algorithm; for noisy telephone speech, the error rates obtained are lower than those obtained with other methods.  相似文献   

16.
These studies investigated formant frequency discrimination by Japanese macaques (Macaca fuscata) using an AX discrimination procedure and techniques of operant conditioning. Nonhuman subjects were significantly more sensitive to increments in the center frequency of either the first (F1) or second (F2) formant of single-formant complexes than to corresponding pure-tone frequency shifts. Furthermore, difference limens (DLs) for multiformant signals were not significantly different than those for single-formant stimuli. These results suggest that Japanese monkeys process formant and pure-tone frequency increments differentially and that the same mechanisms mediate formant frequency discrimination in single-formant and vowel-like complexes. The importance of two of the cues available to mediate formant frequency discrimination, changes in the phase and the amplitude spectra of the signals, was investigated by independently manipulating these two parameters. Results of the studies indicated that phase cues were not a significant feature of formant frequency discrimination by Japanese macaques. Rather, subjects attended to relative level changes in harmonics within a narrow frequency range near F1 and F2 to detect formant frequency increments. These findings are compared to human formant discrimination data and suggest that both species rely on detecting alterations in spectral shape to discriminate formant frequency shifts. Implications of the results for animal models of speech perception are discussed.  相似文献   

17.
Three alternative speech coding strategies suitable for use with cochlear implants were compared in a study of three normally hearing subjects using an acoustic model of a multiple-channel cochlear implant. The first strategy (F2) presented the amplitude envelope of the speech and the second formant frequency. The second strategy (F0 F2) included the voice fundamental frequency, and the third strategy (F0 F1 F2) presented the first formant frequency as well. Discourse level testing with the speech tracking method showed a clear superiority of the F0 F1 F2 strategy when the auditory information was used to supplement lipreading. Tracking rates averaged over three subjects for nine 10-min sessions were 40 wpm for F2, 52 wpm for F0 F2, and 66 wpm for F0 F1 F2. Vowel and consonant confusion studies and a test of prosodic information were carried out with auditory information only. The vowel test showed a significant difference between the strategies, but no differences were found for the other tests. It was concluded that the amplitude and duration cues common to all three strategies accounted for the levels of consonant and prosodic information received by the subjects, while the different tracking rates were a consequence of the better vowel recognition and the more natural quality of the F0 F1 F2 strategy.  相似文献   

18.
Recent studies have shown that time-varying changes in formant pattern contribute to the phonetic specification of vowels. This variation could be especially important in children's vowels, because children have higher fundamental frequencies (f0's) than adults, and formant-frequency estimation is generally less reliable when f0 is high. To investigate the contribution of time-varying changes in formant pattern to the identification of children's vowels, three experiments were carried out with natural and synthesized versions of 12 American English vowels spoken by children (ages 7, 5, and 3 years) as well as adult males and females. Experiment 1 showed that (i) vowels generated with a cascade formant synthesizer (with hand-tracked formants) were less accurately identified than natural versions; and (ii) vowels synthesized with steady-state formant frequencies were harder to identify than those which preserved the natural variation in formant pattern over time. The decline in intelligibility was similar across talker groups, and there was no evidence that formant movement plays a greater role in children's vowels compared to adults. Experiment 2 replicated these findings using a semi-automatic formant-tracking algorithm. Experiment 3 showed that the effects of formant movement were the same for vowels synthesized with noise excitation (as in whispered speech) and pulsed excitation (as in voiced speech), although, on average, the whispered vowels were less accurately identified than their voiced counterparts. Taken together, the results indicate that the cues provided by changes in the formant frequencies over time contribute materially to the intelligibility of vowels produced by children and adults, but these time-varying formant frequency cues do not interact with properties of the voicing source.  相似文献   

19.
Auditory feedback influences human speech production, as demonstrated by studies using rapid pitch and loudness changes. Feedback has also been investigated using the gradual manipulation of formants in adaptation studies with whispered speech. In the work reported here, the first formant of steady-state isolated vowels was unexpectedly altered within trials for voiced speech. This was achieved using a real-time formant tracking and filtering system developed for this purpose. The first formant of vowel /epsilon/ was manipulated 100% toward either /ae/ or /I/, and participants responded by altering their production with average Fl compensation as large as 16.3% and 10.6% of the applied formant shift, respectively. Compensation was estimated to begin <460 ms after stimulus onset. The rapid formant compensations found here suggest that auditory feedback control is similar for both F0 and formants.  相似文献   

20.
提升小波加权自相关函数的基音检测算法*   总被引:1,自引:0,他引:1       下载免费PDF全文
王晨  章小兵  刘美娟 《应用声学》2018,37(2):201-207
随着计算机技术的发展,语音信号处理作为人机交互的重要渠道,其在复杂噪声环境下的特征值检测算法直接关系到计算机的运算效率。基音周期是语音特征值提取的重要参数之一。针对传统基音检测算法在噪声环境下检测精度低的问题,提出了一种基于自适应提升小波变换加权线性预测误差自相关函数的基音检测算法。该方法用多级提升小波近似系数加权求和的方法来弥补自相关函数随着时间延迟量的增加幅值衰减的缺陷;用线性预测误差自相关函数的方法来抑制共振峰的干扰,然后将两种方法结合来突出基音周期处的峰值。实验结果表明,与传统的自相关函数法和小波加权法相比,该方法能有效减弱共振峰的影响,突出基音周期处的峰值,提高基音周期检测精度,鲁棒性更好。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号