首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
The order parameter dynamics of a mean-field model is frequently investigated in macroscopic cumulant dynamics, from which a bifurcation can be predicted qualitatively. In this Letter, for quantitatively investigating the long-time order parameter dynamics, a semi-analytic method is proposed based on approximate nonlinear Fokker-Planck equations. Applying the new method to the mean-field model of periodically driven overdamped bistable oscillators with colored noise, we exhibit the bifurcation behavior and the nonlinear stochastic resonance of the order parameter by tuning noise intensity or coupling coefficient, and the accuracy of the new method are verified by direct simulation. Our observations disclose some new properties about the order parameter dynamics of the mean-field model. For example, the periodic signal shifts the critical coupling coefficient to a larger value, while the nonzero correlation time of the colored noise shifts it to a lower value. Our observation also discloses that there is no quantitatively corresponding relation between the resonant peak and the critical bifurcation parameter of the Gaussian moment system.  相似文献   

2.
When speech is in competition with interfering sources in rooms, monaural indicators of intelligibility fail to take account of the listener's abilities to separate target speech from interfering sounds using the binaural system. In order to incorporate these segregation abilities and their susceptibility to reverberation, Lavandier and Culling [J. Acoust. Soc. Am. 127, 387-399 (2010)] proposed a model which combines effects of better-ear listening and binaural unmasking. A computationally efficient version of this model is evaluated here under more realistic conditions that include head shadow, multiple stationary noise sources, and real-room acoustics. Three experiments are presented in which speech reception thresholds were measured in the presence of one to three interferers using real-room listening over headphones, simulated by convolving anechoic stimuli with binaural room impulse-responses measured with dummy-head transducers in five rooms. Without fitting any parameter of the model, there was close correspondence between measured and predicted differences in threshold across all tested conditions. The model's components of better-ear listening and binaural unmasking were validated both in isolation and in combination. The computational efficiency of this prediction method allows the generation of complex "intelligibility maps" from room designs.  相似文献   

3.
针对现有的基于欠采样的频率和二维到达角的联合估计存在结构复杂问题,本文提出了一种基于调制宽带转换器技术的L型延迟阵列接收结构.利用延迟通道与未延迟通道采样值之间的相位差可直接估计载频,进而计算二维到达角,无需额外的参数配对操作,避免了配对步骤引入的误差和复杂度的提升.并结合所提L型延迟阵列结构的特点构造相关矩阵和三线性模型,提出了两种参数估计算法,一种基于旋转不变子空间算法,计算量小,适用于需要实时处理的场景;另一种基于正则分解技术,鲁棒性较好,适用于信噪比较低的应用场景.仿真实验表明该方法能较好地从欠奈奎斯特样本中估计目标的载频和二维到达角参数.  相似文献   

4.
梁瑞宇  周健  王青云  奚吉  赵力 《声学学报》2015,40(3):446-454
为提高复杂场景下的听障患者的语言理解度,本文提出一种仿人耳听觉的助听器双耳声源定位算法。算法首先借鉴耳蜗分频特性和听觉掩蔽特性,将声音信号进行多通道分解,并提取人耳敏感频带的信号进行双耳时间差(Interaural Time Difference,ITD)估计;然后基于人耳哈斯效应,提取有效的ITD信息;最后采用头相关模型,将ITD转化为声源方向信息。同时,为了改善混响和多干扰声场景下的声源定位能力,本文提出一种多通道的加权联合策略。仿真和场景测试实验表明,算法的抗干扰性强,定位精度高。而且,在7名受试者的理解度测试中,同现有的助听器增强算法相比,结合定位算法的语音增强算法达到3~5dB的性能改善。   相似文献   

5.
本文提出了语音信号的一种时域-频域-能量表示,并给出了算法,可用于孤立词语音识别,这种时域-频域-能量表示有两个特点,基于短时能量梯度的非线性时间规正,可保留语音信号频域的过滤特性,丢掉其稳态特性,计算量小,适于实时应用。  相似文献   

6.
The binaural system is well-known for its sluggish response to changes in the interaural parameters to which it is sensitive. Theories of binaural unmasking have suggested that detection of signals in noise is mediated by detection of differences in interaural correlation. If these theories are correct, improvements in the intelligibility of speech in favorable binaural conditions is most likely mediated by spectro-temporal variations in interaural correlation of the stimulus which mirror the spectro-temporal amplitude modulations of the speech. However, binaural sluggishness should limit the temporal resolution of the representation of speech recovered by this means. The present study tested this prediction in two ways. First, listeners' masked discrimination thresholds for ascending vs descending pure-tone arpeggios were measured as a function of rate of frequency change in the NoSo and NoSpi binaural configurations. Three-tone arpeggios were presented repeatedly and continuously for 1.6 s, masked by a 1.6-s burst of noise. In a two-interval task, listeners determined the interval in which the arpeggios were ascending. The results showed a binaural advantage of 12-14 dB for NoSpi at 3.3 arpeggios per s (arp/s), which reduced to 3-5 dB at 10.4 arp/s. This outcome confirmed that the discrimination of spectro-temporal patterns in noise is susceptible to the effects of binaural sluggishness. Second, listeners' masked speech-reception thresholds were measured in speech-shaped noise using speech which was 1, 1.5, and 2 times the original articulation rate. The articulation rate was increased using a phase-vocoder technique which increased all the modulation frequencies in the speech without altering its pitch. Speech-reception thresholds were, on average, 5.2 dB lower for the NoSpi than for the NoSo configuration, at the original articulation rate. This binaural masking release was reduced to 2.8 dB when the articulation rate was doubled, but the most notable effect was a 6-8 dB increase in thresholds with articulation rate for both configurations. These results suggest that higher modulation frequencies in masked signals cannot be temporally resolved by the binaural system, but that the useful modulation frequencies in speech are sufficiently low (<5 Hz) that they are invulnerable to the effects of binaural sluggishness, even at elevated articulation rates.  相似文献   

7.
本文提出了语音信号的一种时域─频域─能量表示,并给出了算法,可用于孤立词语音识别.这种时域─频域─能量表示有两个特点:基于短时能量梯度的非线性时间规正,可保留语音信号频域的过渡特性,丢掉其稳态特性;计算量小,适于实时应用.  相似文献   

8.
This study focuses on correlating speech confusion patterns, defined as consonant-vowel confusion as a function of the speech-to-noise ratio, and a model acoustic feature (AF) representation called the AI gram, defined as the articulation index density in the spectrotemporal domain. By collecting many responses from many talkers and listeners, the AF and psychophysical feature (event) is shown to be correlated via the AI-gram model and the confusion matrices at the utterance level, thereby explaining the listener confusion. Consonant /t/ is used as an example to identify its primary robust-to-noise feature, and a precise correlation of the acoustic information with the listeners' confusions is used to label the event. The main spectrotemporal cue defining the /t/ event is an across-frequency temporal coincidence, wherein frequency spread and robustness vary across utterances, while the event remains invariant. The cross-frequency timing event is shown to be the key perceptual feature for consonants in a vowel following context. Coincidences are found to form the basic element of the auditory object. Neural circuits used for coincidence in binaural processing for localization across ears are proposed to be used within one ear across channels. It is further concluded that the event is based on the audibility of the /t/ burst rather than on any superthreshold property.  相似文献   

9.
Animals live in cluttered auditory environments, where sounds arrive at the two ears through several paths. Reflections make sound localization difficult, and it is thought that the auditory system deals with this issue by isolating the first wavefront and suppressing later signals. However, in many situations, reflections arrive too early to be suppressed, for example, reflections from the ground in small animals. This paper examines the implications of these early reflections on binaural cues to sound localization, using realistic models of reflecting surfaces and a spherical model of diffraction by the head. The fusion of direct and reflected signals at each ear results in interference patterns in binaural cues as a function of frequency. These cues are maximally modified at frequencies related to the delay between direct and reflected signals, and therefore to the spatial location of the sound source. Thus, natural binaural cues differ from anechoic cues. In particular, the range of interaural time differences is substantially larger than in anechoic environments. Reflections may potentially contribute binaural cues to distance and polar angle when the properties of the reflecting surface are known and stable, for example, for reflections on the ground.  相似文献   

10.
针对低信噪比说话人识别中缺失数据特征方法鲁棒性下降的问题,提出了一种采用感知听觉场景分析的缺失数据特征提取方法。首先求取语音的缺失数据特征谱,并由语音的感知特性求出感知特性的语音含量。含噪语音经过感知特性的语音增强和对其语谱的二维增强后求解出语音的分布,联合感知特性语音含量和缺失强度参数提取出感知听觉因子。再结合缺失数据特征谱把特征的提取过程分解为不同听觉场景进行区分地分析和处理,以增强说话人识别系统的鲁棒性能。实验结果表明,在-10 dB到10 dB的低信噪比环境下,对于4种不同的噪声,提出的方法比5种对比方法的鲁棒性均有提高,平均识别率分别提高26.0%,19.6%,12.7%,4.6%和6.5%。论文提出的方法,是一种在时-频域中寻找语音鲁棒特征的方法,更适合于低信噪比环境下的说话人识别。   相似文献   

11.
Although the speech transmission index (STI) is a well-accepted and standardized method for objective prediction of speech intelligibility in a wide range of environments and applications, it is essentially a monaural model. Advantages of binaural hearing in speech intelligibility are disregarded. In specific conditions, this leads to considerable mismatches between subjective intelligibility and the STI. A binaural version of the STI was developed based on interaural cross correlograms, which shows a considerably improved correspondence with subjective intelligibility in dichotic listening conditions. The new binaural STI is designed to be a relatively simple model, which adds only few parameters to the original standardized STI and changes none of the existing model parameters. For monaural conditions, the outcome is identical to the standardized STI. The new model was validated on a set of 39 dichotic listening conditions, featuring anechoic, classroom, listening room, and strongly echoic environments. For these 39 conditions, speech intelligibility [consonant-vowel-consonant (CVC) word score] and binaural STI were measured. On the basis of these conditions, the relation between binaural STI and CVC word scores closely matches the STI reference curve (standardized relation between STI and CVC word score) for monaural listening. A better-ear STI appears to perform quite well in relation to the binaural STI model; the monaural STI performs poorly in these cases.  相似文献   

12.
A fluid-type floating vibration isolation system was developed based on anti-resonance mechanism. The mathematical model was derived for theoretical analysis. The system enables completely isolate vibration at any specific frequency, when the frequency of anti-resonance of the floating vibration isolation system is adjusted to the vibration frequency by tuning the added mass of flowing fluid. Since the approach only alters the inertial force of added mass rather than changing the entire system stiffness, the robustness of the system’s static stability remains during a tuning process, and the system can perform vibration isolation superbly at very low frequencies. A prototype of fluid-type floating vibration isolation system was designed, built and tested to validate the mathematical model. The experimental results illustrated a good agreement with the theoretical analysis.  相似文献   

13.
Traditional methods often only use monaural masking models to decorrelate input signals for stereo acoustic echo cancellation. Whereas, it seems more reasonable to use binaural masking models for the following two reasons. First, stereo signals are heard by two ears rather than just one. Second, psychoacoustic researchers have already shown that there are obvious masking level differences between binaural masking models and monaural masking models. By studying binaural masking level difference models, we first introduce a simplified binaural masking model for stereo acoustic echo cancellation. Considering that the interaural time difference is dominant at low frequencies (??1.5  kHz) and the interaural level difference is a major cue at higher frequencies, we propose to use different signal decorrelation schemes at these two frequency bands. In the low-frequency band, a pitch-driven sinusoidal injection scheme is proposed to maintain the interaural time difference, where the amount of injection is determined by the proposed binaural masking model. In the high-frequency band, a modified sinusoidal phase modulation scheme is applied to make a trade-off between preserving the interaural level difference and decorrelating the stereophonic input signals. Assessment results show that the proposed method can effectively improve the non-unique problem and retain good speech quality.  相似文献   

14.
在工业生产过程中,特别是化工领域,许多单元属于开环不稳定过程,特别还存在时滞特性。针对此情况,提出了一种二自由度响应控制结构,该结构优点是设定值跟踪与扰动响应完全解耦,分别对设定值跟踪控制器和扰动控制器两个参数进行独立调节,而无需进行折中,保证了系统的稳定性和鲁棒性。最后对一阶二阶不稳定时滞系统的仿真结果表明,所提出的二自由度控制结构能够有效地解决系统的稳定鲁棒性和扰动抑制作用。  相似文献   

15.
This paper proposes an adaptive filter-based method for detection and frequency estimation of whistle calls, such as the calls of birds and marine mammals, which are typically analyzed in the time-frequency domain using a spectrogram. The approach taken here is based on adaptive notch filtering, which is an established technique for frequency tracking. For application to automatic whistle processing, methods for detection and improved frequency tracking through frequency crossings as well as interfering transients are developed and coupled to the frequency tracker. Background noise estimation and compensation is accomplished using order statistics and pre-whitening. Using simulated signals as well as recorded calls of marine mammals and a human whistled speech utterance, it is shown that the proposed method can detect more simultaneous whistles than two competing spectrogram-based methods while not reporting any false alarms on the example datasets. In one example, it extracts complete 1.4 and 1.8 s bottlenose dolphin whistles successfully through frequency crossings. The method performs detection and estimates frequency tracks even at high sweep rates. The algorithm is also shown to be effective on human whistled utterances.  相似文献   

16.
唐友福  刘树林  雷娜  姜锐红  刘颖慧 《物理学报》2012,61(17):170504-170504
针对传统功率谱在频率概念上的局限性及傅氏变换的固有缺陷, 提出一种新的广义局部频率概念,在自适应峰值分解方法的基础上, 研究周期激励下Duffing系统随阻尼参数r变化的频域动力学特征,发现了频率分岔现象, 并且不同参数r下的混沌时间序列在中心频率附近出现连续频段, 其形状具有相似性.通过厄米解调分析,总结出混沌时间序列具有频率调制特性和频率调制的相似性. 上述研究表明:提出的基于自适应峰值分解的广义局部频率方法, 能够有效提取Duffing系统的频域特征,为观察非线性系统混沌状态下频率连续分布规律提供一种新方法.  相似文献   

17.
Auditory filter bandwidths and time constants were obtained with five normal-hearing subjects for different masker configurations both in the frequency and time domain for monaural and binaural listening conditions. Specifically, the masking level in the monaural condition and the interaural correlation in the binaural conditions, respectively, was changed in a sinusoidal, stepwise, and rectangular way in the frequency domain. In the corresponding experiments in the time domain, a sinusoidal and stepwise change of the masker was performed. From these results, a comparison was made across conditions to evaluate the influence of the factors "shape of transition," "monaural versus binaural," "frequency domain versus time domain," and "subject." Also, the respective data from the literature were considered using the same model assumptions and fitting strategy as used for the current data. The results indicate that the monaural auditory filter bandwidths and time constants fitted to the data are consistent across conditions both for the data included in this study and the data from the literature. No consistent relation between individual auditory filter bandwidths and time constants were found across subjects. For the binaural conditions, however, considerable differences were found in estimates of the bandwidths and time constants, respectively, across conditions. The reason for this mismatch seems to be the different detection strategies employed for the various tasks that are affected by the consistency of binaural information across frequency and time. While monaural detection performance appears to be modeled quite well with a linear filter or temporal integration window, this does not hold for the binaural conditions where both larger bandwidth and time constant estimates are found.  相似文献   

18.
Binaural speech intelligibility of individual listeners under realistic conditions was predicted using a model consisting of a gammatone filter bank, an independent equalization-cancellation (EC) process in each frequency band, a gammatone resynthesis, and the speech intelligibility index (SII). Hearing loss was simulated by adding uncorrelated masking noises (according to the pure-tone audiogram) to the ear channels. Speech intelligibility measurements were carried out with 8 normal-hearing and 15 hearing-impaired listeners, collecting speech reception threshold (SRT) data for three different room acoustic conditions (anechoic, office room, cafeteria hall) and eight directions of a single noise source (speech in front). Artificial EC processing errors derived from binaural masking level difference data using pure tones were incorporated into the model. Except for an adjustment of the SII-to-intelligibility mapping function, no model parameter was fitted to the SRT data of this study. The overall correlation coefficient between predicted and observed SRTs was 0.95. The dependence of the SRT of an individual listener on the noise direction and on room acoustics was predicted with a median correlation coefficient of 0.91. The effect of individual hearing impairment was predicted with a median correlation coefficient of 0.95. However, for mild hearing losses the release from masking was overestimated.  相似文献   

19.
Room response equalization systems are used for improving the listening experience in cinema theatres, home theatres, car hi-fi systems. In this paper, an adaptive multichannel and multiple position room response equalization system and its real-time implementation are described. An adaptive and accurate estimation of the room responses is provided introducing a normalized least mean square optimization approach with a variable step-size, and taking advantage of an interchannel coherence reduction technique based on the missing fundamental phenomenon. Then, the equalizer is designed in warp frequency domain for improving equalization in the low frequency region, reducing the computational cost of the design procedure, and deriving an algorithm capable of working in real time. Indeed, a real-time implementation of the proposed adaptive equalizer has been obtained on NU-Tech framework and has been used in order to provide a deep objective and subjective evaluation of the equalization system. The results of these evaluations illustrate the effectiveness of the proposed approach, also in comparison with other techniques of the state of the art.  相似文献   

20.
基于多带解调分析和瞬时频率估计的耳语音话者识别   总被引:4,自引:0,他引:4  
王敏  赵鹤鸣 《声学学报》2010,35(4):471-476
为了改善耳语音话者识别的稳健性,提出了一种基于调幅-调频(AM-FM)模型的耳语音特征参数,瞬时频率估计(IFE)。根据语音产生的共振峰调制理论,采用多带解调分析(MDA)获得语音的瞬时包络和频率;然后根据包络幅度和频率的加权估计,得到语音的特征IFE来描绘语音的频率结构。将该特征用于耳语话者识别并和传统的Mel倒谱系数(MFCC)进行了比较。实验结果表明,随着测试人数的增加,IFE的识别效果略好于MFCC;在测试信道改变的情况下,与MFCC相比IFE的稳健性得到了有效的提高。   相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号