期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

严威朱永发谢炜宇宋宏伟张朝阳杨明晖《化学物理学报》2021,34(6):717-727

多体展开方法虽然已经广泛地用于估算弱相互作用体系的能量,但是其并不适用于计算共价团簇和金属团簇的能量. 本文提出了一种适用于计算共价体系能量的相互作用多体展开(IMBE)方法. 在相互作用多体展开方法中,体系的能量表示为孤立原子的能量及该原子与其他周围原子间相互作用的和. 首先将该方法应用于计算氮团簇的能量,且多体展开截断至四体项. 结果表明：与传统的多体展开方法相比,相互作用多体展开方法可以显著地降低能量误差. 另外,以密度泛函理论计算结果为参考,相互作用多体展开方法估算能量的误差不依赖于体系的大小和结构,说明相互作用多体展开方法比较适合用于估算共价相互作用大体系的能量. 相似文献

2.

基于Transformer的普通话语声识别模型位置编码选择

下载免费PDF全文

徐冬冬《应用声学》2021,40(2):194-199

具有自注意机制的Transformer网络在语声识别研究领域渐渐得到广泛关注.该文围绕着将位置信息嵌入与语声特征相结合的方向,研究更加适合普通话语声识别模型的位置编码方法.实验结果得出,采用卷积编码的输入表示代替正弦位置编码,可以更好地融合语声特征上下文联系和相对位置信息,获得较好的识别效果.训练的语声识别系统是在Tr... 相似文献

3.

汉语音段反转言语的可懂度研究

下载免费PDF全文

蒋斌匡正吴鸣杨军《声学学报》2012,37(6):659-666

实验研究了帧长对汉语音段反转言语可懂度的影响。实验结果表明,帧长在64 ms以下,汉语音段反转言语具有较高的可懂度;帧长在64~203 ms之间,可懂度随帧长的增加逐渐降低;帧长在203 ms以上,可懂度为0。在帧长8 ms时,汉语的声调失真导致可懂度下降。原始语音信号和音段反转言语的调制谱的分析表明,调制谱失真大小和可懂度密切相关。因此,用原始语音信号和音段反转言语的窄带包络间的归一化相关值可以衡量调制谱失真大小,基于语音的语言传输指数法计算的客观值和实验结果显著相关(r=0.876,p<0.01)。研究表明,语言可懂度与窄带包络有关,音段反转言语的可懂度和保留原始语音信号的窄带包络密切相关。相似文献

4.

语音信号序列的Volterra预测模型

下载免费PDF全文

张玉梅胡小俊吴晓军白树林路纲《物理学报》2015,64(20):200507-200507

对给定的英语音素、单词和语句进行了采集并完成预处理. 分别应用互信息法和Cao 氏法确定了实际采集的语音信号序列的延迟时间和嵌入维数, 以完成语音序列的相空间重构. 通过计算实际采集的语音信号序列的最大Lyapunov指数, 完成了语音信号的混沌特性识别, 判定其具有混沌特性. 引入Volterra级数, 提出了一种具有显式结构的语音信号非线性预测模型. 为克服最小均方误差算法在Volterra模型系数更新时固有的缺点, 在最小二乘法基础上, 应用基于后验误差假设的可变收敛因子技术, 构建了一种基于Davidon-Fletcher-Powell算法的二阶Volterra 模型(DFPSOVF), 并将其应用于具有混沌特性的语音信号序列预测. 仿真结果表明: DFPSOVF非线性预测模型对于单帧和多帧语音信号均具有更好的预测精度, 优于线性预测模型, 并且能够很好地反映语音序列变化的趋势和规律, 完全可以满足语音预测的要求; 可以根据语音信号序列的嵌入维数选取预测模型的记忆长度. 所提出模型可以为语音信号重构和压缩编码开辟一条新途径, 以改善语音信号处理方法的复杂度和处理效果. 相似文献

5.

改进的正交匹配追踪的语音增强算法*

下载免费PDF全文

武正平马建芬张朝霞杨东东《应用声学》2018,37(6):934-939

为了提高传统正交匹配追踪（Orthogonal Matching Pursuit,OMP ）算法的语音增强性能和运算速度,本研究基于稀疏编码理论,提出了一种改进的OMP算法的语音增强算法。其一,将K-奇异值分解（K-singular value decomposition,K-SVD）算法与OMP算法相结合,通过设置能量阈值的方法,提高OMP算法的语音增强性能;其二,通过改进传统OMP算法中信号稀疏逼近的计算方法,提高算法的运算速度。改进的OMP算法的语音增强算法与传统K-SVD语音增强算法相比,采用PESQ评价增强语音的质量,NCM评价语音的可懂度。在NCM的值基本保持不变的情况下,PESQ的值平均提高约12.47%,取得了更好的增强效果。取得了更好的增强效果。改进的OMP算法的运算速度与传统OMP算法相比提高近一倍。相似文献

6.

Effects of degradation of intensity, time, or frequency content on speech intelligibility for normal-hearing and hearing-impaired listeners

van Schijndel NH Houtgast T Festen JM 《The Journal of the Acoustical Society of America》2001,110(1):529-542

Many hearing-impaired listeners suffer from distorted auditory processing capabilities. This study examines which aspects of auditory coding (i.e., intensity, time, or frequency) are distorted and how this affects speech perception. The distortion-sensitivity model is used: The effect of distorted auditory coding of a speech signal is simulated by an artificial distortion, and the sensitivity of speech intelligibility to this artificial distortion is compared for normal-hearing and hearing-impaired listeners. Stimuli (speech plus noise) are wavelet coded using a complex sinusoidal carrier with a Gaussian envelope (1/4 octave bandwidth). Intensity information is distorted by multiplying the modulus of each wavelet coefficient by a random factor. Temporal and spectral information are distorted by randomly shifting the wavelet positions along the temporal or spectral axis, respectively. Measured were (1) detection thresholds for each type of distortion, and (2) speech-reception thresholds for various degrees of distortion. For spectral distortion, hearing-impaired listeners showed increased detection thresholds and were also less sensitive to the distortion with respect to speech perception. For intensity and temporal distortion, this was not observed. Results indicate that a distorted coding of spectral information may be an important factor underlying reduced speech intelligibility for the hearing impaired. 相似文献

7.

定点脉冲线性预测编码方法研究_*

下载免费PDF全文

马震吴殿红《应用声学》2016,35(2):137-143

在多脉冲线性预测编码的基础上,本文提出了位置无关脉冲搜索算法。该算法不需要搜索脉冲位置,而是根据给定的脉冲位置一次性解出脉冲幅度矢量。这就保证了得到的脉冲组合在最小二乘意义下是最优的,为改进合成语音质量提供了理论基础。进而在激励脉冲与位置无关的理论基础上,提出了定点脉冲线性预测编码方法。对所提出的算法在MATLAB下进行了仿真,仿真结果发现位置无关脉冲搜索算法得到的合成语音质量优于序贯法,编码时间也要比序贯法短。定点脉冲线性预测编码方法可以在2.7 kbps的编码速率下获得与G.729相近的合成语音。相似文献

8.

Nonlinear dynamics of the voice: Signal analysis and biomechanical modeling

Herzel H Berry D Titze I Steinecke I 《Chaos (Woodbury, N.Y.)》1995,5(1):30-34

Irregularities in voiced speech are often observed as a consequence of vocal fold lesions, paralyses, and other pathological conditions. Many of these instabilities are related to the intrinsic nonlinearities in the vibrations of the vocal folds. In this paper, bifurcations in voice signals are analyzed using narrow-band spectrograms. We study sustained phonation of patients with laryngeal paralysis and data from an excised larynx experiment. These spectrograms are compared with computer simulations of an asymmetric 2-mass model of the vocal folds. (c) 1995 American Institute of Physics. 相似文献

9.

An objective quality assessment method for bit-reduction coding of wideband speech.

S Hayashi N Kitawaki 《The Journal of the Acoustical Society of America》1992,92(1):106-113

This paper proposes a new objective quality assessment method for bit-reduction coding of wideband speech taking into account the masking effect of quantizing noise. First, this paper analyzes the reliability and sensitivity of the speech quality assessment method, based on a paired-comparison test with a modulated noise reference signal, for the bit-reduction coding of high-quality wideband speech. Then, the perception of quantizing noise is studied using speech with noise synthesized similar to the quantizing noise. The detection of quantizing noise is found to be influenced by masking by the source signal. This leads to a new method of objectively estimating the quality of coding speech by multiple regression analysis. The factors for the estimation are segmental signal-to-noise ratio, spectrum envelope distance between source signal and quantizing noise, and the similarity of the noise power envelope to the source signal in the time domain. This estimation method is applied to the parameter optimization of wideband coding systems. 相似文献

10.

窄带高光谱干涉成像的压缩采样复原方法

孟鑫李建欣朱日宏周伟程静静《光学学报》2013,33(1):130001-273

利用干涉成像光谱仪对目标进行窄带高光谱成像探测具有高光通量、高光谱分辨率和高目标分辨率等优点。按照尼奎斯特定理对窄带光谱干涉信息进行采样存在较大的数据冗余,增加了后期傅里叶变换的数据处理量,影响光谱的复原效率。在分析窄带光谱傅里叶变换特性的基础上,提出了基于滤光片光谱透射率函数的窄带光谱压缩采样方法。引入滤光片参数和混叠参数,可以复原不同精度的窄带光谱信息。配以符合要求的多带通窄带滤光片,可对目标进行压缩采样获取多个谱段的窄带光谱信息,从而避免了逐个谱段探测,提高了探测效率。对该方法进行了仿真分析和实验验证,得到了与目标光谱相吻合的复原窄带光谱。相似文献

11.

Effects of temporal and spectral factors of maskers on speech intelligibility

Yoshifumi Hara Mikio Tohyama Kazunori Miyoshi 《Applied Acoustics》2012,73(9):893-899

This study demonstrates a new possibility of estimating intelligibility of speech in informational maskers. The temporal and spectral properties of sound maskers are investigated to achieve acoustic privacy in public spaces. Speech intelligibility (SI) tests were conducted using Japanese sentences in daily use for energy (white noise) or informational (reversed speech) maskers. We found that the masking effects including informational masking on SI might not be estimated by analyzing the narrow-band temporal envelopes, which is a common way of predicting SI under noisy conditions. The masking effects might instead be visualized by spectral auto-correlation analysis on a frame-by-frame basis, for the series of dominant-spectral peaks of the masked target in the frequency domain. Consequently, we found that dissimilarity in frame-based spectral-auto-correlation sequences between the original and masked targets was the key to evaluating maskers including informational masking effects on SI. 相似文献

12.

Speech dereverberation method based on spectral subtraction and spectral line enhancement

Zhe Chen Rui WangFuliang Yin Bingqian WangWenwen Peng 《Applied Acoustics》2016

Speech signals recorded with a distant microphone usually are interfered by the spatial reverberation in the room, which severely degrades the clarity and intelligibility of speech. A speech dereverberation method based on spectral subtraction and spectral line enhancement is proposed in this paper. Following the generalized statistical reverberation model, the power spectrum of late reverberation is estimated and removed from the reverberation speech by the spectral subtraction method. Then, according to the human auditory model, a spectral line enhancement technique based on adaptive post-filtering is adopted to further eliminate the reverberant components between adjacent speech formants. The proposed method can effectively suppress the spatial reverberation and improve the auditory perception of speech. The subjective and objective evaluation results reveal that the perceptual quality of speech is greatly improved by the proposed method. 相似文献

13.

Predicted effects of sensorineural hearing loss on across-fiber envelope coding in the auditory nerve

Swaminathan J Heinz MG 《The Journal of the Acoustical Society of America》2011,129(6):4001-4013

Cross-channel envelope correlations are hypothesized to influence speech intelligibility, particularly in adverse conditions. Acoustic analyses suggest speech envelope correlations differ for syllabic and phonemic ranges of modulation frequency. The influence of cochlear filtering was examined here by predicting cross-channel envelope correlations in different speech modulation ranges for normal and impaired auditory-nerve (AN) responses. Neural cross-correlation coefficients quantified across-fiber envelope coding in syllabic (0-5 Hz), phonemic (5-64 Hz), and periodicity (64-300 Hz) modulation ranges. Spike trains were generated from a physiologically based AN model. Correlations were also computed using the model with selective hair-cell damage. Neural predictions revealed that envelope cross-correlation decreased with increased characteristic-frequency separation for all modulation ranges (with greater syllabic-envelope correlation than phonemic or periodicity). Syllabic envelope was highly correlated across many spectral channels, whereas phonemic and periodicity envelopes were correlated mainly between adjacent channels. Outer-hair-cell impairment increased the degree of cross-channel correlation for phonemic and periodicity ranges for speech in quiet and in noise, thereby reducing the number of independent neural information channels for envelope coding. In contrast, outer-hair-cell impairment was predicted to decrease cross-channel correlation for syllabic envelopes in noise, which may partially account for the reduced ability of hearing-impaired listeners to segregate speech in complex backgrounds. 相似文献

14.

窄带薄层色谱-表面增强拉曼光谱联用法的建立及应用

张彬彬史毅朱青霞陆峰《光散射学报》2017,29(2):129-132

建立窄带薄层色谱-表面增强拉曼光谱(narrow-band TLC-SERS)分析方法来改善传统薄层色谱-表面增强拉曼光谱法(TLC-SERS)在色谱展开过程中斑点横向扩散以及由于滴加SERS基底而导致的斑点二次扩散的不足,进一步提高TLC-SERS方法的检测灵敏度。本文采用色谱层析硅胶GF254制备并优化出宽度为2mm的窄带TLC板,待样品在窄带TLC板上分离后,在分离斑点表面喷洒银溶胶,然后对斑点进行SERS检测。结果表明,narrow-band TLC-SERS法具有简便、快速、灵敏的特点,不仅改善了传统TLC-SERS方法斑点横向扩散的不足,而且降低了固定相的用量,节约了成本,具有非常广阔的应用前景。相似文献

15.

Spectral shape discrimination of narrow-band sounds.

D M Green B G Berg H Dai D A Eddins Z Onsan Q Nguyen 《The Journal of the Acoustical Society of America》1992,92(5):2586-2597

Measurements are reported on the detectability of signals added to narrow-band sounds. The narrow-band sounds had a bandwidth of 20 Hz and were either Gaussian noise with flat amplitude spectra or sets of equal-amplitude sinusoidal components whose phases were chosen at random. Four different kinds of sinusoidal signals were used. Two signals produced symmetric changes in the audio spectrum adding a component either at the center of the spectrum or at both ends. The other two signals produced asymmetric changes adding a component at either end of the spectrum. The overall level of the sound was randomly varied on each presentation, so that the presence of a signal was largely unrelated to the absolute level of the signal component(s). A model is proposed that assumes the detection of the symmetric signals is based on changes in the shape of the power spectrum of the envelope. Such changes in the envelope power spectrum are probably heard as changes in the "roughness" or "smoothness" of the narrow-band sound. The predictions of this model were obtained from computer simulations. For the asymmetric signals, the most probable detection cues were changes in the pitch of the narrow-band sound. Results from a variety of different experiments using three listeners support these conjectures. 相似文献

16.

The role of perceived spatial separation in the unmasking of speech 总被引：12，自引：0，他引：12

Freyman RL Helfer KS McCall DD Clifton RK 《The Journal of the Acoustical Society of America》1999,106(6):3578-3588

Spatial separation of speech and noise in an anechoic space creates a release from masking that often improves speech intelligibility. However, the masking release is severely reduced in reverberant spaces. This study investigated whether the distinct and separate localization of speech and interference provides any perceptual advantage that, due to the precedence effect, is not degraded by reflections. Listeners' identification of nonsense sentences spoken by a female talker was measured in the presence of either speech-spectrum noise or other sentences spoken by a second female talker. Target and interference stimuli were presented in an anechoic chamber from loudspeakers directly in front and 60 degrees to the right in single-source and precedence-effect (lead-lag) conditions. For speech-spectrum noise, the spatial separation advantage for speech recognition (8 dB) was predictable from articulation index computations based on measured release from masking for narrow-band stimuli. The spatial separation advantage was only 1 dB in the lead-lag condition, despite the fact that a large perceptual separation was produced by the precedence effect. For the female talker interference, a much larger advantage occurred, apparently because informational masking was reduced by differences in perceived locations of target and interference. 相似文献

17.

小波包自适应阈值语音降噪新算法

下载免费PDF全文

田玉静左红伟董玉民王超《应用声学》2011,30(1):72-80

为了克服低信噪比输入下,语音增强造成语音清音中的弱分量损失,造成重构信号包络失真的问题。论文提出了一种新的语音增强方法。该方法根据语音感知模型,采用不完全小波包分解拟合语音临界频带,并对语音按子带能量进行清浊音区分处理,在阈值计算上,提出了一种清浊音分离,基于子带信号能量的小波包自适应阈值算法。通过仿真实验,客观评测和听音测试表明,该算法在低信噪比输入时较传统算法,能够更加有效地减少重构信号包络失真,在不损伤语音清晰度和自然度的前提下,使输出信噪比明显提高。将该算法与能量谱减法结合,进行二次增强能进一步提高降噪输出的语音质量。相似文献

18.

倒谱参数稀疏分解下的汉语音谎言检测

下载免费PDF全文

樊晓鹤赵鹤鸣陈雪勤周燕《声学学报》2018,43(1):121-128

为了提高汉语语音的谎言检测准确率,提出了一种对信号倒谱参数进行稀疏分解的方法。首先,采用小波包滤波器组对语音信号进行多频带划分,求得子频带对数能量并进行离散余弦变换以提取小波包频带倒谱系数,结合梅尔频率谱系数得到倒谱参数;其次,依据K-奇异值分解方法分别利用说谎和非说谎两种状态下的语音倒谱参数集训练得到过完备混合字典,在此字典上根据正交匹配追踪算法对参数集进行稀疏编码提取稀疏特征;最终进行多种分类模型下的识别实验·实验结果表明,稀疏分解方法相比传统参数降维方法具有更好的优化性能,本文推荐的稀疏谱特征最佳识别率达到78.34%,优于其他特征参数,显著提高了谎言检测识别准确率。相似文献

19.

水中目标窄带噪声识别的听觉外周模型

林正青邱梦然《声学学报》2016,41(6):881-890

为解决听觉外周模型特征在具有工程背景的水中目标声信号分类研究中识别率下降问题,提出了一种外周模型Gammatone滤波器组修正方法,获得的窄带噪声特征可明显提高水中目标识别性能。首先,分析了识别率下降原因,发现声学工程应用中多通道数据采集,导致信号频率范围变窄,而引起声信号的时频特征发生变化。其次,根据听觉模型用Gammatone滤波器组模拟人耳基底膜频率分解特性、低频信息包含水中目标噪声信号的重要类别特征,对原有的听觉模型特征进行插值,对滤波器组的通道数与中心频率进行适应性修正,得到目标噪声在较窄频带的27维特征,修正后的模型能够更精细地反映出目标时频特性。最后,采用神经网络分类器进行实验。结果表明,修正后的听觉模型保留了原较宽频带特征的主要信息,而且进一步提高了对实际目标的分类能力,识别率由原来的82.59%提高到88.80%。本文提出根据工程应用平台的有效接收频带优化听觉外周模型Gammatone滤波器组的设计,采用阵元级的多通道数据进行分析,侧重于工程应用,解决了多通道数据采集中,由于频带变窄,导致信号的特征信息量下降,进而引起声特征识别性能下降的问题,修正后的听觉模型特征,有效地提高水中目标辐射噪声的识别效果。本文对从事无源声呐目标识别、有源声呐目标识别、带宽受限的多通道声数据采集的时频特性分析研究人员具有一定的参考价值。相似文献

20.

Steady-spectrum contexts and perceptual compensation for reverberation in speech identification

Watkins AJ Makin SJ 《The Journal of the Acoustical Society of America》2007,121(1):257-266

Perceptual compensation for reverberation was measured by embedding test words in contexts that were either spoken phrases or processed versions of this speech. The processing gave steady-spectrum contexts with no changes in the shape of the short-term spectral envelope over time, but with fluctuations in the temporal envelope. Test words were from a continuum between "sir" and "stir." When the amount of reverberation in test words was increased, to a level above the amount in the context, they sounded more like "sir." However, when the amount of reverberation in the context was also increased, to the level present in the test word, there was perceptual compensation in some conditions so that test words sounded more like "stir" again. Experiments here found compensation with speech contexts and with some steady-spectrum contexts, indicating that fluctuations in the context's temporal envelope can be sufficient for compensation. Other results suggest that the effectiveness of speech contexts is partly due to the narrow-band "frequency-channels" of the auditory periphery, where temporal-envelope fluctuations can be more pronounced than they are in the sound's broadband temporal envelope. Further results indicate that for compensation to influence speech, the context needs to be in a broad range of frequency channels. 相似文献