首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到19条相似文献,搜索用时 78 毫秒
1.
听觉模拟的语音增强方法   总被引:2,自引:0,他引:2  
本文通过分析听觉系统的信号提取方法,提出了适合于信号提取的动态多阈值的概念,并以此提出了实现语音增强的方法。实验结果表明,与传统的语音增强方法相比,听觉模拟的语音增强方法有更好的增强效果。  相似文献   

2.
特征提取是水下无源声呐目标分类识别的关键步骤,提出了一种基于听觉Patterson-Holdsworth耳蜗模型的听觉域张量特征提取方法。将耳蜗模型的滤波器冲激响应视为信号分解的基函数,根据听觉模型非线性尺度或常规线性尺度确定不同通道的中心频率,然后计算出相应通道的增益和带宽,并量化冲激响应的阶数和相位参数,得到信号分解基,再根据信号分解原理得到通道数×阶数×相位数的三阶张量特征,并通过计算测试样本张量特征与训练样本张量特征间的相似性实现了水下无源声呐目标的分类识别。海上实录无源声呐目标的分类识别实验表明,提取的张量特征具有比较好的分类识别性能,听觉模型等效矩形带宽尺度优于线性尺度划分中心频率,能够提高无源声呐的目标指示能力。  相似文献   

3.
一种基于听觉特性的语音失真测度方法   总被引:3,自引:0,他引:3  
提出了一种基于听觉特性的语音失真测度方法——感知谱失真 PSD(Perceptual Spectrum Distortion)测度,该测度方法通过模拟人的听觉特性把语音短时频谱转变为符合听觉特性的感知频谱,再以感知谱为基础来度量语音失真程度。经过对不同质量的语音进行仿真实验以及与Itakura测度方法作对比实验,结果表明PSD测度是一种与语音质量主观评价一致性较好的语音失真测度方法。  相似文献   

4.
语音中相位的听觉感知实验研究   总被引:2,自引:0,他引:2  
人的听觉对语音信号中相位的感知比较迟钝,因而对语音信号进行处理和编码时常常不关心相位失真。实际上,相位失真到一定程度时会明显导致语音质量的下降。为了取得高质量的声码器,语音谱分量的相位信息是不能不考虑的。本文通过主观听觉测试实验研究了语音信号的短时Fourier变换相位谱对人的听觉感知的影响。测试结果表明:(1)如果完全舍弃原相位信息,则得到的重建语音含有很强的噪声且自然度很差;(2)不论舍弃高频段还是低频段的相位信息,均能导致听觉感知差异;(3)当相位的量阶小于π/7时,人的听觉系统将分辨不出重建语音和原始语音之间存在的差异.  相似文献   

5.
基于听觉模型的耳语音的声韵切分   总被引:5,自引:0,他引:5       下载免费PDF全文
丁慧  栗学丽  徐柏龄 《应用声学》2004,23(2):20-25,44
本文分析了耳语音的特点,并根据生理声学及心理声学的基本理论与实验资料,提出了一种利用听觉模型来进行耳语音声韵切分的方法。这种适用于耳语音声韵切分的听觉感知模型主要分为四个层次:耳蜗对声音频率的分解机理;听觉系统的时域和频域非线性变化;中枢神经系统的侧抑制机理。这种模型能反映在噪声环境下人对低能量语音的听觉感知特性,因而适于耳语音识别,在耳语音声韵母切分实验中得到了满意的结果。  相似文献   

6.
基于听觉事件检测的汉语语音声韵切分   总被引:2,自引:0,他引:2  
张宝奇  张连海  屈丹 《声学学报》2010,35(6):701-707
提出了一种基于听觉事件检测的汉语声韵母切分方法。该方法首先使用耳蜗滤波器组对语音进行滤波,然后在每个频带上检测对应于能量突变的听觉事件,最后在不同频率范围对听觉事件进行融合以确定声韵母边界。实验结果表明,对8 kHz采样的干净语音切分准确率可达到88.9%;信噪比10 dB的语音切分准确率可达到82.9%以上。  相似文献   

7.
语音中元音和辅音的听觉感知研究   总被引:1,自引:0,他引:1       下载免费PDF全文
本文对语音中元音和辅音的听觉感知研究进行综述。80多年前基于无意义音节的权威实验结果表明辅音对人的听感知更为重要,由于实验者在学术上的成就和权威性,这一结论成为了常识,直到近20年前基于自然语句的实验挑战了这个结论并引发了新一轮的研究。本文主要围绕元音和辅音对语音感知的相对重要性、元音和辅音的稳态信息和边界动态信息对语音感知的影响以及相关研究的潜在应用等进行较为系统的介绍,最后给出了总结与展望。  相似文献   

8.
基于数学形态滤波的语音信号基音特征提取   总被引:3,自引:1,他引:3  
蒋刚毅  郑义 《声学学报》1998,23(6):522-528
数学形态滤波是一种关于信号形状处理的非线性变换,它能简化信号、消除较小分量而保留信号的基本形状特征.本文基于数学形态滤波方法提出了两个分别在时域和频域提取语音信号基音周期的方案,在频域提取基音周期的同时还能提取出语音信号的谱包络。它们具有简单、直观和计算效率高等特点。由于数学形态滤波运算是并行的、局部的,新方案适于并行化处理和易于硬件化实现。实验结果表明,选择合理的数学形态滤波参数以及线性预测编码参数,能获得准确的语音信号基音特征。  相似文献   

9.
本讨论了引入人耳听觉特性的迭代维纳滤波在语音分离中的应用,即用矢量量化形成的码本反映目标话的语音特征,通过计算滤波结果与这一特征的匹配度来模拟人耳在“鸡尾酒会效应”中的注意力机制。实验结果表明这一方法有很好的效果。  相似文献   

10.
本研究是在开展模拟涂层裂纹或未裂纹时的MHD压降实验研究时间时测量的裂纹电极间的MHD电位变化情况。实验段见文献(1)中的图1所示结构。用ΔV14、ΔV24分别表示余裂纹电极D1和D2与参考电极D4间实验测得的电位差,此外只选出具有一定代表性的电极间的电位差随雷诺数Re的关系曲线,即Re-Δ14和Re-ΔV24的实验曲线,其结果见图1-3所示。  相似文献   

11.
Primary auditory cortex (PAC), located in Heschl's gyrus (HG), is the earliest cortical level at which sounds are processed. Standard theories of speech perception assume that signal components are given a representation in PAC which are then matched to speech templates in auditory association cortex. An alternative holds that speech activates a specialized system in cortex that does not use the primitives of PAC. Functional magnetic resonance imaging revealed different brain activation patterns in listening to speech and nonspeech sounds across different levels of complexity. Sensitivity to speech was observed in association cortex, as expected. Further, activation in HG increased with increasing levels of complexity with added fundamentals for both nonspeech and speech stimuli, but only for nonspeech when separate sources (release bursts/fricative noises or their nonspeech analogs) were added. These results are consistent with the existence of a specialized speech system which bypasses more typical processes at the earliest cortical level.  相似文献   

12.

Background  

The speech signal contains both information about phonological features such as place of articulation and non-phonological features such as speaker identity. These are different aspects of the 'what'-processing stream (speaker vs. speech content), and here we show that they can be further segregated as they may occur in parallel but within different neural substrates. Subjects listened to two different vowels, each spoken by two different speakers. During one block, they were asked to identify a given vowel irrespectively of the speaker (phonological categorization), while during the other block the speaker had to be identified irrespectively of the vowel (speaker categorization). Auditory evoked fields were recorded using 148-channel magnetoencephalography (MEG), and magnetic source imaging was obtained for 17 subjects.  相似文献   

13.
The existence of auditory cues such as intonation, rhythm, and pausing that facilitate end-of-utterance detection is by now well established. It has been argued repeatedly that speakers may also employ visual cues to indicate that they are at the end of their utterance. This raises at least two questions, which are addressed in the current paper. First, which modalities do speakers use for signalling finality and nonfinality, and second, how sensitive are observers to these signals. Our goal is to investigate the relative contribution of three different conditions to end-of-utterance detection: the two unimodal ones, vision only and audio only, and their bimodal combination. Speaker utterances were collected via a novel semicontrolled production experiment, in which participants provided lists of words in an interview setting. The data thus collected were used in two perception experiments, which systematically compared responses to unimodal (audio only and vision only) and bimodal (audio-visual) stimuli. Experiment I is a reaction time experiment, which revealed that humans are significantly quicker in end-of-utterance detection when confronted with bimodal or audio-only stimuli, than for vision-only stimuli. No significant differences in reaction times were found between the bimodal and audio-only condition, and therefore a second experiment was conducted. Experiment II is a classification experiment, and showed that participants perform significantly better in the bimodal condition than in the two unimodal ones. Both the first and the second experiment revealed interesting differences between speakers in the various conditions, which indicates that some speakers are more expressive in the visual and others in the auditory modality.  相似文献   

14.
Feature extraction is a key step for underwater passive sonar target classification and recognition.A kind of tensor feature extraction method based on auditory PattersonHoldsworth cochlear model is proposed.First,the filter impulse response of the cochlear model is regarded as the basis function of signal decomposition,and the center frequency of different channels is determined according to the nonlinear scale or conventional linear scale of the auditory model.Then,the gain and bandwidth of th...  相似文献   

15.
A systematic improvement in auditory performance over time, following a change in the acoustic information available to the listener (that cannot be attributed to task, procedural or training effects) is known as auditory acclimatization. However, there is conflicting evidence concerning the existence of auditory acclimatization; some studies show an improvement in performance over time while other studies show no change. In an attempt to resolve this conflict, speech recognition abilities of 16 subjects with bilateral sensorineural hearing impairments were measured over a 12-week period following provision of a monaural hearing instrument for the first time. The not-fitted ear was used as the control. Three presentation levels were used representing quiet, normal, and raised speech. The results confirm the presence of acclimatization. In addition, the results show that acclimatization is evident at the higher presentation levels but not at the lowest.  相似文献   

16.
针对水下目标逆合成孔径声呐(Inverse Synthetic Aperture Sonar,ISAS)图像识别问题中观测角度随机多变,目标结构相互遮挡问题,提出一种基于多亮点拓扑矢量特征的ISAS水下目标识别方法。通过分析ISAS成像过程中散射点位置由三维空间向二维成像平面的投影关系,表明了横向定标后的声呐图像中强亮点之间的距离仅由目标散射结构之间的物理距离决定,据此基于强亮点之间的相互距离,构造能稳定描述不同观测角度下目标的拓扑矢量特征。然后通过K-means聚类获取多聚类中心以克服目标结构互相遮挡造成的亮点缺失问题。最终采用最近邻分类器实现目标识别。水池缩比模型实验表明,该方法对于水下目标的识别率达到84.0%。  相似文献   

17.
Submarine warfare continues to pose a threat in present-day military operations. Visual displays play a dominant role for operator detection and classification of underwater and surface targets. However, the visual modality is ineffective for the detection of transient signals. In spite of quieter submarines, transient sounds such as hull popping are difficult to disguise, which makes them more likely to be detected via an auditory display. Operators tend to use auditory displays less often because several factors can impede effective aural processing. In this paper, the sonar problem is reviewed followed by some proposed techniques for making more effective use of the auditory modality for the presentation of sonar signals as a means of further improving operator detection and classification of targets. Some recommendations for augmenting the aural presentation of sonar signals over headphones are then discussed. Key research areas include: (1) a reduction of the sound level of the ambient noise in noisy environments should improve the likelihood that the operator will detect weak signals; (2) the provision to replay sound bites of interest and to compare these against a library of known archetypes should lead to increased accuracy in target classification; (3) the ability to present sonar beams in a three-dimensional auditory display where the spatial position of each sonar beam corresponds to the actual position of the source in the ocean should enable the operator to monitor multiple beams and increase his/her situational awareness. Ultimately, the viability of an auditory display is dependent on operator hearing acuity.  相似文献   

18.
杨丽梅  郭立红 《光学技术》2007,33(3):406-408
针对天空背景下低信噪比的飞行器,提出了一种基于SUSAN算法、灰色系统理论和数学形态学相结合的飞行器结构特征提取的新方法。在Visual C++6.0平台下,首先利用SUSAN算法从背景中提取飞行器的结构边缘信息,并与原图像相加实现目标增强;然后用灰色系统理论检测出飞行器的结构特征边缘;最后利用条件膨胀和重构算法,实现云层的抑制,并重构出飞行器目标。实验结果表明:该方法对于实现飞行器的跟踪、结构特征提取以及事后判读有重要的意义,同时验证了该方法的可行性。  相似文献   

19.
以表征物理属性的导纳特征为中间量,提取与加筋板材料属性有关的冲击声特征。先用相关分析方法获得金属加筋板物理属性的导纳特征表达以及导纳特征与冲击声特征之间的联系,间接得到表征声源物理属性的冲击声特征,然后通过支持向量机分类器验证不同特征在金属加筋板材料分类辨识中的性能。结果表明,所得的4组冲击声特征能准确识别出不同的材料,单个特征的识别率与对应材料属性的可分程度有关,理想冲击声声特征比音色特征的平均识别率更高。由此可见,利用导纳特征提取与材料属性相关冲击声特征的方法是有效的,且所提的特征能够很好的反映声源材料属性。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号