首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到19条相似文献,搜索用时 354 毫秒
1.
语音识别控制系统需要对语音进行录制与播放处理,利用单片机实现该功能可克服传统语音录制与播放系统需外接语音处理模块、体积大且使用较复杂的缺点。因此选用SPR4096存储器作为语音的数字化信号存储器件,利用凌阳16位单片机设计与实现语音录制与播放硬件系统。结果表明,该硬件系统降低了电路复杂度和制作成本,简单易行,具有有较高的实用价值。  相似文献   

2.
压缩编码技术是无线语音通信的关键技术之一。介绍了语音编码技术的基本概念及分类,并选用AMBE多带激励压缩编码算法,通过单片机控制专用语音压缩DSP芯片,提出了一种适合低速无线语音、数据实时通信的系统解决方案,完成了硬件、软件设计,实现了低速率下的语音、数据的同步大气传输。测试结果显示,在语音编码速率为2.4kbps以下时,仍然可以得到音质较好的语音输出。  相似文献   

3.
热释电红外探测无线遥控报警系统由热释电红外探测模块、无线电收/发射模块、数字编/译码集成电路和语音录放模块组成。当人体进入监视区时,红外传感器首先将接收到的红外辐射能转换成电能信号,再经内部电路放大处理,输出控制信号启动发射系统工作,经编码脉冲调制后, 由发射模块向空间辐射无线电遥控编码信号;接收机收到信号后,便进行解调、放大、整形,从解码器输出编码脉冲,然后触发语音录放系统,播放事先录制好的警示语,提醒值班人员。该系统主要采用了RDP-18热释电红外探测模块,具有从信号接收至控制输出的全部功能。整体装置为模块化结构,具有频率稳定、工作可靠、免调试等特点,遥控距离1 000 m,可适用于多种场合的需要。  相似文献   

4.
肖东  莫福源  陈庚  郭圣明  马力 《声学学报》2013,38(5):589-596
中远距离(>10 km)水声语音通信时,由于可利用带宽窄、复杂多变等不利因素对信息传输率的制约,语音编码速率应降到尽可能的低。利用水声信道传播时延大的特点,结合人耳听觉感知的特性,在深入研究混合激励线性预测编码(MELP)标准之后,提出一种语音编码速率可调节的变比特率语音编码算法。其平均码速率约600 bps,主观语音质量评估平均得分(PESQ MOS)约2.8分。对该编码算法性能进行了计算机仿真和海上实验验证。实验及仿真表明,在误码率不高于10-3时,本算法表现良好且稳定,合成语音清晰可懂,易于辨认说话人。   相似文献   

5.
肖东  莫福源  陈庚  马力 《应用声学》2016,35(1):77-83
过渡段对语音清晰度、可懂度和人耳听觉感知都起到不可忽视的作用。参数语音编码中,包含有过渡段的语音帧能否得到恰当处理,是决定其合成语音是否清晰可懂的关键。本文以混合激励线性预测编码为参考,将其中的语音帧划分为静音、清音、浊音、过渡四大类后分别处理,在以往低码率语音编码(1 kbps)工作基础上,比较了八种过渡帧划分方法对合成语音PESQ MOS的影响。经分析后发现:不同的过渡帧对PESQ MOS的贡献也不同。由清、静音向浊音变化的过渡帧的贡献最大;介于浊辅音与元音之间的过渡帧的贡献也不应被忽略。  相似文献   

6.
张玉梅  胡小俊  吴晓军  白树林  路纲 《物理学报》2015,64(20):200507-200507
对给定的英语音素、单词和语句进行了采集并完成预处理. 分别应用互信息法和Cao 氏法确定了实际采集的语音信号序列的延迟时间和嵌入维数, 以完成语音序列的相空间重构. 通过计算实际采集的语音信号序列的最大Lyapunov指数, 完成了语音信号的混沌特性识别, 判定其具有混沌特性. 引入Volterra级数, 提出了一种具有显式结构的语音信号非线性预测模型. 为克服最小均方误差算法在Volterra模型系数更新时固有的缺点, 在最小二乘法基础上, 应用基于后验误差假设的可变收敛因子技术, 构建了一种基于Davidon-Fletcher-Powell算法的二阶Volterra 模型(DFPSOVF), 并将其应用于具有混沌特性的语音信号序列预测. 仿真结果表明: DFPSOVF非线性预测模型对于单帧和多帧语音信号均具有更好的预测精度, 优于线性预测模型, 并且能够很好地反映语音序列变化的趋势和规律, 完全可以满足语音预测的要求; 可以根据语音信号序列的嵌入维数选取预测模型的记忆长度. 所提出模型可以为语音信号重构和压缩编码开辟一条新途径, 以改善语音信号处理方法的复杂度和处理效果.  相似文献   

7.
以单片机为核心,通过AGC电路收集语音模拟信号,转换为数字信号后存入单片机当中,再通过单片机程序实现对信号的回放。数字化语音存储及回放系统实现了对语音录音与放音的数字化控制,增加语音的存储量,对采集的语音采用了非失真压缩算法,压缩后再进行存储处理,确保了语音回放的可靠质量。  相似文献   

8.
肖东  莫福源  陈庚  马力 《应用声学》2012,31(2):109-117
线谱频率(Line Spectral Frequency,LSF)是线性预测频谱系数(Linear Predication Coefficient,LPC)有效的编码形式。语音线性预测模型中,LPC反映了声道调制的模型,是影响语音听觉感知重要的参数之一。在混合激励线性预测语音编码(Mixed Excitation Linear Prediction,MELP)标准中,对LSF采用4级码本进行分级式矢量量化。首先,为减少其量化冗余度以降低编码速率,本文提出了一种改进的选择算法,生成了一个2级码本替换之。其次,为提高合成语音质量,依据LSF矢量量化的精度与合成语音质量的关系的实验结果,提出根据人耳听觉感知特性进行LSF量化和评价的方法,并予以实验证明。  相似文献   

9.
浅议IP电话     
 IP电话是国际互联网电话(Internetphone)的简称,又称为VOIP(voiceoverinternetprotocol,基于IP协议的语音通信),是一种借助计算机和互联网将语音信息转换和传送的新型通信方式,是计算机网络技术和语音通信技术新的综合集成应用成果。一、IP电话的发展历程早在20世纪70年代,一些有远见的科学家提出了将计算机(computer)、电话(telephone)通过某些硬件和软件集成为一体的技术,使语音和数据融为一体,并在一个终端上得以实现,但是当时由于技术上还不成熟,在实验及实际应用中未能成功。  相似文献   

10.
一种数话同传的激光通信系统的实现   总被引:1,自引:1,他引:0       下载免费PDF全文
 针对传统通信模式易被干扰和保密性差的缺点,设计了一种可以同时传送语音和数据的无线激光通信系统。在保证语音不失真的前提下,采用AMBE语音压缩算法,将语音压缩到4 800 bit/s的数据率,和数据混合编码后驱动激光器发光;接收端接收激光后经光电转换、解码处理将语音和数据还原,通过2.5 km的通信试验,表明该系统具有传输距离远、语音清晰、数据稳定的特点,在未来的空对空、空对地、地对空、地对地通信中将得到广泛的应用。  相似文献   

11.
In order to assist physically handicapped persons in their movements, we developed an embedded isolated word speech recognition system (ASR) applied to voice control of smart wheelchairs. However, in spite of the existence in the industrial market of several kinds of electric wheelchairs, the problem remains the need to manually control this device by hand via joystick; which limits their use especially by people with severe disabilities. Thus, a significant number of disabled people cannot use a standard electric wheelchair or drive it with difficulty. The proposed solution is to use the voice to control and drive the wheelchair instead of classical joysticks. The intelligent chair is equipped with an obstacle detection system consisting of ultrasonic sensors, a moving navigation algorithm and a speech acquisition and recognition module for voice control embedded in a DSP card. The ASR architecture consists of two main modules. The first one is the speech parameterization module (features extraction) and the second module is the classifier which identifies the speech and generates the control word to motors power unit. The training and recognition phases are based on Hidden Markov Models (HMM), K-means, Baum-Welch and Viterbi algorithms. The database consists of 39 isolated speaker words (13 words pronounced 3 times under different environments and conditions). The simulations are tested under Matlab environment and the real-time implementation is performed by C language with code composer studio embedded in a TMS 320 C6416 DSP kit. The results and experiments obtained gave promising recognition ratio and accuracy around 99% in clean environment. However, the system accuracy decreases considerably in noisy environments, especially for SNR values below 5 dB (in street: 78%, in factory: 52%).  相似文献   

12.
Customarily, speaking and singing have tended to be regarded as two completely separate sets of behaviors in clinical and educational settings. The treatment of speech and voice disorders has focused on the client's speaking ability, as this is perceived to be the main vocal behavior of concern. However, according to a broader voice-science perspective, given that the same vocal structure is used for speaking and singing, it may be possible to include singing in speech and voice therapy. In this article, a theoretical framework is proposed that indicates possible benefits from the inclusion of singing in such therapeutic settings. Based on a literature review, it is demonstrated theoretically why singing activities can potentially be exploited in the treatment of prepubertal children suffering from speech and voice disorders. Based on this theoretical framework, implications for further empirical research and practice are suggested.  相似文献   

13.
SUMMARY: The aim of this study was to investigate how different acoustic parameters, extracted both from speech pressure waveforms and glottal flows, can be used in measuring vocal loading in modern working environments and how these parameters reflect the possible changes in the vocal function during a working day. In addition, correlations between objective acoustic parameters and subjective voice symptoms were addressed. The subjects were 24 female and 8 male customer-service advisors, who mainly use telephone during their working hours. Speech samples were recorded from continuous speech four times during a working day and voice symptom questionnaires were completed simultaneously. Among the various objective parameters, only F0 resulted in a statistically significant increase for both genders. No correlations between the changes in objective and subjective parameters appeared. However, the results encourage researchers within the field of occupational voice use to apply versatile measurement techniques in studying occupational voice loading.  相似文献   

14.
BACKGROUND: After total laryngectomy, the interruption of the upper digestive tube and the section of the cricopharyngeal segment alter the high-pressure zone of the pharyngoesophageal transition, which will not only start to have a digestive function, but also be stimulated to take on the production of voice and speech. The pressure observed in the cricopharyngeal segment seems to act as a critical factor for the development of esophageal sound production, and manometry is the procedure capable of quantifying the pressure observed in this region. OBJECTIVE: The objective of the current study was to assess the upper esophageal sphincter pressure in laryngectomized patients who are either successful or unsuccessful esophageal speakers, both at rest and during esophageal phonation, using manometry. METHODS: Twenty laryngectomized persons aged 32 to 83 years (mean, 44.2 years) were submitted to evaluation by a speech pathologist and divided into two groups, ie, successful esophageal speakers (N=12) and unsuccessful esophageal speakers (N=8), according to a scale validated by Wepman et al (1953). The upper esophageal sphincter (UES) pressure was assessed by manometry both at rest and during the following voice emissions in Portuguese: the vowel "a," the monosyllable "pa," and the sentence "papai papou pipoca." The amplitude, the duration of the pressure wave, and the area under the curve were measured. RESULTS: At rest, the mean UES pressure was 11.83 mm Hg for successful esophageal speakers and 9.92 mm Hg for unsuccessful esophageal speakers, with no significant difference between groups; the mean for the two groups as a whole was 11.06 mm Hg. During the voice and speech sequence tests, no significant difference was observed when the emissions in Portuguese of "a," "pa," and the sentence were analyzed separately. CONCLUSION: As the pressure observed at rest did not differ between the successful esophageal speakers and the unsuccessful esophageal speakers, and the amplitude, the duration of the pressure wave, and the area under the amplitude x duration curve were also equal for both groups, we conclude that the cricopharyngeal segment pressure is not a preponderant factor for the acquisition of esophageal voice and speech.  相似文献   

15.
16.
A voice conversion algorithm,which makes use of the information between continuous frames of speech by compressed sensing,is proposed in this paper.According to the sparsity property of the concatenated vector of several continuous Linear Spectrum Pairs(LSP)in the discrete cosine transformation domain,this paper utilizes compressed sensing to extract the compressed vector from the concatenated LSPs and uses it as the feature vector to train the conversion function.The results of evaluations demonstrate that the performance of this approach can averagely improve 3.21%with the conventional algorithm based on weighted frequency warping when choosing the appropriate numbers of speech frame.The experimental results also illustrate that the performance of voice conversion system can be improved by taking full advantage of the inter-frame information,because those information can make the converted speech remain the more stable acoustic properties which is inherent in inter-frames.  相似文献   

17.
现阶段用于语音转换的深度学习方法多是通过使用大量的训练数据来生成高质量的语音。本文提出了一种基于平均模型和误差削减网络的语音转换框架,可用于有限数量的训练数据。首先,基于CBHG网络的平均模型使用排除源说话人和目标说话人的多说话人语音数据进行训练;然后,在有限数量的目标语音数据下对平均模型执行自适应训练;最后,提出一种误差削减网络,可以进一步改善转换后语音的质量。实验表明,所提出的语音转换框架可以灵活地处理有限的训练数据,并且在客观和主观评估方面均优于传统框架。  相似文献   

18.
There is general agreement that postural alignment is important in optimizing voice function. A number of articles have illuminated the way in which posture, particularly of the cervical spine, is directly related to vocal resonance and pitch control. Despite frequent involvement in muscle training, few speech pathologists have the background in exercise physiology necessary to appreciate the contribution of muscular length-tension relationships to postural alignment. The purpose of this article is to provide voice therapists with information to help them formulate appropriate recommendations for improving postural alignment. This article synthesizes information from the literature regarding the role of muscular length-tension balance in the attainment and maintenance of postural alignment. Important considerations in the assessment of muscle tension and weakness are presented along with advice regarding application to the treatment of voice-disordered patients. Concepts detailed include agonist/antagonist relationships, the biomechanics of stretching, postural assessment, and the relationship between muscle tension and muscle weakness. The role of both stretching and strength-based training is discussed. Specific exercises with emphasis on altering the alignment of the cervical and thoracic spine are presented with suggestions for their use in the clinic. There is growing understanding of the physiology behind recommendations of voice teachers and therapists to maintain optimal alignment. To effectively mediate postural misalignment, clinicians must have knowledge of the length-tension relationships between muscles. This understanding will lead to better interventions for postural alignment.  相似文献   

19.
以虚拟仪器为平台的声学实验   总被引:1,自引:1,他引:0  
基于计算机声卡和Adobe Audition软件制作了声学虚拟仪器,可实现示波器、信号发生器和频率计算仪器功能,并介绍了此声学虚拟仪器在双音多频信号DTMF的研究、声速的测量和变音钟实验中的具体应用.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号