首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 171 毫秒
1.
汉语语句重音对音高和音长的影响   总被引:4,自引:1,他引:3  
提高汉语合成语音的自然度的关键是要建立一个完善的汉语韵律模型.本文以连续的广播语言为研究对象,对汉语中语句重音对韵律特征参数的影响进行了初步探讨,分析了不同语句重音条件下音长和音高的变化及其相互关系,指出:(1)音高是语句重音的基本表达手段,随着语句重音级别的提高,音高分布曲线向高频方向推移。(2)在连续语流中词被‘重读”和“轻读”的情况下,音长分布出现双峰,表示它们的音长有的受语句重音的影响,有的不受语句重音的影响。(3)在“正常”、“重读”和“轻读”王种情况下,音高和音长的相互关系分别是:不相关、负相关和正相关,证实了汉语语句重音音高和音长之间的互补关系。这些研究结果为汉语会成系统中韵律模型的建立提供了基础。在此基础上,本文又用神经网络对连续语音的语句重音进行了部分标注,开集中的分类结果正确率为63%,对语音数据中重音等级的自动标注方法作了探索。  相似文献   

2.
重音是重要的语调特征,重音合成技术可以提高语音的自然度和表现力。针对重音的局部凸显性,该文提出了声学特征凸显度的表示方法,分析了不同韵律位置(韵律词首、中、尾,韵律短语首、中、尾等)重音音节的声学特征凸显度,发现在韵律单元末(韵律词末音节和韵律短语末韵律词)的重音其基频最大值凸显度要低于非韵律单元末重音,提出了基于声学特征凸显度的非线性的重音声学参数生成算法,解决了传统重音声学参数线性修改算法的修改幅度不足或过大的问题。采用该算法建立了基于隐Markov模型的支持重音合成的语音合成系统。实验表明,该系统可以有效合成带有重音的语音,提高了合成语音的自然度和表现力。   相似文献   

3.
重音是重要的语调特征,重音合成技术可以提高语音的自然度和表现力。针对重音的局部凸显性,该文提出了声学特征凸显度的表示方法,分析了不同韵律位置(韵律词首、中、尾,韵律短语首、中、尾等)重音音节的声学特征凸显度,发现在韵律单元末(韵律词末音节和韵律短语末韵律词)的重音其基频最大值凸显度要低于非韵律单元末重音,提出了基于声学特征凸显度的非线性的重音声学参数生成算法,解决了传统重音声学参数线性修改算法的修改幅度不足或过大的问题。采用该算法建立了基于隐Markov模型的支持重音合成的语音合成系统。实验表明,该系统可以有效合成带有重音的语音,提高了合成语音的自然度和表现力。  相似文献   

4.
自然风格言语的汉语句重音自动判别研究   总被引:6,自引:2,他引:6  
重音是语音合成中韵律处理的一个重要参数。本文分析了轻声和重读音节同正常重音在各声学参数上的差异,包括基频、音节时长、强度、停顿长度等,还特别考察了时长同基频参数之间的关系,以及上声音调同基频的关系。建立了基于人工神经网络的三种重音预测模型,即声学预测模型、语言学预测模型和混合预测模型,对汉语句重音(包括轻声、正常重音、重读)进行了自动判别,结果显示混合模型要优于另外两种模型。此外,本文还根据重音标注的多样性现象设计了支持率的评价方法。  相似文献   

5.
采用心理统计方法对中等规模语料库进行分析,探讨句法、韵律及其声学相关物之间的关系,根据汉语口语常规重音分布的规律,研究普通话常规重音分布规则及其在实际话语中应用的先后次序,最终建立适用于汉语文语转换系统的常规重音分布规则系统。  相似文献   

6.
对汉语普通话新闻语篇朗读语料的分析表明,被置于语段中的小句,作为重音标志的音高和音长将发生变化。语段小句与孤立小句相比,音高变化集中表现在小句调核上,是高音点的整体降低,而不同类别的重音,音高降低的程度不同。在语段中,非语段重音的小句重音呈现出较明显的弱化,即表现为音高降低和音节时长缩短。在多个小句构成的语段中,说话人可以利用各小句重音的强弱变化来实现对语段的韵律调节,进而实现对语篇韵律的整体控制和顺畅的语义表达。语段重音及小句重音的研究将实验语音学引进了播音语言教学,也有助于汉语合成语音的韵律控制。   相似文献   

7.
本研究通过30篇自然叙事语篇,以韵律词为分析单位,对语篇中音高和时长在语句重音中的作用进行探讨,结果主要发现:(1)韵律词音域的相对宽窄对语句重音起着最主要的作用.(2)音高和时长在语句重音中的作用受到小句音域宽度和韵律词等级的交互影响.在正常韵律诃中,1级重音由音高和时长共同发挥作用来实现;2级重音主要靠音高起作用.在强化韵律词中,小句音域越窄,时长在语句重音中的作用越重要.(3)音高和时长之间的相关性主要受到韵律词强度的影响,在弱化、正常和强化韵律词中,音高和时长分别表现出普遍的正相关、不相关和负相关.  相似文献   

8.
汉语语句中重读音节音高变化模式研究   总被引:8,自引:0,他引:8  
对汉语重读音节知觉的音高线索及句中重读音节的音高变化模式进行了研究。论文分3部分:重音知觉实验、问答匹配实验和语料库分析。重音知觉实验主要考察了重音知觉的音高线索,主要是高音点、低音点对重音知觉的贡献。重读音节音高变化模式的研究,一方面从发音人的角度,用问答匹配实验,选取/DAO4/为代表音节,设计少量实验句请多位发音人郎读,系统安排/DAO4/在句中的位置,用问句自然地引导/DAO4/重读或非重读,对这两种情况做比较;另一方面从听者的角度,用语料库分析,对一个大规模语料库通过感知实验进行重音和停顿两方面韵律标注,比较标为重和标为轻的音节的音高值。重音知觉实验结果表明,音域平移和高音点提高都是重音知觉的线索,但是高音点的提高对词重音知觉的作用更明显。重读音节音高变化模式的两项研究表明,重读音节的音高在高音线-低音线渐降汉语语调模式上变化,高音点的提高是重读音节音高变化的主要声学表现,低音点的变化更多地受到低音线渐降的限制,变化的幅度不十分明显,而且不足必须提高。高音线-低音线双线语调模型中,高音线起落的变化,前后音节高音点的对比关系表明句中音节的重读程度。  相似文献   

9.
张璐  祖漪清  闫润强 《声学学报》2012,37(4):448-456
研究了语调短语边界处焦点、词重音位置与上升的边界调对语调短语末词基频模式的影响。通过分析两个美式英语语料库语调短语末词的声学特征,我们发现当该单词是焦点时,重音的基频峰值比边界调的尾值高;边界调在重音实现后才充分体现出来;词重音在音节结构中后移会压缩词重音后基频调域范围。当语调短语末词不是焦点时,边界调的上升趋势从开始就体现出来,并压制了词重音的基频凸显。我们的结论是,焦点可以通过提升词重音基频峰值的高度完成;焦点和边界调实现的力度受词重音所处位置限制,在极端的情况下,边界调只能在语调短语最末音节的尾部实施。在有限音段上这些韵律特征都有表达其功能最彻底的一段位置,它们竞相展现,此消彼长。   相似文献   

10.
汉语语音资料库的语音学标记及人工切分   总被引:2,自引:0,他引:2  
介绍了汉语语音综合资料库的一个子库:CAS-SYL。该数据库包括汉语全部有调音节1267个,共计10个发音人;全部语音数据由人工完成音段切分及语音学标注。针对汉语音节的声韵结构,语音学标注水平被定位在半音节层次上.语音学标注符号系统采用了计算机可读的音标符号系统一汉语SAMPA-X(extendedSAMPhoneticAlphabet).还介绍了语音学标注策略,音段定位原则,基于语音波形的声门关闭时刻:GCI(GlottalClosedInstant)的声学线索。同时对声韵间的协同发音的声学体现进行了总结。最后对人工切分带来的非稳定性进行了分析.  相似文献   

11.
In this study the effects of accent and prosodic boundaries on the production of English vowels (/a,i/), by concurrently examining acoustic vowel formants and articulatory maxima of the tongue, jaw, and lips obtained with EMA (Electromagnetic Articulography) are investigated. The results demonstrate that prosodic strengthening (due to accent and/or prosodic boundaries) has differential effects depending on the source of prominence (in accented syllables versus at edges of prosodic domains; domain initially versus domain finally). The results are interpreted in terms of how the prosodic strengthening is related to phonetic realization of vowel features. For example, when accented, /i/ was fronter in both acoustic and articulatory vowel spaces (enhancing [-back]), accompanied by an increase in both lip and jaw openings (enhancing sonority). By contrast, at edges of prosodic domains (especially domain-finally), /i/ was not necessarily fronter, but higher (enhancing [+high]), accompanied by an increase only in the lip (not jaw) opening. This suggests that the two aspects of prosodic structure (accent versus boundary) are differentiated by distinct phonetic patterns. Further, it implies that prosodic strengthening, though manifested in fine-grained phonetic details, is not simply a low-level phonetic event but a complex linguistic phenomenon, closely linked to the enhancement of phonological features and positional strength that may license phonological contrasts.  相似文献   

12.
Relying on a corpus of thirty narrative discourses,the roles of pitch and duration of prosodic words in sentence accent were studied in discourse context.At first,the pitch was normalized.Then according to the pitch range,the sentence and prosodic word were classified into three ranks of strengthened,normal and weakened respectively.In the same time the sentence accent was classified into two levels of primary and secondary by perceptual evaluation. The results showed that the relative pitch range of prosodic words in opposition to sentence contributed dominantly to sentence accent.Furthermore,the roles of pitch and duration in sentence accent were affected interactively by the rank of sentence and prosodic words.In normal prosodic words,primary sentence accents were realized by the mutual performance of pitch and duration while secondary sentence accents mainly depended on the variation of pitch. In strengthened prosodic words,the role of duration in sentence accent was more significant when the pitch range of the sentence was more compressed.Finally,it was found that the correlation between pitch and duration was influenced primarily by the strength of prosodic words,and in weakened,normal and strengthened prosodic words,the correlations between pitch and duration were positive,null,and negative respectively.  相似文献   

13.
维吾尔语焦点的韵律实现及感知   总被引:1,自引:0,他引:1       下载免费PDF全文
通过严格控制的语音实验,研究了维吾尔语陈述句中焦点对音高和时长的调节作用。实验设计了两个目标句,请发音人根据上下文自然地强调句中相应的词,随后还考察了焦点的感知问题。结果表明:(1)以句末焦点为基线,维吾尔语焦点的韵律编码方式类似于北京话和英语中的"三区段"调节模式,表现为焦点词音高升高、音域扩大和焦点后音高骤降(音域变窄),而焦点前音高变化不大;(2)焦点词和焦点前的词时长都有延长,而焦点后的词没有明显变化;(3)对焦点感知的正确率平均可达90%左右,表明焦点的韵律编码方式是有效的感知线索;(4)感知实验及语调分析还显示,维吾尔语"中性焦点"语调特征与英语和汉语不同,它接近句首焦点而不是句末焦点。另外,论文特别讨论了"焦点后音高骤降"在中国语言中的分布及来源问题。   相似文献   

14.
Acoustic cues related to the voice source, including harmonic structure and spectral tilt, were examined for relevance to prosodic boundary detection. The measurements considered here comprise five categories: duration, pitch, harmonic structure, spectral tilt, and amplitude. Distributions of the measurements and statistical analysis show that the measurements may be used to differentiate between prosodic categories. Detection experiments on the Boston University Radio Speech Corpus show equal error detection rates around 70% for accent and boundary detection, using only the acoustic measurements described, without any lexical or syntactic information. Further investigation of the detection results shows that duration and amplitude measurements, and, to a lesser degree, pitch measurements, are useful for detecting accents, while all voice source measurements except pitch measurements are useful for boundary detection.  相似文献   

15.
Downstep in pitch contour of Chinese Putonghua is examined using subtly designed sentences by controlling tone combinations. The results show both automatic and nonautomatic downstep phenomena exist in Chinese. In non-automatic downstep, low tones compress downwards the pitch range of the following syllables. and the main influence of downstep is on topline. Low tone not only lower the topline behind it, but also raise the high tones before it, the effects are compatible with each other. In automatic downstep, the topline of pitch contour in intonational phrase is presented as a linear downtrend, but it differs among speakers due to the effect of personal stress practice. In comparison with downstep phenomenon in other tone or non-tone languages, the downstep ratio in Chinese is not constant, and the domain of downstep is not limited within the adjacent tones.  相似文献   

16.
The differences of the pitch and duration of Chinese syllables between Putonghua (PTH) and Taiwan Mandarin (TM) were studied. The speech materials to be used are not only isolated syllables, but also sentences. The results reveal that: For the isolated syllables, T1 and T2 in TM are influenced by Minnan dialect, therefore their pitch are lower than those in PTH. T3 is fall-rise in PTH, while it is fall in TM. Moreover, the syllable duration sequence for different tone is T3〉T2〉T1〉T4 in PTH, while it is T1〉T2〉T3〉T4 in TM. For the syllables in sentences, T2 is mid-rise in PTH, while it is mid-level in TM. And the T3 is longer than T4 but shorter than T1 or T2 in PTH, while it is the shortest in TM. Furthermore the effects of prosodic phrase boundary on duration for different tones are almost the same in PTH, but the lengthening part of T1 or T2 is longer than that of T3 or T4 in TM.  相似文献   

17.
维吾尔语方言识别及相关声学分析   总被引:1,自引:0,他引:1       下载免费PDF全文
根据语音识别和声纹识别等语音应用研究的实际需要,首次对和田方言的声学特性和识别进行研究。首先选取和田方言语音进行人工多层级标注,对元音的共振峰、时长和音强进行统计分析,描绘出和田方言主体格局及男性和女性的发音特点。然后运用方差分析和非参数分析法对维吾尔语3种方言的共振峰样本进行检验,结果表明3种方言的男性元音、女性元音及整体元音的共振峰分布模式存在显著差异。最后,分别构建基于GMM-UBM (Gaussian Mixture Model-Universal Background Model)、DNN-UBM (Deep Neural Networks-Universal Background Model)和LSTM-UBM (Long Short Term MemoryUniversal Background Model)维吾尔语方言识别模型,对基于梅尔频率倒谱系数及其与共振峰频率组合做输入特征提取的方言i-vector区分性进行对比实验。实验结果表明融入共振峰系数的组合特征可以增加方言的辨识度,且LSTM-UBM模型较GMM-UBM和DNN-UBM能提取到更具区分性的方言i-vector。   相似文献   

18.
Stress is an important parameter for prosody processing in speech synthesis. In this paper, we compare the acoustic features of neutral tone syllables and strong stress syllables with moderate stress syllables, including pitch, syllable duration, intensity and pause length after syllable. The relation between duration and pitch, as well as the Third Tone (T3) and pitch are also studied. Three stress prediction models based on ANN, i.e. the acoustic model, the linguistic model and the mixed model, are presented for predicting Chinese sentential stress. The results show that the mixed model performs better than the other two models. In order to solve the problem of the diversity of manual labeling, an evaluation index of support ratio is proposed.  相似文献   

19.
The effects of prosodic phrase(PP)boundary on the pitch lowering of downstep and focus,as well as the domains of them were investigated in Chinese Putonghua,by using designed sentences which consist of two prosodic phrases(i.e.,PP1,PP2).The results showed that:(1)The PP boundary blocked the downstep effect in the preceding phrase,indicating that PP is the domain of downstep.(2)The post-focus F_0 lowering effect in PP1 spread across the PP boundary and lower the FO contour of PP2.If there is a downstep effect in PP2,the postboundary compression effect of the prior focus will accumulate with the downstep,producing further lowered contour.Therefore,the domain of focus is an intonational phrase(IP).(3)When there is one contrastive focus in each phrase,the outstanding pitch reset elicited by the second focus will block the FO lowering effect of PP1 onto PP2,and the two foci are realized independently.  相似文献   

20.
An automatic detection and evaluation method of the Erhua (also called r-retroflexion or retroflex suffixation) in the Putonghua proficiency test (PSC) is proposed. Based on the framework of the computer assisted pronunciation evaluation system, the present authors made an in-depth analysis of phonologic rules and acoustic characteristics of the Erhua, and solved the detection and evaluation of the Erhua as a typical classification problem. Then more rep- resentative acoustic features were selected and a variety of different classification algorithms were used. The results showed that the boosting classification and regression tree (Boosting CART) could make full use of the characteristics of the Erhua, and the classification accuracy was 92.41%. Based on further analysis of the acoustic feature group, it was found that formant, pronunciation confidence and duration were the most important clues of the Erhua, and these clues could effectively realize the automatic detection and evaluation of the Erhua.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号