首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到18条相似文献,搜索用时 811 毫秒
1.
重音是重要的语调特征,重音合成技术可以提高语音的自然度和表现力。针对重音的局部凸显性,该文提出了声学特征凸显度的表示方法,分析了不同韵律位置(韵律词首、中、尾,韵律短语首、中、尾等)重音音节的声学特征凸显度,发现在韵律单元末(韵律词末音节和韵律短语末韵律词)的重音其基频最大值凸显度要低于非韵律单元末重音,提出了基于声学特征凸显度的非线性的重音声学参数生成算法,解决了传统重音声学参数线性修改算法的修改幅度不足或过大的问题。采用该算法建立了基于隐Markov模型的支持重音合成的语音合成系统。实验表明,该系统可以有效合成带有重音的语音,提高了合成语音的自然度和表现力。  相似文献   

2.
重音是重要的语调特征,重音合成技术可以提高语音的自然度和表现力。针对重音的局部凸显性,该文提出了声学特征凸显度的表示方法,分析了不同韵律位置(韵律词首、中、尾,韵律短语首、中、尾等)重音音节的声学特征凸显度,发现在韵律单元末(韵律词末音节和韵律短语末韵律词)的重音其基频最大值凸显度要低于非韵律单元末重音,提出了基于声学特征凸显度的非线性的重音声学参数生成算法,解决了传统重音声学参数线性修改算法的修改幅度不足或过大的问题。采用该算法建立了基于隐Markov模型的支持重音合成的语音合成系统。实验表明,该系统可以有效合成带有重音的语音,提高了合成语音的自然度和表现力。  相似文献   

3.
从调类个性、句中位置和重音级别3个层面的语音分析,考察普通话4个声调在不同语调条件下的音高实现。目标词被置于3种不同的焦点位置(即句重音最强的位置)和两种不同的非焦点位置(即非句重音位置)上,对目标词的调域以及目标声调的高音点和低音点进行了观察分析。实验结果表明,(1)在焦点条件以及非焦点条件下,阳平的音高位于调域的中低音区,去声低音点的理论调值尽管低于阳平低音点,但去声低音点在音高实现上往往接近阳平低音点甚至会高于阳平低音点;(2)焦点在句首位置表现为调域向上下两个方向扩展,在句末位置则表现为调域整体上抬,但不同声调的高音点并不都与调域上限同比例变化,不同声调低音点的变化也并不都与调域下限同比例变化;(3)重音后音节的音高对焦点音节的依赖关系受音步组合关系的制约,焦点和焦点后音节若在同一音步内,焦点后音节的音高与焦点音节的音高关系类似轻声音节与其前接非轻声音节的音高关系,焦点和焦点后音节之间如果存在音步边界,焦点后音节的音高表现出一定的独立性。这些结果说明了语句中声调音高实现的复杂性,一个具有较好预测性的汉语普通话语调模型的建立需要包括焦点结构、韵律结构、协同发音、调类个性等不同层面信息的诸多细节化规则。  相似文献   

4.
从调类个性、句中位置和重音级别3个层面的语音分析,考察普通话4个声调在不同语调条件下的音高实现。目标词被置于3种不同的焦点位置(即句重音最强的位置)和两种不同的非焦点位置(即非句重音位置)上,对目标词的调域以及目标声调的高音点和低音点进行了观察分析。实验结果表明,(1)在焦点条件以及非焦点条件下,阳平的音高位于调域的中低音区,去声低音点的理论调值尽管低于阳平低音点,但去声低音点在音高实现上往往接近阳平低音点甚至会高于阳平低音点;(2)焦点在句首位置表现为调域向上下两个方向扩展,在句末位置则表现为调域整体上抬,但不同声调的高音点并不都与调域上限同比例变化,不同声调低音点的变化也并不都与调域下限同比例变化;(3)重音后音节的音高对焦点音节的依赖关系受音步组合关系的制约,焦点和焦点后音节若在同一音步内,焦点后音节的音高与焦点音节的音高关系类似轻声音节与其前接非轻声音节的音高关系,焦点和焦点后音节之间如果存在音步边界,焦点后音节的音高表现出一定的独立性。这些结果说明了语句中声调音高实现的复杂性,一个具有较好预测性的汉语普通话语调模型的建立需要包括焦点结构、韵律结构、协同发音、调类个性等不同层面信息的诸多细节化规则。  相似文献   

5.
关于普通话韵律短语重音的实验研究   总被引:6,自引:2,他引:4  
通过3个逻辑上紧密联系的实验证明,汉语普通话中存在韵律短语重音,而且这个短语重音落在短语的语义焦点所在词上;词的重读音节时长的延长是短语重音的一个重要声学表现。  相似文献   

6.
连续话语中双音节韵律词的重音感知   总被引:5,自引:1,他引:4  
王韫佳  初敏  贺琳  冯勇强 《声学学报》2003,28(6):534-539
对于从微软亚洲研究院的汉语语音语料库中获得的300个语句中的1,898个双音节韵律词进行了重音感知实验,实验结果表明,连续话语中双音节词的重音感知特点与孤立词的重音感知特点有所不同,它受到词所在的韵律边界的显著影响。在感知实验中,词内两音节的重音得分之差与它们的高音点音高差和时长差都表现出正相关,但与高音点音高差的相关强于与时长差的相关。高音点音高差和时长差在非停顿前不相关,在停顿前为较弱的正相关。实验结果还表明,音节的重音感知受到调型的显著影响。  相似文献   

7.
句法边界的韵律学表现   总被引:7,自引:1,他引:6  
杨玉芳 《声学学报》1997,22(5):414-421
本文研究朗读语句中不同等级的句法边界与附近音节的韵律学参数和边界处停顿时长之间的系统关系。结果看到;边界前音节的时域和频域参数随边界等级的系统变化.在时域,边界前音节的时长和停顿之和随边界等级提高几乎是线性增长;边界前音节时长随边界等级的变化是双向的,在短语边界处达到最大;停顿在大的句法边界处增长很快;在音节内部辅音时长和能量峰值的归一化位置也随边界等级有系统变化.在频域,边界前音节基频均值陆边界等级提高而下降,音域逐渐收缩.这些结果将为连续言语合成、识别和理解系统中处理语句句法结构和语音的关系提供实验依据.  相似文献   

8.
边界强度对焦点实现方式的影响   总被引:1,自引:0,他引:1       下载免费PDF全文
刘璐  王蓓 《声学学报》2020,45(3):289-298
汉语普通话中,单焦点主要表现为焦点词音高上升和焦点后音高压缩(Post-Focus-Compression,PFC),而双焦点句中第一个焦点后音高压缩有限。韵律边界强度是否影响焦点的实现方式,特别是焦点后音高压缩?本实验借助句法上词、短语、分句和句子的分类,在句中关键词(X)后设定了4种韵律边界强度。通过问句引导的4种焦点条件分别为:关键词X为焦点,句末词Y为焦点,词X和Y都是焦点(双焦点),以及中性焦点。语音分析结果显示:(1)焦点词都表现出音高上升和时长延长,增加量在单焦点和双焦点间没有显著差异,且不受焦点词后边界强度的影响;(2)双焦点句中第一个焦点后的音高压缩会被中等强度的边界减弱,而只有非常强的边界才会减弱单焦后的音高压缩;(3)随韵律边界强度增加,边界前的词时长增加,但延长量是有上限的,且不受焦点位置的影响。总体来说,韵律边界和焦点在语调上是平行编码的。  相似文献   

9.
通过设计特定声调组合和语境的实验室语句,考察了韵律短语边界对语句中降阶和焦点后音高骤降的影响规律,以及降阶和焦点的作用域。结果发现,在由两个韵律短语组成的语句中,韵律短语边界会阻断前一短语中的降阶作用,降阶的作用域是韵律短语。焦点的实现与降阶不同:焦点后的正向音高降低作用会跨越韵律短语边界,使得后一韵律短语的高音线明显降低;如果后一韵律短语中有降阶,则焦点的跨边界音高降低作用会与降阶作用累积在一起,产生更低的高音线,说明焦点的作用域是语调短语。但当后一韵律短语也出现焦点时,音高重置阻断了前一短语中焦点的正向音高降低作用,此时两个焦点分别独立地实现。  相似文献   

10.
语篇中大尺度信息单元边界的声学线索   总被引:3,自引:2,他引:1  
王蓓  杨玉芳  吕士楠 《声学学报》2005,30(2):177-183
主要研究了语篇中句子、段落等大尺度信息单元边界的韵律等级以及边界处的声学线索。对10个语篇语料库进行了韵律等级标注和声学分析。研究得到以下主要结论:(1)语篇中有韵律意义的大尺度信息单元有小句(对应语调短语)、句子(包括单句和复句)和段落。单句和复句边界没有知觉等级和声学特征上的显著区别,对应同一韵律单元。(2)大尺度韵律边界等级的音高线索是通过边界前后音节的音高对比实现的,即音高重置程度。仅有首音节或末音节处的单一声学线索不足以区分边界等级。(3)段落和复句内的语调短语基本以平行的模式存在,没有明显的、规律性的整体语调下倾的现象。(4)信息单元越大,无声段越长且变化的自由度越大。另外,在小句边界处无声段与音高重置程度显著正相关。  相似文献   

11.
There is a tendency across languages to use a rising pitch contour to convey question intonation and a falling pitch contour to convey a statement. In a lexical tone language such as Mandarin Chinese, rising and falling pitch contours are also used to differentiate lexical meaning. How, then, does the multiplexing of the F(0) channel affect the perception of question and statement intonation in a lexical tone language? This study investigated the effects of lexical tones and focus on the perception of intonation in Mandarin Chinese. The results show that lexical tones and focus impact the perception of sentence intonation. Question intonation was easier for native speakers to identify on a sentence with a final falling tone and more difficult to identify on a sentence with a final rising tone, suggesting that tone identification intervenes in the mapping of F(0) contours to intonational categories and that tone and intonation interact at the phonological level. In contrast, there is no evidence that the interaction between focus and intonation goes beyond the psychoacoustic level. The results provide insights that will be useful for further research on tone and intonation interactions in both acoustic modeling studies and neurobiological studies.  相似文献   

12.
Speech intonation and focus location in matched statements and questions   总被引:3,自引:0,他引:3  
An acoustical study of speech production was conducted to determine the manner in which the location of linguistic focus influences intonational attributes of duration and fundamental voice frequency (F0) in matched statements and questions. Speakers orally read sentences that were preceded by aurally presented stimuli designed to elicit either no focus or focus on the first or last noun phrase of the target sentences. Computer-aided acoustical analysis of word durations showed a localized, large magnitude increase in the duration of the focused word for both statements and questions. Analysis of F0 revealed a more complex pattern of results, with the shape of the F0 topline dependent on sentence type and focus location. For sentences with neutral or sentence-final focus, the difference in the F0 topline between questions and statements was evident only on the last key word, where the F0 peak of questions was considerably higher than that of statements. For sentences with focus on the first key word, there was no difference in peak F0 on the focused item itself, but the F0 toplines of questions and statements diverged quite dramatically following the initial word. The statement contour dropped to a low F0 value for the remainder of the sentence, whereas the question remained quite high in F0 for all subsequent words. In addition, the F0 contour on the focused word was rising in questions and falling in statements, regardless of focus location. The results provide a basis for work on the perception of linguistic focus.  相似文献   

13.
Recent research has found that while speaking, subjects react to perturbations in pitch of voice auditory feedback by changing their voice fundamental frequency (F0) to compensate for the perceived pitch-shift. The long response latencies (150-200 ms) suggest they may be too slow to assist in on-line control of the local pitch contour patterns associated with lexical tones on a syllable-to-syllable basis. In the present study, we introduced pitch-shifted auditory feedback to native speakers of Mandarin Chinese while they produced disyllabic sequences /ma ma/ with different tonal combinations at a natural speaking rate. Voice F0 response latencies (100-150 ms) to the pitch perturbations were shorter than syllable durations reported elsewhere. Response magnitudes increased from 50 cents during static tone to 85 cents during dynamic tone productions. Response latencies and peak times decreased in phrases involving a dynamic change in F0. The larger response magnitudes and shorter latency and peak times in tasks requiring accurate, dynamic control of F0, indicate this automatic system for regulation of voice F0 may be task-dependent. These findings suggest that auditory feedback may be used to help regulate voice F0 during production of bi-tonal Mandarin phrases.  相似文献   

14.
Post-low bouncing is a phenomenon whereby after reaching a very low pitch in a low lexical tone, F(0) bounces up and then gradually drops back in the following syllables. This paper reports the results of an acoustic analysis of the phenomenon in two Mandarin Chinese corpora and presents a simple mechanical model that can effectively simulate this bouncing effect. The acoustic analysis shows that most of the F(0) dynamic features profiling the bouncing effect strongly correlate with the amount of F(0) lowering in the preceding low-tone syllable, and that the additional F(0) raising commences at the onset of the first post-low syllable. Using the quantitative Target Approximation model, this bouncing effect was simulated by adding an acceleration adjustment to the initial F(0) state of the first post-low syllable. A highly linear relation between F(0) lowering and estimated acceleration adjustment was found. This relation was then used to effectively simulate the bouncing effect in both the neutral tone and the full tones. The results of the analysis and simulation are consistent with the hypothesis that the bouncing effect is due to a temporary perturbation of the balance between antagonistic forces in the laryngeal control in producing a very low pitch.  相似文献   

15.
In tone languages there are potential conflicts in the perception of lexical tone and intonation, as both depend mainly on the differences in fundamental frequency (F0) patterns. The present study investigated the acoustic cues associated with the perception of sentences as questions or statements in Cantonese, as a function of the lexical tone in sentence final position. Cantonese listeners performed intonation identification tasks involving complete sentences, isolated final syllables, and sentences without the final syllable (carriers). Sensitivity (d' scores) were similar for complete sentences and final syllables but were significantly lower for carriers. Sensitivity was also affected by tone identity. These findings show that the perception of questions and statements relies primarily on the F0 characteristics of the final syllables (local F0 cues). A measure of response bias (c) provided evidence for a general bias toward the perception of statements. Logistic regression analyses showed that utterances were accurately classified as questions or statements by using average F0 and F0 interval. Average F0 of carriers (global F0 cue) was also found to be a reliable secondary cue. These findings suggest that the use of F0 cues for the perception of intonation question in tonal languages is likely to be language-specific.  相似文献   

16.
Segmental duration patterns have long been used to support the proposal that syllables are basic speech planning units, but production experiments almost always confound syllable and word boundaries. The current study tried to remedy this problem by comparing word-internal and word-peripheral consonantal duration patterns. Stress and sequencing were used to vary the nominal location of word-internal boundaries in American English productions of disyllabic nonsense words with medial consonant sequences. The word-internal patterns were compared to those that occurred at the edges of words, where boundary location was held constant and only stress and sequence order were varied. The English patterns were then compared to patterns from Russian and Finnish. All three languages showed similar effects of stress and sequencing on consonantal duration, but an independent effect of syllable position was observed only in English and only at a word boundary. English also showed stronger effects of stress and sequencing across a word boundary than within a word. Finnish showed the opposite pattern, whereas Russian showed little difference between word-internal and word-peripheral patterns. Overall, the results suggest that the suprasegmental units of motor planning are language-specific and that the word may be more a relevant planning unit in English.  相似文献   

17.
Quadrisyllabic words and phrases with normal stress of Mandarinwere used to study the tonal coarticuation.It was firstly found that the F_0perturbation at the starting—point and the ending—point of the F_0 curve ineach syllable caused by tonal coarticulation is larger than the intrinsic F_0 dif-ference of vowels at the starting—point and the ending—point of it.As for thetonal coarticulation,it was discovered that tonal coarticulation in word andphrase with normal stress is different to that in the nonsense sequence with evenstress,and in word and phrase with normal stress,the tonal coarticulatory ef-fects are unidirectional,and the carryover effect does not extend to theending—point of tone—section of the following syllable and the anticipatory ef-fect does not extend to the starting-point of tone-section of the preceding one,and the F_0 perturbation by tonal coarticulation has its pattern.  相似文献   

18.
It is hypothesized that in sine-wave replicas of natural speech, lexical tone recognition would be severely impaired due to the loss of F0 information, but the linguistic information at the sentence level could be retrieved even with limited tone information. Forty-one native Mandarin-Chinese-speaking listeners participated in the experiments. Results showed that sine-wave tone-recognition performance was on average only 32.7% correct. However, sine-wave sentence-recognition performance was very accurate, approximately 92% correct on average. Therefore the functional load of lexical tones on sentence recognition is limited, and the high-level recognition of sine-wave sentences is likely attributed to the perceptual organization that is influenced by top-down processes.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号