首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 0 毫秒
1.
This study presents EMA (electromagnetic articulography) data on articulation of the vowel /a/ at different prosodic boundaries in French. Three speakers of metropolitan French produced utterances containing the vowel /a/, preceded by /t/ and followed by one of six consonants /b d g f s S/ (three stops and three fricatives), with different prosodic boundaries intervening between the /a/ and the six different consonants. The prosodic boundaries investigated are the Utterance, the Intonational phrase, the Accentual phrase, and the Word. Data for the Tongue Tip, Tongue Body, and Jaw are presented. The articulatory data presented here were recorded at the same time as the acoustic data presented in Tabain [J. Acoust. Soc. Am. 113, 516-531 (2003)]. Analyses show that there is a strong effect on peak displacement of the vowel according to the prosodic hierarchy, with the stronger prosodic boundaries inducing a much lower Tongue Body and Jaw position than the weaker prosodic boundaries. Durations of both the opening movement into and the closing movement out of the vowel are also affected. Peak velocity of the articulatory movements is also examined, and, contrary to results for phrase-final lengthening, it is found that peak velocity of the opening movement into the vowel tends to increase with the higher prosodic boundaries, together with the increased magnitude of the movement between the consonant and the vowel. Results for the closing movement out of the vowel and into the consonant are not so clear. Since one speaker shows evidence of utterance-level articulatory declension, it is suggested that the competing constraints of articulatory declension and prosodic effects might explain some previous results on phrase-final lengthening.  相似文献   

2.
In this article, we examine the effects of changing speaking rate and syllable stress on the space-time structure of articulatory gestures. Lip and jaw movements of four subjects were monitored during production of selected bisyllabic utterances in which stress and rate were orthogonally varied. Analysis of the relative timing of articulatory movements revealed that the time of onset of gestures specific to consonant articulation was tightly linked to the timing of gestures specific to the flanking vowels. The observed temporal stability was independent of large variations in displacement, duration, and velocity of individual gestures. The kinematic results are in close agreement with our previously reported EMG findings [B. Tuller et al., J. Exp. Psychol. 8, 460-472 (1982)] and together provide evidence for relational invariants in articulation.  相似文献   

3.
汉语韵律层级结构边界的声学分析   总被引:11,自引:5,他引:6  
基于大规模语料库,对比了较慢和较快两种语速的语料,研究了韵律层级结构边界的声学表现。研究主要得到以下结果: (1)汉语语句音高的下倾和重置是由音域下限的移动实现的。 (2)韵律词边界的声学线索是低音线的不连续性和边界前音节的延长, 一般没有无声段。 (3)韵律短语和语调短语边界的声学线索是低音线重置和无声段。并且,边界等级越高,低音线重置程度越大,无声段的长度也越长。 (4)知觉等级与无声段长度成对数增长关系。  相似文献   

4.
This study explores the effects of prosodic boundaries on nasality at intonational phrase, word, and syllable boundaries. The subjects were recorded saying phrases that contained a syllable-final nasal consonant followed by a syllable-initial stop. The timing, duration, and magnitude of the nasal airflows measured were used to determine the extent of nasality across boundaries. Nasal amplitudes were found to vary in a speaker-dependent manner among boundary types. However, the patterns of nasal contours and temporal aspects of the airflow parameters consistently varied with boundary type across all the speakers. In general, the duration of nasal airflow and nasal plateau were the longest at the intonational phrase boundary, followed by word boundary and then syllable boundary. In addition to the hierarchical influence of boundary strength, there were unique phonetic markings associated with individual boundaries. In particular, two nasal rises interrupted by nasal inhalation occurred only across an intonation phrase boundary. Also, unexpectedly, a word boundary was marked by the longest postboundary vowel, whereas a syllable boundary was marked with the shortest nasal duration. The results here support the hierarchical effect of boundary on both domain-edge strengthening and cross-boundary coarticulation.  相似文献   

5.
基于数据挖掘算法的汉语合成韵律参数预测方法   总被引:8,自引:0,他引:8  
韵律模块是语音合成系统中的重要组成部分,韵律特征参数的描述正确与否直接影响合成系统的输出,针对目前语音合成系统中缺乏对前后音节的韵律参数之间关系的有效描述,提出一种新的韵律参数预测方法——数据挖掘技术来发现音节韵律参数之间的相互关系,通过其中的关联规则模型对这些关系进行描述,并基于关联发现算法获得汉语韵律参数中基频参数和时长参数的变化规则,研究表明这些规则可以较好地为多样本拼接合成系统的选音提供帮助和指导。  相似文献   

6.
Finding the control parameters of an articulatory model that result in given acoustics is an important problem in speech research. However, one should also be able to derive the same parameters from measured articulatory data. In this paper, a method to estimate the control parameters of the the model by Maeda from electromagnetic articulography (EMA) data, which allows the derivation of full sagittal vocal tract slices from sparse flesh-point information, is presented. First, the articulatory grid system involved in the model's definition is adapted to the speaker involved in the experiment, and EMA data are registered to it automatically. Then, articulatory variables that correspond to measurements defined by Maeda on the grid are extracted. An initial solution for the articulatory control parameters is found by a least-squares method, under constraints ensuring vocal tract shape naturalness. Dynamic smoothness of the parameter trajectories is then imposed by a variational regularization method. Generated vocal tract slices for vowels are compared with slices appearing in magnetic resonance images of the same speaker or found in the literature. Formants synthesized on the basis of these generated slices are adequately close to those tracked in real speech recorded concurrently with EMA.  相似文献   

7.
通过设计特定声调组合和语境的实验室语句,考察了韵律短语边界对语句中降阶和焦点后音高骤降的影响规律,以及降阶和焦点的作用域。结果发现,在由两个韵律短语组成的语句中,韵律短语边界会阻断前一短语中的降阶作用,降阶的作用域是韵律短语。焦点的实现与降阶不同:焦点后的正向音高降低作用会跨越韵律短语边界,使得后一韵律短语的高音线明显降低;如果后一韵律短语中有降阶,则焦点的跨边界音高降低作用会与降阶作用累积在一起,产生更低的高音线,说明焦点的作用域是语调短语。但当后一韵律短语也出现焦点时,音高重置阻断了前一短语中焦点的正向音高降低作用,此时两个焦点分别独立地实现。  相似文献   

8.
The purpose of this study is to test a methodology for describing the articulation of vowels. High front vowels are a test case because some theories suggest that high front vowels have little cross-linguistic variation. Acoustic studies appear to show counterexamples to these predictions, but purely acoustic studies are difficult to interpret because of the many-to-one relation between articulation and acoustics. In this study, vocal tract dimensions, including constriction degree and position, are measured from cinéradiographic and x-ray data on high front vowels from three different languages (North American English, French, and Mandarin Chinese). Statistical comparisons find several significant articulatory differences between North American English /i/ and Mandarin Chinese and French /i/. In particular, differences in constriction degree were found, but not constriction position. Articulatory synthesis is used to model the acoustic consequences of some of the significant articulatory differences, finding that the articulatory differences may have the acoustic consequences of making the latter languages' /i/ perceptually sharper by shifting the frequencies of F(2) and F(3) upwards. In addition, the vowel /y/ has specific articulations that differ from those for /i/, including a wider tongue constriction, and substantially different acoustic sensitivity functions for F(2) and F(3).  相似文献   

9.
The effects of prosodic phrase(PP)boundary on the pitch lowering of downstep and focus,as well as the domains of them were investigated in Chinese Putonghua,by using designed sentences which consist of two prosodic phrases(i.e.,PP1,PP2).The results showed that:(1)The PP boundary blocked the downstep effect in the preceding phrase,indicating that PP is the domain of downstep.(2)The post-focus F_0 lowering effect in PP1 spread across the PP boundary and lower the FO contour of PP2.If there is a downstep effect in PP2,the postboundary compression effect of the prior focus will accumulate with the downstep,producing further lowered contour.Therefore,the domain of focus is an intonational phrase(IP).(3)When there is one contrastive focus in each phrase,the outstanding pitch reset elicited by the second focus will block the FO lowering effect of PP1 onto PP2,and the two foci are realized independently.  相似文献   

10.
Acoustic lengthening at prosodic boundaries is well explored, and the articulatory bases for this lengthening are becoming better understood. However, the temporal scope of prosodic boundary effects has not been examined in the articulatory domain. The few acoustic studies examining the distribution of lengthening indicate that boundary effects extend from one to three syllables before the boundary, and that effects diminish as distance from the boundary increases. This diminishment is consistent with the pi-gesture model of prosodic influence [Byrd and Saltzman, J. Phonetics 31, 149-180 (2003)]. The present experiment tests the preboundary and postboundary scope of articulatory lengthening at an intonational phrase boundary. Movement-tracking data are used to evaluate durations of consonant closing and opening movements, acceleration durations, and consonant spatial magnitude. Results indicate that prosodic boundary effects exist locally near the phrase boundary in both directions, diminishing in magnitude more remotely for those subjects who exhibit extended effects. Small postboundary effects that are compensatory in direction are also observed.  相似文献   

11.
An original three-dimensional (3D) linear articulatory model of the velum and nasopharyngeal wall has been developed from magnetic resonance imaging (MRI) and computed tomography images of a French subject sustaining a set of 46 articulations, covering his articulatory repertoire. The velum and nasopharyngeal wall are represented by generic surface triangular meshes fitted to the 3D contours extracted from MRI for each articulation. Two degrees of freedom were uncovered by principal component analysis: first, VL accounts for 83% of the velum variance, corresponding to an oblique vertical movement seemingly related to the levator veli palatini muscle; second, VS explains another 6% of the velum variance, controlling a mostly horizontal movement possibly related to the sphincter action of the superior pharyngeal constrictor. The nasopharyngeal wall is also controlled by VL for 47% of its variance. Electromagnetic articulographic data recorded on the velum fitted these parameters exactly, and may serve to recover dynamic velum 3D shapes. The main oral and nasopharyngeal area functions controlled by the articulatory model, complemented by the area functions derived from the complex geometry of each nasal passage extracted from coronal MRIs, were fed to an acoustic model and gave promising results about the influence of velum movements on the spectral characteristics of nasals.  相似文献   

12.
多语种情感语音的韵律特征分析和情感识别研究   总被引:3,自引:1,他引:2  
姜晓庆  田岚  崔国辉 《声学学报》2006,31(3):217-221
韵律特征参数的变化是语音信号中情感信息主要体现。为了研究基于少量韵律特征的多语种语音样本情感识别的可行性,以提高情感识别系统对语种信息的鲁棒性,实验选取七种典型的情感状态,对指定句式下同一说话人在汉语、英语、日语多语种语音样本中的基频、能量、时间等韵律参数的动态特性进行统计分析。统计结果表明,不同语种情感语音样本的各种韵律特征参数的变化结构有较好的一致性。在这一结论基础上,利用主元素分析方法(PCA)对多语种混合样本进行了初步的情感识别实验,平均错误率为27.74%,最低识别错误率为11%。可见,通过基本的韵律参数可以实现对几种基本情感忽略语种信息的初步有效识别。  相似文献   

13.
X‐ray free‐electron laser (XFEL) pulses from SPring‐8 Ångstrom Compact free‐electron LAser (SACLA) with a temporal duration of <10 fs have provided a variety of benefits in scientific research. In a previous study, an arrival‐timing monitor was developed to improve the temporal resolution in pump–probe experiments at beamline 3 by rearranging data in the order of the arrival‐timing jitter between the XFEL and the synchronized optical laser pulses. This paper presents Timing Monitor Analyzer (TMA), a software package by which users can conveniently obtain arrival‐timing data in the analysis environment at SACLA. The package is composed of offline tools that pull stored data from cache storage, and online tools that pull data from a data‐handling server in semi‐real time during beam time. Users can select the most suitable tool for their purpose, and share the results through a network connection between the offline and online analysis environments.  相似文献   

14.
Standard continuous interleaved sampling processing, and a modified processing strategy designed to enhance temporal cues to voice pitch, were compared on tests of intonation perception, and vowel perception, both in implant users and in acoustic simulations. In standard processing, 400 Hz low-pass envelopes modulated either pulse trains (implant users) or noise carriers (simulations). In the modified strategy, slow-rate envelope modulations, which convey dynamic spectral variation crucial for speech understanding, were extracted by low-pass filtering (32 Hz). In addition, during voiced speech, higher-rate temporal modulation in each channel was provided by 100% amplitude-modulation by a sawtooth-like wave form whose periodicity followed the fundamental frequency (F0) of the input. Channel levels were determined by the product of the lower- and higher-rate modulation components. Both in acoustic simulations and in implant users, the ability to use intonation information to identify sentences as question or statement was significantly better with modified processing. However, while there was no difference in vowel recognition in the acoustic simulation, implant users performed worse with modified processing both in vowel recognition and in formant frequency discrimination. It appears that, while enhancing pitch perception, modified processing harmed the transmission of spectral information.  相似文献   

15.
The experiments examined age-related changes in temporal sensitivity to increments in the interonset intervals (IOI) of components in tonal sequences. Discrimination was examined using reference sequences consisting of five 50-ms tones separated by silent intervals; tone frequencies were either fixed at 4 kHz or varied within a 2-4-kHz range to produce spectrally complex patterns. The tonal IOIs within the reference sequences were either equal (200 or 600 ms) or varied individually with an average value of 200 or 600 ms to produce temporally complex patterns. The difference limen (DL) for increments of IOI was measured. Comparison sequences featured either equal increments in all tonal IOIs or increments in a single target IOI, with the sequential location of the target changing randomly across trials. Four groups of younger and older adults with and without sensorineural hearing loss participated. Results indicated that DLs for uniform changes of sequence rate were smaller than DLs for single target intervals, with the largest DLs observed for single targets embedded within temporally complex sequences. Older listeners performed more poorly than younger listeners in all conditions, but the largest age-related differences were observed for temporally complex stimulus conditions. No systematic effects of hearing loss were observed.  相似文献   

16.
研究韵律特征在说话人确认中的应用。将整个韵律轨迹以固定段长和段移进行片段划分,并对其进行勒让德多项式拟合从而获取连续性的韵律特征,将特征映射到总变化因子空间,并用概率线性判别分析来补偿说话人和场景的差异。在美国国家标准技术研究院2010年说话人识别评测扩展核心测试集5的基础上加入噪声构造测试集,并分别对韵律特征和传统Mel频率倒谱系数进行测试。结果显示,随着信噪比的逐渐减小,Mel频率倒谱系数性能出现大幅度下降,而韵律特征性能相对比较稳定,两种特征融合后能使系统性能得到进一步提升,等错率和最小检测错误代价相对于Mel频率倒谱系数单系统最多能分别下降9%和11%。实验表明,韵律特征应用于说话人识别中具有较强的噪声鲁棒性,且与传统的Mel频率倒谱系数存在较强的互补性。  相似文献   

17.
研究韵律特征在说话人确认中的应用。将整个韵律轨迹以固定段长和段移进行片段划分,并对其进行勒让德多项式拟合从而获取连续性的韵律特征,将特征映射到总变化因子空间,并用概率线性判别分析来补偿说话人和场景的差异。在美国国家标准技术研究院2010年说话人识别评测扩展核心测试集5的基础上加入噪声构造测试集,并分别对韵律特征和传统Mel频率倒谱系数进行测试。结果显示,随着信噪比的逐渐减小,Mel频率倒谱系数性能出现大幅度下降,而韵律特征性能相对比较稳定,两种特征融合后能使系统性能得到进一步提升,等错率和最小检测错误代价相对于Mel频率倒谱系数单系统最多能分别下降9%和11%。实验表明,韵律特征应用于说话人识别中具有较强的噪声鲁棒性,且与传统的Mel频率倒谱系数存在较强的互补性。  相似文献   

18.
To assess the ability of projective phase sensitive magnetic resonance (MR) angiography to visualize the aortoiliac vascular segment, and to determine the effects of triggering and timing of data acquisition om image quality, we studied 18 healthy volunteers, mean age 33.3 +/- 11 years, by color Doppler imaging and by MR angiography. MR angiography was performed at 1.5 T using a flow-adjustable gradient-echo (FLAG) sequence operated in both ECG-triggered and non-triggered acquisition modes. The images were graded in a blinded fashion by two independent observers. The data were analyzed using Pearson's chi-square analysis. Eighteen triggered time-resolved and 17 non-triggered, time-averaged MR angiograms consisting of 252 and 17 angiographic images, (AI) respectively, were analyzed. In the triggered mode 69 (27.4%) AI and in the non-triggered mode 2 (11.8%) AI were diagnostic. At least one triggered diagnostic AI was obtained in each subject. The image grades were not statistically different between observers (kappa = 0.6686). In the triggered mode diagnostic images were acquired within +/- 90 msec of the peak systolic flow velocity determined by Doppler. The proportion of diagnostic images in the triggered mode was highest (73.3%) within a 30-msec interval before the peak flow. In healthy subjects the aortoiliac segment is reliably visualized by FLAG MR angiography. The optimum results are achieved using the triggered acquisition mode and timing acquisition to the initial 180 msec of the abdominal aortic systolic flow pulse.  相似文献   

19.
基于隐马尔科夫模型的汉语韵律词基频模型   总被引:3,自引:1,他引:2  
提出了一种基于隐马尔科夫模型(HMM)的汉语韵律词的统计基频模型。模型能反映韵律环境和基频曲线参数之间的映射关系,从模型可以估计一段基频曲线和一段文本之间的相关度,也可以从文本生成相应的基频曲线。本方法使用HMM作为基木框架,具有HMM理论体系所能支配的各种优点。同时将韵律作为模型单元,使得模型能够反映韵律层次级的连续变调。最后给出了实验结果并对模型的应用前景进行了展望。  相似文献   

20.
杨玉芳 《声学学报》1998,23(2):163-169
以语句音节间知觉距离辨别的心理物理学实验结果为基础,用多维标度分析方法构建语句的韵律知觉结构。根据这些结构,讨论听者利用局部韵律学线索知觉全句韵律结构的能力,以及汉语中旬法和语音的界面问题。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号