首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Durations of the vocalic portions of speech are influenced by a large number of linguistic and nonlinguistic factors (e.g., stress and speaking rate). However, each factor affecting vowel duration may influence articulation in a unique manner. The present study examined the effects of stress and final-consonant voicing on the detailed structure of articulatory and acoustic patterns in consonant-vowel-consonant (CVC) utterances. Jaw movement trajectories and F 1 trajectories were examined for a corpus of utterances differing in stress and final-consonant voicing. Jaw lowering and raising gestures were more rapid, longer in duration, and spatially more extensive for stressed versus unstressed utterances. At the acoustic level, stressed utterances showed more rapid initial F 1 transitions and more extreme F 1 steady-state frequencies than unstressed utterances. In contrast to the results obtained in the analysis of stress, decreases in vowel duration due to devoicing did not result in a reduction in the velocity or spatial extent of the articulatory gestures. Similarly, at the acoustic level, the reductions in formant transition slopes and steady-state frequencies demonstrated by the shorter, unstressed utterances did not occur for the shorter, voiceless utterances. The results demonstrate that stress-related and voicing-related changes in vowel duration are accomplished by separate and distinct changes in speech production with observable consequences at both the articulatory and acoustic levels.  相似文献   

2.
A significant body of evidence has accumulated indicating that vowel identification is influenced by spectral change patterns. For example, a large-scale study of vowel formant patterns showed substantial improvements in category separability when a pattern classifier was trained on multiple samples of the formant pattern rather than a single sample at steady state [J. Hillenbrand et al., J. Acoust. Soc. Am. 97, 3099-3111 (1995)]. However, in the earlier study all utterances were recorded in a constant /hVd/ environment. The main purpose of the present study was to determine whether a close relationship between vowel identity and spectral change patterns is maintained when the consonant environment is allowed to vary. Recordings were made of six men and six women producing eight vowels (see text) in isolation and in CVC syllables. The CVC utterances consisted of all combinations of seven initial consonants (/h,b,d,g,p,t,k/) and six final consonants (/b,d,g,p,t,k/). Formant frequencies for F1-F3 were measured every 5 ms during the vowel using an interactive editing tool. Results showed highly significant effects of phonetic environment. As with an earlier study of this type, particularly large shifts in formant patterns were seen for rounded vowels in alveolar environments [K. Stevens and A. House, J. Speech Hear. Res. 6, 111-128 (1963)]. Despite these context effects, substantial improvements in category separability were observed when a pattern classifier incorporated spectral change information. Modeling work showed that many aspects of listener behavior could be accounted for by a fairly simple pattern classifier incorporating F0, duration, and two discrete samples of the formant pattern.  相似文献   

3.
4.
Fundamental frequency (F0) and voice onset time (VOT) were measured in utterances containing voiceless aspirated [ph, th, kh], voiceless unaspirated [sp, st, sk], and voiced [b, d, g] stop consonants produced in the context of [i, e, u, o, a] by 8- to 9-year-old subjects. The results revealed that VOT reliably differentiated voiceless aspirated from voiceless unaspirated and voiced stops, whereas F0 significantly contrasted voiced with voiceless aspirated and unaspirated stops, except for the first glottal period, where voiceless unaspirated stops contrasted with the other two categories. Fundamental frequency consistently differentiated vowel height in alveolar and velar stop consonant environments only. In comparing the results of these children and of adults, it was observed that the acoustic correlates of stop consonant voicing and vowel quality were different not only in absolute values, but also in terms of variability. Further analyses suggested that children were more variable in production due to inconsistency in achieving specific targets. The findings also suggest that, of the acoustic correlates of the voicing feature, the primary distinction of VOT is strongly developed by 8-9 years of age, whereas the secondary distinction of F0 is still in an emerging state.  相似文献   

5.
Speech perception requires the integration of information from multiple phonetic and phonological dimensions. A sizable literature exists on the relationships between multiple phonetic dimensions and single phonological dimensions (e.g., spectral and temporal cues to stop consonant voicing). A much smaller body of work addresses relationships between phonological dimensions, and much of this has focused on sequences of phones. However, strong assumptions about the relevant set of acoustic cues and/or the (in)dependence between dimensions limit previous findings in important ways. Recent methodological developments in the general recognition theory framework enable tests of a number of these assumptions and provide a more complete model of distinct perceptual and decisional processes in speech sound identification. A hierarchical Bayesian Gaussian general recognition theory model was fit to data from two experiments investigating identification of English labial stop and fricative consonants in onset (syllable initial) and coda (syllable final) position. The results underscore the importance of distinguishing between conceptually distinct processing levels and indicate that, for individual subjects and at the group level, integration of phonological information is partially independent with respect to perception and that patterns of independence and interaction vary with syllable position.  相似文献   

6.
This study assessed the acoustic and perceptual effect of noise on vowel and stop-consonant spectra. Multi-talker babble and speech-shaped noise were added to vowel and stop stimuli at -5 to +10 dB S/N, and the effect of noise was quantified in terms of (a) spectral envelope differences between the noisy and clean spectra in three frequency bands, (b) presence of reliable F1 and F2 information in noise, and (c) changes in burst frequency and slope. Acoustic analysis indicated that F1 was detected more reliably than F2 and the largest spectral envelope differences between the noisy and clean vowel spectra occurred in the mid-frequency band. This finding suggests that in extremely noisy conditions listeners must be relying on relatively accurate F1 frequency information along with partial F2 information to identify vowels. Stop consonant recognition remained high even at -5 dB despite the disruption of burst cues due to additive noise, suggesting that listeners must be relying on other cues, perhaps formant transitions, to identify stops.  相似文献   

7.
8.
This paper describes acoustic cues for classification of consonant voicing in a distinctive feature-based speech recognition system. Initial acoustic cues are selected by studying consonant production mechanisms. Spectral representations, band-limited energies, and correlation values, along with Mel-frequency cepstral coefficients features (MFCCs) are also examined. Analysis of variance is performed to assess relative significance of features. Overall, 82.2%, 80.6%, and 78.4% classification rates are obtained on the TIMIT database for stops, fricatives, and affricates, respectively. Combining acoustic parameters with MFCCs shows performance improvement in all cases. Also, performance in the NTIMIT telephone channel speech shows that acoustic parameters are more robust than MFCCs.  相似文献   

9.
10.
A model of the vocal-tract area function is described that consists of four tiers. The first tier is a vowel substrate defined by a system of spatial eigenmodes and a neutral area function determined from MRI-based vocal-tract data. The input parameters to the first tier are coefficient values that, when multiplied by the appropriate eigenmode and added to the neutral area function, construct a desired vowel. The second tier consists of a consonant shaping function defined along the length of the vocal tract that can be used to modify the vowel substrate such that a constriction is formed. Input parameters consist of the location, area, and range of the constriction. Location and area roughly correspond to the standard phonetic specifications of place and degree of constriction, whereas the range defines the amount of vocal-tract length over which the constriction will influence the tract shape. The third tier allows length modifications for articulatory maneuvers such as lip rounding/spreading and larynx lowering/raising. Finally, the fourth tier provides control of the level of acoustic coupling of the vocal tract to the nasal tract. All parameters can be specified either as static or time varying, which allows for multiple levels of coarticulation or coproduction.  相似文献   

11.
Phonation threshold pressures were directly measured in five normal subjects in a variety of voicing conditions. The effects of fundamental frequency, intensity, closure speed of the vocal folds, and laryngeal airway resistance on phonation threshold pressures were determined. Subglottic air pressures were measured using percutaneous puncture of the cricothyroid membrane. Both onset and offset of phonation were studied to see if a hysteresis effect produced lower offset pressures than onset pressures. Univariate analysis showed that phonation threshold pressure was influenced most strongly by fundamental frequency and intensity. Multiple linear regression showed that these two variables, as well as laryngeal airway resistance, most strongly predicted phonation threshold pressure. Two of the five subjects demonstrated a significant hysteresis effect, but one subject actually had higher offset pressures than onset pressures.  相似文献   

12.
The purpose of this study was to examine the contribution of information provided by vowels versus consonants to sentence intelligibility in young normal-hearing (YNH) and typical elderly hearing-impaired (EHI) listeners. Sentences were presented in three conditions, unaltered or with either the vowels or the consonants replaced with speech shaped noise. Sentences from male and female talkers in the TIMIT database were selected. Baseline performance was established at a 70 dB SPL level using YNH listeners. Subsequently EHI and YNH participants listened at 95 dB SPL. Participants listened to each sentence twice and were asked to repeat the entire sentence after each presentation. Words were scored correct if identified exactly. Average performance for unaltered sentences was greater than 94%. Overall, EHI listeners performed more poorly than YNH listeners. However, vowel-only sentences were always significantly more intelligible than consonant-only sentences, usually by a ratio of 2:1 across groups. In contrast to written English or words spoken in isolation, these results demonstrated that for spoken sentences, vowels carry more information about sentence intelligibility than consonants for both young normal-hearing and elderly hearing-impaired listeners.  相似文献   

13.
14.
The effect on gap detectability of varying noise fall time (FT) and rise time (RT) of the gap boundary ramps was examined in mice using reflex modification audiometry, measuring inhibition of acoustic startle reflexes by variously shaped gaps just preceding reflex expression. In experiment 1 (n = 12) inhibition increased up to near-asymptotic values with longer FT (0, 1, 2, 3, 5, or 10 ms) and QT (quiet time, 0 to 13 ms), with a 2:1 trade-off between FT and QT. In experiment 2 (n = 24) inhibition increased for any RT above 0 ms (2, 3, 5, or 7 ms) if QT= 1 ms, but diminished with increased RT when QT = 3 or 8 ms. Enhanced detectability for subthreshold gaps by longer ramps results from their extending the apparent gap duration. The negative effect of increased RT for threshold gaps suggests the importance for gap detection of the stronger neural responses to sharp edges at the end of the gap shown previously in the mouse inferior colliculus. These effects are specific to gaps: inhibition for fixed (70-dB SPL) or varied level pulses (30 to 60 dB) was unaffected by varying the ramped edges (experiments 3 and 4, n = 9).  相似文献   

15.
李亚东  贾晓鹏  颜丙敏  陈宁  房超  李勇  马红安 《中国物理 B》2016,25(4):48103-048103
The effect of the catalyst height on the morphology of diamond crystal is investigated by means of temperature gradient growth(TGG) under high pressure and high temperature(HPHT) conditions with using a Ni-based catalyst in this article. The experimental results show that the morphology of diamond changes from an octahedral shape to a cuboctahedral shape as the catalyst height rises. Moreover, the finite element method(FEM) is used to simulate the temperature field of the melted catalyst/solvent. The results show that the temperature at the location of the seed diamond continues to decrease with the increase of catalyst height, which is conducive to changing the morphology of diamond. This work provides a new way to change the diamond crystal morphology.  相似文献   

16.
Vocal perturbation, harmonics-to-noise, and intensity measures were obtained for 10 subjects during three experimental tasks: (a) prolonged /a/, (b) /pa/ with vowel prolonged, and (c) same as (b) with subjects wearing a pneumotachographic mask and oral pressure tube inserted between the lips. There were no statistically significant differences among the experimental conditions for any of the measures. The findings suggest that a single task may be used to obtain airflow, oral pressure, and acoustic measures of vocal performance. Observed differences in jitter and harmonic-to-noise means for the male and female speakers are discussed.  相似文献   

17.
Thresholds of vowel formant discrimination for F1 and F2 of isolated vowels with full and partial vowel spectra were measured for normal-hearing listeners at fixed and roving speech levels. Performance of formant discrimination was significantly better for fixed levels than for roving levels with both full and partial spectra. The effect of vowel spectral range was present only for roving levels, but not for fixed levels. These results, consistent with studies of profile analysis, indicated different perceptual mechanisms for listeners to discriminate vowel formant frequency at fixed and roving levels.  相似文献   

18.
A comparison of the available concepts on the effect of spontaneous polarization on the height of the Schottky barrier at the metal-ferroelectric contact with the corresponding experimental data has been used as a basis for setting up an alternative model of this phenomenon, which draws on the dependence of the electron work function of a ferroelectric on the magnitude and orientation of the spontaneous polarization vector.  相似文献   

19.
20.
The process of formation and propagation of the depression wave at spontaneous contact of cold liquid and saturated vapor is investigated in a gas dynamics approach. Modeling of wave processes was performed using Godunov’s method based on solving the Riemann problem on arbitrary discontinuity decomposition. The influence of various factors, namely, the kinetics of condensation and intensification of heat-transfer processes, on the pressure pulse form is investigated. Results of the investigation are in good agreement with results of modeling based on numerical solving the Boltzmann kinetic equation.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号