共查询到20条相似文献,搜索用时 46 毫秒
1.
Nieto-Castanon A Guenther FH Perkell JS Curtin HD 《The Journal of the Acoustical Society of America》2005,117(5):3196-3212
This paper investigates the functional relationship between articulatory variability and stability of acoustic cues during American English /r/ production. The analysis of articulatory movement data on seven subjects shows that the extent of intrasubject articulatory variability along any given articulatory direction is strongly and inversely related to a measure of acoustic stability (the extent of acoustic variation that displacing the articulators in this direction would produce). The presence and direction of this relationship is consistent with a speech motor control mechanism that uses a third formant frequency (F3) target; i.e., the final articulatory variability is lower for those articulatory directions most relevant to determining the F3 value. In contrast, no consistent relationship across speakers and phonetic contexts was found between hypothesized vocal-tract target variables and articulatory variability. Furthermore, simulations of two speakers' productions using the DIVA model of speech production, in conjunction with a novel speaker-specific vocal-tract model derived from magnetic resonance imaging data, mimic the observed range of articulatory gestures for each subject, while exhibiting the same articulatory/acoustic relations as those observed experimentally. Overall these results provide evidence for a common control scheme that utilizes an acoustic, rather than articulatory, target specification for American English /r/. 相似文献
2.
F H Guenther C Y Espy-Wilson S E Boyce M L Matthies M Zandipour J S Perkell 《The Journal of the Acoustical Society of America》1999,105(5):2854-2865
The American English phoneme /r/ has long been associated with large amounts of articulatory variability during production. This paper investigates the hypothesis that the articulatory variations used by a speaker to produce /r/ in different contexts exhibit systematic tradeoffs, or articulatory trading relations, that act to maintain a relatively stable acoustic signal despite the large variations in vocal tract shape. Acoustic and articulatory recordings were collected from seven speakers producing /r/ in five phonetic contexts. For every speaker, the different articulator configurations used to produce /r/ in the different phonetic contexts showed systematic tradeoffs, as evidenced by significant correlations between the positions of transducers mounted on the tongue. Analysis of acoustic and articulatory variabilities revealed that these tradeoffs act to reduce acoustic variability, thus allowing relatively large contextual variations in vocal tract shape for /r/ without seriously degrading the primary acoustic cue. Furthermore, some subjects appeared to use completely different articulatory gestures to produce /r/ in different phonetic contexts. When viewed in light of current models of speech movement control, these results appear to favor models that utilize an acoustic or auditory target for each phoneme over models that utilize a vocal tract shape target for each phoneme. 相似文献
3.
Espy-Wilson CY Boyce SE Jackson M Narayanan S Alwan A 《The Journal of the Acoustical Society of America》2000,108(1):343-356
Recent advances in physiological data collection methods have made it possible to test the accuracy of predictions against speaker-specific vocal tracts and acoustic patterns. Vocal tract dimensions for /r/ derived via magnetic-resonance imaging (MRI) for two speakers of American English [Alwan, Narayanan, and Haker, J. Acoust. Soc. Am. 101, 1078-1089 (1997)] were used to construct models of the acoustics of /r/. Because previous models have not sufficiently accounted for the very low F3 characteristic of /r/, the aim was to match formant frequencies predicted by the models to the full range of formant frequency values produced by the speakers in recordings of real words containing /r/. In one set of experiments, area functions derived from MRI data were used to argue that the Perturbation Theory of tube acoustics cannot adequately account for /r/, primarily because predicted locations did not match speakers' actual constriction locations. Different models of the acoustics of /r/ were tested using the Maeda computer simulation program [Maeda, Speech Commun. 1, 199-299 (1982)]; the supralingual vocal-tract dimensions reported in Alwan et al. were found to be adequate at predicting only the highest of attested F3 values. By using (1) a recently developed adaptation of the Maeda model that incorporates the sublingual space as a side branch from the front cavity, and by including (2) the sublingual space as an increment to the dimensions of the front cavity, the mid-to-low values of the speakers' F3 range were matched. Finally, a simple tube model with dimensions derived from MRI data was developed to account for cavity affiliations. This confirmed F3 as a front cavity resonance, and variations in F1, F2, and F4 as arising from mid- and back-cavity geometries. Possible trading relations for F3 lowering based on different acoustic mechanisms for extending the front cavity are also proposed. 相似文献
4.
The purpose of this study is to test a methodology for describing the articulation of vowels. High front vowels are a test case because some theories suggest that high front vowels have little cross-linguistic variation. Acoustic studies appear to show counterexamples to these predictions, but purely acoustic studies are difficult to interpret because of the many-to-one relation between articulation and acoustics. In this study, vocal tract dimensions, including constriction degree and position, are measured from cinéradiographic and x-ray data on high front vowels from three different languages (North American English, French, and Mandarin Chinese). Statistical comparisons find several significant articulatory differences between North American English /i/ and Mandarin Chinese and French /i/. In particular, differences in constriction degree were found, but not constriction position. Articulatory synthesis is used to model the acoustic consequences of some of the significant articulatory differences, finding that the articulatory differences may have the acoustic consequences of making the latter languages' /i/ perceptually sharper by shifting the frequencies of F(2) and F(3) upwards. In addition, the vowel /y/ has specific articulations that differ from those for /i/, including a wider tongue constriction, and substantially different acoustic sensitivity functions for F(2) and F(3). 相似文献
5.
Steiner I Richmond K Marshall I Gray CD 《The Journal of the Acoustical Society of America》2012,131(2):EL106-EL111
This paper announces the availability of the magnetic resonance imaging (MRI) subset of the mngu0 corpus, a collection of articulatory speech data from one speaker containing different modalities. This subset comprises volumetric MRI scans of the speaker's vocal tract during sustained production of vowels and consonants, as well as dynamic mid-sagittal scans of repetitive consonant-vowel (CV) syllable production. For reference, high-quality acoustic recordings of the speech material are also available. The raw data are made freely available for research purposes. 相似文献
6.
Cho T 《The Journal of the Acoustical Society of America》2005,117(6):3867-3878
In this study the effects of accent and prosodic boundaries on the production of English vowels (/a,i/), by concurrently examining acoustic vowel formants and articulatory maxima of the tongue, jaw, and lips obtained with EMA (Electromagnetic Articulography) are investigated. The results demonstrate that prosodic strengthening (due to accent and/or prosodic boundaries) has differential effects depending on the source of prominence (in accented syllables versus at edges of prosodic domains; domain initially versus domain finally). The results are interpreted in terms of how the prosodic strengthening is related to phonetic realization of vowel features. For example, when accented, /i/ was fronter in both acoustic and articulatory vowel spaces (enhancing [-back]), accompanied by an increase in both lip and jaw openings (enhancing sonority). By contrast, at edges of prosodic domains (especially domain-finally), /i/ was not necessarily fronter, but higher (enhancing [+high]), accompanied by an increase only in the lip (not jaw) opening. This suggests that the two aspects of prosodic structure (accent versus boundary) are differentiated by distinct phonetic patterns. Further, it implies that prosodic strengthening, though manifested in fine-grained phonetic details, is not simply a low-level phonetic event but a complex linguistic phenomenon, closely linked to the enhancement of phonological features and positional strength that may license phonological contrasts. 相似文献
7.
Pruitt JS Jenkins JJ Strange W 《The Journal of the Acoustical Society of America》2006,119(3):1684-1696
Perception of second language speech sounds is influenced by one's first language. For example, speakers of American English have difficulty perceiving dental versus retroflex stop consonants in Hindi although English has both dental and retroflex allophones of alveolar stops. Japanese, unlike English, has a contrast similar to Hindi, specifically, the Japanese /d/ versus the flapped /r/ which is sometimes produced as a retroflex. This study compared American and Japanese speakers' identification of the Hindi contrast in CV syllable contexts where C varied in voicing and aspiration. The study then evaluated the participants' increase in identifying the distinction after training with a computer-interactive program. Training sessions progressively increased in difficulty by decreasing the extent of vowel truncation in stimuli and by adding new speakers. Although all participants improved significantly, Japanese participants were more accurate than Americans in distinguishing the contrast on pretest, during training, and on posttest. Transfer was observed to three new consonantal contexts, a new vowel context, and a new speaker's productions. Some abstract aspect of the contrast was apparently learned during training. It is suggested that allophonic experience with dental and retroflex stops may be detrimental to perception of the new contrast. 相似文献
8.
Purpose
The purpose of this study was to compare histologically determined cellularity and extracellular space to dynamic contrast-enhanced magnetic resonance imaging (DCE MRI)-based maps of a two-compartment model's parameters describing tumor contrast agent extravasation, specifically tumor extravascular extracellular space (EES) volume fraction (ve), tumor plasma volume fraction (vp) and volume-normalized contrast agent transfer rate between tumor plasma and interstitium (KTRANS/VT).Materials and Methods
Obtained ve, vp and KTRANS/VT maps were estimated from gadolinium diethylenetriamine penta-acetic acid DCE T1-weighted gradient-echo images at resolutions of 469, 938 and 2500 μm. These parameter maps were compared at each resolution to histologically determined tumor type, and the high-resolution 469-μm maps were compared with automated cell counting using Otsu's method and a color-thresholding method for estimated intracellular (Vintracellular) and extracellular (Vextracellular) space fractions.Results
The top five KTRANS/VT values obtained from each tumor at 469 and 938 μm resolutions are significantly different from those obtained at 2500 μm (P<.0001) and from one another (P=.0014). Using these top five KTRANS/VT values and the corresponding tumor EES volume fractions ve, we can statistically differentiate invasive ductal carcinomas from noninvasive papillary carcinomas for the 469- and 938-μm resolutions (P=.0017 and P=.0047, respectively), but not for the 2500-μm resolution (P=.9008). The color-thresholding method demonstrated that ve measured by DCE MRI is statistically similar to histologically determined EES. The Vextracellular obtained from the color-thresholding method was statistically similar to the ve measured with DCE MRI for the top 10 KTRANS/VT values (P>.05). DCE MRI-based KTRANS/VT estimates are not statistically correlated with histologically determined cellularity.Conclusion
DCE MRI estimates of tumor physiology are a limited representation of tumor histological features. Extracellular spaces measured by both DCE MRI and microscopic analysis are statistically similar. Tumor typing by DCE MRI is spatial resolution dependent, as lower resolutions average out contributions to voxel-based estimates of KTRANS/VT. Thus, an appropriate resolution window is essential for DCE MRI tumor diagnosis. Within this resolution window, the top KTRANS/VT values with corresponding ve are diagnostic for the tumor types analyzed in this study. 相似文献9.
The production of the lateral sounds involves airflow paths around the tongue produced by the laterally inward movement of the tongue toward the midsagittal plane. If contact is made with the palate, a closure is formed in the flow path along the midsagittal line. The effects of the lateral channels on the sound spectrum are not clear. In this study, a vocal-tract model with parallel lateral channels and a supralingual cavity was developed. Analysis shows that the lateral channels with dimensions derived from magnetic resonance images of an American English /l/ are able to produce a pole-zero pair in the frequency range of 2-5 kHz. This pole-zero pair, together with an additional pole-zero pair due to the supralingual cavity, results in a low-amplitude and relatively flat spectral shape in the F3-F5 frequency region of the /l/ sound spectrum. 相似文献
10.
Story BH 《The Journal of the Acoustical Society of America》2008,123(1):327-335
A new set of area functions for vowels has been obtained with magnetic resonance imaging from the same speaker as that previously reported in 1996 [Story et al., J. Acoust. Soc. Am. 100, 537-554 (1996)]. The new area functions were derived from image data collected in 2002, whereas the previously reported area functions were based on magnetic resonance images obtained in 1994. When compared, the new area function sets indicated a tendency toward a constricted pharyngeal region and expanded oral cavity relative to the previous set. Based on calculated formant frequencies and sensitivity functions, these morphological differences were shown to have the primary acoustic effect of systematically shifting the second formant (F2) downward in frequency. Multiple instances of target vocal tract shapes from a specific speaker provide additional sampling of the possible area functions that may be produced during speech production. This may be of benefit for understanding intraspeaker variability in vowel production and for further development of speech synthesizers and speech models that utilize area function information. 相似文献
11.
The Urey-Bradley force constants of the fluorosulphate radical in the ground 2 A 2 and the excited 2 E electronic states and the fluorosulphate anion in the ground 1 A 1 electronic state were calculated using published fundamental frequencies. The analysis was carried out within Wilson's FG formalism and the constants were evaluated by a computer program based on the least-squares-fit method. The normal coordinates and the potential energy distributions were also determined. Results support the assignments of the fundamental frequencies—the ground state values for the radical have so far been obtained only from the analysis of its electronic spectrum. 相似文献
12.
13.
Optical methods of recording ESR, which were developed in the early 1950s to record magnetic resonance of excited atoms, are extensively used at present in investigations of ESR of the ground and excited states of atoms and paramagnetic centers in condensed media [1]. Attention is called in the present communication to additional capabilities of optical ESR and of paramagnetic relaxation methods, which are realizable through the use of laser sources.S. I. Vavilov State Optical Institute, Leningrad. Translated from Primenenie Lazerov v Atomnoi, Molekulyarnoi i Yadernoi Fiziki — Trudy II Vsesoyuznoi Shkoly, pp. 3–11, 1981. 相似文献
14.
Karl J. Saldanha Ryan P. Doan Kristy M. AinslieTejal A. Desai Sharmila Majumdar 《Magnetic resonance imaging》2011,29(1):40-49
Purpose
To examine mesenchymal stem cell (MSC) labeling with micrometer-sized iron oxide particles (MPIOs) for magnetic resonance imaging (MRI)-based tracking and its application to monitoring articular cartilage regeneration.Methods
Rabbit MSCs were labeled using commercial MPIOs. In vitro MRI was performed with gradient echo (GRE) and spin echo (SE) sequences at 3T and quantitatively characterized using line profile and region of interest analysis. Ex vivo MRI of hydrogel-encapsulated labeled MSCs implanted within a bovine knee was performed with spoiled GRE (SPGR) and T1ρ sequences. Fluorescence microscopy, labeling efficiency, and chondrogenesis of MPIO-labeled cells were also examined.Results
MPIO labeling results in efficient contrast uptake and signal loss that can be visualized and quantitatively characterized via MRI. SPGR imaging of implanted cells results in ex vivo detection within native tissue, and T1ρ imaging is unaffected by the presence of labeled cells immediately following implantation. MPIO labeling does not affect quantitative glycosaminoglycan production during chondrogenesis, but iron aggregation hinders extracellular matrix visualization. This aggregation may result from excess unincorporated particles following labeling and is an issue that necessitates further investigation.Conclusion
This study demonstrates the promise of MPIO labeling for monitoring cartilage regeneration and highlights its potential in the development of cell-based tissue engineering strategies. 相似文献15.
Stavness I Gick B Derrick D Fels S 《The Journal of the Acoustical Society of America》2012,131(5):EL355-EL360
This study reports an investigation of the well-known context-dependent variation in English /r/ using a biomechanical tongue-jaw-hyoid model. The simulation results show that preferred /r/ variants require less volume displacement, relative strain, and relative muscle stress than variants that are not preferred. This study also uncovers a previously unknown mechanism in tongue biomechanics for /r/ production: Torque in the sagittal plane about the mental spine. This torque enables raising of the tongue anterior for retroflexed [Symbol: see text] by activation of hyoglossus and relaxation of anterior genioglossus. The results provide a deeper understanding of the articulatory factors that govern contextual phonetic variation. 相似文献
16.
The perceptual effects of orthogonal variations in two acoustic parameters which differentiate American English prevocalic /r/ and /l/ were examined. A spectral cue (frequency onset and transition of F2 and F3) and a temporal cue (relative duration of initial steady state and transition of F1) were varied in synthetic versions of "rock" and "lock." Four temporal variations in each of ten stimuli of a spectral-cue continuum were generated. Phonetic identification and oddity discrimination tasks with the four series showed systematic displacement of perceptual boundaries and discrimination peaks, thus reflecting a trading relation between the two cues. The perceptual equivalence of spectral and temporal cues was investigated by comparing the accuracy of discrimination of three types of stimulus comparisons: phonetically facilitating two-cue pairs, one-cue pairs, and phonetically conflicting two-cue pairs. As predicted, discrimination accuracy was ordered: Facilitating cues greater than one-cue greater than conflicting cues, indicating that perceivers discriminated on the basis of an integrated phonetic percept. 相似文献
17.
A langatate crystal was studied using the nuclear magnetic resonance method. The temperature dependence of the spin-lattice relaxation rate of 71Ga nuclei was measured in a single-crystal sample in the range 294–500 K. It was shown that the relaxation rate depends linearly on the square of the temperature. The shape of the powder spectrum obtained under static conditions was found to correspond to large values of the quadrupole coupling constant of gallium nuclei. The measurements of the powder spectra obtained upon magic-angle spinning made it possible to estimate the quadrupole coupling constant for gallium in the tetrahedral and octahedral oxygen coordinations. 相似文献
18.
Magnetic resonance imaging (MRI) technique enables non-invasive analysis of the human vocal tract during phonation. Creation of MR images of the vocal tract is accompanied by simultaneous recording of the produced speech. The paper analyzes and compares spectral properties of an acoustical noise produced by mechanical vibration of the gradient coils during scanning in the open-air MRI equipment working in a weak magnetic field with low B0 up to 0.2 T. This noise exhibits harmonic character, so it is suitable to analyze its properties in the spectral domain. Obtained results of spectral analysis will be used to devise a new cepstral-based filtering method for noise suppression of recorded speech. 相似文献
19.
W V Summers 《The Journal of the Acoustical Society of America》1987,82(3):847-863
Durations of the vocalic portions of speech are influenced by a large number of linguistic and nonlinguistic factors (e.g., stress and speaking rate). However, each factor affecting vowel duration may influence articulation in a unique manner. The present study examined the effects of stress and final-consonant voicing on the detailed structure of articulatory and acoustic patterns in consonant-vowel-consonant (CVC) utterances. Jaw movement trajectories and F 1 trajectories were examined for a corpus of utterances differing in stress and final-consonant voicing. Jaw lowering and raising gestures were more rapid, longer in duration, and spatially more extensive for stressed versus unstressed utterances. At the acoustic level, stressed utterances showed more rapid initial F 1 transitions and more extreme F 1 steady-state frequencies than unstressed utterances. In contrast to the results obtained in the analysis of stress, decreases in vowel duration due to devoicing did not result in a reduction in the velocity or spatial extent of the articulatory gestures. Similarly, at the acoustic level, the reductions in formant transition slopes and steady-state frequencies demonstrated by the shorter, unstressed utterances did not occur for the shorter, voiceless utterances. The results demonstrate that stress-related and voicing-related changes in vowel duration are accomplished by separate and distinct changes in speech production with observable consequences at both the articulatory and acoustic levels. 相似文献
20.
Iverson P Hazan V Bannister K 《The Journal of the Acoustical Society of America》2005,118(5):3267-3278
Recent work [Iverson et al. (2003) Cognition, 87, B47-57] has suggested that Japanese adults have difficulty learning English /r/ and /l/ because they are overly sensitive to acoustic cues that are not reliable for /r/-/l/ categorization (e.g., F2 frequency). This study investigated whether cue weightings are altered by auditory training, and compared the effectiveness of different training techniques. Separate groups of subjects received High Variability Phonetic Training (natural words from multiple talkers), and 3 techniques in which the natural recordings were altered via signal processing (All Enhancement, with F3 contrast maximized and closure duration lengthened; Perceptual Fading, with F3 enhancement reduced during training; and Secondary Cue Variability, with variation in F2 and durations increased during training). The results demonstrated that all of the training techniques improved /r/-/l/ identification by Japanese listeners, but there were no differences between the techniques. Training also altered the use of secondary acoustic cues; listeners became biased to identify stimuli as English /l/ when the cues made them similar to the Japanese /r/ category, and reduced their use of secondary acoustic cues for stimuli that were dissimilar to Japanese /r/. The results suggest that both category assimilation and perceptual interference affect English /r/ and /l/ acquisition. 相似文献