期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Molis MR 《The Journal of the Acoustical Society of America》2005,118(2):1062-1071

相似文献

2.

Effects of consonantal context on perceptual assimilation of American English vowels by Japanese listeners

Strange W Akahane-Yamada R Kubo R Trent SA Nishi K 《The Journal of the Acoustical Society of America》2001,109(4):1691-1704

This study investigated the extent to which adult Japanese listeners' perceived phonetic similarity of American English (AE) and Japanese (J) vowels varied with consonantal context. Four AE speakers produced multiple instances of the 11 AE vowels in six syllabic contexts /b-b, b-p, d-d, d-t, g-g, g-k/ embedded in a short carrier sentence. Twenty-four native speakers of Japanese were asked to categorize each vowel utterance as most similar to one of 18 Japanese categories [five one-mora vowels, five two-mora vowels, plus/ei, ou/ and one-mora and two-mora vowels in palatalized consonant CV syllables, C(j)a(a), C(j)u(u), C(j)o(o)]. They then rated the "category goodness" of the AE vowel to the selected Japanese category on a seven-point scale. None of the 11 AE vowels was assimilated unanimously to a single J response category in all context/speaker conditions; consistency in selecting a single response category ranged from 77% for /eI/ to only 32% for /ae/. Median ratings of category goodness for modal response categories were somewhat restricted overall, ranging from 5 to 3. Results indicated that temporal assimilation patterns (judged similarity to one-mora versus two-mora Japanese categories) differed as a function of the voicing of the final consonant, especially for the AE vowels, /see text/. Patterns of spectral assimilation (judged similarity to the five J vowel qualities) of /see text/ also varied systematically with consonantal context and speakers. On the basis of these results, it was predicted that relative difficulty in the identification and discrimination of AE vowels by Japanese speakers would vary significantly as a function of the contexts in which they were produced and presented. 相似文献

3.

A study of jaw coarticulatory resistance and aggressiveness for Catalan consonants and vowels

D Recasens 《The Journal of the Acoustical Society of America》2012,132(1):412-420

The goal of this study is to investigate coarticulatory resistance and aggressiveness for the jaw in Catalan consonants and vowels and, more specifically, for the alveolopalatal nasal //[symbol see text]/ and for dark /l/ for which there is little or no data on jaw position and coarticulation. Jaw movement data for symmetrical vowel-consonant-vowel sequences with the consonants /p, n, l, s, ∫, [ symbol see text], k/ and the vowels /i, a, u/ were recorded by three Catalan speakers with a midsagittal magnetometer. Data reveal that jaw height is greater for /s, ∫/ than for /p, [see text]/, which is greater than for /n, l, k/ during the consonant, and for /i, u/ than for /a/ during the vowel. Differences in coarticulatory variability among consonants and vowels are inversely related to differences in jaw height, i.e., fricatives and high vowels are most resistant, and /n, l, k/ and the low vowel are least resistant. Moreover, coarticulation resistant phonetic segments exert more prominent effects and, thus, are more aggressive than segments specified for a lower degree of coarticulatory resistance. Data are discussed in the light of the degree of articulatory constraint model of coarticulation. 相似文献

4.

Context-specific acoustic differences between Peruvian and Iberian Spanish vowels

Chládková K Escudero P Boersma P 《The Journal of the Acoustical Society of America》2011,130(1):416-428

This paper examines four acoustic properties (duration F0, F1, and F2) of the monophthongal vowels of Iberian Spanish (IS) from Madrid and Peruvian Spanish (PS) from Lima in various consonantal contexts (/s/, /f/, /t/, /p/, and /k/) and in various phrasal contexts (in isolated words and sentence-internally). Acoustic measurements on 39 speakers, balanced by dialect and gender, can be generalized to the following differences between the two dialects. The vowel /a/ has a lower first formant in PS than in IS by 6.3%. The vowels /e/ and /o/ have more peripheral second-formant (F2) values in PS than in IS by about 4%. The consonant /s/ causes more centralization of the F2 of neighboring vowels in IS than in PS. No dialectal differences are found for the effect of phrasal context. Next to the between-dialect differences in the vowels, the present study finds that /s/ has a higher spectral center of gravity in PS than in IS by about 10%, that PS speakers speak slower than IS speakers by about 9%, and that Spanish-speaking women speak slower than Spanish-speaking men by about 5% (irrespective of dialect). 相似文献

5.

Cross-linguistic studies of children's and adults' vowel spaces

Chung H Kong EJ Edwards J Weismer G Fourakis M Hwang Y 《The Journal of the Acoustical Society of America》2012,131(1):442-454

This study examines cross-linguistic variation in the location of shared vowels in the vowel space across five languages (Cantonese, American English, Greek, Japanese, and Korean) and three age groups (2-year-olds, 5-year-olds, and adults). The vowels /a/, /i/, and /u/ were elicited in familiar words using a word repetition task. The productions of target words were recorded and transcribed by native speakers of each language. For correctly produced vowels, first and second formant frequencies were measured. In order to remove the effect of vocal tract size on these measurements, a normalization approach that calculates distance and angular displacement from the speaker centroid was adopted. Language-specific differences in the location of shared vowels in the formant values as well as the shape of the vowel spaces were observed for both adults and children. 相似文献

6.

Production of the word-final English /t/-/d/ contrast by native speakers of English, Mandarin, and Spanish.

J E Flege M J Munro L Skelton 《The Journal of the Acoustical Society of America》1992,92(1):128-143

The primary aim of this study was to determine if adults whose native language permits neither voiced nor voiceless stops to occur in word-final position can master the English word-final /t/-/d/ contrast. Native English-speaking listeners identified the voicing feature in word-final stops produced by talkers in five groups: native speakers of English, experienced and inexperienced native Spanish speakers of English, and experienced and inexperienced native Mandarin speakers of English. Contrary to hypothesis, the experienced second language (L2) learners' stops were not identified significantly better than stops produced by the inexperienced L2 learners; and their stops were correctly identified significantly less often than stops produced by the native English speakers. Acoustic analyses revealed that the native English speakers made vowels significantly longer before /d/ than /t/, produced /t/-final words with a higher F1 offset frequency than /d/-final words, produced more closure voicing in /d/ than /t/, and sustained closure longer for /t/ than /d/. The L2 learners produced the same kinds of acoustic differences between /t/ and /d/, but theirs were usually of significantly smaller magnitude. Taken together, the results suggest that only a few of the 40 L2 learners examined in the present study had mastered the English word-final /t/-/d/ contrast. Several possible explanations for this negative finding are presented. Multiple regression analyses revealed that the native English listeners made perceptual use of the small, albeit significant, vowel duration differences produced in minimal pairs by the nonnative speakers. A significantly stronger correlation existed between vowel duration differences and the listeners' identifications of final stops in minimal pairs when the perceptual judgments were obtained in an "edited" condition (where post-vocalic cues were removed) than in a "full cue" condition. This suggested that listeners may modify their identification of stops based on the availability of acoustic cues. 相似文献

7.

Acoustic and perceptual similarity of Japanese and American English vowels

Nishi K Strange W Akahane-Yamada R Kubo R Trent-Brown SA 《The Journal of the Acoustical Society of America》2008,124(1):576-588

Acoustic and perceptual similarities between Japanese and American English (AE) vowels were investigated in two studies. In study 1, a series of discriminant analyses were performed to determine acoustic similarities between Japanese and AE vowels, each spoken by four native male speakers using F1, F2, and vocalic duration as input parameters. In study 2, the Japanese vowels were presented to native AE listeners in a perceptual assimilation task, in which the listeners categorized each Japanese vowel token as most similar to an AE category and rated its goodness as an exemplar of the chosen AE category. Results showed that the majority of AE listeners assimilated all Japanese vowels into long AE categories, apparently ignoring temporal differences between 1- and 2-mora Japanese vowels. In addition, not all perceptual assimilation patterns reflected context-specific spectral similarity patterns established by discriminant analysis. It was hypothesized that this incongruity between acoustic and perceptual similarity may be due to differences in distributional characteristics of native and non-native vowel categories that affect the listeners' perceptual judgments. 相似文献

8.

Multimodal Standardization of Voice Among Four Multicultural Populations Formant Structures

Mary V. Andrianopoulos Keith Darrow Jie Chen 《Journal of voice》2001,15(1):61-77

A stratified random sample of 20 males and 20 females matched for physiologic factors and cultural-linguistic markers was examined to determine differences in formant frequencies during prolongation of three vowels: [a], [i], and [u]. The ethnic and gender breakdown included four sets of 5 male and 5 female subjects comprised of Caucasian and African American speakers of Standard American English, native Hindi Indian speakers, and native Mandarin Chinese speakers. Acoustic measures were analyzed using the Computerized Speech Lab (4300B) from which formant histories were extracted from a 200-ms sample of each vowel token to obtain first formant (F1), second formant (F2), and third formant (F3) frequencies. Significant group differences for the main effect of culture and race were found. For the main effect gender, sexual dimorphism in vowel formants was evidenced for all cultures and races across all three vowels. The acoustic differences found are attributed to cultural-linguistic factors. 相似文献

9.

Acoustic correlates of the front/back vowel distinction: a comparison of transition onset versus "steady state"

H M Sussman 《The Journal of the Acoustical Society of America》1990,88(1):87-96

This study investigated whether F2 and F3 transition onsets could encode the vowel place feature as well as F2 and F3 "steady-state" measures [Syrdal and Gopal, J. Acoust. Soc. Am. 79, 1086-1100 (1986)]. Multiple comparisons were made using (a) scatterplots in multidimensional space, (b) critical band differences, and (c) linear discriminant functional analyses. Four adult male speakers produced /b/(v)/t/, /d/(v)/t/, and /g/(v)/t/ tokens with medial vowel contexts /i,I, E, ey, ae, a, v, c, o, u/. Each token was repeated in a random order five times, yielding a total of 150 tokens per subject. Formant measurements were taken at four loci: F2 onset, F2 vowel, F3 onset, and F3 vowel. Onset points coincided with the first glottal pulse following the release burst and steady-state measures were taken approximately 60-70 ms post-onset. Graphic analyses revealed two distinct, minimally overlapping subsets grouped by front versus back. This dichotomous grouping was also seen in two-dimensional displays using only "onset" data as coordinates. Conversion to a critical band (bark) scale confirmed that front vowels were characterized by F3-F2 bark differences within a critical 3-bark distance, while back vowels exceeded the 3-bark critical distance. Using the critical distance metric onset values categorized front vowels as well as steady-state measures, but showed a 20% error rate for back vowels. Front vowels had less variability than back vowels. Statistical separability was quantified with linear discriminant function analysis. Percent correct classification into vowel place groups was 87.5% using F2 and F3 onsets as input variables, and 95.7% using F2 and F3 vowel. Acoustic correlates of the vowel place feature are already present at second and third formant transition onsets. 相似文献

10.

A study of high front vowels with articulatory data and acoustic simulations

Jackson MT McGowan RS 《The Journal of the Acoustical Society of America》2012,131(4):3017-3035

The purpose of this study is to test a methodology for describing the articulation of vowels. High front vowels are a test case because some theories suggest that high front vowels have little cross-linguistic variation. Acoustic studies appear to show counterexamples to these predictions, but purely acoustic studies are difficult to interpret because of the many-to-one relation between articulation and acoustics. In this study, vocal tract dimensions, including constriction degree and position, are measured from cinéradiographic and x-ray data on high front vowels from three different languages (North American English, French, and Mandarin Chinese). Statistical comparisons find several significant articulatory differences between North American English /i/ and Mandarin Chinese and French /i/. In particular, differences in constriction degree were found, but not constriction position. Articulatory synthesis is used to model the acoustic consequences of some of the significant articulatory differences, finding that the articulatory differences may have the acoustic consequences of making the latter languages' /i/ perceptually sharper by shifting the frequencies of F(2) and F(3) upwards. In addition, the vowel /y/ has specific articulations that differ from those for /i/, including a wider tongue constriction, and substantially different acoustic sensitivity functions for F(2) and F(3). 相似文献

11.

An acoustic description of the vowels of Northern and Southern Standard Dutch

Adank P van Hout R Smits R 《The Journal of the Acoustical Society of America》2004,116(3):1729-1738

A database is presented of measurements of the fundamental frequency, the frequencies of the first three formants, and the duration of the 15 vowels of Standard Dutch as spoken in the Netherlands (Northern Standard Dutch) and in Belgium (Southern Standard Dutch). The speech material consisted of read monosyllabic utterances in a neutral consonantal context (i.e., /sVs/). Recordings were made for 20 female talkers and 20 male talkers, who were stratified for the factors age, gender, and region. Of the 40 talkers, 20 spoke Northern Standard Dutch and 20 spoke Southern Standard Dutch. The results indicated that the nine monophthongal Dutch vowels /a [see symbol in text] epsilon i I [see symbol in text] u y Y/ can be separated fairly well given their steady-state characteristics, while the long mid vowels /e o ?/ and three diphthongal vowels /epsilon I [see symbol in text]u oey/ also require information about their dynamic characteristics. The analysis of the formant values indicated that Northern Standard Dutch and Southern Standard Dutch differ little in the formant frequencies at steady-state for the nine monophthongal vowels. Larger differences between these two language varieties were found for the dynamic specifications of the three long mid vowels, and, to a lesser extent, of the three diphthongal vowels. 相似文献

12.

Acoustic characteristics of English lexical stress produced by native Mandarin speakers

Zhang Y Nissen SL Francis AL 《The Journal of the Acoustical Society of America》2008,123(6):4498-4513

Native speakers of Mandarin Chinese have difficulty producing native-like English stress contrasts. Acoustically, English lexical stress is multidimensional, involving manipulation of fundamental frequency (F0), duration, intensity and vowel quality. Errors in any or all of these correlates could interfere with perception of the stress contrast, but it is unknown which correlates are most problematic for Mandarin speakers. This study compares the use of these correlates in the production of lexical stress contrasts by 10 Mandarin and 10 native English speakers. Results showed that Mandarin speakers produced significantly less native-like stress patterns, although they did use all four acoustic correlates to distinguish stressed from unstressed syllables. Mandarin and English speakers' use of amplitude and duration were comparable for both stressed and unstressed syllables, but Mandarin speakers produced stressed syllables with a higher F0 than English speakers. There were also significant differences in formant patterns across groups, such that Mandarin speakers produced English-like vowel reduction in certain unstressed syllables, but not in others. Results suggest that Mandarin speakers' production of lexical stress contrasts in English is influenced partly by native-language experience with Mandarin lexical tones, and partly by similarities and differences between Mandarin and English vowel inventories. 相似文献

13.

Formant discrimination in noise for isolated vowels

Liu C Kewley-Port D 《The Journal of the Acoustical Society of America》2004,116(5):3119-3129

Formant discrimination for isolated vowels presented in noise was investigated for normal-hearing listeners. Discrimination thresholds for F1 and F2, for the seven American English vowels /i, I, epsilon, ae, [symbol see text], a, u/, were measured under two types of noise, long-term speech-shaped noise (LTSS) and multitalker babble, and also under quiet listening conditions. Signal-to-noise ratios (SNR) varied from -4 to +4 dB in steps of 2 dB. All three factors, formant frequency, signal-to-noise ratio, and noise type, had significant effects on vowel formant discrimination. Significant interactions among the three factors showed that threshold-frequency functions depended on SNR and noise type. The thresholds at the lowest levels of SNR were highly elevated by a factor of about 3 compared to those in quiet. The masking functions (threshold vs SNR) were well described by a negative exponential over F1 and F2 for both LTSS and babble noise. Speech-shaped noise was a slightly more effective masker than multitalker babble, presumably reflecting small benefits (1.5 dB) due to the temporal variation of the babble. 相似文献

14.

Target spectral, dynamic spectral, and duration cues in infant perception of German vowels

Bohn OS Polka L 《The Journal of the Acoustical Society of America》2001,110(1):504-515

Previous studies of vowel perception have shown that adult speakers of American English and of North German identify native vowels by exploiting at least three types of acoustic information contained in consonant-vowel-consonant (CVC) syllables: target spectral information reflecting the articulatory target of the vowel, dynamic spectral information reflecting CV- and -VC coarticulation, and duration information. The present study examined the contribution of each of these three types of information to vowel perception in prelingual infants and adults using a discrimination task. Experiment 1 examined German adults' discrimination of four German vowel contrasts (see text), originally produced in /dVt/ syllables, in eight experimental conditions in which the type of vowel information was manipulated. Experiment 2 examined German-learning infants' discrimination of the same vowel contrasts using a comparable procedure. The results show that German adults and German-learning infants appear able to use either dynamic spectral information or target spectral information to discriminate contrasting vowels. With respect to duration information, the removal of this cue selectively affected the discriminability of two of the vowel contrasts for adults. However, for infants, removal of contrastive duration information had a larger effect on the discrimination of all contrasts tested. 相似文献

15.

Acoustic and perceptual similarity of North German and American English vowels

Strange W Bohn OS Trent SA Nishi K 《The Journal of the Acoustical Society of America》2004,115(4):1791-1807

Current theories of cross-language speech perception claim that patterns of perceptual assimilation of non-native segments to native categories predict relative difficulties in learning to perceive (and produce) non-native phones. Cross-language spectral similarity of North German (NG) and American English (AE) vowels produced in isolated hVC(a) (di)syllables (study 1) and in hVC syllables embedded in a short sentence (study 2) was determined by discriminant analyses, to examine the extent to which acoustic similarity was predictive of perceptual similarity patterns. The perceptual assimilation of NG vowels to native AE vowel categories by AE listeners with no German language experience was then assessed directly. Both studies showed that acoustic similarity of AE and NG vowels did not always predict perceptual similarity, especially for "new" NG front rounded vowels and for "similar" NG front and back mid and mid-low vowels. Both acoustic and perceptual similarity of NG and AE vowels varied as a function of the prosodic context, although vowel duration differences did not affect perceptual assimilation patterns. When duration and spectral similarity were in conflict, AE listeners assimilated vowels on the basis of spectral similarity in both prosodic contexts. 相似文献

16.

Across-talker effects on non-native listeners' vowel perception in noise

Bent T Kewley-Port D Ferguson SH 《The Journal of the Acoustical Society of America》2010,128(5):3142-3151

This study explored how across-talker differences influence non-native vowel perception. American English (AE) and Korean listeners were presented with recordings of 10 AE vowels in /bVd/ context. The stimuli were mixed with noise and presented for identification in a 10-alternative forced-choice task. The two listener groups heard recordings of the vowels produced by 10 talkers at three signal-to-noise ratios. Overall the AE listeners identified the vowels 22% more accurately than the Korean listeners. There was a wide range of identification accuracy scores across talkers for both AE and Korean listeners. At each signal-to-noise ratio, the across-talker intelligibility scores were highly correlated for AE and Korean listeners. Acoustic analysis was conducted for 2 vowel pairs that exhibited variable accuracy across talkers for Korean listeners but high identification accuracy for AE listeners. Results demonstrated that Korean listeners' error patterns for these four vowels were strongly influenced by variability in vowel production that was within the normal range for AE talkers. These results suggest that non-native listeners are strongly influenced by across-talker variability perhaps because of the difficulty they have forming native-like vowel categories. 相似文献

17.

Effects of masking noise on vowel and sibilant contrasts in normal-hearing speakers and postlingually deafened cochlear implant users

Perkell JS Denny M Lane H Guenther F Matthies ML Tiede M Vick J Zandipour M Burton E 《The Journal of the Acoustical Society of America》2007,121(1):505-518

The role of auditory feedback in speech production was investigated by examining speakers' phonemic contrasts produced under increases in the noise to signal ratio (N/S). Seven cochlear implant users and seven normal-hearing controls pronounced utterances containing the vowels /i/, /u/, /e/ and /ae/ and the sibilants /s/ and /I/ while hearing their speech mixed with noise at seven equally spaced levels between their thresholds of detection and discomfort. Speakers' average vowel duration and SPL generally rose with increasing N/S. Average vowel contrast was initially flat or rising; at higher N/S levels, it fell. A contrast increase is interpreted as reflecting speakers' attempts to maintain clarity under degraded acoustic transmission conditions. As N/S increased, speakers could detect the extent of their phonemic contrasts less effectively, and the competing influence of economy of effort led to contrast decrements. The sibilant contrast was more vulnerable to noise; it decreased over the entire range of increasing N/S for controls and was variable for implant users. The results are interpreted as reflecting the combined influences of a clarity constraint, economy of effort and the effect of masking on achieving auditory phonemic goals-with implant users less able to increase contrasts in noise than controls. 相似文献

18.

Detection and discrimination of synthetic English vowels by Old World monkeys (Cercopithecus, Macaca) and humans

J M Sinnott 《The Journal of the Acoustical Society of America》1989,86(2):557-565

Abilities to detect and discriminate ten synthetic steady-state English vowels were compared in Old World monkeys (Cercopithecus, Macaca) and humans using standard animal psychophysical procedures and positive-reinforcement operant conditioning techniques. Monkeys' detection thresholds were close to humans' for the front vowels /i-I-E-ae-E), but 10-20 dB higher for the back vowels /V-D-C-U-u/. Subjects were subsequently presented with groups of vowels to discriminate. All monkeys experienced difficulty with spectrally similar pairs such as /V-D/, /E-ae/, and /U-u/, but macaques were superior to Cercopithecus monkeys. Humans discriminated all vowels at 100% correct levels, but their increased response latencies reflected spectral similarity and correlated with higher error rates by monkeys. Varying the intensity level of the vowel stimuli had little effect on either monkey or human discrimination, except at the lowest levels tested. These qualitative similarities in monkey and human vowel discrimination suggest that some monkey species may provide useful models of human vowel processing at the sensory level. 相似文献

19.

Acoustic analysis of monophthong and diphthong production in acquired severe to profound hearing loss

Palethorpe S Watson CI Barker R 《The Journal of the Acoustical Society of America》2003,114(2):1055-1068

The effect of diminished auditory feedback on monophthong and diphthong production was examined in postlingually deafened Australian-English speaking adults. The participants were 4 female and 3 male speakers with severe to profound hearing loss, who were compared to 11 age- and accent-matched normally hearing speakers. The test materials were 5 repetitions of hVd words containing 18 vowels. Acoustic measures that were studied included F1, F2, discrete cosine transform coefficients (DCTs), and vowel duration information. The durational analyses revealed increased total vowel durations with a maintenance of the tense/lax vowel distinctions in the deafened speakers. The deafened speakers preserved a differentiated vowel space, although there were some gender-specific differences seen. For example, there was a retraction of F2 in the front vowels for the female speakers that did not occur in the males. However, all deafened speakers showed a close correspondence between the monophthong and diphthong formant movements that did occur. Gaussian classification highlighted vowel confusions resulting from changes in the deafened vowel space. The results support the view that postlingually deafened speakers maintain reasonably good speech intelligibility, in part by employing production strategies designed to bolster auditory feedback. 相似文献

20.

Multimodal Standardization of Voice Among Four Multicultural Populations: Fundamental Frequency and Spectral Characteristics

Mary V. Andrianopoulos Keith N. Darrow Jie Chen 《Journal of voice》2001,15(2):194-219

A stratified random sample of 20 males and 20 females matched for physiological factors and cultural-linguistic markers were examined to determine differences in fundamental frequency and spectral characteristics during prolongation of three vowels: [a], [i], and [u]. The ethnic-gender breakdown included four sets of five male and five female subjects comprised of Caucasian and African-American speakers of standard American English, native Hindi Indian speakers, and native Mandarin Chinese speakers. Acoustic measures were analyzed using the Multidimensional Voice Program (Kay Elemetrics, Lincoln Park, NJ) (Model 4305) from which fundamental frequency and associated acoustic spectra were extracted from a 200-ms sample of each vowel token. Statistically significant group differences for the main effects of culture, race, and gender were found. The acoustic differences found are attributed to biomechanical, physiological, cultural, and linguistic factors. 相似文献