首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Measurements of the neck frequency response function (NFRF), defined as the ratio of the spectrum of the estimated volume velocity that excites the vocal tract to the spectrum of the acceleration delivered to the neck wall, were made at three different positions on the necks of nine laryngectomized subjects (five males and four females) and four normal laryngeal speakers (two males and two females). A minishaker driven by broadband noise provided excitation to the necks of subjects as they configured their vocal tracts to mimic the production of the vowels /a/, /ae/, and /I/. The sound pressure at the lips was measured with a microphone and an impedance head mounted on the shaker measured the acceleration. The neck wall passed low-frequency sound energy better than high-frequency sound energy, and thus the NFRF was accurately modeled as a low-pass filter. The NFRFs of the different subject groups (female laryngeal, male laryngeal speakers, laryngectomized males, and laryngectomized females) differed from each other in terms of corner frequency and gain, with both types of male subjects presenting NFRFs with larger overall gains. In addition, there was a notable amount of intersubject variability within groups. Because the NFRF is an estimate of how sound energy passes through the neck wall, these results should aid in the design of improved neck-type electrolarynx devices.  相似文献   

2.
It was hypothesized that native English adults would be more skillful in producing word-final English /p/ and /b/ than native English children who, in turn, would be more skillful in doing so than adult native speakers of a language (Mandarin Chinese) that does not possess word-final stops. A video tracking system was used to monitor lip and jaw movements. The subjects in all three groups made vowels significantly longer before /b/ than /p/, but the effect seen for the English subjects was three times as large as the Chinese subjects' effect and depended less on differences in lip closing velocity for (b) and /p/. The English subjects also showed a difference in duration between /a/ and /i/ that was twice as large as the difference seen for the Chinese subjects. Of the three groups, only the English adults showed significantly greater displacement and peak movement velocity for the final stop consonant of /bap/ than /bab/. This suggested that their central phonetic representations specified a more forceful constriction of the lips for /p/ than /b/. The English adults seemed to compensate more effectively for a bite block in producing the final stops in /bip/ and /bib/. The results obtained for the English children were intermediate to those obtained for the English and Chinese adults, which is consistent with the hypothesized experience-based differences in level of skill.  相似文献   

3.
The primary aim of this study was to determine if adults whose native language permits neither voiced nor voiceless stops to occur in word-final position can master the English word-final /t/-/d/ contrast. Native English-speaking listeners identified the voicing feature in word-final stops produced by talkers in five groups: native speakers of English, experienced and inexperienced native Spanish speakers of English, and experienced and inexperienced native Mandarin speakers of English. Contrary to hypothesis, the experienced second language (L2) learners' stops were not identified significantly better than stops produced by the inexperienced L2 learners; and their stops were correctly identified significantly less often than stops produced by the native English speakers. Acoustic analyses revealed that the native English speakers made vowels significantly longer before /d/ than /t/, produced /t/-final words with a higher F1 offset frequency than /d/-final words, produced more closure voicing in /d/ than /t/, and sustained closure longer for /t/ than /d/. The L2 learners produced the same kinds of acoustic differences between /t/ and /d/, but theirs were usually of significantly smaller magnitude. Taken together, the results suggest that only a few of the 40 L2 learners examined in the present study had mastered the English word-final /t/-/d/ contrast. Several possible explanations for this negative finding are presented. Multiple regression analyses revealed that the native English listeners made perceptual use of the small, albeit significant, vowel duration differences produced in minimal pairs by the nonnative speakers. A significantly stronger correlation existed between vowel duration differences and the listeners' identifications of final stops in minimal pairs when the perceptual judgments were obtained in an "edited" condition (where post-vocalic cues were removed) than in a "full cue" condition. This suggested that listeners may modify their identification of stops based on the availability of acoustic cues.  相似文献   

4.
Recent advances in physiological data collection methods have made it possible to test the accuracy of predictions against speaker-specific vocal tracts and acoustic patterns. Vocal tract dimensions for /r/ derived via magnetic-resonance imaging (MRI) for two speakers of American English [Alwan, Narayanan, and Haker, J. Acoust. Soc. Am. 101, 1078-1089 (1997)] were used to construct models of the acoustics of /r/. Because previous models have not sufficiently accounted for the very low F3 characteristic of /r/, the aim was to match formant frequencies predicted by the models to the full range of formant frequency values produced by the speakers in recordings of real words containing /r/. In one set of experiments, area functions derived from MRI data were used to argue that the Perturbation Theory of tube acoustics cannot adequately account for /r/, primarily because predicted locations did not match speakers' actual constriction locations. Different models of the acoustics of /r/ were tested using the Maeda computer simulation program [Maeda, Speech Commun. 1, 199-299 (1982)]; the supralingual vocal-tract dimensions reported in Alwan et al. were found to be adequate at predicting only the highest of attested F3 values. By using (1) a recently developed adaptation of the Maeda model that incorporates the sublingual space as a side branch from the front cavity, and by including (2) the sublingual space as an increment to the dimensions of the front cavity, the mid-to-low values of the speakers' F3 range were matched. Finally, a simple tube model with dimensions derived from MRI data was developed to account for cavity affiliations. This confirmed F3 as a front cavity resonance, and variations in F1, F2, and F4 as arising from mid- and back-cavity geometries. Possible trading relations for F3 lowering based on different acoustic mechanisms for extending the front cavity are also proposed.  相似文献   

5.
SUMMARY: The present study investigated the effect of tonal changes on voice onset time (VOT) between normal laryngeal (NL) and superior esophageal (SE) speakers of Mandarin Chinese. VOT values were measured from the syllables /pha/, /tha/, and /kha/ produced at four tone levels by eight NL and seven SE speakers who were native speakers of Mandarin. Results indicated that Mandarin tones were associated with significantly different VOT values for NL speakers, in which high-falling tone was associated with significantly shorter VOT values than mid-rising tone and falling-rising tone. Regarding speaker group, SE speakers showed significantly shorter VOT values than NL speakers across all tone levels. This may be related to their use of pharyngoesophageal (PE) segment as another sound source. SE speakers appear to take a shorter time to start PE segment vibration compared to NL speakers using the vocal folds for vibration.  相似文献   

6.
The purpose of this study was (1) to determine the psychophysical character of auditory-perceptual ratings of voice pleasantness (VP) and voice acceptability (VA) for tracheoesophageal (TE) speakers using direct magnitude estimation (DME) and equal-appearing interval (EAI) scaling procedures and (2) to determine the relationship between listeners' ratings of VP and VA. Ten adult listeners judged overall VP and VA from connected speech samples produced by 20 adult male TE speakers. Although results yielded a prothetic continuum for VP and a metathetic continuum for VA, the amount of variance accounted for by a curvilinear model of VP was minimally more than that accounted for by a linear model. Results also revealed a significant relationship between VP and VA (r = 0.939). Findings from this study do not suggest any greater validity associated with VP and VA ratings obtained by the DME than the EAI method. As a result of the significant relationship between these ratings and to the ease of applying EAI scales, it is recommended that VA be used as a current clinical outcome measure. These data illustrate the need to identify attributes that best describe TE speech that are measured appropriately and are clinically useful.  相似文献   

7.
The timing of upper lip protrusion movements and accompanying acoustic events was examined for multiple repetitions of word pairs such as "lee coot" and "leaked coot" by four speakers of American English. The duration of the intervocalic consonant string was manipulated by using various combinations of /s/, /t/, /k/, /h/, and /#/. Pairwise comparisons were made of consonant string duration (acoustic /i/ offset to acoustic /u/ onset) with durations of: protrusion movement beginning to acoustic /u/ onset, maximum acceleration of the movement to acoustic /u/ onset, and acoustic /u/ onset to movement end. There were some consonant-specific protrusion effects, primarily on the movement beginning event for /s/. Inferences from measures of the maximum acceleration and movement end events for the non-/s/ subset suggested the simultaneous and variable expression of three competing constraints: (1) end the protrusion movement during the voiced part of the /u/; (2) use a preferred movement duration; and (3) begin the /u/-related protrusion movement when permitted by relaxation of the perceptually motivated constraint that the preceding /i/ be unrounded. The subjects differed in the degree of expression of each constraint, but the results generally indicate that anticipatory coarticulation of lip protrusion is influenced both by acoustic-phonetic context dependencies and dynamical properties of movements. Because of the extensive variation in the data and the small number of subjects, these ideas are tentative; additional work is needed to explore them further.  相似文献   

8.
Vocal fold contact behavior was examined in separate groups of boys and girls through application of an electroglottograph(EGG). In general, a contact quotient (EGG duty cycle) showed minimal differences within and between boys and girls during sustained production of the vowels /i/, /u/, and /a/. The findings are discussed with respect to the laryngeal behavior of prepubescent children as well as the clinical utility and applicability of the EGG for examining phonatory behavior among young children.  相似文献   

9.
This study was to present an odor provocation/challenge test for laryngeal hypersensitivity in a suspected odor induced dysphonic patient. The second aim was to rule out secondary gain from organic laryngeal hypersensitivity. Two steps were taken for this purpose. First, because the evaluation of hypersensitivity may be affected by the perception of odor, the study investigated laryngeal hypersensitivity during nasal and oral breathing separately to disentangle possible cognitive reactions to odors. Second, a healthy control (HC) participant was used with the identical testing protocol for nasal breathing to minimize unbiased results. The HC's response to nasal breathing of the odors showed no response to all the stimuli. The participant with possible secondary gain issues responded differently to the odors when presented nasally versus orally. Oral breathing showed less severe and less frequent laryngeal hypersensitive reactions. This suggests that laryngeal hypersensitivity was either due to the odor, cognitive information, sensory changes in olfaction leading to psychological conditioning, or for any possible secondary gain. Hence, it is difficult to indicate the precise reason (cause and effect) for the participant's laryngeal hypersensitivity; however, this study describes the first structured, controlled, repeatable, and randomized design to investigate odor induced laryngeal hypersensitivity and decipher possible secondary gain from true laryngeal hypersensitivity.  相似文献   

10.
Until recently, speech analysis techniques have been built around the all-pole linear predictive model. This study examines the effectiveness of using the perceptual linear predictive method for analyzing nasal consonants. Six speakers (three men and three women) produced 300 CV syllables with initial nasal consonants /m/ and /n/. A threshold-based boundary detection algorithm was developed to extract nasal segments from the CV contexts. Poles of a fifth-order perceptual linear predictive model were calculated and the frequency of the second pole was used to characterize the place of articulation of nasal consonants. Results indicated that the frequency for the second transformed pole was significantly lower for /m/ than for /n/ and was independent of factors such as a vowel context and gender of the speaker. A nasal identification rate of 86% was obtained based on the frequency of the second pole. The use of the perceptual linear predictive method may thus overcome some difficulties associated with analyzing nasal consonants.  相似文献   

11.
This study addresses three issues that are relevant to coarticulation theory in speech production: whether the degree of articulatory constraint model (DAC model) accounts for patterns of the directionality of tongue dorsum coarticulatory influences; the extent to which those patterns in tongue dorsum coarticulatory direction are similar to those for the tongue tip; and whether speech motor control and phonemic planning use a fixed or a context-dependent temporal window. Tongue dorsum and tongue tip movement data on vowel-to-vowel coarticulation are reported for Catalan VCV sequences with vowels /i/, /a/, and /u/, and consonants /p/, /n/, dark /l/, /s/, /S/, alveolopalatal /n/ and /k/. Electromidsagittal articulometry recordings were carried out for three speakers using the Carstens articulograph. Trajectory data are presented for the vertical dimension for the tongue dorsum, and for the horizontal dimension for tongue dorsum and tip. In agreement with predictions of the DAC model, results show that directionality patterns of tongue dorsum coarticulation can be accounted for to a large extent based on the articulatory requirements on consonantal production. While dorsals exhibit analogous trends in coarticulatory direction for all articulators and articulatory dimensions, this is mostly so for the tongue dorsum and tip along the horizontal dimension in the case of lingual fricatives and apicolaminal consonants. This finding results from different articulatory strategies: while dorsal consonants are implemented through homogeneous tongue body activation, the tongue tip and tongue dorsum act more independently for more anterior consonantal productions. Discontinuous coarticulatory effects reported in the present investigation suggest that phonemic planning is adaptative rather than context independent.  相似文献   

12.
Five commonly used methods for determining the onset of voicing of syllable-initial stop consonants were compared. The speech and glottal activity of 16 native speakers of Cantonese with normal voice quality were investigated during the production of consonant vowel (CV) syllables in Cantonese. Syllables consisted of the initial consonants /ph/, /th/, /kh/, /p/, /t/, and /k/ followed by the vowel /a/. All syllables had a high level tone, and were all real words in Cantonese. Measurements of voicing onset were made based on the onset of periodicity in the acoustic waveform, and on spectrographic measures of the onset of a voicing bar (f0), the onset of the first formant (F1), second formant (F2), and third formant (F3). These measurements were then compared against the onset of glottal opening as determined by electroglottography. Both accuracy and variability of each measure were calculated. Results suggest that the presence of aspiration in a syllable decreased the accuracy and increased the variability of spectrogram-based measurements, but did not strongly affect measurements made from the acoustic waveform. Overall, the acoustic waveform provided the most accurate estimate of voicing onset; measurements made from the amplitude waveform were also the least variable of the five measures. These results can be explained as a consequence of differences in spectral tilt of the voicing source in breathy versus modal phonation.  相似文献   

13.
Electropalatography was used to monitor linguapalatal contact patterns in /s/ and /t/. Talkers often compensated incompletely for a bite block, both immediately after its insertion (sample B1) and after 10 min of practice (sample B2). Significant differences in the number of sensors contacted were noted between normal and bite-block samples for both /s/ and /t/. Differences in length of constriction in /t/, and the A-P location and width of the groove in /s/ were also noted. The two native English subjects compensated better than three Arabic subjects, perhaps because English /s/ and /t/ are formed more posteriorily and with a smaller contact area than their Arabic counterparts. A significant correlation existed between the area and A-P location of linguapalatal contact. All five subjects formed a groove for /s/ in sample B2, but two often did not produce /t/ with complete constriction. This suggests a groove is critical for /s/, but complete constriction is not critical for /t/. The contact patterns in sample B2 more closely resembled normal speech than those in sample B1 in some instances, while in other instances the reverse was true. The conclusion that subjects sometimes overcompensated in sample B2 was supported by the results of detailed acoustic and perceptual analyses for one subject. Taken together, the results suggest that compensation for a bite block is not instantaneous, and that specific parameter values may be encoded in central phonetic representations.  相似文献   

14.
This paper presents the results of a closed-set recognition task for 64 consonant-vowel sounds (16 C X 4 V, spoken by 18 talkers) in speech-weighted noise (-22,-20,-16,-10,-2 [dB]) and in quiet. The confusion matrices were generated using responses of a homogeneous set of ten listeners and the confusions were analyzed using a graphical method. In speech-weighted noise the consonants separate into three sets: a low-scoring set C1 (/f/, /theta/, /v/, /d/, /b/, /m/), a high-scoring set C2 (/t/, /s/, /z/, /S/, /Z/) and set C3 (/n/, /p/, /g/, /k/, /d/) with intermediate scores. The perceptual consonant groups are C1: {/f/-/theta/, /b/-/v/-/d/, /theta/-/d/}, C2: {/s/-/z/, /S/-/Z/}, and C3: /m/-/n/, while the perceptual vowel groups are /a/-/ae/ and /epsilon/-/iota/. The exponential articulation index (AI) model for consonant score works for 12 of the 16 consonants, using a refined expression of the AI. Finally, a comparison with past work shows that white noise masks the consonants more uniformly than speech-weighted noise, and shows that the AI, because it can account for the differences in noise spectra, is a better measure than the wideband signal-to-noise ratio for modeling and comparing the scores with different noise maskers.  相似文献   

15.
In sequences such as law and order, speakers of British English often insert /r/ between law and and. Acoustic analyses revealed such "intrusive" /r/ to be significantly shorter than canonical /r/. In a 2AFC experiment, native listeners heard British English sentences in which /r/ duration was manipulated across a word boundary [e.g., saw (r)ice], and orthographic and semantic factors were varied. These listeners responded categorically on the basis of acoustic evidence for /r/ alone, reporting ice after short /r/s, rice after long /r/s; orthographic and semantic factors had no effect. Dutch listeners proficient in English who heard the same materials relied less on durational cues than the native listeners, and were affected by both orthography and semantic bias. American English listeners produced intermediate responses to the same materials, being sensitive to duration (less so than native, more so than Dutch listeners), and to orthography (less so than the Dutch), but insensitive to the semantic manipulation. Listeners from language communities without common use of intrusive /r/ may thus interpret intrusive /r/ as canonical /r/, with a language difference increasing this propensity more than a dialect difference. Native listeners, however, efficiently distinguish intrusive from canonical /r/ by exploiting the relevant acoustic variation.  相似文献   

16.
This study investigates cross-speaker differences in the factors that predict voicing thresholds during abduction-adduction gestures in six normal women. Measures of baseline airflow, pulse amplitude, subglottal pressure, and fundamental frequency were made at voicing offset and onset during intervocalic /h/, produced in varying vowel environments and at different loudness levels, and subjected to relational analyses to determine which factors were most strongly related to the timing of voicing cessation or initiation. The data indicate that (a) all speakers showed differences between voicing offsets and onsets, but the degree of this effect varied across speakers; (b) loudness and vowel environment have speaker-specific effects on the likelihood of devoicing during /h/; and (c) baseline flow measures significantly predicted times of voicing offset and onset in all participants, but other variables contributing to voice timing differed across speakers. Overall, the results suggest that individual speakers have unique methods of achieving phonatory goals during running speech. These data contribute to the literature on individual differences in laryngeal function, and serve as a means of evaluating how well laryngeal models can reproduce the range of voicing behavior used by speakers during running speech tasks.  相似文献   

17.
Past studies have shown that when formants are perturbed in real time, speakers spontaneously compensate for the perturbation by changing their formant frequencies in the opposite direction to the perturbation. Further, the pattern of these results suggests that the processing of auditory feedback error operates at a purely acoustic level. This hypothesis was tested by comparing the response of three language groups to real-time formant perturbations, (1) native English speakers producing an English vowel /ε/, (2) native Japanese speakers producing a Japanese vowel (/e([inverted perpendicular])/), and (3) native Japanese speakers learning English, producing /ε/. All three groups showed similar production patterns when F1 was decreased; however, when F1 was increased, the Japanese groups did not compensate as much as the native English speakers. Due to this asymmetry, the hypothesis that the compensatory production for formant perturbation operates at a purely acoustic level was rejected. Rather, some level of phonological processing influences the feedback processing behavior.  相似文献   

18.
The purpose of the present study was to compare the speech performance of four types of alaryngeal phonation-electrolaryngeal (EL), pneumatic artificial laryngeal (PA), tracheoesophageal (TE), and standard esophageal (SE) speech-by adult Cantonese-speaking laryngectomees. Subjective ratings of (1) voice quality, (2) articulation proficiency, (3) quietness of speech, (4) pitch variability, and (5) overall speech intelligibility were given by eight naive individuals who had no prior experience with any form of alaryngeal speech. Results indicated that SE and TE speech was perceived to be more hoarse than PA and EL speech. EL speech was associated with significantly less pitch variability, and PA speakers produced speech with the least amount of perceived noise. However, articulation proficiency and overall speech intelligibility were found to be comparable in all four types of alaryngeal speakers.  相似文献   

19.
This paper examines four acoustic properties (duration F0, F1, and F2) of the monophthongal vowels of Iberian Spanish (IS) from Madrid and Peruvian Spanish (PS) from Lima in various consonantal contexts (/s/, /f/, /t/, /p/, and /k/) and in various phrasal contexts (in isolated words and sentence-internally). Acoustic measurements on 39 speakers, balanced by dialect and gender, can be generalized to the following differences between the two dialects. The vowel /a/ has a lower first formant in PS than in IS by 6.3%. The vowels /e/ and /o/ have more peripheral second-formant (F2) values in PS than in IS by about 4%. The consonant /s/ causes more centralization of the F2 of neighboring vowels in IS than in PS. No dialectal differences are found for the effect of phrasal context. Next to the between-dialect differences in the vowels, the present study finds that /s/ has a higher spectral center of gravity in PS than in IS by about 10%, that PS speakers speak slower than IS speakers by about 9%, and that Spanish-speaking women speak slower than Spanish-speaking men by about 5% (irrespective of dialect).  相似文献   

20.
The present study explored significant differences between male-to-female transgendered speakers perceived as male and those perceived as female in terms of speaking fundamental frequency (SFF) and its variability, vowel formants for /a/ and /i/, and intonation measures. Fifteen individuals who identified themselves as male-to-female transsexuals served as speaker subjects, in addition to 6 biological female control subjects and 3 biological male control subjects. Each subject was recorded reading the Rainbow Passage and producing the isolated vowels /a/ and /i/. Twenty undergraduate psychology students served as listeners. Results indicated that subjects perceived as female had a higher mean SFF and higher upper limit of SFF than subjects perceived as male. A significant correlation between upper limit of SFF and ratings of femininity was achieved.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号