首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
The current investigation studied whether adults, children with normally developing language aged 4-5 years, and children with specific language impairment, aged 5-6 years identified vowels on the basis of steady-state or transitional formant frequencies. Four types of synthetic tokens, created with a female voice, served as stimuli: (1) steady-state centers for the vowels [i] and [ae]; (2) voweless tokens with transitions appropriate for [bib] and [baeb]; (3) "congruent" tokens that combined the first two types of stimuli into [bib] and [baeb]; and (4) "conflicting" tokens that combined the transitions from [bib] with the vowel from [baeb] and vice versa. Results showed that children with language impairment identified the [i] vowel more poorly than other subjects for both the voweless and congruent tokens. Overall, children identified vowels most accurately in steady-state centers and congruent stimuli (ranging between 94%-96%). They identified the vowels on the basis of transitions only from "voweless" tokens with 89% and 83.5% accuracy for the normally developing and language impaired groups, respectively. Children with normally developing language used steady-state cues to identify vowels in 87% of the conflicting stimuli, whereas children with language impairment did so for 79% of the stimuli. Adults were equally accurate for voweless, steady-state, and congruent tokens (ranging between 99% to 100% accuracy) and used both steady-state and transition cues for vowel identification. Results suggest that most listeners prefer the steady state for vowel identification but are capable of using the onglide/offglide transitions for vowel identification. Results were discussed with regard to Nittrouer's developmental weighting shift hypothesis and Strange and Jenkin's dynamic specification theory.  相似文献   

2.
Listeners are more likely to hear a synthetic fricative ambiguous between /s/ and /integral/ as /integral/ if it is appended to a woman's voice than a man's voice [Strand and Johnson, in Natural Language Processing and Speech Technology: Results of the 3rd KONVENS Conference (Mouton de Gruyter, Berlin, 1996), pp. 14-26]. This study expanded on this finding by replicating the result with a much larger group of male and female talkers than had been examined previously, by examining whether phonetic context mediates the influence of talker sex on fricative identification, and by examining whether talkers' perceived sexual orientation influences fricative identification. Stimuli were created by pairing a synthetic nine-step /s/-/integral/ continuum with tokens of /ae k/ and /Ip/ taken from productions of shack and ship by 44 talkers whose perceived sexual orientation had been reported previously [Munson et al., J. Phonetics (in press)]. Listeners participated in a series of two-alternative sack-shack and sip-ship identification experiments. Listeners identified more /integral/ tokens for women's voices than for men's voices for both continua. Lesbian/bisexual-sounding women elicited more sack and sip responses than heterosexual-sounding women. No consistent influence of perceived sexual orientation on fricative identification was noted for men's voices. Results suggest that listeners are sensitive to the association between fricatives' center frequencies and perceived sexual orientation in women's voices, but not in men's voices.  相似文献   

3.
The purpose of this study was to determine the accuracy with which listeners could identify the gender of a speaker from a synthesized isolated vowel based on the natural production of that speaker when (1) the fundamental frequency was consistent with the speaker's gender, (2) the fundamental frequency was inconsistent with the the speaker's gender, and (3) the speaker was transgendered. Ten male-to-female transgendered persons, 10 men and 10 women, served as subjects. Each speaker produced the vowels /i/, /u/, and //. These vowels were analyzed for fundamental frequency and the first three formant frequencies and bandwidths. Formant frequency and bandwidth information was used to synthesize two vowel tokens for each speaker, one at a fundamental frequency of 120 Hz and one at 240 Hz. Listeners were asked to listen to these tokens and determine whether the original speaker was male or female. Listeners were not aware of the use of transgendered speakers. Results showed that, in all cases, gender identifications were based on fundamental frequency, even when fundamental frequency and formant frequency information was contradictory.  相似文献   

4.
Monolingual Peruvian Spanish listeners identified natural tokens of the Canadian French (CF) and Canadian English (CE) /?/ and /?/, produced in five consonantal contexts. The results demonstrate that while the CF vowels were mapped to two different native vowels, /e/ and /a/, in all consonantal contexts, the CE contrast was mapped to the single native vowel /a/ in four out of five contexts. Linear discriminant analysis revealed that acoustic similarity between native and target language vowels was a very good predictor of context-specific perceptual mappings. Predictions are made for Spanish learners of the /?/-/?/ contrast in CF and CE.  相似文献   

5.
Moderately to profoundly hearing-impaired (n = 30) and normal-hearing (n = 6) listeners identified [p, k, t, f, theta, s] in [symbol; see text], and [symbol; see text]s tokens extracted from spoken sentences. The [symbol; see text]s were also identified in the sentences. The hearing-impaired group distinguished stop/fricative manner more poorly for [symbol; see text] in sentences than when extracted. Further, the group's performance for extracted [symbol; see text] was poorer than for extracted [symbol; see text] and [symbol; see text]. For the normal-hearing group, consonant identification was similar among the syllable and sentence contexts.  相似文献   

6.
Four normal-hearing young adults have been extensively trained in the use of a tactile speech-transmission system. Subjects were tested in the recognition of various phonetic elements including vowels, and stop, nasal, and fricative consonants under three receiving conditions; visual reception alone (lipreading), tactile reception alone, and tactile plus visual reception. Subjects were artificially deafened using earplugs and white noise and all speech tokens were presented live voice. Analysis of the data demonstrates that the tactile transform enables receivers to achieve excellent recognition of vowels in CVC context and the consonantal features of voicing and nasality. This, in combination with high recognition of vowels and the consonantal feature place of articulation through visual receptors, leads to recognition performance in the combined condition (visual plus tactual) which far exceeds either reception condition in isolation.  相似文献   

7.
Ten American English vowels were sung in a /b/-vowel-/d/ consonantal context by a professional countertenor in full voice (at F0 = 130, 165, 220, 260, and 330 Hz) and in head voice (at F0 = 220, 260, 330, 440, and 520 Hz). Four identification tests were prepared using the entire syllable or the center 200-ms portion of either the full-voice tokens or the head-voice tokens. Listeners attempted to identify each vowel by circling the appropriate word on their answer sheets. Errors were more frequent when the vowels were sung at higher F0. In addition, removal of the consonantal context markedly increased identification errors for both the head-voice and full-voice conditions. Back vowels were misidentified significantly more often than front vowels. For equal F0 values, listeners were significantly more accurate in identifying the head-voice stimuli. Acoustical analysis suggests that the difference of intelligibility between head and full voice may have been due to the head voice having more energy in the first harmonic than the full voice.  相似文献   

8.
The purpose of this study was to examine the effect of reduced vowel working space on dysarthric talkers' speech intelligibility using both acoustic and perceptual approaches. In experiment 1, the acoustic-perceptual relationship between vowel working space area and speech intelligibility was examined in Mandarin-speaking young adults with cerebral palsy. Subjects read aloud 18 bisyllabic words containing the vowels /i/, /a/, and /u/ using their normal speaking rate. Each talker's words were identified by three normal listeners. The percentage of correct vowel and word identification were calculated as vowel intelligibility and word intelligibility, respectively. Results revealed that talkers with cerebral palsy exhibited smaller vowel working space areas compared to ten age-matched controls. The vowel working space area was significantly correlated with vowel intelligibility (r=0.632, p<0.005) and with word intelligibility (r=0.684, p<0.005). Experiment 2 examined whether tokens of expanded vowel working spaces were perceived as better vowel exemplars and represented with greater perceptual spaces than tokens of reduced vowel working spaces. The results of the perceptual experiment support this prediction. The distorted vowels of talkers with cerebral palsy compose a smaller acoustic space that results in shrunken intervowel perceptual distances for listeners.  相似文献   

9.
This study explored how across-talker differences influence non-native vowel perception. American English (AE) and Korean listeners were presented with recordings of 10 AE vowels in /bVd/ context. The stimuli were mixed with noise and presented for identification in a 10-alternative forced-choice task. The two listener groups heard recordings of the vowels produced by 10 talkers at three signal-to-noise ratios. Overall the AE listeners identified the vowels 22% more accurately than the Korean listeners. There was a wide range of identification accuracy scores across talkers for both AE and Korean listeners. At each signal-to-noise ratio, the across-talker intelligibility scores were highly correlated for AE and Korean listeners. Acoustic analysis was conducted for 2 vowel pairs that exhibited variable accuracy across talkers for Korean listeners but high identification accuracy for AE listeners. Results demonstrated that Korean listeners' error patterns for these four vowels were strongly influenced by variability in vowel production that was within the normal range for AE talkers. These results suggest that non-native listeners are strongly influenced by across-talker variability perhaps because of the difficulty they have forming native-like vowel categories.  相似文献   

10.
Cross-generational and cross-dialectal variation in vowels among speakers of American English was examined in terms of vowel identification by listeners and vowel classification using pattern recognition. Listeners from Western North Carolina and Southeastern Wisconsin identified 12 vowel categories produced by 120 speakers stratified by age (old adults, young adults, and children), gender, and dialect. The vowels /?, o, ?, u/ were well identified by both groups of listeners. The majority of confusions were for the front /i, ?, e, ?, ?/, the low back /ɑ, ?/ and the monophthongal North Carolina /a?/. For selected vowels, generational differences in acoustic vowel characteristics were perceptually salient, suggesting listeners' responsiveness to sound change. Female exemplars and native-dialect variants produced higher identification rates. Linear discriminant analyses which examined dialect and generational classification accuracy showed that sampling the formant pattern at vowel midpoint only is insufficient to separate the vowels. Two sample points near onset and offset provided enough information for successful classification. The models trained on one dialect classified the vowels from the other dialect with much lower accuracy. The results strongly support the importance of dynamic information in accurate classification of cross-generational and cross-dialectal variations.  相似文献   

11.
Speaker variability and noise are two common sources of acoustic variability. The goal of this study was to examine whether these two sources of acoustic variability affected native and non-native perception of Mandarin fricatives to different degrees. Multispeaker Mandarin fricative stimuli were presented to 40 native and 52 non-native listeners in two presentation formats (blocked by speaker and mixed across speakers). The stimuli were also mixed with speech-shaped noise to create five levels of signal-to- noise ratios. The results showed that noise affected non-native identification disproportionately. By contrast, the effect of speaker variability was comparable between the native and non-native listeners. Confusion patterns were interpreted with reference to the results of acoustic analysis, suggesting native and non-native listeners used distinct acoustic cues for fricative identification. It was concluded that not all sources of acoustic variability are treated equally by native and non-native listeners. Whereas noise compromised non-native fricative perception disproportionately, speaker variability did not pose a special challenge to the non-native listeners.  相似文献   

12.
Earlier work [Nittrouer et al., J. Speech Hear. Res. 32, 120-132 (1989)] demonstrated greater evidence of coarticulation in the fricative-vowel syllables of children than in those of adults when measured by anticipatory vowel effects on the resonant frequency of the fricative back cavity. In the present study, three experiments showed that this increased coarticulation led to improved vowel recognition from the fricative noise alone: Vowel identification by adult listeners was better overall for children's productions and was successful earlier in the fricative noise. This enhanced vowel recognition for children's samples was obtained in spite of the fact that children's and adults' samples were randomized together, therefore indicating that listeners were able to normalize the vowel information within a fricative noise where there often was acoustic evidence of only one formant associated primarily with the vowel. Correct vowel judgments were found to be largely independent of fricative identification. However, when another coarticulatory effect, the lowering of the main spectral prominence of the fricative noise for /u/ versus /i/, was taken into account, vowel judgments were found to interact with fricative identification. The results show that listeners are sensitive to the greater coarticulation in children's fricative-vowel syllables, and that, in some circumstances, they do not need to make a correct identification of the most prominently specified phone in order to make a correct identification of a coarticulated one.  相似文献   

13.
This study compared how normal-hearing listeners (NH) and listeners with moderate to moderately severe cochlear hearing loss (HI) use and combine information within and across frequency regions in the perceptual separation of competing vowels with fundamental frequency differences (deltaF0) ranging from 0 to 9 semitones. Following the procedure of Culling and Darwin [J. Acoust. Soc. Am. 93, 3454-3467 (1993)], eight NH listeners and eight HI listeners identified competing vowels with either a consistent or inconsistent harmonic structure. Vowels were amplified to assure audibility for HI listeners. The contribution of frequency region depended on the value of deltaF0 between the competing vowels. When deltaF0 was small, both groups of listeners effectively utilized deltaF0 cues in the low-frequency region. In contrast, HI listeners derived significantly less benefit than NH listeners from deltaF0 cues conveyed by the high-frequency region at small deltaF0's. At larger deltaF0's, both groups combined deltaF0 cues from the low and high formant-frequency regions. Cochlear impairment appears to negatively impact the ability to use F0 cues for within-formant grouping in the high-frequency region. However, cochlear loss does not appear to disrupt the ability to use within-formant F0 cues in the low-frequency region or to group F0 cues across formant regions.  相似文献   

14.
Speakers may adapt the phonetic details of their productions when they anticipate perceptual difficulty or comprehension failure on the part of a listener. Previous research suggests that a speaking style known as clear speech is more intelligible overall than casual, conversational speech for a variety of listener populations. However, it is unknown whether clear speech improves the intelligibility of fricative consonants specifically, or how its effects on fricative perception might differ depending on listener population. The primary goal of this study was to determine whether clear speech enhances fricative intelligibility for normal-hearing listeners and listeners with simulated impairment. Two experiments measured babble signal-to-noise ratio thresholds for fricative minimal pair distinctions for 14 normal-hearing listeners and 14 listeners with simulated sloping, recruiting impairment. Results indicated that clear speech helped both groups overall. However, for impaired listeners, reliable clear speech intelligibility advantages were not found for non-sibilant pairs. Correlation analyses comparing acoustic and perceptual data indicated that a shift of energy concentration toward higher frequency regions and greater source strength contributed to the clear speech effect for normal-hearing listeners. Correlations between acoustic and perceptual data were less consistent for listeners with simulated impairment, and suggested that lower-frequency information may play a role.  相似文献   

15.
Listeners' auditory discrimination of vowel sounds depends in part on the order in which stimuli are presented. Such presentation order effects have been argued to be language independent, and to result from psychophysical (not speech- or language-specific) factors such as the decay of memory traces over time or increased weighting of later-occurring stimuli. In the present study, native Cantonese speakers' discrimination of a linguistic tone continuum is shown to exhibit order of presentation effects similar to those shown for vowels in previous studies. When presented with two successive syllables differing in fundamental frequency by approximately 4 Hz, listeners were significantly more sensitive to this difference when the first syllable was higher in frequency than the second. However, American English-speaking listeners with no experience listening to Cantonese showed no such contrast effect when tested in the same manner using the same stimuli. Neither English nor Cantonese listeners showed any order of presentation effects in the discrimination of a nonspeech continuum in which tokens had the same fundamental frequencies as the Cantonese speech tokens but had a qualitatively non-speech-like timbre. These results suggest that tone presentation order effects, unlike vowel effects, may be language specific, possibly resulting from the need to compensate for utterance-related pitch declination when evaluating fundamental frequency for tone identification.  相似文献   

16.
Vowel intelligibility during singing is an important aspect of communication during performance. The intelligibility of isolated vowels sung by Western classically trained singers has been found to be relatively low, in fact, decreasing as pitch rises, and it is lower for women than for men. The lack of contextual cues significantly deteriorates vowel intelligibility. It was postulated in this study that the reduced intelligibility of isolated sung vowels may be partly from the vowels used by the singers in their daily vocalises. More specifically, if classically trained singers sang only a few American English vowels during their vocalises, their intelligibility for American English vowels would be less than for those classically trained singers who usually vocalize on most American English vowels. In this study, there were 21 subjects (15 women, 6 men), all Western classically trained performers as well as teachers of classical singing. They sang 11 words containing 11 different American English vowels, singing on two pitches a musical fifth apart. Subjects were divided into two groups, those who normally vocalize on 4, 5, or 6 vowels, and those who sing all 11 vowels during their daily vocalises. The sung words were cropped to isolate the vowels, and listening tapes were created. Two listening groups, four singing teachers and five speech-language pathologists, were asked to identify the vowels intended by the singers. Results suggest that singing fewer vowels during daily vocalises does not decrease intelligibility compared with singing the 11 American English vowels. Also, in general, vowel intelligibility was lower with the higher pitch, and vowels sung by the women were less intelligible than those sung by the men. Identification accuracy was about the same for the singing teacher listeners and the speech-language pathologist listeners except for the lower pitch, where the singing teachers were more accurate.  相似文献   

17.
The purpose of this study was to examine the contribution of information provided by vowels versus consonants to sentence intelligibility in young normal-hearing (YNH) and typical elderly hearing-impaired (EHI) listeners. Sentences were presented in three conditions, unaltered or with either the vowels or the consonants replaced with speech shaped noise. Sentences from male and female talkers in the TIMIT database were selected. Baseline performance was established at a 70 dB SPL level using YNH listeners. Subsequently EHI and YNH participants listened at 95 dB SPL. Participants listened to each sentence twice and were asked to repeat the entire sentence after each presentation. Words were scored correct if identified exactly. Average performance for unaltered sentences was greater than 94%. Overall, EHI listeners performed more poorly than YNH listeners. However, vowel-only sentences were always significantly more intelligible than consonant-only sentences, usually by a ratio of 2:1 across groups. In contrast to written English or words spoken in isolation, these results demonstrated that for spoken sentences, vowels carry more information about sentence intelligibility than consonants for both young normal-hearing and elderly hearing-impaired listeners.  相似文献   

18.
The amount of acoustic information that native and non-native listeners need for syllable identification was investigated by comparing the performance of monolingual English speakers and native Spanish speakers with either an earlier or a later age of immersion in an English-speaking environment. Duration-preserved silent-center syllables retaining 10, 20, 30, or 40 ms of the consonant-vowel and vowel-consonant transitions were created for the target vowels /i, I, eI, epsilon, ae/ and /a/, spoken by two males in /bVb/ context. Duration-neutral syllables were created by editing the silent portion to equate the duration of all vowels. Listeners identified the syllables in a six-alternative forced-choice task. The earlier learners identified the whole-word and 40 ms duration-preserved syllables as accurately as the monolingual listeners, but identified the silent-center syllables significantly less accurately overall. Only the monolingual listener group identified syllables significantly more accurately in the duration-preserved than in the duration-neutral condition, suggesting that the non-native listeners were unable to recover from the syllable disruption sufficiently to access the duration cues in the silent-center syllables. This effect was most pronounced for the later learners, who also showed the most vowel confusions and the greatest decrease in performance from the whole word to the 40 ms transition condition.  相似文献   

19.
This study examined vowel perception by young normal-hearing (YNH) adults, in various listening conditions designed to simulate mild-to-moderate sloping sensorineural hearing loss. YNH listeners were individually age- and gender-matched to young hearing-impaired (YHI) listeners tested in a previous study [Richie et al., J. Acoust. Soc. Am. 114, 2923-2933 (2003)]. YNH listeners were tested in three conditions designed to create equal audibility with the YHI listeners; a low signal level with and without a simulated hearing loss, and a high signal level with a simulated hearing loss. Listeners discriminated changes in synthetic vowel tokens /I e epsilon alpha ae/ when Fl or F2 varied in frequency. Comparison of YNH with YHI results failed to reveal significant differences between groups in terms of performance on vowel discrimination, in conditions of similar audibility by using both noise masking to elevate the hearing thresholds of the YNH and applying frequency-specific gain to the YHI listeners. Further, analysis of learning curves suggests that while the YHI listeners completed an average of 46% more test blocks than YNH listeners, the YHI achieved a level of discrimination similar to that of the YNH within the same number of blocks. Apparently, when age and gender are closely matched between young hearing-impaired and normal-hearing adults, performance on vowel tasks may be explained by audibility alone.  相似文献   

20.
Speech samples of 12 speakers (8 children and 4 adults) producing the fricatives /s/ and/sh/ followed by the vowels /i/ and /u/ were analyzed to locate the major spectral prominences. Results showed that the fricative low-frequency prominences for children's samples differed from those of adults in three important ways: (1) They were generally higher in frequency; (2) they were greater in amplitude relative to higher frequency regions; and (3) they showed greater effects of vowel context. The first finding can be explained by a simple scaling of adult models of fricative production to accommodate children's smaller vocal tracts. The other two findings suggest, however, that there are other anatomical and articulatory differences between children and adults affecting fricative production. The data presented here suggest that one important difference may be the relative sizes of the fricative constriction and the glottal opening.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号