首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 390 毫秒
1.
This study investigates the effects of speaking condition and auditory feedback on vowel production by postlingually deafened adults. Thirteen cochlear implant users produced repetitions of nine American English vowels prior to implantation, and at one month and one year after implantation. There were three speaking conditions (clear, normal, and fast), and two feedback conditions after implantation (implant processor turned on and off). Ten normal-hearing controls were also recorded once. Vowel contrasts in the formant space (expressed in mels) were larger in the clear than in the fast condition, both for controls and for implant users at all three time samples. Implant users also produced differences in duration between clear and fast conditions that were in the range of those obtained from the controls. In agreement with prior work, the implant users had contrast values lower than did the controls. The implant users' contrasts were larger with hearing on than off and improved from one month to one year postimplant. Because the controls and implant users responded similarly to a change in speaking condition, it is inferred that auditory feedback, although demonstrably important for maintaining normative values of vowel contrasts, is not needed to maintain the distinctiveness of those contrasts in different speaking conditions.  相似文献   

2.
Recent studies have demonstrated that mothers exaggerate phonetic properties of infant-directed (ID) speech. However, these studies focused on a single acoustic dimension (frequency), whereas speech sounds are composed of multiple acoustic cues. Moreover, little is known about how mothers adjust phonetic properties of speech to children with hearing loss. This study examined mothers' production of frequency and duration cues to the American English tense/lax vowel contrast in speech to profoundly deaf (N?=?14) and normal-hearing (N?=?14) infants, and to an adult experimenter. First and second formant frequencies and vowel duration of tense (/i/,?/u/) and lax (/I/,?/?/) vowels were measured. Results demonstrated that for both infant groups mothers hyperarticulated the acoustic vowel space and increased vowel duration in ID speech relative to adult-directed speech. Mean F2 values were decreased for the /u/ vowel and increased for the /I/ vowel, and vowel duration was longer for the /i/, /u/, and /I/ vowels in ID speech. However, neither acoustic cue differed in speech to hearing-impaired or normal-hearing infants. These results suggest that both formant frequencies and vowel duration that differentiate American English tense/lx vowel contrasts are modified in ID speech regardless of the hearing status of the addressee.  相似文献   

3.
The timing of changes in parameters of speech production was investigated in six cochlear implant users by switching their implant microphones off and on a number of times in a single experimental session. The subjects repeated four short, two-word utterances, /dV1n#SV2d/ (S = /s/ or /S/), in quasi-random order. The changes between hearing and nonhearing states were introduced by a voice-activated switch at V1 onset. "Postural" measures were made of vowel sound pressure level (SPL), duration, F0; contrast measures were made of vowel separation (distance between pair members in the formant plane) and sibilant separation (difference in spectral means). Changes in parameter values were averaged over multiple utterances, lined up with respect to the switch. No matter whether prosthetic hearing was blocked or restored, contrast measures for vowels and sibilants did not change systematically. Some changes in duration, SPL and F0 were observed during the vowel within which hearing state was changed, V1, as well as during V2 and subsequent utterance repetitions. Thus, sound segment contrasts appear to be controlled differently from the postural parameters of speaking rate and average SPL and F0. These findings are interpreted in terms of the function of hypothesized feedback and feedforward mechanisms for speech motor control.  相似文献   

4.
This study examined the ability of cochlear implant users and normal-hearing subjects to perform auditory stream segregation of pure tones. An adaptive, rhythmic discrimination task was used to assess stream segregation as a function of frequency separation of the tones. The results for normal-hearing subjects were consistent with previously published observations (L.P.A.S van Noorden, Ph.D. dissertation, Eindhoven University of Technology, Eindhoven, The Netherlands 1975), suggesting that auditory stream segregation increases with increasing frequency separation. For cochlear implant users, there appeared to be a range of pure-tone streaming abilities, with some subjects demonstrating streaming comparable to that of normal-hearing individuals, and others possessing much poorer streaming abilities. The variability in pure-tone streaming of cochlear implant users was correlated with speech perception in both steady-state noise and multi-talker babble. Moderate, statistically significant correlations between streaming and both measures of speech perception in noise were observed, with better stream segregation associated with better understanding of speech in noise. These results suggest that auditory stream segregation is a contributing factor in the ability to understand speech in background noise. The inability of some cochlear implant users to perform stream segregation may therefore contribute to their difficulties in noise backgrounds.  相似文献   

5.
Frequency resolution was evaluated for two normal-hearing and seven hearing-impaired subjects with moderate, flat sensorineural hearing loss by measuring percent correct detection of a 2000-Hz tone as the width of a notch in band-reject noise increased. The level of the tone was fixed for each subject at a criterion performance level in broadband noise. Discrimination of synthetic speech syllables that differed in spectral content in the 2000-Hz region was evaluated as a function of the notch width in the same band-reject noise. Recognition of natural speech consonant/vowel syllables in quiet was also tested; results were analyzed for percent correct performance and relative information transmitted for voicing and place features. In the hearing-impaired subjects, frequency resolution at 2000 Hz was significantly correlated with the discrimination of synthetic speech information in the 2000-Hz region and was not related to the recognition of natural speech nonsense syllables unless (a) the speech stimuli contained the vowel /i/ rather than /a/, and (b) the score reflected information transmitted for place of articulation rather than percent correct.  相似文献   

6.
This paper investigates the perception of non-native phoneme contrasts which exist in the native language, but not in the position tested. Like English, Dutch contrasts voiced and voiceless obstruents. Unlike English, Dutch allows only voiceless obstruents in word-final position. Dutch and English listeners' accuracy on English final voicing contrasts and their use of preceding vowel duration as a voicing cue were tested. The phonetic structure of Dutch should provide the necessary experience for a native-like use of this cue. Experiment 1 showed that Dutch listeners categorized English final /z/-/s/, /v/-/f/, /b/-/p/, and /d/-/t/ contrasts in nonwords as accurately as initial contrasts, and as accurately as English listeners did, even when release bursts were removed. In experiment 2, English listeners used vowel duration as a cue for one final contrast, although it was uninformative and sometimes mismatched other voicing characteristics, whereas Dutch listeners did not. Although it should be relatively easy for them, Dutch listeners did not use vowel duration. Nevertheless, they attained native-like accuracy, and sometimes even outperformed the native listeners who were liable to be misled by uninformative vowel duration information. Thus, native-like use of cues for non-native but familiar contrasts in unfamiliar positions may hardly ever be attained.  相似文献   

7.
This study was designed to test the hypothesis that the kinematic manipulations used by speakers in different speaking conditions are influenced by kinematic performance limits. A range of kinematic parameter values was elicited by having seven subjects produce cyclical CV movements of lips, tongue blade and tongue dorsum (/ba/, /da/, /ga/), at rates ranging from 1 to 6 Hz. The resulting measures were used to establish speaker- and articulator-specific kinematic performance spaces, defined by movement duration, displacement and peak speed. These data were compared with speech movement data produced by the subjects in several different speaking conditions in the companion study (Perkell et al., 2002). The amount of overlap of the speech data and cyclical data varied across speakers, from almost no overlap to complete overlap. Generally, for a given movement duration, speech movements were larger than cyclical movements, indicating that the speech movements were faster and were produced with greater effort, according to the performance space analysis. It was hypothesized that the cyclical movements of the tongue and lips were slower than the speech movements because they were more constrained by (coupled to) the relatively massive mandible. To test this hypothesis, a comparison was made of cyclical movements in maxillary versus mandibular frames of reference. The results indicate that the cyclical movements were not strongly constrained by mandible movements. The overall results generally indicate that the cyclical task did not succeed in defining the upper limits of kinematic performance spaces within which the speech data were confined. Thus, the hypothesis that performance limits influence speech kinematics could not be tested effectively. The differences between the speech and cyclical movements may be due to other factors, such as differences in speakers' "skill" with the two types of movement, or the size of the movements--the speech movements were larger, probably because of a well-defined target for the primary, stressed vowel.  相似文献   

8.
Standard continuous interleaved sampling processing, and a modified processing strategy designed to enhance temporal cues to voice pitch, were compared on tests of intonation perception, and vowel perception, both in implant users and in acoustic simulations. In standard processing, 400 Hz low-pass envelopes modulated either pulse trains (implant users) or noise carriers (simulations). In the modified strategy, slow-rate envelope modulations, which convey dynamic spectral variation crucial for speech understanding, were extracted by low-pass filtering (32 Hz). In addition, during voiced speech, higher-rate temporal modulation in each channel was provided by 100% amplitude-modulation by a sawtooth-like wave form whose periodicity followed the fundamental frequency (F0) of the input. Channel levels were determined by the product of the lower- and higher-rate modulation components. Both in acoustic simulations and in implant users, the ability to use intonation information to identify sentences as question or statement was significantly better with modified processing. However, while there was no difference in vowel recognition in the acoustic simulation, implant users performed worse with modified processing both in vowel recognition and in formant frequency discrimination. It appears that, while enhancing pitch perception, modified processing harmed the transmission of spectral information.  相似文献   

9.
Two experiments investigating the effects of auditory stimulation delivered via a Nucleus multichannel cochlear implant upon vowel production in adventitiously deafened adult speakers are reported. The first experiment contrasts vowel formant frequencies produced without auditory stimulation (implant processor OFF) to those produced with auditory stimulation (processor ON). Significant shifts in second formant frequencies were observed for intermediate vowels produced without auditory stimulation; however, no significant shifts were observed for the point vowels. Higher first formant frequencies occurred in five of eight vowels when the processor was turned ON versus OFF. A second experiment contrasted productions of the word "head" produced with a FULL map, OFF condition, and a SINGLE channel condition that restricted the amount of auditory information received by the subjects. This experiment revealed significant shifts in second formant frequencies between FULL map utterances and the other conditions. No significant differences in second formant frequencies were observed between SINGLE channel and OFF conditions. These data suggest auditory feedback information may be used to adjust the articulation of some speech sounds.  相似文献   

10.
This study examined intraproduction variability in jitter measures from elderly speakers' sustained vowel productions and tried to determine whether mean jitter levels (percent) and intraspeaker variability on jitter measures are affected significantly by the segment of the vowel selected for measurement. Twenty-eight healthy elderly men (mean age 75.6 years) and women (mean age 72.0 years) were tape recorded producing 25 repeat trials of the vowels /i/, /a/, and /u/, as steadily as possible. Jitter was analyzed from two segments of each vowel production: (a) the initial 100 cycles after 1 s of phonation, and (b) 100 cycles from the most stable-appearing portion of the production. Results indicated that the measurement point selected for jitter analysis was a significant factor both in the mean jitter level obtained and in the variability of jitter observed across repeat productions.  相似文献   

11.

Background  

The cortical activity underlying the perception of vowel identity has typically been addressed by manipulating the first and second formant frequency (F1 & F2) of the speech stimuli. These two values, originating from articulation, are already sufficient for the phonetic characterization of vowel category. In the present study, we investigated how the spectral cues caused by articulation are reflected in cortical speech processing when combined with phonation, the other major part of speech production manifested as the fundamental frequency (F0) and its harmonic integer multiples. To study the combined effects of articulation and phonation we presented vowels with either high (/a/) or low (/u/) formant frequencies which were driven by three different types of excitation: a natural periodic pulseform reflecting the vibration of the vocal folds, an aperiodic noise excitation, or a tonal waveform. The auditory N1m response was recorded with whole-head magnetoencephalography (MEG) from ten human subjects in order to resolve whether brain events reflecting articulation and phonation are specific to the left or right hemisphere of the human brain.  相似文献   

12.
Acoustic analyses and perception experiments were conducted to determine the effects of brief deprivation of auditory feedback on fricatives produced by cochlear implant users. The words /si/ and /Si/ were recorded by four children and four adults with their cochlear implant speech processor turned on or off. In the processor-off condition, word durations increased significantly for a majority of talkers. These increases were greater for children compared to adults, suggesting that children may rely on auditory feedback to a greater extent than adults. Significant differences in spectral measures of /S/ were found between processor-on and processor-off conditions for two of the four children and for one of the four adults. These talkers also demonstrated a larger /s/-/S/ contrast in centroid values compared to the other talkers within their respective groups. This finding may indicate that talkers who produce fine spectral distinctions are able to perceive these distinctions through their implants and to use this feedback to fine tune their speech. Two listening experiments provided evidence that some of the acoustic changes were perceptible to normal-hearing listeners. Taken together, these experiments indicate that for certain cochlear-implant users the brief absence of auditory feedback may lead to perceptible modifications in fricative consonants.  相似文献   

13.
The purpose of this study was to examine the effect of reduced vowel working space on dysarthric talkers' speech intelligibility using both acoustic and perceptual approaches. In experiment 1, the acoustic-perceptual relationship between vowel working space area and speech intelligibility was examined in Mandarin-speaking young adults with cerebral palsy. Subjects read aloud 18 bisyllabic words containing the vowels /i/, /a/, and /u/ using their normal speaking rate. Each talker's words were identified by three normal listeners. The percentage of correct vowel and word identification were calculated as vowel intelligibility and word intelligibility, respectively. Results revealed that talkers with cerebral palsy exhibited smaller vowel working space areas compared to ten age-matched controls. The vowel working space area was significantly correlated with vowel intelligibility (r=0.632, p<0.005) and with word intelligibility (r=0.684, p<0.005). Experiment 2 examined whether tokens of expanded vowel working spaces were perceived as better vowel exemplars and represented with greater perceptual spaces than tokens of reduced vowel working spaces. The results of the perceptual experiment support this prediction. The distorted vowels of talkers with cerebral palsy compose a smaller acoustic space that results in shrunken intervowel perceptual distances for listeners.  相似文献   

14.
Gap detection thresholds for speech and analogous nonspeech stimuli were determined in younger and older adults with clinically normal hearing in the speech range. Gap detection thresholds were larger for older than for younger listeners in all conditions, with the size of the age difference increasing with stimulus complexity. For both ages, gap detection thresholds were far smaller when the markers before and after the gap were the same (spectrally symmetrical) compared to when they were different (spectrally asymmetrical) for both speech and nonspeech stimuli. Moreover, gap detection thresholds were smaller for nonspeech than for speech stimuli when the markers were spectrally symmetrical but the opposite was observed when the markers were spectrally asymmetrical. This pattern of results may reflect the benefit of activating well-learned gap-dependent phonemic contrasts. The stimulus-dependent age effects were interpreted as reflecting the differential effects of age-dependent losses in temporal processing ability on within- and between-channel gap detection.  相似文献   

15.
Humans and monkeys were compared in their differential sensitivity to various acoustic cues underlying voicing contrasts specified by voice-onset time (VOT) in utterance-initial stop consonants. A low-uncertainty repeating standard AX procedure and positive-reinforcement operant conditioning techniques were used to measure difference limens (DLs) along a VOT continuum from--70 ms (prevoiced/ba/) to 0 ms (/ba/) to + 70 ms (/pa/). For all contrasts tested, human sensitivity was more acute than that of monkeys. For voicing lag, which spans a phonemic contrast in English, human DLs for a/ba/(standard)-to-/pa/ (target) continuum averaged 8.3 ms compared to 17 ms for monkeys. Human DLs for a/pa/-to-/ba/ continuum averaged 11 ms compared to 25 ms for monkeys. Larger species differences occurred for voicing lead, which is phonemically nondistinctive in English. Human DLs for a /ba/-to-prevoiced/ba/ continuum averaged 8.2 ms and were four times lower than monkeys (35 ms). Monkeys did not reliably discriminate prevoiced /ba/-to-/ba/, whereas humans DLs averaged 18 ms. The effects of eliminating cues in the English VOT contrasts were also examined. Removal of the aspiration noise in /pa/ greatly increased the DLs and reaction times for both humans and monkeys, but straightening out the F1 transition in /ba/ had only minor effects. Results suggest that quantitative differences in sensitivity should be considered when using monkeys to model the psychoacoustic level of human speech perception.  相似文献   

16.
This study addresses three issues that are relevant to coarticulation theory in speech production: whether the degree of articulatory constraint model (DAC model) accounts for patterns of the directionality of tongue dorsum coarticulatory influences; the extent to which those patterns in tongue dorsum coarticulatory direction are similar to those for the tongue tip; and whether speech motor control and phonemic planning use a fixed or a context-dependent temporal window. Tongue dorsum and tongue tip movement data on vowel-to-vowel coarticulation are reported for Catalan VCV sequences with vowels /i/, /a/, and /u/, and consonants /p/, /n/, dark /l/, /s/, /S/, alveolopalatal /n/ and /k/. Electromidsagittal articulometry recordings were carried out for three speakers using the Carstens articulograph. Trajectory data are presented for the vertical dimension for the tongue dorsum, and for the horizontal dimension for tongue dorsum and tip. In agreement with predictions of the DAC model, results show that directionality patterns of tongue dorsum coarticulation can be accounted for to a large extent based on the articulatory requirements on consonantal production. While dorsals exhibit analogous trends in coarticulatory direction for all articulators and articulatory dimensions, this is mostly so for the tongue dorsum and tip along the horizontal dimension in the case of lingual fricatives and apicolaminal consonants. This finding results from different articulatory strategies: while dorsal consonants are implemented through homogeneous tongue body activation, the tongue tip and tongue dorsum act more independently for more anterior consonantal productions. Discontinuous coarticulatory effects reported in the present investigation suggest that phonemic planning is adaptative rather than context independent.  相似文献   

17.
This paper presents the results of a closed-set recognition task for 64 consonant-vowel sounds (16 C X 4 V, spoken by 18 talkers) in speech-weighted noise (-22,-20,-16,-10,-2 [dB]) and in quiet. The confusion matrices were generated using responses of a homogeneous set of ten listeners and the confusions were analyzed using a graphical method. In speech-weighted noise the consonants separate into three sets: a low-scoring set C1 (/f/, /theta/, /v/, /d/, /b/, /m/), a high-scoring set C2 (/t/, /s/, /z/, /S/, /Z/) and set C3 (/n/, /p/, /g/, /k/, /d/) with intermediate scores. The perceptual consonant groups are C1: {/f/-/theta/, /b/-/v/-/d/, /theta/-/d/}, C2: {/s/-/z/, /S/-/Z/}, and C3: /m/-/n/, while the perceptual vowel groups are /a/-/ae/ and /epsilon/-/iota/. The exponential articulation index (AI) model for consonant score works for 12 of the 16 consonants, using a refined expression of the AI. Finally, a comparison with past work shows that white noise masks the consonants more uniformly than speech-weighted noise, and shows that the AI, because it can account for the differences in noise spectra, is a better measure than the wideband signal-to-noise ratio for modeling and comparing the scores with different noise maskers.  相似文献   

18.
The present study evaluated auditory-visual speech perception in cochlear-implant users as well as normal-hearing and simulated-implant controls to delineate relative contributions of sensory experience and cues. Auditory-only, visual-only, or auditory-visual speech perception was examined in the context of categorical perception, in which an animated face mouthing ba, da, or ga was paired with synthesized phonemes from an 11-token auditory continuum. A three-alternative, forced-choice method was used to yield percent identification scores. Normal-hearing listeners showed sharp phoneme boundaries and strong reliance on the auditory cue, whereas actual and simulated implant listeners showed much weaker categorical perception but stronger dependence on the visual cue. The implant users were able to integrate both congruent and incongruent acoustic and optical cues to derive relatively weak but significant auditory-visual integration. This auditory-visual integration was correlated with the duration of the implant experience but not the duration of deafness. Compared with the actual implant performance, acoustic simulations of the cochlear implant could predict the auditory-only performance but not the auditory-visual integration. These results suggest that both altered sensory experience and improvised acoustic cues contribute to the auditory-visual speech perception in cochlear-implant users.  相似文献   

19.
The speech of a postlingually deafened preadolescent was recorded and analyzed while a single-electrode cochlear implant (3M/House) was in operation, on two occasions after it failed (1 day and 18 days) and on three occasions after stimulation of a multichannel cochlear implant (Nucleus 22) (1 day, 6 months, and 1 year). Listeners judged 3M/House tokens to be the most normal until the subject had one year's experience with the Nucleus device. Spectrograms showed less aspiration, better formant definition and longer final frication and closure duration post-Nucleus stimulation (6 MO. NUCLEUS and 1 YEAR NUCLEUS) relative to the 3M/House and no auditory feedback conditions. Acoustic measurements after loss of auditory feedback (1 DAY FAIL and 18 DAYS FAIL) indicated a constriction of vowel space. Appropriately higher fundamental frequency for stressed than unstressed syllables, an expansion of vowel space and improvement in some aspects of production of voicing, manner and place of articulation were noted one year post-Nucleus stimulation. Loss of auditory feedback results are related to the literature on the effects of postlingual deafness on speech. Nucleus and 3M/House effects on speech are discussed in terms of speech production studies of single-electrode and multichannel patients.  相似文献   

20.
Speech recognition was measured as a function of spectral resolution (number of spectral channels) and speech-to-noise ratio in normal-hearing (NH) and cochlear-implant (CI) listeners. Vowel, consonant, word, and sentence recognition were measured in five normal-hearing listeners, ten listeners with the Nucleus-22 cochlear implant, and nine listeners with the Advanced Bionics Clarion cochlear implant. Recognition was measured as a function of the number of spectral channels (noise bands or electrodes) at signal-to-noise ratios of + 15, + 10, +5, 0 dB, and in quiet. Performance with three different speech processing strategies (SPEAK, CIS, and SAS) was similar across all conditions, and improved as the number of electrodes increased (up to seven or eight) for all conditions. For all noise levels, vowel and consonant recognition with the SPEAK speech processor did not improve with more than seven electrodes, while for normal-hearing listeners, performance continued to increase up to at least 20 channels. Speech recognition on more difficult speech materials (word and sentence recognition) showed a marginally significant increase in Nucleus-22 listeners from seven to ten electrodes. The average implant score on all processing strategies was poorer than scores of NH listeners with similar processing. However, the best CI scores were similar to the normal-hearing scores for that condition (up to seven channels). CI listeners with the highest performance level increased in performance as the number of electrodes increased up to seven, while CI listeners with low levels of speech recognition did not increase in performance as the number of electrodes was increased beyond four. These results quantify the effect of number of spectral channels on speech recognition in noise and demonstrate that most CI subjects are not able to fully utilize the spectral information provided by the number of electrodes used in their implant.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号