首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Both English and Japanese have two voiceless sibilant fricatives, an anterior fricative /s/ contrasting with a more posterior fricative /∫/. When children acquire sibilant fricatives, English children typically substitute [s] for /∫/, whereas Japanese children typically substitute [∫] for /s/. This study examined English- and Japanese-speaking adults' perception of children's productions of voiceless sibilant fricatives to investigate whether the apparent asymmetry in the acquisition of voiceless sibilant fricatives reported previously in the two languages was due in part to how adults perceive children's speech. The results of this study show that adult speakers of English and Japanese weighed acoustic parameters differently when identifying fricatives produced by children and that these differences explain, in part, the apparent cross-language asymmetry in fricative acquisition. This study shows that generalizations about universal and language-specific patterns in speech-sound development cannot be determined without considering all sources of variation including speech perception.  相似文献   

2.
Modifying the vocal tract alters a speaker's previously learned acoustic-articulatory relationship. This study investigated the contribution of auditory feedback to the process of adapting to vocal-tract modifications. Subjects said the word /tas/ while wearing a dental prosthesis that extended the length of their maxillary incisor teeth. The prosthesis affected /s/ productions and the subjects were asked to learn to produce "normal" /s/'s. They alternately received normal auditory feedback and noise that masked their natural feedback during productions. Acoustic analysis of the speakers' /s/ productions showed that the distribution of energy across the spectra moved toward that of normal, unperturbed production with increased experience with the prosthesis. However, the acoustic analysis did not show any significant differences in learning dependent on auditory feedback. By contrast, when naive listeners were asked to rate the quality of the speakers' utterances, productions made when auditory feedback was available were evaluated to be closer to the subjects' normal productions than when feedback was masked. The perceptual analysis showed that speakers were able to use auditory information to partially compensate for the vocal-tract modification. Furthermore, utterances produced during the masked conditions also improved over a session, demonstrating that the compensatory articulations were learned and available after auditory feedback was removed.  相似文献   

3.
Speech production by children with cochlear implants (CIs) is generally less intelligible and less accurate on a phonemic level than that of normally hearing children. Research has reported that children with CIs produce less acoustic contrast between phonemes than normally hearing children, but these studies have included correct and incorrect productions. The present study compared the extent of contrast between correct productions of /s/ and /∫/ by children with CIs and two comparison groups: (1) normally hearing children of the same chronological age as the children with CIs and (2) normally hearing children with the same duration of auditory experience. Spectral peaks and means were calculated from the frication noise of productions of /s/ and /∫/. Results showed that the children with CIs produced less contrast between /s/ and /∫/ than normally hearing children of the same chronological age and normally hearing children with the same duration of auditory experience due to production of /s/ with spectral peaks and means at lower frequencies. The results indicate that there may be differences between the speech sounds produced by children with CIs and their normally hearing peers even for sounds that adults judge as correct.  相似文献   

4.
Sibilant groove place and width were initially examined during [s] [s] in isolation and in CV and VC syllables. The [s] was found to be produced through a 6- to 8-mm-wide groove near the front of the alveolar ridge by one talker and near the back of the ridge by the other. [s] was produced through a 10- to 12-mm groove behind the posterior border of the alveolar ridge by both. In the second experiment three subjects used visual articulatory feedback to vary sibilant groove width and place systematically. One subject was able to do this with comparatively few retrials; one had difficulty with certain targeted grooves; one had difficulty with many targeted grooves. The noises generated were replayed to 14 listeners who labeled them as "s," "probably s," "probably sh," or "sh." They usually heard the sound as [s] when the grooves were narrow and near the front of the alveolar process, [s] when the groove was wider and behind the alveolar process. Noise through grooves that matched natural speech places and widths usually produced higher listener recognition scores. Exceptions were found when the subjects had unusual difficulty in achieving stipulated groove widths and places.  相似文献   

5.
The role of auditory feedback in speech production was investigated by examining speakers' phonemic contrasts produced under increases in the noise to signal ratio (N/S). Seven cochlear implant users and seven normal-hearing controls pronounced utterances containing the vowels /i/, /u/, /e/ and /ae/ and the sibilants /s/ and /I/ while hearing their speech mixed with noise at seven equally spaced levels between their thresholds of detection and discomfort. Speakers' average vowel duration and SPL generally rose with increasing N/S. Average vowel contrast was initially flat or rising; at higher N/S levels, it fell. A contrast increase is interpreted as reflecting speakers' attempts to maintain clarity under degraded acoustic transmission conditions. As N/S increased, speakers could detect the extent of their phonemic contrasts less effectively, and the competing influence of economy of effort led to contrast decrements. The sibilant contrast was more vulnerable to noise; it decreased over the entire range of increasing N/S for controls and was variable for implant users. The results are interpreted as reflecting the combined influences of a clarity constraint, economy of effort and the effect of masking on achieving auditory phonemic goals-with implant users less able to increase contrasts in noise than controls.  相似文献   

6.
Two experiments investigating the effects of auditory stimulation delivered via a Nucleus multichannel cochlear implant upon vowel production in adventitiously deafened adult speakers are reported. The first experiment contrasts vowel formant frequencies produced without auditory stimulation (implant processor OFF) to those produced with auditory stimulation (processor ON). Significant shifts in second formant frequencies were observed for intermediate vowels produced without auditory stimulation; however, no significant shifts were observed for the point vowels. Higher first formant frequencies occurred in five of eight vowels when the processor was turned ON versus OFF. A second experiment contrasted productions of the word "head" produced with a FULL map, OFF condition, and a SINGLE channel condition that restricted the amount of auditory information received by the subjects. This experiment revealed significant shifts in second formant frequencies between FULL map utterances and the other conditions. No significant differences in second formant frequencies were observed between SINGLE channel and OFF conditions. These data suggest auditory feedback information may be used to adjust the articulation of some speech sounds.  相似文献   

7.
Temporal auditory acuity, the ability to discriminate rapid changes in the envelope of a sound, is essential for speech comprehension. Human envelope following responses (EFRs) recorded from scalp electrodes were evaluated as an objective measurement of temporal processing in the auditory nervous system. The temporal auditory acuity of older and younger participants was measured behaviorally using both gap and modulation detection tasks. These findings were then related to EFRs evoked by white noise that was amplitude modulated (25% modulation depth) with a sweep of modulation frequencies from 20 to 600 Hz. The frequency at which the EFR was no longer detectable was significantly correlated with behavioral measurements of gap detection (r = -0.43), and with the maximum perceptible modulation frequency (r = 0.72). The EFR techniques investigated here might be developed into a clinically useful objective estimate of temporal auditory acuity for subjects who cannot provide reliable behavioral responses.  相似文献   

8.
This study investigated noise-induced changes in suppression growth (SG) of distortion product otoacoustic emissions (DPOAEs). Detailed measurements of SG were obtained in rabbits as a function of f2 frequencies at four primary-tone levels. SG measures were produced by using suppressor tones (STs) presented at two fixed distances from f2. The magnitude of suppression was calculated for each ST level and depicted as contour plots showing the amount of suppression as a function of the f2 frequency. At each f2, SG indices included slope, suppression threshold, and an estimate of the tip-to-tail value. All suppression measures were obtained before and after producing a cochlear dysfunction using a monaural exposure to a 2-h, 110-dB SPL octave-band noise centered at 2 kHz. The noise exposure produced varying amounts of cochlear damage as revealed by changes in DP-grams and auditory brainstem responses. However, average measures of SG slopes, suppression thresholds, and tip-to-tail values failed to mirror the mean DP-gram loss patterns. When suppression-based parameters were correlated with the amount of DPOAE loss, small but significant correlations were observed for some measures. Overall, the findings suggest that measures derived from DPOAE SG are limited in their ability to detect noise-induced cochlear damage.  相似文献   

9.
The role of auditory feedback in speech motor control was explored in three related experiments. Experiment 1 investigated auditory sensorimotor adaptation: the process by which speakers alter their speech production to compensate for perturbations of auditory feedback. When the first formant frequency (F1) was shifted in the feedback heard by subjects as they produced vowels in consonant-vowel-consonant (CVC) words, the subjects' vowels demonstrated compensatory formant shifts that were maintained when auditory feedback was subsequently masked by noise-evidence of adaptation. Experiment 2 investigated auditory discrimination of synthetic vowel stimuli differing in F1 frequency, using the same subjects. Those with more acute F1 discrimination had compensated more to F1 perturbation. Experiment 3 consisted of simulations with the directions into velocities of articulators model of speech motor planning, which showed that the model can account for key aspects of compensation. In the model, movement goals for vowels are regions in auditory space; perturbation of auditory feedback invokes auditory feedback control mechanisms that correct for the perturbation, which in turn causes updating of feedforward commands to incorporate these corrections. The relation between speaker acuity and amount of compensation to auditory perturbation is mediated by the size of speakers' auditory goal regions, with more acute speakers having smaller goal regions.  相似文献   

10.
Scientists have made great strides toward understanding the mechanisms of speech production and perception. However, the complex relationships between the acoustic structures of speech and the resulting psychological percepts have yet to be fully and adequately explained, especially in speech produced by younger children. Thus, this study examined the acoustic structure of voiceless fricatives (/f, theta, s, S/) produced by adults and typically developing children from 3 to 6 years of age in terms of multiple acoustic parameters (durations, normalized amplitude, spectral slope, and spectral moments). It was found that the acoustic parameters of spectral slope and variance (commonly excluded from previous studies of child speech) were important acoustic parameters in the differentiation and classification of the voiceless fricatives, with spectral variance being the only measure to separate all four places of articulation. It was further shown that the sibilant contrast between /s/ and /S/ was less distinguished in children than adults, characterized by a dramatic change in several spectral parameters at approximately five years of age. Discriminant analysis revealed evidence that classification models based on adult data were sensitive to these spectral differences in the five-year-old age group.  相似文献   

11.
This paper presents an experiment where participants were asked to adjust, while walking, the spectral content and the amplitude of synthetic footstep sounds in order to match the sounds of their own footsteps. The sounds were interactively generated by means of a shoe-based system capable of tracking footfalls and delivering real-time auditory feedback via headphones. Results allowed identification of the mean value and the range of variation of spectral centroid and peak level of footstep sounds simulating various combinations of shoe type and ground material. Results showed that the effect of ground material on centroid and peak level depended on the type of shoe. Similarly, the effect of shoe type on the two variables depended on the type of ground material. In particular, participants produced greater amplitudes for hard sole shoes than for soft sole shoes in presence of solid surfaces, while similar amplitudes for both types of shoes were found for aggregate, hybrids, and liquids. No significant correlations were found between each of the two acoustic features and participants’ body size. This result might be explained by the fact that while adjusting the sounds participants did not primarily focus on the acoustic rendering of their body. In addition, no significant differences were found between the values of the two acoustic features selected by the experimenters and those adjusted by participants. This result can therefore be considered as a measure of the goodness of the design choices to synthesize the involved footstep sounds for a generic walker. More importantly, this study showed that the relationships between the ground-shoes combinations are not changed when participants are actively walking. This represents the first active listening confirmation of this result, which had previously only been shown in passive listening studies. The results of this research can be used to design ecologically-valid auditory rendering of foot-floor interactions in virtual environments.  相似文献   

12.
Speech duration characteristics of phrase-level utterances produced by 26 severely and profoundly hearing-impaired adults were examined acoustically using relative timing measures. The measures were then compared to the same utterances produced by 13 normal-hearing adults. Although absolute speech durations of the hearing-impaired subjects were significantly longer than their normal-hearing counterparts, relative timing did not differ between groups. Findings are discussed in relation to the biological constraint hypothesis associated with speech timing, as well as the role of auditory feedback in models of speech production.  相似文献   

13.
Within-subject variation of three vocal frequency perturbation indices was compared across multiple sessions. The magnitude of jitter factor (JF), pitch perturbation quotient (PPQ), and directional perturbation quotient (DPF) was measured every other day for 33 consecutive days for ten female and five male normal young adult speakers. Perturbation measures were calculated using a zero-crossing analysis of taped [i] and [u] productions. Pearson product-moment correlations among the three perturbation indices were calculated to examine their relation over time. Coefficients of variation for JF, PPQ, and DPF were considered indicative of the temporal stability of the three measures. JF and PPQ provided redundant information about laryngeal behaviors in steady-state productions. DPF, however, appeared to measure different laryngeal behaviors. Also, JF and PPQ varied considerably within individuals across sessions while DPF was the more temporally stable measure. Multiple sampling sessions and measurement of both the magnitude and direction of period differences are advised for future investigations of vocal frequency perturbation.  相似文献   

14.
This study investigated the role of sensory feedback during the production of front vowels. A temporary aftereffect induced by tongue loading was employed to modify the somatosensory-based perception of tongue height. Following the removal of tongue loading, tongue height during vowel production was estimated by measuring the frequency of the first formant (F1) from the acoustic signal. In experiment 1, the production of front vowels following tongue loading was investigated either in the presence or absence of auditory feedback. With auditory feedback available, the tongue height of front vowels was not modified by the aftereffect of tongue loading. By contrast, speakers did not compensate for the aftereffect of tongue loading when they produced vowels in the absence of auditory feedback. In experiment 2, the characteristics of the masking noise were manipulated such that it masked energy either in the F1 region or in the region of the second and higher formants. The results showed that the adjustment of tongue height during the production of front vowels depended on information about F1 in the auditory feedback. These findings support the idea that speech goals include both auditory and somatosensory targets and that speakers are able to make use of information from both sensory modalities to maximize the accuracy of speech production.  相似文献   

15.
For all but the most profoundly hearing-impaired (HI) individuals, auditory-visual (AV) speech has been shown consistently to afford more accurate recognition than auditory (A) or visual (V) speech. However, the amount of AV benefit achieved (i.e., the superiority of AV performance in relation to unimodal performance) can differ widely across HI individuals. To begin to explain these individual differences, several factors need to be considered. The most obvious of these are deficient A and V speech recognition skills. However, large differences in individuals' AV recognition scores persist even when unimodal skill levels are taken into account. These remaining differences might be attributable to differing efficiency in the operation of a perceptual process that integrates A and V speech information. There is at present no accepted measure of the putative integration process. In this study, several possible integration measures are compared using both congruent and discrepant AV nonsense syllable and sentence recognition tasks. Correlations were tested among the integration measures, and between each integration measure and independent measures of AV benefit for nonsense syllables and sentences in noise. Integration measures derived from tests using nonsense syllables were significantly correlated with each other; on these measures, HI subjects show generally high levels of integration ability. Integration measures derived from sentence recognition tests were also significantly correlated with each other, but were not significantly correlated with the measures derived from nonsense syllable tests. Similarly, the measures of AV benefit based on nonsense syllable recognition tests were found not to be significantly correlated with the benefit measures based on tests involving sentence materials. Finally, there were significant correlations between AV integration and benefit measures derived from the same class of speech materials, but nonsignificant correlations between integration and benefit measures derived from different classes of materials. These results suggest that the perceptual processes underlying AV benefit and the integration of A and V speech information might not operate in the same way on nonsense syllable and sentence input.  相似文献   

16.
Speech reception thresholds (SRTs) for sentences were determined in stationary and modulated background noise for two age-matched groups of normal-hearing (N = 13) and hearing-impaired listeners (N = 21). Correlations were studied between the SRT in noise and measures of auditory and nonauditory performance, after which stepwise regression analyses were performed within both groups separately. Auditory measures included the pure-tone audiogram and tests of spectral and temporal acuity. Nonauditory factors were assessed by measuring the text reception threshold (TRT), a visual analogue of the SRT, in which partially masked sentences were adaptively presented. Results indicate that, for the normal-hearing group, the variance in speech reception is mainly associated with nonauditory factors, both in stationary and in modulated noise. For the hearing-impaired group, speech reception in stationary noise is mainly related to the audiogram, even when audibility effects are accounted for. In modulated noise, both auditory (temporal acuity) and nonauditory factors (TRT) contribute to explaining interindividual differences in speech reception. Age was not a significant factor in the results. It is concluded that, under some conditions, nonauditory factors are relevant for the perception of speech in noise. Further evaluation of nonauditory factors might enable adapting the expectations from auditory rehabilitation in clinical settings.  相似文献   

17.
Perception of second language speech sounds is influenced by one's first language. For example, speakers of American English have difficulty perceiving dental versus retroflex stop consonants in Hindi although English has both dental and retroflex allophones of alveolar stops. Japanese, unlike English, has a contrast similar to Hindi, specifically, the Japanese /d/ versus the flapped /r/ which is sometimes produced as a retroflex. This study compared American and Japanese speakers' identification of the Hindi contrast in CV syllable contexts where C varied in voicing and aspiration. The study then evaluated the participants' increase in identifying the distinction after training with a computer-interactive program. Training sessions progressively increased in difficulty by decreasing the extent of vowel truncation in stimuli and by adding new speakers. Although all participants improved significantly, Japanese participants were more accurate than Americans in distinguishing the contrast on pretest, during training, and on posttest. Transfer was observed to three new consonantal contexts, a new vowel context, and a new speaker's productions. Some abstract aspect of the contrast was apparently learned during training. It is suggested that allophonic experience with dental and retroflex stops may be detrimental to perception of the new contrast.  相似文献   

18.
Recent studies have demonstrated the use of manganese ion (Mn2+)) as an in vivo neuronal tract tracer. In contrast to histological approaches, manganese tracing can be performed repeatedly on the same living animal. In this study, we describe the neuroaxonal tracing of the auditory pathway in the living guinea pig, relying on the fact that Mn2+ ion enters excitable cells through voltage-gated calcium channels and is an excellent MRI paramagnetic tract-tracing agent. Small focal injections of Mn2+ ion into the cochlea produced significant contrast enhancement along the known neuronal circuitry. This in vivo approach, allowing repeated measures, is expected to open new vistas to study auditory physiology and to provide new insights on in vivo axonal transport and neuronal activity in the central auditory system.  相似文献   

19.
Estimates of the ability to make use of sentence context in 34 postlingually hearing-impaired (HI) individuals were obtained using formulas developed by Boothroyd and Nittrouer [Boothroyd and Nittrouer, J. Acoust. Sco. Am. 84, 101-114 (1988)] which relate scores for isolated words to words in meaningful sentences. Sentence materials were constructed by concatenating digitized productions of isolated words to ensure physical equivalence among the test items in the two conditions. Isolated words and words in sentences were tested at three levels of intelligibility (targeting 29%, 50%, and 79% correct). Thus, for each subject, three estimates of context ability, or k factors, were obtained. In addition, auditory, visual, and auditory-visual sentence recognition was evaluated using natural productions of sentence materials. Two main questions were addressed: (1) Is context ability constant for speech materials produced with different degrees of clarity? and (2) What are the relations between individual estimates of k and sentence recognition as a function of presentation modality? Results showed that estimates of k were not constant across different levels of intelligibility: k was greater for the more degraded condition relative to conditions of higher word intelligibility. Estimates of k also were influenced strongly by the test order of isolated words and words in sentences. That is, prior exposure to words in sentences improved later recognition of the same words when presented in isolation (and vice versa), even though the 1500 key words comprising the test materials were presented under degraded (filtered) conditions without feedback. The impact of this order effect was to reduce individual estimates of k for subjects exposed to sentence materials first and to increase estimates of k for subjects exposed to isolated words first. Finally, significant relationships were found between individual k scores and sentence recognition scores in all three presentation modalities, suggesting that k is a useful measure of individual differences in the ability to use sentence context.  相似文献   

20.
This study used ultrasound imaging to examine the cross-sectional shape of the tongue during the production of the ten English vowels ( see text ) in two consonant contexts--/p/ and /s/--and at two scan angles--anterior and posterior. Results were compared with models of sagittal tongue shape. A newly built transducer holder and head restraint maintained the ultrasound transducer in a fixed position inferior to the mandible at a chosen location and angle. The transducer was free to move only in a superior/inferior direction, and demonstrated reliable tracking of the jaw. Since the tongue is anisotrophic along its length, anterior and posterior scan angles were examined to identify differences in tongue shape. Similarly, the coarticulatory effects of the sibilant /s/ versus the bilabial /p/ were examined, to assess variability of intrinsic tongue shape for the vowels. Results showed that the subject's midsagittal tongue grooving was almost universal for the vowels. Posterior grooves were deeper than anterior grooves. In /s/ context, posterior tongue grooves were shallower than in /p/ context. Anteriorly, /s/ context caused deeper grooves for low vowels. Cross-sectional tongue shape varied with tongue position similarly to sagittal tongue shape.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号