首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
2.
An important speech cue is that of voice onset time (VOT), a cue for the perception of voicing and aspiration in word-initial stops. Preaspiration, an [h]-like sound between a vowel and the following stop, can be cued by voice offset time, a cue which in most respects mirrors VOT. In Icelandic VOffT is much more sensitive to the duration of the preceding vowel than is VOT to the duration of the following vowel. This has been explained by noting that preaspiration can only follow a phonemically short vowel. Lengthening of the vowel, either by changing its duration or by moving the spectrum towards that appropriate for a long vowel, will thus demand a longer VOffT to cue preaspiration. An experiment is reported showing that this greater effect that vowel quantity has on the perception of VOffT than on the perception of VOT cannot be explained by the effect of F1 frequency at vowel offset.  相似文献   

3.
Perception of sine-wave analogs of voice onset time stimuli   总被引:1,自引:0,他引:1  
It has been argued that perception of stop consonant voicing contrasts is based on auditory mechanisms responsible for the resolution of temporal order. As one source of evidence, category boundaries for nonspeech stimuli whose components vary in relative onset time are reasonably close to the labeling boundary for a labial stop voiced-voiceless continuum. However, voicing boundaries change considerably when the onset frequency of the first formant (F1) is varied--either directly or as a side effect of a change in F1 transition duration. Stimuli consisted of a midfrequency sinusoid that was initiated 0-50 ms prior to the onset of a low-frequency sinusoid. Results showed that the labeling boundary for relative onset time increased for longer durations of a low-frequency tone sweep. This effect is analogous to the F1 transition duration effect with synthetic speech. Further, the discrimination of differences in relative onset time was poorer for stimuli with longer frequency sweeps. However, unlike synthetic speech, there were no systematic effects when the frequency of a transitionless lower sinusoid was varied. These findings are discussed in relation to the potential contributions of auditory mechanisms and speech-specific processes in the perception of the voicing contrast.  相似文献   

4.
The voice onset time (VOT) of a stop consonant is the interval between its burst onset and voicing onset. Among a variety of research topics on VOT, one that has been studied for years is how VOTs are efficiently measured. Manual annotation is a feasible way, but it becomes a time-consuming task when the corpus size is large. This paper proposes an automatic VOT estimation method based on an onset detection algorithm. At first, a forced alignment is applied to identify the locations of stop consonants. Then a random forest based onset detector searches each stop segment for its burst and voicing onsets to estimate a VOT. The proposed onset detection can detect the onsets in an efficient and accurate manner with only a small amount of training data. The evaluation data extracted from the TIMIT corpus were 2344 words with a word-initial stop. The experimental results showed that 83.4% of the estimations deviate less than 10 ms from their manually labeled values, and 96.5% of the estimations deviate by less than 20 ms. Some factors that influence the proposed estimation method, such as place of articulation, voicing of a stop consonant, and quality of succeeding vowel, were also investigated.  相似文献   

5.
Learning to speak involves both mastering the requisite articulatory gestures of one's native language and learning to coordinate those gestures according to the rules of the language. Voice onset time (VOT) acquisition illustrates this point: The child must learn to produce the necessary upper vocal tract and laryngeal gestures and to coordinate them with very precise timing. This longitudinal study examined the acquisition of English VOT by audiotaping seven children at 2 month intervals from first words (around 15 months) to the appearance of three-word sentences (around 30 months) in spontaneous speech. Words with initial stops were excerpted, and (1) the numbers of words produced with intended voiced and voiceless initial stops were counted; (2) VOT was measured; and (3) within-child standard deviations of VOT were measured. Results showed that children (1) initially avoided saying words with voiceless initial stops, (2) initially did not delay the onset of the laryngeal adduction relative to the release of closure as long as adults do for voiceless stops, and (3) were more variable in VOT for voiceless than for voiced stops. Overall these results support a model of acquisition that focuses on the mastery of gestural coordination as opposed to the acquisition of segmental contrasts.  相似文献   

6.
7.
Responses of chinchilla auditory-nerve fibers to synthesized stop consonants differing in voice onset time (VOT) were obtained. The syllables, heard as /ga/-/ka/ or /da/-/ta/, were similar to those previously used by others in psychophysical experiments with human and with chinchilla subjects. Average discharge rates of neurons tuned to the frequency region near the first formant generally increased at the onset of voicing, for VOTs longer than 20 ms. These rate increases were closely related to spectral amplitude changes associated with the onset of voicing and with the activation of the first formant; as a result, they provided accurate information about VOT. Neurons tuned to frequency regions near the second and third formants did not encode VOT in their average discharge rates. Modulations in the average rates of these neurons reflected spectral variations that were independent of VOT. The results are compared to other measurements of the peripheral encoding of speech sounds and to psychophysical observations suggesting that syllables with large variations in VOT are heard as belonging to one of only two phonemic categories.  相似文献   

8.
A previous experiment demonstrated age-related differences in voice-onset-time (VOT) discrimination when an adaptive procedure was used and trials were concentrated among pairs of stimuli that were discriminated 50% of the time. The major purpose of this experiment was to determine whether the same types of age effects would be replicated for new groups of subjects and a different task in which all stimuli were presented equal numbers of times. An eight-item, five-formant consonant-vowel (CV) continuum in which VOT ranged from 0-35 ms was used. The same-different task presented all possible pairs of CV syllables in which VOT differed by 10 and 20 ms and an equal number of catch trials that contained identical CVs. Results showed that children displayed poorer discrimination than adults for CV pairs differing by both time intervals. Adults displayed a somewhat greater tendency to respond "same" than children. The outcomes supported results of the previous study and were interpreted as representing true age-related differences in VOT discrimination.  相似文献   

9.
Voice onset time (VOT) is a temporal cue that can distinguish consonants such as /d/ from /t/. It has previously been shown that neurons' responses to the onset of voicing are strongly dependent on their static spectral sensitivity. This study examined the relation between temporal resolution, determined from responses to sinusoidally amplitude-modulated (SAM) tones, and responses to syllables with different VOTs. Responses to syllables and SAM tones were obtained from low-frequency neurons in the inferior colliculus (IC) of the chinchilla. VOT and modulation period varied from 10 to 70 ms in 10-ms steps, and discharge rates elicited by stimuli whose amplitude envelopes were modulated over the same temporal interval were compared. Neurons that respond preferentially to syllables with particular VOTs might be expected to respond best to the SAM tones with comparable modulation periods. However, no consistent agreement between responses to VOT syllables and to SAM tones was obtained. These results confirm the previous suggestion that IC neurons' selectivity for VOT is determined by spectral rather than temporal sensitivity.  相似文献   

10.
Voice onset time (VOT) data for the plosives /p b t d k g/ in two vowel contexts (/i a/) for 5 groups of 46 boys and girls aged 5; 8 (5 years, 8 months) to 13;2 years were investigated to examine patterns of sex differences. Results indicated that there was some evidence of females displaying longer VOT values than the males. In addition, these were found to be most marked for the data of the 13;2-year olds. Furthermore, the sex differences in the VOT values displayed phonetic context effects. For example, the greatest sex differences were observed for the voiceless plosives, and within the context of the vowel /i/.  相似文献   

11.
12.
This study examined whether vocal fold kinematics prior to phonation differed between hard (glottal), normal, or breathy onsets in men and women. Glottal landmarks were identified and digitized from videotape recorded with a rigid laryngoscope during different voice onset types. Significant linear relationships (p 0.0055) were found among onset types on measures of (a) gesture duration when moving from 80% to 20% of maximum distance during adduction, (b) maximum velocity, (c) duration between the completion of adduction and phonation onset, and (d) ratios of maximum velocity to maximum distance between the vocal processes, an estimate of stiffness. The gesture duration was greatest for breathy onsets and least for hard onsets, while the maximum velocity, latency between adduction and phonation onset, and estimated stiffness were greatest for hard onsets and least for breathy onsets. The results suggest that one trajectory seems to be used with increases in gesture duration being accompanied by decreases in articulator stiffness when moving from hard to normal to breathy voice onset types.  相似文献   

13.
14.
Responses of chinchilla auditory-nerve fibers were measured for stimulus conditions analogous to those in which psychophysical release from masking has been observed in humans. The maskers were two equal power, narrow-band noise stimuli with different amplitude envelopes. The neurons in the sample fell into three groups that resolved the maskers' envelopes with varying degrees of accuracy. The boundaries of these groups were not sharply delineated by characteristic frequency (CF) but were dependent on the relationship between the masker level and the neurons' thresholds at the masker frequency. For the neurons that best preserved the maskers' envelope fluctuations, a neural release from masking was observed; rate-based neural masked thresholds were higher for the masker with the least fluctuating envelope. The results suggest that neural and psychophysical release from masking arises because the probe evokes larger rate changes, relative to the background response to the masker, during periods of low masker energy. Between two otherwise equivalent maskers, the one with the periods of lowest energy will produce the lower masked thresholds because rate changes are larger and more detectable.  相似文献   

15.
A nonlinear optical processor that is capable of true real-time conversion of spatial-domain images to ultrafast time-domain optical waveforms is presented. The method is based on four-wave mixing between the optical waves of spectrally decomposed ultrashort pulses and spatially Fourier-transformed quasi-monochromatic images. To achieve efficient wave mixing at a femtosecond rate we utilize a cascaded second-order nonlinearity arrangement in a beta-barium borate crystal with type II phase matching. We use this ultrafast technique to experimentally generate several complex-amplitude temporal waveforms, with efficiency as high as 10%, by virtue of the cascaded nonlinearity arrangement.  相似文献   

16.
Voice onset time (VOT) signifies the interval between consonant onset and the start of rhythmic vocal-cord vibrations. Differential perception of consonants such as /d/ and /t/ is categorical in American English, with the boundary generally lying at a VOT of 20-40 ms. This study tests whether previously identified response patterns that differentially reflect VOT are maintained in large-scale population activity within primary auditory cortex (A1) of the awake monkey. Multiunit activity and current source density patterns evoked by the syllables /da/ and /ta/ with variable VOTs are examined. Neural representation is determined by the tonotopic organization. Differential response patterns are restricted to lower best-frequency regions. Response peaks time-locked to both consonant and voicing onsets are observed for syllables with a 40- and 60-ms VOT, whereas syllables with a 0- and 20-ms VOT evoke a single response time-locked only to consonant onset. Duration of aspiration noise is represented in higher best-frequency regions. Representation of VOT and aspiration noise in discrete tonotopic areas of A1 suggest that integration of these phonetic cues occurs in secondary areas of auditory cortex. Findings are consistent with the evolving concept that complex stimuli are encoded by synchronized activity in large-scale neuronal ensembles.  相似文献   

17.
Functional magnetic resonance imaging (fMRI) was performed in 30 healthy adults to identify the location, magnitude, and extent of activation in brain regions that are engaged during the performance of Conners' Continuous Performance Test (CPT). Performance on the task during fMRI was highly correlated with performance on the standard Conners' CPT in the behavioral testing laboratory. An extensive neural network was activated during the task that included the frontal, cingulate, parietal, temporal, and occipital cortices; the cerebellum and the basal ganglia. There was also a network of brain regions which were more active during fixation than task. The magnitude of activation in several regions was correlated with reaction time. Among regions that were more active during task, the overall volume of supratentorial activation and cerebellar activation was greater in the left hemisphere. Frontal activation was greater in dorsal than in ventral regions, and dorsal frontal activation was bilateral. Ventral frontal region and parietal lobe activation were greater in the right hemisphere. The volume of clusters of activation in the extrastriate ventral visual pathway was greater in the left hemisphere. This network is consistent with existing models of motor control, visual object processing and attentional control and may serve as a basis for hypothesis-driven fMRI studies in clinical populations with deficits in Conners' CPT performance.  相似文献   

18.
Neural Volterra filter for chaotic time series prediction   总被引:1,自引:0,他引:1       下载免费PDF全文
李恒超  张家树  肖先赐 《中国物理》2005,14(11):2181-2188
A new second-order neural Volterra filter (SONVF) with conjugate gradient (CG) algorithm is proposed to predict chaotic time series based on phase space delay-coordinate reconstruction of chaotic dynamics system in this paper, where the neuron activation functions are introduced to constraint Volterra series terms for improving the nonlinear approximation of second-order Volterra filter (SOVF). The SONVF with CG algorithm improves the accuracy of prediction without increasing the computation complexity. Meanwhile, the difficulty of neuron number determination does not exist here. Experimental results show that the proposed filter can predict chaotic time series effectively, and one-step and multi-step prediction performances are obviously superior to those of SOVF, which demonstrate that the proposed SONVF is feasible and effective.  相似文献   

19.
The goal of this study was to examine the neural encoding of voice-onset time distinctions that indicate the phonetic categories /da/ and /ta/ for human listeners. Cortical Auditory Evoked Potentials (CAEP) were measured in conjunction with behavioral perception of a /da/-/ta/ continuum. Sixteen subjects participated in identification and discrimination experiments. A sharp category boundary was revealed between /da/ and /ta/ around the same location for all listeners. Subjects' discrimination of a VOT change of equal magnitude was significantly more accurate across the /da/-/ta/ categories than within the /ta/ category. Neurophysiologic correlates of VOT encoding were investigated using the N1 CAEP which reflects sensory encoding of stimulus features and the MMN CAEP which reflects sensory discrimination. The MMN elicited by the across-category pair was larger and more robust than the MMN which occurred in response to the within-category pair. Distinct changes in N1 morphology were related to VOT encoding. For stimuli that were behaviorally identified as /da/, a single negativity (N1) was apparent; however, for stimuli identified as /ta/, two distinct negativities (N1 and N1') were apparent. Thus the enhanced MMN responses and the morphological discontinuity in N1 morphology observed in the region of the /da/-/ta/ phonetic boundary appear to provide neurophysiologic correlates of categorical perception for VOT.  相似文献   

20.
High-speed filming is one of the most informative methods for assessing voice physiology data. Tracing high-speed images of the glottis provides quantitative parameters such as the glottal area and the glottal width function. By way of example, a number of studies are discussed which extract quantitative data from high-speed images showing voice onsets. Furthermore, a new computer system (MVAS; multi-dimensional voice analysis system) is presented that synchronously displays a laryngoscopic high-speed film, the electroglottographical signal, and several acoustic analyses of the recorded voice sample. The automatic measurement of glottal width and glottal area from the laryngoscopic images is also provided. Looking at former studies and our analyses of voice onsets reveals a tremendous intersubject and even intrasubject variability (different prephonatory closure, different time span until full amplitude is reached, different open quotient).  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号