首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Natural speech consonant-vowel (CV) syllables [( f, s, theta, s, v, z, ?] followed by [i, u, a]) were computer edited to include 20-70 ms of their frication noise in 10-ms steps as measured from their onset, as well as the entire frication noise. These stimuli, and the entire syllables, were presented to 12 subjects for consonant identification. Results show that the listener does not require the entire fricative-vowel syllable in order to correctly perceive a fricative. The required frication duration depends on the particular fricative, ranging from approximately 30 ms for [s, z] to 50 ms for [f, s, v], while [theta, ?] are identified with reasonable accuracy in only the full frication and syllable conditions. Analysis in terms of the linguistic features of voicing, place, and manner of articulation revealed that fricative identification in terms of place of articulation is much more affected by a decrease in frication duration than identification in terms of voicing and manner of articulation.  相似文献   

2.
Speech perception requires the integration of information from multiple phonetic and phonological dimensions. A sizable literature exists on the relationships between multiple phonetic dimensions and single phonological dimensions (e.g., spectral and temporal cues to stop consonant voicing). A much smaller body of work addresses relationships between phonological dimensions, and much of this has focused on sequences of phones. However, strong assumptions about the relevant set of acoustic cues and/or the (in)dependence between dimensions limit previous findings in important ways. Recent methodological developments in the general recognition theory framework enable tests of a number of these assumptions and provide a more complete model of distinct perceptual and decisional processes in speech sound identification. A hierarchical Bayesian Gaussian general recognition theory model was fit to data from two experiments investigating identification of English labial stop and fricative consonants in onset (syllable initial) and coda (syllable final) position. The results underscore the importance of distinguishing between conceptually distinct processing levels and indicate that, for individual subjects and at the group level, integration of phonological information is partially independent with respect to perception and that patterns of independence and interaction vary with syllable position.  相似文献   

3.
This study presents various acoustic measures used to examine the sequence /a # C/, where "#" represents different prosodic boundaries in French. The 6 consonants studied are /b d g f s S/ (3 stops and 3 fricatives). The prosodic units investigated are the utterance, the intonational phrase, the accentual phrase, and the word. It is found that vowel target values, formant transitions into the stop consonant, and the rate of change in spectral tilt into the fricative, are affected by the strength of the prosodic boundary. F1 becomes higher for /a/ the stronger the prosodic boundary, with the exception of one speaker's utterance data, which show the effects of articulatory declension at the utterance level. Various effects of the stop consonant context are observed, the most notable being a tendency for the vowel /a/ to be displaced in the direction of the F2 consonant "locus" for /d/ (the F2 consonant values for which remain relatively stable across prosodic boundaries) and for /g/ (the F2 consonant values for which are displaced in the direction of the velar locus in weaker prosodic boundaries, together with those of the vowel). Velocity of formant transition may be affected by prosodic boundary (with greater velocity at weaker boundaries), though results are not consistent across speakers. There is also a tendency for the rate of change in spectral tilt moving from the vowel to the fricative to be affected by the presence of a prosodic boundary, with a greater rate of change at the weaker prosodic boundaries. It is suggested that spectral cues, in addition to duration, amplitude, and F0 cues, may alert listeners to the presence of a prosodic boundary.  相似文献   

4.
Past studies have shown that when formants are perturbed in real time, speakers spontaneously compensate for the perturbation by changing their formant frequencies in the opposite direction to the perturbation. Further, the pattern of these results suggests that the processing of auditory feedback error operates at a purely acoustic level. This hypothesis was tested by comparing the response of three language groups to real-time formant perturbations, (1) native English speakers producing an English vowel /ε/, (2) native Japanese speakers producing a Japanese vowel (/e([inverted perpendicular])/), and (3) native Japanese speakers learning English, producing /ε/. All three groups showed similar production patterns when F1 was decreased; however, when F1 was increased, the Japanese groups did not compensate as much as the native English speakers. Due to this asymmetry, the hypothesis that the compensatory production for formant perturbation operates at a purely acoustic level was rejected. Rather, some level of phonological processing influences the feedback processing behavior.  相似文献   

5.
Several types of measurements were made to determine the acoustic characteristics that distinguish between voiced and voiceless fricatives in various phonetic environments. The selection of measurements was based on a theoretical analysis that indicated the acoustic and aerodynamic attributes at the boundaries between fricatives and vowels. As expected, glottal vibration extended over a longer time in the obstruent interval for voiced fricatives than for voiceless fricatives, and there were more extensive transitions of the first formant adjacent to voiced fricatives than for the voiceless cognates. When two fricatives with different voicing were adjacent, there were substantial modifications of these acoustic attributes, particularly for the syllable-final fricative. In some cases, these modifications leads to complete assimilation of the voicing feature. Several perceptual studies with synthetic vowel-consonant-vowel stimuli and with edited natural stimuli examined the role of consonant duration, extent and location of glottal vibration, and extent of formant transitions on the identification of the voicing characteristics of fricatives. The perceptual results were in general consistent with the acoustic observations and with expectations based on the theoretical model. The results suggest that listeners base their voicing judgments of intervocalic fricatives on an assessment of the time interval in the fricative during which there is no glottal vibration. This time interval must exceed about 60 ms if the fricative is to be judged as voiceless, except that a small correction to this threshold is applied depending on the extent to which the first-formant transitions are truncated at the consonant boundaries.  相似文献   

6.
对齿龈塞音在腭裂语音中的声门塞音代偿现象进行了声学分析,计算频谱分布的多阶统计量—谱矩,并将代偿塞音和正常塞音进行对比。结果显示声门塞音爆破段的第一阶谱矩即频谱质心的频率位置比正常塞音低,因为声门塞音的阻塞部位在声门,导致声道腔体偏长从而共振频率偏低。还观察到声门塞音的第二阶谱矩即标准偏差偏高,说明其谱能量分布比正常塞音更加分散。声门塞音的第三阶谱矩即偏度大多为正值,反映了声门塞音功率谱的非对称性且大头朝向低频区而长尾朝向高频区。采用逻辑回归模型进行样本分类,通过交叉验证选出最优的四阶谱矩作为模型自变量,分类正确率为89.7%。结合塞音爆破时刻自动检测,实现了音节/di/的声门塞音客观判定。   相似文献   

7.
The purpose of this investigation was to study the effects of consonant environment on vowel duration for normally hearing males, hearing-impaired males with intelligible speech, and hearing-impaired males with semi-intelligible speech. The results indicated that the normally hearing and intelligible hearing-impaired speakers exhibited similar trends with respect to consonant influence on vowel duration; i.e., vowels were longer in duration, in a voiced environment as compared with a voiceless, and in a fricative environment as compared with a plosive. The semi-intelligible hearing-impaired speakers, however, failed to demonstrate a consonant effect on vowel duration, and produced the vowels with significantly longer durations when compared with the other two groups of speakers. These data provide information regarding temporal conditions which may contribute to the decreased intelligibility of hearing-impaired persons.  相似文献   

8.
Previous experimental evidence has been interpreted as support for regulation of both acoustics and aerodynamics during speech production. One recent perspective is that although speech acoustics may be manipulated, regulation of aerodynamics is a central component of the processes that produce speech. From this perspective, it has been suggested that aerodynamic regulation is given priority over perceptual accuracy. The experiment attempted to test this hypothesis by forcing speakers into a choice between aerodynamic and acoustic regulation. The intensity level of frication (embedded in a carrier phrase) was selectively amplified or attenuated and fed back to the speaker on line. Intraoral air pressure was recorded in order to assess whether or not perturbed auditory feedback would result in aerodynamic compensation. Although compensatory changes in peak intraoral air pressure, pressure duration, and pressure curve area were seen in response to 30-dB alterations of frication, no systematic effects were seen for smaller auditory manipulations. Further, the compensations were less than what one might expect from a system controlling auditory output. Explanations of these findings and their implications for the control of speech production are offered.  相似文献   

9.
The attenuation of sound due to the interaction between a low Mach number turbulent boundary layer and acoustic waves can be significant at low frequencies or in narrow tubes. In a recent publication by the present authors the acoustics of charge air coolers for passenger cars has been identified as an interesting application where turbulence attenuation can be of importance. Favourable low-frequency damping has been observed that could be used for control of the in-duct sound that is created by the engine gas exchange process. Analytical frequency-dependent models for the eddy viscosity that controls the momentum and thermal boundary layers are available but are restricted to thin acoustic boundary layers. For cases with cross-sections of a few millimetres a model based on thin acoustic boundary layers will not be applicable in the frequency range of interest.In the present paper a frequency-dependent axis-symmetric numerical model for interaction between turbulence and acoustic waves is proposed. A finite element scheme is used to formulate the time harmonic linearized convective equations for conservation of mass, momentum and energy into one coupled system of equations. The turbulence is introduced with a linear model for the eddy viscosity that is added to the shear viscosity. The proposed model is validated by comparison with experimental data from the literature.  相似文献   

10.
Closants, or consonantlike sounds in infant vocalizations, were described acoustically using 16-kHz spectrograms and LPC or FFT analyses based on waveforms sampled at 20 or 40 kHz. The two major closant types studied were fricatives and trills. Compared to similar fricative sounds in adult speech, the fricative sounds of the 3-, 6-, 9-, and 12-month-old infants had primary spectral components at higher frequencies, i.e., to and above 14 kHz. Trill rate varied from 16-180 Hz with a mean of about 100, approximately four times the mean trill rate reported for adult talkers. Acoustic features are described for various places of articulation for fricatives and trills. The discussion of the data emphasizes dimensions of acoustic contrast that appear in infant vocalizations during the first year of life, and implications of the spectral data for auditory and motor self-stimulation by normal-hearing and hearing-impaired infants.  相似文献   

11.
Previous work has established that speakers have difficulty making rapid compensatory adjustments in consonant production (especially in fricatives) for structural perturbations of the vocal tract induced by artificial palates with thicker-than-normal alveolar regions. The present study used electromagnetic articulography and simultaneous acoustic recordings to estimate tongue configurations during production of [s s? t k] in the presence of a thin and a thick palate, before and after a practice period. Ten native speakers of English participated in the study. In keeping with previous acoustic studies, fricatives were more affected by the palate than were the stops. The thick palate lowered the center of gravity and the jaw was lower and the tongue moved further backwards and downwards. Center of gravity measures revealed complete adaptation after training, and with practice, subjects' decreased interlabial distance. The fact that adaptation effects were found for [k], which are produced with an articulatory gesture not directly impeded by the palatal perturbation, suggests a more global sensorimotor recalibration that extends beyond the specific articulatory target.  相似文献   

12.
Acoustic analyses were undertaken to explore the durational characteristics of the fricatives [f,theta,s,v,delta z] as cues to initial consonant voicing in English. Based on reports on the perception of voiced-voiceless fricatives, it was expected that there would be clear-cut duration differences distinguishing voiced and voiceless fricatives. Preliminary results for three speakers indicate that, although differences emerged in the overall mean duration of voiced and voiceless fricatives, contrary to expectations, there was a great deal of overlap in the duration distribution of voiced and voiceless fricative tokens. Further research is needed to examine the role of duration as a cue to syllable-initial fricative consonant voicing in English.  相似文献   

13.
Weak consonants (e.g., stops) are more susceptible to noise than vowels, owing partially to their lower intensity. This raises the question whether hearing-impaired (HI) listeners are able to perceive (and utilize effectively) the high-frequency cues present in consonants. To answer this question, HI listeners were presented with clean (noise absent) weak consonants in otherwise noise-corrupted sentences. Results indicated that HI listeners received significant benefit in intelligibility (4 dB decrease in speech reception threshold) when they had access to clean consonant information. At extremely low signal-to-noise ratio (SNR) levels, however, HI listeners received only 64% of the benefit obtained by normal-hearing listeners. This lack of equitable benefit was investigated in Experiment 2 by testing the hypothesis that the high-frequency cues present in consonants were not audible to HI listeners. This was tested by selectively amplifying the noisy consonants while leaving the noisy sonorant sounds (e.g., vowels) unaltered. Listening tests indicated small (~10%), but statistically significant, improvements in intelligibility at low SNR conditions when the consonants were amplified in the high-frequency region. Selective consonant amplification provided reliable low-frequency acoustic landmarks that in turn facilitated a better lexical segmentation of the speech stream and contributed to the small improvement in intelligibility.  相似文献   

14.
Traditionally, the average professional musician has owned numerous acoustic musical instruments, many of them having distinctive acoustic qualities. However, a modern musician could prefer to have a single musical instrument whose acoustics are programmable by feedback control, where acoustic variables are estimated from sensor measurements in real time and then fed back in order to influence the controlled variables. In this paper, theory is presented that describes stable feedback control of an acoustic musical instrument. The presentation should be accessible to members of the musical acoustics community who may have limited or no experience with feedback control. First, the only control strategy guaranteed to be stable subject to any musical instrument mobility is described: the sensors and actuators must be collocated, and the controller must emulate a physical analog system. Next, the most fundamental feedback controllers and the corresponding physical analog systems are presented. The effects that these controllers have on acoustic musical instruments are described. Finally, practical design challenges are discussed. A proof explains why changing the resonance frequency of a musical resonance requires much more control power than changing the decay time of the resonance.  相似文献   

15.
Variations in the loop response of hearing aids caused by jaw movements, variations in acoustics outside the ear, and variations of vent size have been identified. Behind The Ear (BTE) and In The Ear Canal (ITEC) hearing aids were considered. The largest variations among the variations of the acoustics outside the ear, except when the hearing aid was partly removed, were found with the ITEC when a telephone set was placed by the ear. The variations of the loop response caused by changes in vent size were compared with the variations of a theoretical model of the feedback path. The theoretical model was also used to compare the feedback of different designs of the vent that gives the same acoustic impedance at low frequencies. The calculated feedback was less with the short vents (12 mm) than the long vents (24 mm).  相似文献   

16.
Among the most influential publications in speech perception is Liberman, Delattre, and Cooper's [Am. J. Phys. 65, 497-516 (1952)] report on the identification of synthetic, voiceless stops generated by the Pattern Playback. Their map of stop consonant identification shows a highly complex relationship between acoustics and perception. This complex mapping poses a challenge to many classes of relatively simple pattern recognition models which are unable to capture the original finding of Liberman et al. that identification of /k/ was bimodal for bursts preceding front vowels but otherwise unimodal. A replication of this experiment was conducted in an attempt to reproduce these identification patterns using a simulation of the Pattern Playback device. Examination of spectrographic data from stimuli generated by the Pattern Playback revealed additional spectral peaks that are consistent with harmonic distortion characteristic of tube amplifiers of that era. Only when harmonic distortion was introduced did bimodal /k/ responses in front-vowel context emerge. The acoustic consequence of this distortion is to add, e.g., a high-frequency peak to midfrequency bursts or a midfrequency peak to a low-frequency burst. This likely resulted in additional /k/ responses when the second peak approximated the second formant of front vowels. Although these results do not challenge the main observations made by Liberman et al. that perception of stop bursts is context dependent, they do show that the mapping from acoustics to perception is much less complex without these additional distortion products.  相似文献   

17.
The aim of the study was to establish whether /u/-fronting, a sound change in progress in standard southern British, could be linked synchronically to the fronting effects of a preceding anterior consonant both in speech production and speech perception. For the production study, which consisted of acoustic analyses of isolated monosyllables produced by two different age groups, it was shown for younger speakers that /u/ was phonetically fronted and that the coarticulatory influence of consonants on /u/ was less than in older speakers. For the perception study, responses were elicited from the same subjects to two minimal word-pair continua that differed in the direction of the consonants' coarticulatory fronting effects on /u/. Consistent with their speech production, young listeners' /u/ category boundary was shifted toward /i/ and they compensated perceptually less for the fronting effects of the consonants on /u/ than older listeners. The findings support Ohala's model in which certain sound changes can be linked to the listener's failure to compensate for coarticulation. The results are also shown to be consistent with episodic models of speech perception in which phonological frequency effects bring about a realignment of the variants of a phonological category in speech production and perception.  相似文献   

18.
Although some cochlear implant (CI) listeners can show good word recognition accuracy, it is not clear how they perceive and use the various acoustic cues that contribute to phonetic perceptions. In this study, the use of acoustic cues was assessed for normal-hearing (NH) listeners in optimal and spectrally degraded conditions, and also for CI listeners. Two experiments tested the tense/lax vowel contrast (varying in formant structure, vowel-inherent spectral change, and vowel duration) and the word-final fricative voicing contrast (varying in F1 transition, vowel duration, consonant duration, and consonant voicing). Identification results were modeled using mixed-effects logistic regression. These experiments suggested that under spectrally-degraded conditions, NH listeners decrease their use of formant cues and increase their use of durational cues. Compared to NH listeners, CI listeners showed decreased use of spectral cues like formant structure and formant change and consonant voicing, and showed greater use of durational cues (especially for the fricative contrast). The results suggest that although NH and CI listeners may show similar accuracy on basic tests of word, phoneme or feature recognition, they may be using different perceptual strategies in the process.  相似文献   

19.
Exploring the compensatory responses of the speech production system to perturbation has provided valuable insights into speech motor control. The present experiment was conducted to examine compensation for one such perturbation-a palatal perturbation in the production of the fricative /s/. Subjects wore a specially designed electropalatographic (EPG) appliance with a buildup of acrylic over the alveolar ridge as well as a normal EPG palate. In this way, compensatory tongue positioning could be assessed during a period of target specific and intense practice and compared to nonperturbed conditions. Electropalatographic, acoustic, and perceptual analyses of productions of /asa/ elicited from nine speakers over the course of a one-hour practice period were conducted. Acoustic and perceptual results confirmed earlier findings, which showed improvement in production with a thick artificial palate in place over the practice period; the EPG data showed overall increased maximum contact as well as increased medial and posterior contact for speakers with the thick palate in place, but little change over time. Negative aftereffects were observed in the productions with the thin palate, indicating recalibration of sensorimotor processes in the face of the oral-articulatory perturbation. Findings are discussed with regard to the nature of adaptive articulatory skills.  相似文献   

20.
时洁  杨德森  张昊阳  时胜国  李松  胡博 《中国物理 B》2017,26(7):74301-074301
The acoustical scattering cross section is usually employed to evaluate the scattering ability of the bubbles when they are excited by the incident acoustic waves. This parameter is strongly related to many important applications of performance prediction for search sonar or underwater telemetry, acoustical oceanography, acoustic cavitation, volcanology, and medical and industrial ultrasound. In the present paper, both the analytical and numerical analysis results of the acoustical scattering cross section of a single bubble under multi-frequency excitation are obtained. The nonlinear characteristics(e.g.,harmonics, subharmonics, and ultraharmonics) of the scattering cross section curve under multi-frequency excitation are investigated compared with single-frequency excitation. The influence of several paramount parameters(e.g., bubble equilibrium radius, acoustic pressure amplitude, and acoustic frequencies) in the multi-frequency system on the predictions of scattering cross section is discussed. It is shown that the combination resonances become significant in the multi-frequency system when the acoustic power is big enough, and the acoustical scattering cross section is promoted significantly within a much broader range of bubble sizes and acoustic frequencies due to the generation of more resonances.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号