首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到10条相似文献,搜索用时 126 毫秒
1.
2.
Frequency modulation coherence was investigated as a possible cue for the perceptual segregation of concurrent sound sources. Synthesized chords of 2-s duration and comprising six permutations of three sung vowels (/a/, /i/, /o/) at three fundamental frequencies (130.8, 174.6, and 233.1 Hz) were constructed. In one condition, no vowels were modulated, and, in a second, all three were modulated coherently such that the ratio relations among all frequency components were maintained. In a third group of conditions, one vowel was modulated, while the other two remained steady. In a fourth group, one vowel was modulated independently of the two other vowels, which were modulated coherently with one another. Subjects were asked to judge the perceived prominence of each of the three vowels in each chord. Judged prominence increased significantly when the target vowel was modulated compared to when it was not, with the greatest increase being found for higher fundamental frequencies. The increase in prominence with modulation was unaffected by whether the target was modulated coherently or not with nontarget vowels. The modulation and pitch position of nontarget vowels had no effect on target vowel prominence. These results are discussed in terms of possible concurrent auditory grouping principles.  相似文献   

3.
Speech coding in the auditory nerve: V. Vowels in background noise   总被引:1,自引:0,他引:1  
Responses of auditory-nerve fibers to steady-state, two-formant vowels in low-pass background noise (S/N = 10 dB) were obtained in anesthetized cats. For fibers over a wide range of characteristic frequencies (CFs), the peaks in discharge rate at the onset of the vowel stimuli were nearly eliminated in the presence of noise. In contrast, strong effects of noise on fine time patterns of discharge were limited to CF regions that are far from the formant frequencies. One effect is a reduction in the amplitude of the response component at the fundamental frequency in the high-CF regions and for CFs between F1 and F2 when the formants are widely separated. A reduction in the amplitude of the response components at the formant frequencies, with concomitant increase in components near CF or low-frequency components occurs in CF regions where the signal-to-noise ratio is particularly low. The processing schemes that were effective for estimating the formant frequencies and fundamental frequency of vowels in quiet generally remain adequate in moderate-level background noise. Overall, the discharge patterns contain many cues for distinctions among the vowel stimuli, so that the central processor should be able to identify the different vowels, consistent with psychophysical performance at moderate signal-to-noise ratios.  相似文献   

4.
The purpose of this study was to determine the accuracy with which listeners could identify the gender of a speaker from a synthesized isolated vowel based on the natural production of that speaker when (1) the fundamental frequency was consistent with the speaker's gender, (2) the fundamental frequency was inconsistent with the the speaker's gender, and (3) the speaker was transgendered. Ten male-to-female transgendered persons, 10 men and 10 women, served as subjects. Each speaker produced the vowels /i/, /u/, and //. These vowels were analyzed for fundamental frequency and the first three formant frequencies and bandwidths. Formant frequency and bandwidth information was used to synthesize two vowel tokens for each speaker, one at a fundamental frequency of 120 Hz and one at 240 Hz. Listeners were asked to listen to these tokens and determine whether the original speaker was male or female. Listeners were not aware of the use of transgendered speakers. Results showed that, in all cases, gender identifications were based on fundamental frequency, even when fundamental frequency and formant frequency information was contradictory.  相似文献   

5.
An experiment investigated the effects of amplitude ratio (-35 to 35 dB in 10-dB steps) and fundamental frequency difference (0%, 3%, 6%, and 12%) on the identification of pairs of concurrent synthetic vowels. Vowels as weak as -25 dB relative to their competitor were easier to identify in the presence of a fundamental frequency difference (delta F0). Vowels as weak as -35 dB were not. Identification was generally the same at delta F0 = 3%, 6%, and 12% for all amplitude ratios: unfavorable amplitude ratios could not be compensated by larger delta F0's. Data for each vowel pair and each amplitude ratio, at delta F0 = 0%, were compared to the spectral envelope of the stimulus at the same ratio, in order to determine which spectral cues determined identification. This information was then used to interpret the pattern of improvement with delta F0 for each vowel pair, to better understand mechanisms of F0-guided segregation. Identification of a vowel was possible in the presence of strong cues belonging to its competitor, as long as cues to its own formants F1 and F2 were prominent. delta F0 enhanced the prominence of a target vowel's cues, even when the spectrum of the target was up to 10 dB below that of its competitor at all frequencies. The results are incompatible with models of segregation based on harmonic enhancement, beats, or channel selection.  相似文献   

6.
Imitations of ten synthesized vowels were recorded from 33 speakers including men, women, and children. The first three formant frequencies of the imitations were estimated from spectrograms and considered with respect to developmental patterns in vowel formant structure, uniform scale factors for vowel normalization, and formant variability. Strong linear effects were observed in the group data for imitations of most of the English vowels studied, and straight lines passing through the origin provided a satisfactory fit to linear F1--F2 plots of the English vowel data. Logarithmic transformations of the formant frequencies helped substantially to equalize the dispersion of the group data for different vowels, but formant scale factors were observed to vary somewhat with both formant number and vowel identity. Variability of formant frequency was least for F1 (s.d. of 60 Hz or less for English vowels of adult males) and about equal for F2 and F3 (s.d. of 100 Hz or less for English vowels of adult males).  相似文献   

7.
Covariation among vowel height effects on vowel intrinsic fundamental frequency (IF(0)), voice onset time (VOT), and voiceless interval duration (VID) is analyzed to assess the plausibility of a common physiological mechanism underlying variation in these measures. Phrases spoken by 20 young adults, containing words composed of initial voiceless stops or /s/ and high or low vowels, were produced in habitual and voluntarily increased F(0) conditions. High vowels were associated with increased IF(0) and longer VIDs. VOT and VID exhibited significant covariation with IF(0) only for males at habitual F(0). The lack of covariation for females and at increased F(0) is discussed.  相似文献   

8.
Ten American English vowels were sung in a /b/-vowel-/d/ consonantal context by a professional countertenor in full voice (at F0 = 130, 165, 220, 260, and 330 Hz) and in head voice (at F0 = 220, 260, 330, 440, and 520 Hz). Four identification tests were prepared using the entire syllable or the center 200-ms portion of either the full-voice tokens or the head-voice tokens. Listeners attempted to identify each vowel by circling the appropriate word on their answer sheets. Errors were more frequent when the vowels were sung at higher F0. In addition, removal of the consonantal context markedly increased identification errors for both the head-voice and full-voice conditions. Back vowels were misidentified significantly more often than front vowels. For equal F0 values, listeners were significantly more accurate in identifying the head-voice stimuli. Acoustical analysis suggests that the difference of intelligibility between head and full voice may have been due to the head voice having more energy in the first harmonic than the full voice.  相似文献   

9.
The perception of voicing in final velar stop consonants was investigated by systematically varying vowel duration, change in offset frequency of the final first formant (F1) transition, and rate of frequency change in the final F1 transition for several vowel contexts. Consonant-vowel-consonant (CVC) continua were synthesized for each of three vowels, [i,I,ae], which represent a range of relatively low to relatively high-F1 steady-state values. Subjects responded to the stimuli under both an open- and closed-response condition. Results of the study show that both vowel duration and F1 offset properties influence perception of final consonant voicing, with the salience of the F1 offset property higher for vowels with high-F1 steady-state frequencies than low-F1 steady-state frequencies, and the opposite occurring for the vowel duration property. When F1 onset and offset frequencies were controlled, rate of the F1 transition change had inconsistent and minimal effects on perception of final consonant voicing. Thus the findings suggest that it is the termination value of the F1 offset transition rather than rate and/or duration of frequency change, which cues voicing in final velar stop consonants during the transition period preceding closure.  相似文献   

10.

Background  

The cortical activity underlying the perception of vowel identity has typically been addressed by manipulating the first and second formant frequency (F1 & F2) of the speech stimuli. These two values, originating from articulation, are already sufficient for the phonetic characterization of vowel category. In the present study, we investigated how the spectral cues caused by articulation are reflected in cortical speech processing when combined with phonation, the other major part of speech production manifested as the fundamental frequency (F0) and its harmonic integer multiples. To study the combined effects of articulation and phonation we presented vowels with either high (/a/) or low (/u/) formant frequencies which were driven by three different types of excitation: a natural periodic pulseform reflecting the vibration of the vocal folds, an aperiodic noise excitation, or a tonal waveform. The auditory N1m response was recorded with whole-head magnetoencephalography (MEG) from ten human subjects in order to resolve whether brain events reflecting articulation and phonation are specific to the left or right hemisphere of the human brain.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号