首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 0 毫秒
1.
The present study was undertaken to examine if a subject's voice F0 responded not only to perturbations in pitch of voice feedback but also to changes in pitch of a side tone presented congruent with voice feedback. Small magnitude brief duration perturbations in pitch of voice or tone auditory feedback were randomly introduced during sustained vowel phonations. Results demonstrated a higher rate and larger magnitude of voice F0 responses to changes in pitch of the voice compared with a triangular-shaped tone (experiment 1) or a pure tone (experiment 2). However, response latencies did not differ across voice or tone conditions. Data suggest that subjects responded to the change in F0 rather than harmonic frequencies of auditory feedback because voice F0 response prevalence, magnitude, or latency did not statistically differ across triangular-shaped tone or pure-tone feedback. Results indicate the audio-vocal system is sensitive to the change in pitch of a variety of sounds, which may represent a flexible system capable of adapting to changes in the subject's voice. However, lower prevalence and smaller responses to tone pitch-shifted signals suggest that the audio-vocal system may resist changes to the pitch of other environmental sounds when voice feedback is present.  相似文献   

2.
Listeners' auditory discrimination of vowel sounds depends in part on the order in which stimuli are presented. Such presentation order effects have been argued to be language independent, and to result from psychophysical (not speech- or language-specific) factors such as the decay of memory traces over time or increased weighting of later-occurring stimuli. In the present study, native Cantonese speakers' discrimination of a linguistic tone continuum is shown to exhibit order of presentation effects similar to those shown for vowels in previous studies. When presented with two successive syllables differing in fundamental frequency by approximately 4 Hz, listeners were significantly more sensitive to this difference when the first syllable was higher in frequency than the second. However, American English-speaking listeners with no experience listening to Cantonese showed no such contrast effect when tested in the same manner using the same stimuli. Neither English nor Cantonese listeners showed any order of presentation effects in the discrimination of a nonspeech continuum in which tokens had the same fundamental frequencies as the Cantonese speech tokens but had a qualitatively non-speech-like timbre. These results suggest that tone presentation order effects, unlike vowel effects, may be language specific, possibly resulting from the need to compensate for utterance-related pitch declination when evaluating fundamental frequency for tone identification.  相似文献   

3.
The discrimination of the fundamental frequency (fo) of pairs of complex tones with no common harmonics is worse than the discrimination of fo for tones with all harmonics in common. These experiments were conducted to assess whether this effect is a result of pitch shifts between pairs of tones without common harmonics or whether it reflects influences of spectral differences (timbre) on the accuracy of pitch perception. In experiment 1, pitch matches were obtained between sounds drawn from the following types: (1) pure tones (P) with frequencies 100, 200, or 400 Hz; (2) a multiple-component complex tone, designated A, with harmonics 3, 4, 8, 9, 10, 14, 15, and fo = 100, 200, or 400 Hz; (3) A multiple-component complex tone, designated B, with harmonics 5, 6, 7, 11, 12, 13, 16, and with fo = 100, 200 or 400 Hz. The following matches were made; A vs A, B vs B, A vs P, B vs P and P vs P. Pitch shifts were found between the pure tones and the complex tones (A vs P and B vs P), but not between the A and B tones (A vs B). However, the variability of the A vs B matches was significantly greater than that of the A vs A or B vs B matches. Also, the variability of the A vs P and B vs P matches was greater than that for the A vs B matches. In a second experiment, frequency difference limens (DLCs) were measured for the A vs A, B vs B, and A vs B pairs of sounds. The DLCs were larger for the A vs B pair than for A vs A or B vs B. The results suggest that the poor frequency discrimination of tones with no common harmonics does not result from pitch shifts between the tones. Rather, it seems that spectral differences between tones interfere with judgements of their relative pitch.  相似文献   

4.
Learning to perceive pitch differences   总被引:2,自引:0,他引:2  
This paper reports two experiments concerning the stimulus specificity of pitch discrimination learning. In experiment 1, listeners were initially trained, during ten sessions (about 11,000 trials), to discriminate a monaural pure tone of 3000 Hz from ipsilateral pure tones with slightly different frequencies. The resulting perceptual learning (improvement in discrimination thresholds) appeared to be frequency-specific since, in subsequent sessions, new learning was observed when the 3000-Hz standard tone was replaced by a standard tone of 1200 Hz, or 6500 Hz. By contrast, a subsequent presentation of the initial tones to the contralateral ear showed that the initial learning was not, or was only weakly, ear-specific. In experiment 2, training in pitch discrimination was initially provided using complex tones that consisted of harmonics 3-7 of a missing fundamental (near 100 Hz for some listeners, 500 Hz for others). Subsequently, the standard complex was replaced by a standard pure tone with a frequency which could be either equal to the standard complex's missing fundamental or remote from it. In the former case, the two standard stimuli were matched in pitch. However, this perceptual relationship did not appear to favor the transfer of learning. Therefore, the results indicated that pitch discrimination learning is, at least to some extent, timbre-specific, and cannot be viewed as a reduction of an internal noise which would affect directly the output of a neural device extracting pitch from both pure tones and complex tones including low-rank harmonics.  相似文献   

5.
Psychophysical experiments show that the pitch of a short sine wave tone depends upon the amplitude envelope of the tone. Subjects find that the pitch of an exponentially decaying tone (1dB/ms) is higher than the pitch of a (20-ms) rectangularly gated tone of equal frequency. The percentage difference in frequency required to produce equal pitches with the two envelopes depends upon frequency fo: 2.6% at fo = 412 Hz, 1.4% at fo = 825 Hz, 1% at fo = 1650 Hz, and 0.7% at fo = 3300 Hz. The pitch change is insensitive to the relative intensities of the two tones. The spectra of tones with the two different envelopes suggest no obvious explanation for the pitch change. However, the weighted time-varying spectra for tones with two different envelopes evolve differently with time. Alternatively the pitch change can be derived from a modified version of the auditory phase theory of Huggins.  相似文献   

6.
The aim of this paper is to answer the question whether "perception-action" dissociation, which is well documented in vision, may also be found in auditory information processing. Trained singers were asked to produce vowel sounds into a microphone. The sound that each singer produced was fed back to their ears via headphones. Two seconds after the sound production had begun, the auditory feedback was shifted in pitch by a certain degree (9, 19, 50, or 99 cents in either direction). In every set of sounds, instances without any pitch shifts also appeared. After each trial, participants reported whether they were aware of a pitch change or not. It was found that even though the participants were unaware of subtle pitch changes, the fundamental frequency of their vowel production was found to shift slightly in the opposite direction to the pitch shift. These results show that auditory information is processed by two separate systems: one for perception and one for action. They also show that the function of the auditory control system differs from the visual control system. The latter is used to control bodily movements while the function of the former is a nonconscious, instant control of vocalization.  相似文献   

7.
A pitch-synchronous analysis of hoarseness in running speech   总被引:3,自引:0,他引:3  
A method of pitch-synchronous acoustic analysis of hoarseness requiring a voice sample of only four fundamental periods is presented. This method calculates a noise-to-signal (N/S) ratio, which indicates the depth of valleys between harmonic peaks in the power spectrum. The spectrum is calculated pitch synchronously from a Fourier transform of the signal, windowed through a continuously variable Hanning window spanning exactly four fundamental periods. A two-stage procedure is used to determine the exact duration of the four fundamental periods. An initial estimate is obtained using autocorrelation in the time domain. A more precise estimate is obtained in the frequency domain by minimizing the errors between the preliminary calculated power spectrum and the predicted spectrum spread of a windowed harmonic signal. Analysis of synthesized voices showed that the N/S ratio is sensitive to additive noise, jitter, and shimmer, and is insensitive to slow (8 Hz) modulation in fundamental frequency and amplitude. An analysis of pre- and postoperative voices of six patients with benign laryngeal disease showed that the N/S ratio for vowel /u/ in running speech consistently improved after surgery for all subjects, in agreement with their successful therapeutic results.  相似文献   

8.
Three experiments examined the ability of listeners to identify steady-state synthetic vowel-like sounds presented concurrently in pairs to the same ear. Experiment 1 confirmed earlier reports that listeners identify the constituents of such pairs more accurately when they differ in fundamental frequency (f0) by about a half semitone or more, compared to the condition where they have the same f0. When the constituents have different f0's, corresponding harmonics of the two vowels are misaligned in frequency and corresponding pitch periods are asynchronous in time. These differences provide cues that might aid identification. Experiments 2 and 3 determined whether listeners can use these cues, divorced from a difference in f0, to improve their accuracy of identification. Harmonic misalignment was beneficial when the constituents had an f0 of 200 Hz so that the harmonics of each constituent were well separated in frequency. Pitch-period asynchrony was beneficial when the constituents had an f0 of 50 Hz so that the onsets of the pitch periods of each constituent were well separated in time. Neither cue was beneficial when both constituents had an f0 of 100 Hz. It is unlikely, therefore, that either cue contributed to the improvement in performance found in Experiment 1 where the constituents were given different f0's close to 100 Hz. Rather, it is argued that performance improved in Experiment 1 primarily because the two f0's specified two pitches that could be used to segregate the contributions of each vowel in the composite waveform.  相似文献   

9.
The preferences of experienced listeners for pitch and formant frequency dispersion in unison choir sounds were explored using synthesized stimuli. Two types of dispersion were investigated: (a) pitch scatter, which arises when voices in an ensemble exhibit small differences in mean fundamental frequency, and (b) spectral smear, defined as such dispersion of formants 3 to 5 as arises from differences in vocal tract length. Each stimulus represented a choir section of five bass, tenor, alto, or soprano voices, producing the vowel [u], [a], or [w]. Subjects chose one dispersion level out of six available, selecting the “maximum tolerable” in a first run and the “preferred” in a second run. The listeners were very different in their tolerance for dispersion. Typical scatter choices were 14 cent standard deviation for “tolerable” and 0 or 5 cent for “preferred.” The smear choices were less consistent; the standard deviations were 12 and 7%, respectively. In all modes of assessment, the largest dispersion was chosen for the vowel [u] on a bass tone. There was a vowel effect on the smear choices. The effects of voice category were not significant.  相似文献   

10.
The perceptual fusion of harmonics is often assumed to result from the operation of a template mechanism that is also responsible for computing global pitch. This dual-role hypothesis was tested using frequency-shifted complexes. These sounds are inharmonic, but preserve a regular pattern of equal component spacing. The stimuli had a nominal fundamental (F0) frequency of 200 Hz (+/- 20%), and were frequency shifted either by 25.0% or 37.5% of F0. Three consecutive components (6-8) were removed and replaced with a sinusoidal probe, located at one of a set of positions spanning the gap. On any trial, subjects heard a complex tone followed by an adjustable pure tone in a continuous loop. Subjects were well able to match the pitch of the probe unless it corresponded with a position predicted by the spectral pattern of the complex. Peripheral factors could not account for this finding. In contrast, hit rates were not depressed for probes positioned at integer multiples of the F0(s) corresponding to the global pitch(es) of the complex, predicted from previous data [Patterson, J. Acoust. Soc. Am. 53, 1565-1572 (1973)]. These findings suggest that separate central mechanisms are responsible for computing global pitch and for the perceptual grouping of partials.  相似文献   

11.
Frequency modulation coherence was investigated as a possible cue for the perceptual segregation of concurrent sound sources. Synthesized chords of 2-s duration and comprising six permutations of three sung vowels (/a/, /i/, /o/) at three fundamental frequencies (130.8, 174.6, and 233.1 Hz) were constructed. In one condition, no vowels were modulated, and, in a second, all three were modulated coherently such that the ratio relations among all frequency components were maintained. In a third group of conditions, one vowel was modulated, while the other two remained steady. In a fourth group, one vowel was modulated independently of the two other vowels, which were modulated coherently with one another. Subjects were asked to judge the perceived prominence of each of the three vowels in each chord. Judged prominence increased significantly when the target vowel was modulated compared to when it was not, with the greatest increase being found for higher fundamental frequencies. The increase in prominence with modulation was unaffected by whether the target was modulated coherently or not with nontarget vowels. The modulation and pitch position of nontarget vowels had no effect on target vowel prominence. These results are discussed in terms of possible concurrent auditory grouping principles.  相似文献   

12.
Three experiments investigated how the onset asynchrony and ear of presentation of a single mistuned frequency component influence its contribution to the pitch of an otherwise harmonic complex tone. Subjects matched the pitch of the target complex by adjusting the pitch of a second similar but strictly periodic complex tone. When the mistuned component (the 4th harmonic of a 155 Hz fundamental) started 160 ms or more before the remaining harmonics but stopped simultaneously with them, it made a reduced contribution to the pitch of the complex. It made no contribution if it started more than 300 ms before. Pitch shifts and their reduction with onset time were larger for short (90 ms) sounds than for long (410 ms). Pitch shifts were slightly larger when the mistuned component was presented to the same ear as the remaining 11 in-tune harmonics than to the opposite ear. Adding a "captor" complex tone with a fundamental of 200 Hz and a missing 3rd harmonic to the contralateral ear did not augment the effect of onset time, even though the captor was synchronous with the mistuned harmonic, the mistuned component was equal in frequency to the missing 3rd harmonic of the captor complex tone and it was played to the same ear as the captor. The results show that a difference in onset time can prevent a resolved frequency component from contributing to the pitch of a complex tone even though it is present throughout that complex tone.  相似文献   

13.
The contribution of extraneous sounds to the perceptual estimation of the first-formant (F1) frequency of voiced vowels was investigated using a continuum of vowels perceived as changing from/I/to/epsilon/as F1 was increased. Any phonetic effects of adding extraneous sounds were measured as a change in the position of the phoneme boundary on the continuum. Experiments 1-5 demonstrated that a pair of extraneous tones, mistuned from harmonic values of the fundamental frequency of the vowel, could influence perceived vowel quality when added in the F1 region. Perceived F1 frequency was lowered when the tones were added on the lower skirt of F1, and raised when they were added on the upper skirt. Experiments 6 and 7 demonstrated that adding a narrow-band noise in the F1 region could produce a similar pattern of boundary shifts, despite the differences in temporal properties and timbre between a noise band and a voiced vowel. The data are interpreted using the concept of the harmonic sieve [Duifhuis et al., J. Acoust. Soc. Am. 71, 1568-1580 (1982)]. The results imply a partial failure of the harmonic sieve to exclude extraneous sounds from the perceptual estimation of F1 frequency. Implications for the nature of the hypothetical harmonic sieve are discussed.  相似文献   

14.
The ability of baboons to discriminate changes in the formant structures of a synthetic baboon grunt call and an acoustically similar human vowel (/epsilon/) was examined to determine how comparable baboons are to humans in discriminating small changes in vowel sounds, and whether or not any species-specific advantage in discriminability might exist when baboons discriminate their own vocalizations. Baboons were trained to press and hold down a lever to produce a pulsed train of a standard sound (e.g., /epsilon/ or a baboon grunt call), and to release the lever only when a variant of the sound occurred. Synthetic variants of each sound had the same first and third through fifth formants (F1 and F3-5), but varied in the location of the second formant (F2). Thresholds for F2 frequency changes were 55 and 67 Hz for the grunt and vowel stimuli, respectively, and were not statistically different from one another. Baboons discriminated changes in vowel formant structures comparable to those discriminated by humans. No distinct advantages in discrimination performances were observed when the baboons discriminated these synthetic grunt vocalizations.  相似文献   

15.
Two sounds with the same pitch may vary from each other based on saliency of their pitch sensation. This perceptual attribute is called "pitch strength." The study of voice pitch strength may be important in quantifying of normal and pathological qualities. The present study investigated how pitch strength varies across normal and dysphonic voices. A set of voices (vowel /a/) selected from the Kay Elemetrics Disordered Voice Database served as the stimuli. These stimuli demonstrated a wide range of voice quality. Ten listeners judged the pitch strength of these stimuli in an anchored magnitude estimation task. On a given trial, listeners heard three different stimuli. The first stimulus represented very low pitch strength (wide-band noise), the second stimulus consisted of the target voice and the third stimulus represented very high pitch strength (pure tone). Listeners estimated pitch strength of the target voice by positioning a continuous slider labeled with values between 0 and 1, reflecting the two anchor stimuli. Results revealed that listeners can judge pitch strength reliably in dysphonic voices. Moderate to high correlations with perceptual judgments of voice quality suggest that pitch strength may contribute to voice quality judgments.  相似文献   

16.
It is difficult to hear out individually the components of a "chord" of equal-amplitude pure tones with synchronous onsets and offsets. In the present study, this was confirmed using 300-ms random (inharmonic) chords with components at least 1/2 octave apart. Following each chord, after a variable silent delay, listeners were presented with a single pure tone which was either identical to one component of the chord or halfway in frequency between two components. These two types of sequence could not be reliably discriminated from each other. However, it was also found that if the single tone following the chord was instead slightly (e.g., 1/12 octave) lower or higher in frequency than one of its components, the same listeners were sensitive to this relation. They could perceive a pitch shift in the corresponding direction. Thus, it is possible to perceive a shift in a nonperceived frequency/pitch. This paradoxical phenomenon provides psychophysical evidence for the existence of automatic "frequency-shift detectors" in the human auditory system. The data reported here suggest that such detectors operate at an early stage of auditory scene analysis but can be activated by a pair of sounds separated by a few seconds.  相似文献   

17.
Recent studies have demonstrated that mothers exaggerate phonetic properties of infant-directed (ID) speech. However, these studies focused on a single acoustic dimension (frequency), whereas speech sounds are composed of multiple acoustic cues. Moreover, little is known about how mothers adjust phonetic properties of speech to children with hearing loss. This study examined mothers' production of frequency and duration cues to the American English tense/lax vowel contrast in speech to profoundly deaf (N?=?14) and normal-hearing (N?=?14) infants, and to an adult experimenter. First and second formant frequencies and vowel duration of tense (/i/,?/u/) and lax (/I/,?/?/) vowels were measured. Results demonstrated that for both infant groups mothers hyperarticulated the acoustic vowel space and increased vowel duration in ID speech relative to adult-directed speech. Mean F2 values were decreased for the /u/ vowel and increased for the /I/ vowel, and vowel duration was longer for the /i/, /u/, and /I/ vowels in ID speech. However, neither acoustic cue differed in speech to hearing-impaired or normal-hearing infants. These results suggest that both formant frequencies and vowel duration that differentiate American English tense/lx vowel contrasts are modified in ID speech regardless of the hearing status of the addressee.  相似文献   

18.
Two experiments investigated the role of the regularity of the frequency spacing of harmonics, as a separate factor from harmonicity, on the perception of the virtual pitch of a harmonic series. The first experiment compared the shifts produced by mistuning the 3rd, 4th, and 5th harmonics in the pitch of two harmonic series: the odd-H and the all-H tones. The odd-H tone contained odd harmonics 1 to 11, plus the 4th harmonic; the all-H tone contained harmonics 1 to 12. Both tones had a fundamental frequency of 155 Hz. Pitch shifts produced by mistuning the 3rd harmonic, but not the 4th and 5th harmonics, were found to be significantly larger for the odd-H tone than for the all-H tone. This finding was consistent with the idea that grouping by spectral regularity affects pitch perception since an odd harmonic made a larger contribution than an adjacent even harmonic to the pitch of the odd-H tone. However, an alternative explanation was that the 3rd mistuned harmonic produced larger pitch shifts within the odd-H tone than the 4th mistuned harmonic because of differences in the partial masking of these harmonics by adjacent harmonics. The second experiment tested these explanations by measuring pitch shifts for a modified all-H tone in which each mistuned odd harmonic was tested in the presence of the 4th harmonic, but in the absence of its other even-numbered neighbor. The results showed that, for all mistuned harmonics, pitch shifts for the modified all-H tone were not significantly different from those for the odd-H tone. These findings suggest that the harmonic relations among frequency components, rather than the regularity of their frequency spacing, is the primary factor for the perception of the virtual pitch of complex sounds.  相似文献   

19.
While numerous studies on infant perception have demonstrated the infant's ability to discriminate sounds having different frequencies, little research has evaluated more sophisticated pitch perception abilities such as perceptual constancy and perception of the missing fundamental. In the present study 7-8-month-old infants demonstrated the ability to discriminate harmonic complexes from two pitch categories that differed in pitch by approximately 20% (e.g., 160 vs 200 Hz). Using a visually reinforced conditioned head-turning paradigm, a number of spectrally different tonal complexes that contained varying harmonic components but signaled the same two pitch categories were presented. After learning the basic pitch discrimination, the same infants learned to categorize spectrally different tonal complexes according to the pitches signaled by their fundamental frequencies. That is, the infants showed evidence of perceptual constancy for the pitch of harmonic complexes. Finally, infants heard tonal complexes that signaled the same pitch categories but for which the fundamental frequency was removed. Infants were still able to categorize the harmonic complexes according to their pitch categories. These results suggest that by 7 months of age infants show fairly sophisticated pitch perception abilities similar to those demonstrated by adults.  相似文献   

20.
Recent simulations of continuous interleaved sampling (CIS) cochlear implant speech processors have used acoustic stimulation that provides only weak cues to pitch, periodicity, and aperiodicity, although these are regarded as important perceptual factors of speech. Four-channel vocoders simulating CIS processors have been constructed, in which the salience of speech-derived periodicity and pitch information was manipulated. The highest salience of pitch and periodicity was provided by an explicit encoding, using a pulse carrier following fundamental frequency for voiced speech, and a noise carrier during voiceless speech. Other processors included noise-excited vocoders with envelope cutoff frequencies of 32 and 400 Hz. The use of a pulse carrier following fundamental frequency gave substantially higher performance in identification of frequency glides than did vocoders using envelope-modulated noise carriers. The perception of consonant voicing information was improved by processors that preserved periodicity, and connected discourse tracking rates were slightly faster with noise carriers modulated by envelopes with a cutoff frequency of 400 Hz compared to 32 Hz. However, consonant and vowel identification, sentence intelligibility, and connected discourse tracking rates were generally similar through all of the processors. For these speech tasks, pitch and periodicity beyond the weak information available from 400 Hz envelope-modulated noise did not contribute substantially to performance.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号