首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 765 毫秒
1.
This study quantifies sex differences in the acoustic structure of vowel-like grunt vocalizations in baboons (Papio spp.) and tests the basic perceptual discriminability of these differences to baboon listeners. Acoustic analyses were performed on 1028 grunts recorded from 27 adult baboons (11 males and 16 females) in southern Africa, focusing specifically on the fundamental frequency (F0) and formant frequencies. The mean F0 and the mean frequencies of the first three formants were all significantly lower in males than they were in females, more dramatically so for F0. Experiments using standard psychophysical procedures subsequently tested the discriminability of adult male and adult female grunts. After learning to discriminate the grunt of one male from that of one female, five baboon subjects subsequently generalized this discrimination both to new call tokens from the same individuals and to grunts from novel males and females. These results are discussed in the context of both the possible vocal anatomical basis for sex differences in call structure and the potential perceptual mechanisms involved in their processing by listeners, particularly as these relate to analogous issues in human speech production and perception.  相似文献   

2.
The four experiments reported here measure listeners' accuracy and consistency in adjusting a formant frequency of one- or two-formant complex sounds to match the timbre of a target sound. By presenting the target and the adjustable sound on different fundamental frequencies, listeners are prevented from performing the task by comparing the absolute or relative levels of resolved spectral components. Experiment 1 uses two-formant vowellike sounds. When the two sounds have the same F0, the variability of matches (within-subject standard deviation) for either the first or the second formant is around 1%-3%, which is comparable to existing data on formant frequency discrimination thresholds. With a difference in F0, variability increases to around 8% for first-formant matches, but to only about 4% for second-formant matches. Experiment 2 uses sounds with a single formant at 1100 or 1200 Hz with both sounds on either low or high fundamental frequencies. The increase in variability produced by a difference in F0 is greater for high F0's (where the harmonics close to the formant peak are resolved) than it is for low F0's (where they are unresolved). Listeners also showed systematic errors in their mean matches to sounds with different high F0's. The direction of the systematic errors was towards the most intense harmonic. Experiments 3 and 4 showed that introduction of a vibratolike frequency modulation (FM) on F0 reduces the variability of matches, but does not reduce the systematic error. The experiments demonstrate, for the specific frequencies and FM used, that there is a perceptual cost to interpolating a spectral envelope across resolved harmonics.  相似文献   

3.
Fundamental frequency difference limens (F0DLs) were measured for a target harmonic complex tone with nominal fundamental frequency (F0) of 200 Hz, in the presence and absence of a harmonic masker with overlapping spectrum. The F0 of the masker was 0, ± 3, or ± 6 semitones relative to 200 Hz. The stimuli were bandpass filtered into three regions: 0-1000 Hz (low, L), 1600-2400 Hz (medium, M), and 2800-3600 Hz (high, H), and a background noise was used to mask combination tones and to limit the audibility of components falling on the filter skirts. The components of the target or masker started either in cosine or random phase. Generally, the effect of F0 difference between target and masker was small. For the target alone, F0DLs were larger for random than cosine phase for region H. For the target plus masker, F0DLs were larger when the target had random phase than cosine phase for regions M and H. F0DLs increased with increasing center frequency of the bandpass filter. Modeling using excitation patterns and "summary autocorrelation" and "stabilized auditory image" models suggested that use of temporal fine structure information can account for the small F0DLs obtained when harmonics are barely, if at all, resolved.  相似文献   

4.
The two experiments described here use a formant-matching task to investigate what abstract representations of sound are available to listeners. The first experiment examines how veridically and reliably listeners can adjust the formant frequency of a single-formant sound to match the timbre of a target single-formant sound that has a different bandwidth and either the same or a different fundamental frequency (F0). Comparison with previous results [Dissard and Darwin, J. Acoust. Soc. Am. 106, 960-969 (2000)] shows that (i) for sounds on the same F0, introducing a difference in bandwidth increases the variability of matches regardless of whether the harmonics close to the formant are resolved or unresolved; (ii) for sounds on different F0's, introducing a difference in bandwidth only increases variability for sounds that have unresolved harmonics close to the formant. The second experiment shows that match variability for sounds differing in F0, but with the same bandwidth and with resolved harmonics near the formant peak, is not influenced by the harmonic spacing or by the alignment of harmonics with the formant peak. Overall, these results indicate that match variability increases when the match cannot be made on the basis of the excitation pattern, but match variability does not appear to depend on whether ideal matching performance requires simply interpolation of a spectral envelope or also the extraction of the envelope's peak frequency.  相似文献   

5.
Vocal vibrato and tremor are characterized by oscillations in voice fundamental frequency (F0). These oscillations may be sustained by a control loop within the auditory system. One component of the control loop is the pitch-shift reflex (PSR). The PSR is a closed loop negative feedback reflex that is triggered in response to discrepancies between intended and perceived pitch with a latency of approximately 100 ms. Consecutive compensatory reflexive responses lead to oscillations in pitch every approximately 200 ms, resulting in approximately 5-Hz modulation of F0. Pitch-shift reflexes were elicited experimentally in six subjects while they sustained /u/ vowels at a comfortable pitch and loudness. Auditory feedback was sinusoidally modulated at discrete integer frequencies (1 to 10 Hz) with +/- 25 cents amplitude. Modulated auditory feedback induced oscillations in voice F0 output of all subjects at rates consistent with vocal vibrato and tremor. Transfer functions revealed peak gains at 4 to 7 Hz in all subjects, with an average peak gain at 5 Hz. These gains occurred in the modulation frequency region where the voice output and auditory feedback signals were in phase. A control loop in the auditory system may sustain vocal vibrato and tremorlike oscillations in voice F0.  相似文献   

6.
Three experiments examined the ability of listeners to identify steady-state synthetic vowel-like sounds presented concurrently in pairs to the same ear. Experiment 1 confirmed earlier reports that listeners identify the constituents of such pairs more accurately when they differ in fundamental frequency (f0) by about a half semitone or more, compared to the condition where they have the same f0. When the constituents have different f0's, corresponding harmonics of the two vowels are misaligned in frequency and corresponding pitch periods are asynchronous in time. These differences provide cues that might aid identification. Experiments 2 and 3 determined whether listeners can use these cues, divorced from a difference in f0, to improve their accuracy of identification. Harmonic misalignment was beneficial when the constituents had an f0 of 200 Hz so that the harmonics of each constituent were well separated in frequency. Pitch-period asynchrony was beneficial when the constituents had an f0 of 50 Hz so that the onsets of the pitch periods of each constituent were well separated in time. Neither cue was beneficial when both constituents had an f0 of 100 Hz. It is unlikely, therefore, that either cue contributed to the improvement in performance found in Experiment 1 where the constituents were given different f0's close to 100 Hz. Rather, it is argued that performance improved in Experiment 1 primarily because the two f0's specified two pitches that could be used to segregate the contributions of each vowel in the composite waveform.  相似文献   

7.
Experiment 1 measured frequency modulation detection thresholds (FMTs) for harmonic complex tones as a function of modulation rate. Six complexes were used, with fundamental frequencies (F0s) of either 88 or 250 Hz, bandpass filtered into a LOW (125-625 Hz), MID (1375-1875 Hz) or HIGH (3900-5400 Hz) frequency region. The FMTs were about an order of magnitude greater for the three complexes whose harmonics were unresolved by the peripheral auditory system (F0 = 88 Hz in the MID region and both F0s in the HIGH region) than for the other three complexes, which contained some resolved harmonics. Thresholds increased with increases in FM rate above 2 Hz for all conditions. The increase was larger when the F0 was 88 Hz than when it was 250 Hz, and was also larger in the LOW than in the MID and HIGH regions. Experiment 2 measured thresholds for detecting mistuning produced by modulating the F0s of two simultaneously presented complexes out of phase by 180 degrees. The size of the resulting mistuning oscillates at a rate equal to the rate of FM applied to the two carriers. At low FM rates, thresholds were lowest when the harmonics were either resolved for both complexes or unresolved for both complexes, and highest when resolvability differed across complexes. For pairs of complexes with resolved harmonics, mistuning thresholds increased dramatically as the FM rate was increased above 2-5 Hz, in a way which could not be accounted for by the effect of modulation rate on the FMTs for the individual complexes. A third experiment, in which listeners detected constant ("static") mistuning between pairs of frequency-modulated complexes, provided evidence that this deterioration was due the harmonics in one of the two "resolved" complexes becoming unresolved at high FM rates, when analyzed over some finite time window. It is concluded that the detection of time-varying mistuning between groups of harmonics is limited by factors that are not apparent in FM detection data.  相似文献   

8.
The purpose of this investigation was to study voice changes during a working day. The subjects consisted of 33 female primary and secondary schoolteachers who recorded their first and last lessons during one school day. The subjects were studied both as one group and two subgroups (those with many and those with few voice complaints). Estimates of fundamental frequency (F0), sound pressure level (SPL), the standard deviations of these values (F0 SD; SPL SD) and F0 time (vibration time of vocal folds) were made. The most obvious change due to loading was the rise of F0 that was 9.7 Hz between the first and last lesson (P = 0.00). F0 increased more (12.8. Hz, P = 0.006) in the subgroup with few complaints.  相似文献   

9.
Frequency and intensity discrimination in humans and monkeys   总被引:1,自引:0,他引:1  
Frequency and intensity DLs were compared in humans and monkeys using a repeating standard "yes-no" procedure in which subjects reported frequency increments, frequency decrements, intensity increments, or intensity decrements in an ongoing train of 1.0-kHz tone bursts. There was only one experimental condition (intensity increments) in which monkey DLs (1.5-2.0 dB) overlapped those of humans (1.0-1.8 dB). For discrimination of both increments and decrements in frequency, monkey DLs (16-33 Hz) were approximately seven times larger than those of humans (2.4-4.8 Hz), and for discrimination of intensity decrements, monkey DLs (4.4-7.0 dB) were very unstable and larger than those of humans (1.0-1.8 dB). For intensity increment discrimination, humans and monkeys also exhibited similar DLs as SL was varied. However, for frequency increment discrimination, best DLs for humans occurred at a high (50 dB) SL, whereas best DLs for monkeys occurred at a moderate (30 dB) SL. Results are discussed in terms of various neural mechanisms that might be differentially engaged by humans and monkeys in performing these tasks; for example, different amounts of temporal versus rate coding in frequency discrimination, and different mechanisms for monitoring rate decreases in intensity discrimination. The implications of these data for using monkeys as models of human speech sound discrimination are also discussed.  相似文献   

10.
The effect of the filter bank on fundamental frequency (F0) discrimination was examined in four Nucleus CI24 cochlear implant subjects for synthetic stylized vowel-like stimuli. The four tested filter banks differed in cutoff frequencies, amount of overlap between filters, and shape of the filters. To assess the effects of temporal pitch cues on F0 discrimination, temporal fluctuations were removed above 10 Hz in one condition and above 200 Hz in another. Results indicate that F0 discrimination based upon place pitch cues is possible, but just-noticeable differences exceed 1 octave or more depending on the filter bank used. Increasing the frequency resolution in the F0 range improves the F0 discrimination based upon place pitch cues. The results of F0 discrimination based upon place pitch agree with a model that compares the centroids of the electrical excitation pattern. The addition of temporal fluctuations up to 200 Hz significantly improves F0 discrimination. Just-noticeable differences using both place and temporal pitch cues range from 6% to 60%. Filter banks that do not resolve the higher harmonics provided the best temporal pitch cues, because temporal pitch cues are clearest when the fluctuation on all channels is at F0 and preferably in phase.  相似文献   

11.
The purpose of this study was to compare the role of frequency selectivity in measures of auditory and vibrotactile temporal resolution. In the first experiment, temporal modulation transfer functions for a sinusoidally amplitude modulated (SAM) 250-Hz carrier revealed auditory modulation thresholds significantly lower than corresponding vibrotactile modulation thresholds at SAM frequencies greater than or equal to 100 Hz. In the second experiment, auditory and vibrotactile gap detection thresholds were measured by presenting silent gaps bounded by markers of the same or different frequency. The marker frequency F1 = 250 Hz preceded the silent gap and marker frequencies after the silent gap included F2 = 250, 255, 263, 310, and 325 Hz. Auditory gap detection thresholds were lower than corresponding vibrotactile thresholds for F2 markers less than or equal to 263 Hz, but were greater than the corresponding vibrotactile gap detection thresholds for F2 markers greater than or equal to 310 Hz. When the auditory gap detection thresholds were transformed into filter attenuation values, the results were modeled well by a constant-percentage (10%) bandwidth filter centered on F1. The vibrotactile gap detection thresholds, however, were independent of marker frequency separation. In a third experiment, auditory and vibrotactile rate difference limens (RDLs) were measured for a 250-Hz carrier at SAM rates less than or equal to 100 Hz. Auditory RDLs were lower than corresponding vibrotactile RDLs for standard rates greater than 10 Hz. Combination tones may have confounded auditory performance for standard rates of 80 and 100 Hz. The results from these experiments revealed that frequency selectivity influences auditory measures of temporal resolution, but there was no evidence of frequency selectivity affecting vibrotactile temporal resolution.  相似文献   

12.
In a two-interval, two-alternative, forced-choice (2I-2AFC) adaptive procedure, listeners discriminated between the fundamental frequencies (F0s) of two 100-ms harmonic target complexes. This ability can be impaired substantially by the presence of another complex (the "fringe") immediately before and after each target complex. It has been shown that for the impairment to occur (i) target and fringes have to be in the same frequency region; (ii) if all harmonics of target and fringes are unresolved then they may differ in F0; otherwise, they have to be similar [C. Micheyl and R. P. Carlyon, J. Acoust. Soc. Am. 104, 3006-3018 (1998)]. These findings have been discussed in terms of information about the fringe's F0 being included in the estimate of the F0 of the target, and in terms of auditory streaming. The present study investigated the role of perceived location and ipsilateral versus contralateral presentation of the fringes on F0 discrimination of the target. Experiment 1 used interaural level differences (ILDs), and experiment 2 used interaural time differences (ITDs) to create a range of lateralized perceptions of the 200-ms harmonic fringes. Difference limens for the F0 of the monaural target complex were measured in the presence and absence of the fringes. The nominal F0 was 88 or 250 Hz and could be the same or different for target and fringes. Stimuli were bandpass filtered between 125-625, 1375-1875, or 3900-5400 Hz. In both experiments, the effect of the fringes was reduced when their subjective location differed from that of the target. This reduction depended on the resolvability of both the fringes and the target. The effect of the fringes was reduced most (but still present), when fringes were presented purely contralaterally to the target. The results are consistent with the idea that the fringes produce interference when the listeners have difficulty segregating the target from the fringes, and that a difference in perceived location enhances segregation of the sequentially presented stimuli.  相似文献   

13.
Carlyon and Shackleton [J. Acoust. Soc. Am. 95, 3541-3554 (1994)] presented an influential study supporting the existence of two pitch mechanisms, one for complex tones containing resolved and one for complex tones containing only unresolved components. The current experiments provide an alternative explanation for their finding, namely the existence of across-frequency interference in fundamental frequency (F0) discrimination. Sensitivity (d') was measured for F0 discrimination between two sequentially presented 400 ms complex (target) tones containing only unresolved components. In experiment 1, the target was filtered between 1375 and 15,000 Hz, had a nominal F0 of 88 Hz, and was presented either alone or with an additional complex tone ("interferer"). The interferer was filtered between 125-625 Hz, and its F0 varied between 88 and 114.4 Hz across blocks. Sensitivity was significantly reduced in the presence of the interferer, and this effect decreased as its F0 was moved progressively further from that of the target. Experiment 2 showed that increasing the level of a synchronously gated lowpass noise that spectrally overlapped with the interferer reduced this "pitch discrimination interference (PDI)". In experiment 3A, the target was filtered between 3900 and 5400 Hz and had an F0 of either 88 or 250 Hz. It was presented either alone or with an interferer, filtered between 1375 and 1875 Hz with an F0 corresponding to the nominal target F0. PDI was larger in the presence of the resolved (250 Hz F0) than in the presence of the unresolved (88 Hz F0) interferer, presumably because the pitch of the former was more salient than that of the latter. Experiments 4A and 4B showed that PDI was reduced but not eliminated when the interferer was gated on 200 ms before and off 200 ms after the target, and that some PDI was observed with a continuous interferer. The current findings provide an alternative interpretation of a study supposedly providing strong evidence for the existence of two pitch mechanisms.  相似文献   

14.
Normal-hearing listeners' ability to "hear out" the pitch of a target harmonic complex tone (HCT) was tested with simultaneous HCT or noise maskers, all bandpass-filtered into the same spectral region (1200-3600 Hz). Target-to-masker ratios (TMRs) necessary to discriminate fixed fundamental-frequency (F0) differences were measured for target F0s between 100 and 400 Hz. At high F0s (400 Hz), asynchronous gating of masker and signal, presenting the masker in a different F0 range, and reducing the F0 rove of the masker, all resulted in improved performance. At the low F0s (100 Hz), none of these manipulations improved performance significantly. The findings are generally consistent with the idea that the ability to segregate sounds based on cues such as F0 differences and onset/offset asynchronies can be strongly limited by peripheral harmonic resolvability. However, some cases were observed where perceptual segregation appeared possible, even when no peripherally resolved harmonics were present in the mixture of target and masker. A final experiment, comparing TMRs necessary for detection and F0 discrimination, showed that F0 discrimination of the target was possible with noise maskers at only a few decibels above detection threshold, whereas similar performance with HCT maskers was only possible 15-25 dB above detection threshold.  相似文献   

15.
This 12-month prospective longitudinal study used acoustic analysis to identify phonational gaps in the vocal range of adolescent boys undergoing voice change and to investigate the relationship between the appearance of phonational gaps, weight gain, and changes in speaking fundamental frequency (SF0). Eighteen pubescent boys were recorded producing three descending and three ascending glides over their physiological voice range using the vowel "ah." Recordings were digitized over the range 0-16 kHz and then analyzed to determine both the frequency range and appearance and frequency characteristics of the phonational gaps. Data were plotted against changes in weight and SF0 both as an indicator of pubertal development and to test the hypothesis that changes in weight and SF0 were related to the appearance of phonational gaps. Results indicated that minimum F0 decreased significantly over the time period and phonational gaps increased significantly, but there were no significant changes in maximum F0 or range. Individual data indicated the initial appearance of a lower-frequency gap followed by a higher-frequency gap before the long-term establishment of a midrange gap. At time 5, all boys in the weight range 42.7-44.9 kg had either low- or high-range gaps. The SF0 for this group varied from 117 to 216 Hz. All boys heavier than 54.8 kg had highly variable phonational gaps. SF0 range for this group was 99.5-151 Hz. Transitory low- then high-frequency phonational gaps appeared before the establishment of a midrange phonational gap. In this study, these phonational gaps were associated with certain weight ranges and rapid weight gain, with changes to boys' speaking voices, and with loss of ability to use the mid- and falsetto vocal range.  相似文献   

16.
Thresholds (F0DLs) were measured for discrimination of the fundamental frequency (F0) of a group of harmonics (group B) embedded in harmonics with a fixed F0. Miyazono and Moore [(2009). Acoust. Sci. & Tech. 30, 383386] found a large training effect for tones with high harmonics in group B, when the harmonics were added in cosine phase. It is shown here that this effect was due to use of a cue related to pitch pulse asynchrony (PPA). When PPA cues were disrupted by introducing a temporal offset between the envelope peaks of the harmonics in group B and the remaining harmonics, F0DLs increased markedly. Perceptual learning was examined using a training stimulus with cosine-phase harmonics, F0 = 50 Hz, and high harmonics in group B, under conditions where PPA was not useful. Learning occurred, and it transferred to other cosine-phase tones, but not to random-phase tones. A similar experiment with F0 = 100 Hz showed a learning effect which transferred to a cosine-phase tone with mainly high unresolved harmonics, but not to cosine-phase tones with low harmonics, and not to random-phase tones. The learning found here appears to be specific to tones for which F0 discrimination is based on distinct peaks in the temporal envelope.  相似文献   

17.
目的:探索随机振动和正弦振动因素下生成语音在听觉效果上的变化规律。方法:随机振动采用频率范围2-20Hz,加速度为0.3G、0.5G、0.7G(有效值,下同),正弦振动采用频率4、6、8、10、12Hz,加速度为0.3G、0.5G;在安静及信噪比分别为0dB和-6dB三种状态下对随机振动组、正弦振组及对照组3个组的语音材料进行清晰度测试。结果:和对照组相比,随机振动组,清晰度几科没有变化,正弦振动组,0.3G时4Hz、0.5G时6Hz和8Hz作用下语音清晰度有明显降低,检验结果非常显著。研究还发现,清晰度的降低随听音环境的信噪比的降低而变得严重;结论:正弦振动对发音人发音的影响,会使通话效果变差,并且在听音环境恶劣时尤为突出。  相似文献   

18.
Speech recognition in noise improves with combined acoustic and electric stimulation compared to electric stimulation alone [Kong et al., J. Acoust. Soc. Am. 117, 1351-1361 (2005)]. Here the contribution of fundamental frequency (F0) and low-frequency phonetic cues to speech recognition in combined hearing was investigated. Normal-hearing listeners heard vocoded speech in one ear and low-pass (LP) filtered speech in the other. Three listening conditions (vocode-alone, LP-alone, combined) were investigated. Target speech (average F0=120 Hz) was mixed with a time-reversed masker (average F0=172 Hz) at three signal-to-noise ratios (SNRs). LP speech aided performance at all SNRs. Low-frequency phonetic cues were then removed by replacing the LP speech with a LP equal-amplitude harmonic complex, frequency and amplitude modulated by the F0 and temporal envelope of voiced segments of the target. The combined hearing advantage disappeared at 10 and 15 dB SNR, but persisted at 5 dB SNR. A similar finding occurred when, additionally, F0 contour cues were removed. These results are consistent with a role for low-frequency phonetic cues, but not with a combination of F0 information between the two ears. The enhanced performance at 5 dB SNR with F0 contour cues absent suggests that voicing or glimpsing cues may be responsible for the combined hearing benefit.  相似文献   

19.
Ten American English vowels were sung in a /b/-vowel-/d/ consonantal context by a professional countertenor in full voice (at F0 = 130, 165, 220, 260, and 330 Hz) and in head voice (at F0 = 220, 260, 330, 440, and 520 Hz). Four identification tests were prepared using the entire syllable or the center 200-ms portion of either the full-voice tokens or the head-voice tokens. Listeners attempted to identify each vowel by circling the appropriate word on their answer sheets. Errors were more frequent when the vowels were sung at higher F0. In addition, removal of the consonantal context markedly increased identification errors for both the head-voice and full-voice conditions. Back vowels were misidentified significantly more often than front vowels. For equal F0 values, listeners were significantly more accurate in identifying the head-voice stimuli. Acoustical analysis suggests that the difference of intelligibility between head and full voice may have been due to the head voice having more energy in the first harmonic than the full voice.  相似文献   

20.
The principal resonance frequency in the driving-point impedance of the human body decreases with increasing vibration magnitude—a nonlinear response. An understanding of the nonlinearities may advance understanding of the mechanisms controlling body movement and improve anthropodynamic modelling of responses to vibration at various magnitudes. This study investigated the effects of vibration magnitude and voluntary periodic muscle activity on the apparent mass resonance frequency using vertical random vibration in the frequency range 0.5-20 Hz. Each of 14 subjects was exposed to 14 combinations of two vibration magnitudes (0.25 and 2.0 m s−2 root-mean square (rms)) in seven sitting conditions: two without voluntary periodic movement (A: upright; B: upper-body tensed), and five with voluntary periodic movement (C: back-abdomen bending; D: folding-stretching arms from back to front; E: stretching arms from rest to front; F: folding arms from elbow; G: deep breathing). Three conditions with voluntary periodic movement significantly reduced the difference in resonance frequency at the two vibration magnitudes compared with the difference in a static sitting condition. Without voluntary periodic movement (condition A: upright), the median apparent mass resonance frequency was 5.47 Hz at the low vibration magnitude and 4.39 Hz at the high vibration magnitude. With voluntary periodic movement (C: back-abdomen bending), the resonance frequency was 4.69 Hz at the low vibration magnitude and 4.59 Hz at the high vibration magnitude. It is concluded that back muscles, or other muscles or tissues in the upper body, influence biodynamic responses of the human body to vibration and that voluntary muscular activity or involuntary movement of these parts can alter their equivalent stiffness.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号