首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
The effect of the filter bank on fundamental frequency (F0) discrimination was examined in four Nucleus CI24 cochlear implant subjects for synthetic stylized vowel-like stimuli. The four tested filter banks differed in cutoff frequencies, amount of overlap between filters, and shape of the filters. To assess the effects of temporal pitch cues on F0 discrimination, temporal fluctuations were removed above 10 Hz in one condition and above 200 Hz in another. Results indicate that F0 discrimination based upon place pitch cues is possible, but just-noticeable differences exceed 1 octave or more depending on the filter bank used. Increasing the frequency resolution in the F0 range improves the F0 discrimination based upon place pitch cues. The results of F0 discrimination based upon place pitch agree with a model that compares the centroids of the electrical excitation pattern. The addition of temporal fluctuations up to 200 Hz significantly improves F0 discrimination. Just-noticeable differences using both place and temporal pitch cues range from 6% to 60%. Filter banks that do not resolve the higher harmonics provided the best temporal pitch cues, because temporal pitch cues are clearest when the fluctuation on all channels is at F0 and preferably in phase.  相似文献   

2.
Pitch, timbre, and/or timing cues may be used to stream and segregate competing musical melodies and instruments. In this study, melodic contour identification was measured in cochlear implant (CI) and normal-hearing (NH) listeners, with and without a competing masker; timing, pitch, and timbre cues were varied between the masker and target contour. NH performance was near-perfect across different conditions. CI performance was significantly poorer than that of NH listeners. While some CI subjects were able to use or combine timing, pitch and/or timbre cues, most were not, reflecting poor segregation due to poor spectral resolution.  相似文献   

3.
Harmonic complex tones comprising components in different spectral regions may differ considerably in timbre. While the pitch of "residue" tones of this type has been studied extensively, their timbral properties have received little attention. Discrimination of F0 for such tones is typically poorer than for complex tones with "corresponding" harmonics [A. Faulkner, J. Acoust. Soc. Am. 78, 1993-2004 (1985)]. The F0 DLs may be higher because timbre differences impair pitch discrimination. The present experiment explores effects of changes in spectral locus and F0 of harmonic complex tones on both pitch and timbre. Six normally hearing listeners indicated if the second tone of a two-tone sequence was: (1) same, (2) higher in pitch, (3) lower in pitch, (4) same in pitch but different in "something else," (5) higher in pitch and different in "something else," or (6) lower in pitch and different in "something else" than the first. ("Something else" is assumed to represent timbre.) The tones varied in spectral loci of four equal-amplitude harmonics m, m + 1, m + 2, and m + 3 (m = 1,2,3,4,5,6) and ranged in F0 from 200 to 200 +/- 2n Hz (n = 0,1,2,4,8,16,32). Results show that changes in F0 primarily affect pitch, and changes in spectral locus primarily affect timbre. However, a change in spectral locus can also influence pitch. The direction of locus change was reported as the direction of pitch change, despite no change in F0 or changes in F0 in the opposite direction for delta F0 < or = 0-2%. This implies that listeners may be attending to the "spectral pitch" of components, or to changes in a timbral attribute like "sharpness," which are construed as changes in overall pitch in the absence of strong F0 cues. For delta F0 > or = 2%, the direction of reported pitch change accord with the direction of F0 change, but the locus change continued to be reported as a timbre change. Rather than spectral-pitch matching of corresponding components, a context-dependent spectral evaluation process is thus implied in discernment of changes in pitch and timbre. Relative magnitudes of change in derived features of the spectrum such as harmonic number and F0, and absolute features such as spectral frequencies are compared. What is called "spectral pitch," contributes to the overall pitch, but also appears to be an important dimension of the multidimensional percept, timbre.  相似文献   

4.
Recent simulations of continuous interleaved sampling (CIS) cochlear implant speech processors have used acoustic stimulation that provides only weak cues to pitch, periodicity, and aperiodicity, although these are regarded as important perceptual factors of speech. Four-channel vocoders simulating CIS processors have been constructed, in which the salience of speech-derived periodicity and pitch information was manipulated. The highest salience of pitch and periodicity was provided by an explicit encoding, using a pulse carrier following fundamental frequency for voiced speech, and a noise carrier during voiceless speech. Other processors included noise-excited vocoders with envelope cutoff frequencies of 32 and 400 Hz. The use of a pulse carrier following fundamental frequency gave substantially higher performance in identification of frequency glides than did vocoders using envelope-modulated noise carriers. The perception of consonant voicing information was improved by processors that preserved periodicity, and connected discourse tracking rates were slightly faster with noise carriers modulated by envelopes with a cutoff frequency of 400 Hz compared to 32 Hz. However, consonant and vowel identification, sentence intelligibility, and connected discourse tracking rates were generally similar through all of the processors. For these speech tasks, pitch and periodicity beyond the weak information available from 400 Hz envelope-modulated noise did not contribute substantially to performance.  相似文献   

5.
This study investigated the integration of place- and temporal-pitch cues in pitch contour identification (PCI), in which cochlear implant (CI) users were asked to judge the overall pitch-change direction of stimuli. Falling and rising pitch contours were created either by continuously steering current between adjacent electrodes (place pitch), by continuously changing amplitude modulation (AM) frequency (temporal pitch), or both. The percentage of rising responses was recorded as a function of current steering or AM frequency change, with single or combined pitch cues. A significant correlation was found between subjects' sensitivity to current steering and AM frequency change. The integration of place- and temporal-pitch cues was most effective when the two cues were similarly discriminable in isolation. Adding the other (place or temporal) pitch cues shifted the temporal- or place-pitch psychometric functions horizontally without changing the slopes. PCI was significantly better with consistent place- and temporal-pitch cues than with inconsistent cues. PCI with single cues and integration of pitch cues were similar on different electrodes. The results suggest that CI users effectively integrate place- and temporal-pitch cues in relative pitch perception tasks. Current steering and AM frequency change should be coordinated to better transmit dynamic pitch information to CI users.  相似文献   

6.
The dependency of the timbre of musical sounds on their fundamental frequency (F0) was examined in three experiments. In experiment I subjects compared the timbres of stimuli produced by a set of 12 musical instruments with equal F0, duration, and loudness. There were three sessions, each at a different F0. In experiment II the same stimuli were rearranged in pairs, each with the same difference in F0, and subjects had to ignore the constant difference in pitch. In experiment III, instruments were paired both with and without an F0 difference within the same session, and subjects had to ignore the variable differences in pitch. Experiment I yielded dissimilarity matrices that were similar at different F0's, suggesting that instruments kept their relative positions within timbre space. Experiment II found that subjects were able to ignore the salient pitch difference while rating timbre dissimilarity. Dissimilarity matrices were symmetrical, suggesting further that the absolute displacement of the set of instruments within timbre space was small. Experiment III extended this result to the case where the pitch difference varied from trial to trial. Multidimensional scaling (MDS) of dissimilarity scores produced solutions (timbre spaces) that varied little across conditions and experiments. MDS solutions were used to test the validity of signal-based predictors of timbre, and in particular their stability as a function of F0. Taken together, the results suggest that timbre differences are perceived independently from differences of pitch, at least for F0 differences smaller than an octave. Timbre differences can be measured between stimuli with different F0's.  相似文献   

7.
Studies of pitch perception often involve measuring difference limens for complex tones (DLCs) that differ in fundamental frequency (F0). These measures are thought to reflect F0 discrimination and to provide an indirect measure of subjective pitch strength. However, in many situations discrimination may be based on cues other than the pitch or the F0, such as differences in the frequencies of individual components or timbre (brightness). Here, DLCs were measured for harmonic and inharmonic tones under various conditions, including a randomized or fixed lowest harmonic number, with and without feedback. The inharmonic tones were produced by shifting the frequencies of all harmonics upwards by 6.25%, 12.5%, or 25% of F0. It was hypothesized that, if DLCs reflect residue-pitch discrimination, these frequency-shifted tones, which produced a weaker and more ambiguous pitch than would yield larger DLCs than the harmonic tones. However, if DLCs reflect comparisons of component pitches, or timbre, they should not be systematically influenced by frequency shifting. The results showed larger DLCs and more scattered pitch matches for inharmonic than for harmonic complexes, confirming that the inharmonic tones produced a less consistent pitch than the harmonic tones, and consistent with the idea that DLCs reflect F0 pitch discrimination.  相似文献   

8.
The ability of absolute-pitch (AP) musicians to identify or produce virtual pitch from harmonic structures without feedback or an external acoustic referent was examined in three experiments. Stimuli consisted of pure tones, missing-fundamental harmonic complexes, or piano notes highpass filtered to remove their fundamental frequency and lower harmonics. Results of Experiment I showed that relative to control (non-AP) musicians, AP subjects easily (>90%) identified pitch of harmonic complexes in a 12-alternative forced-choice task. Increasing harmonic order (i.e., lowest harmonic number in the complex), however, resulted in a monotonic decline in performance. Results suggest that AP musicians use two pitch cues from harmonic structures: 1) spectral spacing between harmonic components, and 2) octave-related cues to note identification in individually resolved harmonics. Results of Experiment II showed that highpass filtered piano notes are identified by AP subjects at better than 75% accuracy even when the note’s energy is confined to the 4th and higher harmonics. Identification of highpass piano notes also appears to be better than that expected from pure or complex tones, possibly due to contributions from familiar timbre cues to note identity. Results of Experiment III showed that AP subjects can adjust the spectral spacing between harmonics of a missing-fundamental complex to accurately match the expected spacing from a target musical note. Implications of these findings for mechanisms of AP encoding are discussed. The text was submitted by the authors in English.  相似文献   

9.
Standard continuous interleaved sampling processing, and a modified processing strategy designed to enhance temporal cues to voice pitch, were compared on tests of intonation perception, and vowel perception, both in implant users and in acoustic simulations. In standard processing, 400 Hz low-pass envelopes modulated either pulse trains (implant users) or noise carriers (simulations). In the modified strategy, slow-rate envelope modulations, which convey dynamic spectral variation crucial for speech understanding, were extracted by low-pass filtering (32 Hz). In addition, during voiced speech, higher-rate temporal modulation in each channel was provided by 100% amplitude-modulation by a sawtooth-like wave form whose periodicity followed the fundamental frequency (F0) of the input. Channel levels were determined by the product of the lower- and higher-rate modulation components. Both in acoustic simulations and in implant users, the ability to use intonation information to identify sentences as question or statement was significantly better with modified processing. However, while there was no difference in vowel recognition in the acoustic simulation, implant users performed worse with modified processing both in vowel recognition and in formant frequency discrimination. It appears that, while enhancing pitch perception, modified processing harmed the transmission of spectral information.  相似文献   

10.
The abilities to hear changes in pitch for sung vowels and understand speech using an experimental sound coding strategy (eTone) that enhanced coding of temporal fundamental frequency (F0) information were tested in six cochlear implant users, and compared with performance using their clinical (ACE) strategy. In addition, rate- and modulation rate-pitch difference limens (DLs) were measured using synthetic stimuli with F0s below 300 Hz to determine psychophysical abilities of each subject and to provide experience in attending to rate cues for the judgment of pitch. Sung-vowel pitch ranking tests for stimuli separated by three semitones presented across an F0 range of one octave (139-277 Hz) showed a significant benefit for the experimental strategy compared to ACE. Average d-prime (d') values for eTone (d' = 1.05) were approximately three time larger than for ACE (d' = 0.35). Similar scores for both strategies in the speech recognition tests showed that coding of segmental speech information by the experimental strategy was not degraded. Average F0 DLs were consistent with results from previous studies and for all subjects were less than or equal to approximately three semitones for F0s of 125 and 200?Hz.  相似文献   

11.
Tone languages differ from English in that the pitch pattern of a single-syllable word conveys lexical meaning. In the present study, dependence of tonal-speech perception on features of the stimulation was examined using an acoustic simulation of a CIS-type speech-processing strategy for cochlear prostheses. Contributions of spectral features of the speech signals were assessed by varying the number of filter bands, while contributions of temporal envelope features were assessed by varying the low-pass cutoff frequency used for extracting the amplitude envelopes. Ten normal-hearing native Mandarin Chinese speakers were tested. When the low-pass cutoff frequency was fixed at 512 Hz, consonant, vowel, and sentence recognition improved as a function of the number of channels and reached plateau at 4 to 6 channels. Subjective judgments of sound quality continued to improve as the number of channels increased to 12, the highest number tested. Tone recognition, i.e., recognition of the four Mandarin tone patterns, depended on both the number of channels and the low-pass cutoff frequency. The trade-off between the temporal and spectral cues for tone recognition indicates that temporal cues can compensate for diminished spectral cues for tone recognition and vice versa. An additional tone recognition experiment using syllables of equal duration showed a marked decrease in performance, indicating that duration cues contribute to tone recognition. A third experiment showed that recognition of processed FM patterns that mimic Mandarin tone patterns was poor when temporal envelope and duration cues were removed.  相似文献   

12.
Sequences of rapidly occurring sounds that differ from each other are often perceptually segregated into "streams" within which the range of differences is smaller [Bregman and Campbell, J. Exp. Psychol. 89, 244-249 (1971)]. Early research on streaming implied it to be pitch dominated, but Wessel [Comput. Music J. 3, 45-52 (1979)] demonstrated that timbre differences could also bring about segregation. In the present study, pitch and timbre attributes were put in competition in four-tone sequences of the form: T2P1-TmP1-T2Pn-TmPn, with the first pair assigned pitch P1 but different timbres T2 and Tm, and the second pair pitch Pn, and similarly contrasted timbres. Six listeners were asked to indicate whether perceived grouping of 49 such sequences was based on pitch proximity, timbre similarity, or ambiguous percepts not dominated by either cue. Results confirm that timbre can segregate sequences and imply that timbre and pitch compete in perceptually organizing complex sequences. Because timbre differences were provided by varying the locus of four equal-amplitude harmonics, and pitch differences were provided by varying their relative spacing, it is suggested that the tradeoffs observed may actually arise due to differences in perceived salience of "spectral pitch" and "virtual pitch" [Terhardt, J. Acoust. Soc. Am. 55, 1061-1069 (1974)] dependent on relative changes in spectral locus and spectral spacing over time.  相似文献   

13.
Two experiments evaluated discrimination of simulated single-format frequency transitions. In the first experiment, listeners received practice with trial-by-trial feedback in discriminating either rising or falling frequency transitions of three different durations (30, 60, and 120 ms). Transitions either occurred in isolation or were followed by a steady-state sound matched in frequency to the transition end point. Some improvement in discrimination over practice runs occurred for the shortest transitions. Whether performance was evaluated at the beginning or end of practice, there were no differences attributable to transition direction or to whether transitions were followed by steady-state sound. Discrimination, however, was significantly better for the longest transitions. Just noticeable differences (jnd's) for the longest transitions, measured in Hz at transition onsets, were of approximately the same magnitude as jnd's for steady-state sounds that were equal in frequency to the midpoints of the transitions. Subjects of the second experiment discriminated the longer rising and falling transitions, but did not receive extensive practice. Results of experiment 2 replicated results of experiment 1 in showing similar jnd's. Experiment 2 also showed no differences attributable to transition direction or to the presence of the steady-state sound following transitions.  相似文献   

14.
Amplitude modulations of pulsitile stimulation can be used to convey pitch information to cochlear implant users. One variable in designing cochlear implant speech processors is the choice of modulation waveform used to convey pitch information. Modulation frequency discrimination thresholds were measured for 100 Hz modulations with four waveforms (sine, sawtooth, a sharpened sawtooth, and square). Just-noticeable differences (JNDs) were similar for all but the square waveform, which often produced larger JNDs. The results suggest that a sine, sawtooth, and sharpened sawtooth waveforms are likely to provide similar pitch discrimination within a speech processing strategy.  相似文献   

15.
Chinese sentence recognition strongly relates to the reception of tonal information. For cochlear implant (CI) users with residual acoustic hearing, tonal information may be enhanced by restoring low-frequency acoustic cues in the nonimplanted ear. The present study investigated the contribution of low-frequency acoustic information to Chinese speech recognition in Mandarin-speaking normal-hearing subjects listening to acoustic simulations of bilaterally combined electric and acoustic hearing. Subjects listened to a 6-channel CI simulation in one ear and low-pass filtered speech in the other ear. Chinese tone, phoneme, and sentence recognition were measured in steady-state, speech-shaped noise, as a function of the cutoff frequency for low-pass filtered speech. Results showed that low-frequency acoustic information below 500 Hz contributed most strongly to tone recognition, while low-frequency acoustic information above 500 Hz contributed most strongly to phoneme recognition. For Chinese sentences, speech reception thresholds (SRTs) improved with increasing amounts of low-frequency acoustic information, and significantly improved when low-frequency acoustic information above 500 Hz was preserved. SRTs were not significantly affected by the degree of spectral overlap between the CI simulation and low-pass filtered speech. These results suggest that, for CI patients with residual acoustic hearing, preserving low-frequency acoustic information can improve Chinese speech recognition in noise.  相似文献   

16.
The purpose of this study was to determine whether individuals show differences in speech and voice during reading of the same news before and after attending a radio announcing course. Twenty-five students of a Radio Announcing Course in Sao Paulo city, 17 men and 8 women, aged 19 to 55 years, participated in this study. The readings were recorded in a professional audio studio, and the speech samples were submitted to perceptual and acoustic analysis. For the perceptual analysis, the samples were randomly presented in pairs and five trained speech pathologists identified each recording as pre- and posttraining, and also justified their choices by indicating what parameters better based their judgment: type of voice, articulation and pronunciation, loudness, pitch, resonance, speech rate, respiratory coordination, and use of emphasis. The acoustic parameters analyzed were mean, minimum, and maximum fundamental frequency, frequency range, text duration, and pause duration. The perceptual analysis showed that the posttraining speech samples were considered the best productions in 80% of the evaluations. Emphasis characterized the readings (70.4%), followed by type of voice (44.8%) and pitch (40.8%). Acoustic analysis showed higher mean fundamental frequency and increase of frequency range posttraining. These results indicated richer modulation in the posttraining readings. There are differences in the readings of the same news pre- and posttraining in a radio announcing course, and the posttraining reading was considered the best production, indicating the positive effect of the training.  相似文献   

17.
This study investigated age-related differences in sensitivity to temporal cues in modified natural speech sounds. Listeners included young noise-masked subjects, elderly normal-hearing subjects, and elderly hearing-impaired subjects. Four speech continua were presented to listeners, with stimuli from each continuum varying in a single temporal dimension. The acoustic cues varied in separate continua were voice-onset time, vowel duration, silence duration, and transition duration. In separate conditions, the listeners identified the word stimuli, discriminated two stimuli in a same-different paradigm, and discriminated two stimuli in a 3-interval, 2-alternative forced-choice procedure. Results showed age-related differences in the identification function crossover points for the continua that varied in silence duration and transition duration. All listeners demonstrated shorter difference limens (DLs) for the three-interval paradigm than the two-interval paradigm, with older hearing-impaired listeners showing larger DLs than the other listener groups for the silence duration cue. The findings support the general hypothesis that aging can influence the processing of specific temporal cues that are related to consonant manner distinctions.  相似文献   

18.
Two experiments investigated the effects of critical bandwidth and frequency region on the use of temporal envelope cues for speech. In both experiments, spectral details were reduced using vocoder processing. In experiment 1, consonant identification scores were measured in a condition for which the cutoff frequency of the envelope extractor was half the critical bandwidth (HCB) of the auditory filters centered on each analysis band. Results showed that performance is similar to those obtained in conditions for which the envelope cutoff was set to 160 Hz or above. Experiment 2 evaluated the impact of setting the cutoff frequency of the envelope extractor to values of 4, 8, and 16 Hz or to HCB in one or two contiguous bands for an eight-band vocoder. The cutoff was set to 16 Hz for all the other bands. Overall, consonant identification was not affected by removing envelope fluctuations above 4 Hz in the low- and high-frequency bands. In contrast, speech intelligibility decreased as the cutoff frequency was decreased in the midfrequency region from 16 to 4 Hz. The behavioral results were fairly consistent with a physical analysis of the stimuli, suggesting that clearly measurable envelope fluctuations cannot be attenuated without affecting speech intelligibility.  相似文献   

19.
The dependency of the brightness dimension of timbre on fundamental frequency (FO) was examined experimentally. Subjects compared the timbres of 24 synthetic stimuli, produced by the combination of six values of spectral centroid to obtain different values of expected brightness, and four FO's, ranging over 18 semitones. Subjects were instructed to ignore pitch differences. Dissimilarity scores were analyzed by both ANOVA and multidimensional scaling (MDS). Results show that timbres can be compared between stimuli with different FO's over the range tested, and that differences in FO affect timbre dissimilarity in two ways. First, dissimilarity scores reveal a term proportional to FO difference that shows up in the MDS solution as a dimension correlated with FO and orthogonal to other timbre dimensions. Second, FO affects systematically the timbre dimension (brightness) correlated with spectral centroid. Interestingly, both terms covaried with differences in FO rather than chroma or consonance. The first term probably corresponds to pitch. The second can be eliminated if the formula for spectral centroid is modified by introducing a corrective factor dependent on FO.  相似文献   

20.
Speech recognition in noise improves with combined acoustic and electric stimulation compared to electric stimulation alone [Kong et al., J. Acoust. Soc. Am. 117, 1351-1361 (2005)]. Here the contribution of fundamental frequency (F0) and low-frequency phonetic cues to speech recognition in combined hearing was investigated. Normal-hearing listeners heard vocoded speech in one ear and low-pass (LP) filtered speech in the other. Three listening conditions (vocode-alone, LP-alone, combined) were investigated. Target speech (average F0=120 Hz) was mixed with a time-reversed masker (average F0=172 Hz) at three signal-to-noise ratios (SNRs). LP speech aided performance at all SNRs. Low-frequency phonetic cues were then removed by replacing the LP speech with a LP equal-amplitude harmonic complex, frequency and amplitude modulated by the F0 and temporal envelope of voiced segments of the target. The combined hearing advantage disappeared at 10 and 15 dB SNR, but persisted at 5 dB SNR. A similar finding occurred when, additionally, F0 contour cues were removed. These results are consistent with a role for low-frequency phonetic cues, but not with a combination of F0 information between the two ears. The enhanced performance at 5 dB SNR with F0 contour cues absent suggests that voicing or glimpsing cues may be responsible for the combined hearing benefit.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号