期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Pure-tone auditory stream segregation and speech perception in noise in cochlear implant recipients

Hong RS Turner CW 《The Journal of the Acoustical Society of America》2006,120(1):360-374

This study examined the ability of cochlear implant users and normal-hearing subjects to perform auditory stream segregation of pure tones. An adaptive, rhythmic discrimination task was used to assess stream segregation as a function of frequency separation of the tones. The results for normal-hearing subjects were consistent with previously published observations (L.P.A.S van Noorden, Ph.D. dissertation, Eindhoven University of Technology, Eindhoven, The Netherlands 1975), suggesting that auditory stream segregation increases with increasing frequency separation. For cochlear implant users, there appeared to be a range of pure-tone streaming abilities, with some subjects demonstrating streaming comparable to that of normal-hearing individuals, and others possessing much poorer streaming abilities. The variability in pure-tone streaming of cochlear implant users was correlated with speech perception in both steady-state noise and multi-talker babble. Moderate, statistically significant correlations between streaming and both measures of speech perception in noise were observed, with better stream segregation associated with better understanding of speech in noise. These results suggest that auditory stream segregation is a contributing factor in the ability to understand speech in background noise. The inability of some cochlear implant users to perform stream segregation may therefore contribute to their difficulties in noise backgrounds. 相似文献

2.

Sensorimotor adaptation to feedback perturbations of vowel acoustics and its relation to perception

Villacorta VM Perkell JS Guenther FH 《The Journal of the Acoustical Society of America》2007,122(4):2306-2319

The role of auditory feedback in speech motor control was explored in three related experiments. Experiment 1 investigated auditory sensorimotor adaptation: the process by which speakers alter their speech production to compensate for perturbations of auditory feedback. When the first formant frequency (F1) was shifted in the feedback heard by subjects as they produced vowels in consonant-vowel-consonant (CVC) words, the subjects' vowels demonstrated compensatory formant shifts that were maintained when auditory feedback was subsequently masked by noise-evidence of adaptation. Experiment 2 investigated auditory discrimination of synthetic vowel stimuli differing in F1 frequency, using the same subjects. Those with more acute F1 discrimination had compensated more to F1 perturbation. Experiment 3 consisted of simulations with the directions into velocities of articulators model of speech motor planning, which showed that the model can account for key aspects of compensation. In the model, movement goals for vowels are regions in auditory space; perturbation of auditory feedback invokes auditory feedback control mechanisms that correct for the perturbation, which in turn causes updating of feedforward commands to incorporate these corrections. The relation between speaker acuity and amount of compensation to auditory perturbation is mediated by the size of speakers' auditory goal regions, with more acute speakers having smaller goal regions. 相似文献

3.

Sequential streaming due to manipulation of interaural time differences

Stainsby TH Fullgrabe C Flanagan HJ Waldman SK Moore BC 《The Journal of the Acoustical Society of America》2011,130(2):904-914

The effect of apparent spatial location on sequential streaming was investigated by manipulating interaural time differences (ITDs). The degree of obligatory stream segregation was inferred indirectly from the threshold for detecting a rhythmic irregularity in an otherwise isochronous sequence of interleaved "A" and "B" tones. Stimuli were bandpass-filtered harmonic complexes with a 100-Hz fundamental. The A and B tones had equal but opposite ITDs of 0, 0.25, 0.5, 1, or 2 ms and had the same or different passbands. The passband ranges were 1250-2500 Hz and 1768-3536 Hz in experiment 1, and 353-707 Hz and 500-1000 Hz in experiment 2. In both experiments, increases in ITD led to increases in threshold, mainly when the passbands of A and B were the same. The effects were largest for ITDs above 0.5 ms, for which rhythmic irregularities in the timing of the A or B tones alone may have disrupted performance. It is concluded that the differences in apparent spatial location produced by ITD have only weak effects on obligatory streaming. 相似文献

4.

Adaptive auditory feedback control of the production of formant trajectories in the Mandarin triphthong /iau/ and its pattern of generalization

Cai S Ghosh SS Guenther FH Perkell JS 《The Journal of the Acoustical Society of America》2010,128(4):2033-2048

In order to test whether auditory feedback is involved in the planning of complex articulatory gestures in time-varying phonemes, the current study examined native Mandarin speakers' responses to auditory perturbations of their auditory feedback of the trajectory of the first formant frequency during their production of the triphthong /iau/. On average, subjects adaptively adjusted their productions to partially compensate for the perturbations in auditory feedback. This result indicates that auditory feedback control of speech movements is not restricted to quasi-static gestures in monophthongs as found in previous studies, but also extends to time-varying gestures. To probe the internal structure of the mechanisms of auditory-motor transformations, the pattern of generalization of the adaptation learned on the triphthong /iau/ to other vowels with different temporal and spatial characteristics (produced only under masking noise) was tested. A broad but weak pattern of generalization was observed; the strength of the generalization diminished with increasing dissimilarity from /iau/. The details and implications of the pattern of generalization are examined and discussed in light of previous sensorimotor adaptation studies of both speech and limb motor control and a neurocomputational model of speech motor control. 相似文献

5.

Effects of frequency modulated tones and vowel formants on perioral muscle activity during isometric lip rounding

Shimon Sapir Elizabeth DeRosier Andrea M. Simonson Amy Wohlert 《Journal of voice》1990,4(2)

Two studies were conducted to assess the sensitivity of perioral muscles to vowel-like auditory stimuli. In one study, normal young adults produced an isometric lip rounding gesture while listening to a frequency modulated tone (FMT). The fundamental of the FMT was modulated over time in a sinusoidal fashion near the frequency ranges of the first and second formants of the vowels /u/ and /i/ (rate of modulation = 4.5 or 7 Hz). In another study, normal young adults produced an isometric lip rounding gesture while listening to synthesized vowels whose formant frequencies were modulated over time in a sinusoidal fashion to simulate repetitive changes from the vowel /u/ to /i/ (rate of modulation = 2 or 4 Hz). The FMTs and synthesized vowels were presented binaurally via headphones at 75 and 60 dB SL, respectively. Muscle activity from the orbicularis oris superior and inferior and from lip retractors was recorded with surface electromyography (EMG). Signal averaging and spectral analysis of the rectified and smoothed EMG failed to show perioral muscle responses to the auditory stimuli. Implications for auditory feedback theories of speech control are discussed. 相似文献

6.

Binding and unbinding the auditory and visual streams in the McGurk effect

O Nahorna F Berthommier JL Schwartz 《The Journal of the Acoustical Society of America》2012,132(2):1061-1077

Subjects presented with coherent auditory and visual streams generally fuse them into a single percept. This results in enhanced intelligibility in noise, or in visual modification of the auditory percept in the McGurk effect. It is classically considered that processing is done independently in the auditory and visual systems before interaction occurs at a certain representational stage, resulting in an integrated percept. However, some behavioral and neurophysiological data suggest the existence of a two-stage process. A first stage would involve binding together the appropriate pieces of audio and video information before fusion per se in a second stage. Then it should be possible to design experiments leading to unbinding. It is shown here that if a given McGurk stimulus is preceded by an incoherent audiovisual context, the amount of McGurk effect is largely reduced. Various kinds of incoherent contexts (acoustic syllables dubbed on video sentences or phonetic or temporal modifications of the acoustic content of a regular sequence of audiovisual syllables) can significantly reduce the McGurk effect even when they are short (less than 4?s). The data are interpreted in the framework of a two-stage "binding and fusion" model for audiovisual speech perception. 相似文献

7.

Auditory stream segregation on the basis of amplitude-modulation rate

Grimault N Bacon SP Micheyl C 《The Journal of the Acoustical Society of America》2002,111(3):1340-1348

In this study, auditory stream segregation based on differences in the rate of envelope fluctuations--in the absence of spectral and temporal fine structure cues--was tested. The temporal sequences to segregate were composed of fully amplitude-modulated (AM) bursts of broadband noises A and B. All sequences were built by the reiteration of a ABA triplet where A modulation rate was fixed at 100 Hz and B modulation rate was variable. The first experiment was devoted to measuring the threshold difference in AM rate leading subjects to perceive the sequence as two streams as opposed to just one. The results of this first experiment revealed that subjects generally perceived the sequences as a single perceptual stream when the difference in AM rate between the A and B noises was smaller than 0.75 oct, and as two streams when the difference was larger than about 1.00 oct. These streaming thresholds were found to be substantially larger than, and not related to, the subjects' modulation-rate discrimination thresholds. The results of a second experiment demonstrated that AM-rate-based streaming was adversely affected by decreases in AM depth, but that segregation remained possible as long as the AM of either the A or B noises was above the subject's AM-detection threshold. The results of a third experiment indicated that AM-rate-based streaming effects were still observed when the modulations applied to the A and B noises were set individually, either at a constant level in dB above AM-detection threshold, or at levels at which they were of the same perceived strength. This finding suggests that AM-rate-based streaming is not necessarily mediated by perceived differences in AM depth. Altogether, the results of this study indicate that sequential sounds can be segregated on the sole basis of differences in the rate of their temporal fluctuations in the absence of other temporal or spectral cues. 相似文献

8.

Comparing perceived auditory width to the visual image of a performing ensemble in contrasting bi-modal environments

Valente DL Braasch J Myrbeck SA 《The Journal of the Acoustical Society of America》2012,131(1):205-217

Despite many studies investigating auditory spatial impressions in rooms, few have addressed the impact of simultaneous visual cues on localization and the perception of spaciousness. The current research presents an immersive audiovisual environment in which participants were instructed to make auditory width judgments in dynamic bi-modal settings. The results of these psychophysical tests suggest the importance of congruent audio visual presentation to the ecological interpretation of an auditory scene. Supporting data were accumulated in five rooms of ascending volumes and varying reverberation times. Participants were given an audiovisual matching test in which they were instructed to pan the auditory width of a performing ensemble to a varying set of audio and visual cues in rooms. Results show that both auditory and visual factors affect the collected responses and that the two sensory modalities coincide in distinct interactions. The greatest differences between the panned audio stimuli given a fixed visual width were found in the physical space with the largest volume and the greatest source distance. These results suggest, in this specific instance, a predominance of auditory cues in the spatial analysis of the bi-modal scene. 相似文献

9.

Effects of inducer continuity on auditory stream segregation: comparison of physical and perceived continuity in different contexts

Haywood NR Roberts B 《The Journal of the Acoustical Society of America》2011,130(5):2917-2927

The factors influencing the stream segregation of discrete tones and the perceived continuity of discrete tones as continuing through an interrupting masker are well understood as separate phenomena. Two experiments tested whether perceived continuity can influence the build-up of stream segregation by manipulating the perception of continuity during an induction sequence and measuring streaming in a subsequent test sequence comprising three triplets of low and high frequency tones (LHL-[ellipsis (horizontal)]). For experiment 1, a 1.2-s standard induction sequence comprising six 100-ms L-tones strongly promoted segregation, whereas a single extended L-inducer (1.1 s plus 100-ms silence) did not. Segregation was similar to that following the single extended inducer when perceived continuity was evoked by inserting noise bursts between the individual tones. Reported segregation increased when the noise level was reduced such that perceived continuity no longer occurred. Experiment 2 presented a 1.3-s continuous inducer created by bridging the 100-ms silence between an extended L-inducer and the first test-sequence tone. This configuration strongly promoted segregation. Segregation was also increased by filling the silence after the extended inducer with noise, such that it was perceived like a bridging inducer. Like physical continuity, perceived continuity can promote or reduce test-sequence streaming, depending on stimulus context. 相似文献

10.

Spectral enhancement of Polish vowels to improve their identification by hearing impaired listeners

E Ozimek A S?k A Wicher E Skrodzka J Konieczny 《Applied Acoustics》2004,65(5):473-483

Abnormalities in the cochlear function usually cause broadening of the auditory filters which reduces the speech intelligibility. An attempt to apply a spectral enhancement algorithm has been undertaken to improve the identification of Polish vowels by subjects with cochlear-based hearing-impairment. The identification scores of natural (unprocessed) vowels and spectrally enhanced (processed) vowels has been measured for hearing-impaired subjects. It has been found that spectral enhancement improves vowel scores by about 10% for those subjects, however, a wide variation in individual performance among subjects has been observed. The overall vowels identification scores obtained were 85% for natural vowels and 96% for spectrally enhanced vowels. 相似文献

11.

Build-up of the tendency to segregate auditory streams: resetting effects evoked by a single deviant tone

Haywood NR Roberts B 《The Journal of the Acoustical Society of America》2010,128(5):3019-3031

The tendency to hear a sequence of alternating low (L) and high (H) frequency tones as two streams can be increased by a preceding induction sequence, even one composed only of same-frequency tones. Four experiments used such an induction sequence (10 identical L tones) to promote segregation in a shorter test sequence comprising L and H tones. Previous studies have shown that the build-up of stream segregation is usually reduced greatly when a sudden change in acoustic properties distinguishes all of the induction tones from their test-sequence counterparts. Experiment 1 showed that a single deviant tone, created by altering the final inducer (in frequency, level, duration, or replacement with silence) reduced reported segregation, often substantially. Experiment 2 partially replicated this finding, using changes in temporal discrimination as a measure of streaming. Experiments 3 and 4 varied the size of a frequency change applied to the deviant tone; the extent of resetting varied with size only gradually. The results suggest that resetting begins to occur once the change is large enough to be noticeable. Since the prior inducers always remained unaltered in the deviant-tone conditions, it is proposed that a single change actively resets the build-up evoked by the induction sequence. 相似文献

12.

Bandwidth of spectral resolution for two-formant synthetic vowels and two-tone complex signals

Xu Q Jacewicz E Feth LL Krishnamurthy AK 《The Journal of the Acoustical Society of America》2004,115(4):1653-1664

Spectral integration refers to the summation of activity beyond the bandwidth of the peripheral auditory filter. Several experimental lines have sought to determine the bandwidth of this "supracritical" band phenomenon. This paper reports on two experiments which tested the limit on spectral integration in the same listeners. Experiment I verified the critical separation of 3.5 bark in two-formant synthetic vowels as advocated by the center-of-gravity (COG) hypothesis. According to the COG effect, two formants are integrated into a single perceived peak if their separation does not exceed approximately 3.5 bark. With several modifications to the methods of a classic COG matching task, the present listeners responded to changes in pitch in two-formant synthetic vowels, not estimating their phonetic quality. By changing the amplitude ratio of the formants, the frequency of the perceived peak was closer to that of the stronger formant. This COG effect disappeared with larger formant separation. In a second experiment, auditory spectral resolution bandwidths were measured for the same listeners using common-envelope, two-tone complex signals. Results showed that the limits of spectral averaging in two-formant vowels and two-tone spectral resolution bandwidth were related for two of the three listeners. The third failed to perform the discrimination task. For the two subjects who completed both tasks, the results suggest that the critical region in vowel task and the complex-tone discriminability estimates are linked to a common mechanism, i.e., to an auditory spectral resolving power. A signal-processing model is proposed to predict the COG effect in two-formant synthetic vowels. The model introduces two modifications to Hermansky's [J. Acoust. Soc. Am. 87, 1738-1752 (1990)] perceptual linear predictive (PLP) model. The model predictions are generally compatible with the present experimental results and with the predictions of several earlier models accounting for the COG effect. 相似文献

13.

Articulatory correlates of stress and speaking rate in Swedish VCV utterances

O Engstrand 《The Journal of the Acoustical Society of America》1988,83(5):1863-1875

Articulatory activity underlying changes in stress and speaking rate was studied by means of x-ray cinefilm and acoustic speech records. Two Swedish subjects produced vowel-consonant-vowel (VCV) utterances under controlled rate-stress conditions. The vowels were tense (i a u), and the consonants were the voiceless stops, notably (p). The spectral characteristics of the vowels were not significantly influenced by changes in the speaking rate. They were, however, significantly emphasized under stress. At the articulatory level, stressed vowels displayed narrower oral tract constrictions than unstressed vowels at the two speaking rates studied. At the faster speaking rate, vowel- and consonant-related gestures were coproduced to a greater extent than at the slower rate. The data, failing to produce evidence for an "undershoot" mechanism, support the view that dialect-specific correlates of stress are actively safeguarded by means of articulatory reorganization. 相似文献

14.

Effect of different types of auditory stimulation on vowel formant frequencies in multichannel cochlear implant users 总被引：2，自引：0，他引：2

M A Svirsky E A Tobey 《The Journal of the Acoustical Society of America》1991,89(6):2895-2904

Two experiments investigating the effects of auditory stimulation delivered via a Nucleus multichannel cochlear implant upon vowel production in adventitiously deafened adult speakers are reported. The first experiment contrasts vowel formant frequencies produced without auditory stimulation (implant processor OFF) to those produced with auditory stimulation (processor ON). Significant shifts in second formant frequencies were observed for intermediate vowels produced without auditory stimulation; however, no significant shifts were observed for the point vowels. Higher first formant frequencies occurred in five of eight vowels when the processor was turned ON versus OFF. A second experiment contrasted productions of the word "head" produced with a FULL map, OFF condition, and a SINGLE channel condition that restricted the amount of auditory information received by the subjects. This experiment revealed significant shifts in second formant frequencies between FULL map utterances and the other conditions. No significant differences in second formant frequencies were observed between SINGLE channel and OFF conditions. These data suggest auditory feedback information may be used to adjust the articulation of some speech sounds. 相似文献

15.

Coarticulatory organization for lip rounding in Turkish and English

S E Boyce 《The Journal of the Acoustical Society of America》1990,88(6):2584-2595

A number of studies, involving English, Swedish, French, and Spanish, have shown that, for sequences of rounded vowels separated by nonlabial consonants, both EMG activity and lip protrusion diminish during the intervocalic consonant interval, producing a "trough" pattern. A two-part study was conducted to (a) compare patterns of protrusion movement (upper and lower lip) and EMG activity (orbicularis oris) for speakers of English and Turkish, a language where phonological rules constrain vowels within a word to agree in rounding and (b) determine which of two current models of coarticulation, the "look-ahead" and "coproduction" models, best explained the data. Results showed Turkish speakers producing "plateau" patterns of movement rather than troughs, and unimodal rather than bimodal patterns of EMG activity. In the second part of the study, one prediction of the coproduction model, that articulatory gestures have stable profiles across contexts, was tested by adding and subtracting movement data signals to synthesize naturally occurring patterns. Results suggest English and Turkish may have different modes of coarticulatory organization. 相似文献

16.

Kinematic, acoustic, and perceptual analyses of connected speech produced by parkinsonian and normal geriatric adults 总被引：4，自引：0，他引：4

K Forrest G Weismer G S Turner 《The Journal of the Acoustical Society of America》1989,85(6):2608-2622

Acoustic and kinematic analyses, as well as perceptual evaluation, were conducted on the speech of Parkinsonian and normal geriatric adults. As a group, the Parkinsonian speakers had very limited jaw movement compared to the normal geriatrics. For opening gestures, jaw displacements and velocities produced by the Parkinsonian subjects were about half those produced by the normal geriatrics. Lower lip movement amplitude and velocity also were reduced for the Parkinsonian speakers relative to the normal geriatrics, but the magnitude of the reduction was not as great as that seen in the jaw. Lower lip closing velocities expressed as a function of movement amplitude were greater for the Parkinsonian speakers than for the normal geriatrics. This increased velocity of lower lip movement may reflect a difference in the control of lip elevation for the Parkinsonian speakers, an effect that increased with the severity of dysarthria. Acoustically, the Parkinsonian subjects had reduced durations of vocalic segments, reduced formant transitions, and increased voice onset time compared to the normal geriatrics. These effects were greater for the more severe, compared to the milder, dysarthrics and were most apparent in the more complex, vocalic gestures. 相似文献

17.

Different phase-stable relationships of the upper lip and jaw for production of vowels and diphthongs.

S Shaiman R J Porter 《The Journal of the Acoustical Society of America》1991,90(6):3000-3007

Relational invariants have been reported in the timing of articulatory gestures across suprasegmental changes, such as rate and stress. In the current study, the relative timing of the upper lip and jaw was investigated across changes in both suprasegmental and segmental characteristics of speech. The onset of upper lip movement relative to the vowel-to-vowel jaw cycle during intervocalic bilabial production was represented as a phase angle, and analyzed across changes in stress, vowel height, and vowel/diphthong identity. Results indicated that the relative timing of the upper lip and jaw varied systematically with changes in stress and vowel/diphthong identity, while remaining constant across changes in vowel height. It appears that modifications in relative timing may be due to adjustments in the jaw cycle as a result of the compound nature of jaw movement for diphthongs as compared to vowels, with further modifications due to the effect of stress on these compound movements. 相似文献

18.

The timing of articulatory gestures: evidence for relational invariants

B Tuller J A Kelso 《The Journal of the Acoustical Society of America》1984,76(4):1030-1036

In this article, we examine the effects of changing speaking rate and syllable stress on the space-time structure of articulatory gestures. Lip and jaw movements of four subjects were monitored during production of selected bisyllabic utterances in which stress and rate were orthogonally varied. Analysis of the relative timing of articulatory movements revealed that the time of onset of gestures specific to consonant articulation was tightly linked to the timing of gestures specific to the flanking vowels. The observed temporal stability was independent of large variations in displacement, duration, and velocity of individual gestures. The kinematic results are in close agreement with our previously reported EMG findings [B. Tuller et al., J. Exp. Psychol. 8, 460-472 (1982)] and together provide evidence for relational invariants in articulation. 相似文献

19.

Sensorineural hearing loss and the discrimination of vowel-like stimuli

C W Turner D J Van Tasell 《The Journal of the Acoustical Society of America》1984,75(2):562-565

It has been hypothesized that the wider-than-normal auditory bandwidths attributed to sensorineural hearing loss lead to a reduced ability to discriminate spectral characteristics in speech signals. In order to investigate this possibility, the minimum detectable depth of a spectral "notch" between the second (F2) and third (F3) formants of a synthetic vowel-like stimulus was determined for normal and hearing-impaired subjects. The minimum detectable notch for all subjects was surprisingly small; values obtained were much smaller than those found in actual vowels. An analysis of the stimuli based upon intensity discrimination within a single critical band predicted only small differences in performance on this task for rather large differences in the size of the auditory bandwidth. These results suggest that impairments of auditory frequency resolution in sensorineural hearing loss may not be critical in the perception of steady-state vowels. 相似文献

20.

Segregation of concurrent sounds. I: Effects of frequency modulation coherence

S McAdams 《The Journal of the Acoustical Society of America》1989,86(6):2148-2159

Frequency modulation coherence was investigated as a possible cue for the perceptual segregation of concurrent sound sources. Synthesized chords of 2-s duration and comprising six permutations of three sung vowels (/a/, /i/, /o/) at three fundamental frequencies (130.8, 174.6, and 233.1 Hz) were constructed. In one condition, no vowels were modulated, and, in a second, all three were modulated coherently such that the ratio relations among all frequency components were maintained. In a third group of conditions, one vowel was modulated, while the other two remained steady. In a fourth group, one vowel was modulated independently of the two other vowels, which were modulated coherently with one another. Subjects were asked to judge the perceived prominence of each of the three vowels in each chord. Judged prominence increased significantly when the target vowel was modulated compared to when it was not, with the greatest increase being found for higher fundamental frequencies. The increase in prominence with modulation was unaffected by whether the target was modulated coherently or not with nontarget vowels. The modulation and pitch position of nontarget vowels had no effect on target vowel prominence. These results are discussed in terms of possible concurrent auditory grouping principles. 相似文献