首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
The three experiments reported here compare the effectiveness of natural prosodic and vocal-tract size cues at overcoming spatial cues in selective attention. Listeners heard two simultaneous sentences and decided which of two simultaneous target words came from the attended sentence. Experiment 1 used sentences that had natural differences in pitch and in level caused by a change in the location of the main sentence stress. The sentences' pitch contours were moved apart or together in order to separate out effects due to pitch and those due to other prosodic factors such as intensity. Both pitch and the other prosodic factors had an influence on which target word was reported, but the effects were not strong enough to override the spatial difference produced by an interaural time difference of +/- 91 microseconds. In experiment 2, a large (+/- 15%) difference in apparent vocal-tract size between the speakers of the two sentences had an additional and strong effect, which, in conjunction with the original prosodic differences overrode an interaural time difference of +/- 181 microseconds. Experiment 3 showed that vocal-tract size differences of +/- 4% or less had no detectable effect. Overall, the results show that prosodic and vocal-tract size cues can override spatial cues in determining which target word belongs in an attended sentence.  相似文献   

2.
Three experiments used the Coordinated Response Measure task to examine the roles that differences in F0 and differences in vocal-tract length have on the ability to attend to one of two simultaneous speech signals. The first experiment asked how increases in the natural F0 difference between two sentences (originally spoken by the same talker) affected listeners' ability to attend to one of the sentences. The second experiment used differences in vocal-tract length, and the third used both F0 and vocal-tract length differences. Differences in F0 greater than 2 semitones produced systematic improvements in performance. Differences in vocal-tract length produced systematic improvements in performance when the ratio of lengths was 1.08 or greater, particularly when the shorter vocal tract belonged to the target talker. Neither of these manipulations produced improvements in performance as great as those produced by a different-sex talker. Systematic changes in both F0 and vocal-tract length that simulated an incremental shift in gender produced substantially larger improvements in performance than did differences in F0 or vocal-tract length alone. In general, shifting one of two utterances spoken by a female voice towards a male voice produces a greater improvement in performance than shifting male towards female. The increase in performance varied with the intonation patterns of individual talkers, being smallest for those talkers who showed most variability in their intonation patterns between different utterances.  相似文献   

3.
4.
Selective attention describes that individuals have a preference on information according to their involving motivation. Based on achievements of social psychology, we propose an opinion interacting model to improve the modeling of individuals’ interacting behaviors. There are two parameters governing the probability of agents interacting with opponents, i.e. individual relevance and time-openness. It is found that, large individual relevance and large time-openness advance the appearance of large clusters, but large individual relevance and small time-openness favor the lessening of extremism. We also put this new model into application to work out some factor leading to a successful product. Numerical simulations show that selective attention, especially individual relevance, cannot be ignored by launcher firms and information spreaders so as to attain the most successful promotion.  相似文献   

5.
In many experiments on comodulation masking release (CMR), both across- and within-channel cues may be available. This makes it difficult to determine the mechanisms underlying CMR. The present study compared CMR in a flanking-band (FB) paradigm for a situation in which only across-channel cues were likely to be available [FBs placed distally from the on-frequency band (OFB)] and a situation where both across- and within-channel cues might have been available (proximally spaced FBs, for which larger CMRs have previously been observed). The use of across-channel cues was selectively disrupted using a manipulation of auditory grouping factors, following Dau et al. [J. Acoust. Soc. Am. 125, 2182-2188(2009)] and the use of within-channel cues was selectively disrupted using a manipulation called "OFB reversal," following Goldman et al. [J. Acoust. Soc. Am. 129, 3181-3193 (2011)]. The auditory grouping manipulation eliminated CMR for the distal-FB configuration and reduced CMR for the proximal-FB configuration. This may indicate that across-channel cues are available for proximal FB placement. CMR for the proximal-FB configuration persisted when both manipulations were used together, which suggests that OFB reversal does not entirely eliminate within-channel cues.  相似文献   

6.
Glottal-pulse rate (GPR) and vocal-tract length (VTL) are related to the size, sex, and age of the speaker but it is not clear how the two factors combine to influence our perception of speaker size, sex, and age. This paper describes experiments designed to measure the effect of the interaction of GPR and VTL upon judgements of speaker size, sex, and age. Vowels were scaled to represent people with a wide range of GPRs and VTLs, including many well beyond the normal range of the population, and listeners were asked to judge the size and sex/age of the speaker. The judgements of speaker size show that VTL has a strong influence upon perceived speaker size. The results for the sex and age categorization (man, woman, boy, or girl) show that, for vowels with GPR and VTL values in the normal range, judgements of speaker sex and age are influenced about equally by GPR and VTL. For vowels with abnormal combinations of low GPRs and short VTLs, the VTL information appears to decide the sex/age judgement.  相似文献   

7.
A recent study [Smith and Patterson, J. Acoust. Soc. Am. 118, 3177-3186 (2005)] demonstrated that both the glottal-pulse rate (GPR) and the vocal-tract length (VTL) of vowel sounds have a large effect on the perceived sex and age (or size) of a speaker. The vowels for all of the "different" speakers in that study were synthesized from recordings of the sustained vowels of one, adult male speaker. This paper presents a follow-up study in which a range of vowels were synthesized from recordings of four different speakers--an adult man, an adult woman, a young boy, and a young girl--to determine whether the sex and age of the original speaker would have an effect upon listeners' judgments of whether a vowel was spoken by a man, woman, boy, or girl, after they were equated for GPR and VTL. The sustained vowels of the four speakers were scaled to produce the same combinations of GPR and VTL, which covered the entire range normally encountered in every day life. The results show that listeners readily distinguish children from adults based on their sustained vowels but that they struggle to distinguish the sex of the speaker.  相似文献   

8.
Standard continuous interleaved sampling processing, and a modified processing strategy designed to enhance temporal cues to voice pitch, were compared on tests of intonation perception, and vowel perception, both in implant users and in acoustic simulations. In standard processing, 400 Hz low-pass envelopes modulated either pulse trains (implant users) or noise carriers (simulations). In the modified strategy, slow-rate envelope modulations, which convey dynamic spectral variation crucial for speech understanding, were extracted by low-pass filtering (32 Hz). In addition, during voiced speech, higher-rate temporal modulation in each channel was provided by 100% amplitude-modulation by a sawtooth-like wave form whose periodicity followed the fundamental frequency (F0) of the input. Channel levels were determined by the product of the lower- and higher-rate modulation components. Both in acoustic simulations and in implant users, the ability to use intonation information to identify sentences as question or statement was significantly better with modified processing. However, while there was no difference in vowel recognition in the acoustic simulation, implant users performed worse with modified processing both in vowel recognition and in formant frequency discrimination. It appears that, while enhancing pitch perception, modified processing harmed the transmission of spectral information.  相似文献   

9.
Two experiments investigated the effect of reverberation on listeners' ability to perceptually segregate two competing voices. Culling et al. [Speech Commun. 14, 71-96 (1994)] found that for competing synthetic vowels, masked identification thresholds were increased by reverberation only when combined with modulation of fundamental frequency (F0). The present investigation extended this finding to running speech. Speech reception thresholds (SRTs) were measured for a male voice against a single interfering female voice within a virtual room with controlled reverberation. The two voices were either (1) co-located in virtual space at 0 degrees azimuth or (2) separately located at +/-60 degrees azimuth. In experiment 1, target and interfering voices were either normally intonated or resynthesized with a fixed F0. In anechoic conditions, SRTs were lower for normally intonated and for spatially separated sources, while, in reverberant conditions, the SRTs were all the same. In experiment 2, additional conditions employed inverted F0 contours. Inverted F0 contours yielded higher SRTs in all conditions, regardless of reverberation. The results suggest that reverberation can seriously impair listeners' ability to exploit differences in F0 and spatial location between competing voices. The levels of reverberation employed had no effect on speech intelligibility in quiet.  相似文献   

10.
Older individuals often report difficulty coping in situations with multiple conversations in which they at times need to "tune out" the background speech and at other times seek to monitor competing messages. The present study was designed to simulate this type of interaction by examining the cost of requiring listeners to perform a secondary task in conjunction with understanding a target talker in the presence of competing speech. The ability of younger and older adults to understand a target utterance was measured with and without requiring the listener to also determine how many masking voices were presented time-reversed. Also of interest was how spatial separation affected the ability to perform these two tasks. Older adults demonstrated slightly reduced overall speech recognition and obtained less spatial release from masking, as compared to younger listeners. For both younger and older listeners, spatial separation increased the costs associated with performing both tasks together. The meaningfulness of the masker had a greater detrimental effect on speech understanding for older participants than for younger participants. However, the results suggest that the problems experienced by older adults in complex listening situations are not necessarily due to a deficit in the ability to switch and/or divide attention among talkers.  相似文献   

11.
XW Zhou  RE Jones 《J Phys Condens Matter》2012,24(32):325804, 1-325804,15
The thermal conductivity of a crystal is sensitive to the presence of surfaces and nanoscale defects. While this opens tremendous opportunities to tailor thermal conductivity, true 'phonon engineering' of nanocrystals for a specific electronic or thermoelectric application can only be achieved when the dependence of thermal conductivity on the defect density, size and spatial population is understood and quantified. Unfortunately, experimental studies of the effects of nanoscale defects are quite challenging. While molecular dynamics simulations are effective in calculating thermal conductivity, the defect density range that can be explored with feasible computing resources is unrealistically high. As a result, previous work has not generated a fully detailed understanding of the dependence of thermal conductivity on nanoscale defects. Using GaN as an example, we have combined a physically motivated analytical model and highly converged large-scale molecular dynamics simulations to study the effects of defects on thermal conductivity. An analytical expression for thermal conductivity as a function of void density, size, and population has been derived and corroborated with the model, simulations, and experiments.  相似文献   

12.
Spatial impression perceived in a listening space comprises at least two components: one is auditory (apparent) source width (ASW) and the other is listener envelopment (LEV). Both ASW and LEV are affected not only by temporal but also by spatial structures of reflections. It has been clarified that ASW for symphony music is significantly affected by low-frequency components of source signals and reflections, but not by their high-frequency components. The objective of this work is to investigate whether LEV is affected by the frequency characteristics of source signals and reverberation sounds, which are known to contribute to the creation of LEV. In this study, three experiments were performed to clarify the effects of reverberation time (RT) and its frequency characteristics on LEV. In contrast to the case of ASW, the experimental results show that RTs both at high and low frequencies affect LEV.  相似文献   

13.
Binaural speech intelligibility of individual listeners under realistic conditions was predicted using a model consisting of a gammatone filter bank, an independent equalization-cancellation (EC) process in each frequency band, a gammatone resynthesis, and the speech intelligibility index (SII). Hearing loss was simulated by adding uncorrelated masking noises (according to the pure-tone audiogram) to the ear channels. Speech intelligibility measurements were carried out with 8 normal-hearing and 15 hearing-impaired listeners, collecting speech reception threshold (SRT) data for three different room acoustic conditions (anechoic, office room, cafeteria hall) and eight directions of a single noise source (speech in front). Artificial EC processing errors derived from binaural masking level difference data using pure tones were incorporated into the model. Except for an adjustment of the SII-to-intelligibility mapping function, no model parameter was fitted to the SRT data of this study. The overall correlation coefficient between predicted and observed SRTs was 0.95. The dependence of the SRT of an individual listener on the noise direction and on room acoustics was predicted with a median correlation coefficient of 0.91. The effect of individual hearing impairment was predicted with a median correlation coefficient of 0.95. However, for mild hearing losses the release from masking was overestimated.  相似文献   

14.
We investigate the role of the spatial pattern and temporal dynamics on the population properties of a diffusive Gause–Lotka–Volterra system. The average total population size is insensitive to the temporal dynamics, whereas a significant decrease of the population size can be found as the spatial diffusion effect is increased, implying that the spatial pattern plays an important role. At large diffusion coefficients, a saturation of the spatial pattern variation is observed, which can be understood by a spatial scaling analysis of the system. The existence of multiple attractors can also indicate that spatial patterns play a more important role than temporal dynamics in dominating the population size.  相似文献   

15.
Two experiments investigated the impact of reverberation and masking on speech understanding using cochlear implant (CI) simulations. Experiment 1 tested sentence recognition in quiet. Stimuli were processed with reverberation simulation (T=0.425, 0.266, 0.152, and 0.0 s) and then either processed with vocoding (6, 12, or 24 channels) or were subjected to no further processing. Reverberation alone had only a small impact on perception when as few as 12 channels of information were available. However, when the processing was limited to 6 channels, perception was extremely vulnerable to the effects of reverberation. In experiment 2, subjects listened to reverberated sentences, through 6- and 12-channel processors, in the presence of either speech-spectrum noise (SSN) or two-talker babble (TTB) at various target-to-masker ratios. The combined impact of reverberation and masking was profound, although there was no interaction between the two effects. This differs from results obtained in subjects listening to unprocessed speech where interactions between reverberation and masking have been shown to exist. A speech transmission index (STI) analysis indicated a reasonably good prediction of speech recognition performance. Unlike previous investigations, the SSN and TTB maskers produced equivalent results, raising questions about the role of informational masking in CI processed speech.  相似文献   

16.
The contributions of auditory and cognitive factors to age-dependent differences in auditory spatial attention were investigated. In conditions of real spatial separation, the target sentence was presented from a central location and competing sentences were presented from left and right locations. In conditions of simulated spatial separation, different apparent spatial locations of the target and competitors were induced using the precedence effect. The identity of the target was cued by a callsign presented either prior to or following each target sentence, and the probability that the target would be presented at the three locations was specified at the beginning of each block. Younger and older adults with normal hearing sensitivity below 4 kHz completed all 16 conditions (2-spatial separation method X 2-callsign conditions X 4-probability conditions). Overall, younger adults performed better than older adults. For both age groups, performance improved with target location certainty, with a priori target cueing, and when location differences were real rather than simulated. For both age groups, the contributions of natural spatial cues were most pronounced when the target occurred at "unlikely" spatial listening locations. This suggests that both age groups benefit similarly from richer acoustical cues and a priori information in difficult listening environments.  相似文献   

17.
This study presents various acoustic measures used to examine the sequence /a # C/, where "#" represents different prosodic boundaries in French. The 6 consonants studied are /b d g f s S/ (3 stops and 3 fricatives). The prosodic units investigated are the utterance, the intonational phrase, the accentual phrase, and the word. It is found that vowel target values, formant transitions into the stop consonant, and the rate of change in spectral tilt into the fricative, are affected by the strength of the prosodic boundary. F1 becomes higher for /a/ the stronger the prosodic boundary, with the exception of one speaker's utterance data, which show the effects of articulatory declension at the utterance level. Various effects of the stop consonant context are observed, the most notable being a tendency for the vowel /a/ to be displaced in the direction of the F2 consonant "locus" for /d/ (the F2 consonant values for which remain relatively stable across prosodic boundaries) and for /g/ (the F2 consonant values for which are displaced in the direction of the velar locus in weaker prosodic boundaries, together with those of the vowel). Velocity of formant transition may be affected by prosodic boundary (with greater velocity at weaker boundaries), though results are not consistent across speakers. There is also a tendency for the rate of change in spectral tilt moving from the vowel to the fricative to be affected by the presence of a prosodic boundary, with a greater rate of change at the weaker prosodic boundaries. It is suggested that spectral cues, in addition to duration, amplitude, and F0 cues, may alert listeners to the presence of a prosodic boundary.  相似文献   

18.
This study presents EMA (electromagnetic articulography) data on articulation of the vowel /a/ at different prosodic boundaries in French. Three speakers of metropolitan French produced utterances containing the vowel /a/, preceded by /t/ and followed by one of six consonants /b d g f s S/ (three stops and three fricatives), with different prosodic boundaries intervening between the /a/ and the six different consonants. The prosodic boundaries investigated are the Utterance, the Intonational phrase, the Accentual phrase, and the Word. Data for the Tongue Tip, Tongue Body, and Jaw are presented. The articulatory data presented here were recorded at the same time as the acoustic data presented in Tabain [J. Acoust. Soc. Am. 113, 516-531 (2003)]. Analyses show that there is a strong effect on peak displacement of the vowel according to the prosodic hierarchy, with the stronger prosodic boundaries inducing a much lower Tongue Body and Jaw position than the weaker prosodic boundaries. Durations of both the opening movement into and the closing movement out of the vowel are also affected. Peak velocity of the articulatory movements is also examined, and, contrary to results for phrase-final lengthening, it is found that peak velocity of the opening movement into the vowel tends to increase with the higher prosodic boundaries, together with the increased magnitude of the movement between the consonant and the vowel. Results for the closing movement out of the vowel and into the consonant are not so clear. Since one speaker shows evidence of utterance-level articulatory declension, it is suggested that the competing constraints of articulatory declension and prosodic effects might explain some previous results on phrase-final lengthening.  相似文献   

19.
20.
This study investigated whether the production of prosodic focus and phrasing contrasts was modified when interlocutors could only hear each other [auditory only (AO)], compared to when they could hear and see each other [face to face (FTF)]. The prosodic characteristics of utterances produced by six talkers were examined using both acoustic and perceptual measures (ratings of the degree of focus or clarity of the statement-question contrast). The acoustic measures showed a range of differences between narrow focus and between phrasing contrasts and some of these differences were greater in the AO setting than the FTF one. The listener's ratings of focus and phrasing showed a clear difference between the AO and FTF conditions, with perceptual attributes of both narrow focus and echoic question phrasing being rated as clearer in the AO condition. To explain these results it is proposed that talkers compensate for the lack of visual prosodic cues in the AO condition by taking extra care (relative to FTF conditions) to ensure the effective transmission of prosodic cues.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号