首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
This investigation was undertaken to enlarge current understanding of the acoustic properties which influence the perception of maleness and femaleness in the voices of prepubertal children. Perceptual judgments of sexual identity were obtained in response to tape recordings of whispered and normally phonated vowels, normally spoken sentences, and sentences spoken in a monotonous fashion. Seventy-three children provided recordings. The four utterance types were chosen to experimentally manipulate selected physical properties of speech thought to exert an influence on listener judgments of sexual identity. The results of this work suggest that cues stemming from differences in vocal tract dimensions and/or articulatory behaviors provided the primary cues about the sexual identity of these preadolescent children. Although laryngeal source cues could have provided relevant information about the sex of a few children, this variable was felt to play a relatively minor role in the sex recognition process. New information was uncovered about the role certain suprasegmental factors play in the identification of child sex.  相似文献   

2.
A listener who recognizes a talker notices characteristic attributes of the talker's speech despite the novelty of each utterance. Accounts of talker perception have often presumed that consistent aspects of an individual's speech, termed indexical properties, are ascribable to a talker's unique anatomy or consistent vocal posture distinct from acoustic correlates of phonetic contrasts. Accordingly, the perception of a talker is acknowledged to occur independently of the perception of a linguistic message. Alternatively, some studies suggest that attention to attributes of a talker includes indexical linguistic attributes conveyed in the articulation of consonants and vowels. This investigation sought direct evidence of attention to phonetic attributes of speech in perceiving talkers. Natural samples and sinewave replicas derived from them were used in three experiments assessing the perceptual properties of natural and sine-wave sentences; of temporally veridical and reversed natural and sine-wave sentences; and of an acoustic correlate of vocal tract scale to judgments of sine-wave talker similarity. The results revealed that the subjective similarity of individual talkers is preserved in the absence of natural vocal quality; and that local phonetic segmental attributes as well as global characteristics of speech can be exploited when listeners notice characteristics of talkers.  相似文献   

3.
Information about the acoustic properties of a talker's voice is available in optical displays of speech, and vice versa, as evidenced by perceivers' ability to match faces and voices based on vocal identity. The present investigation used point-light displays (PLDs) of visual speech and sinewave replicas of auditory speech in a cross-modal matching task to assess perceivers' ability to match faces and voices under conditions when only isolated kinematic information about vocal tract articulation was available. These stimuli were also used in a word recognition experiment under auditory-alone and audiovisual conditions. The results showed that isolated kinematic displays provide enough information to match the source of an utterance across sensory modalities. Furthermore, isolated kinematic displays can be integrated to yield better word recognition performance under audiovisual conditions than under auditory-alone conditions. The results are discussed in terms of their implications for describing the nature of speech information and current theories of speech perception and spoken word recognition.  相似文献   

4.
Peta White   《Journal of voice》1999,13(4):570-582
High-pitched productions present difficulties in formant frequency analysis due to wide harmonic spacing and poorly defined formants. As a consequence, there is little reliable data regarding children's spoken or sung vowel formants. Twenty-nine 11-year-old Swedish children were asked to produce 4 sustained spoken and sung vowels. In order to circumvent the problem of wide harmonic spacing, F1 and F2 measurements were taken from vowels produced with a sweeping F0. Experienced choir singers were selected as subjects in order to minimize the larynx height adjustments associated with pitch variation in less skilled subjects. Results showed significantly higher formant frequencies for speech than for singing. Formants were consistently higher in girls than in boys suggesting longer vocal tracts in these preadolescent boys. Furthermore, formant scaling demonstrated vowel dependent differences between boys and girls suggesting non-uniform differences in male and female vocal tract dimensions. These vowel-dependent sex differences were not consistent with adult data.  相似文献   

5.
The primary goal of this study was to characterize a performer's singing and speaking voice. One woman was not admitted to a premier choral group, but her sister, who was comparable in physical characteristics and background, was admitted and provided a valuable control subject. The perceptual judgment of a vocal coach who conducted the group's auditions was decisive in discriminating these 2 singers. The singer not admitted to the group described a history of voice pathology, lacked a functional head register, and spoke with a voice characterized by hoarseness. Multiple listener judgments and acoustic and aerodynamic evaluations of both singers provided a more systematic basis for determining: 1) the phonatory basis for this judgment; 2) whether similar judgments would be made by groups of vocal coaches and speech-language pathologists; and 3) whether the type of tasks (e.g., sung vs. spoken) would influence these judgments. Statistically significant differences were observed between the ratings of vocal health provided by two different groups of listeners. Significant interactions were also observed as a function of the types of voice samples heard by these listeners. Instrumental analyses provided evidence that, in comparison to her sister, the rejected singer had a compromised vocal range, glottal insufficiencies as assessed aerodynamically and electroglottographically, and impaired acoustic quality, especially in her speaking voice.  相似文献   

6.
Although listeners routinely perceive both the sex and individual identity of talkers from their speech, explanations of these abilities are incomplete. Here, variation in vocal production-related anatomy was assumed to affect vowel acoustics thought to be critical for indexical cueing. Integrating this approach with source-filter theory, patterns of acoustic parameters that should represent sex and identity were identified. Due to sexual dimorphism, the combination of fundamental frequency (F0, reflecting larynx size) and vocal tract length cues (VTL, reflecting body size) was predicted to provide the strongest acoustic correlates of talker sex. Acoustic measures associated with presumed variations in supralaryngeal vocal tract-related anatomy occurring within sex were expected to be prominent in individual talker identity. These predictions were supported by results of analyses of 2500 tokens of the /epsilon/ phoneme, extracted from the naturally produced speech of 125 subjects. Classification by talker sex was virtually perfect when F0 and VTL were used together, whereas talker classification depended primarily on the various acoustic parameters associated with vocal-tract filtering.  相似文献   

7.
Previous studies have shown that trained listeners are highly reliable in making perceptual judgments of several parameters of normal and pathologic voices. This study investigated objective measures of acoustic characteristics of high and low preference voices as determined by previous perceptual study. Four acoustic parameters were measured including harmonics-to-noise ratio, autocorrelation function, average jitter, and the standard deviation of the fundamental frequency. Useful correlations between perceptual and measured results were identified. Normal voices differ from pathologic voices in terms of the acoustic-perceptual relationships.  相似文献   

8.
Reiterant speech, or nonsense syllable mimicry, has been proposed as a way to study prosody, particularly syllable and word durations, unconfounded by segmental influences. Researchers have shown that segmental influences on durations can be neutralized in reiterant speech. If it is to be a useful tool in the study of prosody, it must also be shown that reiterant speech preserves the suprasegmental duration and intonation differences relevant to perception. In the present study, syllable durations for nonreiterant and reiterant ambiguous sentences were measured to seek evidence of the duration differences which can enable listeners to resolve surface structure ambiguities in nonreiterant speech. These duration patterns were found in both nonreiterant and reiterant speech. A perceptual study tested listeners' perception of these ambiguous sentences as spoken by four "good" speakers--speakers who neutralized intrinsic duration differences and whose sentences were independently rated by skilled listeners as good imitations of normal speech. The listeners were able to choose the correct interpretation when the ambiguous sentences were in reiterant form as well as they did when the sentences were spoken normally. These results support the notion that reiterant speech is like nonreiterant speech in aspects which are important in the study of prosody.  相似文献   

9.
The purpose of this study was to examine the acoustic characteristics of children's speech and voices that account for listeners' ability to identify gender. In Experiment I, vocal recordings and gross physical measurements of 4-, 8-, 12-, and 16-year olds were taken (10 girls and 10 boys per age group). The speech sample consisted of seven nondiphthongal vowels of American English (/ae/ "had," /E/ "head," /i/ "heed," /I/ "hid," /a/ "hod," /inverted v/ "hud," and /u/ "who'd") produced in the carrier phrase, "Say /hVd/ again." Fundamental frequency (f0) and formant frequencies (F1, F2, F3) were measured from these syllables. In Experiment II, 20 adults rated the syllables produced by the children in Experiment I based on a six-point gender rating scale. The results from these experiments indicate (1) vowel formant frequencies differentiate gender for children as young as four years of age, while formant frequencies and f0 differentiate gender after 12 years of age, (2) the relationship between gross measures of physical size and vocal characteristics is apparent for at least 12- and 16-year olds, and (3) listeners can identify gender from the speech and voice of children as young as four years of age, and with respect to young children, listeners appear to base their gender ratings on vowel formant frequencies. The findings are discussed in relation to the development of gender identity and its perceptual representation in speech and voice.  相似文献   

10.
11.
SUMMARY: This study investigates the possible differences between actors' and nonactors' vocal projection strategies using acoustic and perceptual analyses. A total of 11 male actors and 10 male nonactors volunteered as subjects, reading an extended text sample in habitual, moderate, and loud levels. The samples were analyzed for sound pressure level (SPL), alpha ratio (difference between the average SPL of the 1-5kHz region and the average SPL of the 50Hz-1kHz region), fundamental frequency (F0), and long-term average spectrum (LTAS). Through LTAS, the mean frequency of the first formant (F1) range, the mean frequency of the "actor's formant," the level differences between the F1 frequency region and the F0 region (L1-L0), and the level differences between the strongest peak at 0-1kHz and that at 3-4kHz were measured. Eight voice specialists evaluated perceptually the degree of projection, loudness, and tension in the samples. The actors had a greater alpha ratio, stronger level of the "actor's formant" range, and a higher degree of perceived projection and loudness in all loudness levels. SPL, however, did not differ significantly between the actors and nonactors, and no differences were found in the mean formant frequencies ranges. The alpha ratio and the relative level of the "actor's formant" range seemed to be related to the degree of perceived loudness. From the physiological point of view, a more favorable glottal setting, providing a higher glottal closing speed, may be characteristic of these actors' projected voices. So, the projected voices, in this group of actors, were more related to the glottic source than to the resonance of the vocal tract.  相似文献   

12.
This article reports on an experiment examining some perceptual consequences of correspondences between accent patterns, the distribution of plus and minus focus, and the distribution of new and given information in Dutch spoken sentences. "Accent patterns" refer here to the distribution of intonational accents over spoken sentences. Each accent marks a sentence constituent as plus focus, i.e., as highlighted by the speaker. Constituents not so marked are called minus focus. The main questions examined here are to what extent are plus focus constituents generally perceived as conveying new information, and minus focus constituents as conveying earlier introduced or given information. The linguistic material for the experiment was formed by brief radio news items, each two sentences long. Leading sentences determined the distribution of new and given information in target sentences. The accent patterns and, hence, the possible focus distributions in the target utterances were varied systematically by manipulating their synthetic pitch contours according to the rules for Dutch intonation. Subjects were asked to rate on a scale from 1-10 the acceptability of each possible combination of a leading with a target utterance. Results showed that the most preferred or acceptable distributions of new and given information closely match the distributions of plus and minus focus. It was also found that new information can hardly ever acceptably be associated with minus focus, but given information can rather often, although not always, acceptably be associated with plus focus. This appears to be limited to certain conditions, defined by a combination of syntactic and focus structure of the sentence. In these conditions, plus focus cannot be perceived only as signaling new information, but also as highlighting thematic relations with the context. These results are related to work on text-to-speech systems.  相似文献   

13.
An important outcome of education for speech-language pathologypractice is the ability to analyze voices perceptually, a complex task that is often difficult for novices. This article describes an interactive multi-media package, “A Sound Judgement,” that is designed to help students develop skills in perceptual voice analysis and to link their perceptions to laryngeal physiology. The package presents a range of clients with vocal impairments at increasing levels of complexity. Each case has a videoed interview, endoscopic views and animations of the larynx, and case history information. Students make perceptual ratings of clients' voices on a format designed specifically for this package and feedback is provided using ratings made by expert speech-language pathologists. High levels of consensus for the perceptual judgments were achieved among the expert raters. Preliminary evaluations by students have demonstrated that “A Sound Judgement” is likely to be a valuable educational tool.  相似文献   

14.
Traditional interval or ordinal rating scale protocols appear to be poorly suited to measuring vocal quality. To investigate why this might be so, listeners were asked to classify pathological voices as having or not having different voice qualities. It was reasoned that this simple task would allow listeners to focus on the kind of quality a voice had, rather than how much of a quality it possessed, and thus might provide evidence for the validity of traditional vocal qualities. In experiment 1, listeners judged whether natural pathological voice samples were or were not primarily breathy and rough. Listener agreement in both tasks was above chance, but listeners agreed poorly that individual voices belonged in particular perceptual classes. To determine whether these results reflect listeners' difficulty agreeing about single perceptual attributes of complex stimuli, listeners in experiment 2 classified natural pathological voices and synthetic stimuli (varying in f0 only) as low pitched or not low pitched. If disagreements derive from difficulties dividing an auditory continuum consistently, then patterns of agreement should be similar for both kinds of stimuli. In fact, listener agreement was significantly better for the synthetic stimuli than for the natural voices. Difficulty isolating single perceptual dimensions of complex stimuli thus appears to be one reason why traditional unidimensional rating protocols are unsuited to measuring pathologic voice quality. Listeners did agree that a few aphonic voices were breathy, and that a few voices with prominent vocal fry and/or interharmonics were rough. These few cases of agreement may have occurred because the acoustic characteristics of the voices in question corresponded to the limiting case of the quality being judged. Values of f0 that generated listener agreement in experiment 2 were more extreme for natural than for synthetic stimuli, consistent with this interpretation.  相似文献   

15.
The main purpose of the present study was to examine the vocal quality and to investigate the effects of gender on vocal quality in 28 children with a unilateral or bilateral cleft palate. In this study, the vocal quality was determined using videolaryngostroboscopic and perceptual evaluations, aerodynamic, voice range, acoustic, and dysphonia severity index (DSI) measurements. The DSI is based on the weighted combination of four voice measurements and ranges from +5 to -5 for, respectively, normal and severely dysphonic voices. Additional objectives were to compare the vocal quality characteristics of children with cleft palate with the available normative data and to investigate the impact of the cleft type on vocal quality. Gender-related vocal quality differences were found. The male cleft palate children showed an overall vocal quality of +0.62 with the presence of a perceptual slight grade of hoarseness and roughness. The female vocal quality had a DSI value of +2.4 reflecting a perceptually normal voice. Irrespective of the type of cleft, all subjects demonstrated a significantly lower DSI-value in comparison with the available normative data. The results of the present study have provided valuable insights into the vocal quality characteristics of cleft palate children.  相似文献   

16.
Three experiments used the Coordinated Response Measure task to examine the roles that differences in F0 and differences in vocal-tract length have on the ability to attend to one of two simultaneous speech signals. The first experiment asked how increases in the natural F0 difference between two sentences (originally spoken by the same talker) affected listeners' ability to attend to one of the sentences. The second experiment used differences in vocal-tract length, and the third used both F0 and vocal-tract length differences. Differences in F0 greater than 2 semitones produced systematic improvements in performance. Differences in vocal-tract length produced systematic improvements in performance when the ratio of lengths was 1.08 or greater, particularly when the shorter vocal tract belonged to the target talker. Neither of these manipulations produced improvements in performance as great as those produced by a different-sex talker. Systematic changes in both F0 and vocal-tract length that simulated an incremental shift in gender produced substantially larger improvements in performance than did differences in F0 or vocal-tract length alone. In general, shifting one of two utterances spoken by a female voice towards a male voice produces a greater improvement in performance than shifting male towards female. The increase in performance varied with the intonation patterns of individual talkers, being smallest for those talkers who showed most variability in their intonation patterns between different utterances.  相似文献   

17.
This study concerns speaking voice quality in a group of male teachers (n = 35) and male actors (n = 36), as the purpose was to investigate normal and supranormal voices. The goal was the development of a method of valid perceptual evaluation for normal to supranormal and resonant voices. The voices (text reading at two loudness levels) had been evaluated by 10 listeners, for 15 vocal characteristics using VA scales. In this investigation, the results of an exploratory factor analysis of the vocal characteristics used in this method are presented, reflecting four dimensions of major importance for normal and supranormal voices. Special emphasis is placed on the effects on voice quality of a change in the loudness variable, as two loudness levels are studied. Furthermore, the vocal characteristics Sonority and Ringing voice quality are paid special attention, as the essence of the term "resonant voice" was a basic issue throughout a doctoral dissertation where this study was included.  相似文献   

18.
A growing body of contemporary research has investigated differences between trained and untrained singing voices. However, few studies have separated untrained singers into those who do and do not express abilities related to singing talent, including accurate pitch control and production of a pleasant timbre (voice quality). This investigation studied measures of the singing power ratio (SPR), which is a quantitative measure of the resonant quality of the singing voice. SPR reflects the amplification or suppression in the vocal tract of the harmonics produced by the sound source. This measure was acquired from the voices of untrained talented and nontalented singers as a means to objectively investigate voice quality differences. Measures of SPR were acquired from vocal samples with fast Fourier transform (FFT) power spectra to analyze the amplitude level of the partials in the acoustic spectrum. Long-term average spectra (LTAS) were also analyzed. Results indicated significant differences in SPR between groups, which suggest that vocal tract resonance, and its effect on perceived vocal timbre or quality, may be an important variable related to the perception of singing talent. LTAS confirmed group differences in the tuning of vocal tract harmonics.  相似文献   

19.
To reduce degradation in speech recognition due to varied characteristics of different speakers,a method of perceptual frequency warping based on subglottal resonances for speaker normalization is proposed.The warping factor is extracted from the second subglottal resonance using acoustic coupling between subglottis and vocal tract.The second subglottal resonance is independent of the speech content,which reflects the speaker characteristics more than the third formant.The perceptual minimum variation distortionless response(PMVDR) coefficient is normalized,which is more robust and has better anti-noise capability than MFCC. The normalized coefficients are used in the speech-mode training and speech recognition.Experiments show that the word error rate,as compared with MFCC and the spectrum warping by the third formant,decreases by 4%and 3%respectively in clean speech recognition,and by 9%and 5%respectively in a noisy environment.The results indicate that the proposed method can improve the word recognition accuracy in a speaker-independent recognition system.  相似文献   

20.
Effects of Family Therapy on Children''s Voices   总被引:1,自引:0,他引:1  
The families of nine children with deviant voice qualities were selected for family treatment according to the SYGESTI model. Recordings of the children's speech were made before and after therapy. Perceptual evaluation of their voice quality showed significant improvement in various perceptual parameters after the therapy. Acoustical analysis confirmed changes of voice quality and mean fundamental frequency in speech. The therapy also was found to improve relations between family members, conflict management and other aspects of communication. The results suggest that these children's deviant voices were related to family conditions.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号