首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Evidence seems to show that coding DNA is more random than noncoding DNA, but other conflictingevidence also exists. Based on the third-base degeneracy of codons, we regard the third position of codons as a 'noisy'position. By deleting one fixed position of non-overlapping triplets in a given sequence, three masked sequences may bededuced from the sequence. We have investigated the block-to-site mutual information functions of coding and noncodingsequences in yeast without and with the masking. Characteristics that distinguish coding from noncoding DNA havebeen found. It is observed that the strong correlations in the coding regions may be blocked by the third base of codons,and the proper masking can extract the correlations. Distribution of dimeric tandem repeats of unmasked sequences isalso compared with that of masked sequences.  相似文献   

2.
We use the wavelet transform to investigate the fractal scaling properties of coding and noncoding human DNA sequences. We find that the strength of the long-range correlations observed in the introns increases with the guanine-cytosine (GC) content, while coding sequences show no such correlations at any GC content. However, we demonstrate that long-range correlations can be detected when the coding sequences are undersampled by retaining the third base of each codon only. This strongly suggests that the observed correlations are not likely to be due to insertion-deletion mechanisms. We comment about the origin of these correlations in terms of putative dynamical processes that could produce the isochore structure of the human genome. Received: 18 August 1997 / Accepted: 29 October 1997  相似文献   

3.
We present a new computational approach to finding borders between coding and noncoding DNA. This approach has two features: (i) DNA sequences are described by a 12-letter alphabet that captures the differential base composition at each codon position, and (ii) the search for the borders is carried out by means of an entropic segmentation method which uses only the general statistical properties of coding DNA. We find that this method is highly accurate in finding borders between coding and noncoding regions and requires no "prior training" on known data sets. Our results appear to be more accurate than those obtained with moving windows in the discrimination of coding from noncoding DNA.  相似文献   

4.
Summary We present evidence supporting the idea that the DNA sequence in genes containingnoncoding regions is correlated, and that the correlation is remarkably long range-indeed, base pairsthousands of base pairs distant are correlated. We do not find such a long-range correlation in the coding regions of the gene; we utilize this fact to build aCoding Sequence Finder algorithm, which uses statistical ideas to locate the coding regions of an unknown DNA sequence. We resolve the problem of the ?non-stationarity? feature of the sequence of base pairs (that the relative concentration of purines and pyrimidines changes in different regions of the mosaic-like chain) by describing a new algorithm calledDetrended Fluctuation Analysis (DFA). We address the claim of Voss that there is no difference in the statistical properties of coding and noncoding regions of DNA by systematically applying the DFA algorithm, as well as standard FFT analysis, to every DNA sequence (33 301 coding and 29 453 non-coding) in the entire GenBank database. We describe a simple model to account for the presence of long-range power law correlations (and the systematic variation of the scaling exponent α with evolution) which is based upon a generalization of the classic Lévy walk. Finally, we describe briefly some recent work showing that thenoncoding sequences have certain statistical features in common with natural languages. Specifically, we adapt to DNA the Zipf approach to analyzing linguistic texts, and the Shannon approach to quantifying the ?redundancy? of a linguistic text in terms of a measurable entropy function. We suggest that noncoding regions in eukaryotes may display a smaller entropy and larger redundancy than coding regions for plants and invertebrates, further supporting the possibility that noncoding regions of DNA may carry biological information. Paper presented at the I International Conference on Scaling Concepts and Complex Fluids, Copanello, Italy, July 4–8, 1994.  相似文献   

5.
Fractals in DNA sequence analysis   总被引:2,自引:0,他引:2       下载免费PDF全文
喻祖国  Vo Anh  龚志民  龙顺湖 《中国物理》2002,11(12):1313-1318
Fractal methods have been successfully used to study many problems in physics,mathematics,engineering,finance,and even in biology,There has been an increasing interest in unravelling the mysteries of DNA;for example,how can we distinguish coding and noncoding sequences,and the problems of classification and evolution relationship of organisms are key problems in bioinformatics,Although much research has been carried out by taking into consideration the long-range correlations in DNA sequences,and the global fractal dimension has been used in these works by other people,the models and methods are somewhat rough and the results are not satisfactory.In recent years,our group has introduced a time series model(statistical point of view)and a visual representation (geometrical point of view) to DNA sequence analysis.We have also used fractal dimension,correlation dimension,the Hurst exponent and the dimension spectrum (multifractal analysis)to discuss problems in this field.In this paper,we introduce these fractal models and methods and the results of DNA sequence analysis.  相似文献   

6.
Linguistic features of noncoding DNA sequences   总被引:9,自引:0,他引:9  
We extend the Zipf approach to analyzing linguistic texts to the statistical study of DNA base pair sequences and find that the noncoding regions are more similar to natural languages than the coding regions. We also adapt the Shannon approach to quantifying the "redundancy" of a linguistic text in terms of a measurable entropy function, and demonstrate that noncoding regions in eukaryotes display a smaller entropy and larger redundancy than coding regions, supporting the possibility that noncoding regions of DNA may carry biological information.  相似文献   

7.
We review evidence supporting the idea that the DNA sequence in genes containing noncoding regions is correlated, and that the correlation is remarkably long range--indeed, base pairs thousands of base pairs distant are correlated. We do not find such a long-range correlation in the coding regions of the gene, and utilize this fact to build a Coding Sequence Finder Algorithm, which uses statistical ideas to locate the coding regions of an unknown DNA sequence. Finally, we describe briefly some recent work adapting to DNA the Zipf approach to analyzing linguistic texts, and the Shannon approach to quantifying the "redundancy" of a linguistic text in terms of a measurable entropy function, and reporting that noncoding regions in eukaryotes display a larger redundancy than coding regions. Specifically, we consider the possibility that this result is solely a consequence of nucleotide concentration differences as first noted by Bonhoeffer and his collaborators. We find that cytosine-guanine (CG) concentration does have a strong "background" effect on redundancy. However, we find that for the purine-pyrimidine binary mapping rule, which is not affected by the difference in CG concentration, the Shannon redundancy for the set of analyzed sequences is larger for noncoding regions compared to coding regions.  相似文献   

8.
罗辽复 《物理学进展》2011,17(3):320-346
探索核苷酸统汁关联的规律性是遗传语言研究的基础。本文评述了这个领域的工作,着重讨论了核苷关联的短程为主性,DNA序列的信息参数分析,以及关联的进化相关性。文中还强调了核苷关联的生物学意义及这一研究的可能生物学应用,其中包括:构建进化树,预测蛋白质二级结构,寻求碱基关联偏好模的规律性,导出编码序列和非编码序列的遗传语言差别,通过关联谱和偏好模发现阅读框架,研究核苷关联和基因表达的关系(表达增强网络),研究长周期关联及功率谱的低频行为等。  相似文献   

9.
Clustering and long-range correlations in the nucleotide sequences of different categories of organisms are discussed. Clustering, mostly observed in higher eucaryotes, can be found at different length scales in DNA and Central Limit Theorems are used as links between these length scales. Several dynamical, statistical, mean-field models are proposed based on biologically motivated dynamical mechanisms and they successfully reproduce both the short range behavior observed in coding DNA and the long range, out-of-equilibrium features of non-coding DNA. Such dynamical mechanisms include aggregation of oligonucleotides, influx and DNA length reduction schemes, transpositions, and fusions of large DNA macromolecules. Fractality can be inferred from the short and long range correlations observed in the sequence structure of higher eucaryotes, where the non-coding part is relatively extended. In these organisms the DNA coding/non-coding alternation has the characteristics of finite size, fractal, random sets.  相似文献   

10.
The acoustic environment of the bottlenose dolphin often consists of noise where energy across frequency regions is coherently modulated in time (e.g., ambient noise from snapping shrimp). However, most masking studies with dolphins have employed random Gaussian noise for estimating patterns of masked thresholds. The current study demonstrates a pattern of masking where temporally fluctuating comodulated noise produces lower masked thresholds (up to a 17 dB difference) compared to Gaussian noise of the same spectral density level. Noise possessing wide bandwidths, low temporal modulation rates, and across-frequency temporal envelope coherency resulted in lower masked thresholds, a phenomenon known as comodulation masking release. The results are consistent with a model where dolphins compare temporal envelope information across auditory filters to aid in signal detection. Furthermore, results suggest conventional models of masking derived from experiments using random Gaussian noise may not generalize well to environmental noise that dolphins actually encounter.  相似文献   

11.
Scaling in nature: from DNA through heartbeats to weather.   总被引:1,自引:0,他引:1  
The purpose of this report is to describe some recent progress in applying scaling concepts to various systems in nature. We review several systems characterized by scaling laws such as DNA sequences, heartbeat rates and weather variations. We discuss the finding that the exponent alpha quantifying the scaling in DNA in smaller for coding than for noncoding sequences. We also discuss the application of fractal scaling analysis to the dynamics of heartbeat regulation, and report the recent finding that the scaling exponent alpha is smaller during sleep periods compared to wake periods. We also discuss the recent findings that suggest a universal scaling exponent characterizing the weather fluctuations.  相似文献   

12.
We use a tight-binding formulation to investigate the transmissivity and the current-voltage (I-V) characteristics of sequences of double-strand DNA molecules. In order to reveal the relevance of the underlying correlations in the nucleotides distribution, we compare the results for the genomic DNA sequence with those of artificial sequences (the long-range correlated Fibonacci and Rudin-Shapiro one) and a random sequence, which is a kind of prototype of a short-range correlated system. The random sequence is presented here with the same first neighbors pair correlations of the human DNA sequence. We found that the long-range character of the correlations is important to the transmissivity spectra, although the I-V curves seem to be mostly influenced by the short-range correlations.  相似文献   

13.
This study demonstrates a new possibility of estimating intelligibility of speech in informational maskers. The temporal and spectral properties of sound maskers are investigated to achieve acoustic privacy in public spaces. Speech intelligibility (SI) tests were conducted using Japanese sentences in daily use for energy (white noise) or informational (reversed speech) maskers. We found that the masking effects including informational masking on SI might not be estimated by analyzing the narrow-band temporal envelopes, which is a common way of predicting SI under noisy conditions. The masking effects might instead be visualized by spectral auto-correlation analysis on a frame-by-frame basis, for the series of dominant-spectral peaks of the masked target in the frequency domain. Consequently, we found that dissimilarity in frame-based spectral-auto-correlation sequences between the original and masked targets was the key to evaluating maskers including informational masking effects on SI.  相似文献   

14.
Chaos game representation (CGR)-walk model for DNA sequences   总被引:1,自引:0,他引:1       下载免费PDF全文
高洁  徐振源 《中国物理 B》2009,18(1):370-376
Chaos game representation (CGR) is an iterative mapping technique that processes sequences of units, such as nucleotides in a DNA sequence or amino acids in a protein, in order to determine the coordinates of their positions in a continuous space. This distribution of positions has two features: one is unique, and the other is source sequence that can be recovered from the coordinates so that the distance between positions may serve as a measure of similarity between the corresponding sequences. A CGR-walk model is proposed based on CGR coordinates for the DNA sequences. The CGR coordinates are converted into a time series, and a long-memory ARFIMA (p, d, q) model, where ARFIMA stands for autoregressive fractionally integrated moving average, is introduced into the DNA sequence analysis. This model is applied to simulating real CGR-walk sequence data of ten genomic sequences. Remarkably long-range correlations are uncovered in the data, and the results from these models are reasonably fitted with those from the ARFIMA (p, d, q) model.  相似文献   

15.
Detection was measured for a 500 Hz tone masked by noise (an "energetic" masker) or sets of ten randomly drawn tones (an "informational" masker). Presenting the maskers diotically and the target tone with a variety of interaural differences (interaural amplitude ratios and/or interaural time delays) resulted in reduced detection thresholds relative to when the target was presented diotically ("binaural release from masking"). Thresholds observed when time and amplitude differences applied to the target were "reinforcing" (favored the same ear, resulting in a lateralized position for the target) were not significantly different from thresholds obtained when differences were "opposing" (favored opposite ears, resulting in a centered position for the target). This irrelevance of differences in the perceived location of the target is a classic result for energetic maskers but had not previously been shown for informational maskers. However, this parallellism between the patterns of binaural release for energetic and informational maskers was not accompanied by high correlations between the patterns for individual listeners, supporting the idea that the mechanisms for binaural release from energetic and informational masking are fundamentally different.  相似文献   

16.
17.
The purpose of this opening talk is to describe examples of recent progress in applying statistical mechanics to biological systems. We first briefly review several biological systems, and then focus on the fractal features characterized by the long-range correlations found recently in DNA sequences containing non-coding material. We discuss the evidence supporting the finding that for sequences containing only coding regions, there are no long-range correlations. We also discuss the recent finding that the exponent alpha characterizing the long-range correlations increases with evolution, and we discuss two related models, the insertion model and the insertion-deletion model, that may account for the presence of long-range correlations. Finally, we summarize the analysis of long-term data on human heartbeats (up to 10(4) heart beats) that supports the possibility that the successive increments in the cardiac beat-to-beat intervals of healthy subjects display scale-invariant, long-range "anti-correlations" (a tendency to beat faster is balanced by a tendency to beat slower later on). In contrast, for a group of subjects with severe heart disease, long-range correlations vanish. This finding suggests that the classical theory of homeostasis, according to which stable physiological processes seek to maintain "constancy," should be extended to account for this type of dynamical, far from equilibrium, behavior.  相似文献   

18.
Illusory continuity of tonal and infratonal periodic sounds   总被引:2,自引:0,他引:2  
Temporal induction can restore masked or obliterated portions of signals so that tones may seem continuous when alternated with sounds having appropriate spectral composition and intensity. The upper intensity limits for the induction of tones (pulsation thresholds) are related to masking functions and have been used to define the characteristics of frequency domain (place) analysis of tones. The present study has found that induction also occurs for infratonal periodic sounds that require a time domain analysis for perception of acoustic repetition. Limits for temporal induction were determined for iterated frozen noise segments from 10-2000 Hz alternated with a louder on-line noise. Masked thresholds were also obtained for the pulsed signals presented along with continuous noise, and it was found that the relation between induction limits and masking changed with frequency. The results obtained for induction and masking are discussed in terms of general principles governing restoration of obliterated sounds.  相似文献   

19.
Clustering and long-range correlations in the nucleotide sequences of different categories of organisms are studied. As a result of clustering, the size distribution of coding and non-coding DNA regions is estimated analytically using the Generalised Central Limit Theorem.The alternation of coding regions (which follow a short range size distribution) with non-coding regions (which follow a long range size distribution in higher organisms) leads to DNA structures which have a striking resemblance to random Cantor Fractals. For lower organisms (such as viruses, procaryotes etc.) long-range correlations are sporadically observed and the DNA structures do not present fractality.Statistical models are proposed based on biologically motivated dynamical mechanisms (such as aggregation of oligonucleotides, influx and DNA length reduction), which can account for the above statistical features.  相似文献   

20.
By using Hurst's analysis and transfer approach, the rescaled range functions and Hurst exponents of human chromosome 22 and enterobacteria phage lambda DNA sequences are investigated and the transmission coefficients, Landauer resistances and Lyapunov coefficients of finite segments based on above genomic DNA sequences are calculated. In a comparison with quasiperiodic and random artificial DNA sequences, we find that λ-DNA exhibits anticorrelation behavior characterized by a Hurst exponent 0.5<H<1 while, as far as the segments selected in our Letter are concerned, Ch22 sequence displays a transition from correlation behavior to anticorrelation behavior. The resonant peaks of the transmission coefficient in genomic sequences can survive in longer sequence length than in random sequences but in shorter sequence length than in quasiperiodic sequences. It is shown that the genomic sequences have long-range correlation properties to some extent but the correlations are not strong enough to maintain the scale invariance properties.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号