首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 390 毫秒
1.
Linguistic features of noncoding DNA sequences   总被引:9,自引:0,他引:9  
We extend the Zipf approach to analyzing linguistic texts to the statistical study of DNA base pair sequences and find that the noncoding regions are more similar to natural languages than the coding regions. We also adapt the Shannon approach to quantifying the "redundancy" of a linguistic text in terms of a measurable entropy function, and demonstrate that noncoding regions in eukaryotes display a smaller entropy and larger redundancy than coding regions, supporting the possibility that noncoding regions of DNA may carry biological information.  相似文献   

2.
Summary We present evidence supporting the idea that the DNA sequence in genes containingnoncoding regions is correlated, and that the correlation is remarkably long range-indeed, base pairsthousands of base pairs distant are correlated. We do not find such a long-range correlation in the coding regions of the gene; we utilize this fact to build aCoding Sequence Finder algorithm, which uses statistical ideas to locate the coding regions of an unknown DNA sequence. We resolve the problem of the ?non-stationarity? feature of the sequence of base pairs (that the relative concentration of purines and pyrimidines changes in different regions of the mosaic-like chain) by describing a new algorithm calledDetrended Fluctuation Analysis (DFA). We address the claim of Voss that there is no difference in the statistical properties of coding and noncoding regions of DNA by systematically applying the DFA algorithm, as well as standard FFT analysis, to every DNA sequence (33 301 coding and 29 453 non-coding) in the entire GenBank database. We describe a simple model to account for the presence of long-range power law correlations (and the systematic variation of the scaling exponent α with evolution) which is based upon a generalization of the classic Lévy walk. Finally, we describe briefly some recent work showing that thenoncoding sequences have certain statistical features in common with natural languages. Specifically, we adapt to DNA the Zipf approach to analyzing linguistic texts, and the Shannon approach to quantifying the ?redundancy? of a linguistic text in terms of a measurable entropy function. We suggest that noncoding regions in eukaryotes may display a smaller entropy and larger redundancy than coding regions for plants and invertebrates, further supporting the possibility that noncoding regions of DNA may carry biological information. Paper presented at the I International Conference on Scaling Concepts and Complex Fluids, Copanello, Italy, July 4–8, 1994.  相似文献   

3.
Spectrum of the Micromaser with Kerr Medium   总被引:3,自引:0,他引:3  
We have established the master equation for the micromaser with Kerr medium field density operator,studied the spectrum of the micromaser with Kerr medium and analyzed the influence of Kerr effect and the detuning on the spectrum.In the thermal-atom regime,we find that Kerr effect broadens Linewidth D and increases frequency-shift S,and that the detuning Δ narrows linewidth D and increases frequency-shift S as a whole,Moreover Kerr effect leads to oscillatings more rapidly in the resonance peaks,which means that it causes quantum noise,As a whole,with the increase of cavity-length L,the linewidth D and frequency-shift S gradually increase.  相似文献   

4.
Evidence seems to show that coding DNA is more random than noncoding DNA, but other conflictingevidence also exists. Based on the third-base degeneracy of codons, we regard the third position of codons as a 'noisy'position. By deleting one fixed position of non-overlapping triplets in a given sequence, three masked sequences may bededuced from the sequence. We have investigated the block-to-site mutual information functions of coding and noncodingsequences in yeast without and with the masking. Characteristics that distinguish coding from noncoding DNA havebeen found. It is observed that the strong correlations in the coding regions may be blocked by the third base of codons,and the proper masking can extract the correlations. Distribution of dimeric tandem repeats of unmasked sequences isalso compared with that of masked sequences.  相似文献   

5.
We present a new computational approach to finding borders between coding and noncoding DNA. This approach has two features: (i) DNA sequences are described by a 12-letter alphabet that captures the differential base composition at each codon position, and (ii) the search for the borders is carried out by means of an entropic segmentation method which uses only the general statistical properties of coding DNA. We find that this method is highly accurate in finding borders between coding and noncoding regions and requires no "prior training" on known data sets. Our results appear to be more accurate than those obtained with moving windows in the discrimination of coding from noncoding DNA.  相似文献   

6.
We present a theoretical framework for the thermodynamic properties of supercoiling-induced denaturation bubbles in circular double-stranded DNA molecules. We explore how DNA supercoiling, ambient salt concentration, and sequence heterogeneity impact on the bubble occurrence. An analytical derivation of the probability distribution to find multiple bubbles is derived and the relevance for supercoiled DNA discussed. We show that in?vivo sustained DNA bubbles are likely to occur due to partial twist release in regions rich in weaker AT base pairs. Single DNA plasmid imaging experiments clearly demonstrate the existence of bubbles in free solution.  相似文献   

7.
We use the wavelet transform to investigate the fractal scaling properties of coding and noncoding human DNA sequences. We find that the strength of the long-range correlations observed in the introns increases with the guanine-cytosine (GC) content, while coding sequences show no such correlations at any GC content. However, we demonstrate that long-range correlations can be detected when the coding sequences are undersampled by retaining the third base of each codon only. This strongly suggests that the observed correlations are not likely to be due to insertion-deletion mechanisms. We comment about the origin of these correlations in terms of putative dynamical processes that could produce the isochore structure of the human genome. Received: 18 August 1997 / Accepted: 29 October 1997  相似文献   

8.
罗辽复 《物理学进展》2011,17(3):320-346
探索核苷酸统汁关联的规律性是遗传语言研究的基础。本文评述了这个领域的工作,着重讨论了核苷关联的短程为主性,DNA序列的信息参数分析,以及关联的进化相关性。文中还强调了核苷关联的生物学意义及这一研究的可能生物学应用,其中包括:构建进化树,预测蛋白质二级结构,寻求碱基关联偏好模的规律性,导出编码序列和非编码序列的遗传语言差别,通过关联谱和偏好模发现阅读框架,研究核苷关联和基因表达的关系(表达增强网络),研究长周期关联及功率谱的低频行为等。  相似文献   

9.
Recent experiments indicate that double-stranded DNA molecules of approximately 100 base pairs in length have a probability of cyclization which is up to 10(5) times larger than that expected based on the known bending modulus of the double-helix. We argue that for short molecules, the formation of a few base pairs of single-stranded DNA can provide a "flexible hinge" that facilitates loop formation. A detailed calculation shows that this mechanism explains the experimental data.  相似文献   

10.
We obtain, using transfer-matrix methods, the distribution function P(R) of the end-to-end distance, the loop formation probability, and force-extension relations in a model for short double-stranded DNA molecules. Accounting for the appearance of "bubbles," localized regions of enhanced flexibility associated with the opening of a few base pairs of double-stranded DNA in thermal equilibrium, leads to dramatic changes in P(R) and unusual force-extension curves. An analytic formula for the loop formation probability in the presence of bubbles is proposed. For short heterogeneous chains, we demonstrate a strong dependence of loop formation probabilities on sequence.  相似文献   

11.
马松山  徐慧  刘小良  郭爱敏 《物理学报》2006,55(6):3170-3174
在单电子紧束缚近似下,建立了一维无序二元DNA分子链模型,计算了链长为2×104个碱基对的DNA分子链的电子态密度、局域化特性,并探讨了碱基对的不同组分、格点能量无序度对电子局域态的影响.结果表明:由于DNA分子链中格点能量无序及碱基对的不同组分的存在,其电子波函数呈现出局域化的特性,而局域长度作为衡量电子局域化程度的一个尺度,受碱基对的组分及格点能量无序度的影响. 关键词: DNA分子链 电子结构 电子局域态 局域长度  相似文献   

12.
We used Raman spectroscopy to study the conformational changes of DNA induced by Cd2+ ions in different Cd2+ concentrations solution. The experimental results show that when the Cd2+/PO-2 ratio R increased from 0 to 3.0, the band 835.0 cm-1 shifted about 8 cm-1, and the overlapping spectra of 1446.0and 1461.0 cm-1 separated and moved to 1441.0 and 1458.0 cm-1, respectively. This indicates that the conformation of DNA has changed from a "normal" B-form to a "modified" B'-form. At the same time,changes of other bands demonstrate that parts of base stacking collapse and some hydrogen bonds between AT are disrupted, AT base pairs are damaged more larger than GC base pairs.  相似文献   

13.
《Physica A》2006,371(2):157-170
We study the effects of an external periodic perturbation on a Poisson rate process, with special attention to the perturbation-induced sojourn-time patterns. We show that these patterns correspond to turning a memory-less sequence into a sequence with memory. The memory effects are stronger the slower the perturbation. The adoption of a de-trending technique, applied with no caution, might generate the impression that no fluctuation–periodicity correlation exists. We find that this is due to the fact that the perturbation-induced memory is a global property and that the result of a local in time analysis would not find any memory effect, insofar as the process under study is locally a Poisson process. We find that an efficient way to detect this memory effect is to analyze the moduli of the de-trended sequence. We turn the sequence to analyze into a diffusion process, and we evaluate the Shannon entropy of the resulting diffusion process. We find that both the original sequence and the suitably processed de-trended sequence yield the same dependence of entropy on time, namely, an initial scaling larger than ordinary scaling, and a sequel of weak oscillations, which are a clear signature of the external perturbation, in both cases. This is a clear indication of the fluctuation–periodicity correlation.  相似文献   

14.
The translocation of structured RNA or DNA molecules through narrow pores necessitates the opening of all base pairs. Here, we study the interplay between the dynamics of translocation and base pairing theoretically, using kinetic Monte Carlo simulations and analytical methods. We find that the transient formation of base pairs that do not occur in the ground state can significantly speed up translocation.  相似文献   

15.
Using a simple computational procedure, we examine DNA chains from different species in order to prove their nonlinear deterministic structures. This procedure applies a nonlinear modeling technique based upon quantitative comparison of the neighborhoods from similar DNA subsegments of size d. Our results reveal that noncoding regions exhibit a deterministic signature at sizes larger than a characteristic dimension d(c). Applications to evolutionary categories and recognition of different DNA regions are discussed.  相似文献   

16.
By performing a comprehensive study on 1832 segments of 1212 complete genomes of viruses, we show that in viral genomes the hairpin structures of thermodynamically predicted RNA secondary structures are more abundant than expected under a simple random null hypothesis. The detected hairpin structures of RNA secondary structures are present both in coding and in noncoding regions for the four groups of viruses categorized as dsDNA, dsRNA, ssDNA and ssRNA. For all groups, hairpin structures of RNA secondary structures are detected more frequently than expected for a random null hypothesis in noncoding rather than in coding regions. However, potential RNA secondary structures are also present in coding regions of dsDNA group. In fact, we detect evolutionary conserved RNA secondary structures in conserved coding and noncoding regions of a large set of complete genomes of dsDNA herpesviruses.  相似文献   

17.
One of the important steps in the annotation of genomes is the identification of regions in the genome which code for proteins. One of the tools used by most annotation approaches is the use of signals extracted from genomic regions that can be used to identify whether the region is a protein coding region. Motivated by the fact that these regions are information bearing structures we propose signals based on measures motivated by the average mutual information for use in this task. We show that these signals can be used to identify coding and noncoding sequences with high accuracy. We also show that these signals are robust across species, phyla, and kingdom and can, therefore, be used in species agnostic genome annotation algorithms for identifying protein coding regions. These in turn could be used for gene identification.  相似文献   

18.
We report on a study of the interactions between holes and molecular vibrations on dry DNA using photoinduced infrared absorption spectroscopy. Laser photoexcited holes are found to have a room-temperature lifetime in excess of tau > 1 ms, clearly indicating the presence of localization. However, from a quantitative model analysis of the frequency shifts of vibrational modes caused by the holes, we find the hole-vibrational coupling constant to be relatively small, lambda approximately 0.2. This interaction leads to a change in the conformational energy of DeltaE0 approximately 0.015 eV, which is too small to cause self-trapping at room temperature. We conclude that, at least in the dry (A) form, DNA is best understood in terms of a double chain of coupled quantum dots arising from the pseudorandom chain sequence of base pairs, in which Anderson localization prevents the formation of a metallic state.  相似文献   

19.
Pairing of DNA fragments with homologous sequences occurs in gene shuffling, DNA repair, and other vital processes. While chemical individuality of base pairs is hidden inside the double helix, x ray and NMR revealed sequence-dependent modulation of the structure of DNA backbone. Here we show that the resulting modulation of the DNA surface charge pattern enables duplexes longer than approximately 50 base pairs to recognize sequence homology electrostatically at a distance of up to several water layers. This may explain the local recognition observed in pairing of homologous chromosomes and the observed length dependence of homologous recombination.  相似文献   

20.
Fractals in DNA sequence analysis   总被引:2,自引:0,他引:2       下载免费PDF全文
喻祖国  Vo Anh  龚志民  龙顺湖 《中国物理》2002,11(12):1313-1318
Fractal methods have been successfully used to study many problems in physics,mathematics,engineering,finance,and even in biology,There has been an increasing interest in unravelling the mysteries of DNA;for example,how can we distinguish coding and noncoding sequences,and the problems of classification and evolution relationship of organisms are key problems in bioinformatics,Although much research has been carried out by taking into consideration the long-range correlations in DNA sequences,and the global fractal dimension has been used in these works by other people,the models and methods are somewhat rough and the results are not satisfactory.In recent years,our group has introduced a time series model(statistical point of view)and a visual representation (geometrical point of view) to DNA sequence analysis.We have also used fractal dimension,correlation dimension,the Hurst exponent and the dimension spectrum (multifractal analysis)to discuss problems in this field.In this paper,we introduce these fractal models and methods and the results of DNA sequence analysis.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号