首页 | 本学科首页   官方微博 | 高级检索  
     

Chaos game representation (CGR)-walk model for DNA sequences
引用本文:高洁,徐振源. Chaos game representation (CGR)-walk model for DNA sequences[J]. 中国物理 B, 2009, 18(1): 370-376. DOI: 10.1088/1674-1056/18/1/060
作者姓名:高洁  徐振源
作者单位:(1)School of Science, Jiangnan University, Wuxi 214122, China;School of Information Technology, Jiangnan University,Wuxi 214122, China; (2)School of Science, Jiangnan University, Wuxi 214122, China
基金项目:Project supported by the NationalNatural Science Foundation of China (Grant No 60575038) and theNatural Science Foundation of Jiangnan University, China (Grant No20070365).
摘    要:Chaos game representation (CGR) is an iterative mappingtechnique that processes sequences of units, such as nucleotides ina DNA sequence or amino acids in a protein, in order to determinethe coordinates of their positions in a continuous space. Thisdistribution of positions has two features: one is unique, and theother is source sequence that can be recovered from the coordinatesso that the distance between positions may serve as a measure ofsimilarity between the corresponding sequences. A CGR-walkmodel is proposed based on CGR coordinates for the DNAsequences. The CGR coordinates are converted into a timeseries, and a long-memory ARFIMA (p, d, q) model, whereARFIMA stands for autoregressive fractionally integrated movingaverage, is introduced into the DNA sequence analysis. This model isapplied to simulating real CGR-walk sequence data of tengenomic sequences. Remarkably long-range correlations are uncoveredin the data, and the results from these models are reasonably fittedwith those from the ARFIMA (p, d, q) model.

关 键 词:CGR-walk model   DNA sequence   long-memory   ARFIMA(p   d   q) model  
收稿时间:2008-04-24

Chaos game representation (CGR)-walk model for DNA sequences
Gao Jie and Xu Zhen-Yuan. Chaos game representation (CGR)-walk model for DNA sequences[J]. Chinese Physics B, 2009, 18(1): 370-376. DOI: 10.1088/1674-1056/18/1/060
Authors:Gao Jie and Xu Zhen-Yuan
Affiliation:School of Science, Jiangnan University, Wuxi 214122, China;  School of Information Technology, Jiangnan University, Wuxi 214122, China
Abstract:Chaos game representation (CGR) is an iterative mappingtechnique that processes sequences of units, such as nucleotides ina DNA sequence or amino acids in a protein, in order to determinethe coordinates of their positions in a continuous space. Thisdistribution of positions has two features: one is unique, and theother is source sequence that can be recovered from the coordinatesso that the distance between positions may serve as a measure ofsimilarity between the corresponding sequences. A CGR-walkmodel is proposed based on CGR coordinates for the DNAsequences. The CGR coordinates are converted into a timeseries, and a long-memory ARFIMA (p, d, q) model, whereARFIMA stands for autoregressive fractionally integrated movingaverage, is introduced into the DNA sequence analysis. This model isapplied to simulating real CGR-walk sequence data of tengenomic sequences. Remarkably long-range correlations are uncoveredin the data, and the results from these models are reasonably fittedwith those from the ARFIMA (p, d, q) model.
Keywords:CGR-walk model   DNA sequence   long-memory   ARFIMA(p   d   q) model
本文献已被 维普 等数据库收录!
点击此处可从《中国物理 B》浏览原始摘要信息
点击此处可从《中国物理 B》下载免费的PDF全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号