首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
BACKGROUND: Given a big sequence fragment or a set of functionally related sequences we consider two problems of a sequence analysis associated with the given sequence(s). The first problem is to measure sequence complexity (repetitiveness, compactness) to estimate how informative the set as a whole is. Usually an obtained measure should be compared with an appropriate random background calculated using permutation of the given sequences. We propose a novel and effective approach for background information measurement instead of the usual sequence reshuffling. The second problem is to detect a periodic bias to determine if it is one of the set features. Sequence periodicity, when sometimes one has in mind hidden periodicity, is a very basic genomic property. The sequence period of 3, which is considered to characterize coding sequences, and period 10-11, which may be due to the alternation of hydrophobic and hydrophilic amino acids, DNA curvature, and bendability were discovered and described. Searching for periodical biases brought significant results in the study of sequence-dependent nucleosome positioning: nucleosomal sites carry hidden period of about 10.4 bases. RESULTS: Calculated differences between genomic sequences and background showed high biological relevancy of the method that we proposed in this study. Our algorithm was applied to a few natural and artificial datasets. We constructed a simple "periodic" dataset by replacement of every tenth dinucleotide in each sequence of a trial set by the same dinucleotide "CC". We showed that the method reveals the introduced periodicity and that this periodical pattern carries higher information than in uninterrupted subsequences. An application of the method to the nucleosomal dataset revealed a weak pseudo-periodicity of 10.4 nucleotides confirming previous knowledge. An application of the method to Escherichia coli datasets revealed the well-known periodicity of 3bp as a genic attribute, a secondary genic period slightly larger than 11bp, and an intergenic period a bit smaller than 11bp. CONCLUSIONS: We reported a novel compositional complexity-based method for sequence analysis. We found that the difference between the sequence complexity of a natural sequence and of background is especially high for a set consisting exclusively of coding sequences. Hidden periodicities were found with no need of any preliminary assumptions regarding a composition of periodic elements. We illustrated the power of the method by studying the sets with known weak periodic properties: a nucleosomal database and sets of different regions of E. coli. We showed that the method conveniently indicated all kinds of periodicity and related features in these sets of DNA sequences.  相似文献   

2.
A method using capillary gel electrophoresis with laser-induced fluorescence detection is described which permits complete sequence determination of antisense DNA analogues of unknown sequence. This method, originally created as a tool to confirm the sequence of antisense oligonucleotides being developed as therapeutic drugs, utilizes data collected under a range of experimental conditions described by the Ogston model as applied to gel electrophoresis. A linear relationship independent of experimental conditions between the relative electrophoretic migration time and the oligonucleotide base number was observed and is shown to be consistent with a simplified version of this model and can be used to facilitate the sequence determination.  相似文献   

3.
汪猷  孙小俭  钱诚  钱瑞卿  张伟君  顾天爵 《化学学报》1988,46(11):1125-1133
在初步用计算机辅助的羧肽酶法测定了天花粉蛋白C-端顺序的基础上, 进一步设计了两个计算机程序-DPS程序和CPA程序. 用合成小肽和天然的肽对这两个计算机程序进行模型实验证明, 运用这两个程序能分别满意地从羧肽酶的酶解动力学曲线中获得重要的C-端顺序信息, 并测定了天花粉蛋白分子中未知肽段CB-3的C-端顺序为: -SerAlaSerAlaLeuHserOH, 这一顺序后来已经其他实验结果所证实. 本法不仅使C-端顺序测定延长至七个氨基酸, 而且还基本上解决了多肽或蛋白质含有多种多次重复氨基酸残基的C-端顺序测定.  相似文献   

4.
Protein chains are generally long and consist of multiple domains. Domains are distinct structural units of a protein that can evolve and function independently. The accurate and reliable prediction of protein domain linkers and boundaries is often considered to be the initial step of protein tertiary structure and function predictions. In this paper, we introduce CISA as a method for predicting inter-domain linker regions solely from the amino acid sequence information. The method first computes the amino acid compositional index from the protein sequence dataset of domain-linker segments and the amino acid composition. A preference profile is then generated by calculating the average compositional index values along the amino acid sequence using a sliding window. Finally, the protein sequence is segmented into intervals and a simulated annealing algorithm is employed to enhance the prediction by finding the optimal threshold value for each segment that separates domains from inter-domain linkers. The method was tested on two standard protein datasets and showed considerable improvement over the state-of-the-art domain linker prediction methods.  相似文献   

5.
Zou B  Ma Y  Wu H  Zhou G 《The Analyst》2012,137(3):729-734
Detection of nucleic acids with signal amplification is preferable in clinical diagnosis. A novel approach was developed for signal amplification by coupling invasive reaction with hyperbranched rolling circle amplification (HRCA). Invasive reaction, which does not rely on specific recognition sequences in a target but a specific structure formed by the specific binding of an upstream probe and a downstream probe to a target DNA, can generate thousands of flaps from one target DNA; then the flaps are ligated with padlock probes to form circles, which are the templates of HRCA. As HRCA amplicon sequence is free of target DNA sequence, signal amplification is achieved. Because flap sequence is the same to any target of interest, HRCA is universal; the detection cost is hence greatly reduced. The sensitivity of the proposed method is less than 1 fM artificial DNA targets; and the specificity of the method is high enough to discriminate one base difference in the target sequence. The feasibility was verified by detecting real biological samples from HBV carriers, indicating that the method is highly sensitive, cost-effective, and has a low risk of cross-contamination from amplicons. These properties should give great potential in clinical diagnosis.  相似文献   

6.
Poly(ether)urethane elastomers (PEUE) having different sequence distributions can be synthesized by the reaction of p-phenylene diisocyanate, poly(oxytetramethylene glycol), and hydrazine by four different routes. The degree of the sequence distribution of PEUE was determined by high-resolution NMR spectroscopy. The sequence distribution of PEUE synthesized by the prepolymer method in solvent (method 1) was found to coincide with the sequence distribution calculated from the reactivity ratio of two isocyanate groups in p-phenylene diisocyanate. On the other hand, the sequence distribution of PEUE obtained by the prepolymer method without solvent (method 2) was found to deviate from that expected from the reactivity ratio. The degree of the distribution of monomers in PEUE having the same composition ratio corresponded to the infrared absorbance ratio at 1720 and 1700 cm?1.  相似文献   

7.
Sequence-controlled oligomers of cyclic imino ethers were synthesized by the one-pot multi-stage feeding method. Selective formation of the sequence was clearly demonstrated by equimolar reactions between the monomers and 1:1 adducts of an initiator with the monomers. Efficiency of the sequence control is determined by the difference of reactivities of the active ends. A new general prerequisite to synthesize a well-defined sequence by multi-stage oligomerization was proposed.  相似文献   

8.
Summary A method is described for obtaining peptide fragments for sequence analysis from microquantities of proteins separated by 1- or 2-dimensional polyacrylamide gel electrophoresis. After separation by electrophoresis, the proteins were stained with Coomassie Blue and excised. Proteolytic digestion with trypsin was performed directly in the polyacrylamide matrix. The resulting peptide fragments were eluted, separated by reversed phase HPLC, collected and sequenced in a gas phase sequencer. Excellent peptide recoveries allowed generation of extensive internal sequence information from picomole amounts of protein. The method thus overcomes the problem of obtaining amino acid sequence data from N-terminally blocked proteins and provides multiple, independent stretches of sequences that can be used to generate oligonucleotide probes for molecular cloning, to design synthetic peptides for inducing antibodies, and to search sequence databases for related proteins.  相似文献   

9.
A new method for determining the amino acid sequence of polypeptides consists in initial partial hydrolysis to yield a complex mixture of oligopeptides. After derivatization to enhance its volatility, the mixture is analyzed by combined gas chromatography and mass spectrometry. The sequence of the polypeptide is established by a computer from the identified oligopeptides. So far polypeptides having up to 40 amino acids have been analyzed by this method. The advantages and disadvantages of the new method compared with the stepwise procedure of the Edman degradation are considered. Since the two methods are based on fundamentally different principles they may prove to be complementary.  相似文献   

10.
A method incorporating nested collision-induced dissociation/post-source decay (CID/PSD) combined with endopeptidase digestion is described as an approach to determine the sequence of N-terminally modified peptides. The information from immonium and related ions observed in the CID/PSD spectrum was used for the selection of a suitable endopeptidase for the digestion of peptides. Rapid and reliable assignment of peptide sequence was performed by the comparison of CID/PSD spectra of both intact and endopeptidese-digested peptide fragments, since the assignments of the observed fragment ions to either N- or C-terminal ions can thus be carried out unambiguously. This nested CID/PSD method was applied to the sequence determination of two peptides from the solitary wasps Anoplius samariensis and Batozonellus maculifrons (pompilid wasps), which could not be sequenced by the Edman method due to N-terminal modification.  相似文献   

11.
建立了利用荧光标记引物和DNA自动测序仪进行DNA断裂位点分析的新方法, 该方法简便易行、灵敏度高、重复性好、数据分析客观性强、结果可靠, 适用于各种因素造成的DNA断裂位点的分析.  相似文献   

12.
Conventionally, protein structure prediction via “threading” relies on some nonoptimal method to align a protein sequence to each member of a library of known structures. We show how a score function (force field) can be modified so as to allow the direct application of a dynamic programming algorithm to the problem. This involves an approximation whose damage can be minimized by an optimization process during score function parameter determination. The method is compared to sequence to structure alignments using a more conventional pair-wise score function and the frozen approximation. The new method produces results comparable to the frozen approximation, but is faster and has fewer adjustable parameters. It is also free of memory of the template's original amino acid sequence, and does not suffer from a problem of nonconvergence, which can be shown to occur with the frozen approximation. Alignments generated by the simplified score function can then be ranked using a second score function with the approximations removed. ©1999 John Wiley & Sons, Inc. J Comput Chem 20: 1455–1467, 1999  相似文献   

13.
《Analytical letters》2012,45(18):3309-3323
Abstract

The relative-area difference sequence (ΔSr sequence) analytical method is proposed, with which the most similar herbal samples can be determined according to their contents. Together with common and variation peak ratio dual-index sequence analysis, one can identify the most similar sample group in the two-dimensional sequence accurately; therefore, the method can evaluate the quality of herbs.  相似文献   

14.
Inspired by biological polymers, sequence‐controlled synthetic polymers are highly promising materials that integrate the robustness of synthetic systems with the information‐derived activity of biological counterparts. Polymer–biopolymer conjugates are often targeted to achieve this union; however, their synthesis remains challenging. We report a stepwise solid‐phase approach for the generation of completely monodisperse and sequence‐defined DNA–polymer conjugates using readily available reagents. These polymeric modifications to DNA display self‐assembly and encapsulation behavior—as evidenced by HPLC, dynamic light scattering, and fluorescence studies—which is highly dependent on sequence order. The method is general and has the potential to make DNA–polymer conjugates and sequence‐defined polymers widely available.  相似文献   

15.
The stacking interaction energies between nucleic acid bases in A DNA and B DNA are calculated by means of the ab initio molecular orbital method. The calculated values agree well with the experimental values of stacking enthalpy changes. The stacking interaction energy is shown to be highly sequence dependent, particularly when the sequence includes guanine or cytosine. The possibility is shown that the conformation of a DNA double helix fragment is determined by the constituent stacking interaction. Electrostatic energy is the cause of the sequence dependency of the stacking energy, while charge transfer and dispersion energies contribute to the overall stability.  相似文献   

16.
The 1983-base pair nucleotide sequence of the EcoRI-HindIII fragment of vaccinia virus Tiantan strain HindIII K clone is determined by the dideoxy chain termination method. A search in the NBRF protein sequence database using FASTA and other microcomputer programs reveals that several proteins belonging to the serpin (serine protease inhibitor) superfamily have striking similarities to the protein encoded by the HindIII K1 ORF. On the basis of the dot-matrix analysis and sequence alignment, the K1-encoded protein is shown as a novel member of the serpin superfamily. The putative reactive site and switch sequence of this novel serpin are then compared with those of other serpins. The probable evolutionary and possible functional relationships are discussed.  相似文献   

17.
A new method has been developed for prediction of homology model quality directly from the sequence alignment, using multivariate regression. Hence, the expected quality of future homology models can be estimated using only information about the primary structure. This method has been applied to protein kinases and can easily be extended to other protein families. Homology model quality for a reference set of homology models was verified by comparison to experimental structures, by calculation of root-mean-square deviations (RMSDs) and comparison of interresidue contact areas. The homology model quality measures were then used as dependent variables in a Partial Least Squares (PLS) regression, using a matrix of alignment score profiles found from the Point Accepted Mutation (PAM) 250 similarity matrix as independent variables. This resulted in a regression model that can be used to predict the accuracy of future homology models from the sequence alignment. Using this method, one can identify the target-template combinations that are most likely to give homology models of sufficient quality. Hence, this method can be used to effectively choose the optimal templates to use for the homology modeling. The method's ability to guide the choice of homology modeling templates was verified by comparison of success rates to those obtained using BLAST scores and target-template sequence identities, respectively. The results indicate that the method presented here performs best in choosing the optimal homology modeling templates. Using this method, the optimal template was chosen in 86% of the cases, as compared to 62% using BLAST scores, and 57% using sequence identities. The method presented here can also be used to identify regions of the protein structure that are difficult to model, as well as alignment errors. Hence, this method is a useful tool for ensuring that the best possible homology model is generated.  相似文献   

18.
Metagenomic studies suggest that only a small fraction of the viruses that exist in nature have been identified and studied. Characterization of unknown viral genomes is hindered by the many genomes populating any virus sample. A new method is reported that integrates drop‐based microfluidics and computational analysis to enable the purification of any single viral species from a complex mixed virus sample and the retrieval of complete genome sequences. By using this platform, the genome sequence of a 5243 bp dsDNA virus that was spiked into wastewater was retrieved with greater than 96 % sequence coverage and more than 99.8 % sequence identity. This method holds great potential for virus discovery since it allows enrichment and sequencing of previously undescribed viruses as well as known viruses.  相似文献   

19.
用STO-3G基组的从头计算和DFP梯度几何优化方法对戊二烯负离子(C5H7^-)及其甲基取代衍生物进行了骨架优化。得到C5H7^-构象异构体的稳定顺序为W>S>U。而甲基取代的戊二烯负离子, 其顺序取决于甲基取代的位置。  相似文献   

20.
A method has been developed for the determination of the amino-acid sequence of a cyclic peptide containing cystine. It is based on the reduction of the peptide in a reductive matrix prior to ionization by fast-atom bombardment. The amino-acid sequence of the resulting linear peptide is then determined by tandem mass spectrometry from the spectrum produced by the collision-induced decomposition of the [M + H]+ ion of the peptide.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号