首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 0 毫秒
1.
2.
Protein methylation is involved in dozens of biological processes and plays an important role in adjusting protein physicochemical properties, conformation and function. However, with the rapid increase of protein sequence entering into databanks, the gap between the number of known sequence and the number of known methylation annotation is widening rapidly. Therefore, it is vitally significant to develop a computational method for quick and accurate identification of methylation sites. In this study, a novel predictor (Methy_SVMIACO) based on support vector machine (SVM) and improved ant colony optimization algorithm (IACO) is developed to identify methylation sites. The IACO is utilized to find the optimal feature subset and parameter of SVM, while SVM is employed to perform the identification of methylation sites. Comparison of the IACO with conventional ACO shows that the IACO converges quickly toward the global optimal solution and it is more useful tool for feature selection and SVM parameter optimization. The performance of Methy_SVMIACO is evaluated with a sensitivity of 85.71%, a specificity of 86.67%, an accuracy of 86.19% and a Matthew's correlation coefficient (MCC) of 0.7238 for lysine as well as a sensitivity of 89.08%, a specificity of 94.07%, an accuracy of 91.56% and a MCC of 0.8323 for arginine in 10-fold cross-validation test. It is shown through the analysis of the optimal feature subset that some upstream and downstream residues play important role in the methylation of arginine and lysine. Compared with other existing methods, the Methy_SVMIACO provides higher Acc, Sen and Spe, indicating that the current method may serve as a powerful complementary tool to other existing approaches in this area. The Methy_SVMIACO can be acquired freely on request from the authors.  相似文献   

3.
Ant colony optimization (ACO) is a meta-heuristic algorithm, which is derived from the observation of real ants. In this paper, ACO algorithm is proposed to feature selection in quantitative structure property relationship (QSPR) modeling and to predict λmax of 1,4-naphthoquinone derivatives. Feature selection is the most important step in classification and regression systems. The performance of the proposed algorithm (ACO) is compared with that of a stepwise regression, genetic algorithm and simulated annealing methods. The average absolute relative deviation in this QSPR study using ACO, stepwise regression, genetic algorithm and simulated annealing using multiple linear regression method for calibration and prediction sets were 5.0%, 3.4% and 6.8%, 6.1% and 5.1%, 8.6% and 6.0%, 5.7%, respectively. It has been demonstrated that the ACO is a useful tool for feature selection with nice performance.  相似文献   

4.
The classical sequencing by hybridization takes into account a binary information about sequence composition. A given element from an oligonucleotide library is or is not a part of the target sequence. However, the DNA chip technology has been developed and it enables to receive a partial information about multiplicity of each oligonucleotide the analyzed sequence consist of. Currently, it is not possible to assess the exact data of such type but even partial information should be very useful.Two realistic multiplicity information models are taken into consideration in this paper. The first one, called “one and many” assumes that it is possible to obtain information if a given oligonucleotide occurs in a reconstructed sequence once or more than once. According to the second model, called “one, two and many”, one is able to receive from biochemical experiment information if a given oligonucleotide is present in an analyzed sequence once, twice or at least three times.An ant colony optimization algorithm has been implemented to verify the above models and to compare with existing algorithms for sequencing by hybridization which utilize the additional information. The proposed algorithm solves the problem with any kind of hybridization errors. Computational experiment results confirm that using even the partial information about multiplicity leads to increased quality of reconstructed sequences. Moreover, they also show that the more precise model enables to obtain better solutions and the ant colony optimization algorithm outperforms the existing ones.Test data sets and the proposed ant colony optimization algorithm are available on: http://bioserver.cs.put.poznan.pl/download/ACO4mSBH.zip.  相似文献   

5.
Sequence alignment is one of the most important operations in bioinformatics. In this article, we introduced a new method for pairwise alignment. We associated the process of aligning with the plan by the modified dot plots. The next position will be selected by the number of pheromone and the matching score of the candidates. The presented algorithm can be used to find the best aligning result without calculating the scoring matrix. Superiority of the presented algorithm has been proved in several experiments.  相似文献   

6.
A new variable selection algorithm is described, based on ant colony optimization (ACO). The algorithm aim is to choose, from a large number of available spectral wavelengths, those relevant to the estimation of analyte concentrations or sample properties when spectroscopic analysis is combined with multivariate calibration techniques such as partial least-squares (PLS) regression. The new algorithm employs the concept of cooperative pheromone accumulation, which is typical of ACO selection methods, and optimizes PLS models using a pre-defined number of variables, employing a Monte Carlo approach to discard irrelevant sensors. The performance has been tested on a simulated system, where it shows a significant superiority over other commonly employed selection methods, such as genetic algorithms. Several near infrared spectroscopic experimental data sets have been subjected to the present ACO algorithm, with PLS leading to improved analytical figures of merit upon wavelength selection. The method could be helpful in other chemometric activities such as classification or quantitative structure-activity relationship (QSAR) problems.  相似文献   

7.
一种新的化学计量学方法——蚁群虎法   总被引:3,自引:0,他引:3  
蚁群算法是一种全新的仿生算法,具有智能搜索,全局优化,稳健笥强,分布式计算,易与其它方法结合等优点,是求解复杂的组合优化问题的有力工具。本文对蚁群算法的基本原理,数学模型,应用领域以及进展情况进行了介绍。  相似文献   

8.
Most sequence clustering methods require a full distance matrix to be computed between all pairs of sequences. This requires computer memory and time proportional to N(2) for N sequences. For small N or say up to 10000 or so, this can be accomplished in reasonable times for sequences of moderate length. For very large N, however, this becomes increasingly prohibitive. In this paper, we have tested variations on a class of published embedding methods that have been designed for clustering large numbers of complex objects where the individual distance calculations are expensive. These methods involve embedding the sequences in a space where the similarities within a set of sequences can be closely approximated without having to compute all pair-wise distances. We show how this approach greatly reduces computation time and memory requirements for clustering large numbers of sequences and demonstrate the quality of the clusterings by benchmarking them as guide trees for multiple alignments. Source code is available on request from the authors.  相似文献   

9.
In this article, we describe a representation for the processes of multiple sequences alignment (MSA) and used it to solve the problem of MSA. By this representation, we took every possible aligning result into account by defining the representation of gap insertion, the value of heuristic information in every optional path and scoring rule. On the basis of the proposed multidimensional graph, we used the ant colony algorithm to find the better path that denotes a better aligning result. In our article, we proposed the instance of three‐dimensional graph and four‐dimensional graph and advanced a special ichnographic representation to analyze MSA. It is yet only an experimental software, and we gave an example for finding the best aligning result by three‐dimensional graph and ant colony algorithm. Experimental results show that our method can improve the solution quality on MSA benchmarks. © 2009 Wiley Periodicals, Inc. J Comput Chem 2009  相似文献   

10.
BackgroundIn psoriasis skin disease, psoriatic cells develop rapidly than the normal healthy cells. This speedy growth causes accumulation of dead skin cells on the skin’s surface, resulting in thick patches of red, dry, and itchy skin. This patches or psoriatic skin legions may exhibit similar characteristics as healthy skin, which makes lesion detection more challenging. However, for accurate disease diagnosis and severity detection, lesion segmentation has prime importance. In that context, our group had previously performed psoriasis lesion segmentation using the conventional clustering algorithm. However, it suffers from the constraint of falling into the local sub-optimal centroids of the clusters.ObjectiveThe main objective of this paper is to implement an optimal lesion segmentation technique with aims at global convergence by reducing the probability of trapping into the local optima. This has been achieved by integrating swarm intelligence based algorithms with conventional K-means and Fuzzy C-means (FCMs) clustering algorithms.MethodologyThere are a total of eight different suitable combinations of conventional clustering (i.e., K-means and Fuzzy C-means (FCMs)) and four swarm intelligence (SI) techniques (i.e., seeker optimization (SO), artificial bee colony (ABC), ant colony optimization (ACO) and particle swarm optimization (PSO)) have been implemented in this study. The experiments are performed on the dataset of 780 psoriasis images from 74 patients collected at Psoriasis Clinic and Research Centre, Psoriatreat, Pune, Maharashtra, India. In this study, we are employing swarm intelligence optimization techniques in combination with the conventional clustering algorithms to increase the probability of convergence to the optimal global solution and hence improved clustering and detection.ResultsThe performance has been quantified in terms of four indices, namely accuracy (A), sensitivity (SN), specificity (SP), and Jaccard index (JI). Among the eight different combinations of clustering and optimization techniques considered in this study, FCM + SO outperformed with mean JI = 0.83, mean A = 90.89, mean SN = 92.84, and mean SP = 88.27. FCM + SO found statistical significant than other approaches with 96.67 % of the reliability index.ConclusionThe results obtained reflect the superiority of the proposed techniques over conventional clustering techniques. Hence our research development will lead to an objective analysis for automatic, accurate, and quick diagnosis of psoriasis.  相似文献   

11.
In this study, dispersive solid phase extraction combined with dispersive liquid–liquid extraction has been developed for the extraction of benzene, toluene, ethylbenzene, and xylenes isomers (BTEX) in soil samples prior to gas chromatography–mass spectrometry. The BTEX were extracted from soil sample into acetonitrile by dispersive solid phase extraction method, and the extract was then used as dispersive solvent in dispersive liquid–liquid extraction procedure. Ant colony optimization–artificial neural network (ACO–ANN) has been employed to develop the model for simulation and optimization of this method. The volume of dispersive solvent, volume of extraction solvent, extraction time, and ultrasonic time were the input variables, while the multiple response function (Rm) of analytes was the output. The optimum operating condition was then determined by ant colony optimization method. At the optimum conditions, the limit of detections of 0.12–0.75 ng g−1 was obtained for the BTEX. The developed procedure was then applied to the extraction and determination of BTEX in the soil samples and one certified soil. Copyright © 2015 John Wiley & Sons, Ltd.  相似文献   

12.

Derivation of quantitative structure-activity relationships (QSAR) usually involves computational models that relate a set of input variables describing the structural properties of the molecules for which the activity has been measured to the output variable representing activity. Many of the input variables may be correlated, and it is therefore often desirable to select an optimal subset of the input variables that results in the most predictive model. In this paper we describe an optimization technique for variable selection based on artificial ant colony systems. The algorithm is inspired by the behavior of real ants, which are able to find the shortest path between a food source and their nest using deposits of pheromone as a communication agent. The underlying basic self-organizing principle is exploited for the construction of parsimonious QSAR models based on neural networks for several classical QSAR data sets.  相似文献   

13.
Multispectral images such as multispectral chemical images or multispectral satellite images provide detailed data with information in both the spatial and spectral domains. Many segmentation methods for multispectral images are based on a per-pixel classification, which uses only spectral information and ignores spatial information. A clustering algorithm based on both spectral and spatial information would produce better results.

In this work, spatial refinement clustering (SpaRef), a new clustering algorithm for multispectral images is presented. Spatial information is integrated with partitional and agglomeration clustering processes. The number of clusters is automatically identified. SpaRef is compared with a set of well-known clustering methods on compact airborne spectrographic imager (CASI) over an area in the Klompenwaard, The Netherlands. The clusters obtained show improved results. Applying SpaRef to multispectral chemical images would be a straight-forward step.  相似文献   


14.
PK-means: A new algorithm for gene clustering   总被引:3,自引:0,他引:3  
Microarray technology has been widely applied in study of measuring gene expression levels for thousands of genes simultaneously. Gene cluster analysis is found useful for discovering the function of gene because co-expressed genes are likely to share the same biological function. K-means is one of well-known clustering methods. However, it is sensitive to the selection of an initial clustering and easily becoming trapped in a local minimum. Particle-pair optimizer (PPO) is a variation on the traditional particle swarm optimization (PSO) algorithm, which is stochastic particle-pair based optimization technique that can be applied to a wide range of problems. In this paper we bridges PPO and K-means within the algorithm PK-means for the first time. Our results indicate that PK-means clustering is generally more accurate than K-means and Fuzzy K-means (FKM). PK-means also has better robustness for it is less sensitive to the initial randomly selected cluster centroids. Finally, our algorithm outperforms these methods with fast convergence rate and low computation load.  相似文献   

15.
RNA structure comparison is a fundamental problem in structural biology, structural chemistry, and bioinformatics. It can be used for analysis of RNA energy landscapes, conformational switches, and facilitating RNA structure prediction. The purpose of our integrated tool RNACluster is twofold: to provide a platform for computing and comparison of different distances between RNA secondary structures, and to perform cluster identification to derive useful information of RNA structure ensembles, using a minimum spanning tree (MST) based clustering algorithm. RNACluster employs a cluster identification approach based on a MST representation of the RNA ensemble data and currently supports six distance measures between RNA secondary structures. RNACluster provides a user-friendly graphical interface to allow a user to compare different structural distances, analyze the structure ensembles, and visualize predicted structural clusters.  相似文献   

16.
17.
This paper introduces the ant colony algorithm, a novel swarm intelligence based optimization method, to select appropriate wavelet coefficients from mass spectral data as a new feature selection method for ovarian cancer diagnostics. By determining the proper parameters for the ant colony algorithm (ACA) based searching algorithm, we perform the feature searching process for 100 times with the number of selected features fixed at 5. The results of this study show: (1) the classification accuracy based on the five selected wavelet coefficients can reach up to 100% for all the training, validating and independent testing sets; (2) the eight most popular selected wavelet coefficients of the 100 runs can provide 100% accuracy for the training set, 100% accuracy for the validating set, and 98.8% accuracy for the independent testing set, which suggests the robustness and accuracy of the proposed feature selection method; and (3) the mass spectral data corresponding to the eight popular wavelet coefficients can be located by reverse wavelet transformation and these located mass spectral data still maintain high classification accuracies (100% for the training set, 97.6% for the validating set, and 98.8% for the testing set) and also provide sufficient physical and medical meaning for future ovarian cancer mechanism studies. Furthermore, the corresponding mass spectral data (potential biomarkers) are in good agreement with other studies which have used the same sample set. Together these results suggest this feature extraction strategy will benefit the development of intelligent and real-time spectroscopy instrumentation based diagnosis and monitoring systems.  相似文献   

18.
Haplotype reconstruction, based on aligned single nucleotide polymorphism (SNP) fragments, is to infer a pair of haplotypes from localized polymorphism data gathered through short genome fragment assembly. This paper first presents two distance functions, which are used to measure the difference degree and similarity degree between SNP fragments. Based on the two distance functions, a clustering algorithm is proposed in order to solve MEC model. The algorithm involves two sections. One is to determine the initial haplotype pair, the other concerns with inferring true haplotype pair by re-clustering. The comparison results prove that our algorithm utilizing two distance functions is effective and feasible.  相似文献   

19.
The adaptation of novel techniques developed in the field of computational chemistry to solve the concerned problems for large and flexible molecules is taking the center stage with regard to efficient algorithm, computational cost and accuracy. In this article, the gradient‐based gravitational search (GGS) algorithm, using analytical gradients for a fast minimization to the next local minimum has been reported. Its efficiency as metaheuristic approach has also been compared with Gradient Tabu Search and others like: Gravitational Search, Cuckoo Search, and Back Tracking Search algorithms for global optimization. Moreover, the GGS approach has also been applied to computational chemistry problems for finding the minimal value potential energy of two‐dimensional and three‐dimensional off‐lattice protein models. The simulation results reveal the relative stability and physical accuracy of protein models with efficient computational cost. © 2015 Wiley Periodicals, Inc.  相似文献   

20.
Based on the concept of ant colony optimization and the idea of population in genetic algorithm, a novel global optimization algorithm, called the hybrid ant colony optimization (HACO), is proposed in this paper to tackle continuous-space optimization problems. It was compared with other well-known stochastic methods in the optimization of the benchmark functions and was also used to solve the problem of selecting appropriate dilation efficiently by optimizing the wavelet power spectrum of the hydrophobic sequence of protein, which is the key step on using continuous wavelet transform (CWT) to predict a-helices and connecting peptides.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号