首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Accurate clustering of cells from single-cell RNA sequencing (scRNA-seq) data is an essential step for biological analysis such as putative cell type identification. However, scRNA-seq data has high dimension and high sparsity, which makes traditional clustering methods less effective to reflect the similarity between cells. Since genetic network fundamentally defines the functions of cell and deep learning shows strong advantages in network representation learning, we propose a novel scRNA-seq clustering framework ScGSLC based on graph similarity learning. ScGSLC effectively integrates scRNA-seq data and protein-protein interaction network to a graph. Then graph convolution network is employed by ScGSLC to embedding graph and clustering the cells by the calculated similarity between graphs. Unsupervised clustering results of nine public data sets demonstrate that ScGSLC shows better performance than the state-of-the-art methods.  相似文献   

2.
Active particles convert external energy into motility, displaying a variety of dynamical features. Recent progress in the field has marked a shift in focus from understanding the origin and sources of active motion to controlling the dynamics and trajectory of individual microswimmers. This review explores the advancements made in a two-fold perspective—the role of particle design and that of external factors. Our main goal is to highlight the guiding principles, which determine active particle trajectory. These include, on the one hand, the role of the morphology of active particles and their assemblies in driving translation, rotation, and corresponding coupling between the two. On the other hand, the effect of environmental parameters such as the presence of physicochemical heterogeneities including interfaces, suspended obstacles, and boundaries on the modality and trajectory of active colloids. We discuss the potential of using active particles in biomedical and environmental applications through recent examples.  相似文献   

3.
DNA microarray data has been widely used in cancer research due to the significant advantage helped to successfully distinguish between tumor classes. However, typical gene expression data usually presents a high-dimensional imbalanced characteristic, which poses severe challenge for traditional machine learning methods to construct a robust classifier performing well on both the minority and majority classes. As one of the most successful feature weighting techniques, Relief is considered to particularly suit to handle high-dimensional problems. Unfortunately, almost all relief-based methods have not taken the class imbalance distribution into account. This study identifies that existing Relief-based algorithms may underestimate the features with the discernibility ability of minority classes, and ignore the distribution characteristic of minority class samples. As a result, an additional bias towards being classified into the majority classes can be introduced. To this end, a new method, named imRelief, is proposed for efficiently handling high-dimensional imbalanced gene expression data. imRelief can correct the bias towards to the majority classes, and consider the scattered distributional characteristic of minority class samples in the process of estimating feature weights. This way, imRelief has the ability to reward the features which perform well at separating the minority classes from other classes. Experiments on four microarray gene expression data sets demonstrate the effectiveness of imRelief in both feature weighting and feature subset selection applications.  相似文献   

4.
PK-means: A new algorithm for gene clustering   总被引:3,自引:0,他引:3  
Microarray technology has been widely applied in study of measuring gene expression levels for thousands of genes simultaneously. Gene cluster analysis is found useful for discovering the function of gene because co-expressed genes are likely to share the same biological function. K-means is one of well-known clustering methods. However, it is sensitive to the selection of an initial clustering and easily becoming trapped in a local minimum. Particle-pair optimizer (PPO) is a variation on the traditional particle swarm optimization (PSO) algorithm, which is stochastic particle-pair based optimization technique that can be applied to a wide range of problems. In this paper we bridges PPO and K-means within the algorithm PK-means for the first time. Our results indicate that PK-means clustering is generally more accurate than K-means and Fuzzy K-means (FKM). PK-means also has better robustness for it is less sensitive to the initial randomly selected cluster centroids. Finally, our algorithm outperforms these methods with fast convergence rate and low computation load.  相似文献   

5.
A large challenge in the post-genomic era is to obtain the quantitatively dynamic interactive information of the important constitutes of underlying systems. The S-system is a dynamic and structurally rich model that determines the net strength of interactions between genes and/or proteins. Good generation characteristics without the need for prior information have allowed S-systems to become one of the most promising canonical models. Various evolutionary computation technologies have recently been developed for the identification of system parameters and skeletal-network structures. However, the gaps between the truncated and preserved terms remain too small. Additionally, current research methods fail to identify the structures of high dimensional systems (e.g., 30 genes with 1800 connections). Optimization technologies should converge fast and have the ability to adaptively adjust the search. In this study, we propose a seeding-inspired chemotaxis genetic algorithm (SCGA) that can force evolution to adjust the population movement to identify a favorable location. The seeding-inspired training strategy is a method to achieve optimal results with limited resources. SCGA introduces seeding-inspired genetic operations to allow a population to possess competitive power (exploitation and exploration) and a winner-chemotaxis-induced population migration to force a population to repeatedly tumble away from an attractor and swim toward another attractor. SCGA was tested on several canonical biological systems. SCGA not only learned the correct structure within only one to three pruning steps but also ensures pruning safety. The values of the truncated terms were all smaller than 10−14, even for a thirty-gene system.  相似文献   

6.
7.
This method of analysis for kinetic data of solid state chemical reactions takes advantage of the binomial series expansion to normalize various mathematical expressions related to various solid state models into a power series. This facilitates analysis by computer methods, in that deviations of the various models will be perceptible at the outset.  相似文献   

8.
This work demonstrates the use of a new additional constraint for the Multivariate Curve Resolution−Alternating Least Squares (MCR−ALS) algorithm called “area correlation constraint”, introduced to build calibration models for Excitation Emission Matrix (EEM) data. We propose the application of area correlation constraint MCR−ALS for the quantification of cholesterol using a simulated data set and an experimental data system (cholesterol in a ternary mixture). This new constraint includes pseudo-univariate local regressions using the area of resolved profiles against reference values during the alternating least squares optimization, to provide directly accurate quantifications of a specific analyte in concentration units. In the two datasets investigated in this work, the new constraint retrieved correctly the analyte and interference spectral profiles and performed accurate estimations of cholesterol concentrations in test samples. This the first study using the proposed area constraint using EEM measurements. This new constraint approach emerges as a new possibility to be tested in general cases of second-order multivariate calibration data in the presence of unknown interferents or in more involved higher order calibration cases.  相似文献   

9.
Rational drug design involves finding solutions to large combinatorial problems for which an exhaustive search is impractical. Genetic algorithms provide a novel tool for the investigation of such problems. These are a class of algorithms that mimic some of the major characteristics of Darwinian evolution. LEA has been designed in order to conceive novel small organic molecules which satisfy quantitative structure-activity relationship based rules (fitness). The fitness consists of a sum of constraints that are range properties. The algorithm takes an initial set of fragments and iteratively improves them by means of crossover and mutation operators that are related to those involved in Darwinian evolution. The basis of the algorithm, its implementation and parameterization, are described together with an application in de novo molecular design of new retinoids. The results may be promising for chemical synthesis and show that this tool may find extensive applications in de novo drug design projects.  相似文献   

10.
A novel algorithm is proposed for the fixed-node quantum Monte Carlo (FNQMC) method.In contrast to previous procedures,its "guiding function" is not optimized prior to diffusion quantum Monte Carlo (DMC) computation but synchronistically in the diffusion process The new algorithm can not only save CPU time,but also make both of the optimization and diffusion carried out according to the same sampling fashion,reaching the goal to improve each other This new optimizing procedure converges super-linearly,and thus can accelerate the particle diffusion During the diffusion process,the node of the "guiding function" changes incessantly,which is conducible to reducing the "fixed-node error" The new algorithm has been used to calculate the total energies of states X3B1 and a1A1 of CH2 as well as π-X2B1 and λ-2A1 of NH2 The singlet-triplet energy splitting (λEsT) in CH2 and π energy splitting in NH2 obtained with this present method are (45 542±1.840) and (141.644±1.589) kJ/mol,respectively The calculated  相似文献   

11.
One of the most commonly used means to characterize potential energy surfaces of reactions and chemical systems is the Hessian calculation, whose analytic evaluation is computationally and memory demanding. A new scalable distributed data analytic Hessian algorithm is presented. Features of the distributed data parallel coupled perturbed Hartree-Fock (CPHF) are (a) columns of density-like and Fock-like matrices are distributed among processors, (b) an efficient static load balancing scheme achieves good work load distribution among the processors, (c) network communication time is minimized, and (d) numerous performance improvements in analytic Hessian steps are made. As a result, the new code has good performance which is demonstrated on large biological systems.  相似文献   

12.
Summary A popular first step in the problem of structure-based, de novo molecule design is to identify regions where specific functional groups or chemical entities would be expected to interact strongly. When the three-dimensional structure of the receptor is not available, it may be possible to derive a pharmacophore giving the three-dimensional relationships between such chemical groups. The task then is to design synthetically feasible molecules which not only contain the required groups, but which can also position them in the desired relative orientation. One way to do this is to first link the groups using an acyclic chain. We have investigated the application of the tweak algorithm [Shenkin, P.S. et al., Biopolymers, 26 (1987) 2053] for generating families of acyclic linkers. These linking structures can subsequently be braced using a ring-joining algorithm [Leach, A.R. and Lewis, R.A., J. Comput. Chem., 15 (1994) 233], giving rise to an even wider variety of molecular skeletons for further studies.  相似文献   

13.
《Thermochimica Acta》1998,316(1):37-45
A new method, called non-parametric kinetics (NPK), for the treatment of non-isothermal thermoanalytical data has been developed. The most significant feature of this method is its ability to provide a kinetic model that fits the experimental data, without any assumptions either about the functionality of the reaction rate with the degree of conversion or the temperature. The thermal decomposition of dibenzoyl peroxide has been studied in order to validate the NPK method, and the results are compared with those of the traditional ones.  相似文献   

14.
A new effective algorithm for solving the complete vibronic problem by the variational method is proposed. The algorithm reduces the variational matrix by successively including the shift and entanglement of normal coordinates and using recurrent formulas for determining the eigenvectors of the matrix. No additional approximations are used. The method is faster than those used previously by two or more orders of magnitude. This work was supported by the Russian Fundamental Research Fund (93-02-3405). K. A. Timiryazev Moscow Agricultural Academy. Translated fromZhurmal Strukturnoi Khimii, Vol. 35, No. 6, pp. 23–30, November–December, 1994. Translated by O. Kharlamova  相似文献   

15.
This paper develops a multi-parturition genetic algorithm (MPGA) to be used in geometrical bounding of the overlapped clusters in a data set for the classification of chemical data. Two new operators have been introduced to modify the conventional genetic algorithm, namely, multi-parturition and decimation and orientated creation to improve the linear classification results and diminish the computational time. To circumvent the difficulty commonly encountered in the treatment of linearly inseparable chemical data sets, the optimized linear classifier is further modified to provide a complementary nonlinear classifier. For this reason the space regions of the overlapped clusters have been bounded by erection of half-hyperellipsoids over the linearly misclassified patterns. The proposed MPGA was applied to classify a number of chemical and other data sets with a dimension from 4 to 14. Experimental results have indicated that the proposed MPGA could classify seriously overlapped data sets with an acceptable error rate.  相似文献   

16.
Methods for the calculation of activation energies, pre-exponential factors and reaction orders from thermogravimetric data are briefly reviewed. A new integral method is proposed for the determination of these kinetic parameters, using data from pairs of TG curves produced at different heating rates. Employing accurate values of the temperature integral of the Arrhenius equation, tabulated over a range ofE andT, and a simple graphical procedure, the method offers advantages of speed and accuracy over those previously reported. It is suggested that at least one of the kinetic parameters should be allowed to move freely in order to achieve the best possible fit between calculated and experimental traces.  相似文献   

17.
There is currently substantial interest and activity in the development and application of a new technique, called "charge flipping" (CF), that has emerged in the past few years for carrying out structure solution from X-ray diffraction data. We report here a new variant of this technique, termed "residue-based charge flipping" (RBCF), in which the residues of calculated and experimental structure factor amplitudes, together with the corresponding electron density residues, are introduced within the CF algorithm. An important feature of this approach is that it does not require a positive threshold electron density value (delta) to be specified to control the charge-flipping step within the algorithm (in contrast, it is well established that the success of standard CF calculations can depend critically on choosing a suitable value of delta for a given structural problem). Methodological details of the RBCF algorithm are described, and the results of the application of this technique for structure solution of three test structures are reported. The RBCF technique is shown to lead to the correct structure solution in all cases, with success rates of at least 90% (for independent calculations from different sets of initial random phases). Significantly, the convergence behavior of RBCF calculations is found to contrast markedly with that generally observed for standard CF calculations. In particular, convergence (assessed from the evolution of R-factor versus iteration number) typically progresses rapidly and immediately from the earliest iterations of RBCF calculations, rather than displaying an extended plateau region. This feature, and the fact that the RBCF technique does not use the delta parameter that is required in standard CF calculations, suggest that the RBCF algorithm may be a promising approach in future applications.  相似文献   

18.
Summary A numerical algorithm (HILDA) is developed and made available for the determination of the surface heterogeneity 'of a powder sample in terms of a patchwise adsorptive energy distribution function. Adsorption on unisorptic `patches' may be described by a choice of model isotherm functions. These are the Hill-de Boer equation, Fowler-Guggenheim, Langmuir and the two dimensional virial equation. Parameters are presented to enable HILDA to be applied to a variety of adsorption systems. It is demonstrated how the monolayer capacity of a surface can be determined and the results are compared to BET values. HILDA is also used to follow the changes in the adsorptive energy distribution that occur with annealing a high specific area sodium chloride sample. It is concluded that the method has considerable potential for future applications.
Zusammenfassung Es wurde ein numerischer Algorithmus (HILDA) entwickelt und zugänglich gemacht, um die Oberflächenheterogenität eines Pulverpräparates in Termen einer fleckenweisen Adsorptionsenergieverteilungsfunktion zu bestimmen. Die Adsorption an einheitlich adsorbierenden Flecken kann durch die Wahl geeigneter Modellisothermen beschrieben werden. Als solche dienen die Gleichungen nach Hill-de Boer, Fowler-Guggenheim, Langmuir sowie die zweidimensionale Virialgleichung. Es werden Parameter angegeben, mit deren Hilfe HILDA auf eine Vielzahl von Adsorptionssystemen angewendet werden kann, und es wird gezeigt, wie die Monoschichtkapazität einer Oberfläche bestimmt werden kann.Die Ergebnisse werden mit BET-Werten verglichen. HILDA wird weiterhin verwendet, um Änderungen in der Verteilung der Adsorptionsenergie zu verfolgen, welche beim Tempern eines Kochsalzpräparates mit hoher spezifischer Oberfläche erfolgen.


With 6 figures and 4 tables  相似文献   

19.
20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号