首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
The modern systems biology approach to the study of molecular cellular biology, consists in the development of computational tools to support the formulation of new hypotheses on the molecular mechanisms underlying the observed cell behavior. Recent biotechnologies are able to provide precise measures of gene expression time courses in response to a large variety of internal and environmental perturbations. In this paper, we propose a simple algorithm for the selection of the “best” regulatory network motif among a number of alternatives, using the expression time course of the genes which are the final targets of the activated signalling pathway. To this aim, we considered the Hill nonlinear ODEs model to simulate the behavior of two ubiquitous motifs: the single input motif and the multi output feed-forward loop motif. Our algorithm has been tested on simulated noisy data assuming the presence of a step-wise regulatory signal. The results clearly show that our method is potentially able to robustly discriminate between alternative motifs, thus providing a useful in silico identification tool for the experimenter.  相似文献   

2.
This article studies M-type estimators for fitting robust generalized additive models in the presence of anomalous data. A new theoretical construct is developed to connect the costly M-type estimation with least-squares type calculations. Its asymptotic properties are studied and used to motivate a computational algorithm. The main idea is to decompose the overall M-type estimation problem into a sequence of well-studied conventional additive model fittings. The resulting algorithm is fast and stable, can be paired with different nonparametric smoothers, and can also be applied to cases with multiple covariates. As another contribution of this article, automatic methods for smoothing parameter selection are proposed. These methods are designed to be resistant to outliers. The empirical performance of the proposed methodology is illustrated via both simulation experiments and real data analysis. Supplementary materials are available online.  相似文献   

3.
With advanced capability in data collection, applications of linear regression analysis now often involve a large number of predictors. Variable selection thus has become an increasingly important issue in building a linear regression model. For a given selection criterion, variable selection is essentially an optimization problem that seeks the optimal solution over 2m possible linear regression models, where m is the total number of candidate predictors. When m is large, exhaustive search becomes practically impossible. Simple suboptimal procedures such as forward addition, backward elimination, and backward-forward stepwise procedure are fast but can easily be trapped in a local solution. In this article we propose a relatively simple algorithm for selecting explanatory variables in a linear regression for a given variable selection criterion. Although the algorithm is still a suboptimal algorithm, it has been shown to perform well in extensive empirical study. The main idea of the procedure is to partition the candidate predictors into a small number of groups. Working with various combinations of the groups and iterating the search through random regrouping, the search space is substantially reduced, hence increasing the probability of finding the global optimum. By identifying and collecting “important” variables throughout the iterations, the algorithm finds increasingly better models until convergence. The proposed algorithm performs well in simulation studies with 60 to 300 predictors. As a by-product of the proposed procedure, we are able to study the behavior of variable selection criteria when the number of predictors is large. Such a study has not been possible with traditional search algorithms.

This article has supplementary material online.  相似文献   

4.
Sequential selection has been solved in linear time by Blum et al. [M.B. Blum, R.W. Floyd, V.R. Pratt, R.L. Rivest, R.E. Tarjan, Time bounds for selection, J. Comput. System Sci. 7 (4) (1972) 448–461]. Running this algorithm on a problem of size N with N>M, the size of the main-memory, results in an algorithm that reads and writes O(N) elements, while the number of comparisons is also bounded by O(N). This is asymptotically optimal, but the constants are so large that in practice sorting is faster for most values of M and N.This paper provides the first detailed study of the external selection problem. A randomized algorithm of a conventional type is close to optimal in all respects. Our deterministic algorithm is more or less the same, but first the algorithm builds an index structure of all the elements. This effort is not wasted: the index structure allows the retrieval of elements so that we do not need a second scan through all the data. This index structure can also be used for repeated selections, and can be extended over time. For a problem of size N, the deterministic algorithm reads N+o(N) elements and writes only o(N) elements and is thereby optimal to within lower-order terms.  相似文献   

5.
In this paper, Tseng and Lee's parallel algorithm to solve the stable marriage prolem is analyzed. It is shown that the average number of parallel proposals of the algorithm is of ordern by usingn processors on a CREW PRAM, where each parallel proposal requiresO(loglog(n) time on CREW PRAM by applying the parallel selection algorithms of Valiant or Shiloach and Vishkin. Therefore, our parallel algorithm requiresO(nloglog(n)) time. The speed-up achieved is log(n)/loglog(n) since the average number of proposals required by applying McVitie and Wilson's algorithm to solve the stable marriage problem isO(nlog(n)).  相似文献   

6.
Abstract

Proposed by Tibshirani, the least absolute shrinkage and selection operator (LASSO) estimates a vector of regression coefficients by minimizing the residual sum of squares subject to a constraint on the l 1-norm of the coefficient vector. The LASSO estimator typically has one or more zero elements and thus shares characteristics of both shrinkage estimation and variable selection. In this article we treat the LASSO as a convex programming problem and derive its dual. Consideration of the primal and dual problems together leads to important new insights into the characteristics of the LASSO estimator and to an improved method for estimating its covariance matrix. Using these results we also develop an efficient algorithm for computing LASSO estimates which is usable even in cases where the number of regressors exceeds the number of observations. An S-Plus library based on this algorithm is available from StatLib.  相似文献   

7.
In this paper, we investigate the problem of determining the number of clusters in the k-modes based categorical data clustering process. We propose a new categorical data clustering algorithm with automatic selection of k. The new algorithm extends the k-modes clustering algorithm by introducing a penalty term to the objective function to make more clusters compete for objects. In the new objective function, we employ a regularization parameter to control the number of clusters in a clustering process. Instead of finding k directly, we choose a suitable value of regularization parameter such that the corresponding clustering result is the most stable one among all the generated clustering results. Experimental results on synthetic data sets and the real data sets are used to demonstrate the effectiveness of the proposed algorithm.  相似文献   

8.
This paper investigates the use of multi-objective methods to guide the search when solving single-objective optimisation problems with genetic algorithms. Using the job shop scheduling and travelling salesman problems as examples, experiments demonstrate that the use of helper-objectives (additional objectives guiding the search) significantly improves the average performance of a standard GA. The helper-objectives guide the search towards solutions containing good building blocks and help the algorithm escape local optima. The experiments reveal that the approach works if the number of simultaneously used helper-objectives is low. However, a high number of helper-objectives can be used in the same run by changing the helper-objectives dynamically. The experiments reveal that for the majority of problem instances studied, the proposed approach significantly outperforms a traditional GA.The experiments also demonstrate that controlling the proportion of non-dominated solutions in the population is very important when using helper-objectives, since the presence of too many non-dominated solutions removes the selection pressure in the algorithm.  相似文献   

9.
UOBYQA: unconstrained optimization by quadratic approximation   总被引:5,自引:0,他引:5  
UOBYQA is a new algorithm for general unconstrained optimization calculations, that takes account of the curvature of the objective function, F say, by forming quadratic models by interpolation. Therefore, because no first derivatives are required, each model is defined by ?(n+1)(n+2) values of F, where n is the number of variables, and the interpolation points must have the property that no nonzero quadratic polynomial vanishes at all of them. A typical iteration of the algorithm generates a new vector of variables, t say, either by minimizing the quadratic model subject to a trust region bound, or by a procedure that should improve the accuracy of the model. Then usually F( t ) is obtained, and one of the interpolation points is replaced by t . Therefore the paper addresses the initial positions of the interpolation points, the adjustment of trust region radii, the calculation of t in the two cases that have been mentioned, and the selection of the point to be replaced. Further, UOBYQA works with the Lagrange functions of the interpolation equations explicitly, so their coefficients are updated when an interpolation point is moved. The Lagrange functions assist the procedure that improves the model, and also they provide an estimate of the error of the quadratic approximation to F, which allows the algorithm to achieve a fast rate of convergence. These features are discussed and a summary of the algorithm is given. Finally, a Fortran implementation of UOBYQA is applied to several choices of F, in order to investigate accuracy, robustness in the presence of rounding errors, the effects of first derivative discontinuities, and the amount of work. The numerical results are very promising for n≤20, but larger values are problematical, because the routine work of an iteration is of fourth order in the number of variables. Received: December 7, 2000 / Accepted: August 31, 2001?Published online April 12, 2002  相似文献   

10.
Enumeration of spanning trees of an undirected graph is one of the graph problems that has received much attention in the literature. In this paper a new enumeration algorithm based on the idea of contractions of the graph is presented. The worst-case time complexity of the algorithm isO(n+m+nt) wheren is the number of vertices,m the number of edges, andt the number of spanning trees in the graph. The worst-case space complexity of the algorithm isO(n 2). Computational analysis indicates that the algorithm requires less computation time than any other of the previously best-known algorithms.  相似文献   

11.
Microarrays are part of a new class of biotechnologies which allow the monitoring of expression levels for thousands of genes simultaneously. Image analysis is an important aspect of microarray experiments, one that can have a potentially large impact on subsequent analyses such as clustering or the identification of differentially expressed genes. This article reviews a number of existing image analysis approaches for cDNA microarray experiments and proposes new addressing, segmentation, and background correction methods for extracting information from microarray scanned images. The segmentation component uses a seeded region growing algorithm which makes provision for spots of different shapes and sizes. The background estimation approach is based on an image analysis technique known as morphological opening. These new image analysis procedures are implemented in a software package named Spot, built on the R environment for statistical computing. The statistical properties of the different segmentation and background adjustment methods are examined using microarray data from a study of lipid metabolism in mice. It is shown that in some cases background adjustment can substantially reduce the precision—that is, increase the variability—of low-intensity spot values. In contrast, the choice of segmentation procedure has a smaller impact. The comparison further suggests that seeded region growing segmentation with morphological background correction provides precise and accurate estimates of foreground and background intensities.  相似文献   

12.
Summary. A univariate compactly supported refinable function can always be written as the convolution product , with the B-spline of order k,f a compactly supported distribution, and k the approximation orders provided by the underlying shift-invariant space . Factorizations of univariate refinable vectors were also studied and utilized in the literature. One of the by-products of this article is a rigorous analysis of that factorization notion, including, possibly, the first precise definition of that process. The main goal of this article is the introduction of a special factorization algorithm of refinable vectors that generalizes the scalar case as closely (and unexpectedly) as possible: the original vector is shown to be `almost' in the form , with F still compactly supported and refinable, andk the approximation order of . The algorithm guarantees F to retain the possible favorable properties of , such as the stability of the shifts of and/or the polynomiality of the mask symbol. At the same time, the theory and the algorithm are derived under relatively mild conditions and, in particular, apply to whose shifts are not stable, as well as to refinable vectors which are not compactly supported. The usefulness of this specific factorization for the study of the smoothness of FSI wavelets (known also as `multiwavelets' and `multiple wavelets') is explained. The analysis invokes in an essential way the theory of finitely generated shift-invariant (FSI) spaces, and, in particular, the tool of superfunction theory. Received June 10, 1998 / Revised version received June 14, 1999 / Published online August 2, 2000  相似文献   

13.
We present a novel optimization algorithm for computing the ranges of multivariate polynomials using the Bernstein polynomial approach. The proposed algorithm incorporates four accelerating devices, namely the cut-off test, the simplified vertex test, the monotonicity test, and the concavity test, and also possess many new features, such as, the generalized matrix method for Bernstein coefficient computation, a new subdivision direction selection rule and a new subdivision point selection rule. The features and capabilities of the proposed algorithm are compared with those of other optimization techniques: interval global optimization, the filled function method, a global optimization method for imprecise problems, and a hybrid approach combining simulated annealing, tabu search and a descent method. The superiority of the proposed method over the latter methods is illustrated by numerical experiments and qualitative comparisons.  相似文献   

14.
In this paper we consider a problem of distance selection in the arrangement of hyperplanes induced by n given points. Given a set of n points in d-dimensional space and a number k, , determine the hyperplane that is spanned by d points and at distance ranked by k from the origin. For the planar case we present an O(nlog2n) runtime algorithm using parametric search partly different from the usual approach [N. Megiddo, J. ACM 30 (1983) 852]. We establish a connection between this problem in 3-d and the well-known 3SUM problem using an auxiliary problem of counting the number of vertices in the arrangement of n planes that lie between two sheets of a hyperboloid. We show that the 3-d problem is almost 3SUM-hard and solve it by an O(n2log2n) runtime algorithm. We generalize these results to the d-dimensional (d4) space and consider also a problem of enumerating distances.  相似文献   

15.
A new algorithm for rearranging a heap is presented and analysed in the average case. The average case upper bound for deleting the maximum element of a random heap is improved, and is shown to be less than [logn]+0.299+M(n) comparisons, *) whereM(n) is between 0 and 1. It is also shown that a heap can be constructed using 1.650n+O(logn) comparisons with this algorithm, the best result for any algorithm which does not use any extra space. The expected time to sortn elements is argued to be less thann logn+0.670n+O(logn), while simulation result points at an average case ofn log n+0.4n which will make it the fastest in-place sorting algorithm. The same technique is used to show that the average number of comparisons when deleting the maximum element of a heap using Williams' algorithm for rearrangement is 2([logn]–1.299+L(n)) whereL(n) also is between 0 and 1, and the average cost for Floyd-Williams Heapsort is at least 2nlogn–3.27n, counting only comparisons. An analysis of the number of interchanges when deleting the maximum element of a random heap, which is the same for both algorithms, is also presented.  相似文献   

16.
We present a new branch-and-cut algorithm for the capacitated vehicle routing problem (CVRP). The algorithm uses a variety of cutting planes, including capacity, framed capacity, generalized capacity, strengthened comb, multistar, partial multistar, extended hypotour inequalities, and classical Gomory mixed-integer cuts. For each of these classes of inequalities we describe our separation algorithms in detail. Also we describe the other important ingredients of our branch-and-cut algorithm, such as the branching rules, the node selection strategy, and the cut pool management. Computational results, for a large number of instances, show that the new algorithm is competitive. In particular, we solve three instances (B-n50-k8, B-n66-k9 and B-n78-k10) of Augerat to optimality for the first time.  相似文献   

17.
As a part of a heuristic for the fast detection of new word combinations in text streams, we consider the NP-hard Partial Set Cover of Pairs problem. There we wish to cover a maximum number of pairs of elements by a prescribed number of sets from a given set family. While the approximation ratio of the greedy algorithm for the classic Partial Set Cover problem is completely understood, the same question for covering of pairs is intrinsically more complicated, since the pairs insert some graph-theoretic structure. The best approximation guarantee for the first greedy step can be rephrased as a problem in extremal combinatorics: Assume that we may place a fixed number of subsets of fixed and equal size in a set, how many different pairs of elements can we cover? In this paper we introduce a method to calculate optimal approximation guarantees, and we demonstrate its use on the smallest set families.  相似文献   

18.
Variable selection is an important aspect of high-dimensional statistical modeling, particularly in regression and classification. In the regularization framework, various penalty functions are used to perform variable selection by putting relatively large penalties on small coefficients. The L1 penalty is a popular choice because of its convexity, but it produces biased estimates for the large coefficients. The L0 penalty is attractive for variable selection because it directly penalizes the number of non zero coefficients. However, the optimization involved is discontinuous and non convex, and therefore it is very challenging to implement. Moreover, its solution may not be stable. In this article, we propose a new penalty that combines the L0 and L1 penalties. We implement this new penalty by developing a global optimization algorithm using mixed integer programming (MIP). We compare this combined penalty with several other penalties via simulated examples as well as real applications. The results show that the new penalty outperforms both the L0 and L1 penalties in terms of variable selection while maintaining good prediction accuracy.  相似文献   

19.
《Optimization》2012,61(6):843-853
In this paper we consider different classes of noneonvex quadratic problems that can be solved in polynomial time. We present an algorithm for the problem of minimizing the product of two linear functions over a polyhedron P in R n The complexity of the algorithm depends on the number of vertices of the projection of P onto the R 2 space. In the worst-case this algorithm requires an exponential number of steps but its expected computational time complexity is polynomial. In addition, we give a characterization for the number of isolated local minimum areas for problems on this form.

Furthermore, we consider indefinite quadratic problems with variables restricted to be nonnegative. These problems can be solved in polynomial time if the number of negative eigenvalues of the associated symmetric matrix is fixed.  相似文献   

20.
Summary  Regression and classification problems can be viewed as special cases of the problem of function estimation. It is rather well known that a two-layer perceptron with sigmoidal transformation functions can approximate any continuous function on the compact subsets ofRP if there are sufficient number of hidden nodes. In this paper, we present an algorithm for fitting perceptron models, which is quite different from the usual backpropagation or Levenberg-Marquardt algorithm. This new algorithm based on backfitting ensures a better convergence than backpropagation. We have also used resampling techniques to select an ideal number of hidden nodes automatically using the training data itself. This resampling technique helps to avoid the problem of overfitting that one faces for the usual perceptron learning algorithms without any model selection scheme. Case studies and simulation results are presented to illustrate the performance of this proposed algorithm.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号