期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Identification of Regulatory Network Motifs from Gene Expression Data

Lorenzo Farina Alfredo Germani Gabriella Mavelli Pasquale Palumbo 《Journal of Mathematical Modelling and Algorithms》2010,9(3):233-245

The modern systems biology approach to the study of molecular cellular biology, consists in the development of computational tools to support the formulation of new hypotheses on the molecular mechanisms underlying the observed cell behavior. Recent biotechnologies are able to provide precise measures of gene expression time courses in response to a large variety of internal and environmental perturbations. In this paper, we propose a simple algorithm for the selection of the “best” regulatory network motif among a number of alternatives, using the expression time course of the genes which are the final targets of the activated signalling pathway. To this aim, we considered the Hill nonlinear ODEs model to simulate the behavior of two ubiquitous motifs: the single input motif and the multi output feed-forward loop motif. Our algorithm has been tested on simulated noisy data assuming the presence of a step-wise regulatory signal. The results clearly show that our method is potentially able to robustly discriminate between alternative motifs, thus providing a useful in silico identification tool for the experimenter. 相似文献

2.

Robust Estimation for Generalized Additive Models

Raymond K. W. Wong Fang Yao Thomas C. M. Lee 《Journal of computational and graphical statistics》2013,22(1):270-289

This article studies M-type estimators for fitting robust generalized additive models in the presence of anomalous data. A new theoretical construct is developed to connect the costly M-type estimation with least-squares type calculations. Its asymptotic properties are studied and used to motivate a computational algorithm. The main idea is to decompose the overall M-type estimation problem into a sequence of well-studied conventional additive model fittings. The resulting algorithm is fast and stable, can be paired with different nonparametric smoothers, and can also be applied to cases with multiple covariates. As another contribution of this article, automatic methods for smoothing parameter selection are proposed. These methods are designed to be resistant to outliers. The empirical performance of the proposed methodology is illustrated via both simulation experiments and real data analysis. Supplementary materials are available online. 相似文献

3.

Variable Selection in Linear Regression With Many Predictors

《Journal of computational and graphical statistics》2013,22(3):573-591

With advanced capability in data collection, applications of linear regression analysis now often involve a large number of predictors. Variable selection thus has become an increasingly important issue in building a linear regression model. For a given selection criterion, variable selection is essentially an optimization problem that seeks the optimal solution over 2^m possible linear regression models, where m is the total number of candidate predictors. When m is large, exhaustive search becomes practically impossible. Simple suboptimal procedures such as forward addition, backward elimination, and backward-forward stepwise procedure are fast but can easily be trapped in a local solution. In this article we propose a relatively simple algorithm for selecting explanatory variables in a linear regression for a given variable selection criterion. Although the algorithm is still a suboptimal algorithm, it has been shown to perform well in extensive empirical study. The main idea of the procedure is to partition the candidate predictors into a small number of groups. Working with various combinations of the groups and iterating the search through random regrouping, the search space is substantially reduced, hence increasing the probability of finding the global optimum. By identifying and collecting “important” variables throughout the iterations, the algorithm finds increasingly better models until convergence. The proposed algorithm performs well in simulation studies with 60 to 300 predictors. As a by-product of the proposed procedure, we are able to study the behavior of variable selection criteria when the number of predictors is large. Such a study has not been possible with traditional search algorithms.

This article has supplementary material online. 相似文献

4.

External selection

Jop F. Sibeyn 《Journal of Algorithms in Cognition, Informatics and Logic》2006,58(2):104-117

Sequential selection has been solved in linear time by Blum et al. [M.B. Blum, R.W. Floyd, V.R. Pratt, R.L. Rivest, R.E. Tarjan, Time bounds for selection, J. Comput. System Sci. 7 (4) (1972) 448–461]. Running this algorithm on a problem of size N with N>M, the size of the main-memory, results in an algorithm that reads and writes O(N) elements, while the number of comparisons is also bounded by O(N). This is asymptotically optimal, but the constants are so large that in practice sorting is faster for most values of M and N.This paper provides the first detailed study of the external selection problem. A randomized algorithm of a conventional type is close to optimal in all respects. Our deterministic algorithm is more or less the same, but first the algorithm builds an index structure of all the elements. This effort is not wasted: the index structure allows the retrieval of elements so that we do not need a second scan through all the data. This index structure can also be used for repeated selections, and can be extended over time. For a problem of size N, the deterministic algorithm reads N+o(N) elements and writes only o(N) elements and is thereby optimal to within lower-order terms. 相似文献

5.

The average performance of a parallel stable marriage algorithm

S. S. Tseng 《BIT Numerical Mathematics》1989,29(3):448-456

In this paper, Tseng and Lee's parallel algorithm to solve the stable marriage prolem is analyzed. It is shown that the average number of parallel proposals of the algorithm is of ordern by usingn processors on a CREW PRAM, where each parallel proposal requiresO(loglog(n) time on CREW PRAM by applying the parallel selection algorithms of Valiant or Shiloach and Vishkin. Therefore, our parallel algorithm requiresO(nloglog(n)) time. The speed-up achieved is log(n)/loglog(n) since the average number of proposals required by applying McVitie and Wilson's algorithm to solve the stable marriage problem isO(nlog(n)). 相似文献

6.

On the LASSO and its Dual

Michael R. Osborne Brett Presnell Berwin A. Turlach 《Journal of computational and graphical statistics》2013,22(2):319-337

Abstract

Proposed by Tibshirani, the least absolute shrinkage and selection operator (LASSO) estimates a vector of regression coefficients by minimizing the residual sum of squares subject to a constraint on the l ¹-norm of the coefficient vector. The LASSO estimator typically has one or more zero elements and thus shares characteristics of both shrinkage estimation and variable selection. In this article we treat the LASSO as a convex programming problem and derive its dual. Consideration of the primal and dual problems together leads to important new insights into the characteristics of the LASSO estimator and to an improved method for estimating its covariance matrix. Using these results we also develop an efficient algorithm for computing LASSO estimates which is usable even in cases where the number of regressors exceeds the number of observations. An S-Plus library based on this algorithm is available from StatLib. 相似文献

7.

Categorical data clustering with automatic selection of cluster number

Hai-yong Liao Michael K. Ng 《佛山科学技术学院》2009,1(1):5-25

In this paper, we investigate the problem of determining the number of clusters in the k-modes based categorical data clustering process. We propose a new categorical data clustering algorithm with automatic selection of k. The new algorithm extends the k-modes clustering algorithm by introducing a penalty term to the objective function to make more clusters compete for objects. In the new objective function, we employ a regularization parameter to control the number of clusters in a clustering process. Instead of finding k directly, we choose a suitable value of regularization parameter such that the corresponding clustering result is the most stable one among all the generated clustering results. Experimental results on synthetic data sets and the real data sets are used to demonstrate the effectiveness of the proposed algorithm. 相似文献

8.

Helper-objectives: Using multi-objective evolutionary algorithms for single-objective optimisation

Mikkel T. Jensen 《Journal of Mathematical Modelling and Algorithms》2004,3(4):323-347

This paper investigates the use of multi-objective methods to guide the search when solving single-objective optimisation problems with genetic algorithms. Using the job shop scheduling and travelling salesman problems as examples, experiments demonstrate that the use of helper-objectives (additional objectives guiding the search) significantly improves the average performance of a standard GA. The helper-objectives guide the search towards solutions containing good building blocks and help the algorithm escape local optima. The experiments reveal that the approach works if the number of simultaneously used helper-objectives is low. However, a high number of helper-objectives can be used in the same run by changing the helper-objectives dynamically. The experiments reveal that for the majority of problem instances studied, the proposed approach significantly outperforms a traditional GA.The experiments also demonstrate that controlling the proportion of non-dominated solutions in the population is very important when using helper-objectives, since the presence of too many non-dominated solutions removes the selection pressure in the algorithm. 相似文献

9.

UOBYQA: unconstrained optimization by quadratic approximation 总被引：5，自引：0，他引：5

M.J.D. Powell 《Mathematical Programming》2002,92(3):555-582

UOBYQA is a new algorithm for general unconstrained optimization calculations, that takes account of the curvature of the objective function, F say, by forming quadratic models by interpolation. Therefore, because no first derivatives are required, each model is defined by ?(n+1)(n+2) values of F, where n is the number of variables, and the interpolation points must have the property that no nonzero quadratic polynomial vanishes at all of them. A typical iteration of the algorithm generates a new vector of variables, _t say, either by minimizing the quadratic model subject to a trust region bound, or by a procedure that should improve the accuracy of the model. Then usually F( _t) is obtained, and one of the interpolation points is replaced by _t. Therefore the paper addresses the initial positions of the interpolation points, the adjustment of trust region radii, the calculation of _t in the two cases that have been mentioned, and the selection of the point to be replaced. Further, UOBYQA works with the Lagrange functions of the interpolation equations explicitly, so their coefficients are updated when an interpolation point is moved. The Lagrange functions assist the procedure that improves the model, and also they provide an estimate of the error of the quadratic approximation to F, which allows the algorithm to achieve a fast rate of convergence. These features are discussed and a summary of the algorithm is given. Finally, a Fortran implementation of UOBYQA is applied to several choices of F, in order to investigate accuracy, robustness in the presence of rounding errors, the effects of first derivative discontinuities, and the amount of work. The numerical results are very promising for n≤20, but larger values are problematical, because the routine work of an iteration is of fourth order in the number of variables. Received: December 7, 2000 / Accepted: August 31, 2001?Published online April 12, 2002 相似文献

10.

An algorithm for the enumeration of spanning trees

Pawel Winter 《BIT Numerical Mathematics》1986,26(1):44-62

Enumeration of spanning trees of an undirected graph is one of the graph problems that has received much attention in the literature. In this paper a new enumeration algorithm based on the idea of contractions of the graph is presented. The worst-case time complexity of the algorithm isO(n+m+nt) wheren is the number of vertices,m the number of edges, andt the number of spanning trees in the graph. The worst-case space complexity of the algorithm isO(n ²). Computational analysis indicates that the algorithm requires less computation time than any other of the previously best-known algorithms. 相似文献

11.

Comparison of Methods for Image Analysis on cDNA Microarray Data

《Journal of computational and graphical statistics》2013,22(1):108-136

Microarrays are part of a new class of biotechnologies which allow the monitoring of expression levels for thousands of genes simultaneously. Image analysis is an important aspect of microarray experiments, one that can have a potentially large impact on subsequent analyses such as clustering or the identification of differentially expressed genes. This article reviews a number of existing image analysis approaches for cDNA microarray experiments and proposes new addressing, segmentation, and background correction methods for extracting information from microarray scanned images. The segmentation component uses a seeded region growing algorithm which makes provision for spots of different shapes and sizes. The background estimation approach is based on an image analysis technique known as morphological opening. These new image analysis procedures are implemented in a software package named Spot, built on the R environment for statistical computing. The statistical properties of the different segmentation and background adjustment methods are examined using microarray data from a study of lipid metabolism in mice. It is shown that in some cases background adjustment can substantially reduce the precision—that is, increase the variability—of low-intensity spot values. In contrast, the choice of segmentation procedure has a smaller impact. The comparison further suggests that seeded region growing segmentation with morphological background correction provides precise and accurate estimates of foreground and background intensities. 相似文献

12.

A new factorization technique of the matrix mask of univariate refinable functions

Gerlind Plonka Amos Ron 《Numerische Mathematik》2001,87(3):555-595

Summary. A univariate compactly supported refinable function can always be written as the convolution product , with the B-spline of order k,f a compactly supported distribution, and k the approximation orders provided by the underlying shift-invariant space . Factorizations of univariate refinable vectors were also studied and utilized in the literature. One of the by-products of this article is a rigorous analysis of that factorization notion, including, possibly, the first precise definition of that process. The main goal of this article is the introduction of a special factorization algorithm of refinable vectors that generalizes the scalar case as closely (and unexpectedly) as possible: the original vector is shown to be `almost' in the form , with F still compactly supported and refinable, andk the approximation order of . The algorithm guarantees F to retain the possible favorable properties of , such as the stability of the shifts of and/or the polynomiality of the mask symbol. At the same time, the theory and the algorithm are derived under relatively mild conditions and, in particular, apply to whose shifts are not stable, as well as to refinable vectors which are not compactly supported. The usefulness of this specific factorization for the study of the smoothness of FSI wavelets (known also as `multiwavelets' and `multiple wavelets') is explained. The analysis invokes in an essential way the theory of finitely generated shift-invariant (FSI) spaces, and, in particular, the tool of superfunction theory. Received June 10, 1998 / Revised version received June 14, 1999 / Published online August 2, 2000 相似文献

13.

An efficient algorithm for range computation of polynomials using the Bernstein form 总被引：1，自引：0，他引：1

Shashwati Ray P. S. V. Nataraj 《Journal of Global Optimization》2009,45(3):403-426

We present a novel optimization algorithm for computing the ranges of multivariate polynomials using the Bernstein polynomial approach. The proposed algorithm incorporates four accelerating devices, namely the cut-off test, the simplified vertex test, the monotonicity test, and the concavity test, and also possess many new features, such as, the generalized matrix method for Bernstein coefficient computation, a new subdivision direction selection rule and a new subdivision point selection rule. The features and capabilities of the proposed algorithm are compared with those of other optimization techniques: interval global optimization, the filled function method, a global optimization method for imprecise problems, and a hybrid approach combining simulated annealing, tabu search and a descent method. The superiority of the proposed method over the latter methods is illustrated by numerical experiments and qualitative comparisons. 相似文献

14.

Selecting distances in arrangements of hyperplanes spanned by points

Sergei Bespamyatnikh Michael Segal 《Journal of Discrete Algorithms》2004,2(3):193

In this paper we consider a problem of distance selection in the arrangement of hyperplanes induced by n given points. Given a set of n points in d-dimensional space and a number k, , determine the hyperplane that is spanned by d points and at distance ranked by k from the origin. For the planar case we present an O(nlog²n) runtime algorithm using parametric search partly different from the usual approach [N. Megiddo, J. ACM 30 (1983) 852]. We establish a connection between this problem in 3-d and the well-known 3SUM problem using an auxiliary problem of counting the number of vertices in the arrangement of n planes that lie between two sheets of a hyperboloid. We show that the 3-d problem is almost 3SUM-hard and solve it by an O(n²log²n) runtime algorithm. We generalize these results to the d-dimensional (d4) space and consider also a problem of enumerating distances. 相似文献

15.

Average-case results on heapsort

Svante Carlsson 《BIT Numerical Mathematics》1987,27(1):2-17

A new algorithm for rearranging a heap is presented and analysed in the average case. The average case upper bound for deleting the maximum element of a random heap is improved, and is shown to be less than [logn]+0.299+M(n) comparisons, *) whereM(n) is between 0 and 1. It is also shown that a heap can be constructed using 1.650n+O(logn) comparisons with this algorithm, the best result for any algorithm which does not use any extra space. The expected time to sortn elements is argued to be less thann logn+0.670n+O(logn), while simulation result points at an average case ofn log n+0.4n which will make it the fastest in-place sorting algorithm. The same technique is used to show that the average number of comparisons when deleting the maximum element of a heap using Williams' algorithm for rearrangement is 2([logn]–1.299+L(n)) whereL(n) also is between 0 and 1, and the average cost for Floyd-Williams Heapsort is at least 2nlogn–3.27n, counting only comparisons. An analysis of the number of interchanges when deleting the maximum element of a random heap, which is the same for both algorithms, is also presented. 相似文献

16.

A new branch-and-cut algorithm for the capacitated vehicle routing problem

Jens?Lysgaard Email author Adam N.?Letchford Richard W.?Eglese 《Mathematical Programming》2004,100(2):423-445

We present a new branch-and-cut algorithm for the capacitated vehicle routing problem (CVRP). The algorithm uses a variety of cutting planes, including capacity, framed capacity, generalized capacity, strengthened comb, multistar, partial multistar, extended hypotour inequalities, and classical Gomory mixed-integer cuts. For each of these classes of inequalities we describe our separation algorithms in detail. Also we describe the other important ingredients of our branch-and-cut algorithm, such as the branching rules, the node selection strategy, and the cut pool management. Computational results, for a large number of instances, show that the new algorithm is competitive. In particular, we solve three instances (B-n50-k8, B-n66-k9 and B-n78-k10) of Augerat to optimality for the first time. 相似文献

17.

Calculating approximation guarantees for partial set cover of pairs

Peter Damaschke 《Optimization Letters》2017,11(7):1293-1302

As a part of a heuristic for the fast detection of new word combinations in text streams, we consider the NP-hard Partial Set Cover of Pairs problem. There we wish to cover a maximum number of pairs of elements by a prescribed number of sets from a given set family. While the approximation ratio of the greedy algorithm for the classic Partial Set Cover problem is completely understood, the same question for covering of pairs is intrinsically more complicated, since the pairs insert some graph-theoretic structure. The best approximation guarantee for the first greedy step can be rephrased as a problem in extremal combinatorics: Assume that we may place a fixed number of subsets of fixed and equal size in a set, how many different pairs of elements can we cover? In this paper we introduce a method to calculate optimal approximation guarantees, and we demonstrate its use on the smallest set families. 相似文献

18.

Variable Selection via A Combination of the L0 and L1 Penalties

《Journal of computational and graphical statistics》2013,22(4):782-798

Variable selection is an important aspect of high-dimensional statistical modeling, particularly in regression and classification. In the regularization framework, various penalty functions are used to perform variable selection by putting relatively large penalties on small coefficients. The L₁ penalty is a popular choice because of its convexity, but it produces biased estimates for the large coefficients. The L₀ penalty is attractive for variable selection because it directly penalizes the number of non zero coefficients. However, the optimization involved is discontinuous and non convex, and therefore it is very challenging to implement. Moreover, its solution may not be stable. In this article, we propose a new penalty that combines the L₀ and L₁ penalties. We implement this new penalty by developing a global optimization algorithm using mixed integer programming (MIP). We compare this combined penalty with several other penalties via simulated examples as well as real applications. The results show that the new penalty outperforms both the L₀ and L₁ penalties in terms of variable selection while maintaining good prediction accuracy. 相似文献

19.

Polynomial time algorithms for some classes of constrained nonconvex quadratic problems

《Optimization》2012,61(6):843-853

In this paper we consider different classes of noneonvex quadratic problems that can be solved in polynomial time. We present an algorithm for the problem of minimizing the product of two linear functions over a polyhedron P in R ⁿ The complexity of the algorithm depends on the number of vertices of the projection of P onto the R ² space. In the worst-case this algorithm requires an exponential number of steps but its expected computational time complexity is polynomial. In addition, we give a characterization for the number of isolated local minimum areas for problems on this form.

Furthermore, we consider indefinite quadratic problems with variables restricted to be nonnegative. These problems can be solved in polynomial time if the number of negative eigenvalues of the associated symmetric matrix is fixed. 相似文献

20.

Backfitting neural networks

Anil?Kumar?Ghosh Email author Smarajit?Bose Email author 《Computational Statistics》2004,19(2):193-210

Summary Regression and classification problems can be viewed as special cases of the problem of function estimation. It is rather well known that a two-layer perceptron with sigmoidal transformation functions can approximate any continuous function on the compact subsets ofRP if there are sufficient number of hidden nodes. In this paper, we present an algorithm for fitting perceptron models, which is quite different from the usual backpropagation or Levenberg-Marquardt algorithm. This new algorithm based on backfitting ensures a better convergence than backpropagation. We have also used resampling techniques to select an ideal number of hidden nodes automatically using the training data itself. This resampling technique helps to avoid the problem of overfitting that one faces for the usual perceptron learning algorithms without any model selection scheme. Case studies and simulation results are presented to illustrate the performance of this proposed algorithm. 相似文献