首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 875 毫秒
1.
This article develops a generalization of the scatterplot matrix based on the recognition that most datasets include both categorical and quantitative information. Traditional grids of scatterplots often obscure important features of the data when one or more variables are categorical but coded as numerical. The generalized pairs plot offers a range of displays of paired combinations of categorical and quantitative variables. A mosaic plot, fluctuation diagram, or faceted bar chart may be used to display two categorical variables. A side-by-side boxplot, stripplot, faceted histogram, or density plot helps visualize a categorical and a quantitative variable. A traditional scatterplot is suitable for displaying a pair of numerical variables, but options also support density contours or annotating summary statistics such as the correlation and number of missing values, for example. By combining these, the generalized pairs plot may help to reveal structure in multivariate data that otherwise might go unnoticed in the process of exploratory data analysis. Two different R packages provide implementations of the generalized pairs plot, gpairs and GGally. Supplementary materials for this article are available online on the journal web site.  相似文献   

2.
Classification of samples into two or multi-classes is to interest of scientists in almost every field. Traditional statistical methodology for classification does not work well when there are more variables (p) than there are samples (n) and it is highly sensitive to outlying observations. In this study, a robust partial least squares based classification method is proposed to handle data containing outliers where $n\ll p.$ The proposed method is applied to well-known benchmark datasets and its properties are explored by an extensive simulation study.  相似文献   

3.
This paper concerns accurate computation of the singular value decomposition (SVD) of an matrix . As is well known, cross-product matrix based SVD algorithms compute large singular values accurately but generally deliver poor small singular values. A new novel cross-product matrix based SVD method is proposed: (a) Use a backward stable algorithm to compute the eigenpairs of and take the square roots of the large eigenvalues of it as the large singular values of ; (b) form the Rayleigh quotient of with respect to the matrix consisting of the computed eigenvectors associated with the computed small eigenvalues of ; (c) compute the eigenvalues of the Rayleigh quotient and take the square roots of them as the small singular values of . A detailed quantitative error analysis is conducted on the method. It is proved that if small singular values are well separated from the large ones then the method can compute the small ones accurately up to the order of the unit roundoff . An algorithm is developed that is not only cheaper than the standard Golub–Reinsch and Chan SVD algorithms but also can update or downdate a new SVD by adding or deleting a row and compute certain refined Ritz vectors for large matrix eigenproblems at very low cost. Several variants of the algorithm are proposed that compute some or all parts of the SVD. Typical numerical examples confirm the high accuracy of our algorithm.Supported in part by the National Science Foundation of China (No. 10471074).  相似文献   

4.
Solving Total Least Squares (TLS) problemsAXB requires the computation of the noise subspace of the data matrix [A;B]. The widely used tool for doing this is the Singular Value Decomposition (SVD). However, the SVD has the drawback that it is computationally expensive. Therefore, we consider here a different so-called rank-revealing two-sided orthogonal decomposition which decomposes the matrix into a product of a unitary matrix, a triangular matrix and another unitary matrix in such a way that the effective rank of the matrix is obvious and at the same time the noise subspace is exhibited explicity. We show how this decompsition leads to an efficient and reliable TLS algorithm that can be parallelized in an efficient way.  相似文献   

5.
An Evolution Program for Non-Linear Transportation Problems   总被引:1,自引:0,他引:1  
In this paper we describe main features of a Strongly Feasible Evolution Program (SFEP) designed to solve non-linear network flow problems. The program can handle non-linearities both in the constraints and in the objective function. The solutions procedure is based on a recombination operator in which all parents in a small mating pool have equal chance of contributing their genetic material to offspring. When offspring is created with better fitness value than that of the worst parent, the worst parent is discarded from the mating pool while the offspring is placed in it. The main contributions are in the massive parallel initialization procedure which creates only feasible solutions with simple heuristic rules that increase chances of creating solutions with good fitness values for the initial mating pool, and the gene therapy procedure which fixes defective genes ensuring that the offspring resulting from recombination is always feasible. Both procedures utilize the properties of network flows. The algorithm is capable of handling mixed integer problems with non-linearities in both constraints and the objective function. Tests were conducted on a number of previously published transportation problems with 49 and 100 decision variables, which constitute a subset of network flow problems. Convergence to equal or better solutions was achieved with often less than one tenth of the previous computational efforts.  相似文献   

6.
Interpolation by translates of a given radial basis function (RBF) has become a well-recognized means of fitting functions sampled at scattered sites in d. A major drawback of these methods is their inability to interpolate very large data sets in a numerically stable way while maintaining a good fit. To circumvent this problem, a multilevel interpolation (ML) method for scattered data was presented by Floater and Iske. Their approach involves m levels of interpolation where at the jth level, the residual of the previous level is interpolated. On each level, the RBF is scaled to match the data density. In this paper, we provide some theoretical underpinnings to the ML method by establishing rates of approximation for a technique that deviates somewhat from the Floater–Iske setting. The final goal of the ML method will be to provide a numerically stable method for interpolating several thousand points rapidly.  相似文献   

7.
This paper develops an identity for additive modifications of a singular value decomposition (SVD) to reflect updates, downdates, shifts, and edits of the data matrix. This sets the stage for fast and memory-efficient sequential algorithms for tracking singular values and subspaces. In conjunction with a fast solution for the pseudo-inverse of a submatrix of an orthogonal matrix, we develop a scheme for computing a thin SVD of streaming data in a single pass with linear time complexity: A rank-r thin SVD of a p × q matrix can be computed in O(pqr) time for .  相似文献   

8.
We present a new method for the construction of shape-preserving curves approximating a given set of 3D data, based on the space of quintic like polynomial splines with variable degrees recently introduced in [7]. These splines – which are C 3 and therefore curvature and torsion continuous – possess a very simple geometric structure, which permits to easily handle the shape-constraints.  相似文献   

9.
We present a new method for the construction of shape-preserving curves interpolating a given set of 3D data. The interpolating functions are obtained using quintic-like spaces of polynomial splines with variable degrees. These splines are of class C 3 and are therefore curvature and torsion continuous and possess a very simple geometric structure, which permits to easily handle the shape-constraints.  相似文献   

10.
We analyze the convergence rate of a multigrid method for multilevel linear systems whose coefficient matrices are generated by a real and nonnegative multivariate polynomial f and belong to multilevel matrix algebras like circulant, tau, Hartley, or are of Toeplitz type. In the case of matrix algebra linear systems, we prove that the convergence rate is independent of the system dimension even in presence of asymptotical ill-conditioning (this happens iff f takes the zero value). More precisely, if the d-level coefficient matrix has partial dimension n r at level r, with , then the size of the system is , , and O(N(n)) operations are required by the considered V-cycle Multigrid in order to compute the solution within a fixed accuracy. Since the total arithmetic cost is asymptotically equivalent to the one of a matrix-vector product, the proposed method is optimal. Some numerical experiments concerning linear systems arising in 2D and 3D applications are considered and discussed.  相似文献   

11.
Count data with excess zeros are often encountered in many medical, biomedical and public health applications. In this paper, an extension of zero-inflated Poisson mixed regression models is presented for dealing with multilevel data set, referred as hierarchical mixture zero-inflated Poisson mixed regression models. A stochastic EM algorithm is developed for obtaining the ML estimates of interested parameters and a model comparison is also considered for comparing models with different latent classes through BIC criterion. An application to the analysis of count data from a Shanghai Adolescence Fitness Survey and a simulation study illustrate the usefulness and effectiveness of our methodologies.  相似文献   

12.
For Au = f with an elliptic differential operator and stochastic data f, the m-point correlation function of the random solution u satisfies a deterministic equation with the m-fold tensor product operator A (m) of A. Sparse tensor products of hierarchic FE-spaces in are known to allow for approximations to which converge at essentially the rate as in the case m = 1, i.e. for the deterministic problem. They can be realized by wavelet-type FE bases (von Petersdorff and Schwab in Appl Math 51(2):145–180, 2006; Schwab and Todor in Computing 71:43–63, 2003). If wavelet bases are not available, we show here how to achieve the fast computation of sparse approximations of for Galerkin discretizations of A by multilevel frames such as BPX or other multilevel preconditioners of any standard FEM approximation for A. Numerical examples illustrate feasibility and scope of the method.  相似文献   

13.
The goal of this work is to derive and justify a multilevel preconditioner of optimal arithmetic complexity for symmetric interior penalty discontinuous Galerkin finite element approximations of second order elliptic problems. Our approach is based on the following simple idea given in [R.D. Lazarov, P.S. Vassilevski, L.T. Zikatanov, Multilevel preconditioning of second order elliptic discontinuous Galerkin problems, Preprint, 2005]. The finite element space of piece-wise polynomials, discontinuous on the partition , is projected onto the space of piece-wise constant functions on the same partition that constitutes the largest space in the multilevel method. The discontinuous Galerkin finite element system on this space is associated to the so-called “graph-Laplacian”. In 2-D this is a sparse M-matrix with -1 as off diagonal entries and nonnegative row sums. Under the assumption that the finest partition is a result of multilevel refinement of a given coarse mesh, we develop the concept of hierarchical splitting of the unknowns. Then using local analysis we derive estimates for the constants in the strengthened Cauchy–Bunyakowski–Schwarz (CBS) inequality, which are uniform with respect to the levels. This measure of the angle between the spaces of the splitting was used by Axelsson and Vassilevski in [Algebraic multilevel preconditioning methods II, SIAM J. Numer. Anal. 27 (1990) 1569–1590] to construct an algebraic multilevel iteration (AMLI) for finite element systems. The main contribution in this paper is a construction of a splitting that produces new estimates for the CBS constant for graph-Laplacian. As a result we have a preconditioner for the system of the discontinuous Galerkin finite element method of optimal arithmetic complexity.  相似文献   

14.
Preconditioning techniques are widely used to speed up the convergence of iterative methods for solving large linear systems with sparse or dense coefficient matrices. For certain application problems, however, the standard block diagonal preconditioner makes the Krylov iterative methods converge more slowly or even diverge. To handle this problem, we apply diagonal shifting and stabilized singular value decomposition (SVD) to each diagonal block, which is generated from the multilevel fast multiple algorithm (MLFMA), to improve the stability and efficiency of the block diagonal preconditioner. Our experimental results show that the improved block diagonal preconditioner maintains the computational complexity of MLFMA, converges faster and also reduces the CPU cost.  相似文献   

15.
Summary. A mixed finite element discretization is applied to Richards equation, a nonlinear, possibly degenerate parabolic partial differential equation modeling water flow through porous medium. The equation is considered in its pressure formulation and includes both variably and fully saturated flow regime. Characteristic for such problems is the lack in regularity of the solution. To handle this we use a time-integrated scheme. We analyze the scheme and present error estimates showing its convergence.Mathematics Subject Classification (2000): 65M12, 65M60, 76S05, 35K65Acknowledgments. We would like to thank Markus Bause for very useful discussions and suggestions.  相似文献   

16.
The method introduced by Ennio De Giorgi and Guido Stampacchia for the study of the regularity (L p , Marcinkiewicz or C 0,α ) of the weak solutions of Dirichlet problems hinges on the handle of inequalities concerning the integral of on the subsets where |u(x)| is greater than k. In this framework, here we give a contribution with the study of the Marcinkiewicz regularity of the gradient of infinite energy solutions of Dirichlet problems with nonregular data. Dedicated to Juan Luis Vazquez for his 60th birthday (“El verano del Patriarca”, see [19]).  相似文献   

17.
Estimating the Heavy Tail Index from Scaling Properties   总被引:4,自引:0,他引:4  
This paper deals with the estimation of the tail index for empirical heavy-tailed distributions, such as have been encountered in telecommunication systems. We present a method (called the scaling estimator) based on the scaling properties of sums of heavy-tailed random variables. It has the advantages of being nonparametric, of being easy to apply, of yielding a single value, and of being relatively accurate on synthetic datasets. Since the method relies on the scaling of sums, it measures a property that is often one of the most important effects of heavy-tailed behavior. Most importantly, we present evidence that the scaling estimator appears to increase in accuracy as the size of the dataset grows. It is thus particularly suited for large datasets, as are increasingly encountered in measurements of telecommunications and computing systems.  相似文献   

18.
Singular Value Decomposition (SVD) is a powerful tool in linear algebra and has been extensively applied to Signal Processing, Statistical Analysis and Mathematical Modeling. We propose an extension of SVD for both the qualitative detection and quantitative determination of nonlinearity in a time series. The method is to augment the embedding matrix with additional nonlinear columns derived from the initial embedding vectors and extract the nonlinear relationship using SVD. The paper demonstrates an application of nonlinear SVD to identify parameters when the signal is generated by a nonlinear transformation. Examples of maps (Logistic map and Henon map) and flows (Van der Pol oscillator and Duffing oscillator) are used to illustrate the method of nonlinear SVD to identify parameters. The paper presents the recovery of parameters in the following scenarios: (i) data generated by maps and flows, (ii) comparison of the method for both noisy and noise-free data, (iii) surrogate data analysis for both the noisy and noise-free cases. The paper includes two applications of the method: (i) Mathematical Modeling and (ii) Chaotic Cryptanalysis.  相似文献   

19.
The structure preserving rank reduction problem arises in many important applications. The singular value decomposition (SVD), while giving the closest low rank approximation to a given matrix in matrix L 2 norm and Frobenius norm, may not be appropriate for these applications since it does not preserve the given structure. We present a new method for structure preserving low rank approximation of a matrix, which is based on Structured Total Least Norm (STLN). The STLN is an efficient method for obtaining an approximate solution to an overdetermined linear system AX B, preserving the given linear structure in the perturbation [E F] such that (A + E)X = B + F. The approximate solution can be obtained to minimize the perturbation [E F] in the L p norm, where p = 1, 2, or . An algorithm is described for Hankel structure preserving low rank approximation using STLN with L p norm. Computational results are presented, which show performances of the STLN based method for L 1 and L 2 norms for reduced rank approximation for Hankel matrices.  相似文献   

20.
In this paper we propose a new iterative method for solving a class of linear complementarity problems:u 0,Mu + q 0, uT(Mu + q)=0, where M is a givenl ×l positive semidefinite matrix (not necessarily symmetric) andq is a givenl-vector. The method makes two matrix-vector multiplications and a trivial projection onto the nonnegative orthant at each iteration, and the Euclidean distance of the iterates to the solution set monotonously converges to zero. The main advantages of the method presented are its simplicity, robustness, and ability to handle large problems with any start point. It is pointed out that the method may be used to solve general convex quadratic programming problems. Preliminary numerical experiments indicate that this method may be very efficient for large sparse problems.On leave from the Department of Mathematics, University of Nanjing, Nanjing, People's Republic of China.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号