首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Advances in Data Analysis and Classification - Two new methods for sparse dimension reduction are introduced, based on martingale difference divergence and ball covariance, respectively. These...  相似文献   

2.
We construct non-random bounded discrete half-line Schrödinger operators which have purely singular continuous spectral measures with fractional Hausdorff dimension (in some interval of energies). To do this we use suitable sparse potentials. Our results also apply to whole line operators, as well as to certain random operators. In the latter case we prove and compute an exact dimension of the spectral measures.  相似文献   

3.
PLS and dimension reduction for classification   总被引:2,自引:0,他引:2  
Barker and Rayens (J Chemometrics 17:166–173, 2003) offered convincing arguments that partial least squares (PLS) is to be preferred over principal components analysis (PCA) when discrimination is the goal and dimension reduction is required, since at least with PLS as the dimension reduction tool, information involving group separation is directly involved in the structure extraction. In this paper the basic results in Barker and Rayens (J Chemometrics 17:166–173, 2003) are reviewed and some of their ideas and comparisons are illustrated on a real data set, something which Barker and Rayens did not do. More importantly, new results are introduced, including a formal proof for the superiority of PLS over PCA in the two-group case, as well as new connections between PLS for discrimination and an extended class of PLS-like techniques known as “oriented PLS” (OrPLS). In the latter case, a particularly simple subclass of OrPLS procedures, when used to achieve the dimension reduction, is shown to always produce a lower misclassification rate than when “ordinary” PLS is used for the same purpose.  相似文献   

4.
In a regression context where a response variable Y? is recorded with a covariate X? p , two situations can occur simultaneously: (a) we are interested in the tail of the conditional distribution and not on the central part of the distribution and (b) the number p of regressors is large. To our knowledge, these two situations have only been considered separately in the literature. The aim of this paper is to propose a new dimension reduction approach adapted to the tail of the distribution in order to propose an efficient conditional extreme quantile estimator when the dimension p is large. The results are illustrated on simulated data and on a real dataset.  相似文献   

5.
6.
Existing groupwise dimension reduction requires given group structure to be non-overlapped. This confines its application scope. We aim at groupwise dimension reduction with overlapped group structure or even unknown group structure. To this end, existing groupwise dimension reduction concept is extended to be compatible with overlapped group structure. Then, the envelope method is ameliorated to deal with overlapped groupwise dimension reduction. As an application, Gaussian graphic model is employed to estimate the structure between predictors when the group structure is not given, and the amended envelope method is used for groupwise dimension reduction with graphic structure. Furthermore, the rationale of the proposed estimation procedure is explained at the population level and the estimation consistency is proved at the sample level. Finally, the finite sample performance of the proposed methods is examined via numerical simulations and a body fat data analysis.  相似文献   

7.
We show that the Hausdorff dimension of the spectral measure of a class of deterministic, i.e. nonrandom, block-Jacobi matrices may be determined with any degree of precision, improving a result of Zlatoš [Andrej Zlatoš, Sparse potentials with fractional Hausdorff dimension, J. Funct. Anal. 207 (2004) 216-252].  相似文献   

8.
Consider a random matrix \(H:{\mathbb {R}}^{n}\longrightarrow {\mathbb {R}}^{m}\) . Let \(D\ge 2\) and let \(\{W_l\}_{l=1}^{p}\) be a set of \(k\) -dimensional affine subspaces of \({\mathbb {R}}^{n}\) . We ask what is the probability that for all \(1\le l\le p\) and \(x,y\in W_l\) , $$\begin{aligned} \Vert x-y\Vert _2\le \Vert Hx-Hy\Vert _2\le D\Vert x-y\Vert _2. \end{aligned}$$ We show that for \(m=O\big (k+\frac{\ln {p}}{\ln {D}}\big )\) and a variety of different classes of random matrices \(H\) , which include the class of Gaussian matrices, existence is assured and the probability is very high. The estimate on \(m\) is tight in terms of \(k,p,D\) .  相似文献   

9.
We discuss the theoretical structure and constructive methodology for large-scale graphical models, motivated by their potential in evaluating and aiding the exploration of patterns of association in gene expression data. The theoretical discussion covers basic ideas and connections between Gaussian graphical models, dependency networks and specific classes of directed acyclic graphs we refer to as compositional networks. We describe a constructive approach to generating interesting graphical models for very high-dimensional distributions that builds on the relationships between these various stylized graphical representations. Issues of consistency of models and priors across dimension are key. The resulting methods are of value in evaluating patterns of association in large-scale gene expression data with a view to generating biological insights about genes related to a known molecular pathway or set of specified genes. Some initial examples relate to the estrogen receptor pathway in breast cancer, and the Rb-E2F cell proliferation control pathway.  相似文献   

10.
11.
Dimension reduction techniques are at the core of the statistical analysis of high-dimensional and functional observations. Whether the data are vector- or function-valued, principal component techniques, in this context, play a central role. The success of principal components in the dimension reduction problem is explained by the fact that, for any \(K\le p\), the K first coefficients in the expansion of a p-dimensional random vector \(\mathbf{X}\) in terms of its principal components is providing the best linear K-dimensional summary of \(\mathbf X\) in the mean square sense. The same property holds true for a random function and its functional principal component expansion. This optimality feature, however, no longer holds true in a time series context: principal components and functional principal components, when the observations are serially dependent, are losing their optimal dimension reduction property to the so-called dynamic principal components introduced by Brillinger in 1981 in the vector case and, in the functional case, their functional extension proposed by Hörmann, Kidziński and Hallin in 2015.  相似文献   

12.
Dimension reduction is a well-known pre-processing step in the text clustering to remove irrelevant, redundant and noisy features without sacrificing performance of the underlying algorithm. Dimension reduction methods are primarily classified as feature selection (FS) methods and feature extraction (FE) methods. Though FS methods are robust against irrelevant features, they occasionally fail to retain important information present in the original feature space. On the other hand, though FE methods reduce dimensions in the feature space without losing much information, they are significantly affected by the irrelevant features. The one-stage models, FS/FE methods, and the two-stage models, a combination of FS and FE methods proposed in the literature are not sufficient to fulfil all the above mentioned requirements of the dimension reduction. Therefore, we propose three-stage dimension reduction models to remove irrelevant, redundant and noisy features in the original feature space without loss of much valuable information. These models incorporates advantages of the FS and the FE methods to create a low dimension feature subspace. The experiments over three well-known benchmark text datasets of different characteristics show that the proposed three-stage models significantly improve performance of the clustering algorithm as measured by micro F-score, macro F-score, and total execution time.  相似文献   

13.
We study the limit behaviour of some nonlinear monotone equations, such as: , in a domain which is thin in some directions (e.g. is a plate or a thin cylinder). After rescaling to a fixed domain , the above equation is transformed into: , with convenient operators and . Assuming that and the inverse of have particular forms and satisfy suitable compensated compactness assumptions, we prove a closure result, that is we prove that the limit problem has the same form. This applies in particular to the limit behaviour of nonlinear monotone equations in laminated plates.Received: 16 October 2002, Accepted: 12 June 2003, Published online: 22 September 2003Mathematics Subject Classification (2000): 35B27, 35B40, 74Q15  相似文献   

14.
Data reduction is an important issue in the field of data mining. The goal of data reduction techniques is to extract a subset of data from a massive dataset while maintaining the properties and characteristics of the original data in the reduced set. This allows an otherwise difficult or impossible data mining task to be carried out efficiently and effectively. This paper describes a new method for selecting a subset of data that closely represents the original data in terms of its joint and univariate distributions. A pair of distance criteria, motivated by the χ2-statistic, are used for measuring the goodness-of-fit between the distributions of the reduced and full datasets. Under these criteria, the data reduction problem can be formulated as a bi-objective quadratic program. A genetic algorithm technique is used in the search/optimization process. Experiments conducted on several real-world data sets demonstrate the effectiveness of the proposed method.  相似文献   

15.
Ng  Michael K.  Zhu  Zhaochen 《Numerical Algorithms》2019,80(3):687-707
Numerical Algorithms - In this paper, we study the ensemble Kalman filter (EnKF) method for chemical species simulation in air quality forecast data assimilation. The main contribution of this...  相似文献   

16.
Alois Steindl 《PAMM》2011,11(1):337-338
The Application of Dimension Reduction by Approximate Inertial Manifolds to the dynamics of a fluid conveying tube doesn't perform as expected due to the presence of viscous damping terms, which lead to a cluster point of stable eigenvalues. We investigate methods to circumvent the problems introduced by this cluster point: For the spectral decomposition part of the invariant subspaces are computed using the Schur factorisation. Also a different choice of basis functions for the dominating modes avoids the difficulties. (© 2011 Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim)  相似文献   

17.
The canonical correlation (CANCOR) method for dimension reduction in a regression setting is based on the classical estimates of the first and second moments of the data, and therefore sensitive to outliers. In this paper, we study a weighted canonical correlation (WCANCOR) method, which captures a subspace of the central dimension reduction subspace, as well as its asymptotic properties. In the proposed WCANCOR method, each observation is weighted based on its Mahalanobis distance to the location of the predictor distribution. Robust estimates of the location and scatter, such as the minimum covariance determinant (MCD) estimator of Rousseeuw [P.J. Rousseeuw, Multivariate estimation with high breakdown point, Mathematical Statistics and Applications B (1985) 283-297], can be used to compute the Mahalanobis distance. To determine the number of significant dimensions in WCANCOR, a weighted permutation test is considered. A comparison of SIR, CANCOR and WCANCOR is also made through simulation studies to show the robustness of WCANCOR to outlying observations. As an example, the Boston housing data is analyzed using the proposed WCANCOR method.  相似文献   

18.
We propose to approximate the conditional density function of a random variable Y given a dependent random d-vector X by that of Y given θ^τX, where the unit vector θ is selected such that the average Kullback-Leibler discrepancy distance between the two conditional density functions obtains the minimum. Our approach is nonparametric as far as the estimation of the conditional density functions is concerned. We have shown that this nonparametric estimator is asymptotically adaptive to the unknown index θ in the sense that the first order asymptotic mean squared error of the estimator is the same as that when θ was known. The proposed method is illustrated using both simulated and real-data examples.  相似文献   

19.
Sparse finite element methods for operator equations with stochastic data   总被引:2,自引:0,他引:2  
Let A: V → V′ be a strongly elliptic operator on a d-dimensional manifold D (polyhedra or boundaries of polyhedra are also allowed). An operator equation Au = f with stochastic data f is considered. The goal of the computation is the mean field and higher moments of the solution. We discretize the mean field problem using a FEM with hierarchical basis and N degrees of freedom. We present a Monte-Carlo algorithm and a deterministic algorithm for the approximation of the moment for k⩾1. The key tool in both algorithms is a “sparse tensor product” space for the approximation of with O(N(log N) k−1) degrees of freedom, instead of N k degrees of freedom for the full tensor product FEM space. A sparse Monte-Carlo FEM with M samples (i.e., deterministic solver) is proved to yield approximations to with a work of O(M N(log N) k−1) operations. The solutions are shown to converge with the optimal rates with respect to the Finite Element degrees of freedom N and the number M of samples. The deterministic FEM is based on deterministic equations for in D k ⊂ ℝkd. Their Galerkin approximation using sparse tensor products of the FE spaces in D allows approximation of with O(N(log N) k−1) degrees of freedom converging at an optimal rate (up to logs). For nonlocal operators wavelet compression of the operators is used. The linear systems are solved iteratively with multilevel preconditioning. This yields an approximation for with at most O(N (log N) k+1) operations. This work was supported under IHP Network “Breaking Complexity” by the Swiss BBW under grant No. 02.0418  相似文献   

20.
Linear dimension reduction plays an important role in classification problems. A variety of techniques have been developed for linear dimension reduction to be applied prior to classification. However, there is no single definitive method that works best under all circumstances. Rather a best method depends on various data characteristics. We develop a two-step adaptive procedure in which a best dimension reduction method is first selected based on the various data characteristics, which is then applied to the data at hand. It is shown using both simulated and real life data that such a procedure can significantly reduce the misclassification rate.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号