首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 46 毫秒
1.
Semiparametric partially linear varying coefficient models (SPLVCM) are frequently used in statistical modeling. With high-dimensional covariates both in parametric and nonparametric part for SPLVCM, sparse modeling is often considered in practice. In this paper, we propose a new estimation and variable selection procedure based on modal regression, where the nonparametric functions are approximated by $B$ -spline basis. The outstanding merit of the proposed variable selection procedure is that it can achieve both robustness and efficiency by introducing an additional tuning parameter (i.e., bandwidth $h$ ). Its oracle property is also established for both the parametric and nonparametric part. Moreover, we give the data-driven bandwidth selection method and propose an EM-type algorithm for the proposed method. Monte Carlo simulation study and real data example are conducted to examine the finite sample performance of the proposed method. Both the simulation results and real data analysis confirm that the newly proposed method works very well.  相似文献   

2.
In model-based clustering and classification, the cluster-weighted model is a convenient approach when the random vector of interest is constituted by a response variable $Y$ and by a vector ${\varvec{X}}$ of $p$ covariates. However, its applicability may be limited when $p$ is high. To overcome this problem, this paper assumes a latent factor structure for ${\varvec{X}}$ in each mixture component, under Gaussian assumptions. This leads to the cluster-weighted factor analyzers (CWFA) model. By imposing constraints on the variance of $Y$ and the covariance matrix of ${\varvec{X}}$ , a novel family of sixteen CWFA models is introduced for model-based clustering and classification. The alternating expectation-conditional maximization algorithm, for maximum likelihood estimation of the parameters of all models in the family, is described; to initialize the algorithm, a 5-step hierarchical procedure is proposed, which uses the nested structures of the models within the family and thus guarantees the natural ranking among the sixteen likelihoods. Artificial and real data show that these models have very good clustering and classification performance and that the algorithm is able to recover the parameters very well.  相似文献   

3.
We introduce a dimension reduction method for model-based clustering obtained from a finite mixture of $t$ t -distributions. This approach is based on existing work on reducing dimensionality in the case of finite Gaussian mixtures. The method relies on identifying a reduced subspace of the data by considering the extent to which group means and group covariances vary. This subspace contains linear combinations of the original data, which are ordered by importance via the associated eigenvalues. Observations can be projected onto the subspace and the resulting set of variables captures most of the clustering structure available in the data. The approach is illustrated using simulated and real data, where it outperforms its Gaussian analogue.  相似文献   

4.
The accurate estimation of a precision matrix plays a crucial role in the current age of high-dimensional data explosion. To deal with this problem, one of the prominent and commonly used techniques is the \(\ell _1\) norm (Lasso) penalization for a given loss function. This approach guarantees the sparsity of the precision matrix estimate for properly selected penalty parameters. However, the \(\ell _1\) norm penalization often fails to control the bias of obtained estimator because of its overestimation behavior. In this paper, we introduce two adaptive extensions of the recently proposed \(\ell _1\) norm penalized D-trace loss minimization method. They aim at reducing the produced bias in the estimator. Extensive numerical results, using both simulated and real datasets, show the advantage of our proposed estimators.  相似文献   

5.
In this paper, we study the estimation and variable selection of the sufficient dimension reduction space for survival data via a new combination of $L_1$ penalty and the refined outer product of gradient method (rOPG; Xia et al. in J R Stat Soc Ser B 64:363–410, 2002), called SH-OPG hereafter. SH-OPG can exhaustively estimate the central subspace and select the informative covariates simultaneously; Meanwhile, the estimated directions remain orthogonal automatically after dropping noninformative regressors. The efficiency of SH-OPG is verified through extensive simulation studies and real data analysis.  相似文献   

6.
Nearest neighbour classification requires a good distance metric. Previous approaches try to learn a quadratic distance metric learning so that observations of different classes are well separated. For high-dimensional problems, where many uninformative variables are present, it is attractive to select a sparse distance metric, both to increase predictive accuracy but also to aid interpretation of the result. We investigate the \(\ell 1\) -regularized metric learning problem, making a connection with the Lasso algorithm in the linear least squared settings. We show that the fitted transformation matrix is close to the desired transformation matrix in \(\ell 1\) -norm by assuming a version of the compatibility condition.  相似文献   

7.
For appropriate matrix ensembles, greedy algorithms have proven to be an efficient means of solving the combinatorial optimization problem associated with compressed sensing. This paper describes an implementation for graphics processing units (GPU) of hard thresholding, iterative hard thresholding, normalized iterative hard thresholding, hard thresholding pursuit, and a two-stage thresholding algorithm based on compressive sampling matching pursuit and subspace pursuit. The GPU acceleration of the former bottleneck, namely the matrix–vector multiplications, transfers a significant portion of the computational burden to the identification of the support set. The software solves high-dimensional problems in fractions of a second which permits large-scale testing at dimensions currently unavailable in the literature. The GPU implementations exhibit up to 70 $\times $ × acceleration over standard Matlab central processing unit implementations using automatic multi-threading.  相似文献   

8.
Many applications in data analysis rely on the decomposition of a data matrix into a low-rank and a sparse component. Existing methods that tackle this task use the nuclear norm and \(\ell _1\) -cost functions as convex relaxations of the rank constraint and the sparsity measure, respectively, or employ thresholding techniques. We propose a method that allows for reconstructing and tracking a subspace of upper-bounded dimension from incomplete and corrupted observations. It does not require any a priori information about the number of outliers. The core of our algorithm is an intrinsic Conjugate Gradient method on the set of orthogonal projection matrices, the so-called Grassmannian. Non-convex sparsity measures are used for outlier detection, which leads to improved performance in terms of robustly recovering and tracking the low-rank matrix. In particular, our approach can cope with more outliers and with an underlying matrix of higher rank than other state-of-the-art methods.  相似文献   

9.
We assume data sampled from a mixture of \(d\) -dimensional linear subspaces with spherically symmetric distributions within each subspace and an additional outlier component with spherically symmetric distribution within the ambient space (for simplicity, we may assume that all distributions are uniform on their corresponding unit spheres). We also assume mixture weights for the different components. We say that one of the underlying subspaces of the model is most significant if its mixture weight is higher than the sum of the mixture weights of all other subspaces. We study the recovery of the most significant subspace by minimizing the \(l_p\) -averaged distances of data points from \(d\) -dimensional subspaces of \(\mathbb R^D\) , where \(0 < p \in \mathbb R\) . Unlike other \(l_p\) minimization problems, this minimization is nonconvex for all \(p>0\) and thus requires different methods for its analysis. We show that if \(0 , then for any fraction of outliers, the most significant subspace can be recovered by \(l_p\) minimization with overwhelming probability (which depends on the generating distribution and its parameters). We show that when adding small noise around the underlying subspaces, the most significant subspace can be nearly recovered by \(l_p\) minimization for any \(0 with an error proportional to the noise level. On the other hand, if \(p>1\) and there is more than one underlying subspace, then with overwhelming probability the most significant subspace cannot be recovered or nearly recovered. This last result does not require spherically symmetric outliers.  相似文献   

10.
Recently, matrix norm $l_{2,1}$ has been widely applied to feature selection in many areas such as computer vision, pattern recognition, biological study and etc. As an extension of $l_1$ norm, $l_{2,1}$ matrix norm is often used to find jointly sparse solution. Actually, computational studies have showed that the solution of $l_p$ -minimization ( $0<p<1$ ) is sparser than that of $l_1$ -minimization. The generalized $l_{2,p}$ -minimization ( $p\in (0,1]$ ) is naturally expected to have better sparsity than $l_{2,1}$ -minimization. This paper presents a type of models based on $l_{2,p}\ (p\in (0, 1])$ matrix norm which is non-convex and non-Lipschitz continuous optimization problem when $p$ is fractional ( $0<p<1$ ). For all $p$ in $(0, 1]$ , a unified algorithm is proposed to solve the $l_{2,p}$ -minimization and the convergence is also uniformly demonstrated. In the practical implementation of algorithm, a gradient projection technique is utilized to reduce the computational cost. Typically different $l_{2,p}\ (p\in (0,1])$ are applied to select features in computational biology.  相似文献   

11.
High-dimensional feature selection has become increasingly crucial for seeking parsimonious models in estimation. For selection consistency, we derive one necessary and sufficient condition formulated on the notion of degree of separation. The minimal degree of separation is necessary for any method to be selection consistent. At a level slightly higher than the minimal degree of separation, selection consistency is achieved by a constrained $L_0$ -method and its computational surrogate—the constrained truncated $L_1$ -method. This permits up to exponentially many features in the sample size. In other words, these methods are optimal in feature selection against any selection method. In contrast, their regularization counterparts—the $L_0$ -regularization and truncated $L_1$ -regularization methods enable so under slightly stronger assumptions. More importantly, sharper parameter estimation/prediction is realized through such selection, leading to minimax parameter estimation. This, otherwise, is impossible in the absence of a good selection method for high-dimensional analysis.  相似文献   

12.
Alberto Seeger 《TOP》2014,22(3):1017-1027
Let \(\mathbb{M}_{m,n}\) be the linear space of real matrices of dimension m × n. A variational problem that arises quite often in applications is that of minimizing a real-valued function f on some feasible set \(\Upomega\subseteq \mathbb{M}_{m,n}.\) Matrix optimization problems of such a degree of generality are not always easy to deal with, especially if the decision variable is a high-dimensional rectangular matrix. Sometimes, it is possible to reduce the size and complexity of the matrix optimization problem in the presence of symmetry assumptions (isotropy, orthogonal invariance, etc.). This work establishes a localization result for the solutions to a class of extremal problems involving isotropic sets and functions.  相似文献   

13.
Motivated by the method for solving center-based Least Squares—clustering problem (Kogan in Introduction to clustering large and high-dimensional data, Cambridge University Press, 2007; Teboulle in J Mach Learn Res 8:65–102, 2007) we construct a very efficient iterative process for solving a one-dimensional center-based l 1—clustering problem, on the basis of which it is possible to determine the optimal partition. We analyze the basic properties and convergence of our iterative process, which converges to a stationary point of the corresponding objective function for each choice of the initial approximation. Given is also a corresponding algorithm, which in only few steps gives a stationary point and the corresponding partition. The method is illustrated and visualized on the example of looking for an optimal partition with two clusters, where we check all stationary points of the corresponding minimizing functional. Also, the method is tested on the basis of large numbers of data points and clusters and compared with the method for solving the center-based Least Squares—clustering problem described in Kogan (2007) and Teboulle (2007).  相似文献   

14.
We propose a variable selection procedure in model-based clustering using multilocus genotype data. Indeed, it may happen that some loci are not relevant for clustering into statistically different populations. Inferring the number K of clusters and the relevant clustering subset S of loci is seen as a model selection problem. The competing models are compared using penalized maximum likelihood criteria. Under weak assumptions on the penalty function, we prove the consistency of the resulting estimator ${(\widehat{K}_n, \widehat{S}_n)}$ . An associated algorithm named Mixture Model for Genotype Data (MixMoGenD) has been implemented using c++ programming language and is available on http://www.math.u-psud.fr/~toussile. To avoid an exhaustive search of the optimum model, we propose a modified Backward-Stepwise algorithm, which enables a better search of the optimum model among all possible cardinalities of S. We present numerical experiments on simulated and real datasets that highlight the interest of our loci selection procedure.  相似文献   

15.
An efficient algorithm is derived for solving the quantile regression problem combined with a group sparsity promoting penalty. The group sparsity of the regression parameters is achieved by using a \(\ell _{1,\infty }\) -norm penalty (or constraint) on the regression parameters. The algorithm is efficient in the sense that it obtains the regression parameters for a wide range of penalty parameters, thus enabling easy application of a model selection criteria afterwards. A Matlab implementation of the proposed algorithm is provided and some applications of the methods are studied.  相似文献   

16.
17.
Any abstract convex cone S with a uniformity satisfying the law of cancellation can be embedded in a topological vector space $\widetilde{S}$ (Urbański, Bull Acad Pol Sci, Sér Sci Math Astron Phys 24:709–715, 1976). We introduce a notion of a cone symmetry and decompose in Theorem 2.12 a quotient vector space $\widetilde{S}$ into a topological direct sum of its symmetric subspace $\widetilde{S}_s$ and asymmetric subspace $\widetilde{S}_a$ . In Theorem 2.19 we prove a similar decomposition for a normed space $\widetilde{S}$ . In section 3 we apply decomposition to Minkowski–Rådström–Hörmander (MRH) space with three best known norms and four symmetries. In section 4 we obtain a continuous selection from a MRH space over ?2 to the family of pairs of nonempty compact convex subsets of ?2.  相似文献   

18.
We study a high-dimensional analog for the notion of an acyclic (aka transitive) tournament. We give upper and lower bounds on the number of $d$ -dimensional $n$ -vertex acyclic tournaments. In addition, we prove that every $n$ -vertex $d$ -dimensional tournament contains an acyclic subtournament of $\Omega (\log ^{1/d}n)$ vertices and the bound is tight. This statement for tournaments (i.e., the case $d=1$ ) is a well-known fact. We indicate a connection between acyclic high-dimensional tournaments and Ramsey numbers of hypergraphs. We investigate as well the inter-relations among various other notions of acyclicity in high-dimensional tournaments. These include combinatorial, geometric and topological concepts.  相似文献   

19.
We consider the unconstrained $L_q$ - $L_p$ minimization: find a minimizer of $\Vert Ax-b\Vert ^q_q+\lambda \Vert x\Vert ^p_p$ for given $A \in R^{m\times n}$ , $b\in R^m$ and parameters $\lambda >0$ , $p\in [0, 1)$ and $q\ge 1$ . This problem has been studied extensively in many areas. Especially, for the case when $q=2$ , this problem is known as the $L_2-L_p$ minimization problem and has found its applications in variable selection problems and sparse least squares fitting for high dimensional data. Theoretical results show that the minimizers of the $L_q$ - $L_p$ problem have various attractive features due to the concavity and non-Lipschitzian property of the regularization function $\Vert \cdot \Vert ^p_p$ . In this paper, we show that the $L_q$ - $L_p$ minimization problem is strongly NP-hard for any $p\in [0,1)$ and $q\ge 1$ , including its smoothed version. On the other hand, we show that, by choosing parameters $(p,\lambda )$ carefully, a minimizer, global or local, will have certain desired sparsity. We believe that these results provide new theoretical insights to the studies and applications of the concave regularized optimization problems.  相似文献   

20.
A reorthogonalized block classical Gram–Schmidt algorithm is proposed that factors a full column rank matrix $A$ into $A=QR$ where $Q$ is left orthogonal (has orthonormal columns) and $R$ is upper triangular and nonsingular. This block Gram–Schmidt algorithm can be implemented using matrix–matrix operations making it more efficient on modern architectures than orthogonal factorization algorithms based upon matrix-vector operations and purely vector operations. Gram–Schmidt orthogonal factorizations are important in the stable implementation of Krylov space methods such as GMRES and in approaches to modifying orthogonal factorizations when columns and rows are added or deleted from a matrix. With appropriate assumptions about the diagonal blocks of $R$ , the algorithm, when implemented in floating point arithmetic with machine unit $\varepsilon _M$ , produces $Q$ and $R$ such that $\Vert I- Q ^T\!~ Q \Vert =O(\varepsilon _M)$ and $\Vert A-QR \Vert =O(\varepsilon _M\Vert A \Vert )$ . The first of these bounds has not been shown for a block Gram–Schmidt procedure before. As consequence of these results, we provide a different analysis, with a slightly different assumption, that re-establishes a bound of Giraud et al. (Num Math, 101(1):87–100, 2005) for the CGS2 algorithm.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号