首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 10 毫秒
1.
Using the so-called martingale difference correlation (MDC), we propose a novel censored-conditional-quantile screening approach for ultrahigh-dimensional survival data with heterogeneity (which is often present in such data). By incorporating a weighting scheme, this method is a natural extension of MDC-based conditional quantile screening, as considered in Shao and Zhang (2014), to handle ultrahigh-dimensional survival data. The proposed screening procedure has a sure-screening property under certain technical conditions and an excellent capability of detecting the nonlinear relationship between independent and censored dependent variables. Both simulation results and an analysis of real data demonstrate the effectiveness of the new censored conditional quantile-screening procedure.  相似文献   

2.
High-dimensional data have frequently been collected in many scientific areas including genomewide association study, biomedical imaging, tomography, tumor classifications, and finance. Analysis of highdimensional data poses many challenges for statisticians. Feature selection and variable selection are fundamental for high-dimensional data analysis. The sparsity principle, which assumes that only a small number of predictors contribute to the response, is frequently adopted and deemed useful in the analysis of high-dimensional data.Following this general principle, a large number of variable selection approaches via penalized least squares or likelihood have been developed in the recent literature to estimate a sparse model and select significant variables simultaneously. While the penalized variable selection methods have been successfully applied in many highdimensional analyses, modern applications in areas such as genomics and proteomics push the dimensionality of data to an even larger scale, where the dimension of data may grow exponentially with the sample size. This has been called ultrahigh-dimensional data in the literature. This work aims to present a selective overview of feature screening procedures for ultrahigh-dimensional data. We focus on insights into how to construct marginal utilities for feature screening on specific models and motivation for the need of model-free feature screening procedures.  相似文献   

3.
In this paper, the conditional distance correlation (CDC) is used as a measure of correlation to develop a conditional feature screening procedure given some significant variables for ultrahigh-dimensional data. The proposed procedure is model free and is called conditional distance correlation-sure independence screening (CDC-SIS for short). That is, we do not specify any model structure between the response and the predictors, which is appealing in some practical problems of ultrahigh-dimensional data analysis. The sure screening property of the CDC-SIS is proved and a simulation study was conducted to evaluate the finite sample performances. Real data analysis is used to illustrate the proposed method. The results indicate that CDC-SIS performs well.  相似文献   

4.
Semiparametric linear transformation models have received much attention due to their high flexibility in modeling survival data. A useful estimating equation procedure was recently proposed by Chen et al. (2002) [21] for linear transformation models to jointly estimate parametric and nonparametric terms. They showed that this procedure can yield a consistent and robust estimator. However, the problem of variable selection for linear transformation models has been less studied, partially because a convenient loss function is not readily available under this context. In this paper, we propose a simple yet powerful approach to achieve both sparse and consistent estimation for linear transformation models. The main idea is to derive a profiled score from the estimating equation of Chen et al. [21], construct a loss function based on the profile scored and its variance, and then minimize the loss subject to some shrinkage penalty. Under regularity conditions, we have shown that the resulting estimator is consistent for both model estimation and variable selection. Furthermore, the estimated parametric terms are asymptotically normal and can achieve a higher efficiency than that yielded from the estimation equations. For computation, we suggest a one-step approximation algorithm which can take advantage of the LARS and build the entire solution path efficiently. Performance of the new procedure is illustrated through numerous simulations and real examples including one microarray data.  相似文献   

5.
An updated geometric build-up algorithm is developed for solving the molecular distance geometry problem with a sparse set of inter-atomic distances. Different from the general geometric build-up algorithm, the updated algorithm re-computes the coordinates of the base atoms whenever necessary and possible. In this way, the errors introduced in solving the algebraic equations for the determination of the coordinates of the atoms are controlled in the intermediate computational steps. The method for re-computing the coordinates of the base atoms based on the estimation on the root-mean-square deviation (RMSD) is described. The results of applying the updated algorithm to a set of protein structure problems are presented. In many cases, the updated algorithm solves the problems with high accuracy when the results of the general algorithm are inadequate.  相似文献   

6.
7.
8.
Dvořák [European J. Combin. 34 (2013), pp. 833–840] gave a bound on the minimum size of a distance -dominating set in terms of the maximum size of a distance -independent set and generalized coloring numbers, thus obtaining a constant-factor approximation algorithm for the parameters in any class of graphs with bounded expansion. We improve and clarify this dependence using a linear programming (LP)-based argument inspired by the work of Bansal and Umboh [Inform. Process. Lett. 122 (2017), pp. 21–24].  相似文献   

9.
In this paper, we consider the problem of selecting the variables of the fixed effects in the linear mixed models where the random effects are present and the observation vectors have been obtained from many clusters. As the variable selection procedure, here we use the Akaike Information Criterion, AIC. In the context of the mixed linear models, two kinds of AIC have been proposed: marginal AIC and conditional AIC. In this paper, we derive three versions of conditional AIC depending upon different estimators of the regression coefficients and the random effects. Through the simulation studies, it is shown that the proposed conditional AIC’s are superior to the marginal and conditional AIC’s proposed in the literature in the sense of selecting the true model. Finally, the results are extended to the case when the random effects in all the clusters are of the same dimension but have a common unknown covariance matrix.  相似文献   

10.
We consider normal ≡ Gaussian seemingly unrelated regressions (SUR) with incomplete data (ID). Imposing a natural minimal set of conditional independence constraints, we find a restricted SUR/ID model whose likelihood function and parameter space factor into the product of the likelihood functions and the parameter spaces of standard complete data multivariate analysis of variance models. Hence, the restricted model has a unimodal likelihood and permits explicit likelihood inference. In the development of our methodology, we review and extend existing results for complete data SUR models and the multivariate ID problem.  相似文献   

11.
Bounded additive models in data envelopment analysis (DEA) under the assumption of constant returns to scale (CRS) were recently introduced in the literature (Cooper et al. in J Product Anal 35(2):85–94, 2011; Pastor et al. in J Product Anal 40:285–292, 2013; Pastor et al. in Omega 56:16–24, 2015). In this paper, we propose to extend the so far generated knowledge about bounded additive models to the family of directional distance function (DDF) models in DEA, giving rise to a completely new subfamily of bounded or partially-bounded CRS-DDF models. We finally check the new approach on a real agricultural panel data set estimating efficiency and productivity change over time, resorting to the Luenberger indicator in a context where at least one variable is naturally bounded.  相似文献   

12.
In the problem of selecting the explanatory variables in the linear mixed model, we address the derivation of the (unconditional or marginal) Akaike information criterion (AIC) and the conditional AIC (cAIC). The covariance matrices of the random effects and the error terms include unknown parameters like variance components, and the selection procedures proposed in the literature are limited to the cases where the parameters are known or partly unknown. In this paper, AIC and cAIC are extended to the situation where the parameters are completely unknown and they are estimated by the general consistent estimators including the maximum likelihood (ML), the restricted maximum likelihood (REML) and other unbiased estimators. We derive, related to AIC and cAIC, the marginal and the conditional prediction error criteria which select superior models in light of minimizing the prediction errors relative to quadratic loss functions. Finally, numerical performances of the proposed selection procedures are investigated through simulation studies.  相似文献   

13.
14.
The main challenge in working with gene expression microarrays is that the sample size is small compared to the large number of variables (genes). In many studies, the main focus is on finding a small subset of the genes, which are the most important ones for differentiating between different types of cancer, for simpler and cheaper diagnostic arrays. In this paper, a sparse Bayesian variable selection method in probit model is proposed for gene selection and classification. We assign a sparse prior for regression parameters and perform variable selection by indexing the covariates of the model with a binary vector. The correlation prior for the binary vector assigned in this paper is able to distinguish models with the same size. The performance of the proposed method is demonstrated with one simulated data and two well known real data sets, and the results show that our method is comparable with other existing methods in variable selection and classification.  相似文献   

15.
Conditional extreme value models have been introduced by Heffernan and Resnick (Ann. Appl. Probab., 17, 537–571, 2007) to describe the asymptotic behavior of a random vector as one specific component becomes extreme. Obviously, this class of models is related to classical multivariate extreme value theory which describes the behavior of a random vector as its norm (and therefore at least one of its components) becomes extreme. However, it turns out that this relationship is rather subtle and sometimes contrary to intuition. We clarify the differences between the two approaches with the help of several illuminative (counter)examples. Furthermore, we discuss marginal standardization, which is a useful tool in classical multivariate extreme value theory but, as we point out, much less straightforward and sometimes even obscuring in conditional extreme value models. Finally, we indicate how, in some situations, a more comprehensive characterization of the asymptotic behavior can be obtained if the conditions of conditional extreme value models are relaxed so that the limit is no longer unique.  相似文献   

16.
In this paper we study the asymptotic properties of the adaptive Lasso estimate in high-dimensional sparse linear regression models with heteroscedastic errors. It is demonstrated that model selection properties and asymptotic normality of the selected parameters remain valid but with a suboptimal asymptotic variance. A weighted adaptive Lasso estimate is introduced and investigated. In particular, it is shown that the new estimate performs consistent model selection and that linear combinations of the estimates corresponding to the non-vanishing components are asymptotically normally distributed with a smaller variance than those obtained by the “classical” adaptive Lasso. The results are illustrated in a data example and by means of a small simulation study.  相似文献   

17.
The present paper deals with the statistical inference of the simultaneous switching autoregressive (SSAR) model. This model has been introduced by Kunitomo and Sato (Jpn. Econ. Rev. 50 (2) (1996) 161) in order to take into account the asymmetry in financial and economical time series modelling. Under some conditions which ensure some probabilistic properties of the model, we establish, under other mild assumptions, the asymptotic properties of the minimum Hellinger distance estimates of the parameters. An application to a true data is also given.  相似文献   

18.
Many risk measures have been recently introduced which (for discrete random variables) result in Linear Programs (LP). While some LP computable risk measures may be viewed as approximations to the variance (e.g., the mean absolute deviation or the Gini’s mean absolute difference), shortfall or quantile risk measures are recently gaining more popularity in various financial applications. In this paper we study LP solvable portfolio optimization models based on extensions of the Conditional Value at Risk (CVaR) measure. The models use multiple CVaR measures thus allowing for more detailed risk aversion modeling. We study both the theoretical properties of the models and their performance on real-life data.  相似文献   

19.
20.
In this paper,we propose a new correlation,called stable correlation,to measure the dependence between two random vectors.The new correlation is well defined without the moment condition and is zero if and only if the two random vectors are independent.We also study its other theoretical properties.Based on the new correlation,we further propose a robust model-free feature screening procedure for ultrahigh dimensional data and establish its sure screening property and rank consistency property without imposing the subexponential or sub-Gaussian tail condition,which is commonly required in the literature of feature screening.We also examine the finite sample performance of the proposed robust feature screening procedure via Monte Carlo simulation studies and illustrate the proposed procedure by a real data example.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号