共查询到20条相似文献,搜索用时 15 毫秒
1.
Sparse grids allow one to employ grid-based discretization methods in data-driven problems. We present an extension of the classical sparse grid approach that allows us to tackle high-dimensional problems by spatially adaptive refinement, modified ansatz functions, and efficient regularization techniques. The competitiveness of this method is shown for typical benchmark problems with up to 166 dimensions for classification in data mining, pointing out properties of sparse grids in this context. To gain insight into the adaptive refinement and to examine the scope for further improvements, the approximation of non-smooth indicator functions with adaptive sparse grids has been studied as a model problem. As an example for an improved adaptive grid refinement, we present results for an edge-detection strategy. 相似文献
2.
We consider (a discretization of) a functional of white noise over a finite time interval. We explore the possible interest of representing the white noise in the orthonormal bases of orthogonal polynomials or wavelets for the numerical evaluation of the expected value of this functional. Using the Wiener–Itô decomposition of the functional, the sparsity is studied of the representation of the functional in these bases. An approximation scheme is proposed that uses existing low-dimensional quasi-Monte Carlo rules and takes profit of the sparse structure of the quadratic part of the functional. 相似文献
3.
In this article, we deal with sparse high-dimensional multivariate regression models. The models distinguish themselves from ordinary multivariate regression models in two aspects: (1) the dimension of the response vector and the number of covariates diverge to infinity; (2) the nonzero entries of the coefficient matrix and the precision matrix are sparse. We develop a two-stage sequential conditional selection (TSCS) approach to the identification and estimation of the nonzeros of the coefficient matrix and the precision matrix. It is established that the TSCS is selection consistent for the identification of the nonzeros of both the coefficient matrix and the precision matrix. Simulation studies are carried out to compare TSCS with the existing state-of-the-art methods, which demonstrates that the TSCS approach outperforms the existing methods. As an illustration, the TSCS approach is also applied to a real dataset. 相似文献
4.
This paper is concerned with the error density estimation in high-dimensional sparse linear model, where the number of variables may be larger than the sample size. An improved two-stage refitted cross-validation procedure by random splitting technique is used to obtain the residuals of the model, and then traditional kernel density method is applied to estimate the error density. Under suitable sparse conditions, the large sample properties of the estimator including the consistency and asymptotic normality, as well as the law of the iterated logarithm are obtained. Especially, we gave the relationship between the sparsity and the convergence rate of the kernel density estimator. The simulation results show that our error density estimator has a good performance. A real data example is presented to illustrate our methods. 相似文献
6.
The adaptive lasso is a model selection method shown to be both consistent in variable selection and asymptotically normal in coefficient estimation. The actual variable selection performance of the adaptive lasso depends on the weight used. It turns out that the weight assignment using the OLS estimate (OLS-adaptive lasso) can result in very poor performance when collinearity of the model matrix is a concern. To achieve better variable selection results, we take into account the standard errors of the OLS estimate for weight calculation, and propose two different versions of the adaptive lasso denoted by SEA-lasso and NSEA-lasso. We show through numerical studies that when the predictors are highly correlated, SEA-lasso and NSEA-lasso can outperform OLS-adaptive lasso under a variety of linear regression settings while maintaining the same theoretical properties of the adaptive lasso. 相似文献
7.
Advances in Data Analysis and Classification - In real-world application scenarios, the identification of groups poses a significant challenge due to possibly occurring outliers and existing noise... 相似文献
8.
In the mean regression context, this study considers several frequently encountered heteroscedastic error models where the regression mean and variance functions are specified up to certain parameters. An important point we note through a series of analyses is that different assumptions on standardized regression errors yield quite different efficiency bounds for the corresponding estimators. Consequently, all aspects of the assumptions need to be specifically taken into account in constructing their corresponding efficient estimators. This study clarifies the relation between the regression error assumptions and their, respectively, efficiency bounds under the general regression framework with heteroscedastic errors. Our simulation results support our findings; we carry out a real data analysis using the proposed methods where the Cobb–Douglas cost model is the regression mean. 相似文献
9.
In this paper we investigate penalized least squares methods in linear regression models with heteroscedastic error structure. It is demonstrated that the basic properties with respect to model selection and parameter estimation of bridge estimators, Lasso and adaptive Lasso do not change if the assumption of homoscedasticity is violated. However, these estimators do not have oracle properties in the sense of Fan and Li (2001) if the oracle is based on weighted least squares. In order to address this problem we introduce weighted penalized least squares methods and demonstrate their advantages by asymptotic theory and by means of a simulation study. 相似文献
10.
Consistent procedures are constructed for testing independence between the regressor and the error in non-parametric regression models. The tests are based on the Fourier formulation of independence, and utilize the joint and the marginal empirical characteristic functions of the regressor and of estimated residuals. The asymptotic null distribution as well as the behavior of the test statistic under alternatives is investigated. A simulation study compares bootstrap versions of the proposed tests to corresponding procedures utilizing the empirical distribution function. 相似文献
11.
In this paper, we propose a new criterion, named PICa, to simultaneously select explanatory variables in the mean model and variance model in heteroscedastic linear models based on the model structure. We show that the new criterion can select the true mean model and a correct variance model with probability tending to 1 under mild conditions. Simulation studies and a real example are presented to evaluate the new criterion, and it turns out that the proposed approach performs well. 相似文献
12.
We make empirical-likelihood-based inference for the parameters in heteroscedastic partially linear models. Unlike the existing empirical likelihood procedures for heteroscedastic partially linear models, the proposed empirical likelihood is constructed using components of a semiparametric efficient score. We show that it retains the double robustness feature of the semiparametric efficient estimator for the parameters and shares the desirable properties of the empirical likelihood for linear models. Compared with the normal approximation method and the existing empirical likelihood methods, the empirical likelihood method based on the semiparametric efficient score is more attractive not only theoretically but empirically. Simulation studies demonstrate that the proposed empirical likelihood provides smaller confidence regions than that based on semiparametric inefficient estimating equations subject to the same coverage probabilities. Hence, the proposed empirical likelihood is preferred to the normal approximation method as well as the empirical likelihood method based on semiparametric inefficient estimating equations, and it should be useful in practice. 相似文献
13.
The paper concentrates on consistent estimation and testing in functional polynomial measurement errors models with known
heterogeneous variances. We rest on the corrected score methodology which allows the derivation of consistent and asymptotically
normal estimators for line parameters and also consistent estimators for the asymptotic covariance matrix. Hence, Wald and
score type statistics can be proposed for testing the hypothesis of a reduced linear relationship, for example, with asymptotic
chi-square distribution which guarantees correct asymptotic significance levels. Results of small scale simulation studies
are reported to illustrate the agreement between theoretical and empirical distributions of the test statistics studied. An
application to a real data set is also presented. 相似文献
14.
A simple test is proposed for examining the correctness of a given completely specified response function against unspecified general alternatives in the context of univariate regression. The usual diagnostic tools based on residual plots are useful but heuristic. We introduce a formal statistical test supplementing the graphical analysis. Technically, the test statistic is the maximum length of the sequences of ordered (with respect to the covariate) observations that are consecutively overestimated or underestimated by the candidate regression function. Note that the testing procedure can cope with heteroscedastic errors and no replicates. Recursive formulae allowing one to calculate the exact distribution of the test statistic under the null hypothesis and under a class of alternative hypotheses are given. 相似文献
16.
The authors study a heteroscedastic partially linear regression model and develop an inferential procedure for it. This includes a test of heteroscedasticity, a two-step estimator of the heteroscedastic variance function, semiparametric generalized least-squares estimators of the parametric and nonparametric components of the model, and a bootstrap goodness of fit test to see whether the nonparametric component can be parametrized. 相似文献
17.
The main challenge in working with gene expression microarrays is that the sample size is small compared to the large number of variables (genes). In many studies, the main focus is on finding a small subset of the genes, which are the most important ones for differentiating between different types of cancer, for simpler and cheaper diagnostic arrays. In this paper, a sparse Bayesian variable selection method in probit model is proposed for gene selection and classification. We assign a sparse prior for regression parameters and perform variable selection by indexing the covariates of the model with a binary vector. The correlation prior for the binary vector assigned in this paper is able to distinguish models with the same size. The performance of the proposed method is demonstrated with one simulated data and two well known real data sets, and the results show that our method is comparable with other existing methods in variable selection and classification. 相似文献
18.
We study the properties of the Lasso in the high-dimensional partially linear model where the number of variables in the linear part can be greater than the sample size. We use truncated series expansion based on polynomial splines to approximate the nonparametric component in this model. Under a sparsity assumption on the regression coefficients of the linear component and some regularity conditions, we derive the oracle inequalities for the prediction risk and the estimation error. We also provide sufficient conditions under which the Lasso estimator is selection consistent for the variables in the linear part of the model. In addition, we derive the rate of convergence of the estimator of the nonparametric function. We conduct simulation studies to evaluate the finite sample performance of variable selection and nonparametric function estimation. 相似文献
19.
Advances in Data Analysis and Classification - The common issues of high-dimensional gene expression data are that many of the genes may not be relevant, and there exists a high correlation among... 相似文献
20.
This paper addresses the problem of estimating the density of a future outcome from a multivariate normal model. We propose a class of empirical Bayes predictive densities and evaluate their performances under the Kullback–Leibler (KL) divergence. We show that these empirical Bayes predictive densities dominate the Bayesian predictive density under the uniform prior and thus are minimax under some general conditions. We also establish the asymptotic optimality of these empirical Bayes predictive densities in infinite-dimensional parameter spaces through an oracle inequality. 相似文献
|