首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 46 毫秒
1.
We propose and implement a density estimation procedure which begins by turning density estimation into a nonparametric regression problem. This regression problem is created by binning the original observations into many small size bins, and by then applying a suitable form of root transformation to the binned data counts. In principle many common nonparametric regression estimators could then be applied to the transformed data. We propose use of a wavelet block thresholding estimator in this paper. Finally, the estimated regression function is un-rooted by squaring and normalizing. The density estimation procedure achieves simultaneously three objectives: computational efficiency, adaptivity, and spatial adaptivity. A numerical example and a practical data example are discussed to illustrate and explain the use of this procedure. Theoretically it is shown that the estimator simultaneously attains the optimal rate of convergence over a wide range of the Besov classes. The estimator also automatically adapts to the local smoothness of the underlying function, and attains the local adaptive minimax rate for estimating functions at a point. There are three key steps in the technical argument: Poissonization, quantile coupling, and oracle risk bound for block thresholding in the non-Gaussian setting. Some of the technical results may be of independent interest.  相似文献   

2.
The problem of nonparametric estimation of the joint probability density of a vector of continuous and ordinal/nominal categorical random variables with bounded support is considered. There are numerous publications devoted to the cases of either continuous or categorical variables, and the curse of dimensionality and strong regularity assumptions are the two familiar issues in the literature. Mixed variables occur in practically all applications of the statistical science and, nonetheless, the literature devoted to the joint density estimation is practically next to none. This paper develops the theory of estimation of the density of mixed variables which is on par with results known for simpler settings. Specifically, a data-driven estimator is developed that adapts to unknown anisotropic smoothness of the joint density and, whenever the density depends on a smaller number of variables, performs a dimension reduction that implies the corresponding optimal rate of the mean integrated squared error (MISE) convergence. The results hold without traditional, in the density estimation literature, minimal regularity assumptions like differentiability or continuity of the density. The procedure of estimation is based on mimicking an oracle-estimator that knows the underlying density, and the main theoretical result is the oracle inequality which relates the MISEs of the estimator and the oracle-estimator. The proof is based on a new exponential inequality for Sobolev statistics which is of interest on its own merits.  相似文献   

3.
Modal regression based on nonparametric quantile estimator is given. Unlike the traditional mean and median regression, modal regression uses mode but not mean or median to represent the center of a conditional distribution, which helps the model to be more robust for outliers, asymmetric or heavy-taileddistribution. Most of solutions for modal regression are based on kernel estimation of density. This paper studies a new solution for modal regression by means of nonparametric quantile estimator. This method builds on the fact that the distribution function is the inverse of the quantile function, then the flexibility of nonparametric quantile estimator is utilized to improve the estimation of modal function. The simulations and application show that the new model outperforms the modal regression model via linear quantile function estimation.  相似文献   

4.
As a useful tool in functional data analysis, the functional linear regression model has become increasingly common and been studied extensively in recent years. In this paper, we consider a sparse functional linear regression model which is generated by a finite number of basis functions in an expansion of the coefficient function. In this model, we do not specify how many and which basis functions enter the model, thus it is not like a typical parametric model where predictor variables are pre-specified. We study a general framework that gives various procedures which are successful in identifying the basis functions that enter the model, and also estimating the resulting regression coefficients in one-step. We adopt the idea of variable selection in the linear regression setting where one adds a weighted L1 penalty to the traditional least squares criterion. We show that the procedures in our general framework are consistent in the sense of selecting the model correctly, and that they enjoy the oracle property, meaning that the resulting estimators of the coefficient function have asymptotically the same properties as the oracle estimator which uses knowledge of the underlying model. We investigate and compare several methods within our general framework, via a simulation study. Also, we apply the methods to the Canadian weather data.  相似文献   

5.
Knowledge of the probability distribution of error in a regression problem plays an important role in verification of an assumed regression model, making inference about predictions, finding optimal regression estimates, suggesting confidence bands and goodness of fit tests as well as in many other issues of the regression analysis. This article is devoted to an optimal estimation of the error probability density in a general heteroscedastic regression model with possibly dependent predictors and regression errors. Neither the design density nor regression function nor scale function is assumed to be known, but they are suppose to be differentiable and an estimated error density is suppose to have a finite support and to be at least twice differentiable. Under this assumption the article proves, for the first time in the literature, that it is possible to estimate the regression error density with the accuracy of an oracle that knows “true” underlying regression errors. Real and simulated examples illustrate importance of the error density estimation as well as the suggested oracle methodology and the method of estimation.  相似文献   

6.
We study the properties of the Lasso in the high-dimensional partially linear model where the number of variables in the linear part can be greater than the sample size. We use truncated series expansion based on polynomial splines to approximate the nonparametric component in this model. Under a sparsity assumption on the regression coefficients of the linear component and some regularity conditions, we derive the oracle inequalities for the prediction risk and the estimation error. We also provide sufficient conditions under which the Lasso estimator is selection consistent for the variables in the linear part of the model. In addition, we derive the rate of convergence of the estimator of the nonparametric function. We conduct simulation studies to evaluate the finite sample performance of variable selection and nonparametric function estimation.  相似文献   

7.
We consider nonparametric estimation of conditional medians for time series data. The time series data are generated from two mutually independent linear processes. The linear processes may show long-range dependence. The estimator of the conditional medians is based on minimizing the locally weighted sum of absolute deviations for local linear regression. We present the asymptotic distribution of the estimator. The rate of convergence is independent of regressors in our setting. The result of a simulation study is also given.  相似文献   

8.
This paper is concerned with the conditional bias and variance of local quadratic regression to the multivariate predictor variables. Data sharpening methods of nonparametric regression were first proposed by Choi, Hall, Roussion. Recently, a data sharpening estimator of local linear regression was discussed by Naito and Yoshizaki. In this paper, to improve mainly the fitting precision, we extend their results on the asymptotic bias and variance. Using the data sharpening estimator of multivariate local quadratic regression, we are able to derive higher fitting precision. In particular, our approach is simple to implement, since it has an explicit form, and is convenient when analyzing the asymptotic conditional bias and variance of the estimator at the interior and boundary points of the support of the density function.  相似文献   

9.
This paper is a survey of recent results on the adaptive robust non parametric methods for the continuous time regression model with the semi-martingale noises with jumps. The noises are modeled by the Lévy processes, the Ornstein–Uhlenbeck processes and semi-Markov processes. We represent the general model selection method and the sharp oracle inequalities methods which provide the robust efficient estimation in the adaptive setting. Moreover, we present the recent results on the improved model selection methods for the nonparametric estimation problems.  相似文献   

10.
We consider the problem of estimation in semiparametric varying coefficient models where the covariate modifying the varying coefficients is functional and is modeled nonparametrically. We develop a kernel-based estimator of the nonparametric component and a profiling estimator of the parametric component of the model and derive their asymptotic properties. Specifically, we show the consistency of the nonparametric functional estimates and derive the asymptotic expansion of the estimates of the parametric component. We illustrate the performance of our methodology using a simulation study and a real data application.  相似文献   

11.
This paper proposes a technique [termed censored average derivative estimation (CADE)] for studying estimation of the unknown regression function in nonparametric censored regression models with randomly censored samples. The CADE procedure involves three stages: firstly-transform the censored data into synthetic data or pseudo-responses using the inverse probability censoring weighted (IPCW) technique, secondly estimate the average derivatives of the regression function, and finally approximate the unknown regression function by an estimator of univariate regression using techniques for one-dimensional nonparametric censored regression. The CADE provides an easily implemented methodology for modelling the association between the response and a set of predictor variables when data are randomly censored. It also provides a technique for “dimension reduction” in nonparametric censored regression models. The average derivative estimator is shown to be root-n consistent and asymptotically normal. The estimator of the unknown regression function is a local linear kernel regression estimator and is shown to converge at the optimal one-dimensional nonparametric rate. Monte Carlo experiments show that the proposed estimators work quite well.  相似文献   

12.

This paper considers estimation and inference in semiparametric quantile regression models when the response variable is subject to random censoring. The paper considers both the cases of independent and dependent censoring and proposes three iterative estimators based on inverse probability weighting, where the weights are estimated from the censoring distribution using the Kaplan–Meier, a fully parametric and the conditional Kaplan–Meier estimators. The paper proposes a computationally simple resampling technique that can be used to approximate the finite sample distribution of the parametric estimator. The paper also considers inference for both the parametric and nonparametric components of the quantile regression model. Monte Carlo simulations show that the proposed estimators and test statistics have good finite sample properties. Finally, the paper contains a real data application, which illustrates the usefulness of the proposed methods.

  相似文献   

13.
In the context of semi-functional partial linear regression model, we study the problem of error density estimation. The unknown error density is approximated by a mixture of Gaussian densities with means being the individual residuals, and variance a constant parameter. This mixture error density has a form of a kernel density estimator of residuals, where the regression function, consisting of parametric and nonparametric components, is estimated by the ordinary least squares and functional Nadaraya–Watson estimators. The estimation accuracy of the ordinary least squares and functional Nadaraya–Watson estimators jointly depends on the same bandwidth parameter. A Bayesian approach is proposed to simultaneously estimate the bandwidths in the kernel-form error density and in the regression function. Under the kernel-form error density, we derive a kernel likelihood and posterior for the bandwidth parameters. For estimating the regression function and error density, a series of simulation studies show that the Bayesian approach yields better accuracy than the benchmark functional cross validation. Illustrated by a spectroscopy data set, we found that the Bayesian approach gives better point forecast accuracy of the regression function than the functional cross validation, and it is capable of producing prediction intervals nonparametrically.  相似文献   

14.
Many problems in genomics are related to variable selection where high-dimensional genomic data are treated as covariates. Such genomic covariates often have certain structures and can be represented as vertices of an undirected graph. Biological processes also vary as functions depending upon some biological state, such as time. High-dimensional variable selection where covariates are graph-structured and underlying model is nonparametric presents an important but largely unaddressed statistical challenge. Motivated by the problem of regression-based motif discovery, we consider the problem of variable selection for high-dimensional nonparametric varying-coefficient models and introduce a sparse structured shrinkage (SSS) estimator based on basis function expansions and a novel smoothed penalty function. We present an efficient algorithm for computing the SSS estimator. Results on model selection consistency and estimation bounds are derived. Moreover, finite-sample performances are studied via simulations, and the effects of high-dimensionality and structural information of the covariates are especially highlighted. We apply our method to motif finding problem using a yeast cell-cycle gene expression dataset and word counts in genes' promoter sequences. Our results demonstrate that the proposed method can result in better variable selection and prediction for high-dimensional regression when the underlying model is nonparametric and covariates are structured. Supplemental materials for the article are available online.  相似文献   

15.
回归模型的同方差检验   总被引:2,自引:0,他引:2  
本文利用局部经验似然和WNW方法对条件分布函数和条件分位数进行估计,并利用条件分位数的方法对回归模型中的误差方差进行了同方差假设检验,获得了零假设下检验统计量的渐近分布为X2分布.模拟计算表明同方差假设检验的条件分位数方法具有较好的功效.  相似文献   

16.
Many problems in genomics are related to variable selection where high-dimensional genomic data are treated as covariates. Such genomic covariates often have certain structures and can be represented as vertices of an undirected graph. Biological processes also vary as functions depending upon some biological state, such as time. High-dimensional variable selection where covariates are graph-structured and underlying model is nonparametric presents an important but largely unaddressed statistical challenge. Motivated by the problem of regression-based motif discovery, we consider the problem of variable selection for high-dimensional nonparametric varying-coefficient models and introduce a sparse structured shrinkage (SSS) estimator based on basis function expansions and a novel smoothed penalty function. We present an efficient algorithm for computing the SSS estimator. Results on model selection consistency and estimation bounds are derived. Moreover, finite-sample performances are studied via simulations, and the effects of high-dimensionality and structural information of the covariates are especially highlighted. We apply our method to motif finding problem using a yeast cell-cycle gene expression dataset and word counts in genes’ promoter sequences. Our results demonstrate that the proposed method can result in better variable selection and prediction for high-dimensional regression when the underlying model is nonparametric and covariates are structured. Supplemental materials for the article are available online.  相似文献   

17.
18.
In this paper, we propose a combined regression estimator by using a parametric estimator and a nonparametric estimator of the regression function. The asymptotic distribution of this estimator is obtained for cases where the parametric regression model is correct, incorrect, and approximately correct. These distributional results imply that the combined estimator is superior to the kernel estimator in the sense that it can never do worse than the kernel estimator in terms of convergence rate and it has the same convergence rate as the parametric estimator in the case where the parametric model is correct. Unlike the parametric estimator, the combined estimator is robust to model misspecification. In addition, we also establish the asymptotic distribution of the estimator of the weight given to the parametric estimator in constructing the combined estimator. This can be used to construct consistent tests for the parametric regression model used to form the combined estimator.  相似文献   

19.
We apply nonparametric regression to current status data, which often arises in survival analysis and reliability analysis. While no parametric assumption on the distributions has been imposed, most authors have employed parametric models like linear models to measure the covariate effects on failure times in regression analysis with current status data. We construct a nonparametric estimator of the regression function by modifying the maximum rank correlation (MRC) estimator. Our estimator can deal with the cases where the other estimators do not work. We present the asymptotic bias and the asymptotic distribution of the estimator by adapting a result on equicontinuity of degenerate U-processes to the setup of this paper.  相似文献   

20.
Nonparametric regression estimator based on locally weighted least squares fitting has been studied by Fan and Ruppert and Wand. The latter paper also studies, in the univariate case, nonparametric derivative estimators given by a locally weighted polynomial fitting. Compared with traditional kernel estimators, these estimators are often of simpler form and possess some better properties. In this paper, we develop current work on locally weighted regression and generalize locally weighted polynomial fitting to the estimation of partial derivatives in a multivariate regression context. Specifically, for both the regression and partial derivative estimators we prove joint asymptotic normality and derive explicit asymptotic expansions for their conditional bias and conditional convariance matrix (given observations of predictor variables) in each of the two important cases of local linear fit and local quadratic fit.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号