首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Many problems in genomics are related to variable selection where high-dimensional genomic data are treated as covariates. Such genomic covariates often have certain structures and can be represented as vertices of an undirected graph. Biological processes also vary as functions depending upon some biological state, such as time. High-dimensional variable selection where covariates are graph-structured and underlying model is nonparametric presents an important but largely unaddressed statistical challenge. Motivated by the problem of regression-based motif discovery, we consider the problem of variable selection for high-dimensional nonparametric varying-coefficient models and introduce a sparse structured shrinkage (SSS) estimator based on basis function expansions and a novel smoothed penalty function. We present an efficient algorithm for computing the SSS estimator. Results on model selection consistency and estimation bounds are derived. Moreover, finite-sample performances are studied via simulations, and the effects of high-dimensionality and structural information of the covariates are especially highlighted. We apply our method to motif finding problem using a yeast cell-cycle gene expression dataset and word counts in genes’ promoter sequences. Our results demonstrate that the proposed method can result in better variable selection and prediction for high-dimensional regression when the underlying model is nonparametric and covariates are structured. Supplemental materials for the article are available online.  相似文献   

2.
Abstract

An updating algorithm for bivariate local linear regression is proposed. Thereby, we assume a rectangular design and a polynomial kernel constrained to rectangular support as weight function. Results of univariate regression estimators are extended to the bivariate setting. The updates are performed in a way that most of the well-known numerical instabilities of a naive update implementation can be avoided. Some simulation results illustrate the properties of several algorithms with respect to computing time and numerical stability.  相似文献   

3.
In this paper, we study the weighted composite quantile regression (WCQR) for general linear model with missing covariates. We propose the WCQR estimation and bootstrap test procedures for unknown parameters. Simulation studies and a real data analysis are conducted to examine the finite performance of our proposed methods.  相似文献   

4.
For multivariate regressors, integrating the Nadaraya–Watson regression smoother produces estimators of the lower-dimensional marginal components that are asymptotically normally distributed, at the optimal rate of convergence. Some heuristics, based on consistency of the pilot estimator, suggested that the estimator would not converge at the optimal rate of convergence in the presence of more than four covariates. This paper shows first that marginal integration with its internally normalized counterpart leads to rate-optimal estimators of the marginal components. We introduce the necessary modifications and give central limit theorems. Then, it is shown that the method apply also to more general models, in particular we discuss feasible estimation of partial linear models. The proofs reveal that the pilot estimator shall over-smooth the variables to be integrated, and, that the resulting estimator is itself a lower-dimensional regression smoother. Hence, finite sample properties of the estimator are comparable to those of low-dimensional nonparametric regression. Further advantages when starting with the internally normalized pilot estimator are its computational attractiveness and better performance (compared to its classical counterpart) when the covatiates are correlated and nonuniformly distributed. Simulation studies underline the excellent performance in comparison with so far known methods.  相似文献   

5.
Fixed Design Nonparametric Regression with Truncated and Censored Data   总被引:1,自引:0,他引:1  
In this paper we consider a fixed design model in which the observations axe subject to left truncation and right censoring. A generalized product-limit estimator for the conditional distribution at a given covaxiate value is proposed, and an almost sure asymptotic representation of this estimator is established. We also obtain the rate of uniform consistency, weak convergence and a modulus of continuity for this estimator.Applications include trimmed mean and quantile function estimators.  相似文献   

6.
回归误差项是不可观测的. 由于回归误差项的密度函数在实际中有许多应用, 故使用非参数方法对其进行估计就成为回归分析中的一个基本问题. 针对完全观测数据回归模型, 曾有作者对此问题进行了研究. 然而在实际应用中, 经常会有数据被删失的情况发生, 在此情况下, 可以利用删失回归残差, 并使用核估计的方法对回归误差项的密度函数进行估计. 本文研究了该估计的大样本性质, 并证明了估计量的一致相合性.  相似文献   

7.
截尾数据非参数回归函数加权核估计   总被引:4,自引:0,他引:4  
杨善朝 《数学学报》1999,42(2):255-262
在截尾数据下研究非参数回归函数加权核估计的相合性,对强相合性给出一些较弱的充分条件,这些结论较大程度地改进了现有的结论.  相似文献   

8.
The solution of nonparametric regression problems is addressed via polynomial approximators and one-hidden-layer feedforward neural approximators. Such families of approximating functions are compared as to both complexity and experimental performances in finding a nonparametric mapping that interpolates a finite set of samples according to the empirical risk minimization approach. The theoretical background that is necessary to interpret the numerical results is presented. Two simulation case studies are analyzed to fully understand the practical issues that may arise in solving such problems. The issues depend on both the approximation capabilities of the approximating functions and the effectiveness of the methodologies that are available to select the tuning parameters, i.e., the coefficients of the polynomials and the weights of the neural networks. The simulation results show that the neural approximators perform better than the polynomial ones with the same number of parameters. However, this superiority can be jeopardized by the presence of local minima, which affects the neural networks but does not regard the polynomial approach.  相似文献   

9.
Consider the polynomial regression model , where σ2(X)=Var(Y|X) is unknown, and ε is independent of X and has zero mean. Suppose that Y is subject to random right censoring. A new estimation procedure for the parameters β0,...,β p is proposed, which extends the classical least squares procedure to censored data. The proposed method is inspired by the method of Buckley and James (1979, Biometrika, 66, 429–436), but is, unlike the latter method, a noniterative procedure due to nonparametric preliminary estimation of the conditional regression function. The asymptotic normality of the estimators is established. Simulations are carried out for both methods and they show that the proposed estimators have usually smaller variance and smaller mean squared error than the Buckley–James estimators. The two estimation procedures are also applied to a medical and an astronomical data set.  相似文献   

10.
考虑多维扩散过程的非参数估计问题.利用It扩散的性质,将漂移向量和扩散矩阵的样本表示成带有测量误差的回归模型,并讨论了系统误差的L~r上界以及随机误差项的收敛速度,建立了漂移向量与扩散矩阵非参数估计的通用模型.  相似文献   

11.
We introduce an algorithm which, in the context of nonlinear regression on vector-valued explanatory variables, aims to choose those combinations of vector components that provide best prediction. The algorithm is constructed specifically so that it devotes attention to components that might be of relatively little predictive value by themselves, and so might be ignored by more conventional methodology for model choice, but which, in combination with other difficult-to-find components, can be particularly beneficial for prediction. The design of the algorithm is also motivated by a desire to choose vector components that become redundant once appropriate combinations of other, more relevant components are selected. Our theoretical arguments show these goals are met in the sense that, with probability converging to 1 as sample size increases, the algorithm correctly determines a small, fixed number of variables on which the regression mean, g say, depends, even if dimension diverges to infinity much faster than n. Moreover, the estimated regression mean based on those variables approximates g with an error that, to first order, equals the error which would arise if we were told in advance the correct variables. In this sense, the estimator achieves oracle performance. Our numerical work indicates that the algorithm is suitable for very high dimensional problems, where it keeps computational labor in check by using a novel sequential argument, and also for more conventional prediction problems, where dimension is relatively low.  相似文献   

12.
We consider the problem of testing for a constant nonparametric effect in a general semiparametric regression model when there is a potential for interaction between the parametrically and nonparametrically modeled variables. The work was originally motivated by a unique testing problem in genetic epidemiology (Chatterjee et al., 2006) that involved a typical generalized linear model but with an additional term reminiscent of the Tukey 1-degree-of-freedom formulation, and their interest was in testing for main effects of the genetic variables, while gaining statistical power by allowing for a possible interaction between genes and the environment. Later work (Maity et al., 2009) involved the possibility of modeling the environmental variable nonparametrically, but they focused on whether there was a parametric main effect for the genetic variables. In this paper, we consider the complementary problem, where the interest is in testing for the main effect of the nonparametrically modeled environmental variable. We derive a generalized likelihood ratio test for this hypothesis, show how to implement it, and provide evidence that our method can improve statistical power when compared to standard partially linear models with main effects only. We use the method for the primary purpose of analyzing data from a case-control study of colorectal adenoma.  相似文献   

13.
In this paper, we define a new kernel estimator of the regression function under a left truncation model. We establish the pointwise and uniform strong consistency over a compact set and give a rate of convergence of the estimate. The pointwise asymptotic normality of the estimate is also given. Some simulations are given to show the asymptotic behavior of the estimate in different cases. The distribution function and the covariable’s density are also estimated.  相似文献   

14.
该文将Hrdle和Tsybakov的结果推广到数据来自α-混合的严平稳序列的情形,得到了估计的相合性和渐近正太性.在小样本的情形下给出了随机模拟结果,以检查所提出估计的表现.  相似文献   

15.
In this paper we consider the problem of estimating an unknown joint distribution which is defined over mixed discrete and continuous variables. A nonparametric kernel approach is proposed with smoothing parameters obtained from the cross-validated minimization of the estimator's integrated squared error. We derive the rate of convergence of the cross-validated smoothing parameters to their ‘benchmark’ optimal values, and we also establish the asymptotic normality of the resulting nonparametric kernel density estimator. Monte Carlo simulations illustrate that the proposed estimator performs substantially better than the conventional nonparametric frequency estimator in a range of settings. The simulations also demonstrate that the proposed approach does not suffer from known limitations of the likelihood cross-validation method which breaks down with commonly used kernels when the continuous variables are drawn from fat-tailed distributions. An empirical application demonstrates that the proposed method can yield superior predictions relative to commonly used parametric models.  相似文献   

16.
We consider a nonparametric estimation problem for the Lévy measure of time-inhomogeneous process with independent increments. We derive the functional asymptotic normality and efficiency, in an -space, of generalized Nelson–Aalen estimators. Also we propose some asymptotically distribution free tests for time-homogeneity of the Lévy measure. Our result is a fruit of the empirical process theory and the martingale theory.  相似文献   

17.
We consider a two-factor experiment in which the factors have the same levels with a natural ordering among the levels. Likelihood ratio tests for testing equality of the main effects with a one-sided alternative and for testing the one-sided hypothesis as a null hypothesis are studied. Closed form expressions for the maximum likelihood estimates under the various hypotheses are obtained. The null hypothesis distributions for these test statistics are derived.The efforts of the first author were supported by the NSERC of Canada. The efforts of the second author were supported by the Office of Naval Research under Contract ONR N00014-80-C-0321. The efforts of the third author were supported by the Office of Naval Research under Contract ONR N00014-80-C-0322.  相似文献   

18.
This paper considers the problem of testing a sub-hypothesis in homoscedastic linear regression models when the covariate and error processes form independent long memory moving averages. The asymptotic null distribution of the likelihood ratio type test based on Whittle quadratic forms is shown to be a chi-square distribution. Additionally, the estimators of the slope parameters obtained by minimizing the Whittle dispersion is seen to be n 1/2-consistent for all values of the long memory parameters of the design and error processes. Research of the first author was partly supported by the NSF DMS Grant 0701430. Research of the second author was partly supported by the bilateral France-Lithuania scientific project Gilibert and the Lithuanian State Science and Studies Foundation grant T-15/07.  相似文献   

19.
In this paper, the authors considered various procedures for testing for the independence of two multivariate regression equations with different design matrices. Asymptotic null distributions as well as nonnull distributions under local alternatives of the test statistics associated with the above procedures are also derived.  相似文献   

20.
We study Beran's extension of the Kaplan-Meier estimator for thesituation of right censored observations at fixed covariate values. Thisestimator for the conditional distribution function at a given value of thecovariate involves smoothing with Gasser-Müller weights. We establishan almost sure asymptotic representation which provides a key tool forobtaining central limit results. To avoid complicated estimation ofasymptotic bias and variance parameters, we propose a resampling methodwhich takes the covariate information into account. An asymptoticrepresentation for the bootstrapped estimator is proved and the strongconsistency of the bootstrap approximation to the conditional distributionfunction is obtained.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号