期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Variable selection for semiparametric varying coefficient partially linear errors-in-variables models

Peixin Zhao Liugen Xue 《Journal of multivariate analysis》2010,101(8):1872-1883

This paper focuses on the variable selections for semiparametric varying coefficient partially linear models when the covariates in the parametric and nonparametric components are all measured with errors. A bias-corrected variable selection procedure is proposed by combining basis function approximations with shrinkage estimations. With appropriate selection of the tuning parameters, the consistency of the variable selection procedure and the oracle property of the regularized estimators are established. A simulation study and a real data application are undertaken to evaluate the finite sample performance of the proposed method. 相似文献

2.

Combining conditional and unconditional moment restrictions with missing responses

Xiaohui Yuan Tianqing Liu Nan Lin Baoxue Zhang 《Journal of multivariate analysis》2010,101(10):2420-2433

Many statistical models, e.g. regression models, can be viewed as conditional moment restrictions when distributional assumptions on the error term are not assumed. For such models, several estimators that achieve the semiparametric efficiency bound have been proposed. However, in many studies, auxiliary information is available as unconditional moment restrictions. Meanwhile, we also consider the presence of missing responses. We propose the combined empirical likelihood (CEL) estimator to incorporate such auxiliary information to improve the estimation efficiency of the conditional moment restriction models. We show that, when assuming responses are strongly ignorable missing at random, the CEL estimator achieves better efficiency than the previous estimators due to utilization of the auxiliary information. Based on the asymptotic property of the CEL estimator, we also develop Wilks’ type tests and corresponding confidence regions for the model parameter and the mean response. Since kernel smoothing is used, the CEL method may have difficulty for problems with high dimensional covariates. In such situations, we propose an instrumental variable-based empirical likelihood (IVEL) method to handle this problem. The merit of the CEL and IVEL are further illustrated through simulation studies. 相似文献

3.

Cross-validated bagged learning

Maya L. Petersen Sandra E. Sinisi 《Journal of multivariate analysis》2007,98(9):1693-1704

Many applications aim to learn a high dimensional parameter of a data generating distribution based on a sample of independent and identically distributed observations. For example, the goal might be to estimate the conditional mean of an outcome given a list of input variables. In this prediction context, bootstrap aggregating (bagging) has been introduced as a method to reduce the variance of a given estimator at little cost to bias. Bagging involves applying an estimator to multiple bootstrap samples and averaging the result across bootstrap samples. In order to address the curse of dimensionality, a common practice has been to apply bagging to estimators which themselves use cross-validation, thereby using cross-validation within a bootstrap sample to select fine-tuning parameters trading off bias and variance of the bootstrap sample-specific candidate estimators. In this article we point out that in order to achieve the correct bias variance trade-off for the parameter of interest, one should apply the cross-validation selector externally to candidate bagged estimators indexed by these fine-tuning parameters. We use three simulations to compare the new cross-validated bagging method with bagging of cross-validated estimators and bagging of non-cross-validated estimators. 相似文献

4.

Parameter estimation under ambiguity and contamination with the spurious model

María Teresa Gallegos 《Journal of multivariate analysis》2006,97(5):1221-1250

Recently, we proposed variants as a statistical model for treating ambiguity. If data are extracted from an object with a machine then it might not be able to give a unique safe answer due to ambiguity about the correct interpretation of the object. On the other hand, the machine is often able to produce a finite number of alternative feature sets (of the same object) that contain the desired one. We call these feature sets variants of the object. Data sets that contain variants may be analyzed by means of statistical methods and all chapters of multivariate analysis can be seen in the light of variants. In this communication, we focus on point estimation in the presence of variants and outliers. Besides robust parameter estimation, this task requires also selecting the regular objects and their valid feature sets (regular variants). We determine the mixed MAP-ML estimator for a model with spurious variants and outliers as well as estimators based on the integrated likelihood. We also prove asymptotic results which show that the estimators are nearly consistent.The problem of variant selection turns out to be computationally hard; therefore, we also design algorithms for efficient approximation. We finally demonstrate their efficacy with a simulated data set and a real data set from genetics. 相似文献

5.

响应变量随机缺失下变系数模型的变量选择

赵培信薛留根《数学研究及应用》2011,31(2):251-260

In this paper,we present a variable selection procedure by combining basis function approximations with penalized estimating equations for varying-coefficient models with missing response at random.With appropriate selection of the tuning parameters,we establish the consistency of the variable selection procedure and the optimal convergence rate of the regularized estimators.A simulation study is undertaken to assess the finite sample performance of the proposed variable selection procedure. 相似文献

6.

Asymptotic expansions of the distributions of estimators in canonical correlation analysis under nonnormality

Haruhiko Ogasawara 《Journal of multivariate analysis》2007,98(9):1726-1750

Asymptotic expansions of the distributions of typical estimators in canonical correlation analysis under nonnormality are obtained. The expansions include the Edgeworth expansions up to order O(1/n) for the parameter estimators standardized by the population standard errors, and the corresponding expansion by Hall's method with variable transformation. The expansions for the Studentized estimators are also given using the Cornish-Fisher expansion and Hall's method. The parameter estimators are dealt with in the context of estimation for the covariance structure in canonical correlation analysis. The distributions of the associated statistics (the structure of the canonical variables, the scaled log likelihood ratio and Rozeboom's between-set correlation) are also expanded. The robustness of the normal-theory asymptotic variances of the sample canonical correlations and associated statistics are shown when a latent variable model holds. Simulations are performed to see the accuracy of the asymptotic results in finite samples. 相似文献

7.

A family of estimators for multivariate kurtosis in a nonnormal linear regression model

Hirokazu Yanagihara 《Journal of multivariate analysis》2007,98(1):1-29

In this paper, we propose a new estimator for a kurtosis in a multivariate nonnormal linear regression model. Usually, an estimator is constructed from an arithmetic mean of the second power of the squared sample Mahalanobis distances between observations and their estimated values. The estimator gives an underestimation and has a large bias, even if the sample size is not small. We replace this squared distance with a transformed squared norm of the Studentized residual using a monotonic increasing function. Our proposed estimator is defined by an arithmetic mean of the second power of these squared transformed squared norms with a correction term and a tuning parameter. The correction term adjusts our estimator to an unbiased estimator under normality, and the tuning parameter controls the sizes of the squared norms of the residuals. The family of our estimators includes estimators based on ordinary least squares and predicted residuals. We verify that the bias of our new estimator is smaller than usual by constructing numerical experiments. 相似文献

8.

Variable screening in predicting clinical outcome with high-dimensional microarrays

Jun Shao Shein-Chung Chow 《Journal of multivariate analysis》2007,98(8):1529-1538

Statistical modeling is an important area of biomarker research of important genes for new drug targets, drug candidate validation, disease diagnoses, personalized treatment, and prediction of clinical outcome of a treatment. A widely adopted technology is the use of microarray data that are typically very high dimensional. After screening chromosomes for relative genes using methods such as quantitative trait locus mapping, there may still be a few thousands of genes related to the clinical outcome of interest. On the other hand, the sample size (the number of subjects) in a clinical study is typically much smaller. Under the assumption that only a few important genes are actually related to the clinical outcome, we propose a variable screening procedure to eliminate genes having negligible effects on the clinical outcome. Once the dimension of microarray data is reduced to a manageable number relative to the sample size, one can select a final set of genes via a well-known variable selection method such as the cross-validation. We establish the asymptotic consistency of the proposed variable screening procedure. Some simulation results are also presented. 相似文献

9.

纵向数据下线性EV模型的变量选择

田瑞琴薛留根《应用概率统计》2013,29(3):246-260

本文考虑了纵向数据线性EV模型的变量选择.基于二次推断函数方法和压缩方法的思想提出了一种新的偏差校正的变量选择方法.在选择适当的调整参数下,我们证明了所得到的估计量的相合性和渐近正态性.最后通过模拟研究验证了所提出的变量选择方法的有限样本性质. 相似文献

10.

Variable selection for semiparametric varying-coefficient partially linear models with missing response at random

Pei Xin Zhao Liu Gen Xue 《数学学报(英文版)》2011,27(11):2205-2216

In this paper, we present a variable selection procedure by combining basis function approximations with penalized estimating equations for semiparametric varying-coefficient partially linear models with missing response at random. The proposed procedure simultaneously selects significant variables in parametric components and nonparametric components. With appropriate selection of the tuning parameters, we establish the consistency of the variable selection procedure and the convergence rate of the regularized estimators. A simulation study is undertaken to assess the finite sample performance of the proposed variable selection procedure. 相似文献

11.

A Confidence Region Approach to Tuning for Variable Selection

Funda Gunes Howard D. Bondell 《Journal of computational and graphical statistics》2013,22(2):295-314

We develop an approach to tuning of penalized regression variable selection methods by calculating the sparsest estimator contained in a confidence region of a specified level. Because confidence intervals/regions are generally understood, tuning penalized regression methods in this way is intuitive and more easily understood by scientists and practitioners. More importantly, our work shows that tuning to a fixed confidence level often performs better than tuning via the common methods based on Akaike information criterion (AIC), Bayesian information criterion (BIC), or cross-validation (CV) over a wide range of sample sizes and levels of sparsity. Additionally, we prove that by tuning with a sequence of confidence levels converging to one, asymptotic selection consistency is obtained, and with a simple two-stage procedure, an oracle property is achieved. The confidence-region-based tuning parameter is easily calculated using output from existing penalized regression computer packages. Our work also shows how to map any penalty parameter to a corresponding confidence coefficient. This mapping facilitates comparisons of tuning parameter selection methods such as AIC, BIC, and CV, and reveals that the resulting tuning parameters correspond to confidence levels that are extremely low, and can vary greatly across datasets. Supplemental materials for the article are available online. 相似文献

12.

Asymptotics of Bayesian median loss estimation

Chi Wai Yu Bertrand Clarke 《Journal of multivariate analysis》2010,101(9):1950-1958

We establish the consistency, asymptotic normality, and efficiency for estimators derived by minimizing the median of a loss function in a Bayesian context. We contrast this procedure with the behavior of two Frequentist procedures, the least median of squares (LMS) and the least trimmed squares (LTS) estimators, in regression problems. The LMS estimator is the Frequentist version of our estimator, and the LTS estimator approaches a median-based estimator as the trimming approaches 50% on each side. We argue that the Bayesian median-based method is a good tradeoff between the two Frequentist estimators. 相似文献

13.

Bridge estimation for generalized linear models with a diverging number of parameters

Mingqiu Wang Lixin Song Xiaoguang Wang 《Statistics & probability letters》2010,80(21-22):1584-1596

Variable selection is fundamental to high dimensional generalized linear models. A number of variable selection approaches have been proposed in the literature. This paper considers the problem of variable selection and estimation in generalized linear models via a bridge penalty in the situation where the number of parameters diverges with the sample size. Under reasonable conditions the consistency of the bridge estimator can be achieved. Furthermore, it can select the nonzero coefficients with a probability converging to 1 and the estimators of nonzero coefficients have the asymptotic normality, namely the oracle property. Our simulations indicate that the bridge penalty is an effective consistent model selection technique and is comparable to the smoothly clipped absolute deviation procedure. A real example analysis is presented. 相似文献

14.

Covariate selection for semiparametric hazard function regression models

Florentina Bunea Ian W. McKeague 《Journal of multivariate analysis》2005,92(1):186-204

We study a flexible class of nonproportional hazard function regression models in which the influence of the covariates splits into the sum of a parametric part and a time-dependent nonparametric part. We develop a method of covariate selection for the parametric part by adjusting for the implicit fitting of the nonparametric part. Asymptotic consistency of the proposed covariate selection method is established, leading to asymptotically normal estimators of both parametric and nonparametric parts of the model in the presence of covariate selection. The approach is applied to a real data set and a simulation study is presented. 相似文献

15.

Asymptotic results in segmented multiple regression 总被引：1，自引：0，他引：1

Jeankyung Kim 《Journal of multivariate analysis》2008,99(9):2016-2038

This paper studies the asymptotic behavior of the least squares estimators in segmented multiple regression. For a model with more than one partitioning variable, each of which has one or more change-points, we study the asymptotic properties of the estimated change-points and regression coefficients. Using techniques in empirical process theory, we prove the consistency of the least squares estimators and also establish the asymptotic normality of the estimated regression coefficients. For the estimated change-points, we obtain their consistency at the rates of or 1/n, with or without continuity constraints, respectively. The change-points estimated under the continuity constraints are also shown to asymptotically have a multivariate normal distribution. For the case where the regression mean functions are not assumed to be continuous at the change-points, the asymptotic distribution of the estimated change-points involves a step function process, whose distribution does not follow a well-known distribution. 相似文献

16.

发散维数SICA惩罚Cox回归模型的一种修正BIC调节参数选择器 总被引：2，自引：0，他引：2

下载免费PDF全文

石跃勇焦雨领严良曹永秀《数学杂志》2017,37(4):723-730

本文研究了发散维数SICA惩罚Cox回归模型的调节参数选择问题,提出了一种修正的BIC调节参数选择器.在一定的正则条件下,证明了方法的模型选择相合性.数值结果表明提出的方法表现要优于GCV准则. 相似文献

17.

A class of proper priors for Bayesian simultaneous prediction of independent Poisson observables

Fumiyasu Komaki 《Journal of multivariate analysis》2006,97(8):1815-1828

Simultaneous prediction and parameter inference for the independent Poisson observables model are considered. A class of proper prior distributions for Poisson means is introduced. Bayesian predictive densities and estimators based on priors in the introduced class dominate the Bayesian predictive density and estimator based on the Jeffreys prior under Kullback-Leibler loss. 相似文献

18.

Direct variable selection for discrimination among several groups

Guy Martial Nkiet 《Journal of multivariate analysis》2012,105(1):151-163

We propose a criterion for variable selection in discriminant analysis. This criterion permits to arrange the variables in decreasing order of adequacy for discrimination, so that the variable selection problem reduces to that of the estimation of suitable permutation and dimensionality. Then, estimators for these parameters are proposed and the resulting method for selecting variables is shown to be consistent. In a simulation study, we compute proportions of correct classification after variable selection in order to gain understanding of the performance of our proposal and to compare it to existing methods. 相似文献

19.

半参数变系数部分线性模型的统计推断

下载免费PDF全文

赵培信《中国科学:数学》2013,43(7):635-646

本文在多种复杂数据下, 研究一类半参数变系数部分线性模型的统计推断理论和方法. 首先在纵向数据和测量误差数据等复杂数据下, 研究半参数变系数部分线性模型的经验似然推断问题, 分别提出分组的和纠偏的经验似然方法. 该方法可以有效地处理纵向数据的组内相关性给构造经验似然比函数所带来的困难. 其次在测量误差数据和缺失数据等复杂数据下, 研究模型的变量选择问题, 分别提出一个“纠偏” 的和基于借补值的变量选择方法. 该变量选择方法可以同时选择参数分量及非参数分量中的重要变量, 并且变量选择与回归系数的估计同时进行. 通过选择适当的惩罚参数, 证明该变量选择方法可以相合地识别出真实模型, 并且所得的正则估计具有oracle 性质. 相似文献

20.

Detection of a change point based on local-likelihood

Jib Huh 《Journal of multivariate analysis》2010,101(7):1681-1700

In this paper, we consider the regression function or its νth derivative in generalized linear models which may have a change/discontinuity point at an unknown location. The location and its jump size are estimated with the local polynomial fits based on one-sided kernel weighted local-likelihood functions. Asymptotic distributions of the proposed estimators of location and jump size are established. The finite-sample performances of the proposed estimators with practical aspects are illustrated by simulated and beetle mortality examples. 相似文献