期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Prediction error criterion for selecting variables in a linear regression model

Yasunori Fujikoshi Tamio Kan Shin Takahashi Tetsuro Sakurai 《Annals of the Institute of Statistical Mathematics》2011,63(2):387-403

Several criteria, such as CV, C _p, AIC, CAIC, and MAIC, are used for selecting variables in linear regression models. It might be noted that C _p has been proposed as an estimator of the expected standardized prediction error, although the target risk function of CV might be regarded as the expected prediction error R _PE. On the other hand, the target risk function of AIC, CAIC, and MAIC is the expected log-predictive likelihood. In this paper, we propose a prediction error criterion, PE, which is an estimator of the expected prediction error R _PE. Consequently, it is also a competitor of CV. Results of this study show that PE is an unbiased estimator when the true model is contained in the full model. The property is shown without the assumption of normality. In fact, PE is demonstrated as more faithful for its risk function than CV. The prediction error criterion PE is extended to the multivariate case. Furthermore, using simulations, we examine some peculiarities of all these criteria. 相似文献

2.

Corrected-loss estimation for Error-in-Variable partially linear model

JIN Jiao TONG XingWei 《中国科学数学(英文版)》2015,(5):1101-1114

We consider an Error-in-Variable partially linear model where the covariates of linear part are measured with error which follows a normal distribution with a known covariance matrix. We propose a corrected-loss estimation of the covariate effect. The proposed estimator is asymptotically normal. Simulation studies are presented to show that the proposed method performs well with finite samples, and the proposed method is applied to a real data set. 相似文献

3.

基于支持向量机委员会机器的个人信用评估模型

姚尚锋吕慧刘道才《数学的实践与认识》2010,40(9)

为了充分利用SVM在个人信用评估方面的优点、克服其不足,提出了基于支持向量机委员会机器的个人信用评估模型.将模型与基于属性效用函数估计构造新学习样本方法结合起来进行个人信用评估;经实证分析及与SVM方法对比发现,模型具有更好、更快、更多适应性的预测分类能力. 相似文献

4.

Bayesian bandwidth estimation for a semi-functional partial linear regression model with unknown error density

Han Lin Shang 《Computational Statistics》2014,29(3-4):829-848

In the context of semi-functional partial linear regression model, we study the problem of error density estimation. The unknown error density is approximated by a mixture of Gaussian densities with means being the individual residuals, and variance a constant parameter. This mixture error density has a form of a kernel density estimator of residuals, where the regression function, consisting of parametric and nonparametric components, is estimated by the ordinary least squares and functional Nadaraya–Watson estimators. The estimation accuracy of the ordinary least squares and functional Nadaraya–Watson estimators jointly depends on the same bandwidth parameter. A Bayesian approach is proposed to simultaneously estimate the bandwidths in the kernel-form error density and in the regression function. Under the kernel-form error density, we derive a kernel likelihood and posterior for the bandwidth parameters. For estimating the regression function and error density, a series of simulation studies show that the Bayesian approach yields better accuracy than the benchmark functional cross validation. Illustrated by a spectroscopy data set, we found that the Bayesian approach gives better point forecast accuracy of the regression function than the functional cross validation, and it is capable of producing prediction intervals nonparametrically. 相似文献

5.

The efficiency of the second-order nonlinear least squares estimator and its extension

Mijeong Kim Yanyuan Ma 《Annals of the Institute of Statistical Mathematics》2012,64(4):751-764

We revisit the second-order nonlinear least square estimator proposed in Wang and Leblanc (Anne Inst Stat Math 60:883–900, 2008) and show that the estimator reaches the asymptotic optimality concerning the estimation variability. Using a fully semiparametric approach, we further modify and extend the method to the heteroscedastic error models and propose a semiparametric efficient estimator in this more general setting. Numerical results are provided to support the results and illustrate the finite sample performance of the proposed estimator. 相似文献

6.

高维部分线性模型的变量选择和估计

下载免费PDF全文

杨宜平薛留根《应用概率统计》2011,27(2):172-182

考虑高维部分线性模型,提出了同时进行变量选择和估计兴趣参数的变量选择方法.将Dantzig变量选择应用到线性部分及非参数部分的各阶导数,从而获得参数和非参数部分的估计,且参数部分的估计具有稀疏性,证明了估计的非渐近理论界.最后,模拟研究了有限样本的性质. 相似文献

7.

Post selection shrinkage estimation for high‐dimensional data analysis

下载免费PDF全文

Xiaoli Gao S. E. Ahmed Yang Feng 《商业与工业应用随机模型》2017,33(2):97-120

In high‐dimensional data settings where p ? n , many penalized regularization approaches were studied for simultaneous variable selection and estimation. However, with the existence of covariates with weak effect, many existing variable selection methods, including Lasso and its generations, cannot distinguish covariates with weak and no contribution. Thus, prediction based on a subset model of selected covariates only can be inefficient. In this paper, we propose a post selection shrinkage estimation strategy to improve the prediction performance of a selected subset model. Such a post selection shrinkage estimator (PSE) is data adaptive and constructed by shrinking a post selection weighted ridge estimator in the direction of a selected candidate subset. Under an asymptotic distributional quadratic risk criterion, its prediction performance is explored analytically. We show that the proposed post selection PSE performs better than the post selection weighted ridge estimator. More importantly, it improves the prediction performance of any candidate subset model selected from most existing Lasso‐type variable selection methods significantly. The relative performance of the post selection PSE is demonstrated by both simulation studies and real‐data analysis. Copyright © 2016 John Wiley & Sons, Ltd. 相似文献

8.

Semiparametric estimation of average treatment effect through a random coefficient dummy endogenous variable model

YaHong Zhou LiMing Wang XiaoDan He 《中国科学数学(英文版)》2014,57(11):2415-2428

This paper provides an estimation procedure for average treatment effect through a random coefficient dummy endogenous variable model. A leading example of the model is estimating the effect of a training program on earnings. The model is composed of two equations: an outcome equation and a decision equation. Given the linear restriction in outcome and decision equations, Chen (1999) provided a distribution-free estimation procedure under conditional symmetric error distributions. In this paper we extend Chen’s estimator by relaxing the linear index into a nonparametric function, which greatly reduces the risk of model misspecification. A two-step approach is proposed: the first step uses a nonparametric regression estimator for the decision variable, and the second step uses an instrumental variables approach to estimate average treatment effect in the outcome equation. The proposed estimator is shown to be consistent and asymptotically normally distributed. Furthermore, we investigate the finite performance of our estimator by a Monte Carlo study and also use our estimator to study the return of college education in different periods of China. The estimates seem more reasonable than those of other commonly used estimators. 相似文献

9.

Asymptotic theory of simultaneous estimation of Poisson means

S. Ejaz Ahmed 《Linear algebra and its applications》2009,430(10):2734-2748

The Poisson distribution is often a good approximation to the underlying sampling distribution and is central to the study of categorical data. In this paper, we propose a new unified approach to an investigation of point properties of simultaneous estimations of Poisson population parameters with general quadratic loss functions. The main accent is made on the shrinkage estimation. We build a series of estimators that could be represented as a convex combination of linear statistics such as maximum likelihood estimator (benchmark estimator), restricted estimator, composite estimator, preliminary test estimator, shrinkage estimator, positive rule shrinkage estimator (James-Stein type estimator). All these estimators are represented in a general integrated estimation approach, which allows us to unify our investigation and order them with respect to the risk. A simulation study with numerical and graphical results is conducted to illustrate the properties of the investigated estimators. 相似文献

10.

State estimation with guaranteed performance for switching-type fuzzy neural networks in presence of sensor nonlinearities

《Communications in Nonlinear Science & Numerical Simulation》2014,19(7):2160-2171

This paper investigates the state estimation with guaranteed performance for a class of switching fuzzy neural networks. A switching-type fuzzy neural networks (STFNNs) model is proposed which captures external disturbances, sensor nonlinearities, and mode switching phenomenon of the fuzzy neural networks without the Markovian process assumption. For such a model, a state estimation problem is formulated to achieve the guaranteed performance: the estimation error system is exponentially stable with certain decay rate and a prescribed H_∞ disturbance attenuation level. A novel sufficient condition for this problem is established using the Lyapunov functional method and the average dwell time approach, and the estimator parameters are explicitly given. A numerical example is presented to show the effectiveness of the developed results. 相似文献

11.

Bayesian point estimation and prediction

Robert R. Britney Robert L. Winkler 《Annals of the Institute of Statistical Mathematics》1974,26(1):15-34

In the Bayesian viewpoint, point estimation and prediction are treated from a decision-making standpoint. If a loss function can be determined which associates a loss with every possible error of estimation or prediction, then the optimal estimator or predictor is that value which minimizes expected loss. In most applications, the loss function is assumed to be linear or quadratic in the error of estimation or prediction, although there are many practical situations in which these simple functions are quite inappropriate. In this paper, we investigate the properties of Bayesian point estimates under other loss functions; both the general case and two special cases (power and exponential loss functions) are considered. For the special cases, we also investigate the sensitivity of Bayesian point estimation and prediction to misspecification in the loss function and discuss the practical implications of the results. 相似文献

12.

An adjusted maximum likelihood method for solving small area estimation problems

Huilin Li P. Lahiri 《Journal of multivariate analysis》2010,101(4):882-892

For the well-known Fay-Herriot small area model, standard variance component estimation methods frequently produce zero estimates of the strictly positive model variance. As a consequence, an empirical best linear unbiased predictor of a small area mean, commonly used in small area estimation, could reduce to a simple regression estimator, which typically has an overshrinking problem. We propose an adjusted maximum likelihood estimator of the model variance that maximizes an adjusted likelihood defined as a product of the model variance and a standard likelihood (e.g., a profile or residual likelihood) function. The adjustment factor was suggested earlier by Carl Morris in the context of approximating a hierarchical Bayes solution where the hyperparameters, including the model variance, are assumed to follow a prior distribution. Interestingly, the proposed adjustment does not affect the mean squared error property of the model variance estimator or the corresponding empirical best linear unbiased predictors of the small area means in a higher order asymptotic sense. However, as demonstrated in our simulation study, the proposed adjustment has a considerable advantage in small sample inference, especially in estimating the shrinkage parameters and in constructing the parametric bootstrap prediction intervals of the small area means, which require the use of a strictly positive consistent model variance estimate. 相似文献

13.

�̶�ЧӦ��ݲ��ģ�͵ļ�Ȩ��LSDV��

��ܻԡ��Ф��ʩ�ŷ� 《应用概率统计》2018,34(2):111-134

This paper concerns with the estimation of a fixed effects panel data partially linear regression model with the idiosyncratic errors being an autoregressive process. For fixed effects short time series panel data, the commonly used autoregressive error structure fitting method will not result in a consistent estimator of the autoregressive coefficients. Here we propose an alternative estimation and show that the resulting estimator of the autoregressive coefficients is consistent and this method is workable for any order autoregressive error structure. Moreover, combining the B-spline approximation, profile least squares dummy variable (PLSDV) technique and consistently estimated the autoregressive error structure, we develop a weighted PLSDV estimator for the parametric component and a weighted B-spline series (BS) estimator for the nonparametric component. The weighted PLSDV estimator is shown to be asymptotically normal and more asymptotically efficient than the one which ignores the error autoregressive structure. In addition, this paper derives the asymptotic bias of the weighted BS estimator and establish its asymptotic normality as well. Simulation studies and an example of application are conducted to illustrate the finite sample performance of the proposed procedures. 相似文献

14.

Tree-based multivariate regression and density estimation with right-censored data

Annette M. Molinaro Sandrine Dudoit Mark J. van der Laan 《Journal of multivariate analysis》2004,90(1):154-177

We propose a unified strategy for estimator construction, selection, and performance assessment in the presence of censoring. This approach is entirely driven by the choice of a loss function for the full (uncensored) data structure and can be stated in terms of the following three main steps. (1) First, define the parameter of interest as the minimizer of the expected loss, or risk, for a full data loss function chosen to represent the desired measure of performance. Map the full data loss function into an observed (censored) data loss function having the same expected value and leading to an efficient estimator of this risk. (2) Next, construct candidate estimators based on the loss function for the observed data. (3) Then, apply cross-validation to estimate risk based on the observed data loss function and to select an optimal estimator among the candidates. A number of common estimation procedures follow this approach in the full data situation, but depart from it when faced with the obstacle of evaluating the loss function for censored observations. Here, we argue that one can, and should, also adhere to this estimation road map in censored data situations.Tree-based methods, where the candidate estimators in Step 2 are generated by recursive binary partitioning of a suitably defined covariate space, provide a striking example of the chasm between estimation procedures for full data and censored data (e.g., regression trees as in CART for uncensored data and adaptations to censored data). Common approaches for regression trees bypass the risk estimation problem for censored outcomes by altering the node splitting and tree pruning criteria in manners that are specific to right-censored data. This article describes an application of our unified methodology to tree-based estimation with censored data. The approach encompasses univariate outcome prediction, multivariate outcome prediction, and density estimation, simply by defining a suitable loss function for each of these problems. The proposed method for tree-based estimation with censoring is evaluated using a simulation study and the analysis of CGH copy number and survival data from breast cancer patients. 相似文献

15.

Stein's idea and minimax admissible estimation of a multivariate normal mean

Yuzo Maruyama 《Journal of multivariate analysis》2004,88(2):320-334

We consider estimation of a multivariate normal mean vector under sum of squared error loss.We propose a new class of minimax admissible estimator which are generalized Bayes with respect to a prior distribution which is a mixture of a point prior at the origin and a continuous hierarchical type prior. We also study conditions under which these generalized Bayes minimax estimators improve on the James–Stein estimator and on the positive-part James–Stein estimator. 相似文献

16.

Subagging for credit scoring models

Giuseppe Paleologo André Elisseeff Gianluca Antonini 《European Journal of Operational Research》2010

The logistic regression framework has been for long time the most used statistical method when assessing customer credit risk. Recently, a more pragmatic approach has been adopted, where the first issue is credit risk prediction, instead of explanation. In this context, several classification techniques have been shown to perform well on credit scoring, such as support vector machines among others. While the investigation of better classifiers is an important research topic, the specific methodology chosen in real world applications has to deal with the challenges arising from the real world data collected in the industry. Such data are often highly unbalanced, part of the information can be missing and some common hypotheses, such as the i.i.d. one, can be violated. In this paper we present a case study based on a sample of IBM Italian customers, which presents all the challenges mentioned above. The main objective is to build and validate robust models, able to handle missing information, class unbalancedness and non-iid data points. We define a missing data imputation method and propose the use of an ensemble classification technique, subagging, particularly suitable for highly unbalanced data, such as credit scoring data. Both the imputation and subagging steps are embedded in a customized cross-validation loop, which handles dependencies between different credit requests. The methodology has been applied using several classifiers (kernel support vector machines, nearest neighbors, decision trees, Adaboost) and their subagged versions. The use of subagging improves the performance of the base classifier and we will show that subagging decision trees achieve better performance, still keeping the model simple and reasonably interpretable. 相似文献

17.

时空模型的局部众数回归

汪红霞林金官黄性芳《中国科学:数学》2021,(4):615-630

时空数据经常含有奇异点或来自重尾分布,此时基于最小二乘的估计方法效果欠佳,需要更稳健的估计方法.本文提出时空模型的基于局部众数(local modal, LM)的局部线性估计方法.理论和数据分析结果都显示,若数据含有奇异点或来自重尾分布,基于局部众数的局部线性方法比基于最小二乘的局部线性方法有效;若数据无奇异点且来自正态分布,两种方法效率渐近一致.本文采用众数期望最大化(modal expectation-maximization, MEM)算法,并在数据相依情形下得出估计量的渐近正态性. 相似文献

18.

Weak signals in high‐dimensional regression: Detection,estimation and prediction

Yanming Li Hyokyoung G. Hong S. Ejaz Ahmed Yi Li 《商业与工业应用随机模型》2019,35(2):283-298

Regularization methods, including Lasso, group Lasso, and SCAD, typically focus on selecting variables with strong effects while ignoring weak signals. This may result in biased prediction, especially when weak signals outnumber strong signals. This paper aims to incorporate weak signals in variable selection, estimation, and prediction. We propose a two‐stage procedure, consisting of variable selection and postselection estimation. The variable selection stage involves a covariance‐insured screening for detecting weak signals, whereas the postselection estimation stage involves a shrinkage estimator for jointly estimating strong and weak signals selected from the first stage. We term the proposed method as the covariance‐insured screening‐based postselection shrinkage estimator. We establish asymptotic properties for the proposed method and show, via simulations, that incorporating weak signals can improve estimation and prediction performance. We apply the proposed method to predict the annual gross domestic product rates based on various socioeconomic indicators for 82 countries. 相似文献

19.

Statistical inference for panel data semiparametric partially linear regression models with heteroscedastic errors

Jinhong You Yong Zhou 《Journal of multivariate analysis》2010,101(5):1079-1101

We consider a panel data semiparametric partially linear regression model with an unknown parameter vector for the linear parametric component, an unknown nonparametric function for the nonlinear component, and a one-way error component structure which allows unequal error variances (referred to as heteroscedasticity). We develop procedures to detect heteroscedasticity and one-way error component structure, and propose a weighted semiparametric least squares estimator (WSLSE) of the parametric component in the presence of heteroscedasticity and/or one-way error component structure. This WSLSE is asymptotically more efficient than the usual semiparametric least squares estimator considered in the literature. The asymptotic properties of the WSLSE are derived. The nonparametric component of the model is estimated by the local polynomial method. Some simulations are conducted to demonstrate the finite sample performances of the proposed testing and estimation procedures. An example of application on a set of panel data of medical expenditures in Australia is also illustrated. 相似文献

20.

测量误差数据下线性模型的几乎无偏岭估计

周道清邬吉波《经济数学》2018,(1):102-104

文章讨论带测量误差的线性模型中参数估计的问题.当带测量误差的线性模型存在复共线的时候,通过几乎无偏估计的思想,提出了几乎无偏岭估计,并对估计的性质进行分析.通过研究发现几乎无偏岭估计不但能克服复共线性,同时有比较小的均方误差. 相似文献