共查询到20条相似文献,搜索用时 0 毫秒
1.
2.
ZHANG ZhongZhan & WANG DaRong College of Applied Sciences Beijing University of Technology Beijing China The Pilot College Beijing 《中国科学 数学(英文版)》2011,(3):515-530
In this paper, we propose a new criterion, named PICa, to simultaneously select explanatory variables in the mean model and variance model in heteroscedastic linear models based on the model structure. We show that the new criterion can select the true mean model and a correct variance model with probability tending to 1 under mild conditions. Simulation studies and a real example are presented to evaluate the new criterion, and it turns out that the proposed approach performs well. 相似文献
3.
4.
Semiparametric variable selection for partially varying coefficient models with endogenous variables
By using instrumental variable technology and the partial group smoothly clipped absolute deviation penalty method, we propose a variable selection procedure for a class of partially varying coefficient models with endogenous variables. The proposed variable selection method can eliminate the influence of the endogenous variables. With appropriate selection of the tuning parameters, we establish the oracle property of this variable selection procedure. A simulation study is undertaken to assess the finite sample performance of the proposed variable selection procedure. 相似文献
5.
We focus on the problem of simultaneous variable selection and estimation for nonlinear models based on modal regression (MR), when the number of coefficients diverges with sample size. With appropriate selection of the tuning parameters, the resulting estimator is shown to be consistent and to enjoy the oracle properties. 相似文献
6.
A threshold stochastic volatility (SV) model is used for capturing time-varying volatilities and nonlinearity. Two adaptive Markov chain Monte Carlo (MCMC) methods of model selection are designed for the selection of threshold variables for this family of SV models. The first method is the direct estimation which approximates the model posterior probabilities of competing models. Using parallel MCMC sampling to estimate these probabilities, the best threshold variable is selected with the highest posterior model probability. The second method is to use the deviance information criterion to compare among these competing models and select the best one. Simulation results lead us to conclude that for large samples the posterior model probability approximation method can give an accurate approximation of the posterior probability in Bayesian model selection. The method delivers a powerful and sharp model selection tool. An empirical study of five Asian stock markets provides strong support for the threshold variable which is formulated as a weighted average of important variables. 相似文献
7.
Qingguo Tang R. J. Karunamuni 《Annals of the Institute of Statistical Mathematics》2018,70(3):489-521
Finite mixture regression (FMR) models are frequently used in statistical modeling, often with many covariates with low significance. Variable selection techniques can be employed to identify the covariates with little influence on the response. The problem of variable selection in FMR models is studied here. Penalized likelihood-based approaches are sensitive to data contamination, and their efficiency may be significantly reduced when the model is slightly misspecified. We propose a new robust variable selection procedure for FMR models. The proposed method is based on minimum-distance techniques, which seem to have some automatic robustness to model misspecification. We show that the proposed estimator has the variable selection consistency and oracle property. The finite-sample breakdown point of the estimator is established to demonstrate its robustness. We examine small-sample and robustness properties of the estimator using a Monte Carlo study. We also analyze a real data set. 相似文献
8.
9.
We propose a robust estimation procedure based on local Walsh-average regression(LWR) for single-index models. Our novel method provides a root-n consistent estimate of the single-index parameter under some mild regularity conditions; the estimate of the unknown link function converges at the usual rate for the nonparametric estimation of a univariate covariate. We theoretically demonstrate that the new estimators show significant efficiency gain across a wide spectrum of non-normal error distributions and have almost no loss of efficiency for the normal error. Even in the worst case, the asymptotic relative efficiency(ARE) has a lower bound compared with the least squares(LS) estimates; the lower bounds of the AREs are 0.864 and 0.8896 for the single-index parameter and nonparametric function, respectively. Moreover, the ARE of the proposed LWR-based approach versus the ARE of the LS-based method has an expression that is closely related to the ARE of the signed-rank Wilcoxon test as compared with the t-test. In addition, to obtain a sparse estimate of the single-index parameter, we develop a variable selection procedure by combining the estimation method with smoothly clipped absolute deviation penalty; this procedure is shown to possess the oracle property. We also propose a Bayes information criterion(BIC)-type criterion for selecting the tuning parameter and further prove its ability to consistently identify the true model. We conduct some Monte Carlo simulations and a real data analysis to illustrate the finite sample performance of the proposed methods. 相似文献
10.
In multinomial logit models, the identifiability of parameter estimates is typically obtained by side constraints that specify one of the response categories as reference category. When parameters are penalized, shrinkage of estimates should not depend on the reference category. In this paper we investigate ridge regression for the multinomial logit model with symmetric side constraints, which yields parameter estimates that are independent of the reference category. In simulation studies the results are compared with the usual maximum likelihood estimates and an application to real data is given. 相似文献
11.
Linjun Tang Zhangong Zhou Changchun Wu 《Journal of Applied Mathematics and Computing》2012,40(1-2):399-413
In this paper, a self-weighted composite quantile regression estimation procedure is developed to estimate unknown parameter in an infinite variance autoregressive (IVAR) model. The proposed estimator is asymptotically normal and more efficient than a single quantile regression estimator. At the same time, the adaptive least absolute shrinkage and selection operator (LASSO) for variable selection are also suggested. We show that the adaptive LASSO based on the self-weighted composite quantile regression enjoys the oracle properties. Simulation studies and a real data example are conducted to examine the performance of the proposed approaches. 相似文献
12.
In this paper, we consider improved estimation strategies for the parameter vector in multiple regression models with first-order random coefficient autoregressive errors (RCAR(1)). We propose a shrinkage estimation strategy and implement variable selection methods such as lasso and adaptive lasso strategies. The simulation results reveal that the shrinkage estimators perform better than both lasso and adaptive lasso when and only when there are many nuisance variables in the model. 相似文献
13.
A commonly used semiparametric model is considered. We adopt two difference based estimators of the linear component of the model and propose corresponding thresholding estimators that can be used for variable selection. For each thresholding estimator, variable selection in the linear component is developed and consistency of the variable selection procedure is shown. We evaluate our method in a simulation study and implement it on a real data set. 相似文献
14.
Logit models have been widely used in marketing to predict brand choice and to make inference about the impact of marketing mix variables on these choices. Most researchers have followed the pioneering example of Guadagni and Little, building choice models and drawing inference conditional on the assumption that the logit model is the correct specification for household purchase behaviour. To the extent that logit models fail to adequately describe household purchase behaviour, statistical inferences from them may be flawed. More importantly, marketing decisions based on these models may be incorrect. This research applies White's robust inference method to logit brand choice models. The method does not impose the restrictive assumption that the assumed logit model specification be true. A sandwich estimator of the covariance ‘corrected’ for possible mis‐specification is the basis for inference about logit model parameters. An important feature of this method is that it yields correct standard errors for the marketing mix parameter estimates even if the assumed logit model specification is not correct. Empirical examples include using household panel data sets from three different product categories to estimate logit models of brand choice. The standard errors obtained using traditional methods are compared with those obtained by White's robust method. The findings illustrate that incorrectly assuming the logit model to be true typically yields standard errors which are biased downward by 10–40 per cent. Conditions under which the bias is particularly severe are explored. Under these conditions, the robust approach is recommended. Copyright © 2000 John Wiley & Sons, Ltd. 相似文献
15.
Despite the large cost of bodily injury (BI) claims in motor insurance, relatively little research has been done in this area. Many companies estimate (and therefore reserve) bodily injury compensation directly from initial medical reports. This practice may underestimate the final cost, because the severity is often assessed during the recovery period. Since the evaluation of this severity is often only qualitative, in this paper we apply an ordered multiple choice model at different moments in the life of a claim reported to an insurance company. We assume that the information available to the insurer does not flow continuously, because it is obtained at different stages. Using a real data set, we show that the application of sequential ordered logit models leads to a significant improvement in the prediction of the BI severity level, compared to the subjective classification that is used in practice. We also show that these results could improve the insurer’s reserves notably. 相似文献
16.
Annals of the Institute of Statistical Mathematics - In this paper, we propose improved statistical inference and variable selection methods for generalized linear models based on empirical... 相似文献
17.
This paper is concerned with the approximate computation of choice probabilities in mixed logit models. The relevant approximations are based on the Taylor expansion of the classical logit function and on the high order moments of the random coefficients. The approximate choice probabilities and their derivatives are used in conjunction with log likelihood maximization for parameter estimation. The resulting method avoids the assumption of an apriori distribution for the random tastes. Moreover experiments with simulation data show that it compares well with the simulation based methods in terms of computational cost. 相似文献
18.
A Tabu search method is proposed and analysed for selecting variables that are subsequently used in Logistic Regression Models. The aim is to find from among a set of m variables a smaller subset which enables the efficient classification of cases. Reducing dimensionality has some very well-known advantages that are summarized in literature. The specific problem consists in finding, for a small integer value of p, a subset of size p of the original set of variables that yields the greatest percentage of hits in Logistic Regression. The proposed Tabu search method performs a deep search in the solution space that alternates between a basic phase (that uses simple moves) and a diversification phase (to explore regions not previously visited). Testing shows that it obtains significantly better results than the Stepwise, Backward or Forward methods used by classic statistical packages. Some results of applying these methods are presented. 相似文献
19.
本文提出复合最小化平均分位数损失估计方法 (composite minimizing average check loss estimation,CMACLE)用于实现部分线性单指标模型(partial linear single-index models,PLSIM)的复合分位数回归(composite quantile regression,CQR).首先基于高维核函数构造参数部分的复合分位数回归意义下的相合估计,在此相合估计的基础上,通过采用指标核函数进一步得到参数和非参数函数的可达最优收敛速度的估计,并建立所得估计的渐近正态性,比较PLSIM的CQR估计和最小平均方差估计(MAVE)的相对渐近效率.进一步地,本文提出CQR框架下PLSIM的变量选择方法,证明所提变量选择方法的oracle性质.随机模拟和实例分析验证了所提方法在有限样本时的表现,证实了所提方法的优良性. 相似文献
20.
Parameter estimation and data regression represent special classes of optimization problems. Often, nonlinear programming methods can be tailored to take advantage of the least squares, or more generally, the maximum likelihood, objective function. In previous studies we have developed large-scale nonlinear programming methods that are based on tailored quasi-Newton updating strategies and matrix decomposition of the process model. The resulting algorithms converge in a reduced space of the parameters while simultaneously converging the process model. Moreover, tailoring the method to least squares functions leads to significant improvements in the performance of the algorithm. These approaches can be very efficient for both explicit and implicit models (i.e. problems with small and large degrees of freedom, respectively). In the latter case, degrees of freedom are proportional to a potential large number of data sets. Applications of this case include errors-in-all-variables estimation, data reconciliation and identification of process parameters. To deal with this structure, we apply a decomposition approach that performs a quadratic programming factorization for each data set. Because these are small components of large problems, an efficient and reliable algorithm results. These methods have generally been implemented on standard von Neumann architectures and few studies exist that exploit the parallelism of nonlinear programming algorithms. It is therefore interesting to note that for implicit model parameter estimation and related process applications, this approach can be quite amenable to parallel computation, because the major cost occurs in matrix decompositions for each data set. Here we describe an implementation of this approach on the Alliant FX/8 parallel computer at the Advanced Computing Research Facility at Argone National Laboratory. Special attention is paid to the architecture of this machine and its effect on the performance of the algorithm. This approach is demonstrated on five small, undetermined regression problems as well as a larger process example for simultaneous data reconciliation and parameter estimation. 相似文献