首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 46 毫秒
1.
The selection of a best-subset regression model from a candidate family is a common problem that arises in many analyses. The Akaike information criterion (AIC) and the corrected AIC (\(\text {AIC}_c\)) are frequently used for this purpose. AIC and \(\text {AIC}_c\) are designed to estimate the expected Kullback–Leibler discrepancy. For best-subset selection, both AIC and \(\text {AIC}_c\) are negatively biased, and the use of either criterion will lead to the selection of overfitted models. To correct for this bias, we introduce an “improved” AIC variant, \(\text {AIC}_i\), which has a penalty term evaluated using Monte Carlo simulation. A multistage model selection procedure \(\text {AIC}_{\text {aps}}\), which utilizes \(\text {AIC}_i\), is proposed for best-subset selection. Simulation studies are compiled to compare the performances of the different model selection methods.  相似文献   

2.
A method for feature selection in linear regression based on an extension of Akaike’s information criterion is proposed. The use of classical Akaike’s information criterion (AIC) for feature selection assumes the exhaustive search through all the subsets of features, which has unreasonably high computational and time cost. A new information criterion is proposed that is a continuous extension of AIC. As a result, the feature selection problem is reduced to a smooth optimization problem. An efficient procedure for solving this problem is derived. Experiments show that the proposed method enables one to efficiently select features in linear regression. In the experiments, the proposed procedure is compared with the relevance vector machine, which is a feature selection method based on Bayesian approach. It is shown that both procedures yield similar results. The main distinction of the proposed method is that certain regularization coefficients are identical zeros. This makes it possible to avoid the underfitting effect, which is a characteristic feature of the relevance vector machine. A special case (the so-called nondiagonal regularization) is considered in which both methods are identical.  相似文献   

3.
In this paper, we consider the problem of selecting the variables of the fixed effects in the linear mixed models where the random effects are present and the observation vectors have been obtained from many clusters. As the variable selection procedure, here we use the Akaike Information Criterion, AIC. In the context of the mixed linear models, two kinds of AIC have been proposed: marginal AIC and conditional AIC. In this paper, we derive three versions of conditional AIC depending upon different estimators of the regression coefficients and the random effects. Through the simulation studies, it is shown that the proposed conditional AIC’s are superior to the marginal and conditional AIC’s proposed in the literature in the sense of selecting the true model. Finally, the results are extended to the case when the random effects in all the clusters are of the same dimension but have a common unknown covariance matrix.  相似文献   

4.
In the problem of selecting the explanatory variables in the linear mixed model, we address the derivation of the (unconditional or marginal) Akaike information criterion (AIC) and the conditional AIC (cAIC). The covariance matrices of the random effects and the error terms include unknown parameters like variance components, and the selection procedures proposed in the literature are limited to the cases where the parameters are known or partly unknown. In this paper, AIC and cAIC are extended to the situation where the parameters are completely unknown and they are estimated by the general consistent estimators including the maximum likelihood (ML), the restricted maximum likelihood (REML) and other unbiased estimators. We derive, related to AIC and cAIC, the marginal and the conditional prediction error criteria which select superior models in light of minimizing the prediction errors relative to quadratic loss functions. Finally, numerical performances of the proposed selection procedures are investigated through simulation studies.  相似文献   

5.
In this paper, we consider the model selection problem for discretely observed ergodic multi-dimensional diffusion processes. In order to evaluate the statistical models, Akaike’s information criterion (AIC) is a useful tool. Since AIC is constructed by the maximum log likelihood and the dimension of the parameter space, it may look easy to get AIC even for discretely observed diffusion processes. However, there is a serious problem that a transition density of a diffusion process does not generally have an explicit form. Instead of the exact log-likelihood, we use a contrast function based on a locally Gaussian approximation of the transition density and we propose the contrast-based information criterion.  相似文献   

6.
We consider the use ofB-spline nonparametric regression models estimated by the maximum penalized likelihood method for extracting information from data with complex nonlinear structure. Crucial points inB-spline smoothing are the choices of a smoothing parameter and the number of basis functions, for which several selectors have been proposed based on cross-validation and Akaike information criterion known as AIC. It might be however noticed that AIC is a criterion for evaluating models estimated by the maximum likelihood method, and it was derived under the assumption that the ture distribution belongs to the specified parametric model. In this paper we derive information criteria for evaluatingB-spline nonparametric regression models estimated by the maximum penalized likelihood method in the context of generalized linear models under model misspecification. We use Monte Carlo experiments and real data examples to examine the properties of our criteria including various selectors proposed previously.  相似文献   

7.
Bootstrapping Log Likelihood and EIC, an Extension of AIC   总被引:1,自引:0,他引:1  
Akaike (1973, 2nd International Symposium on Information Theory, 267-281,Akademiai Kiado, Budapest) proposed AIC as an estimate of the expected loglikelihood to evaluate the goodness of models fitted to a given set of data.The introduction of AIC has greatly widened the range of application ofstatistical methods. However, its limit lies in the point that it can beapplied only to the cases where the parameter estimation are performed bythe maximum likelihood method. The derivation of AIC is based on theassessment of the effect of data fluctuation through the asymptoticnormality of MLE. In this paper we propose a new information criterion EICwhich is constructed by employing the bootstrap method to simulate the datafluctuation. The new information criterion, EIC, is regarded as an extensionof AIC. The performance of EIC is demonstrated by some numerical examples.  相似文献   

8.
Abstract

Test-based variable selection algorithms in regression often are based on sequential comparison of test statistics to cutoff values. A predetermined a level typically is used to determine the cutoffs based on an assumed probability distribution for the test statistic. For example, backward elimination or forward stepwise involve comparisons of test statistics to prespecified t or F cutoffs in Gaussian linear regression, while a likelihood ratio. Wald, or score statistic, is typically used with standard normal or chi square cutoffs in nonlinear settings. Although such algorithms enjoy widespread use, their statistical properties are not well understood, either theoretically or empirically. Two inherent problems with these methods are that (1) as in classical hypothesis testing, the value of α is arbitrary, while (2) unlike hypothesis testing, there is no simple analog of type I error rate corresponding to application of the entire algorithm to a data set. In this article we propose a new method, backward elimination via cross-validation (BECV), for test-based variable selection in regression. It is implemented by first finding the empirical p value α*, which minimizes a cross-validation estimate of squared prediction error, then selecting the model by running backward elimination on the entire data set using α* as the nominal p value for each test. We present results of an extensive computer simulation to evaluate BECV and compare its performance to standard backward elimination and forward stepwise selection.  相似文献   

9.
Abstract

Polynomial splines are often used in statistical regression models for smooth response functions. When the number and location of the knots are optimized, the approximating power of the spline is improved and the model is nonparametric with locally determined smoothness. However, finding the optimal knot locations is an historically difficult problem. We present a new estimation approach that improves computational properties by penalizing coalescing knots. The resulting estimator is easier to compute than the unpenalized estimates of knot positions, eliminates unnecessary “corners” in the fitted curve, and in simulation studies, shows no increase in the loss. A number of GCV and AIC type criteria for choosing the number of knots are evaluated via simulation.  相似文献   

10.
Testing for nonindependence among the residuals from a regression or time series model is a common approach to evaluating the adequacy of a fitted model. This idea underlies the familiar Durbin–Watson statistic, and previous works illustrate how the spatial autocorrelation among residuals can be used to test a candidate linear model. We propose here that a version of Moran's I statistic for spatial autocorrelation, applied to residuals from a fitted model, is a practical general tool for selecting model complexity under the assumption of iid additive errors. The “space” is defined by the independent variables, and the presence of significant spatial autocorrelation in residuals is evidence that a more complex model is needed to capture all of the structure in the data. An advantage of this approach is its generality, which results from the fact that no properties of the fitted model are used other than consistency. The problem of smoothing parameter selection in nonparametric regression is used to illustrate the performance of model selection based on residual spatial autocorrelation (RSA). In simulation trials comparing RSA with established selection criteria based on minimizing mean square prediction error, smooths selected by RSA exhibit fewer spurious features such as minima and maxima. In some cases, at higher noise levels, RSA smooths achieved a lower average mean square error than smooths selected by GCV. We also briefly describe a possible modification of the method for non-iid errors having short-range correlations, for example, time-series errors or spatial data. Some other potential applications are suggested, including variable selection in regression models.  相似文献   

11.
Abstract

An improved AIC-based criterion is derived for model selection in general smoothing-based modeling, including semiparametric models and additive models. Examples are provided of applications to goodness-of-fit, smoothing parameter and variable selection in an additive model and semiparametric models, and variable selection in a model with a nonlinear function of linear terms.  相似文献   

12.
《随机分析与应用》2013,31(6):1487-1509
Abstract

We apply Grenander's method of sieves to the problem of identification or estimation of the “drift” function for linear stochastic systems driven by a fractional Brownian motion (fBm). We use an increasing sequence of finite dimensional subspaces of the parameter space as the natural sieves on which we maximise the likelihood function.  相似文献   

13.
This paper deals with the bias correction of the cross-validation (CV) criterion to estimate the predictive Kullback-Leibler information. A bias-corrected CV criterion is proposed by replacing the ordinary maximum likelihood estimator with the maximizer of the adjusted log-likelihood function. The adjustment is just slight and simple, but the improvement of the bias is remarkable. The bias of the ordinary CV criterion is O(n-1), but that of the bias-corrected CV criterion is O(n-2). We verify that our criterion has smaller bias than the AIC, TIC, EIC and the ordinary CV criterion by numerical experiments.  相似文献   

14.

It is well known that variable selection in multiple regression can be unstable and that the model uncertainty can be considerable. The model uncertainty can be quantified and explored by bootstrap resampling, see Sauerbrei et al. (Biom J 57:531–555, 2015). Here approaches are introduced that use the results of bootstrap replications of the variable selection process to obtain more detailed information about the data. Analyses will be based on dissimilarities between the results of the analyses of different bootstrap samples. Dissimilarities are computed between the vector of predictions, and between the sets of selected variables. The dissimilarities are used to map the models by multidimensional scaling, to cluster them, and to construct heatplots. Clusters can point to different interpretations of the data that could arise from different selections of variables supported by different bootstrap samples. A new measure of variable selection instability is also defined. The methodology can be applied to various regression models, estimators, and variable selection methods. It will be illustrated by three real data examples, using linear regression and a Cox proportional hazards model, and model selection by AIC and BIC.

  相似文献   

15.
A common challenge in regression is that for many problems, the degrees of freedom required for a high-quality solution also allows for overfitting. Regularization is a class of strategies that seek to restrict the range of possible solutions so as to discourage overfitting while still enabling good solutions, and different regularization strategies impose different types of restrictions. In this paper, we present a multilevel regularization strategy that constructs and trains a hierarchy of neural networks, each of which has layers that are wider versions of the previous network's layers. We draw intuition and techniques from the field of Algebraic Multigrid (AMG), traditionally used for solving linear and nonlinear systems of equations, and specifically adapt the Full Approximation Scheme (FAS) for nonlinear systems of equations to the problem of deep learning. Training through V-cycles then encourage the neural networks to build a hierarchical understanding of the problem. We refer to this approach as multilevel-in-width to distinguish from prior multilevel works which hierarchically alter the depth of neural networks. The resulting approach is a highly flexible framework that can be applied to a variety of layer types, which we demonstrate with both fully connected and convolutional layers. We experimentally show with PDE regression problems that our multilevel training approach is an effective regularizer, improving the generalize performance of the neural networks studied.  相似文献   

16.
The generalized information criterion (GIC) proposed by Rao and Wu [A strongly consistent procedure for model selection in a regression problem, Biometrika 76 (1989) 369-374] is a generalization of Akaike's information criterion (AIC) and the Bayesian information criterion (BIC). In this paper, we extend the GIC to select linear mixed-effects models that are widely applied in analyzing longitudinal data. The procedure for selecting fixed effects and random effects based on the extended GIC is provided. The asymptotic behavior of the extended GIC method for selecting fixed effects is studied. We prove that, under mild conditions, the selection procedure is asymptotically loss efficient regardless of the existence of a true model and consistent if a true model exists. A simulation study is carried out to empirically evaluate the performance of the extended GIC procedure. The results from the simulation show that if the signal-to-noise ratio is moderate or high, the percentages of choosing the correct fixed effects by the GIC procedure are close to one for finite samples, while the procedure performs relatively poorly when it is used to select random effects.  相似文献   

17.
We consider the linear model Y = + ε that is obtained by discretizing a system of first-kind integral equations describing a set of physical measurements. The n vector β represents the desired quantities, the m x n matrix X represents the instrument response functions, and the m vector Y contains the measurements actually obtained. These measurements are corrupted by random measuring errors ε drawn from a distribution with zero mean vector and known variance matrix. Solution of first-kind integral equations is an ill-posed problem, so the least squares solution for the above model is a highly unstable function of the measurements, and the classical confidence intervals for the solution are too wide to be useful. The solution can often be stabilized by imposing physically motivated nonnegativity constraints. In a previous article (O'Leary and Rust 1986) we developed a method for computing sets of nonnegatively constrained simultaneous confidence intervals. In this article we briefly review the simultaneous intervals and then show how to compute nonnegativity constrained one-at-a-time confidence intervals. The technique gives valid confidence intervals even for problems with m < n. We demonstrate the methods using both an overdetermined and an underdetermined problem obtained by discretizing an equation of Phillips (Phillips 1962).  相似文献   

18.
Unbiased Recursive Partitioning: A Conditional Inference Framework   总被引:1,自引:0,他引:1  
Recursive binary partitioning is a popular tool for regression analysis. Two fundamental problems of exhaustive search procedures usually applied to fit such models have been known for a long time: overfitting and a selection bias towards covariates with many possible splits or missing values. While pruning procedures are able to solve the overfitting problem, the variable selection bias still seriously affects the interpretability of tree-structured regression models. For some special cases unbiased procedures have been suggested, however lacking a common theoretical foundation. We propose a unified framework for recursive partitioning which embeds tree-structured regression models into a well defined theory of conditional inference procedures. Stopping criteria based on multiple test procedures are implemented and it is shown that the predictive performance of the resulting trees is as good as the performance of established exhaustive search procedures. It turns out that the partitions and therefore the models induced by both approaches are structurally different, confirming the need for an unbiased variable selection. Moreover, it is shown that the prediction accuracy of trees with early stopping is equivalent to the prediction accuracy of pruned trees with unbiased variable selection. The methodology presented here is applicable to all kinds of regression problems, including nominal, ordinal, numeric, censored as well as multivariate response variables and arbitrary measurement scales of the covariates. Data from studies on glaucoma classification, node positive breast cancer survival and mammography experience are re-analyzed.  相似文献   

19.
Summary An asymptotically efficient selection of regression variables is considered in the situation where the statistician estimates regression parameters by the maximum likelihood method but fails to choose a likelihood function matching the true error distribution. The proposed procedure is useful when a robust regression technique is applied but the data in fact do not require that treatment. Examples and a Monte Carlo study are presented and relationships to other selectors such as Mallows'C p are investigated. Research supported by Deutsche Forschungsgemeinschaft, Sonderforschungsbereich 123 “Stochastische Mathematische Modelle” and AFOSR Contract No. F49620 82 C 0009.  相似文献   

20.
Multi-sample cluster analysis using Akaike's Information Criterion   总被引:1,自引:0,他引:1  
Summary Multi-sample cluster analysis, the problem of grouping samples, is studied from an information-theoretic viewpoint via Akaike's Information Criterion (AIC). This criterion combines the maximum value of the likelihood with the number of parameters used in achieving that value. The multi-sample cluster problem is defined, and AIC is developed for this problem. The form of AIC is derived in both the multivariate analysis of variance (MANOVA) model and in the multivariate model with varying mean vectors and variance-covariance matrices. Numerical examples are presented for AIC and another criterion calledw-square. The results demonstrate the utility of AIC in identifying the best clustering alternatives. This research was supported by Office of Naval Research Contract N00014-80-C-0408, Task NR042-443 and Army Research Office Contract DAAG 29-82-K-0155, at the University of Illinois at Chicago.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号