首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
We consider the use ofB-spline nonparametric regression models estimated by the maximum penalized likelihood method for extracting information from data with complex nonlinear structure. Crucial points inB-spline smoothing are the choices of a smoothing parameter and the number of basis functions, for which several selectors have been proposed based on cross-validation and Akaike information criterion known as AIC. It might be however noticed that AIC is a criterion for evaluating models estimated by the maximum likelihood method, and it was derived under the assumption that the ture distribution belongs to the specified parametric model. In this paper we derive information criteria for evaluatingB-spline nonparametric regression models estimated by the maximum penalized likelihood method in the context of generalized linear models under model misspecification. We use Monte Carlo experiments and real data examples to examine the properties of our criteria including various selectors proposed previously.  相似文献   

2.
In this Note, we consider the problem of order selection of vector autoregressive moving-average (VARMA) models under the assumption that the errors are uncorrelated, but not necessarily independent. These models are called weak VARMA by opposition to the standard VARMA models, also called strong VARMA models, in which the error terms are supposed to be iid. This selection is based on minimizing an information criterion, especially that introduced by Akaike. The theoretical foundations of the Akaike information criterion (AIC) are not more established when the iid assumption on the noise is relaxed. We propose a modified AIC criterion, and which may be very different from the standard AIC criterion.  相似文献   

3.
The selection of a best-subset regression model from a candidate family is a common problem that arises in many analyses. The Akaike information criterion (AIC) and the corrected AIC (\(\text {AIC}_c\)) are frequently used for this purpose. AIC and \(\text {AIC}_c\) are designed to estimate the expected Kullback–Leibler discrepancy. For best-subset selection, both AIC and \(\text {AIC}_c\) are negatively biased, and the use of either criterion will lead to the selection of overfitted models. To correct for this bias, we introduce an “improved” AIC variant, \(\text {AIC}_i\), which has a penalty term evaluated using Monte Carlo simulation. A multistage model selection procedure \(\text {AIC}_{\text {aps}}\), which utilizes \(\text {AIC}_i\), is proposed for best-subset selection. Simulation studies are compiled to compare the performances of the different model selection methods.  相似文献   

4.
It is widely acknowledged that understanding and prioritizing the voice of customer is a critical step in new product development. In this work, we propose a novel approach to handle missing and incomplete data while combining information from different surveys for prioritizing customer voices. Our new approach comprises of the following stages: estimating and representing missing and incomplete data; estimating intervals for the criteria used in analyzing data; mapping data on criteria to a common scale; modeling interval data using interval belief structure; and aggregating evidence and ranking customer voices using the interval evidential reasoning algorithm. We demonstrate our approach using a case study from automotive domain with a given criteria hierarchy for analyzing data from three different surveys. We propose new optimization formulations for estimating intervals of the criteria used in our case study and logical yet pragmatic transformation functions for mapping criteria values to a common scale.  相似文献   

5.
Several criteria, such as CV, C p , AIC, CAIC, and MAIC, are used for selecting variables in linear regression models. It might be noted that C p has been proposed as an estimator of the expected standardized prediction error, although the target risk function of CV might be regarded as the expected prediction error R PE. On the other hand, the target risk function of AIC, CAIC, and MAIC is the expected log-predictive likelihood. In this paper, we propose a prediction error criterion, PE, which is an estimator of the expected prediction error R PE. Consequently, it is also a competitor of CV. Results of this study show that PE is an unbiased estimator when the true model is contained in the full model. The property is shown without the assumption of normality. In fact, PE is demonstrated as more faithful for its risk function than CV. The prediction error criterion PE is extended to the multivariate case. Furthermore, using simulations, we examine some peculiarities of all these criteria.  相似文献   

6.
The paper presents an efficient solution to decision problems where direct partial information on the distribution of the states of nature is available, either by observations of previous repetitions of the decision problem or by direct expert judgements.To process this information we use a recent generalization of Walley’s imprecise Dirichlet model, allowing us also to handle incomplete observations or imprecise judgements, including missing data. We derive efficient algorithms and discuss properties of the optimal solutions with respect to several criteria, including Gamma-maximinity and E-admissibility. In the case of precise data and pure actions the former surprisingly leads us to a frequency-based variant of the Hodges–Lehmann criterion, which was developed in classical decision theory as a compromise between Bayesian and minimax procedures.  相似文献   

7.
This paper is concerned with cross-validation (CV) criteria for choice of models, which can be regarded as approximately unbiased estimators for two types of risk functions. One is AIC type of risk or equivalently the expected Kullback-Leibler distance between the distributions of observations under a candidate model and the true model. The other is based on the expected mean squared error of prediction. In this paper we study asymptotic properties of CV criteria for selecting multivariate regression models and growth curve models under the assumption that a candidate model includes the true model. Based on the results, we propose their corrected versions which are more nearly unbiased for their risks. Through numerical experiments, some tendency of the CV criteria will be also pointed.  相似文献   

8.
We consider the model selection problem for ergodic diffusion processes based on sampled data. The adaptive estimators for parameters of drift and diffusion coefficients are used in order to construct Akaike’s information criterion (AIC) type model selection statistics. Asymptotic properties of our proposed criteria are given for three kinds of the adaptive estimators.  相似文献   

9.
In the problem of selecting the explanatory variables in the linear mixed model, we address the derivation of the (unconditional or marginal) Akaike information criterion (AIC) and the conditional AIC (cAIC). The covariance matrices of the random effects and the error terms include unknown parameters like variance components, and the selection procedures proposed in the literature are limited to the cases where the parameters are known or partly unknown. In this paper, AIC and cAIC are extended to the situation where the parameters are completely unknown and they are estimated by the general consistent estimators including the maximum likelihood (ML), the restricted maximum likelihood (REML) and other unbiased estimators. We derive, related to AIC and cAIC, the marginal and the conditional prediction error criteria which select superior models in light of minimizing the prediction errors relative to quadratic loss functions. Finally, numerical performances of the proposed selection procedures are investigated through simulation studies.  相似文献   

10.
多元$t$分布数据的局部影响分析   总被引:4,自引:0,他引:4       下载免费PDF全文
对于多元$t$分布数据, 直接应用其概率密度进行影响分析是困难的\bd 本文通过引入服从Gamma分布的权重, 将其表示为特定多元正态分布的混合\bd 在此基础上, 进而将权重视为缺失数据, 引入EM算法; 从而利用基于完全数据似然函数的条件期望进行局部影响分析\bd 本文进一步系统研究了加权扰动模型下的局部影响分析, 得到了相应的诊断统计量; 并通过两个实例说明了这种方法的有效性.  相似文献   

11.
We consider the Cauchy problem for the spatially inhomogeneous Landau equation with soft potentials in the case of large (i.e. non-perturbative) initial data. We construct a solution for any bounded, measurable initial data with uniform polynomial decay in the velocity variable, and that satisfies a technical lower bound assumption (but can have vacuum regions). For uniqueness in this weak class, we have to make the additional assumption that the initial data is Hölder continuous. Our hypotheses are much weaker, in terms of regularity and decay, than previous large-data well-posedness results in the literature. We also derive a continuation criterion for our solutions that is, for the case of very soft potentials, an improvement over the previous state of the art.  相似文献   

12.
A modified version of the Akaike information criterion and two modified versions of the Bayesian information criterion are proposed to select the number of principal components and to choose the penalty parameters of penalized splines in a joint model of paired functional data. Numerical results show that, compared with an existing procedure using the cross-validation, the procedure based on the information criteria is computationally much faster while giving a similar performance.  相似文献   

13.
In longitudinal studies with small samples and incomplete data, multivariate normal-based models continue to be a powerful tool for analysis. This has included a broad scope of biomedical studies. Testing the assumption of multivariate normality (MVN) is critical. Although many methods are available for testing normality in complete data with large samples, a few deal with the testing in small samples. For example, Liang et al. (J. Statist. Planning and Inference 86 (2000) 129) propose a projection procedure for testing MVN for complete-data with small samples where the sample sizes may be close to the dimension. To our knowledge, no statistical methods for testing MVN in incomplete data with small samples are yet available. This article develops a test procedure in such a setting using multiple imputations and the projection test. To utilize the incomplete data structure in multiple imputation, we adopt a noniterative inverse Bayes formulae (IBF) sampling procedure instead of the iterative Gibbs sampling to generate iid samples. Simulations are performed for both complete and incomplete data when the sample size is less than the dimension. The method is illustrated with a real study on an anticancer drug.  相似文献   

14.
冯予 《应用概率统计》2006,22(4):365-380
对指数族非线性混合效应模型, 本文基于$Q$函数(朱宏图, 2001)方法, 给出几种度量数据删除影响的统计量\bd 其主要思想是将随机效应视为缺失数据, 并利用EM算法来处理完全数据对数似然函数的条件期望\bd 一个实际例子说明我们方法是有效的  相似文献   

15.
We develop an approach to tuning of penalized regression variable selection methods by calculating the sparsest estimator contained in a confidence region of a specified level. Because confidence intervals/regions are generally understood, tuning penalized regression methods in this way is intuitive and more easily understood by scientists and practitioners. More importantly, our work shows that tuning to a fixed confidence level often performs better than tuning via the common methods based on Akaike information criterion (AIC), Bayesian information criterion (BIC), or cross-validation (CV) over a wide range of sample sizes and levels of sparsity. Additionally, we prove that by tuning with a sequence of confidence levels converging to one, asymptotic selection consistency is obtained, and with a simple two-stage procedure, an oracle property is achieved. The confidence-region-based tuning parameter is easily calculated using output from existing penalized regression computer packages. Our work also shows how to map any penalty parameter to a corresponding confidence coefficient. This mapping facilitates comparisons of tuning parameter selection methods such as AIC, BIC, and CV, and reveals that the resulting tuning parameters correspond to confidence levels that are extremely low, and can vary greatly across datasets. Supplemental materials for the article are available online.  相似文献   

16.
We look at the problem of optimizing complex operations with incomplete information where the missing information is revealed indirectly and imperfectly through historical decisions. Incomplete information is characterized by missing data elements governing operational behavior and unknown cost parameters. We assume some of this information may be indirectly captured in historical databases through flows characterizing resource movements. We can use these flows or other quantities derived from these flows as “numerical patterns” in our optimization model to reflect some of the incomplete information. We develop our methodology for representing information in resource allocation models using the concept of pattern regression. We use a popular goodness-of-fit measure known as the Cramer–Von Mises metric as the foundation of our approach. We then use a hybrid approach of solving a cost model with a term known as the “pattern metric” that minimizes the deviations of model decisions from observed quantities in a historical database. We present a novel iterative method to solve this problem. Results with real-world data from a large freight railroad are presented.  相似文献   

17.
Exploring incomplete data using visualization techniques   总被引:1,自引:0,他引:1  
Visualization of incomplete data allows to simultaneously explore the data and the structure of missing values. This is helpful for learning about the distribution of the incomplete information in the data, and to identify possible structures of the missing values and their relation to the available information. The main goal of this contribution is to stress the importance of exploring missing values using visualization methods and to present a collection of such visualization techniques for incomplete data, all of which are implemented in the ${{\sf R}}$ package VIM. Providing such functionality for this widely used statistical environment, visualization of missing values, imputation and data analysis can all be done from within ${{\sf R}}$ without the need of additional software.  相似文献   

18.
在带有罚函数的变量选择中,调节参数的选择是一个关键性问题,但遗憾的是,在大多数文献中,调节参数选择的方法较为模糊,多凭经验,缺乏系统的理论方法.本文基于含随机效应的面板数据模型,提出分位回归中适应性LASSO调节参数的选择标准惩罚交叉验证准则(PCV),并讨论比较了该准则与其他选择调节参数的准则的效果.通过对不同分位点进行模拟,我们发现当残差E来自尖峰分布和厚尾分布时,该准则能更好地估计模型参数,尤其对于高分位点和低分位点而言.选取其他分位点时,PCV的效果虽稍逊色于Schwarz信息准则,但明显优于A1kaike 信息准则和交叉验证准则.且在选择变量的准确性方面,该准则比Schwarz信息准则、Akaike信息准则等更加有效.文章最后对我国各地区多个宏观经济指标的面板数据进行建模分析,展示了惩罚交叉验证准则的性能,得到了在不同分位点处宏观经济指标之间的回归关系.  相似文献   

19.
Bootstrapping Log Likelihood and EIC, an Extension of AIC   总被引:1,自引:0,他引:1  
Akaike (1973, 2nd International Symposium on Information Theory, 267-281,Akademiai Kiado, Budapest) proposed AIC as an estimate of the expected loglikelihood to evaluate the goodness of models fitted to a given set of data.The introduction of AIC has greatly widened the range of application ofstatistical methods. However, its limit lies in the point that it can beapplied only to the cases where the parameter estimation are performed bythe maximum likelihood method. The derivation of AIC is based on theassessment of the effect of data fluctuation through the asymptoticnormality of MLE. In this paper we propose a new information criterion EICwhich is constructed by employing the bootstrap method to simulate the datafluctuation. The new information criterion, EIC, is regarded as an extensionof AIC. The performance of EIC is demonstrated by some numerical examples.  相似文献   

20.
许凯  何道江 《数学学报》2016,59(6):783-794
在缺失数据机制是可忽略的假设下,导出了有单调缺失数据的条件独立正态模型中协方差阵和精度阵的Cholesky分解的最大似然估计和无偏估计.通过引入一类特殊的变换群并在更广义的损失下,获得了其最优同变估计.这表明最大似然估计和无偏估计是非容许的.最后,通过数值模拟验证了相关结果的有效性.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号