首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 289 毫秒
1.
This paper deals with the bias correction of the cross-validation (CV) criterion to estimate the predictive Kullback-Leibler information. A bias-corrected CV criterion is proposed by replacing the ordinary maximum likelihood estimator with the maximizer of the adjusted log-likelihood function. The adjustment is just slight and simple, but the improvement of the bias is remarkable. The bias of the ordinary CV criterion is O(n-1), but that of the bias-corrected CV criterion is O(n-2). We verify that our criterion has smaller bias than the AIC, TIC, EIC and the ordinary CV criterion by numerical experiments.  相似文献   

2.
The generalized information criterion (GIC) proposed by Rao and Wu [A strongly consistent procedure for model selection in a regression problem, Biometrika 76 (1989) 369-374] is a generalization of Akaike's information criterion (AIC) and the Bayesian information criterion (BIC). In this paper, we extend the GIC to select linear mixed-effects models that are widely applied in analyzing longitudinal data. The procedure for selecting fixed effects and random effects based on the extended GIC is provided. The asymptotic behavior of the extended GIC method for selecting fixed effects is studied. We prove that, under mild conditions, the selection procedure is asymptotically loss efficient regardless of the existence of a true model and consistent if a true model exists. A simulation study is carried out to empirically evaluate the performance of the extended GIC procedure. The results from the simulation show that if the signal-to-noise ratio is moderate or high, the percentages of choosing the correct fixed effects by the GIC procedure are close to one for finite samples, while the procedure performs relatively poorly when it is used to select random effects.  相似文献   

3.
In a structural measurement error model the structural quasi-score (SQS) estimator is based on the distribution of the latent regressor variable. If this distribution is misspecified, the SQS estimator is (asymptotically) biased. Two types of misspecification are considered. Both assume that the statistician erroneously adopts a normal distribution as his model for the regressor distribution. In the first type of misspecification, the true model consists of a mixture of normal distributions which cluster around a single normal distribution, in the second type, the true distribution is a normal distribution admixed with a second normal distribution of low weight. In both cases of misspecification, the bias, of course, tends to zero when the size of misspecification tends to zero. However, in the first case the bias goes to zero in a flat way so that small deviations from the true model lead to a negligible bias, whereas in the second case the bias is noticeable even for small deviations from the true model.  相似文献   

4.
In this paper, an information-based criterion is proposed for carrying out change point analysis and variable selection simultaneously in linear models with a possible change point. Under some weak conditions, this criterion is shown to be strongly consistent in the sense that with probability one, it chooses the smallest true model for large n. Its byproducts include strongly consistent estimates of the regression coefficients regardless if there is a change point. In case that there is a change point, its byproducts also include a strongly consistent estimate of the change point parameter. In addition, an algorithm is given which has significantly reduced the computation time needed by the proposed criterion for the same precision. Results from a simulation study are also presented.  相似文献   

5.
We propose a parametric model for a bivariate stable Lévy process based on a Lévy copula as a dependence model. We estimate the parameters of the full bivariate model by maximum likelihood estimation. As an observation scheme we assume that we observe all jumps larger than some ε>0 and base our statistical analysis on the resulting compound Poisson process. We derive the Fisher information matrix and prove asymptotic normality of all estimates when the truncation point ε→0. A simulation study investigates the loss of efficiency because of the truncation.  相似文献   

6.
In this paper we address the problem of estimating θ1 when , are observed and |θ1θ2|?c for a known constant c. Clearly Y2 contains information about θ1. We show how the so-called weighted likelihood function may be used to generate a class of estimators that exploit that information. We discuss how the weights in the weighted likelihood may be selected to successfully trade bias for precision and thus use the information effectively. In particular, we consider adaptively weighted likelihood estimators where the weights are selected using the data. One approach selects such weights in accord with Akaike's entropy maximization criterion. We describe several estimators obtained in this way. However, the maximum likelihood estimator is investigated as a competitor to these estimators along with a Bayes estimator, a class of robust Bayes estimators and (when c is sufficiently small), a minimax estimator. Moreover we will assess their properties both numerically and theoretically. Finally, we will see how all of these estimators may be viewed as adaptively weighted likelihood estimators. In fact, an over-riding theme of the paper is that the adaptively weighted likelihood method provides a powerful extension of its classical counterpart.  相似文献   

7.
This paper is concerned with the testing problem of generalized multivariate linear hypothesis for the mean in the growth curve model(GMANOVA). Our interest is the case in which the number of the observed points p is relatively large compared to the sample size N. Asymptotic expansions of the non-null distributions of the likelihood ratio criterion, Lawley-Hotelling’s trace criterion and Bartlett-Nanda-Pillai’s trace criterion are derived under the asymptotic framework that N and p go to infinity together, while p/Nc∈(0,1). It also can be confirmed that Rothenberg’s condition on the magnitude of the asymptotic powers of the three tests is valid when p is relatively large, theoretically and numerically.  相似文献   

8.
This paper considers the estimation of the mean vector θ of a p-variate normal distribution with unknown covariance matrix Σ when it is suspected that for a p×r known matrix B the hypothesis θ=Bη, ηRr may hold. We consider empirical Bayes estimators which includes (i) the unrestricted unbiased (UE) estimator, namely, the sample mean vector (ii) the restricted estimator (RE) which is obtained when the hypothesis θ=Bη holds (iii) the preliminary test estimator (PTE), (iv) the James-Stein estimator (JSE), and (v) the positive-rule Stein estimator (PRSE). The biases and the risks under the squared loss function are evaluated for all the five estimators and compared. The numerical computations show that PRSE is the best among all the five estimators even when the hypothesis θ=Bη is true.  相似文献   

9.
Model identification and discrimination are two major statistical challenges. In this paper we consider a set of models Mk for factorial experiments with the parameters representing the general mean, main effects, and only k out of all two-factor interactions. We consider the class D of all fractional factorial plans with the same number of runs having the ability to identify all the models in Mk, i.e., the full estimation capacity.The fractional factorial plans in D with the full estimation capacity for k?2 are able to discriminate between models in Mu for u?k*, where k*=(k/2) when k is even, k*=((k-1)/2) when k is odd. We obtain fractional factorial plans in D satisfying the six optimality criterion functions AD, AT, AMCR, GD, GT, and GMCR for 2m factorial experiments when m=4 and 5. Both single stage and multi-stage (hierarchical) designs are given. Some results on estimation capacity of a fractional factorial plan for identifying models in Mk are also given. Our designs D4.1 and D10 stand out in their performances relative to the designs given in Li and Nachtsheim [Model-robust factorial designs, Technometrics 42(4) (2000) 345-352.] for m=4 and 5 with respect to the criterion functions AD, AT, AMCR, GD, GT, and GMCR. Our design D4.2 stands out in its performance relative the Li-Nachtsheim design for m=4 with respect to the four criterion functions AT, AMCR, GT, and GMCR. However, the Li-Nachtsheim design for m=4 stands out in its performance relative to our design D4.2 with respect to the criterion functions AD and GD. Our design D14 does have the full estimation capacity for k=5 but the twelve run Li-Nachtsheim design does not have the full estimation capacity for k=5.  相似文献   

10.
Recent advances in the transformation model have made it possible to use this model for analyzing a variety of censored survival data. For inference on the regression parameters, there are semiparametric procedures based on the normal approximation. However, the accuracy of such procedures can be quite low when the censoring rate is heavy. In this paper, we apply an empirical likelihood ratio method and derive its limiting distribution via U-statistics. We obtain confidence regions for the regression parameters and compare the proposed method with the normal approximation based method in terms of coverage probability. The simulation results demonstrate that the proposed empirical likelihood method overcomes the under-coverage problem substantially and outperforms the normal approximation based method. The proposed method is illustrated with a real data example. Finally, our method can be applied to general U-statistic type estimating equations.  相似文献   

11.
As a useful tool in functional data analysis, the functional linear regression model has become increasingly common and been studied extensively in recent years. In this paper, we consider a sparse functional linear regression model which is generated by a finite number of basis functions in an expansion of the coefficient function. In this model, we do not specify how many and which basis functions enter the model, thus it is not like a typical parametric model where predictor variables are pre-specified. We study a general framework that gives various procedures which are successful in identifying the basis functions that enter the model, and also estimating the resulting regression coefficients in one-step. We adopt the idea of variable selection in the linear regression setting where one adds a weighted L1 penalty to the traditional least squares criterion. We show that the procedures in our general framework are consistent in the sense of selecting the model correctly, and that they enjoy the oracle property, meaning that the resulting estimators of the coefficient function have asymptotically the same properties as the oracle estimator which uses knowledge of the underlying model. We investigate and compare several methods within our general framework, via a simulation study. Also, we apply the methods to the Canadian weather data.  相似文献   

12.
In this paper, we propose a new methodology to deal with PCA in high-dimension, low-sample-size (HDLSS) data situations. We give an idea of estimating eigenvalues via singular values of a cross data matrix. We provide consistency properties of the eigenvalue estimation as well as its limiting distribution when the dimension d and the sample size n both grow to infinity in such a way that n is much lower than d. We apply the new methodology to estimating PC directions and PC scores in HDLSS data situations. We give an application of the findings in this paper to a mixture model to classify a dataset into two clusters. We demonstrate how the new methodology performs by using HDLSS data from a microarray study of prostate cancer.  相似文献   

13.
We find the asymptotic distribution of the OLS estimator of the parameters β and ρ in the mixed spatial model with exogenous regressors Yn=Xnβ+ρWnYn+Vn. The exogenous regressors may be bounded or growing, like polynomial trends. The assumption about the spatial matrix Wn is appropriate for the situation when each economic agent is influenced by many others. The error term is a short-memory linear process. The key finding is that in general the asymptotic distribution contains both linear and quadratic forms in standard normal variables and is not normal.  相似文献   

14.
Robust Bayesian analysis is concerned with the problem of making decisions about some future observation or an unknown parameter, when the prior distribution belongs to a class Γ instead of being specified exactly. In this paper, the problem of robust Bayesian prediction and estimation under a squared log error loss function is considered. We find the posterior regret Γ-minimax predictor and estimator in a general class of distributions. Furthermore, we construct the conditional Γ-minimax, most stable and least sensitive prediction and estimation in a gamma model. A prequential analysis is carried out by using a simulation study to compare these predictors.  相似文献   

15.
Let F be a distribution function in the maximal domain of attraction of the Gumbel distribution such that −log(1−F(x))=x1/θL(x) for a positive real number θ, called the Weibull tail index, and a slowly varying function L. It is well known that the estimators of θ have a very slow rate of convergence. We establish here a sharp optimality result in the minimax sense, that is when L is treated as an infinite dimensional nuisance parameter belonging to some functional class. We also establish the rate optimal asymptotic property of a data-driven choice of the sample fraction that is used for estimation.  相似文献   

16.
In this paper we aim to construct adaptive confidence region for the direction of ξ in semiparametric models of the form Y=G(ξTX,ε) where G(⋅) is an unknown link function, ε is an independent error, and ξ is a pn×1 vector. To recover the direction of ξ, we first propose an inverse regression approach regardless of the link function G(⋅); to construct a data-driven confidence region for the direction of ξ, we implement the empirical likelihood method. Unlike many existing literature, we need not estimate the link function G(⋅) or its derivative. When pn remains fixed, the empirical likelihood ratio without bias correlation can be asymptotically standard chi-square. Moreover, the asymptotic normality of the empirical likelihood ratio holds true even when the dimension pn follows the rate of pn=o(n1/4) where n is the sample size. Simulation studies are carried out to assess the performance of our proposal, and a real data set is analyzed for further illustration.  相似文献   

17.
Parallel to Cox's [JRSS B34 (1972) 187-230] proportional hazards model, generalized logistic models have been discussed by Anderson [Bull. Int. Statist. Inst. 48 (1979) 35-53] and others. The essential assumption is that the two densities ratio has a known parametric form. A nice property of this model is that it naturally relates to the logistic regression model for categorical data. In astronomic, demographic, epidemiological, and other studies the variable of interest is often truncated by an associated variable. This paper studies generalized logistic models for the two-sample truncated data problem, where the two lifetime densities ratio is assumed to have the form exp{α+φ(x;β)}. Here φ is a known function of x and β, and the baseline density is unspecified. We develop a semiparametric maximum likelihood method for the case where the two samples have a common truncation distribution. It is shown that inferences for β do not depend the nonparametric components. We also derive an iterative algorithm to maximize the semiparametric likelihood for the general case where different truncation distributions are allowed. We further discuss how to check goodness of fit of the generalized logistic model. The developed methods are illustrated and evaluated using both simulated and real data.  相似文献   

18.
Model selection by means of the predictive least squares (PLS) principle has been thoroughly studied in the context of regression model selection and autoregressive (AR) model order estimation. We introduce a new criterion based on sequentially minimized squared deviations, which are smaller than both the usual least squares and the squared prediction errors used in PLS. We also prove that our criterion has a probabilistic interpretation as a model which is asymptotically optimal within the given class of distributions by reaching the lower bound on the logarithmic prediction errors, given by the so called stochastic complexity, and approximated by BIC. This holds when the regressor (design) matrix is non-random or determined by the observed data as in AR models. The advantages of the criterion include the fact that it can be evaluated efficiently and exactly, without asymptotic approximations, and importantly, there are no adjustable hyper-parameters, which makes it applicable to both small and large amounts of data.  相似文献   

19.
In this paper we develop an econometric method for consistent variable selection in the context of a linear factor model with observable factors for panels of large dimensions. The subset of factors that best fit the data is sequentially determined. Firstly, a partial R2 rule is used to show the existence of an optimal ordering of the candidate variables. Secondly, We show that for a given order of the regressors, the number of factors can be consistently estimated using the Bayes information criterion. The Akaike will asymptotically lead to overfitting of the model. The theory is established under approximate factor structure which allows for limited cross-section and serial dependence in the idiosyncratic term. Simulations show that the proposed two-step selection technique has good finite sample properties. The likelihood of selecting the correct specification increases with the number of cross-sections both asymptotically and in small samples. Moreover, the proposed variable selection method is computationally attractive. For K potential candidate factors, the search requires only 2K regressions compared to 2K for an exhaustive search.  相似文献   

20.
In some applications of kernel density estimation the data may have a highly non-uniform distribution and be confined to a compact region. Standard fixed bandwidth density estimates can struggle to cope with the spatially variable smoothing requirements, and will be subject to excessive bias at the boundary of the region. While adaptive kernel estimators can address the first of these issues, the study of boundary kernel methods has been restricted to the fixed bandwidth context. We propose a new linear boundary kernel which reduces the asymptotic order of the bias of an adaptive density estimator at the boundary, and is simple to implement even on an irregular boundary. The properties of this adaptive boundary kernel are examined theoretically. In particular, we demonstrate that the asymptotic performance of the density estimator is maintained when the adaptive bandwidth is defined in terms of a pilot estimate rather than the true underlying density. We examine the performance for finite sample sizes numerically through analysis of simulated and real data sets.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号