首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
In this paper we develop an econometric method for consistent variable selection in the context of a linear factor model with observable factors for panels of large dimensions. The subset of factors that best fit the data is sequentially determined. Firstly, a partial R2 rule is used to show the existence of an optimal ordering of the candidate variables. Secondly, We show that for a given order of the regressors, the number of factors can be consistently estimated using the Bayes information criterion. The Akaike will asymptotically lead to overfitting of the model. The theory is established under approximate factor structure which allows for limited cross-section and serial dependence in the idiosyncratic term. Simulations show that the proposed two-step selection technique has good finite sample properties. The likelihood of selecting the correct specification increases with the number of cross-sections both asymptotically and in small samples. Moreover, the proposed variable selection method is computationally attractive. For K potential candidate factors, the search requires only 2K regressions compared to 2K for an exhaustive search.  相似文献   

2.
Recently, we proposed variants as a statistical model for treating ambiguity. If data are extracted from an object with a machine then it might not be able to give a unique safe answer due to ambiguity about the correct interpretation of the object. On the other hand, the machine is often able to produce a finite number of alternative feature sets (of the same object) that contain the desired one. We call these feature sets variants of the object. Data sets that contain variants may be analyzed by means of statistical methods and all chapters of multivariate analysis can be seen in the light of variants. In this communication, we focus on point estimation in the presence of variants and outliers. Besides robust parameter estimation, this task requires also selecting the regular objects and their valid feature sets (regular variants). We determine the mixed MAP-ML estimator for a model with spurious variants and outliers as well as estimators based on the integrated likelihood. We also prove asymptotic results which show that the estimators are nearly consistent.The problem of variant selection turns out to be computationally hard; therefore, we also design algorithms for efficient approximation. We finally demonstrate their efficacy with a simulated data set and a real data set from genetics.  相似文献   

3.
Summary The problem of selecting a subset of k gamma populations which includes the “best” population, i.e. the one with the largest value of the scale parameter, is studied as a multiple decision problem. The shape parameters of the gamma distributions are assumed to be known and equal for all the k populations. Based on a common number of observations from each population, a procedure R is defined which selects a subset which is never empty, small in size and yet large enough to guarantee with preassigned probability that it includes the best population regardless of the true unknown values of the scale parameters θi. Expression for the probability of a correct selection using R are derived and it is shown that for the case of a common number of observations the infimum of this probability is identical with the probability integral of the ratio of the maximum of k-1 independent gamma chance variables to another independent gamma chance variable, all with the same value of the other parameter. Formulas are obtained for the expected number of populations retained in the selected subset and it is shown that this function attains its maximum when the parameters θi are equal. Some other properties of the procedure are proved. Tables of constants b which are necessary to carry out the procedure are appended. These constants are reciprocals of the upper percentage points of Fmax, the largest of several correlated F statistics. The distribution of this statistic is obtained. This work was supported in part by Office of Naval Research Contract Nonr-225 (53) at Stanford University. Reproduction in whole or in part is permitted for any purpose of the United States Government.  相似文献   

4.
This paper considers the nonparametric M-estimator in a nonlinear cointegration type model. The local time density argument, which was developed by Phillips and Park (1998) [6] and Wang and Phillips (2009) [9], is applied to establish the asymptotic theory for the nonparametric M-estimator. The weak consistency and the asymptotic distribution of the proposed estimator are established under mild conditions. Meanwhile, the asymptotic distribution of the local least squares estimator and the local least absolute distance estimator can be obtained as applications of our main results. Furthermore, an iterated procedure for obtaining the nonparametric M-estimator and a cross-validation bandwidth selection method are discussed, and some numerical examples are provided to show that the proposed methods perform well in the finite sample case.  相似文献   

5.
For normally distributed data from the k populations with m×m covariance matrices Σ1,…,Σk, we test the hypothesis H:Σ1=?=Σk vs the alternative AH when the number of observations Ni, i=1,…,k from each population are less than or equal to the dimension m, Nim, i=1,…,k. Two tests are proposed and compared with two other tests proposed in the literature. These tests, however, do not require that Nim, and thus can be used in all situations, including when the likelihood ratio test is available. The asymptotic distributions of the test statistics are given, and the power compared by simulations with other test statistics proposed in the literature. The proposed tests perform well and better in several cases than the other two tests available in the literature.  相似文献   

6.
A nonparametric test of the mutual independence between many numerical random vectors is proposed. This test is based on a characterization of mutual independence defined from probabilities of half-spaces in a combinatorial formula of Möbius. As such, it is a natural generalization of tests of independence between univariate random variables using the empirical distribution function. If the number of vectors is p and there are n observations, the test is defined from a collection of processes Rn,A, where A is a subset of {1,…,p} of cardinality |A|>1, which are asymptotically independent and Gaussian. Without the assumption that each vector is one-dimensional with a continuous cumulative distribution function, any test of independence cannot be distribution free. The critical values of the proposed test are thus computed with the bootstrap which is shown to be consistent. Another similar test, with the same asymptotic properties, for the serial independence of a multivariate stationary sequence is also proposed. The proposed test works when some or all of the marginal distributions are singular with respect to Lebesgue measure. Moreover, in singular cases described in Section 4, the test inherits useful invariance properties from the general affine invariance property.  相似文献   

7.
For two multivariate normal populations with unequal covariance matrices, a procedure is developed for testing the equality of the mean vectors based on the concept of generalized p-values. The generalized p-values we have developed are functions of the sufficient statistics. The computation of the generalized p-values is discussed and illustrated with an example. Numerical results show that one of our generalized p-value test has a type I error probability not exceeding the nominal level. A formula involving only a finite number of chi-square random variables is provided for computing this generalized p-value. The formula is useful in a Bayesian solution as well. The problem of constructing a confidence region for the difference between the mean vectors is also addressed using the concept of generalized confidence regions. Finally, using the generalized p-value approach, a solution is developed for the heteroscedastic MANOVA problem.  相似文献   

8.
Understanding and modeling dependence structures for multivariate extreme values are of interest in a number of application areas. One of the well-known approaches is to investigate the Pickands dependence function. In the bivariate setting, there exist several estimators for estimating the Pickands dependence function which assume known marginal distributions [J. Pickands, Multivariate extreme value distributions, Bull. Internat. Statist. Inst., 49 (1981) 859-878; P. Deheuvels, On the limiting behavior of the Pickands estimator for bivariate extreme-value distributions, Statist. Probab. Lett. 12 (1991) 429-439; P. Hall, N. Tajvidi, Distribution and dependence-function estimation for bivariate extreme-value distributions, Bernoulli 6 (2000) 835-844; P. Capéraà, A.-L. Fougères, C. Genest, A nonparametric estimation procedure for bivariate extreme value copulas, Biometrika 84 (1997) 567-577]. In this paper, we generalize the bivariate results to p-variate multivariate extreme value distributions with p?2. We demonstrate that the proposed estimators are consistent and asymptotically normal as well as have excellent small sample behavior.  相似文献   

9.
This paper analyzes the problem of using the sample covariance matrix to detect the presence of clustering in p-variate data in the special case when the component covariance matrices are known up to a constant multiplier. For the case of testing one population against a mixture of two populations, tests are derived and shown to be optimal in a certain sense. Some of their distribution properties are derived exactly. Some remarks on the extensions of these tests to mixtures of kp populations are included. The paper is essentially a formal treatment (in a special case) of some well-known procedures. The methods used in deriving the distribution properties are applicable to a variety of other situations involving mixtures.  相似文献   

10.
We consider a difference based ridge regression estimator and a Liu type estimator of the regression parameters in the partial linear semiparametric regression model, y=Xβ+f+ε. Both estimators are analyzed and compared in the sense of mean-squared error. We consider the case of independent errors with equal variance and give conditions under which the proposed estimators are superior to the unbiased difference based estimation technique. We extend the results to account for heteroscedasticity and autocovariance in the error terms. Finally, we illustrate the performance of these estimators with an application to the determinants of electricity consumption in Germany.  相似文献   

11.
The main objective of this paper is the calculation and the comparative study of two general measures of multivariate kurtosis, namely Mardia's measure β2,p and Song's measure S(f). In this context, general formulas for the said measures are derived for the broad family of the elliptically contoured symmetric distributions and also for specific members of this family, like the multivariate t-distribution, the multivariate Pearson type II, the multivariate Pearson type VII, the multivariate symmetric Kotz type distribution and the uniform distribution in the unit sphere. Analytic expressions for computing Shannon and Rényi entropies are obtained under the elliptic family. The behaviour of Mardia's and Song's measures, their similarities and differences, possible interpretations and uses in practice are investigated by comparing them in specific members of the elliptic family of multivariate distributions. An empirical estimator of Song's measure is moreover proposed and its asymptotic distribution is investigated under the elliptic family of multivariate distributions.  相似文献   

12.
An exhaustive search as required for traditional variable selection methods is impractical in high dimensional statistical modeling. Thus, to conduct variable selection, various forms of penalized estimators with good statistical and computational properties, have been proposed during the past two decades. The attractive properties of these shrinkage and selection estimators, however, depend critically on the size of regularization which controls model complexity. In this paper, we consider the problem of consistent tuning parameter selection in high dimensional sparse linear regression where the dimension of the predictor vector is larger than the size of the sample. First, we propose a family of high dimensional Bayesian Information Criteria (HBIC), and then investigate the selection consistency, extending the results of the extended Bayesian Information Criterion (EBIC), in Chen and Chen (2008) to ultra-high dimensional situations. Second, we develop a two-step procedure, the SIS+AENET, to conduct variable selection in p>n situations. The consistency of tuning parameter selection is established under fairly mild technical conditions. Simulation studies are presented to confirm theoretical findings, and an empirical example is given to illustrate the use in the internet advertising data.  相似文献   

13.
The generalized information criterion (GIC) proposed by Rao and Wu [A strongly consistent procedure for model selection in a regression problem, Biometrika 76 (1989) 369-374] is a generalization of Akaike's information criterion (AIC) and the Bayesian information criterion (BIC). In this paper, we extend the GIC to select linear mixed-effects models that are widely applied in analyzing longitudinal data. The procedure for selecting fixed effects and random effects based on the extended GIC is provided. The asymptotic behavior of the extended GIC method for selecting fixed effects is studied. We prove that, under mild conditions, the selection procedure is asymptotically loss efficient regardless of the existence of a true model and consistent if a true model exists. A simulation study is carried out to empirically evaluate the performance of the extended GIC procedure. The results from the simulation show that if the signal-to-noise ratio is moderate or high, the percentages of choosing the correct fixed effects by the GIC procedure are close to one for finite samples, while the procedure performs relatively poorly when it is used to select random effects.  相似文献   

14.
Let X={X(s)}sS be an almost sure continuous stochastic process (S compact subset of Rd) in the domain of attraction of some max-stable process, with index function constant over S. We study the tail distribution of ∫SX(s)ds, which turns out to be of Generalized Pareto type with an extra ‘spatial’ parameter (the areal coefficient from Coles and Tawn (1996) [3]). Moreover, we discuss how to estimate the tail probability P(∫SX(s)ds>x) for some high value x, based on independent and identically distributed copies of X. In the course we also give an estimator for the areal coefficient. We prove consistency of the proposed estimators. Our methods are applied to the total rainfall in the North Holland area; i.e. X represents in this case the rainfall over the region for which we have observations, and its integral amounts to total rainfall.The paper has two main purposes: first to formalize and justify the results of Coles and Tawn (1996) [3]; further we treat the problem in a non-parametric way as opposed to their fully parametric methods.  相似文献   

15.
In this paper we are interested in studying multiple decision procedures fork (k≧2) populations which are themselves unknown but which one assumed to belong to a restricted family. We propose to study a selection procedure for distributions associated with these populations which are convex-ordered with respect to a specified distributionG assuming that there exists a best one. The procedure described here is based on a statistic which is a linear function of the firstr order statistics and which reduces to the total life statistics whenG is exponential. The infimum of the probability of a correct selection and an asymptotic expression for this probability are obtained using the subset selection approach. Some other properties of this procedure are discussed. Asymptotic relative efficiencies of this rule with respect to some selection procedures proposed by Barlow and Gupta [3] for the star-ordered distributions and by Gupta [8] for the gamma populations with known shape parameters are obtained. A selection procedure for selecting the best population using the indifference zone approach is also studied. This research was supported by the Office of Naval Research Contract N00014-75-C-0455 at Purdue University. Reproduction in whole or in part is permitted for any purpose of the United States Government. Ming-Wei Lu is now at the Department of Vital and Health Statistics, Michigan.  相似文献   

16.
Homogeneity tests based on several progressively Type-II censored samples   总被引:2,自引:0,他引:2  
In this paper, we discuss the problem of testing the homogeneity of several populations when the available data are progressively Type-II censored. Defining for each sample a univariate counting process, we can modify all the methods that were developed during the last two decades (see e.g. [P.K. Andersen, Ø. Borgan, R. Gill, N. Keiding, Statistical Models Based on Counting Processes, Springer, New York, 1993]) for use to this problem. An important aspect of these tests is that they are based on either linear or non-linear functionals of a discrepancy process (DP) based on the comparison of the cumulative hazard rate (chr) estimated from each sample with the chr estimated from the whole sample (viz., the aggregation of all the samples), leading to either linear tests or non-linear tests. Both these kinds of tests suffer from some serious drawbacks. For example, it is difficult to extend non-linear tests to the K-sample situation when K?3. For this reason, we propose here a new class of non-linear tests, based on a chi-square type functional of the DP, that can be applied to the K-sample problem for any K?2.  相似文献   

17.
In this paper, an information-based criterion is proposed for carrying out change point analysis and variable selection simultaneously in linear models with a possible change point. Under some weak conditions, this criterion is shown to be strongly consistent in the sense that with probability one, it chooses the smallest true model for large n. Its byproducts include strongly consistent estimates of the regression coefficients regardless if there is a change point. In case that there is a change point, its byproducts also include a strongly consistent estimate of the change point parameter. In addition, an algorithm is given which has significantly reduced the computation time needed by the proposed criterion for the same precision. Results from a simulation study are also presented.  相似文献   

18.
Recent advances in the transformation model have made it possible to use this model for analyzing a variety of censored survival data. For inference on the regression parameters, there are semiparametric procedures based on the normal approximation. However, the accuracy of such procedures can be quite low when the censoring rate is heavy. In this paper, we apply an empirical likelihood ratio method and derive its limiting distribution via U-statistics. We obtain confidence regions for the regression parameters and compare the proposed method with the normal approximation based method in terms of coverage probability. The simulation results demonstrate that the proposed empirical likelihood method overcomes the under-coverage problem substantially and outperforms the normal approximation based method. The proposed method is illustrated with a real data example. Finally, our method can be applied to general U-statistic type estimating equations.  相似文献   

19.
We consider the problem of testing whether the common mean of a single n-vector of multivariate normal random variables with known variance and unknown common correlation ρ is zero. We derive the standardized likelihood ratio test for known ρ and explore different ways of proceeding with ρ unknown. We evaluate the performance of the standardized statistic where ρ is replaced with an estimate of ρ and determine the critical value cn that controls the type I error rate for the least favorable ρ in [0,1]. The constant cn increases with n and this procedure has pathological behavior if ρ depends on n and ρn converges to zero at a certain rate. As an alternate approach, we replace ρ with the upper limit of a (1−βn) confidence interval chosen so that cn=c for all n. We determine βn so that the type I error rate is exactly controlled for all ρ in [0,1]. We also investigate a simpler approach where we bound the type I error rate. The former method performs well for all n while the less powerful bound method may be a useful in some settings as a simple approach. The proposed tests can be used in different applications, including within-cluster resampling and combining exchangeable p-values.  相似文献   

20.
In this paper we address the problem of estimating θ1 when , are observed and |θ1θ2|?c for a known constant c. Clearly Y2 contains information about θ1. We show how the so-called weighted likelihood function may be used to generate a class of estimators that exploit that information. We discuss how the weights in the weighted likelihood may be selected to successfully trade bias for precision and thus use the information effectively. In particular, we consider adaptively weighted likelihood estimators where the weights are selected using the data. One approach selects such weights in accord with Akaike's entropy maximization criterion. We describe several estimators obtained in this way. However, the maximum likelihood estimator is investigated as a competitor to these estimators along with a Bayes estimator, a class of robust Bayes estimators and (when c is sufficiently small), a minimax estimator. Moreover we will assess their properties both numerically and theoretically. Finally, we will see how all of these estimators may be viewed as adaptively weighted likelihood estimators. In fact, an over-riding theme of the paper is that the adaptively weighted likelihood method provides a powerful extension of its classical counterpart.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号