期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Selecting mixed-effects models based on a generalized information criterion

Wenji Pu Xu-Feng Niu 《Journal of multivariate analysis》2006,97(3):733-758

The generalized information criterion (GIC) proposed by Rao and Wu [A strongly consistent procedure for model selection in a regression problem, Biometrika 76 (1989) 369-374] is a generalization of Akaike's information criterion (AIC) and the Bayesian information criterion (BIC). In this paper, we extend the GIC to select linear mixed-effects models that are widely applied in analyzing longitudinal data. The procedure for selecting fixed effects and random effects based on the extended GIC is provided. The asymptotic behavior of the extended GIC method for selecting fixed effects is studied. We prove that, under mild conditions, the selection procedure is asymptotically loss efficient regardless of the existence of a true model and consistent if a true model exists. A simulation study is carried out to empirically evaluate the performance of the extended GIC procedure. The results from the simulation show that if the signal-to-noise ratio is moderate or high, the percentages of choosing the correct fixed effects by the GIC procedure are close to one for finite samples, while the procedure performs relatively poorly when it is used to select random effects. 相似文献

2.

A strongly consistent information criterion for linear model selection based on M-estimation

Y. Wu M. M. Zen 《Probability Theory and Related Fields》1999,113(4):599-625

In this paper, a linear model selection procedure based on M-estimation is proposed, which includes many classical model selection criteria as its special cases. It is shown that the proposed criterion is strongly consistent under certain mild conditions, for instance without assuming normality of the distribution of the random errors. The results from a simulation study are also presented. Received: 13 October 1997 / Revised version: 10 August 1998 相似文献

3.

Consistent variable selection in large panels when factors are observable

Rachida Ouysse 《Journal of multivariate analysis》2006,97(4):946-984

In this paper we develop an econometric method for consistent variable selection in the context of a linear factor model with observable factors for panels of large dimensions. The subset of factors that best fit the data is sequentially determined. Firstly, a partial R² rule is used to show the existence of an optimal ordering of the candidate variables. Secondly, We show that for a given order of the regressors, the number of factors can be consistently estimated using the Bayes information criterion. The Akaike will asymptotically lead to overfitting of the model. The theory is established under approximate factor structure which allows for limited cross-section and serial dependence in the idiosyncratic term. Simulations show that the proposed two-step selection technique has good finite sample properties. The likelihood of selecting the correct specification increases with the number of cross-sections both asymptotically and in small samples. Moreover, the proposed variable selection method is computationally attractive. For K potential candidate factors, the search requires only 2K regressions compared to 2^K for an exhaustive search. 相似文献

4.

Parameter estimation under ambiguity and contamination with the spurious model

María Teresa Gallegos 《Journal of multivariate analysis》2006,97(5):1221-1250

Recently, we proposed variants as a statistical model for treating ambiguity. If data are extracted from an object with a machine then it might not be able to give a unique safe answer due to ambiguity about the correct interpretation of the object. On the other hand, the machine is often able to produce a finite number of alternative feature sets (of the same object) that contain the desired one. We call these feature sets variants of the object. Data sets that contain variants may be analyzed by means of statistical methods and all chapters of multivariate analysis can be seen in the light of variants. In this communication, we focus on point estimation in the presence of variants and outliers. Besides robust parameter estimation, this task requires also selecting the regular objects and their valid feature sets (regular variants). We determine the mixed MAP-ML estimator for a model with spurious variants and outliers as well as estimators based on the integrated likelihood. We also prove asymptotic results which show that the estimators are nearly consistent.The problem of variant selection turns out to be computationally hard; therefore, we also design algorithms for efficient approximation. We finally demonstrate their efficacy with a simulated data set and a real data set from genetics. 相似文献

5.

Detection of a change point based on local-likelihood

Jib Huh 《Journal of multivariate analysis》2010,101(7):1681-1700

In this paper, we consider the regression function or its νth derivative in generalized linear models which may have a change/discontinuity point at an unknown location. The location and its jump size are estimated with the local polynomial fits based on one-sided kernel weighted local-likelihood functions. Asymptotic distributions of the proposed estimators of location and jump size are established. The finite-sample performances of the proposed estimators with practical aspects are illustrated by simulated and beetle mortality examples. 相似文献

6.

Change-point estimation for censored regression model 总被引：1，自引：0，他引：1

Zhan-feng WANG Yao-hua WU & Lin-cheng ZHAO Department of Statistics Finance University of Science Technology of China Hefei China 《中国科学A辑(英文版)》2007,50(1):63-72

In this paper, we consider the change-point estimation in the censored regression model assuming that there exists one change point. A nonparametric estimate of the change-point is proposed and is shown to be strongly consistent. Furthermore, its convergence rate is also obtained. 相似文献

7.

On kernel method for sliced average variance estimation

Li-Ping Zhu Li-Xing Zhu 《Journal of multivariate analysis》2007,98(5):970-991

In this paper, we use the kernel method to estimate sliced average variance estimation (SAVE) and prove that this estimator is both asymptotically normal and root n consistent. We use this kernel estimator to provide more insight about the differences between slicing estimation and other sophisticated local smoothing methods. Finally, we suggest a Bayes information criterion (BIC) to estimate the dimensionality of SAVE. Examples and real data are presented for illustrating our method. 相似文献

8.

Rates of convergence for partitioning and nearest neighbor regression estimates with unbounded data

Michael Kohler 《Journal of multivariate analysis》2006,97(2):311-323

Estimation of regression functions from independent and identically distributed data is considered. The L₂ error with integration with respect to the design measure is used as an error criterion. Usually in the analysis of the rate of convergence of estimates besides smoothness assumptions on the regression function and moment conditions on Y also boundedness assumptions on X are made. In this article we consider partitioning and nearest neighbor estimates and show that by replacing the boundedness assumption on X by a proper moment condition the same rate of convergence can be shown as for bounded data. 相似文献

9.

Semiparametric inference for transformation models via empirical likelihood

Yichuan Zhao 《Journal of multivariate analysis》2010,101(8):1846-1858

Recent advances in the transformation model have made it possible to use this model for analyzing a variety of censored survival data. For inference on the regression parameters, there are semiparametric procedures based on the normal approximation. However, the accuracy of such procedures can be quite low when the censoring rate is heavy. In this paper, we apply an empirical likelihood ratio method and derive its limiting distribution via U-statistics. We obtain confidence regions for the regression parameters and compare the proposed method with the normal approximation based method in terms of coverage probability. The simulation results demonstrate that the proposed empirical likelihood method overcomes the under-coverage problem substantially and outperforms the normal approximation based method. The proposed method is illustrated with a real data example. Finally, our method can be applied to general U-statistic type estimating equations. 相似文献

10.

A robust estimator of multivariate location based on projection

ZhangZhongzhan LiGuoying 《高校应用数学学报(英文版)》1999,14(2):158-168

This paper presents an estimator of location vector based on one-dimensional projection of high dimensional data. The properties of the new estimator including consistency ,asymptotic normality and robustness are discussed. It is proved that the estimator is not only stronglyconsistent and asymptotically normal but also with a breakdown point 1/2 and a bounded influence function. 相似文献

11.

Offline and online weighted least squares estimation of nonstationary power ARCH processes

Abdelhakim Aknouche Eid M. Al-Eid 《Statistics & probability letters》2011,81(10):1535-1540

This paper proposes two estimation methods based on a weighted least squares criterion for non-(strictly) stationary power ARCH models. The weights are the squared volatilities evaluated at a known value in the parameter space. The first method is adapted for fixed sample size data while the second one allows for online data available in real time. It will be shown that these methods provide consistent and asymptotically Gaussian estimates having asymptotic variance equal to that of the quasi-maximum likelihood estimate (QMLE) regardless of the value of the weighting parameter. Finite-sample performances of the proposed WLS estimates are shown via a simulation study for various sub-classes of power ARCH models. 相似文献

12.

Bias correction of cross-validation criterion based on Kullback-Leibler information under a general condition

Hirokazu Yanagihara Tetsuji Tonda 《Journal of multivariate analysis》2006,97(9):1965-1975

This paper deals with the bias correction of the cross-validation (CV) criterion to estimate the predictive Kullback-Leibler information. A bias-corrected CV criterion is proposed by replacing the ordinary maximum likelihood estimator with the maximizer of the adjusted log-likelihood function. The adjustment is just slight and simple, but the improvement of the bias is remarkable. The bias of the ordinary CV criterion is O(n^-1), but that of the bias-corrected CV criterion is O(n^-2). We verify that our criterion has smaller bias than the AIC, TIC, EIC and the ordinary CV criterion by numerical experiments. 相似文献

13.

Direct variable selection for discrimination among several groups

Guy Martial Nkiet 《Journal of multivariate analysis》2012,105(1):151-163

We propose a criterion for variable selection in discriminant analysis. This criterion permits to arrange the variables in decreasing order of adequacy for discrimination, so that the variable selection problem reduces to that of the estimation of suitable permutation and dimensionality. Then, estimators for these parameters are proposed and the resulting method for selecting variables is shown to be consistent. In a simulation study, we compute proportions of correct classification after variable selection in order to gain understanding of the performance of our proposal and to compare it to existing methods. 相似文献

14.

Nonparametric tests of independence between random vectors

R. Beran P. Lafaye de Micheaux 《Journal of multivariate analysis》2007,98(9):1805-1824

A nonparametric test of the mutual independence between many numerical random vectors is proposed. This test is based on a characterization of mutual independence defined from probabilities of half-spaces in a combinatorial formula of Möbius. As such, it is a natural generalization of tests of independence between univariate random variables using the empirical distribution function. If the number of vectors is p and there are n observations, the test is defined from a collection of processes R_n,A, where A is a subset of {1,…,p} of cardinality |A|>1, which are asymptotically independent and Gaussian. Without the assumption that each vector is one-dimensional with a continuous cumulative distribution function, any test of independence cannot be distribution free. The critical values of the proposed test are thus computed with the bootstrap which is shown to be consistent. Another similar test, with the same asymptotic properties, for the serial independence of a multivariate stationary sequence is also proposed. The proposed test works when some or all of the marginal distributions are singular with respect to Lebesgue measure. Moreover, in singular cases described in Section 4, the test inherits useful invariance properties from the general affine invariance property. 相似文献

15.

A novel trace test for the mean parameters in a multivariate growth curve model

Jemila S. Hamid Joseph Beyene Dietrich von Rosen 《Journal of multivariate analysis》2011,102(2):238-251

A trace test for the mean parameters of the growth curve model is proposed. It is constructed using the restricted maximum likelihood followed by an estimated likelihood ratio approach. The statistic reduces to the Lawley-Hotelling trace test for the Multivariate Analysis of Variance (MANOVA) models. Our test statistic is, therefore, a natural extension of the classical trace test to GMANOVA models. We show that the distribution of the test under the null hypothesis does not depend on the unknown covariance matrix Σ. We also show that the distributions under the null and alternative hypotheses can be represented as sums of weighted central and non-central chi-square random variables, respectively. Under the null hypothesis, the Satterthwaite approximation is used to get an approximate critical point. A novel Satterthwaite type approximation is proposed to obtain an approximate power. A simulation study is performed to evaluate the performance of our proposed test and numerical examples are provided as illustrations. 相似文献

16.

Sparse estimation in functional linear regression

Eun Ryung LeeByeong U. Park 《Journal of multivariate analysis》2012,105(1):1-17

As a useful tool in functional data analysis, the functional linear regression model has become increasingly common and been studied extensively in recent years. In this paper, we consider a sparse functional linear regression model which is generated by a finite number of basis functions in an expansion of the coefficient function. In this model, we do not specify how many and which basis functions enter the model, thus it is not like a typical parametric model where predictor variables are pre-specified. We study a general framework that gives various procedures which are successful in identifying the basis functions that enter the model, and also estimating the resulting regression coefficients in one-step. We adopt the idea of variable selection in the linear regression setting where one adds a weighted L₁ penalty to the traditional least squares criterion. We show that the procedures in our general framework are consistent in the sense of selecting the model correctly, and that they enjoy the oracle property, meaning that the resulting estimators of the coefficient function have asymptotically the same properties as the oracle estimator which uses knowledge of the underlying model. We investigate and compare several methods within our general framework, via a simulation study. Also, we apply the methods to the Canadian weather data. 相似文献

17.

Corrected version of AIC for selecting multivariate normal linear regression models in a general nonnormal case

Hirokazu Yanagihara 《Journal of multivariate analysis》2006,97(5):1070-1089

This paper deals with the bias reduction of Akaike information criterion (AIC) for selecting variables in multivariate normal linear regression models when the true distribution of observation is an unknown nonnormal distribution. We propose a corrected version of AIC which is partially constructed by the jackknife method and is adjusted to the exact unbiased estimator of the risk when the candidate model includes the true model. It is pointed out that the influence of nonnormality in the bias of our criterion is smaller than the ones in AIC and TIC. We verify that our criterion is better than the AIC, TIC and EIC by conducting numerical experiments. 相似文献

18.

Estimation of partial linear error-in-response models with validation data

Qi-Hua Wang 《Annals of the Institute of Statistical Mathematics》2003,55(1):21-39

In this paper, an estimation theory in partial linear model is developed when there is measurement error in the response and when validation data are available. A semiparametric method with the primary data is used to define two estimators for both the regression parameter and the nonparametric part using the least squares criterion with the help of validation data. The proposed estimators of the parameter are proved to be strongly consistent and asymptotically normaal, and the estimators of the nonparametric part are also proved to be strongly consistent and weakly consistent with an optimal convergent rate. Then, the two estimators of the parameter are compared based on their empirical performances. Supported by NNSF of China (No. 10231030, No. 10241001) and a grant to the author for his excellent Ph.D. dissertation work in China. 相似文献

19.

Conditional and unconditional methods for selecting variables in linear mixed models

Tatsuya Kubokawa 《Journal of multivariate analysis》2011,102(3):641-660

In the problem of selecting the explanatory variables in the linear mixed model, we address the derivation of the (unconditional or marginal) Akaike information criterion (AIC) and the conditional AIC (cAIC). The covariance matrices of the random effects and the error terms include unknown parameters like variance components, and the selection procedures proposed in the literature are limited to the cases where the parameters are known or partly unknown. In this paper, AIC and cAIC are extended to the situation where the parameters are completely unknown and they are estimated by the general consistent estimators including the maximum likelihood (ML), the restricted maximum likelihood (REML) and other unbiased estimators. We derive, related to AIC and cAIC, the marginal and the conditional prediction error criteria which select superior models in light of minimizing the prediction errors relative to quadratic loss functions. Finally, numerical performances of the proposed selection procedures are investigated through simulation studies. 相似文献

20.

Clusters, outliers, and regression: fixed point clusters

Christian Hennig 《Journal of multivariate analysis》2003,86(1):183-212

Fixed point clustering is a new stochastic approach to cluster analysis. The definition of a single fixed point cluster (FPC) is based on a simple parametric model, but there is no parametric assumption for the whole dataset as opposed to mixture modeling and other approaches. An FPC is defined as a data subset that is exactly the set of non-outliers with respect to its own parameter estimators. This paper concentrates upon the theoretical foundation of FPC analysis as a method for clusterwise linear regression, i.e., the single clusters are modeled as linear regressions with normal errors. In this setup, fixed point clustering is based on an iteratively reweighted estimation with zero weight for all outliers. FPCs are non-hierarchical, but they may overlap and include each other. A specification of the number of clusters is not needed. Consistency results are given for certain mixture models of interest in cluster analysis. Convergence of a fixed point algorithm is shown. Application to a real dataset shows that fixed point clustering can highlight some other interesting features of datasets compared to maximum likelihood methods in the presence of deviations from the usual assumptions of model based cluster analysis. 相似文献