首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
In this paper, we consider a semiparametric modeling with multi-indices when neither the response nor the predictors can be directly observed and there are distortions from some multiplicative factors. In contrast to the existing methods in which the response distortion deteriorates estimation efficacy even for a simple linear model, the dimension reduction technique presented in this paper interestingly does not have to account for distortion of the response variable. The observed response can be used directly whether distortion is present or not. The resulting dimension reduction estimators are shown to be consistent and asymptotically normal. The results can be employed to test whether the central dimension reduction subspace has been estimated appropriately and whether the components in the basis directions in the space are significant. Thus, the method provides an alternative for determining the structural dimension of the subspace and for variable selection. A simulation study is carried out to assess the performance of the proposed method. The analysis of a real dataset demonstrates the potential usefulness of distortion removal.  相似文献   

2.
We consider informative dimension reduction for regression problems with random predictors. Based on the conditional specification of the model, we develop a methodology for replacing the predictors with a smaller number of functions of the predictors. We apply the method to the case where the inverse conditional model is in the linear exponential family. For such an inverse model and the usual Normal forward regression model it is shown that, for any number of predictors, the sufficient summary has dimension two or less. In addition, we develop a test of dimensionality. The relationship of our method with the existing dimension reduction theory based on the marginal distribution of the predictors is discussed.  相似文献   

3.
Recent sufficient dimension reduction methodologies in multivariate regression do not have direct application to a categorical predictor. For this, we define the multivariate central partial mean subspace and propose two methodologies to estimate it. The first method uses the ordinary least squares. Chi-squared distributed statistics for dimension tests are constructed, and an estimate of the target subspace is consistent and efficient. Moreover, the effects of continuous predictors can be tested without assuming any model. The second method extends Iterative Hessian Transformation to this context. For dimension estimation, permutation tests are used. Simulated and real data examples for illustrating various properties of the proposed methods are presented.  相似文献   

4.
This paper studies how to identify influential observations in the functional linear model in which the predictor is functional and the response is scalar. Measurement of the effects of a single observation on estimation and prediction when the model is estimated by the principal components method is undertaken. For that, three statistics are introduced for measuring the influence of each observation on estimation and prediction of the functional linear model with scalar response that are generalizations of the measures proposed for the standard regression model by [D.R. Cook, Detection of influential observations in linear regression, Technometrics 19 (1977) 15-18; D. Peña, A new statistic for influence in linear regression, Technometrics 47 (2005) 1-12] respectively. A smoothed bootstrap method is proposed to estimate the quantiles of the influence measures, which allows us to point out which observations have the larger influence on estimation and prediction. The behavior of the three statistics and the quantile estimation bootstrap based method is analyzed via a simulation study. Finally, the practical use of the proposed statistics is illustrated by the analysis of a real data example, which show that the proposed measures are useful for detecting heterogeneity in the functional linear model with scalar response.  相似文献   

5.
Robust Bayesian analysis is concerned with the problem of making decisions about some future observation or an unknown parameter, when the prior distribution belongs to a class Γ instead of being specified exactly. In this paper, the problem of robust Bayesian prediction and estimation under a squared log error loss function is considered. We find the posterior regret Γ-minimax predictor and estimator in a general class of distributions. Furthermore, we construct the conditional Γ-minimax, most stable and least sensitive prediction and estimation in a gamma model. A prequential analysis is carried out by using a simulation study to compare these predictors.  相似文献   

6.
Admissible prediction problems in finite populations with arbitrary rank under matrix loss function are investigated. For the general random effects linear model, we obtained the necessary and sufficient conditions for a linear predictor of the linearly predictable variable to be admissible in the two classes of homogeneous linear predictors and all linear predictors and the class that contains all predictors, respectively. Moreover, we prove that the best linear unbiased predictors (BLUPs) of the population total and the finite population regression coefficient are admissible under different assumptions of superpopulation models respectively.  相似文献   

7.
Minimum average variance estimation (MAVE, Xia et al. (2002) [29]) is an effective dimension reduction method. It requires no strong probabilistic assumptions on the predictors, and can consistently estimate the central mean subspace. It is applicable to a wide range of models, including time series. However, the least squares criterion used in MAVE will lose its efficiency when the error is not normally distributed. In this article, we propose an adaptive MAVE which can be adaptive to different error distributions. We show that the proposed estimate has the same convergence rate as the original MAVE. An EM algorithm is proposed to implement the new adaptive MAVE. Using both simulation studies and a real data analysis, we demonstrate the superior finite sample performance of the proposed approach over the existing least squares based MAVE when the error distribution is non-normal and the comparable performance when the error is normal.  相似文献   

8.
In this paper, we carry out an in-depth theoretical investigation for inference with missing response and covariate data for general regression models. We assume that the missing data are missing at random (MAR) or missing completely at random (MCAR) throughout. Previous theoretical investigations in the literature have focused only on missing covariates or missing responses, but not both. Here, we consider theoretical properties of the estimates under three different estimation settings: complete case (CC) analysis, a complete response (CR) analysis that involves an analysis of those subjects with only completely observed responses, and the all case (AC) analysis, which is an analysis based on all of the cases. Under each scenario, we derive general expressions for the likelihood and devise estimation schemes based on the EM algorithm. We carry out a theoretical investigation of the three estimation methods in the normal linear model and analytically characterize the loss of information for each method, as well as derive and compare the asymptotic variances for each method assuming the missing data are MAR or MCAR. In addition, a theoretical investigation of bias for the CC method is also carried out. A simulation study and real dataset are given to illustrate the methodology.  相似文献   

9.
A general methodology for selecting predictors for Gaussian generative classification models is presented. The problem is regarded as a model selection problem. Three different roles for each possible predictor are considered: a variable can be a relevant classification predictor or not, and the irrelevant classification variables can be linearly dependent on a part of the relevant predictors or independent variables. This variable selection model was inspired by a previous work on variable selection in model-based clustering. A BIC-like model selection criterion is proposed. It is optimized through two embedded forward stepwise variable selection algorithms for classification and linear regression. The model identifiability and the consistency of the variable selection criterion are proved. Numerical experiments on simulated and real data sets illustrate the interest of this variable selection methodology. In particular, it is shown that this well ground variable selection model can be of great interest to improve the classification performance of the quadratic discriminant analysis in a high dimension context.  相似文献   

10.
In this paper we aim to estimate the direction in general single-index models and to select important variables simultaneously when a diverging number of predictors are involved in regressions. Towards this end, we propose the nonconcave penalized inverse regression method. Specifically, the resulting estimation with the SCAD penalty enjoys an oracle property in semi-parametric models even when the dimension, pn, of predictors goes to infinity. Under regularity conditions we also achieve the asymptotic normality when the dimension of predictor vector goes to infinity at the rate of pn=o(n1/3) where n is sample size, which enables us to construct confidence interval/region for the estimated index. The asymptotic results are augmented by simulations, and illustrated by analysis of an air pollution dataset.  相似文献   

11.
Reduced rank regression assumes that the coefficient matrix in a multivariate regression model is not of full rank. The unknown rank is traditionally estimated under the assumption of normal responses. We derive an asymptotic test for the rank that only requires the response vector have finite second moments. The test is extended to the nonconstant covariance case. Linear combinations of the components of the predictor vector that are estimated to be significant for modelling the responses are obtained.  相似文献   

12.
The multivariate probit model is very useful for analyzing correlated multivariate dichotomous data. Recently, this model has been generalized with a confirmatory factor analysis structure for accommodating more general covariance structure, and it is called the MPCFA model. The main purpose of this paper is to consider local influence analysis, which is a well-recognized important step of data analysis beyond the maximum likelihood estimation, of the MPCFA model. As the observed-data likelihood associated with the MPCFA model is intractable, the famous Cook's approach cannot be applied to achieve local influence measures. Hence, the local influence measures are developed via Zhu and Lee's [Local influence for incomplete data model, J. Roy. Statist. Soc. Ser. B 63 (2001) 111-126.] approach that is closely related to the EM algorithm. The diagnostic measures are derived from the conformal normal curvature of an appropriate function. The building blocks are computed via a sufficiently large random sample of the latent response strengths and latent variables that are generated by the Gibbs sampler. Some useful perturbation schemes are discussed. Results that are obtained from analyses of an artificial example and a real example are presented to illustrate the newly developed methodology.  相似文献   

13.
This work aims to predict exponentials of mixed effects under a multivariate linear regression model with one random factor. Such quantities are of particular interest in prediction problems where the dependent variable is the logarithm of the variable that is the object of inference. Bias-corrected empirical predictors of the target quantities are defined. A second-order approximation for the mean crossed product error of two of these predictors is obtained, where the mean squared error is a particular case. An estimator of the mean crossed product error with second-order bias is proposed. Finally, results are illustrated through an application related to small area estimation.  相似文献   

14.
Diagnostic checking for multivariate parametric models is investigated in this article. A nonparametric Monte Carlo Test (NMCT) procedure is proposed. This Monte Carlo approximation is easy to implement and can automatically make any test procedure scale-invariant even when the test statistic is not scale-invariant. With it we do not need plug-in estimation of the asymptotic covariance matrix that is used to normalize test statistic and then the power performance can be enhanced. The consistency of NMCT approximation is proved. For comparison, we also extend the score type test to one-dimensional cases. NMCT can also be applied to diverse problems such as a classical problem for which we test whether or not certain covariables in linear model has significant impact for response. Although the Wilks lambda, a likelihood ratio test, is a proven powerful test, NMCT outperforms it especially in non-normal cases. Simulations are carried out and an application to a real data set is illustrated.  相似文献   

15.
It is natural to assume that a missing-data mechanism depends on latent variables in the analysis of incomplete data in latent variate modeling because latent variables are error-free and represent key notions investigated by applied researchers. Unfortunately, the missing-data mechanism is then not missing at random (NMAR). In this article, a new estimation method is proposed, which leads to consistent and asymptotically normal estimators for all parameters in a linear latent variate model, where the missing mechanism depends on the latent variables and no concrete functional form for the missing-data mechanism is used in estimation. The method to be proposed is a type of multi-sample analysis with or without mean structures, and hence, it is easy to implement. Complete-case analysis is shown to produce consistent estimators for some important parameters in the model.  相似文献   

16.
Logic Regression is an adaptive regression methodology mainly developed to explore high-order interactions in genomic data. Logic Regression is intended for situations where most of the covariates in the data to be analyzed are binary. The goal of Logic Regression is to find predictors that are Boolean (logical) combinations of the original predictors. In this article, we give an overview of the methodology and discuss some applications. We also describe the software for Logic Regression, which is available as an R and S-Plus package.  相似文献   

17.
There is a recent interest in developing new statistical methods to predict time series by taking into account a continuous set of past values as predictors. In this functional time series prediction approach, we propose a functional version of the partial linear model that allows both to consider additional covariates and to use a continuous path in the past to predict future values of the process. The aim of this paper is to present this model, to construct some estimates and to look at their properties both from a theoretical point of view by means of asymptotic results and from a practical perspective by treating some real data sets. Although the literature on the use of parametric or nonparametric functional modeling is growing, as far as we know, this is the first paper on semiparametric functional modeling for the prediction of time series.  相似文献   

18.
In this paper we propose a dimension reduction method for estimating the directions in a multiple-index regression based on information extraction. This extends the recent work of Yin and Cook [X. Yin, R.D. Cook, Direction estimation in single-index regression, Biometrika 92 (2005) 371-384] who introduced the method and used it to estimate the direction in a single-index regression. While a formal extension seems conceptually straightforward, there is a fundamentally new aspect of our extension: We are able to show that, under the assumption of elliptical predictors, the estimation of multiple-index regressions can be decomposed into successive single-index estimation problems. This significantly reduces the computational complexity, because the nonparametric procedure involves only a one-dimensional search at each stage. In addition, we developed a permutation test to assist in estimating the dimension of a multiple-index regression.  相似文献   

19.
Fixed point clustering is a new stochastic approach to cluster analysis. The definition of a single fixed point cluster (FPC) is based on a simple parametric model, but there is no parametric assumption for the whole dataset as opposed to mixture modeling and other approaches. An FPC is defined as a data subset that is exactly the set of non-outliers with respect to its own parameter estimators. This paper concentrates upon the theoretical foundation of FPC analysis as a method for clusterwise linear regression, i.e., the single clusters are modeled as linear regressions with normal errors. In this setup, fixed point clustering is based on an iteratively reweighted estimation with zero weight for all outliers. FPCs are non-hierarchical, but they may overlap and include each other. A specification of the number of clusters is not needed. Consistency results are given for certain mixture models of interest in cluster analysis. Convergence of a fixed point algorithm is shown. Application to a real dataset shows that fixed point clustering can highlight some other interesting features of datasets compared to maximum likelihood methods in the presence of deviations from the usual assumptions of model based cluster analysis.  相似文献   

20.
Recent advances in the transformation model have made it possible to use this model for analyzing a variety of censored survival data. For inference on the regression parameters, there are semiparametric procedures based on the normal approximation. However, the accuracy of such procedures can be quite low when the censoring rate is heavy. In this paper, we apply an empirical likelihood ratio method and derive its limiting distribution via U-statistics. We obtain confidence regions for the regression parameters and compare the proposed method with the normal approximation based method in terms of coverage probability. The simulation results demonstrate that the proposed empirical likelihood method overcomes the under-coverage problem substantially and outperforms the normal approximation based method. The proposed method is illustrated with a real data example. Finally, our method can be applied to general U-statistic type estimating equations.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号