首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
In applied sciences, generalized linear mixed models have become one of the preferred tools to analyze a variety of longitudinal and clustered data. Due to software limitations, the analyses are often restricted to the setting in which the random effects terms follow a multivariate normal distribution. However, this assumption may be unrealistic, obscuring important features of among-unit variation. This work describes a widely applicable semiparametric Bayesian approach that relaxes the normality assumption by using a novel mixture of multivariate Polya trees prior to define a flexible nonparametric model for the random effects distribution. The nonparametric prior is centered on the commonly used parametric normal family. We allow this parametric family to hold only approximately, thereby providing a robust alternative for modeling. We discuss and implement practical procedures for addressing the computational challenges that arise under this approach. We illustrate the methodology by applying it to real-life examples.

Supplemental materials for this paper are available online.  相似文献   

2.
Fully nonparametric analysis of covariance with two and three covariates is considered. The approach is based on an extension of the model of Akritas et al. (Biometrika 87(3) (2000) 507). The model allows for possibly nonlinear covariate effect which can have different shape in different factor level combinations. All types of ordinal data are included in the formulation. In particular, the response distributions are not restricted to comply to any parametric or semiparametric model. In this nonparametric model, hypotheses of no main effect no interaction and no simple effect, which adjust for the covariate values, are defined through a decomposition of the conditional distribution functions of the response given to the factor level combination and covariate values. The test statistics are based on averages over the covariate values of certain Nadaraya–Watson regression quantities. Under their respective null hypotheses, such test statistics are shown to have a central χ2 distribution. Small sample corrections are also provided. Simulation results and the analysis of two real datasets are also presented.  相似文献   

3.
We consider Bayesian nonparametric regression through random partition models. Our approach involves the construction of a covariate-dependent prior distribution on partitions of individuals. Our goal is to use covariate information to improve predictive inference. To do so, we propose a prior on partitions based on the Potts clustering model associated with the observed covariates. This drives by covariate proximity both the formation of clusters, and the prior predictive distribution. The resulting prior model is flexible enough to support many different types of likelihood models. We focus the discussion on nonparametric regression. Implementation details are discussed for the specific case of multivariate multiple linear regression. The proposed model performs well in terms of model fitting and prediction when compared to other alternative nonparametric regression approaches. We illustrate the methodology with an application to the health status of nations at the turn of the 21st century. Supplementary materials are available online.  相似文献   

4.
We study a flexible class of nonproportional hazard function regression models in which the influence of the covariates splits into the sum of a parametric part and a time-dependent nonparametric part. We develop a method of covariate selection for the parametric part by adjusting for the implicit fitting of the nonparametric part. Asymptotic consistency of the proposed covariate selection method is established, leading to asymptotically normal estimators of both parametric and nonparametric parts of the model in the presence of covariate selection. The approach is applied to a real data set and a simulation study is presented.  相似文献   

5.
We aim at modeling the survival time of intensive care patients suffering from severe sepsis. The nature of the problem requires a flexible model that allows to extend the classical Cox-model via the inclusion of time-varying and nonparametric effects. These structured survival models are very flexible but additional difficulties arise when model choice and variable selection are desired. In particular, it has to be decided which covariates should be assigned time-varying effects or whether linear modeling is sufficient for a given covariate. Component-wise boosting provides a means of likelihood-based model fitting that enables simultaneous variable selection and model choice. We introduce a component-wise, likelihood-based boosting algorithm for survival data that permits the inclusion of both parametric and nonparametric time-varying effects as well as nonparametric effects of continuous covariates utilizing penalized splines as the main modeling technique. An empirical evaluation of the methodology precedes the model building for the severe sepsis data. A software implementation is available to the interested reader.  相似文献   

6.
This article proposes a probability model for k-dimensional ordinal outcomes, that is, it considers inference for data recorded in k-dimensional contingency tables with ordinal factors. The proposed approach is based on full posterior inference, assuming a flexible underlying prior probability model for the contingency table cell probabilities. We use a variation of the traditional multivariate probit model, with latent scores that determine the observed data. In our model, a mixture of normals prior replaces the usual single multivariate normal model for the latent variables. By augmenting the prior model to a mixture of normals we generalize inference in two important ways. First, we allow for varying local dependence structure across the contingency table. Second, inference in ordinal multivariate probit models is plagued by problems related to the choice and resampling of cutoffs defined for these latent variables. We show how the proposed mixture model approach entirely removes these problems. We illustrate the methodology with two examples, one simulated dataset and one dataset of interrater agreement.  相似文献   

7.
Series models have several functions: comprehending the functional dependence of variable of interest on covariates, forecasting the dependent variable for future values of covariates and estimating variance disintegration, co-integration and steady-state relations. Although the regression function in a time series model has been extensively modeled both parametrically and nonparametrically, modeling of the error autocorrelation is mainly restricted to the parametric setup. A proper modeling of autocorrelation not only helps to reduce the bias in regression function estimate, but also enriches forecasting via a better forecast of the error term. In this article, we present a nonparametric modeling of autocorrelation function under a Bayesian framework. Moving into the frequency domain from the time domain, we introduce a Gaussian process prior to the log of the spectral density, which is then updated by using a Whittle approximation for the likelihood function (Whittle likelihood). The posterior computation is simplified due to the fact that Whittle likelihood is approximated by the likelihood of a normal mixture distribution with log-spectral density as a location shift parameter, where the mixture is of only five components with known means, variances, and mixture probabilities. The problem then becomes conjugate conditional on the mixture components, and a Gibbs sampler is used to initiate the unknown mixture components as latent variables. We present a simulation study for performance comparison, and apply our method to the two real data examples.  相似文献   

8.
Heteroscedasticity checks for regression models   总被引:1,自引:0,他引:1  
For checking on heteroscedasticity in regression models, a unified approach is proposed to constructing test statistics in parametric and nonparametric regression models. For nonparametric regression, the test is not affected sensitively by the choice of smoothing parameters which are involved in estimation of the nonparametric regression function. The limiting null distribution of the test statistic remains the same in a wide range of the smoothing parameters. When the covariate is one-dimensional, the tests are, under some conditions, asymptotically distribution-free. In the high-dimensional cases, the validity of bootstrap approximations is investigated. It is shown that a variant of the wild bootstrap is consistent while the classical bootstrap is not in the general case, but is applicable if some extra assumption on conditional variance of the squared error is imposed. A simulation study is performed to provide evidence of how the tests work and compare with tests that have appeared in the literature. The approach may readily be extended to handle partial linear, and linear autoregressive models.  相似文献   

9.
In this paper we consider the estimation of the error distribution in a heteroscedastic nonparametric regression model with multivariate covariates. As estimator we consider the empirical distribution function of residuals, which are obtained from multivariate local polynomial fits of the regression and variance functions, respectively. Weak convergence of the empirical residual process to a Gaussian process is proved. We also consider various applications for testing model assumptions in nonparametric multiple regression. The model tests obtained are able to detect local alternatives that converge to zero at an n−1/2-rate, independent of the covariate dimension. We consider in detail a test for additivity of the regression function.  相似文献   

10.
We apply nonparametric regression to current status data, which often arises in survival analysis and reliability analysis. While no parametric assumption on the distributions has been imposed, most authors have employed parametric models like linear models to measure the covariate effects on failure times in regression analysis with current status data. We construct a nonparametric estimator of the regression function by modifying the maximum rank correlation (MRC) estimator. Our estimator can deal with the cases where the other estimators do not work. We present the asymptotic bias and the asymptotic distribution of the estimator by adapting a result on equicontinuity of degenerate U-processes to the setup of this paper.  相似文献   

11.
Various random effects models have been developed for clustered binary data; however, traditional approaches to these models generally rely heavily on the specification of a continuous random effect distribution such as Gaussian or beta distribution. In this article, we introduce a new model that incorporates nonparametric unobserved random effects on unit interval (0,1) into logistic regression multiplicatively with fixed effects. This new multiplicative model setup facilitates prediction of our nonparametric random effects and corresponding model interpretations. A distinctive feature of our approach is that a closed-form expression has been derived for the predictor of nonparametric random effects on unit interval (0,1) in terms of known covariates and responses. A quasi-likelihood approach has been developed in the estimation of our model. Our results are robust against random effects distributions from very discrete binary to continuous beta distributions. We illustrate our method by analyzing recent large stock crash data in China. The performance of our method is also evaluated through simulation studies.  相似文献   

12.
The core of the nonparametric/semiparametric Bayesian analysis is to relax the particular parametric assumptions on the distributions of interest to be unknown and random, and assign them a prior. Selecting a suitable prior therefore is especially critical in the nonparametric Bayesian fitting. As the distribution of distribution, Dirichlet process (DP) is the most appreciated nonparametric prior due to its nice theoretical proprieties, modeling flexibility and computational feasibility. In this paper, we review and summarize some developments of DP during the past decades. Our focus is mainly concentrated upon its theoretical properties, various extensions, statistical modeling and applications to the latent variable models.  相似文献   

13.
The authors study a heteroscedastic partially linear regression model and develop an inferential procedure for it. This includes a test of heteroscedasticity, a two-step estimator of the heteroscedastic variance function, semiparametric generalized least-squares estimators of the parametric and nonparametric components of the model, and a bootstrap goodness of fit test to see whether the nonparametric component can be parametrized.  相似文献   

14.
This paper examines the analysis of an extended finite mixture of factor analyzers (MFA) where both the continuous latent variable (common factor) and the categorical latent variable (component label) are assumed to be influenced by the effects of fixed observed covariates. A polytomous logistic regression model is used to link the categorical latent variable to its corresponding covariate, while a traditional linear model with normal noise is used to model the effect of the covariate on the continuous latent variable. The proposed model turns out be in various ways an extension of many existing related models, and as such offers the potential to address some of the issues not fully handled by those previous models. A detailed derivation of an EM algorithm is proposed for parameter estimation, and latent variable estimates are obtained as by-products of the overall estimation procedure.  相似文献   

15.
Two algorithms for establishing a connection between correlations before and after ordinalization under a wide spectrum of nonnormal underlying bivariate distributions are developed by extending the iteratively found normal-based results via the power polynomials. These algorithms are designed to compute the polychoric correlation when the ordinal correlation is specified, and vice versa, along with the distributional properties of latent, continuous variables that are subsequently ordinalized through thresholds dictated by the marginal proportions. The method has broad applicability in the simulation and random number generation world where modeling the relationships between these correlation types is of interest.  相似文献   

16.
A major problem in statistical quality control is to detect a change in the distribution of independent sequentially observed random vectors. The case of a Gaussian pre-change distribution has been extensively analyzed. Here we are concerned with the non-normal multivariate case. In this setup it is natural to use tolerance regions as detection tools. These regions are defined in terms of density level sets, which can be estimated in a plug-in fashion. Under a normal mixture model we compare, through a simulation study, the performance of such a detection scheme for two density estimators: a (parametric) normal mixture and a (nonparametric) kernel estimator. The problem of the bandwidth choice for the latter is addressed. We also obtain a result concerning the convergence rates of the error probabilities under a general parametric model. Finally, a real data example is discussed.  相似文献   

17.
For clustering objects, we often collect not only continuous variables, but binary attributes as well. This paper proposes a model-based clustering approach with mixed binary and continuous variables where each binary attribute is generated by a latent continuous variable that is dichotomized with a suitable threshold value, and where the scores of the latent variables are estimated from the binary data. In economics, such variables are called utility functions and the assumption is that the binary attributes (the presence or the absence of a public service or utility) are determined by low and high values of these functions. In genetics, the latent response is interpreted as the ??liability?? to develop a qualitative trait or phenotype. The estimated scores of the latent variables, together with the observed continuous ones, allow to use a multivariate Gaussian mixture model for clustering, instead of using a mixture of discrete and continuous distributions. After describing the method, this paper presents the results of both simulated and real-case data and compares the performances of the multivariate Gaussian mixture model and of a mixture of joint multivariate and multinomial distributions. Results show that the former model outperforms the mixture model for variables with different scales, both in terms of classification error rate and reproduction of the clusters means.  相似文献   

18.
Current status data arise when the exact timing of an event cannot be observed, and the only available information is whether or not the event has occurred at a random censoring time point. We consider current status data with a cured subgroup, where subjects in this subgroup are not susceptible to the event of interest. We model the cure probability using a generalized linear model with a known link function. For subjects susceptible to the event, we model their survival hazard using a partly linear additive risk model. We show that the penalized maximum likelihood estimate of the parametric regression coefficient is \({\sqrt{n}}\) consistent, asymptotically normal and efficient. The nonparametric cumulative baseline function and nonparametric covariate effect can be estimated with the n 1/3 convergence rate. We propose inference using the weighted bootstrap. Simulations study is employed to assess finite sample performance of the proposed estimate. We analyze the Calcification study using the proposed approach.  相似文献   

19.
We propose a multivariate statistical framework for regional development assessment based on structural equation modelling with latent variables and show how such methods can be combined with non-parametric classification methods such as cluster analysis to obtain development grouping of territorial units. This approach is advantageous over the current approaches in the literature in that it takes account of distributional issues such as departures from normality in turn enabling application of more powerful inferential techniques; it enables modelling of structural relationships among latent development dimensions and subsequently formal statistical testing of model specification and testing of various hypothesis on the estimated parameters; it allows for complex structure of the factor loadings in the measurement models for the latent variables which can also be formally tested in the confirmatory framework; and enables computation of latent variable scores that take into account structural or causal relationships among latent variables and complex structure of the factor loadings in the measurement models. We apply these methods to regional development classification of Slovenia and Croatia.  相似文献   

20.
本文基于多类型复发事件数据,讨论了一个新的加性乘积比率回归模型,该模型包括两部分,其中第一部分为可加Aalen模型,其中协变量影响为加性的且与时间有关.第二部分为Cox回归模型,其中协变量有乘性影响.利用估计方程的方法,给出了该模型中未知参数和非参数函数的一种估计方法,并利用现代经验过程理沦证明了所得估计的相合性和渐近正态性.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号