首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
We present a unified semiparametric Bayesian approach based on Markov random field priors for analyzing the dependence of multicategorical response variables on time, space and further covariates. The general model extends dynamic, or state space, models for categorical time series and longitudinal data by including spatial effects as well as nonlinear effects of metrical covariates in flexible semiparametric form. Trend and seasonal components, different types of covariates and spatial effects are all treated within the same general framework by assigning appropriate priors with different forms and degrees of smoothness. Inference is fully Bayesian and uses MCMC techniques for posterior analysis. The approach in this paper is based on latent semiparametric utility models and is particularly useful for probit models. The methods are illustrated by applications to unemployment data and a forest damage survey.  相似文献   

2.
In order to apply nonparametric meyhods to reliability problems, it is desibrable to have available priors over a broad class of survival distributions.In the paper, this is achieved by taking the failure rate function to be the sum oof a nonnegative stochastic process with increasing sample patjhs and a process with decreasing sample paths. This approach produces a prior which chooses an absolutely survival distribution that can have an IFR, DFR, or U-shapped failure rate. Posterior Laplace transforms of the failure rate are obtained based on survival data allows censoring. Bayes estimates of the failure rate as well as the lifetime distribution are then calculated from these posterior Laplace transforms. This approach is also applied to a competing risks model and the proportional hazards model of Cox.  相似文献   

3.
We aim at modeling the survival time of intensive care patients suffering from severe sepsis. The nature of the problem requires a flexible model that allows to extend the classical Cox-model via the inclusion of time-varying and nonparametric effects. These structured survival models are very flexible but additional difficulties arise when model choice and variable selection are desired. In particular, it has to be decided which covariates should be assigned time-varying effects or whether linear modeling is sufficient for a given covariate. Component-wise boosting provides a means of likelihood-based model fitting that enables simultaneous variable selection and model choice. We introduce a component-wise, likelihood-based boosting algorithm for survival data that permits the inclusion of both parametric and nonparametric time-varying effects as well as nonparametric effects of continuous covariates utilizing penalized splines as the main modeling technique. An empirical evaluation of the methodology precedes the model building for the severe sepsis data. A software implementation is available to the interested reader.  相似文献   

4.
Fully nonparametric analysis of covariance with two and three covariates is considered. The approach is based on an extension of the model of Akritas et al. (Biometrika 87(3) (2000) 507). The model allows for possibly nonlinear covariate effect which can have different shape in different factor level combinations. All types of ordinal data are included in the formulation. In particular, the response distributions are not restricted to comply to any parametric or semiparametric model. In this nonparametric model, hypotheses of no main effect no interaction and no simple effect, which adjust for the covariate values, are defined through a decomposition of the conditional distribution functions of the response given to the factor level combination and covariate values. The test statistics are based on averages over the covariate values of certain Nadaraya–Watson regression quantities. Under their respective null hypotheses, such test statistics are shown to have a central χ2 distribution. Small sample corrections are also provided. Simulation results and the analysis of two real datasets are also presented.  相似文献   

5.
Many problems in genomics are related to variable selection where high-dimensional genomic data are treated as covariates. Such genomic covariates often have certain structures and can be represented as vertices of an undirected graph. Biological processes also vary as functions depending upon some biological state, such as time. High-dimensional variable selection where covariates are graph-structured and underlying model is nonparametric presents an important but largely unaddressed statistical challenge. Motivated by the problem of regression-based motif discovery, we consider the problem of variable selection for high-dimensional nonparametric varying-coefficient models and introduce a sparse structured shrinkage (SSS) estimator based on basis function expansions and a novel smoothed penalty function. We present an efficient algorithm for computing the SSS estimator. Results on model selection consistency and estimation bounds are derived. Moreover, finite-sample performances are studied via simulations, and the effects of high-dimensionality and structural information of the covariates are especially highlighted. We apply our method to motif finding problem using a yeast cell-cycle gene expression dataset and word counts in genes’ promoter sequences. Our results demonstrate that the proposed method can result in better variable selection and prediction for high-dimensional regression when the underlying model is nonparametric and covariates are structured. Supplemental materials for the article are available online.  相似文献   

6.
Many problems in genomics are related to variable selection where high-dimensional genomic data are treated as covariates. Such genomic covariates often have certain structures and can be represented as vertices of an undirected graph. Biological processes also vary as functions depending upon some biological state, such as time. High-dimensional variable selection where covariates are graph-structured and underlying model is nonparametric presents an important but largely unaddressed statistical challenge. Motivated by the problem of regression-based motif discovery, we consider the problem of variable selection for high-dimensional nonparametric varying-coefficient models and introduce a sparse structured shrinkage (SSS) estimator based on basis function expansions and a novel smoothed penalty function. We present an efficient algorithm for computing the SSS estimator. Results on model selection consistency and estimation bounds are derived. Moreover, finite-sample performances are studied via simulations, and the effects of high-dimensionality and structural information of the covariates are especially highlighted. We apply our method to motif finding problem using a yeast cell-cycle gene expression dataset and word counts in genes' promoter sequences. Our results demonstrate that the proposed method can result in better variable selection and prediction for high-dimensional regression when the underlying model is nonparametric and covariates are structured. Supplemental materials for the article are available online.  相似文献   

7.
In the present work we are interested in to provide a universal language for supporting formalisms to specify the approximation hierarchy system for an abstract NP‐hard optimization problem. This work grew from the idea of providing a categorical view of structural complexity to optimization problems. The direction is aimed towards actually exploring the connections among the structural complexity aspects and categorical concepts, which may be viewed in a high‐level, in a structuralistic sense. After introducing the optimization problems categories OPTS and OPT, as well as related questions, a formal system modelling the approximation hierarchy of a given optimization problem is provided, based on categorical shape theory. (© 2004 WILEY‐VCH Verlag GmbH & Co. KGaA, Weinheim)  相似文献   

8.
Widely used parametric generalized linear models are, unfortunately, a somewhat limited class of specifications. Nonparametric aspects are often introduced to enrich this class, resulting in semiparametric models. Focusing on single or k-sample problems, many classical nonparametric approaches are limited to hypothesis testing. Those that allow estimation are limited to certain functionals of the underlying distributions. Moreover, the associated inference often relies upon asymptotics when nonparametric specifications are often most appealing for smaller sample sizes. Bayesian nonparametric approaches avoid asymptotics but have, to date, been limited in the range of inference. Working with Dirichlet process priors, we overcome the limitations of existing simulation-based model fitting approaches which yield inference that is confined to posterior moments of linear functionals of the population distribution. This article provides a computational approach to obtain the entire posterior distribution for more general functionals. We illustrate with three applications: investigation of extreme value distributions associated with a single population, comparison of medians in a k-sample problem, and comparison of survival times from different populations under fairly heavy censoring.  相似文献   

9.
Relative-risk models are often used to characterize the relationship between survival time and time-dependent covariates. When the covariates are observed, the estimation and asymptotic theory for parameters of interest are available; challenges remain when missingness occurs. A popular approach at hand is to jointly model survival data and longitudinal data. This seems efficient, in making use of more information, but the rigorous theoretical studies have long been ignored. For both additive risk models and relative-risk models, we consider the missing data nonignorable. Under general regularity conditions, we prove asymptotic normality for the nonparametric maximum likelihood estimators.  相似文献   

10.
Accurate loss reserves are an important item in the financial statement of an insurance company and are mostly evaluated by macrolevel models with aggregate data in run‐off triangles. In recent years, a new set of literature has considered individual claims data and proposed parametric reserving models based on claim history profiles. In this paper, we present a nonparametric and flexible approach for estimating outstanding liabilities using all the covariates associated to the policy, its policyholder, and all the information received by the insurance company on the individual claims since its reporting date. We develop a machine learning–based method and explain how to build specific subsets of data for the machine learning algorithms to be trained and assessed on. The choice for a nonparametric model leads to new issues since the target variables (claim occurrence and claim severity) are right‐censored most of the time. The performance of our approach is evaluated by comparing the predictive values of the reserve estimates with their true values on simulated data. We compare our individual approach with the most used aggregate data method, namely, chain ladder, with respect to the bias and the variance of the estimates. We also provide a short real case study based on a Dutch loan insurance portfolio.  相似文献   

11.
Abstract

This article proposes a method for nonparametric estimation of hazard rates as a function of time and possibly multiple covariates. The method is based on dividing the time axis into intervals, and calculating number of event and follow-up time contributions from the different intervals. The number of event and follow-up time data are then separately smoothed on time and the covariates, and the hazard rate estimators obtained by taking the ratio. Pointwise consistency and asymptotic normality are shown for the hazard rate estimators for a certain class of smoothers, which includes some standard approaches to locally weighted regression and kernel regression. It is shown through simulation that a variance estimator based on this asymptotic distribution is reasonably reliable in practice. The problem of how to select the smoothing parameter is considered, but a satisfactory resolution to this problem has not been identified. The method is illustrated using data from several breast cancer clinical trials.  相似文献   

12.
This paper proposes a new concept: the usage of Multivariate Markov Chains (MMC) as covariates. Our approach is based on the observation that we can treat possible categorical (or discrete) regressors, whose values are unknown in the forecast period, as an MMC in order to improve the forecast error of a certain dependent variable. Hence, we take advantage of the information about the past state interactions between the MMC categories to forecast the categorical (or discrete) regressors and improve the forecast of the actual dependent variable.  相似文献   

13.
This article discusses inference on the order of dependence in binary sequences. The proposed approach is based on the notion of partial exchangeability of order k. A partially exchangeable binary sequence of order k can be represented as a mixture of Markov chains. The mixture is with respect to the unknown transition probability matrix θ. We use this defining property to construct a semiparametric model for binary sequences by assuming a nonparametric prior on the transition matrix θ. This enables us to consider inference on the order of dependence without constraint to a particular parametric model. Implementing posterior simulation in the proposed model is complicated by the fact that the dimension of θ changes with the order of dependence k. We discuss appropriate posterior simulation schemes based on a pseudo prior approach. We extend the model to include covariates by considering an alternative parameterization as an autologistic regression which allows for a straightforward introduction of covariates. The regression on covariates raises the additional inference problem of variable selection. We discuss appropriate posterior simulation schemes, focusing on inference about the order of dependence. We discuss and develop the model with covariates only to the extent needed for such inference.  相似文献   

14.
In this paper, we analyze ovarian cancer cases from six hospitals in China, screen the prognostic factors and predict the survival rate. The data has the feature that all the covariates are categorical. We use three methods to estimate the survival rate–the traditional Cox regression, the two-step Cox regression and a method based on conditional inference tree. By comparison, we know that they are all effective and can predict the survival curve reasonably. The analysis results show that the survival rate is determined by a combination of risk factors, where clinical stage is the most important prognosis factor.  相似文献   

15.
We develop a nonparametric test, based on kernel smoothers, in order to decide whether some covariates could be suppressed in a multidimensional nonparametric regression study. We give the asymptotic distribution of the statistic involved in our test, under a general dependence assumption on the sample that allows for application to time series prediction.  相似文献   

16.
This paper deals with the problem of detecting influential observations in deterministic nonparametric DEA models. The technique we present is intended to classify for a further analysis those sample observations considerably affecting the measured efficiency for the remaining units. Then, the analyst will have to check whether these observations are contaminated by data errors or not. This approach also allows to determine when efficiency changes due to the presence of a given unit in the sample are statistically significant. Thus, ours is a statistical alternative to approach the problem of detecting influential observations in deterministic nonparametric DEA models.  相似文献   

17.
The article is concerned with the use of Markov chain Monte Carlo methods for posterior sampling in Bayesian nonparametric mixture models.In particular, we consider the problem of slice sampling mixture models for a large class of mixing measures generalizing the celebrated Dirichlet process. Such a class of measures, known in the literature as σ-stable Poisson-Kingman models, includes as special cases most of the discrete priors currently known in Bayesian nonparametrics, for example, the two-parameter Poisson-Dirichlet process and the normalized generalized Gamma process. The proposed approach is illustrated on some simulated data examples. This article has online supplementary material.  相似文献   

18.
We describe a Bayesian model for simultaneous linear quantile regression at several specified quantile levels. More specifically, we propose to model the conditional distributions by using random probability measures, known as quantile pyramids, introduced by Hjort and Walker. Unlike many existing approaches, this framework allows us to specify meaningful priors on the conditional distributions, while retaining the flexibility afforded by the nonparametric error distribution formulation. Simulation studies demonstrate the flexibility of the proposed approach in estimating diverse scenarios, generally outperforming other competitive methods. We also provide conditions for posterior consistency. The method is particularly promising for modeling the extremal quantiles. Applications to extreme value analysis and in higher dimensions are also explored through data examples. Supplemental material for this article is available online.  相似文献   

19.
Parametric models for categorical ordinal response variables, like the proportional odds model or the continuation ratio model, assume that the predictor is given by a linear form of covariates. In this article the parametric models are extended to include smooth components in a semiparametric or partially parametric fashion. Parts of the covariates are thereby modeled linearly while other covariates are modeled as unspecified but smooth functions. Estimation is based on a combination of local likelihood and profile likelihood and asymptotic properties of the estimates are derived. In a simulation study it is demonstrated that the profile likelihood approach is to be preferred over a backfitting procedure. Two data examples demonstrate the applicability of the models.  相似文献   

20.
A common practice in customer satisfaction analysis is to administer surveys where subjects are asked to express opinions on a number of statements, or satisfaction scales, by use of ordered categorical responses. Motivated by this application, we propose a pseudo‐likelihood approach to estimate the dependence structure among multivariate categorical variables. As it is commonly carried out in this area, we assume that the responses are related to latent continuous variables that are truncated to induce categorical responses. A Gaussian likelihood is assumed for the latent variables leading to the so‐called ordered probit model. Because the calculation of the exact likelihood is computationally demanding, we adopt an approximate solution based on pairwise likelihood. To asses the performance of the approach, simulation studies are conducted comparing the proposed method with standard likelihood methods. A parametric bootstrap approach to evaluate the variance of the maximum pairwise likelihood estimator is proposed and discussed. An application to customer satisfaction survey is performed showing the effectiveness of the approach in the presence of covariates and under other generalizations of the model. Copyright © 2015 John Wiley & Sons, Ltd.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号