首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
We develop a vector generalised linear model to describe the influence of the atmospheric circulation on extreme daily precipitation across the UK. The atmospheric circulation is represented by three covariates, namely synoptic scale airflow strength, direction and vorticity; the extremes are represented by the monthly maxima of daily precipitation, modelled by the generalised extreme value distribution (GEV). The model parameters for data from 689 rain gauges across the UK are estimated using a maximum likelihood estimator. Within the framework of vector generalised linear models, various plausible models exist to describe the influence of the individual covariates, possible nonlinearities in the covariates and seasonality. We selected the final model based on the Akaike information criterion (AIC), and evaluated the predictive power of individual covariates by means of quantile verification scores and leave-one-out cross validation. The final model conditions the location and scale parameter of the GEV on all three covariates; the shape parameter is modelled as a constant. The relationships between strength and vorticity on the one hand, and the GEV location and scale parameters on the other hand are modelled as natural cubic splines with two degrees of freedom. The influence of direction is parameterised as a sine with amplitude and phase. The final model has a common parameterisation for the whole year. Seasonality is partly captured by the covariates themselves, but mostly by an additional annual cycle that is parameterised as a phase-shifted sine and accounts for physical influences that we have not attempted to explicitly model, such as humidity.  相似文献   

2.
We propose a probability model for random partitions in the presence of covariates. In other words, we develop a model-based clustering algorithm that exploits available covariates. The motivating application is predicting time to progression for patients in a breast cancer trial. We proceed by reporting a weighted average of the responses of clusters of earlier patients. The weights should be determined by the similarity of the new patient’s covariate with the covariates of patients in each cluster. We achieve the desired inference by defining a random partition model that includes a regression on covariates. Patients with similar covariates are a priori more likely to be clustered together. Posterior predictive inference in this model formalizes the desired prediction.

We build on product partition models (PPM). We define an extension of the PPM to include a regression on covariates by including in the cohesion function a new factor that increases the probability of experimental units with similar covariates to be included in the same cluster. We discuss implementations suitable for any combination of continuous, categorical, count, and ordinal covariates.

An implementation of the proposed model as R-package is available for download.  相似文献   

3.
The dependent competing risks model of human mortality is considered, assuming that the dependence between lifetimes is modelled by a multivariate copula function. The effect on the overall survival of removing one or more causes of death is explored under two alternative definitions of removal, ignoring the causes and eliminating them. Under the two definitions of removal, expressions for the overall survival functions in terms of the specified copula (density) and the net (marginal) survival functions are given. The net survival functions are obtained as a solution to a system of non-linear differential equations, which relates them through the specified copula (derivatives) to the crude (sub-) survival functions, estimated from data. The overall survival functions in a model with four competing risks, cancer, cardiovascular diseases, respiratory diseases and all other causes grouped together, have been implemented and evaluated, based on cause-specific mortality data for England and Wales published by the Office for National Statistics, for the year 2007. We show that the two alternative definitions of removal of a cause of death have different effects on the overall survival and in particular on the life expectancy at birth and at age 65, when one, two or three of the competing causes are removed. An important conclusion is that the eliminating definition is better suited for practical use in competing risks’ applications, since it is more intuitive, and it suffices to consider only positive dependence between the lifetimes which is not the case under the alternative ignoring definition.  相似文献   

4.
In this paper we present a discrete survival model with covariates and random effects, where the random effects may depend on the observed covariates. The dependence between the covariates and the random effects is modelled through correlation parameters, and these parameters can only be identified for time-varying covariates. For time-varying covariates, however, it is possible to separate regression effects and selection effects in the case of a certain dependene structure between the random effects and the time-varying covariates that are assumed to be conditionally independent given the initial level of the covariate. The proposed model is equivalent to a model with independent random effects and the initial level of the covariates as further covariates. The model is applied to simulated data that illustrates some identifiability problems, and further indicate how the proposed model may be an approximation to retrospectively collected data with incorrect specification of the waiting times. The model is fitted by maximum likelihood estimation that is implemented as iteratively reweighted least squares. © 1998 John Wiley & Sons, Ltd.  相似文献   

5.
This paper addresses the problem of data fragmentation when incorporating imbalanced categorical covariates in nonparametric survival models. The problem arises in an application of demand forecasting where certain categorical covariates are important explanatory factors for the diversity of survival patterns but are severely imbalanced in the sense that a large percentage of data segments defined by these covariates have very small sample sizes. Two general approaches, called the class‐based approach and the fusion‐based approach, are proposed to handle the problem. Both reply on judicious utilization of a data segment hierarchy defined by the covariates. The class‐based approach allows certain segments in the hierarchy to have their private survival functions and aggregates the others to share a common survival function. The fusion‐based approach allows all survival functions to borrow and share information from all segments based on their positions in the hierarchy. A nonparametric Bayesian estimator with Dirichlet process priors provides the data‐sharing mechanism in the fusion‐based approach. The hyperparameters in the priors are treated as fixed quantities and learned from data by taking advantage of the data segment hierarchy. The proposed methods are motivated and validated by a case study with real‐world data from an operation of software development service.  相似文献   

6.
We aim at modeling the survival time of intensive care patients suffering from severe sepsis. The nature of the problem requires a flexible model that allows to extend the classical Cox-model via the inclusion of time-varying and nonparametric effects. These structured survival models are very flexible but additional difficulties arise when model choice and variable selection are desired. In particular, it has to be decided which covariates should be assigned time-varying effects or whether linear modeling is sufficient for a given covariate. Component-wise boosting provides a means of likelihood-based model fitting that enables simultaneous variable selection and model choice. We introduce a component-wise, likelihood-based boosting algorithm for survival data that permits the inclusion of both parametric and nonparametric time-varying effects as well as nonparametric effects of continuous covariates utilizing penalized splines as the main modeling technique. An empirical evaluation of the methodology precedes the model building for the severe sepsis data. A software implementation is available to the interested reader.  相似文献   

7.
Generalized linear mixed models (GLMM) are used in situations where a number of characteristics (covariates) affect a nonnormal response variable and the responses are correlated due to the existence of clusters or groups. For example, the responses in biological applications may be correlated due to common genetic factors or environmental factors. The clustering or grouping is addressed by introducing cluster effects to the model; the associated parameters are often treated as random effects parameters. In many applications, the magnitude of the variance components corresponding to one or more of the sets of random effects parameters are of interest, especially the point null hypothesis that one or more of the variance components is zero. A Bayesian approach to test the hypothesis is to use Bayes factors comparing the models with and without the random effects in question—this work reviews a number of approaches for estimating the Bayes factor. We perform a comparative study of the different approaches to compute Bayes factors for GLMMs by applying them to two different datasets. The first example employs a probit regression model with a single variance component to data from a natural selection study on turtles. The second example uses a disease mapping model from epidemiology, a Poisson regression model with two variance components. Bridge sampling and a recent improvement known as warp bridge sampling, importance sampling, and Chib's marginal likelihood calculation are all found to be effective. The relative advantages of the different approaches are discussed.  相似文献   

8.
One attractive advantage of the presented single-index hazards regression is that it can take into account possibly time-dependent covariates. In such a model formulation, the main theme of this research is to develop a theoretically valid and practically feasible estimation procedure for the index coefficients and the induced survival function. In particular, compared with the existing pseudo-likelihood approaches, our one proposes an automatic bandwidth selection and suppresses an influence of outliers. By making an effective use of the considered versatile survival process, we further reduce a substantial finite-sample bias in the Chambless-Diao type estimator of the most popular time-dependent accuracy summary. The asymptotic properties of estimators and data-driven bandwidths are also established under some suitable conditions. It is found in simulations that the proposed estimators and inference procedures exhibit quite satisfactory performances. Moreover, the general applicability of our methodology is illustrated by two empirical data.  相似文献   

9.
This work studies a proportional hazards model for survival data with "long-term survivors",in which covariates are subject to linear measurement error.It is well known that the naive estimators from both partial and full likelihood methods are inconsistent under this measurement error model.For measurement error models,methods of unbiased estimating function and corrected likelihood have been proposed in the literature.In this paper,we apply the corrected partial and full likelihood approaches to estimate the model and obtain statistical inference from survival data with long-term survivors.The asymptotic properties of the estimators are established.Simulation results illustrate that the proposed approaches provide useful tools for the models considered.  相似文献   

10.
提供一种求解Cox模型的新方法:基于Cox模型的基本特征构造虚拟响应变量,在此基础上利用最小二乘迭代算法得到基准生存函数和回归系数的估计,并进一步将该算法推广到广义Cox模型,利用局部多项式方法拟合虚拟响应变量,解决了协变量作用形式为非参的情形.  相似文献   

11.
Trace regression models are widely used in applications involving panel data, images, genomic microarrays, etc., where high-dimensional covariates are often involved. However, the existing research involving high-dimensional covariates focuses mainly on the condition mean model. In this paper, we extend the trace regression model to the quantile trace regression model when the parameter is a matrix of simultaneously low rank and row (column) sparsity. The convergence rate of the penalized estimator is derived under mild conditions. Simulations, as well as a real data application, are also carried out for illustration.  相似文献   

12.
Abstract

In a longitudinal study, individuals are observed over some period of time. The investigator wishes to model the responses over this time as a function of various covariates measured on these individuals. The times of measurement may be sparse and not coincident across individuals. When the covariate values are not extensively replicated, it is very difficult to propose a parametric model linking the response to the covariates because plots of the raw data are of little help. Although the response curve may only be observed at a few points, we consider the underlying curve y(t). We fit a regression model y(t) = x Tβ(t) + ε(t) and use the coefficient functions β(t) to suggest a suitable parametric form. Estimates of y(t) are constructed by simple interpolation, and appropriate weighting is used in the regression. We demonstrate the method on simulated data to show its ability to recover the true structure and illustrate its application to some longitudinal data from the Panel Study of Income Dynamics.  相似文献   

13.
In this paper we consider the estimation of the error distribution in a heteroscedastic nonparametric regression model with multivariate covariates. As estimator we consider the empirical distribution function of residuals, which are obtained from multivariate local polynomial fits of the regression and variance functions, respectively. Weak convergence of the empirical residual process to a Gaussian process is proved. We also consider various applications for testing model assumptions in nonparametric multiple regression. The model tests obtained are able to detect local alternatives that converge to zero at an n−1/2-rate, independent of the covariate dimension. We consider in detail a test for additivity of the regression function.  相似文献   

14.
Dynamic or flexible regression models are used more and more often in carcinogenesis studies to relate lifetime distribution to time-depending explanatory variables. In addition to the classical regression models, such as the Cox model, AFT model, Linear transformation model, Frailty model, etc., we expose so-called flexible regression models, which are well adapted to study cross-effects of survival functions. Such effects are sometimes observed in clinical trials. Classical examples are well-known data concerning effects of chemotherapy (CH) and chemotherapy plus radiotherapy (CH + R) on survival times of gastric cancers patients. In this paper, we give examples which illustrate possible applications of the Hsieh model (2001) and the SCE model proposed by Bagdonavicius and Nikulin and adapted to treat survival data with one crossing point. We compare both models. Biblipgraphy: 40 titles. __________ Translated from Zapiski Nauchnykh Seminarov POMI, Vol. 339, 2006, pp. 78–101.  相似文献   

15.
The seminal Cox’s proportional intensity model with multiplicative frailty is a popular approach to analyzing the frequently encountered recurrent event data in scientific studies. In the case of violating the proportional intensity assumption, the additive intensity model is a useful alternative. Both the additive and proportional intensity models provide two principal frameworks for studying the association between the risk factors and the disease recurrences. However, methodology development on the additive intensity model with frailty is lacking, although would be valuable. In this paper, we propose an additive intensity model with additive frailty to formulate the effects of possibly time-dependent covariates on recurrent events as well as to evaluate the intra-class dependence within recurrent events which is captured by the frailty variable. The asymptotic properties for both the regression parameters and the association parameters in frailty distribution are established. Furthermore, we also investigate the large-sample properties of the estimator for the cumulative baseline intensity function.  相似文献   

16.

Many methods have been developed for analyzing survival data which are commonly right-censored. These methods, however, are challenged by complex features pertinent to the data collection as well as the nature of data themselves. Typically, biased samples caused by left-truncation (or length-biased sampling) and measurement error often accompany survival analysis. While such data frequently arise in practice, little work has been available to simultaneously address these features. In this paper, we explore valid inference methods for handling left-truncated and right-censored survival data with measurement error under the widely used Cox model. We first exploit a flexible estimator for the survival model parameters which does not require specification of the baseline hazard function. To improve the efficiency, we further develop an augmented nonparametric maximum likelihood estimator. We establish asymptotic results and examine the efficiency and robustness issues for the proposed estimators. The proposed methods enjoy appealing features that the distributions of the covariates and of the truncation times are left unspecified. Numerical studies are reported to assess the finite sample performance of the proposed methods.

  相似文献   

17.
It is rather challenging for current variable selectors to handle situations where the number of covariates under consideration is ultra-high. Consider a motivating clinical trial of the drug bortezomib for the treatment of multiple myeloma, where overall survival and expression levels of 44760 probesets were measured for each of 80 patients with the goal of identifying genes that predict survival after treatment. This dataset defies analysis even with regularized regression. Some remedies have been proposed for the linear model and for generalized linear models, but there are few solutions in the survival setting and, to our knowledge, no theoretical support. Furthermore, existing strategies often involve tuning parameters that are difficult to interpret. In this paper we propose and theoretically justify a principled method for reducing dimensionality in the analysis of censored data by selecting only the important covariates. Our procedure involves a tuning parameter that has a simple interpretation as the desired false positive rate of this selection. We present simulation results and apply the proposed procedure to analyze the aforementioned myeloma study.  相似文献   

18.
A three-stage model for the induction of bone tumors in beagles from single injections of 239Pu is presented. The model involves branching with one or two initiation steps controlled by radiation. It was applied to six groups of animals with graded injection levels between 0.38 and 33.6 kBq/kg bodymass, assuming an endosteal dose rate constant in time and proportional to the injected activity. The density of osteosarcoma survival times is Weibull with shape factor three and the scale factor being a function of the injected activity. The fraction of animals with bone tumors can be predicted using the survivor function of the controls, but some systematic deviations occur between calculations and observed fractions in an intermediate dosage range. It is shown that for low levels of 239Pu the fraction of animals with tumors is proportional to the injected amount of plutonium.  相似文献   

19.
Multivariate survival analysis comprises of event times that are generally grouped together in clusters. Observations in each of these clusters relate to data belonging to the same individual or individuals with a common factor. Frailty models can be used when there is unaccounted association between survival times of a cluster. The frailty variable describes the heterogeneity in the data caused by unknown covariates or randomness in the data. In this article, we use the generalized gamma distribution to describe the frailty variable and discuss the Bayesian method of estimation for the parameters of the model. The baseline hazard function is assumed to follow the two parameter Weibull distribution. Data is simulated from the given model and the Metropolis–Hastings MCMC algorithm is used to obtain parameter estimates. It is shown that increasing the size of the dataset improves estimates. It is also shown that high heterogeneity within clusters does not affect the estimates of treatment effects significantly. The model is also applied to a real life dataset.  相似文献   

20.
Abstract

The iterative convex minorant (ICM) algorithm proposed by Groeneboom and Wellner is fast in computing the NPMLE of the distribution function for interval censored data without covariates. We reformulate the ICM as a generalized gradient projection method (GGP), which leads to a natural extension to the Cox model. It is also easily extended to support Tibshirani's Lasso method. Some simulation results are also shown. For illustration we reanalyze two real datasets.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号