首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
It is no longer uncommon these days to find the need in actuarial practice to model claim counts from multiple types of coverage, such as the ratemaking process for bundled insurance contracts. Since different types of claims are conceivably correlated with each other, the multivariate count regression models that emphasize the dependency among claim types are more helpful for inference and prediction purposes. Motivated by the characteristics of an insurance dataset, we investigate alternative approaches to constructing multivariate count models based on the negative binomial distribution. A classical approach to induce correlation is to employ common shock variables. However, this formulation relies on the NB-I distribution which is restrictive for dispersion modeling. To address these issues, we consider two different methods of modeling multivariate claim counts using copulas. The first one works with the discrete count data directly using a mixture of max-id copulas that allows for flexible pair-wise association as well as tail and global dependence. The second one employs elliptical copulas to join continuitized data while preserving the dependence structure of the original counts. The empirical analysis examines a portfolio of auto insurance policies from a Singapore insurer where claim frequency of three types of claims (third party property damage, own damage, and third party bodily injury) are considered. The results demonstrate the superiority of the copula-based approaches over the common shock model. Finally, we implemented the various models in loss predictive applications.  相似文献   

2.
Factor models for multivariate count data   总被引:1,自引:0,他引:1  
We develop a general class of factor-analytic models for the analysis of multivariate (truncated) count data. Dependencies in multivariate counts are of interest in many applications, but few approaches have been proposed for their analysis. Our model class allows for a variety of distributions of the factors in the exponential family. The proposed framework includes a large number of previously proposed factor and random effect models as special cases and leads to many new models that have not been considered so far. Whereas previously these models were proposed separately as different cases, our framework unifies these models and enables one to study them simultaneously. We estimate the Poisson factor models with the method of simulated maximum likelihood. A Monte-Carlo study investigates the performance of this approach in terms of estimation bias and precision. We illustrate the approach in an analysis of TV channels data.  相似文献   

3.
Quantile regression model estimates the relationship between the quantile of a response distribution and the regression parameters, and has been developed for linear models with continuous responses. In this paper, we apply Bayesian quantile regression model for the Malaysian motor insurance claim count data to study the effects of change in the estimates of regression parameters (or the rating factors) on the magnitude of the response variable (or the claim count). We also compare the results of quantile regression models from the Bayesian and frequentist approaches and the results of mean regression models from the Poisson and negative binomial. Comparison from Poisson and Bayesian quantile regression models shows that the effects of vehicle year decrease as the quantile increases, suggesting that the rating factor has lower risk for higher claim counts. On the other hand, the effects of vehicle type increase as the quantile increases, indicating that the rating factor has higher risk for higher claim counts.  相似文献   

4.
In this article, we propose and explore a multivariate logistic regression model for analyzing multiple binary outcomes with incomplete covariate data where auxiliary information is available. The auxiliary data are extraneous to the regression model of interest but predictive of the covariate with missing data. Horton and Laird [N.J. Horton, N.M. Laird, Maximum likelihood analysis of logistic regression models with incomplete covariate data and auxiliary information, Biometrics 57 (2001) 34–42] describe how the auxiliary information can be incorporated into a regression model for a single binary outcome with missing covariates, and hence the efficiency of the regression estimators can be improved. We consider extending the method of [9] to the case of a multivariate logistic regression model for multiple correlated outcomes, and with missing covariates and completely observed auxiliary information. We demonstrate that in the case of moderate to strong associations among the multiple outcomes, one can achieve considerable gains in efficiency from estimators in a multivariate model as compared to the marginal estimators of the same parameters.  相似文献   

5.
The general multivariate analysis of variance model has been extensively studied in the statistical literature and successfully applied in many different fields for analyzing longitudinal data. In this article, we consider the extension of this model having two sets of regressors constituting a growth curve portion and a multivariate analysis of variance portion, respectively. Nowadays, the data collected in empirical studies have relatively complex structures though often demanding a parsimonious modeling. This can be achieved for example through imposing rank constraints on the regression coefficient matrices. The reduced rank regression structure also provides a theoretical interpretation in terms of latent variables. We derive likelihood based estimators for the mean parameters and covariance matrix in this type of models. A numerical example is provided to illustrate the obtained results.  相似文献   

6.
To predict future claims, it is well-known that the most recent claims are more predictive than older ones. However, classic panel data models for claim counts, such as the multivariate negative binomial distribution, do not put any time weight on past claims. More complex models can be used to consider this property, but often need numerical procedures to estimate parameters. When we want to add a dependence between different claim count types, the task would be even more difficult to handle. In this paper, we propose a bivariate dynamic model for claim counts, where past claims experience of a given claim type is used to better predict the other type of claims. This new bivariate dynamic distribution for claim counts is based on random effects that come from the Sarmanov family of multivariate distributions. To obtain a proper dynamic distribution based on this kind of bivariate priors, an approximation of the posterior distribution of the random effects is proposed. The resulting model can be seen as an extension of the dynamic heterogeneity model described in Bolancé et al. (2007). We apply this model to two samples of data from a major Canadian insurance company, where we show that the proposed model is one of the best models to adjust the data. We also show that the proposed model allows more flexibility in computing predictive premiums because closed-form expressions can be easily derived for the predictive distribution, the moments and the predictive moments.  相似文献   

7.
本文研究了多元线性同归模型岭估计的影响分析问题.利用最小二乘估计方法,获得了多元协方差阵扰动模型与原模型参数阵之间的岭估计的一些关系式,给出了度量影响大小的基于岭估计的广义Cook距离.  相似文献   

8.
A multivariate normal statistical model defined by the Markov properties determined by an acyclic digraph admits a recursive factorization of its likelihood function (LF) into the product of conditional LFs, each factor having the form of a classical multivariate linear regression model (≡WMANOVA model). Here these models are extended in a natural way to normal linear regression models whose LFs continue to admit such recursive factorizations, from which maximum likelihood estimators and likelihood ratio (LR) test statistics can be derived by classical linear methods. The central distribution of the LR test statistic for testing one such multivariate normal linear regression model against another is derived, and the relation of these regression models to block-recursive normal linear systems is established. It is shown how a collection of nonnested dependent normal linear regression models (≡Wseemingly unrelated regressions) can be combined into a single multivariate normal linear regression model by imposing a parsimonious set of graphical Markov (≡Wconditional independence) restrictions.  相似文献   

9.
In count data regression there can be several problems that prevent the use of the standard Poisson log‐linear model: overdispersion, caused by unobserved heterogeneity or correlation, excess of zeros, non‐linear effects of continuous covariates or of time scales, and spatial effects. We develop Bayesian count data models that can deal with these issues simultaneously and within a unified inferential approach. Models for overdispersed or zero‐inflated data are combined with semiparametrically structured additive predictors, resulting in a rich class of count data regression models. Inference is fully Bayesian and is carried out by computationally efficient MCMC techniques. Simulation studies investigate performance, in particular how well different model components can be identified. Applications to patent data and to data from a car insurance illustrate the potential and, to some extent, limitations of our approach. Copyright © 2006 John Wiley & Sons, Ltd.  相似文献   

10.
Real count data time series often show an excessive number of zeros, which can form quite different patterns. We develop four extensions of the binomial autoregressive model for autocorrelated counts with a bounded support, which can accommodate a broad variety of zero patterns. The stochastic properties of these models are derived, and ways of parameter estimation and model identification are discussed. The usefulness of the models is illustrated, among others, by an application to the monetary policy decisions of the National Bank of Poland.  相似文献   

11.
Abstract

An essential feature of longitudinal data is the existence of autocorrelation among the observations from the same unit or subject. Two-stage random-effects linear models are commonly used to analyze longitudinal data. These models are not flexible enough, however, for exploring the underlying data structures and, especially, for describing time trends. Semi-parametric models have been proposed recently to accommodate general time trends. But these semi-parametric models do not provide a convenient way to explore interactions among time and other covariates although such interactions exist in many applications. Moreover, semi-parametric models require specifying the design matrix of the covariates (time excluded). We propose nonparametric models to resolve these issues. To fit nonparametric models, we use the novel technique of the multivariate adaptive regression splines for the estimation of mean curve and then apply an EM-like iterative procedure for covariance estimation. After giving a general algorithm of model building, we show how to design a fast algorithm. We use both simulated and published data to illustrate the use of our proposed method.  相似文献   

12.
Relational event data, which consist of events involving pairs of actors over time, are now commonly available at the finest of temporal resolutions. Existing continuous‐time methods for modeling such data are based on point processes and directly model interaction “contagion,” whereby one interaction increases the propensity of future interactions among actors, often as dictated by some latent variable structure. In this article, we present an alternative approach to using temporal‐relational point process models for continuous‐time event data. We characterize interactions between a pair of actors as either spurious or as resulting from an underlying, persistent connection in a latent social network. We argue that consistent deviations from expected behavior, rather than solely high frequency counts, are crucial for identifying well‐established underlying social relationships. This study aims to explore these latent network structures in two contexts: one comprising of college students and another involving barn swallows.  相似文献   

13.
Many survival studies record the times to two or more distinct failures on each subject. The failures may be events of different natures or may be repetitions of the same kind of event. In this article, we consider the regression analysis of such multivariate failure time data under the additive hazards model. Simple weighted estimating functions for the regression parameters are proposed, and asymptotic distribution theory of the resulting estimators are derived. In addition, a class of generalized Wald and generalized score statistics for hypothesis testing and model selection are presented, and the asymptotic properties of these statistics are examined.  相似文献   

14.
The correlation matrix (denoted by R) plays an important role in many statistical models. Unfortunately, sampling the correlation matrix in Markov chain Monte Carlo (MCMC) algorithms can be problematic. In addition to the positive definite constraint of covariance matrices, correlation matrices have diagonal elements fixed at one. In this article, we propose an efficient two-stage parameter expanded reparameterization and Metropolis-Hastings (PX-RPMH) algorithm for simulating R. Using this algorithm, we draw all elements of R simultaneously by first drawing a covariance matrix from an inverse Wishart distribution, and then translating it back to a correlation matrix through a reduction function and accepting it based on a Metropolis-Hastings acceptance probability. This algorithm is illustrated using multivariate probit (MVP) models and multivariate regression (MVR) models with a common correlation matrix across groups. Via both a simulation study and a real data example, the performance of the PX-RPMH algorithm is compared with those of other common algorithms. The results show that the PX-RPMH algorithm is more efficient than other methods for sampling a correlation matrix.  相似文献   

15.
Structural test in regression on functional variables   总被引:1,自引:0,他引:1  
Many papers deal with structural testing procedures in multivariate regression. More recently, various estimators have been proposed for regression models involving functional explanatory variables. Thanks to these new estimators, we propose a theoretical framework for structural testing procedures adapted to functional regression. The procedures introduced in this paper are innovative and make the link between former works on functional regression and others on structural testing procedures in multivariate regression. We prove asymptotic properties of the level and the power of our procedures under general assumptions that cover a large scope of possible applications: tests for no effect, linearity, dimension reduction, …  相似文献   

16.
With the advance of computer storage capacity and online observation technique, more and more data are collected with curves and images. The most two important feature of curve and image data are high-dimension and high correlation between adjacent data. Functional data analysis has more advantage in deal with these data, which can not be treated by traditional multivariate statistics methods. Recently, a variety of functional data methods have been developed, including curve alignment, principal component analysis, regression, classification and clustering. In this paper, we mainly introduce the origins,development and recent process of functional data. Specifically, we firstly introduce the notion of functional data. Secondly, functional principal component analysis has been presented. Then, this paper is devoted to introduce estimation, variable selection and hypothesis testing of functional regression models. Lastly, the paper concludes with a brief discussion of future directions.  相似文献   

17.
Abstract Developing models to predict tree mortality using data from long‐term repeated measurement data sets can be difficult and challenging due to the nature of mortality as well as the effects of dependence on observations. Marginal (population‐averaged) generalized estimating equations (GEE) and random effects (subject‐specific) models offer two possible ways to overcome these effects. For this study, standard logistic, marginal logistic based on the GEE approach, and random logistic regression models were fitted and compared. In addition, four model evaluation statistics were calculated by means of K‐fold cross‐valuation. They include the mean prediction error, the mean absolute prediction error, the variance of prediction error, and the mean square error. Results from this study suggest that the random effects model produced the smallest evaluation statistics among the three models. Although marginal logistic regression accommodated for correlations between observations, it did not provide noticeable improvements of model performance compared to the standard logistic regression model that assumed impendence. This study indicates that the random effects model was able to increase the overall accuracy of mortality modeling. Moreover, it was able to ascertain correlation derived from the hierarchal data structure as well as serial correlation generated through repeated measurements.  相似文献   

18.
Correspondence analysis, a data analytic technique used to study two‐way cross‐classifications, is applied to social relational data. Such data are frequently termed “sociometric” or “network” data. The method allows one to model forms of relational data and types of empirical relationships not easily analyzed using either standard social network methods or common scaling or clustering techniques. In particular, correspondence analysis allows one to model:

—two‐mode networks (rows and columns of a sociomatrix refer to different objects)

—valued relations (e.g. counts, ratings, or frequencies).

In general, the technique provides scale values for row and column units, visual presentation of relationships among rows and columns, and criteria for assessing “dimensionality” or graphical complexity of the data and goodness‐of‐fit to particular models. Correspondence analysis has recently been the subject of research by Goodman, Haberman, and Gilula, who have termed their approach to the problem “canonical analysis” to reflect its similarity to canonical correlation analysis of continuous multivariate data. This generalization links the technique to more standard categorical data analysis models, and provides a much‐needed statistical justificatioa

We review both correspondence and canonical analysis, and present these ideas by analyzing relational data on the 1980 monetary donations from corporations to nonprofit organizations in the Minneapolis St. Paul metropolitan area. We also show how these techniques are related to dyadic independence models, first introduced by Holland, Leinhardt, Fienberg, and Wasserman in the early 1980's. The highlight of this paper is the relationship between correspondence and canonical analysis, and these dyadic independence models, which are designed specifically for relational data. The paper concludes with a discussion of this relationship, and some data analyses that illustrate the fart that correspondence analysis models can be used as approximate dyadic independence models.  相似文献   

19.
We propose a flexible class of models based on scale mixture of uniform distributions to construct shrinkage priors for covariance matrix estimation. This new class of priors enjoys a number of advantages over the traditional scale mixture of normal priors, including its simplicity and flexibility in characterizing the prior density. We also exhibit a simple, easy to implement Gibbs sampler for posterior simulation, which leads to efficient estimation in high-dimensional problems. We first discuss the theory and computational details of this new approach and then extend the basic model to a new class of multivariate conditional autoregressive models for analyzing multivariate areal data. The proposed spatial model flexibly characterizes both the spatial and the outcome correlation structures at an appealing computational cost. Examples consisting of both synthetic and real-world data show the utility of this new framework in terms of robust estimation as well as improved predictive performance. Supplementary materials are available online.  相似文献   

20.
In this article, we consider the problem of testing a linear hypothesis in a multivariate linear regression model which includes the case of testing the equality of mean vectors of several multivariate normal populations with common covariance matrix Σ, the so-called multivariate analysis of variance or MANOVA problem. However, we have fewer observations than the dimension of the random vectors. Two tests are proposed and their asymptotic distributions under the hypothesis as well as under the alternatives are given under some mild conditions. A theoretical comparison of these powers is made.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号