首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 156 毫秒
1.
In this paper a comparative evaluation study on popular non-homogeneous Poisson models for count data is performed. For the study the standard homogeneous Poisson model (HOM) and three non-homogeneous variants, namely a Poisson changepoint model (CPS), a Poisson free mixture model (MIX), and a Poisson hidden Markov model (HMM) are implemented in both conceptual frameworks: a frequentist and a Bayesian framework. This yields eight models in total, and the goal of the presented study is to shed some light onto their relative merits and shortcomings. The first major objective is to cross-compare the performances of the four models (HOM, CPS, MIX and HMM) independently for both modelling frameworks (Bayesian and frequentist). Subsequently, a pairwise comparison between the four Bayesian and the four frequentist models is performed to elucidate to which extent the results of the two paradigms (‘Bayesian vs. frequentist’) differ. The evaluation study is performed on various synthetic Poisson data sets as well as on real-world taxi pick-up counts, extracted from the recently published New York City Taxi database.  相似文献   

2.
Factor models for multivariate count data   总被引:1,自引:0,他引:1  
We develop a general class of factor-analytic models for the analysis of multivariate (truncated) count data. Dependencies in multivariate counts are of interest in many applications, but few approaches have been proposed for their analysis. Our model class allows for a variety of distributions of the factors in the exponential family. The proposed framework includes a large number of previously proposed factor and random effect models as special cases and leads to many new models that have not been considered so far. Whereas previously these models were proposed separately as different cases, our framework unifies these models and enables one to study them simultaneously. We estimate the Poisson factor models with the method of simulated maximum likelihood. A Monte-Carlo study investigates the performance of this approach in terms of estimation bias and precision. We illustrate the approach in an analysis of TV channels data.  相似文献   

3.
The paper considers an infinite discrete-time buffer system with one single output channel. Unlike most analyses of such a buffer system, the present study uses a ‘dynamic’ model for the arrival process of data units into the system. More specifically, the distribution of the number of arrivals per discrete time-unit is allowed to fluctuate in time, in a periodical fashion, whereas in classical models this distribution remains the same, as time goes by. The probability generating function of the number of data units in the buffer, at various time instants, is derived under such dynamic arrival conditions. An extended illustrating example, comparing ‘static’ and ‘dynamic’ arrival models, concludes the paper.  相似文献   

4.
Our paper presents an empirical analysis of the association between firm attributes in electronic retailing and the adoption of information initiatives in mobile retailing. In our attempt to analyze the collected data, we find that the count of information initiatives exhibits underdispersion. Also, zero‐truncation arises from our study design. To tackle the two issues, we test four zero‐truncated (ZT) count data models—binomial, Poisson, Conway–Maxwell–Poisson, and Consul's generalized Poisson. We observe that the ZT Poisson model has a much inferior fit when compared with the other three models. Interestingly, even though the ZT binomial distribution is the only model that explicitly takes into account the finite range of our count variable, it is still outperformed by the other two Poisson mixtures that turn out to be good approximations. Further, despite the rising popularity of the Conway–Maxwell–Poisson distribution in recent literature, the ZT Consul's generalized Poisson distribution shows the best fit among all candidate models and suggests support for one hypothesis. Because underdispersion is rarely addressed in IT and electronic commerce research, our study aims to encourage empirical researchers to adopt a flexible regression model in order to make a robust assessment on the impact of explanatory variables. Copyright © 2014 John Wiley & Sons, Ltd.  相似文献   

5.
We analyze the concept of credibility in claim frequency in two generalized count models–Mittag-Leffler and Weibull count models–which can handle both underdispersion and overdispersion in count data and nest the commonly used Poisson model as a special case. We find evidence, using data from a Danish insurance company, that the simple Poisson model can set the credibility weight to one even when there are only three years of individual experience data resulting from large heterogeneity among policyholders, and in doing so, it can thus break down the credibility model. The generalized count models, on the other hand, allow the weight to adjust according to the number of years of experience available. We propose parametric estimators for the structural parameters in the credibility formula using the mean and variance of the assumed distributions and a maximum likelihood estimation over a collective data. As an example, we show that the proposed parameters from Mittag-Leffler provide weights that are consistent with the idea of credibility. A simulation study is carried out investigating the stability of the maximum likelihood estimates from the Weibull count model. Finally, we extend the analyses to multidimensional lines and explain how our approach can be used in selecting profitable customers in cross-selling; customers can now be selected by estimating a function of their unknown risk profiles, which is the mean of the assumed distribution on their number of claims.  相似文献   

6.
The features used may have an important effect on the performance of credit scoring models. The process of choosing the best set of features for credit scoring models is usually unsystematic and dominated by somewhat arbitrary trial. This paper presents an empirical study of four machine learning feature selection methods. These methods provide an automatic data mining technique for reducing the feature space. The study illustrates how four feature selection methods—‘ReliefF’, ‘Correlation-based’, ‘Consistency-based’ and ‘Wrapper’ algorithms help to improve three aspects of the performance of scoring models: model simplicity, model speed and model accuracy. The experiments are conducted on real data sets using four classification algorithms—‘model tree (M5)’, ‘neural network (multi-layer perceptron with back-propagation)’, ‘logistic regression’, and ‘k-nearest-neighbours’.  相似文献   

7.
In count data regression there can be several problems that prevent the use of the standard Poisson log‐linear model: overdispersion, caused by unobserved heterogeneity or correlation, excess of zeros, non‐linear effects of continuous covariates or of time scales, and spatial effects. We develop Bayesian count data models that can deal with these issues simultaneously and within a unified inferential approach. Models for overdispersed or zero‐inflated data are combined with semiparametrically structured additive predictors, resulting in a rich class of count data regression models. Inference is fully Bayesian and is carried out by computationally efficient MCMC techniques. Simulation studies investigate performance, in particular how well different model components can be identified. Applications to patent data and to data from a car insurance illustrate the potential and, to some extent, limitations of our approach. Copyright © 2006 John Wiley & Sons, Ltd.  相似文献   

8.
《Applied Mathematical Modelling》2014,38(9-10):2422-2434
An exact, closed-form minimum variance filter is designed for a class of discrete time uncertain systems which allows for both multiplicative and additive noise sources. The multiplicative noise model includes a popular class of models (Cox-Ingersoll-Ross type models) in econometrics. The parameters of the system under consideration which describe the state transition are assumed to be subject to stochastic uncertainties. The problem addressed is the design of a filter that minimizes the trace of the estimation error variance. Sensitivity of the new filter to the size of parameter uncertainty, in terms of the variance of parameter perturbations, is also considered. We refer to the new filter as the ‘perturbed Kalman filter’ (PKF) since it reduces to the traditional (or unperturbed) Kalman filter as the size of stochastic perturbation approaches zero. We also consider a related approximate filtering heuristic for univariate time series and we refer to filter based on this heuristic as approximate perturbed Kalman filter (APKF). We test the performance of our new filters on three simulated numerical examples and compare the results with unperturbed Kalman filter that ignores the uncertainty in the transition equation. Through numerical examples, PKF and APKF are shown to outperform the traditional (or unperturbed) Kalman filter in terms of the size of the estimation error when stochastic uncertainties are present, even when the size of stochastic uncertainty is inaccurately identified.  相似文献   

9.
Poisson回归模型广泛应用于分析计数型数据 ,Dean&Lawless(1989)和Dean(1992 )讨论了非重复测量得到的计数型数据的偏大离差存在性的检验问题 .本文分别利用随机系数模型和对数非线性模型讨论了基于重复测量得到的计数型数据的偏大离差的检验问题 ,得到了检验的score统计量 .  相似文献   

10.
By means of an original approach, called ‘method of the moving frame’, we establish existence, uniqueness and stability results for mild and weak solutions of stochastic partial differential equations (SPDEs) with path-dependent coefficients driven by an infinite-dimensional Wiener process and a compensated Poisson random measure. Our approach is based on a time-dependent coordinate transform, which reduces a wide class of SPDEs to a class of simpler SDE (stochastic differential equation) problems. We try to present the most general results, which we can obtain in our setting, within a self-contained framework to demonstrate our approach in all details. Also, several numerical approaches to SPDEs in the spirit of this setting are presented.  相似文献   

11.
Quantile regression model estimates the relationship between the quantile of a response distribution and the regression parameters, and has been developed for linear models with continuous responses. In this paper, we apply Bayesian quantile regression model for the Malaysian motor insurance claim count data to study the effects of change in the estimates of regression parameters (or the rating factors) on the magnitude of the response variable (or the claim count). We also compare the results of quantile regression models from the Bayesian and frequentist approaches and the results of mean regression models from the Poisson and negative binomial. Comparison from Poisson and Bayesian quantile regression models shows that the effects of vehicle year decrease as the quantile increases, suggesting that the rating factor has lower risk for higher claim counts. On the other hand, the effects of vehicle type increase as the quantile increases, indicating that the rating factor has higher risk for higher claim counts.  相似文献   

12.
In applications involving count data, it is common to encounter an excess number of zeros. In the study of outpatient service utilization, for example, the number of utilization days will take on integer values, with many subjects having no utilization (zero values). Mixed-distribution models, such as the zero-inflated Poisson (ZIP) and zero-inflated negative binomial (ZINB), are often used to fit such data. A more general class of mixture models, called hurdle models, can be used to model zero-deflation as well as zero-inflation. Several authors have proposed frequentist approaches to fitting zero-inflated models for repeated measures. We describe a practical Bayesian approach which incorporates prior information, has optimal small-sample properties, and allows for tractable inference. The approach can be easily implemented using standard Bayesian software. A study of psychiatric outpatient service use illustrates the methods.  相似文献   

13.
In positron emission tomography, image data corresponds to measurements of emitted photons from a radioactive tracer in the subject. Such count data is typically modeled using a Poisson random variable, leading to the use of the negative-log Poisson likelihood fit-to-data function. Regularization is needed, however, in order to guarantee reconstructions with minimal artifacts. Given that tracer densities are primarily smoothly varying, but also contain sharp jumps (or edges), total variation regularization is a natural choice. However, the resulting computational problem is quite challenging. In this paper, we present an efficient computational method for this problem. Convergence of the method has been shown for quadratic regularization functions and here convergence is shown for total variation regularization. We also present three regularization parameter choice methods for use on total variation-regularized negative-log Poisson likelihood problems. We test the computational and regularization parameter selection methods on two synthetic data sets.  相似文献   

14.
We provide a mathematical dynamic model of athletic performance, fitness and fatigue based on the two well-known principles ‘train to failure’ and ‘use it or lose it’. The anabolic and catabolic processes are modelled with differential equations. Fitness is defined as muscle fitness. We model the work power of any muscle or set of muscles, and the muscle's maximum work power. Parameters are estimated and we present analytical and numerical results. The relationships between performance, fitness and fatigue are demonstrated for various activity scenarios. For example, the model quantifies the exact manner in which the optimal rest period can be determined to maximize the performance on a given day. The model provides realistic predictions, and constitutes a powerful tool which describes the processes by which performance, fitness and fatigue can be regulated and controlled.  相似文献   

15.
ABSTRACT

We estimate a structural electricity (multi-commodity) model based on historical spot and futures data (fuels and power prices, respectively) and quantify the inherent parameter risk using an average value at risk approach (‘expected shortfall’). The mathematical proofs use the theory of asymptotic statistics to derive a parameter risk measure. We use far in-the-money options to derive a confidence level and use it as a prudent present value adjustment when pricing a virtual power plant. Finally, we conduct a present value benchmarking to compare the approach of temperature-driven demand (based on load data) to an ‘implied demand approach’ (demand implied from observable power futures prices). We observe that the implied demand approach can easily capture observed electricity price volatility whereas the estimation against observable load data will lead to a gap, because – amongst others – the interplay of demand and supply is not captured in the data (i.e., unexpected mismatches).  相似文献   

16.
17.
计数数据往往存在过离散(over-dispersed)即方差大于均值特征,若利用传统的泊松回归模型拟合数据往往会导致其参数的标准误差被低估,显著性水平被高估的错误结论。负二项回归模型、广义泊松回归模型通常被用来处理过离散特征数据。本文以两类广义泊松回归模型GP-1和GP-2模型为基础,将其推广为更为一般的GP-P形式,其中P为参数。此时,P=1或P=2,GP-P模型就退化为GP-1和GP-2模型。文中最后利用此类推广的GP-P模型处理了一组医疗保险数据,并与泊松回归模型、负二项回归模型拟合结果进行了比较。结果表明,推广后的GP-P模型的拟合效果更优。  相似文献   

18.
In this paper, we propose a multivariate time series model for sales count data. Based on the fact that setting an independent Poisson distribution to each brand’s sales produces the Poisson distribution for their total number, characterized as market sales, and then, conditional on market sales, the brand sales follow a multinomial distribution, we first extend this Poisson–multinomial modeling to a dynamic model in terms of a generalized linear model. We further extend the model to contain nesting hierarchical structures in order to apply it to find the market structure in the field of marketing. As an application using point of sales time series in a store, we compare several possible hypotheses on market structure and choose the most plausible structure by using several model selection criteria, including in-sample fit, out-of-sample forecasting errors, and information criterion.  相似文献   

19.
Stochastic failure models for systems under randomly variable environment (dynamic environment) are often described using hazard rate process. In this paper, we consider hazard rate processes induced by external shocks affecting a system that follow the nonhomogeneous Poisson process. The sample paths of these processes monotonically increase. However, the failure rate of a system can have completely different shapes and follow, e.g., the upside-down bathtub pattern. We describe and study various ‘conditional properties’ of the models that help to analyze and interpret the shape of the failure rate and other relevant characteristics.  相似文献   

20.
Count data with excess zeros are often encountered in many medical, biomedical and public health applications. In this paper, an extension of zero-inflated Poisson mixed regression models is presented for dealing with multilevel data set, referred as hierarchical mixture zero-inflated Poisson mixed regression models. A stochastic EM algorithm is developed for obtaining the ML estimates of interested parameters and a model comparison is also considered for comparing models with different latent classes through BIC criterion. An application to the analysis of count data from a Shanghai Adolescence Fitness Survey and a simulation study illustrate the usefulness and effectiveness of our methodologies.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号