Abstract: | Predicting insurance losses is an eternal focus of actuarial science in the insurance sector. Due to the existence of complicated features such as skewness, heavy tail, and multi-modality, traditional parametric models are often inadequate to describe the distribution of losses, calling for a mature application of Bayesian methods. In this study we explore a Gaussian mixture model based on Dirichlet process priors. Using three automobile insurance datasets, we employ the probit stick-breaking method to incorporate the effect of covariates into the weight of the mixture component, improve its hierarchical structure, and propose a Bayesian nonparametric model that can identify the unique regression pattern of different samples. Moreover, an advanced updating algorithm of slice sampling is integrated to apply an improved approximation to the infinite mixture model. We compare our framework with four common regression techniques: three generalized linear models and a dependent Dirichlet process ANOVA model. The empirical results show that the proposed framework flexibly characterizes the actual loss distribution in the insurance datasets and demonstrates superior performance in the accuracy of data fitting and extrapolating predictions, thus greatly extending the application of Bayesian methods in the insurance sector. |