Correcting statistical models via empirical distribution functions |
| |
Authors: | Alexander Munteanu Max Wornowizki |
| |
Institution: | 1.Department of Computer Science,Technische Universit?t Dortmund,Dortmund,Germany;2.Department of Statistics,Technische Universit?t Dortmund,Dortmund,Germany |
| |
Abstract: | We consider the two-sample homogeneity problem where the information contained in two samples is used to test the equality of the underlying distributions. In cases where one sample is simulated by a procedure modelling the data generating process of another observed sample, a mere rejection of the null hypothesis is unsatisfactory. Instead, the data analyst would like to know how the simulation can be improved. Based on the popular Kolmogorov–Smirnov test and a general mixture model, we propose an algorithm that determines an appropriate correction distribution function. Complementing the simulation sample by a given proportion of observations sampled from this distribution reduces the Kolmogorov–Smirnov distance between the modified and the observed sample. Therefore, the correction distribution indicates possible improvements to the current simulation process. We prove our algorithm to run in linear time when applied to sorted samples. We further illustrate its intuitive results on simulated as well as on real data sets from astrophysics and bioinformatics. |
| |
Keywords: | |
本文献已被 SpringerLink 等数据库收录! |
|