首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 46 毫秒
1.
In this paper, we propose ADTreesLogit, a model that integrates the advantage of ADTrees model and the logistic regression model, to improve the predictive accuracy and interpretability of existing churn prediction models. We show that the overall predictive accuracy of ADTreesLogit model compares favorably with that of TreeNet®, a model which won the Gold Prize in the 2003 mobile customer churn prediction modeling contest (The Duke/NCR Teradata Churn Modeling Tournament). In fact, ADTreesLogit has better predictive accuracy than TreeNet® on two important observation points.  相似文献   

2.
Companies' interest in customer relationship modelling and key issues such as customer lifetime value and churn has substantially increased over the years. However, the complexity of building, interpreting and applying these models creates obstacles for their implementation. The main contribution of this paper is to show how domain knowledge can be incorporated in the data mining process for churn prediction, viz. through the evaluation of coefficient signs in a logistic regression model, and secondly, by analysing a decision table (DT) extracted from a decision tree or rule-based classifier. An algorithm to check DTs for violations of monotonicity constraints is presented, which involves the repeated application of condition reordering and table contraction to detect counter-intuitive patterns. Both approaches are applied to two telecom data sets to empirically demonstrate how domain knowledge can be used to ensure the interpretability of the resulting models.  相似文献   

3.
The defection or churn of customers represents an important concern for any company and a central matter of interest in customer base analysis. An additional complication arises in non-contractual settings, where the characteristics that should be observed to saying that a customer has totally or partially defected are not clearly defined. As a matter of fact, different definitions of the churn situation could be used in this context. Focusing on non-contractual settings, in this paper we propose a methodology for evaluating the short-time economic effects that using a certain definition of churn would have on a company. With this aim, we have defined two efficiency measures for the economic results of a marketing campaign implemented against churn, and these measures have been computed using a set of definitions of partial defection. Our methodology finds that definition maximizing both efficiency measures and moreover, the monetary amount that the company should invest per customer in the campaign for achieving the optimal solution. This has been modelled as a multiobjective optimization problem that we solved using compromise programming. Numerical results using real data from a Spanish retailing company are presented and discussed in order to show the performance and validity of our proposal.  相似文献   

4.
Market baskets arise from consumers’ shopping trips and include items from multiple categories that are frequently chosen interdependently from each other. Explanatory models of multicategory choice behavior explicitly allow for such category purchase dependencies. They typically estimate own and across-category effects of marketing-mix variables on purchase incidences for a predefined set of product categories. Because of analytical restrictions, however, multicategory choice models can only handle a small number of categories. Hence, for large retail assortments, the issue emerges of how to determine the composition of shopping baskets with a meaningful selection of categories. Traditionally, this is resolved by managerial intuition. In this article, we combine multicategory choice models with a data-driven approach for basket selection. The proposed procedure also accounts for customer heterogeneity and thus can serve as a viable tool for designing target marketing programs. A data compression step first derives a set of basket prototypes which are representative for classes of market baskets with internally more distinctive (complementary) cross-category interdependencies and are responsible for the segmentation of households. In a second step, segment-specific cross-category effects are estimated for suitably selected categories using a multivariate logistic modeling framework. In an empirical illustration, significant differences in cross-effects and price elasticities can be shown both across segments and compared to the aggregate model.  相似文献   

5.
In this paper, we analyse the delay of a random customer in a two-class batch-service queueing model with variable server capacity, where all customers are accommodated in a common single-server first-come-first-served queue. The server can only process customers that belong to the same class, so that the size of a batch is determined by the length of a sequence of same-class customers. This type of batch server can be found in telecommunications systems and production environments. We first determine the steady state partial probability generating function of the queue occupancy at customer arrival epochs. Using a spectral decomposition technique, we obtain the steady state probability generating function of the delay of a random customer. We also show that the distribution of the delay of a random customer corresponds to a phase-type distribution. Finally, some numerical examples are given that provide further insight in the impact of asymmetry and variance in the arrival process on the number of customers in the system and the delay of a random customer.  相似文献   

6.
While statistical learning methods have proved powerful tools for predictive modeling, the black-box nature of the models they produce can severely limit their interpretability and the ability to conduct formal inference. However, the natural structure of ensemble learners like bagged trees and random forests has been shown to admit desirable asymptotic properties when base learners are built with proper subsamples. In this work, we demonstrate that by defining an appropriate grid structure on the covariate space, we may carry out formal hypothesis tests for both variable importance and underlying additive model structure. To our knowledge, these tests represent the first statistical tools for investigating the underlying regression structure in a context such as random forests. We develop notions of total and partial additivity and further demonstrate that testing can be carried out at no additional computational cost by estimating the variance within the process of constructing the ensemble. Furthermore, we propose a novel extension of these testing procedures using random projections to allow for computationally efficient testing procedures that retain high power even when the grid size is much larger than that of the training set.  相似文献   

7.
In the highly competitive business environment of today, the cost to attract new customers is much higher than the cost required to maintain the existing ones. To keep the balance between the acquisition rate and defection rate through executing offensive and defensive marketing policies, it is required to have real time information using an efficient method to monitor customer loyalty. The relationship between customer loyalty and customer satisfaction should be kept in mind when one develops a method for loyalty monitoring. This paper presents several control charts classified in two groups based on the scale used to assess customer loyalty. In the first group of control charts, customer loyalty is considered as a binary random variable modeled by Bernoulli distribution whilst in the second group, an ordinal scale is considered to report loyalty level. Performance comparison of the proposed techniques using ARL criterion indicates that chi‐square and likelihood‐ratio control charts developed based on Pearson chi‐square statistic and ordinal logistic regression model respectively are able to rapidly detect the significant changes in loyalty behavior. To show how to apply the procedures and how to interpret their results, two illustrative synthetic cases are also explained. Copyright © 2009 John Wiley & Sons, Ltd.  相似文献   

8.
The definition and modeling of customer loyalty have been central issues in customer relationship management since many years. Recent papers propose solutions to detect customers that are becoming less loyal, also called churners. The churner status is then defined as a function of the volume of commercial transactions. In the context of a Belgian retail financial service company, our first contribution is to redefine the notion of customer loyalty by considering it from a customer-centric viewpoint instead of a product-centric one. We hereby use the customer lifetime value (CLV) defined as the discounted value of future marginal earnings, based on the customer’s activity. Hence, a churner is defined as someone whose CLV, thus the related marginal profit, is decreasing. As a second contribution, the loss incurred by the CLV decrease is used to appraise the cost to misclassify a customer by introducing a new loss function. In the empirical study, we compare the accuracy of various classification techniques commonly used in the domain of churn prediction, including two cost-sensitive classifiers. Our final conclusion is that since profit is what really matters in a commercial environment, standard statistical accuracy measures for prediction need to be revised and a more profit oriented focus may be desirable.  相似文献   

9.
Consider a sequence of i.i.d. positive random variables. An universal result in almost sure limit theorem for products of sums of partial sums is established.We will show that the almost sure limit the...  相似文献   

10.
The paper presents the first empirical investigation of the relationship between present value of net revenue from a revolving credit account and times to default and to second purchase. The analysis is based on the data for a store card which is used to buy ‘white’ durable goods in Germany. It is demonstrated that there exists a relationship between the above given measures. It appears that there is a scope for improving profit if an application for a store card is assessed by using a model which estimates the revenue and includes the survival probability of default and the survival probability of second purchase (a survival combination model) rather than merely a static probability of default predicted by a logistic regression.  相似文献   

11.
Regression trees are a popular alternative to classical regression methods. A number of approaches exist for constructing regression trees. Most of these techniques, including CART, are sequential in nature and locally optimal at each node split, so the final tree solution found may not be the best tree overall. In addition, small changes in the training data often lead to large changes in the final result due to the relative instability of these greedy tree-growing algorithms. Ensemble techniques, such as random forests, attempt to take advantage of this instability by growing a forest of trees from the data and averaging their predictions. The predictive performance is improved, but the simplicity of a single-tree solution is lost.

In earlier work, we introduced the Tree Analysis with Randomly Generated and Evolved Trees (TARGET) method for constructing classification trees via genetic algorithms. In this article, we extend the TARGET approach to regression trees. Simulated data and real world data are used to illustrate the TARGET process and compare its performance to CART, Bayesian CART, and random forests. The empirical results indicate that TARGET regression trees have better predictive performance than recursive partitioning methods, such as CART, and single-tree stochastic search methods, such as Bayesian CART. The predictive performance of TARGET is slightly worse than that of ensemble methods, such as random forests, but the TARGET solutions are far more interpretable.  相似文献   

12.
??In this paper, by applying the moment inequality for asymptotically almost negatively associated (AANA, in short) random sequence and truncated method, the equivalent conditions of complete moment convergence of the maximum partial for weighted sums of AANA random variables are obtained without assumptions of identical distribution, which generalize and improve the corresponding ones of{15},{16} and {17}, respectively.  相似文献   

13.
We consider the problem of finding the optimal routing of a single vehicle that starts its route from a depot and picks up from and delivers K different products to N customers that are served according to a predefined customer sequence. The vehicle is allowed during its route to return to the depot to unload returned products and restock with new products. The items of all products are of the same size. For each customer the demands for the products that are delivered by the vehicle and the quantity of the products that is returned to the vehicle are discrete random variables with known joint distribution. Under a suitable cost structure, it is shown that the optimal policy that serves all customers has a specific threshold-type structure. We also study a corresponding infinite-time horizon problem in which the service of the customers is not completed when the last customer has been serviced but it continues indefinitely with the same customer order. For each customer, the joint distribution of the quantities that are delivered and the quantity that is picked up is the same at each cycle. The discounted-cost optimal policy and the average-cost optimal policy have the same structure as the optimal policy in the finite-horizon problem. Numerical results are given that illustrate the structural results.  相似文献   

14.
Assume that we have m finished products in an inventory. Eachfinished product is characterized by two measurements P andQ. A customer specifies a purchase order by the requirementsof characteristics P and Q. A product is qualified to satisfya purchase order if and only if it possesses better measurementsof both P and Q than the customer requires. For a given batchof n purchase orders, the inventory selection problem is tochoose n finished products from the inventory to satisfy allpurchase orders with a minimum cost. This problem can be formulatedas a large-scale transportation problem. When the cost functionof selecting a product to satisfy an order exhibits certainstructure, we develop a fast sequential algorithm to solve thisproblem. Possible extensions and related problems are also discussedin this paper.  相似文献   

15.
The availability of abundant data posts a challenge to integrate static customer data and longitudinal behavioral data to improve performance in customer churn prediction. Usually, longitudinal behavioral data are transformed into static data before being included in a prediction model. In this study, a framework with ensemble techniques is presented for customer churn prediction directly using longitudinal behavioral data. A novel approach called the hierarchical multiple kernel support vector machine (H-MK-SVM) is formulated. A three phase training algorithm for the H-MK-SVM is developed, implemented and tested. The H-MK-SVM constructs a classification function by estimating the coefficients of both static and longitudinal behavioral variables in the training process without transformation of the longitudinal behavioral data. The training process of the H-MK-SVM is also a feature selection and time subsequence selection process because the sparse non-zero coefficients correspond to the variables selected. Computational experiments using three real-world databases were conducted. Computational results using multiple criteria measuring performance show that the H-MK-SVM directly using longitudinal behavioral data performs better than currently available classifiers.  相似文献   

16.
Logistic chaotic maps for binary numbers generations   总被引:1,自引:0,他引:1  
Two pseudorandom binary sequence generators, based on logistic chaotic maps intended for stream cipher applications, are proposed. The first is based on a single one-dimensional logistic map which exhibits random, noise-like properties at given certain parameter values, and the second is based on a combination of two logistic maps. The encryption step proposed in both algorithms consists of a simple bitwise XOR operation of the plaintext binary sequence with the keystream binary sequence to produce the ciphertext binary sequence. A threshold function is applied to convert the floating-point iterates into binary form. Experimental results show that the produced sequences possess high linear complexity and very good statistical properties. The systems are put forward for security evaluation by the cryptographic committees.  相似文献   

17.
针对干扰事件导致易逝品物流配送难以顺利实施这一难题,运用干扰管理思想,结合行为科学中关于消费行为的研究方法对客户进行分类,将物流配送干扰管理问题分为两个阶段:第一阶段处理优先服务的客户,第二阶段处理一般服务的客户;进而构建两阶段的、多目标的干扰管理模型,并提出改进的蚁群算法进行求解。实验结果表明,本文方法虽然配送成本较高,但是却完成了较重要客户的配送任务,这有利于较大幅度提高企业的潜在效益,进而验证了在处理易逝品物流配送干扰问题上的有效性。  相似文献   

18.
??Let {X_n;\,n\ge1} be a sequence of strictly stationary \rho-mixing random variables with zero mean and finite variance. Using the weak convergence theorem and probability inequalities of \rho-mixing sequence, under some proper conditions, we obtained general laws of precise asymptotics for partial sums of \rho-mixing sequence.  相似文献   

19.
Customer churn prediction models aim to indicate the customers with the highest propensity to attrite, allowing to improve the efficiency of customer retention campaigns and to reduce the costs associated with churn. Although cost reduction is their prime objective, churn prediction models are typically evaluated using statistically based performance measures, resulting in suboptimal model selection. Therefore, in the first part of this paper, a novel, profit centric performance measure is developed, by calculating the maximum profit that can be generated by including the optimal fraction of customers with the highest predicted probabilities to attrite in a retention campaign. The novel measure selects the optimal model and fraction of customers to include, yielding a significant increase in profits compared to statistical measures.In the second part an extensive benchmarking experiment is conducted, evaluating various classification techniques applied on eleven real-life data sets from telecom operators worldwide by using both the profit centric and statistically based performance measures. The experimental results show that a small number of variables suffices to predict churn with high accuracy, and that oversampling generally does not improve the performance significantly. Finally, a large group of classifiers is found to yield comparable performance.  相似文献   

20.
This paper introduces a functional central limit theorem for empirical processes endowed with real values from a strictly stationary random field that satisfies an interlaced mixing condition. We proceed by using a common technique from Billingsley (Convergence of probability measures, Wiley, New York, 1999), by first obtaining the limit theorem for the case where the random variables of the strictly stationary ???-mixing random field are uniformly distributed on the interval [0, 1]. We then generalize the result to the case where the absolutely continuous marginal distribution function is not longer uniform. In this case we show that the empirical process endowed with values from the ???-mixing stationary random field, due to the strong mixing condition, doesn??t converge in distribution to a Brownian bridge, but to a continuous Gaussian process with mean zero and the covariance given by the limit of the covariance of the empirical process. The argument for the general case holds similarly by the application of a standard variant of a result of Billingsley (1999) for the space D(???, ??).  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号