期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Estimation of failure probability using semi-definite logit model

Hiroshi?Konno Naoya?Kawadai Email author Dai?Wu 《Computational Management Science》2003,1(1):59-73

We will propose a new and practical method for estimating the failure probability of a large number of small to medium scale companies using their balance sheet data. We will use the maximum likelihood method to estimate the best parameters of the logit function, where the failure intensity function in its exponent is represented as a convex quadratic function instead of a commonly used linear function. The reasons for using this type of function are : (i) it can better represent the observed nonlinear dependence of failure probability on financial attributes, (ii) the resulting likelihood function can be maximized using a cutting plane algorithm developed for nonlinear semi-definite programming problems.We will show that we can achieve better prediction performance than the standard logit model, using thousands of sample companies.Revised: December 2002, 相似文献

2.

PA条件不成立时GEE方法中的检验问题

下载免费PDF全文

冀运齐朱仲义《应用概率统计》2008,24(5):541-553

边际回归模型和与此有关的广义估计方程(GEE)在纵向数据分析中得到了广泛的应用. Pepe和Anderson在1994指出在边际模型和GEE方法的应用中必须满足一个重要条件, 即PA条件. 如果该假定不能满足, 可能得不到相合估计, 由此进行的统计推断的效率可能不高. 本文通过简单的AR(1)模型, 在理论上和数值模拟上讨论了PA条件对基于广义估计方程方法所作的关于回归系数的检验的影响. 由于PA条件不成立, 回归系数的GLS估计不再是渐近无偏的, 构造的Wald和Score统计量的分布不再是中心卡方分布, 从而对于检验的效率也产生了严重影响. 相似文献

3.

One-Step Generalized Estimating Equations With Large Cluster Sizes

Stuart Lipsitz Garrett Fitzmaurice Debajyoti Sinha Nathanael Hevelone Jim Hu Louis L. Nguyen 《Journal of computational and graphical statistics》2017,26(3):734-737

Medical studies increasingly involve a large sample of independent clusters, where the cluster sizes are also large. Our motivating example from the 2010 Nationwide Inpatient Sample (NIS) has 8,001,068 patients and 1049 clusters, with average cluster size of 7627. Consistent parameter estimates can be obtained naively assuming independence, which are inefficient when the intra-cluster correlation (ICC) is high. Efficient generalized estimating equations (GEE) incorporate the ICC and sum all pairs of observations within a cluster when estimating the ICC. For the 2010 NIS, there are 92.6 billion pairs of observations, making summation of pairs computationally prohibitive. We propose a one-step GEE estimator that (1) matches the asymptotic efficiency of the fully iterated GEE; (2) uses a simpler formula to estimate the ICC that avoids summing over all pairs; and (3) completely avoids matrix multiplications and inversions. These three features make the proposed estimator much less computationally intensive, especially with large cluster sizes. A unique contribution of this article is that it expresses the GEE estimating equations incorporating the ICC as a simple sum of vectors and scalars. 相似文献

4.

Weighted estimating equation: modified GEE in longitudinal data analysis

Tianqing LIU Zhidong BAI Baoxue ZHANG 《Frontiers of Mathematics in China》2014,9(2):329-353

The method of generalized estimating equations （GEE） introduced by K. Y. Liang and S. L. Zeger has been widely used to analyze longitudinal data. Recently, this method has been criticized for a failure to protect against misspecification of working correlation models, which in some cases leads to loss of efficiency or infeasibility of solutions. In this paper, we present a new method named as ＇weighted estimating equations （WEE）＇ for estimating the correlation parameters. The new estimates of correlation parameters are obtained as the solutions of these weighted estimating equations. For some commonly assumed correlation structures, we show that there exists a unique feasible solution to these weighted estimating equations regardless the correlation structure is correctly specified or not. The new feasible estimates of correlation parameters are consistent when the working correlation structure is correctly specified. Simulation results suggest that the new method works well in finite samples. 相似文献

5.

Comparison of methods for ordinal lens opacity data from atomic-bomb survivors: univariate worse-eye method and bivariate GEE method using global odds ratio

Eiji Nakashima Kazuo Neriishi Atsushi Minamoto 《Annals of the Institute of Statistical Mathematics》2008,60(3):465-482

In analyses of bivariate ordered polytomous cataract data from atomic-bomb survivors, we compared two methods, the univariate worse-eye method, and the bivariate generalized estimating equations (GEE’s) method using global odds ratio by Williamson et al. (Journal of the American Statistical Association, 90, 1432–1437, 1995). When the association was large and only subject level covariates were used, model selection in the univariate and bivariate methods resulted in the same mean model and similar risk estimates. We showed that the mean parameter and the standard error (SE) in the univariate model are emphasized relative to those in the bivariate model, the biases of which are negligible when the association between both eyes is large. Large sample simulation studies indicated that the univariate Wald statistics are slightly conservative. The simulations also showed that, in bivariate cases, irrespective of the degree of association, the independence estimating equations method with robust SE, and the GEE method with model-based and robust SE are almost fully efficient in parameter estimation when only subject level covariates are included in the mean. 相似文献

6.

Drawing inferences from logit models for panel data

Eric D. Nordmoe Dipak C. Jain 《商业与工业应用随机模型》2000,16(2):127-145

Logit models have been widely used in marketing to predict brand choice and to make inference about the impact of marketing mix variables on these choices. Most researchers have followed the pioneering example of Guadagni and Little, building choice models and drawing inference conditional on the assumption that the logit model is the correct specification for household purchase behaviour. To the extent that logit models fail to adequately describe household purchase behaviour, statistical inferences from them may be flawed. More importantly, marketing decisions based on these models may be incorrect. This research applies White's robust inference method to logit brand choice models. The method does not impose the restrictive assumption that the assumed logit model specification be true. A sandwich estimator of the covariance ‘corrected’ for possible mis‐specification is the basis for inference about logit model parameters. An important feature of this method is that it yields correct standard errors for the marketing mix parameter estimates even if the assumed logit model specification is not correct. Empirical examples include using household panel data sets from three different product categories to estimate logit models of brand choice. The standard errors obtained using traditional methods are compared with those obtained by White's robust method. The findings illustrate that incorrectly assuming the logit model to be true typically yields standard errors which are biased downward by 10–40 per cent. Conditions under which the bias is particularly severe are explored. Under these conditions, the robust approach is recommended. Copyright © 2000 John Wiley & Sons, Ltd. 相似文献

7.

部分线性混合效应模型中方差分量的稳健估计

下载免费PDF全文

秦国友朱仲义《应用概率统计》2007,23(2):207-214

部分线性混合效应模型中方差分量是我们感兴趣的参数, 文献中已经给出许多估计方法. 但是其中很多方法都可以归结为广义估计方程方法(GEE), 如: 最大似然估计(MLE), 约束最大似然估计(REMLE)等, 而GEE方法对异常点很敏感. 本文提出一组关于部分线性混合效应模型(PLMM)中均值和方差分量的稳健估计方程, 对均值和方差分量同时进行稳健估计; 并进行了随机模拟考察所提出稳健估计的有效性, 最后通过两个实例, 说明了所提方法的可行性. 相似文献

8.

Ridge estimation for multinomial logit models with symmetric side constraints

Faisal Maqbool Zahid Gerhard Tutz 《Computational Statistics》2013,28(3):1017-1034

In multinomial logit models, the identifiability of parameter estimates is typically obtained by side constraints that specify one of the response categories as reference category. When parameters are penalized, shrinkage of estimates should not depend on the reference category. In this paper we investigate ridge regression for the multinomial logit model with symmetric side constraints, which yields parameter estimates that are independent of the reference category. In simulation studies the results are compared with the usual maximum likelihood estimates and an application to real data is given. 相似文献

9.

Robust estimation in joint mean–covariance regression model for longitudinal data

Xueying Zheng Wing Kam Fung Zhongyi Zhu 《Annals of the Institute of Statistical Mathematics》2013,65(4):617-638

In this paper, we develop robust estimation for the mean and covariance jointly for the regression model of longitudinal data within the framework of generalized estimating equations (GEE). The proposed approach integrates the robust method and joint mean–covariance regression modeling. Robust generalized estimating equations using bounded scores and leverage-based weights are employed for the mean and covariance to achieve robustness against outliers. The resulting estimators are shown to be consistent and asymptotically normally distributed. Simulation studies are conducted to investigate the effectiveness of the proposed method. As expected, the robust method outperforms its non-robust version under contaminations. Finally, we illustrate by analyzing a hormone data set. By downweighing the potential outliers, the proposed method not only shifts the estimation in the mean model, but also shrinks the range of the innovation variance, leading to a more reliable estimation in the covariance matrix. 相似文献

10.

Random Survival Forests Models for SME Credit Risk Measurement 总被引：2，自引：0，他引：2

Dean Fantazzini Silvia Figini 《Methodology and Computing in Applied Probability》2009,11(1):29-45

This paper extends the existing literature on empirical research in the field of credit risk default for Small Medium Enterprizes (SMEs). We propose a non-parametric approach based on Random Survival Forests (RSF) and we compare its performance with a standard logit model. To the authors’ knowledge, no studies in the area of credit risk default for SMEs have used a variety of statistical methodologies to test the reliability of their predictions and to compare their performance against one another. As for the in-sample results, we find that our non-parametric model performs much better that the classical logit model. As for the out-of-sample performances, the evidence is just the opposite, and the logit performs better than the RSF model. We explain this evidence by showing how error in the estimates of default probabilities can affect classification error when the estimates are used in a classification rule. 相似文献

11.

Estimation of parameters in latent class models using fuzzy clustering algorithms

《European Journal of Operational Research》2005,160(2):515-531

A mixture approach to clustering is an important technique in cluster analysis. A mixture of multivariate multinomial distributions is usually used to analyze categorical data with latent class model. The parameter estimation is an important step for a mixture distribution. Described here are four approaches to estimating the parameters of a mixture of multivariate multinomial distributions. The first approach is an extended maximum likelihood (ML) method. The second approach is based on the well-known expectation maximization (EM) algorithm. The third approach is the classification maximum likelihood (CML) algorithm. In this paper, we propose a new approach using the so-called fuzzy class model and then create the fuzzy classification maximum likelihood (FCML) approach for categorical data. The accuracy, robustness and effectiveness of these four types of algorithms for estimating the parameters of multivariate binomial mixtures are compared using real empirical data and samples drawn from the multivariate binomial mixtures of two classes. The results show that the proposed FCML algorithm presents better accuracy, robustness and effectiveness. Overall, the FCML algorithm has the superiority over the ML, EM and CML algorithms. Thus, we recommend FCML as another good tool for estimating the parameters of mixture multivariate multinomial models. 相似文献

12.

Shrinkage estimation analysis of correlated binary data with a diverging number of parameters

XU PeiRong FU WenJiang ZHU LiXing 《中国科学数学(英文版)》2013,56(2):359-377

For analyzing correlated binary data with high-dimensional covariates,we,in this paper,propose a two-stage shrinkage approach.First,we construct a weighted least-squares(WLS) type function using a special weighting scheme on the non-conservative vector field of the generalized estimating equations(GEE) model.Second,we define a penalized WLS in the spirit of the adaptive LASSO for simultaneous variable selection and parameter estimation.The proposed procedure enjoys the oracle properties in high-dimensional framework where the number of parameters grows to infinity with the number of clusters.Moreover,we prove the consistency of the sandwich formula of the covariance matrix even when the working correlation matrix is misspecified.For the selection of tuning parameter,we develop a consistent penalized quadratic form(PQF) function criterion.The performance of the proposed method is assessed through a comparison with the existing methods and through an application to a crossover trial in a pain relief study. 相似文献

13.

Multivariate probit models for conditional claim-types

Gary Young Emiliano A. Valdez 《Insurance: Mathematics and Economics》2009,44(2):214-228

This paper considers statistical modeling of the types of claim in a portfolio of insurance policies. For some classes of insurance contracts, in a particular period, it is possible to have a record of whether or not there is a claim on the policy, the types of claims made on the policy, and the amount of claims arising from each of the types. A typical example is automobile insurance where in the event of a claim, we are able to observe the amounts that arise from say injury to oneself, damage to one’s own property, damage to a third party’s property, and injury to a third party. Modeling the frequency and the severity components of the claims can be handled using traditional actuarial procedures. However, modeling the claim-type component is less known and in this paper, we recommend analyzing the distribution of these claim-types using multivariate probit models, which can be viewed as latent variable threshold models for the analysis of multivariate binary data. A recent article by Valdez and Frees [Valdez, E.A., Frees, E.W., Longitudinal modeling of Singapore motor insurance. University of New South Wales and the University of Wisconsin-Madison. Working Paper. Dated 28 December 2005, available from: http://wwwdocs.fce.unsw.edu.au/actuarial/research/papers/2006/Valdez-Frees-2005.pdf] considered this decomposition to extend the traditional model by including the conditional claim-type component, and proposed the multinomial logit model to empirically estimate this component. However, it is well known in the literature that this type of model assumes independence across the different outcomes. We investigate the appropriateness of fitting a multivariate probit model to the conditional claim-type component in which the outcomes may in fact be correlated, with possible inclusion of important covariates. Our estimation results show that when the outcomes are correlated, the multinomial logit model produces substantially different predictions relative to the true predictions; and second, through a simulation analysis, we find that even in ideal conditions under which the outcomes are independent, multinomial logit is still a poor approximation to the true underlying outcome probabilities relative to the multivariate probit model. The results of this paper serve to highlight the trade-off between tractability and flexibility when choosing the appropriate model. 相似文献

14.

带有约束的增长曲线模型中回归系数线性估计的可容许性与泛容许性(英文)

胡雪梅王志忠高骥忠《数学杂志》2010,30(1)

本文研究了参数受约束的增长曲线模型中多元回归系数线性估计的可容许性和泛容许性.利用线性估计类中的八种最优标准和圣函数,得到了在三个等价了类中线性估计可容许以及回归系数线性估计泛容许的充要条件.本文的结论推广了覃红等人的工作. 相似文献

15.

Penalized marginal likelihood estimation of finite mixtures of Archimedean copulas

Göran Kauermann Renate Meyer 《Computational Statistics》2014,29(1-2):283-306

This paper proposes finite mixtures of different Archimedean copula families as a flexible tool for modelling the dependence structure in multivariate data. A novel approach to estimating the parameters in this mixture model is presented by maximizing the penalized marginal likelihood via iterative quadratic programming. The motivation for the penalized marginal likelihood stems from an underlying Bayesian model that imposes a prior distribution on the parameter of each Archimedean copula family. An approximative marginal likelihood is obtained by a classical quadrature discretization of the integral w.r.t. each family-specific prior distribution, thus yielding a finite mixture model. Family-specific smoothness penalties are added and the penalized marginal likelihood is maximized using an iterative quadratic programming routine. For comparison purposes, we also present a fully Bayesian approach via simulation-based posterior computation. The performance of the novel estimation approach is evaluated by simulations and two examples involving the modelling of the interdependence of exchange rates and of wind speed measurements, respectively. For these examples, penalized marginal likelihood estimates are compared to the corresponding Bayesian estimates. 相似文献

16.

Penalized multinomial mixture logit model

Shaheena Bashir Edward M. Carter 《Computational Statistics》2010,25(1):121-141

Normal distribution based discriminant methods have been used for the classification of new entities into different groups based on a discriminant rule constructed from the learning set. In practice if the groups are not homogeneous, then mixture discriminant analysis of Hastie and Tibshirani (J R Stat Soc Ser B 58(1):155–176, 1996) is a useful approach, assuming that the distribution of the feature vectors is a mixture of multivariate normals. In this paper a new logistic regression model for heterogenous group structure of the learning set is proposed based on penalized multinomial mixture logit models. This approach is shown through simulation studies to be more effective. The results were compared with the standard mixture discriminant analysis approach using the probability of misclassification criterion. This comparison showed a slight reduction in the average probability of misclassification using this penalized multinomial mixture logit model as compared to the classical discriminant rules. It also showed better results when applied to practical life data problems producing smaller errors. 相似文献

17.

基于面板logit模型的上市公司财务困境预测 总被引：1，自引：0，他引：1

卢永艳王维国《数学的实践与认识》2010,40(5)

目前关于财务困境预测的研究大多是局限于截面数据的静态计量和统计模型,忽视了公司的财务状况是不断变化的事实.为了揭示公司财务状况的变化过程,利用面板数据建立了panel logit概率模型.研究结果表明,panel logit模型在预测准确度方面优于普通的logit模型. 相似文献

18.

Large sample properties of estimates of a discrete grade of membership model

H. Dennis Tolley Kenneth G. Manton 《Annals of the Institute of Statistical Mathematics》1992,44(1):85-95

Increasingly, fuzzy partitions are being used in multivariate classification problems as an alternative to the crisp classification procedures commonly used. One such fuzzy partition, the grade of membership model, partitions individuals into fuzzy sets using multivariate categorical data. Although the statistical methods used to estimate fuzzy membership for this model are based on maximum likelihood methods, large sample properties of the estimation procedure are problematic for two reasons. First, the number of incidental parameters increases with the size of the sample. Second, estimated parameters fall on the boundary of the parameter space with non-zero probability. This paper examines the consistency of the likelihood approach when estimating the components of a particular probability model that gives rise to a fuzzy partition. The results of the consistency proof are used to determine the large sample distribution of the estimates. Common methods of classifying individuals based on multivariate observations attempt to place each individual into crisply defined sets. The fuzzy partition allows for individual to individual heterogeneity, beyond simply errors in measurement, by defining a set of pure type characteristics and determining each individual's distance from these pure types. Both the profiles of the pure types and the heterogeneity of the individuals must be estimated from data. These estimates empirically define the fuzzy partition. In the current paper, this data is assumed to be categorical data. Because of the large number of parameters to be estimated and the limitations of categorical data, one may be concerned about whether or not the fuzzy partition can be estimated consistently. This paper shows that if heterogeneity is measured with respect to a fixed number of moments of the grade of membership scores of each individual, the estimated fuzzy partition is consistent. 相似文献

19.

A comparison of generalized multinomial logit and latent class approaches to studying consumer heterogeneity with some extensions of the generalized multinomial logit model

Joseph Pancras Dipak K. Dey 《商业与工业应用随机模型》2011,27(6):567-578

We calibrate and contrast the recent generalized multinomial logit model and the widely used latent class logit model approaches for studying heterogeneity in consumer purchases. We estimate the parameters of the models on panel data of household ketchup purchases, and find that the generalized multinomial logit model outperforms the best‐fitting latent class logit model in terms of the Bayesian information criterion. We compare the posterior estimates of coefficients for individual customers based on the two different models and discuss how the differences could affect marketing strategies (such as pricing), which could be affected by applying each of the models. We also describe extensions to the scale heterogeneity model that includes the effects of state dependence and purchase history. Copyright © 2011 John Wiley & Sons, Ltd. 相似文献

20.

An integral transform method for estimating the central mean and central subspaces

Peng Zeng Yu Zhu 《Journal of multivariate analysis》2010,101(1):271-290

The central mean and central subspaces of generalized multiple index model are the main inference targets of sufficient dimension reduction in regression. In this article, we propose an integral transform (ITM) method for estimating these two subspaces. Applying the ITM method, estimates are derived, separately, for two scenarios: (i) No distributional assumptions are imposed on the predictors, and (ii) the predictors are assumed to follow an elliptically contoured distribution. These estimates are shown to be asymptotically normal with the usual root-n convergence rate. The ITM method is different from other existing methods in that it avoids estimation of the unknown link function between the response and the predictors and it does not rely on distributional assumptions of the predictors under scenario (i) mentioned above. 相似文献