期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Estimation of parameters in latent class models using fuzzy clustering algorithms

《European Journal of Operational Research》2005,160(2):515-531

A mixture approach to clustering is an important technique in cluster analysis. A mixture of multivariate multinomial distributions is usually used to analyze categorical data with latent class model. The parameter estimation is an important step for a mixture distribution. Described here are four approaches to estimating the parameters of a mixture of multivariate multinomial distributions. The first approach is an extended maximum likelihood (ML) method. The second approach is based on the well-known expectation maximization (EM) algorithm. The third approach is the classification maximum likelihood (CML) algorithm. In this paper, we propose a new approach using the so-called fuzzy class model and then create the fuzzy classification maximum likelihood (FCML) approach for categorical data. The accuracy, robustness and effectiveness of these four types of algorithms for estimating the parameters of multivariate binomial mixtures are compared using real empirical data and samples drawn from the multivariate binomial mixtures of two classes. The results show that the proposed FCML algorithm presents better accuracy, robustness and effectiveness. Overall, the FCML algorithm has the superiority over the ML, EM and CML algorithms. Thus, we recommend FCML as another good tool for estimating the parameters of mixture multivariate multinomial models. 相似文献

2.

Simultaneous Gaussian model-based clustering for samples of multiple origins

A. Lourme C. Biernacki 《Computational Statistics》2013,28(1):371-391

相似文献

3.

Convex decompositions of fuzzy partitions

James C Bezdek J.Douglas Harris 《Journal of Mathematical Analysis and Applications》1979,67(2):490-512

In this paper we investigate some algebraic and geometric properties of fuzzy partition spaces (convex hulls of hard or conventional partition spaces). In particular, we obtain their dimensions, and describe a number of algorithms for effecting convex decompositions. Two of these are easily programmable, and each affords a different insight about data structures suggested by the fuzzy partition decomposed. We also show how the sequence of partitions in any convex decomposition leads to a matrix for which the norm of the corresponding coefficient vector equals a scalar measure of partition fuzziness used with certain fuzzy clustering algorithms. 相似文献

4.

带模糊参数的系统模糊可靠度分析

郭嗣琮刘海涛何波《模糊系统与数学》2007,21(6):76-83

含模糊参数系统的可靠性理论研究具有广泛的实际应用背景,但由于模糊数运算的隶属函数表达困难,影响和制约着模糊参数系统的模糊可靠性理论与应用的研究。本文利用模糊数的结构元表示,给出了模糊表达式隶属函数确定的两种方法,进而得到了具有模糊参数的不可修复串联和并联系统模糊可靠度的隶属函数表达式。相似文献

5.

Pseudo-likelihood methodology for partitioned large and complex samples

Geert Molenberghs Geert Verbeke Samuel Iddi 《Statistics & probability letters》2011,81(7):892-901

Large data sets, either coming from a large number of independent replications, or because of hierarchies in the data with large numbers of within-unit replication, may pose challenges to the data analyst up to the point of making conventional inferential methods, such as maximum likelihood, prohibitive. Based on general pseudo-likelihood concepts, we propose a method to partition such a set of data, analyze each partition member, and properly combine the inferences into a single one. It is shown that the method is fully efficient for independent partitions, while with dependent sub-samples efficiency is sometimes but not always equal to one. It is argued that, for important realistic settings, efficiency is often very high. Illustrative examples enhance insight in the method’s operation, while real-data analysis underscores its power for practice. 相似文献

6.

Measuring the fuzziness of human thoughts: An application of fuzzy sets to sociological research

Shaomin Li 《The Journal of mathematical sociology》2013,37(1):67-84

Conventionally, sociologists measure the membership of an individual to a group by a “0 or 1” characteristic function. But when the definition of that group is fuzzy and an individual is neither a full member nor a nonmember, this dichotomous characteristic function may distort the reality. Instead of the “0 or 1” characteristic function by classical set theory, fuzzy set theory introduces a membership function which is a gradation from 0 to 1 to measure the degree to which an object (an individual) belongs to a concept (a group). Based on the rationale of fuzzy set theory, we suggest some new methods of data collection and analysis. Among several noteworthy findings, two points are emphasized: 1) the fuzzy set is an appropriate way of measuring the fuzziness of human thought; and 2) it allows one to relax the conventional assumption that all individuals have identical distributions and deviations around their means. 相似文献

7.

Adaptive Bayesian Nonstationary Modeling for Large Spatial Datasets Using Covariance Approximations

Bledar A. Konomi Huiyan Sang Bani K. Mallick 《Journal of computational and graphical statistics》2013,22(3):802-829

Gaussian process models have been widely used in spatial statistics but face tremendous modeling and computational challenges for very large nonstationary spatial datasets. To address these challenges, we develop a Bayesian modeling approach using a nonstationary covariance function constructed based on adaptively selected partitions. The partitioned nonstationary class allows one to knit together local covariance parameters into a valid global nonstationary covariance for prediction, where the local covariance parameters are allowed to be estimated within each partition to reduce computational cost. To further facilitate the computations in local covariance estimation and global prediction, we use the full-scale covariance approximation (FSA) approach for the Bayesian inference of our model. One of our contributions is to model the partitions stochastically by embedding a modified treed partitioning process into the hierarchical models that leads to automated partitioning and substantial computational benefits. We illustrate the utility of our method with simulation studies and the global Total Ozone Matrix Spectrometer (TOMS) data. Supplementary materials for this article are available online. 相似文献

8.

Linear intensification of probabilistic fuzzy partitions

《Fuzzy Sets and Systems》2004,141(2):319-332

Fuzziness of fuzzy sets can be reduced by intensification (sharpening) of large membership grades closer to one and small membership grades closer to zero. The relationship between intensification and partial defuzzification of a special type of fuzzy sets, fuzzy clusters, is studied. Some guidelines for assessment of the degree of possible defuzzification of a probabilistic fuzzy partition are suggested. An operator of linear intensification of fuzzy clusters is proposed and illustrated with examples. It is shown how different goals of partial defuzzification can be achieved by modification of this operator. 相似文献

9.

Semiparametric Bayesian Regression via Potts Model

Alejandro Murua Fernando A. Quintana 《Journal of computational and graphical statistics》2017,26(2):265-274

We consider Bayesian nonparametric regression through random partition models. Our approach involves the construction of a covariate-dependent prior distribution on partitions of individuals. Our goal is to use covariate information to improve predictive inference. To do so, we propose a prior on partitions based on the Potts clustering model associated with the observed covariates. This drives by covariate proximity both the formation of clusters, and the prior predictive distribution. The resulting prior model is flexible enough to support many different types of likelihood models. We focus the discussion on nonparametric regression. Implementation details are discussed for the specific case of multivariate multiple linear regression. The proposed model performs well in terms of model fitting and prediction when compared to other alternative nonparametric regression approaches. We illustrate the methodology with an application to the health status of nations at the turn of the 21st century. Supplementary materials are available online. 相似文献

10.

Asymptotic Properties of a Class of Mixture Models for Failure Data: The Interior and Boundary Cases

H. T. V. Vu R. A. Maller X. Zhou 《Annals of the Institute of Statistical Mathematics》1998,50(4):627-653

We analyse an exponential family of distributions which generalises the exponential distribution for censored failure time data, analogous to the way in which the class of generalised linear models generalises the normal distribution. The parameter of the distribution depends on a linear combination of covariates via a possibly nonlinear link function, and we allow another level of heterogeneity: the data may contain "immune" individuals who are not subject to failure. Thus the data is modelled by a mixture of a distribution from the exponential family and a "mass at infinity" representing individuals who never fail. Our results include large sample distributions for parameter estimators and for hypothesis test statistics obtained by maximising the likelihood of a sample. The asymptotic distribution of the likelihood ratio test statistic for the hypothesis that there are no immunes present in the population is shown to be "non-standard"; it is a 50-50 mixture of a chi-squared distribution on 1 degree of freedom and a point mass at 0. Our analysis clearly shows how "negligibility" of individual covariate values and "sufficient followup" conditions are required for the asymptotic properties. 相似文献

11.

Grouped Dirichlet distribution: A new tool for incomplete categorical data analysis

Kai Wang Ng Ming Tan 《Journal of multivariate analysis》2008,99(3):490-509

Motivated by the likelihood functions of several incomplete categorical data, this article introduces a new family of distributions, grouped Dirichlet distributions (GDD), which includes the classical Dirichlet distribution (DD) as a special case. First, we develop distribution theory for the GDD in its own right. Second, we use this expanded family as a new tool for statistical analysis of incomplete categorical data. Starting with a GDD with two partitions, we derive its stochastic representation that provides a simple procedure for simulation. Other properties such as mixed moments, mode, marginal and conditional distributions are also derived. The general GDD with more than two partitions is considered in a parallel manner. Three data sets from a case-control study, a leprosy survey, and a neurological study are used to illustrate how the GDD can be used as a new tool for analyzing incomplete categorical data. Our approach based on GDD has at least two advantages over the commonly used approach based on the DD in both frequentist and conjugate Bayesian inference: (a) in some cases, both the maximum likelihood and Bayes estimates have closed-form expressions in the new approach, but not so when they are based on the commonly-used approach; and (b) even if a closed-form solution is not available, the EM and data augmentation algorithms in the new approach converge much faster than in the commonly-used approach. 相似文献

12.

Tuning membership functions of kernel fuzzy classifiers by maximizing margins

Kazuya Morikawa Seiichi Ozawa Shigeo Abe 《Memetic Computing》2009,1(3):221-228

We propose two methods for tuning membership functions of a kernel fuzzy classifier based on the idea of SVM (support vector machine) training. We assume that in a kernel fuzzy classifier a fuzzy rule is defined for each class in the feature space. In the first method, we tune the slopes of the membership functions at the same time so that the margin between classes is maximized under the constraints that the degree of membership to which a data sample belongs is the maximum among all the classes. This method is similar to a linear all-at-once SVM. We call this AAO tuning. In the second method, we tune the membership function of a class one at a time. Namely, for a class the slope of the associated membership function is tuned so that the margin between the class and the remaining classes is maximized under the constraints that the degrees of membership for the data belonging to the class are large and those for the remaining data are small. This method is similar to a linear one-against-all SVM. This is called OAA tuning. According to the computer experiment for fuzzy classifiers based on kernel discriminant analysis and those with ellipsoidal regions, usually both methods improve classification performance by tuning membership functions and classification performance by AAO tuning is slightly better than that by OAA tuning. 相似文献

13.

Maximum likelihood estimation from fuzzy data using the EM algorithm

Thierry Denœux 《Fuzzy Sets and Systems》2011,183(1):72-91

A method is proposed for estimating the parameters in a parametric statistical model when the observations are fuzzy and are assumed to be related to underlying crisp realizations of a random sample. This method is based on maximizing the observed-data likelihood defined as the probability of the fuzzy data. It is shown that the EM algorithm may be used for that purpose, which makes it possible to solve a wide range of statistical problems involving fuzzy data. This approach, called the fuzzy EM (FEM) method, is illustrated using three classical problems: normal mean and variance estimation from a fuzzy sample, multiple linear regression with crisp inputs and fuzzy outputs, and univariate finite normal mixture estimation from fuzzy data. 相似文献

14.

Evaluation of fuzzy linear regression models by comparing membership functions 总被引：6，自引：0，他引：6

Byungjoon Kim Ram R. Bishu 《Fuzzy Sets and Systems》1998,100(1-3):343-352

Fuzzy linear regression models can provide an estimated fuzzy number that has a fuzzy membership function. If a point that has the highest membership value from the estimated fuzzy number is not within the support of the observed fuzzy membership function, a decision-maker can have high risk from the estimate. In this study a modification of fuzzy linear regression analysis based on a criterion of minimizing the difference of the fuzzy membership values between the observed and estimated fuzzy numbers is proposed. Two numerical examples are used to evaluate the fuzzy regression models. 相似文献

15.

A multivariate stochastic model with non-stationary trend component

Hiroko Kato Sadao Naniwa Makio Ishiguro 《商业与工业应用随机模型》1995,11(1):77-95

The purposes of this paper are to introduce a multivariate non-stationary stochastic time series model without individual detrending and to extract the multiple relationships between variables. To infer the statistical relation between variables, we attempt to estimate the co-movement of multivariate non-stationary time series components. The model is expressed in state-space form, and time series components are estimated by the maximum likelihood method using numerical optimization algorithm. The Kalman filter algorithm is used to compute the likelihood of the model. The AIC procedure gives a criterion for selecting the best model fit for the data. The multiple relationship becomes clear by analysing estimated AR coefficients. Real economic data are used for a numerical example. 相似文献

16.

Fuzzy finite element model updating of the DLR AIRMOD test structure

《Applied Mathematical Modelling》2017

This article presents the application of finite-element fuzzy model updating to the DLR AIRMOD structure. The proposed approach is initially demonstrated on a simulated mass-spring system with three degrees of freedom. Considering the effect of the assembly process on variability measurements, modal tests were carried out for the repeatedly disassembled and reassembled DLR AIRMOD structure. The histograms of the measured data attributed to the uncertainty of the structural components in terms of mass and stiffness are utilised to obtain the membership functions of the chosen fuzzy outputs and to determine the updated membership functions of the uncertain input parameters represented by fuzzy variables. In this regard, a fuzzy parameter is introduced to represent a set of interval parameters through the membership function, and a meta model (kriging, in this work) is used to speed up the updating. The use of non-probabilistic models, i.e. interval and fuzzy models, for updating models with uncertainties is often more practical when the large quantities of test data that are necessary for probabilistic model updating are unavailable. 相似文献

17.

Amalgamation of partitions from multiple segmentation bases: A comparison of non-model-based and model-based methods

Rick L. Andrews Michael J. Brusco Imran S. Currim 《European Journal of Operational Research》2010

The segmentation of customers on multiple bases is a pervasive problem in marketing research. For example, segmentation service providers partition customers using a variety of demographic and psychographic characteristics, as well as an array of consumption attributes such as brand loyalty, switching behavior, and product/service satisfaction. Unfortunately, the partitions obtained from multiple bases are often not in good agreement with one another, making effective segmentation a difficult managerial task. Therefore, the construction of segments using multiple independent bases often results in a need to establish a partition that represents an amalgamation or consensus of the individual partitions. In this paper, we compare three methods for finding a consensus partition. The first two methods are deterministic, do not use a statistical model in the development of the consensus partition, and are representative of methods used in commercial settings, whereas the third method is based on finite mixture modeling. In a large-scale simulation experiment the finite mixture model yielded better average recovery of holdout (validation) partitions than its non-model-based competitors. This result calls for important changes in the current practice of segmentation service providers that group customers for a variety of managerial goals related to the design and marketing of products and services. 相似文献

18.

Time-based detection of changes to multivariate patterns

Jing Hu George Runger 《Annals of Operations Research》2010,174(1):67-81

Detection of changes to multivariate patterns is an important topic in a number of different domains. Modern data sets often include categorical and numerical data and potentially complex in-control regions. Given a flexible, robust decision rule for this environment that signals based on an individual observation vector, an important issue is how to extend the rule to incorporate time-based information. A decision rule can be learned to detect shifts through artificial data that transforms the problem to one of supervised learning. Then class probability ratios are derived from a relationship to likelihood ratios to form the basis for time-weighted updates of the monitoring scheme. 相似文献

19.

Comparing partitions of two sets of units based on the same variables

Genane Youness Gilbert Saporta 《Advances in Data Analysis and Classification》2010,4(1):53-64

We propose a procedure based on a latent variable model for the comparison of two partitions of different units described by the same set of variables. The null hypothesis here is that the two partitions come from the same underlying mixture model. We define a method of “projecting” partitions using a supervised classification method: once one partition is taken as a reference; the individuals of the second data set are allocated to the clusters of the reference partition; it gives two partitions of the same units of the second data set: the original and the projected one and we evaluate their difference by usual measures of association. The empirical distributions of the association measures are derived by simulation. 相似文献

20.

Extracting compact fuzzy rules for nonlinear system modeling using subtractive clustering,GA and unscented filter

M. Eftekhari S.D. Katebi 《Applied Mathematical Modelling》2008

This paper presents a two stage procedure for building optimal fuzzy model from data for nonlinear dynamical systems. Both stages are embedded into Genetic Algorithm (GA) and in the first stage emphasis is placed on structural optimization by assigning a suitable fitness to each individual member of population in a canonical GA. These individuals represent coded information about the structure of the model (number of antecedents and rules). This information is consequently utilized by subtractive clustering to partition the input space and construct a compact fuzzy rule base. In the second stage, Unscented Filter (UF) is employed for optimization of model parameters, that is, parameters of the input–output Membership Functions (MFs). 相似文献