首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 46 毫秒
1.
Analysis of large dimensional contingency tables is rather difficult. Fienberg and Kim (1999, Journal of American Statistical Association, 94, 229–239) studied the problem of combining conditional (on single variable) log-linear structures for graphical models to obtain partial information about the full graphical log-linear model. In this paper, we consider the general log-linear models and obtain explicit representation for the log-linear parameters of the full model based on that of conditional structures. As a consequence, we give conditions under which a particular log-linear parameter is present or not in the full model. Some of the main results of Fienberg and Kim follow from our results. The explicit relationships between full model and the conditional structures are also presented. The connections between conditional structures and the layer structures are pointed out. We investigate also the hierarchical nature of the full model, based on conditional structures. Kim (2006, Computational Statistics and Data Analysis, 50, 2044–2064) analyzed graphical log-linear models based on conditional log-linear structures, when a set of variables is conditioned. For this case, we employ the Möbius inversion technique to obtain the interaction parameters of the full log-linear model, and discuss their properties. The hierarchical nature of the full model is also studied based on conditional structures. This result could be effectively used for the model selection also. As applications of our results, we have discussed several typical examples, including a real-life example.  相似文献   

2.
Applications of log-linear models in discrete discriminant analysis usually treat the grouping variable as a variable in the model. An alternative parameterization is introduced which models the association structure between variables for each population separately. The separate log-linear models may have differing complexity. It is shown that these approaches lead to different classes of models. Applications to the choice of car brand and credit scoring show the usefulness of separate modelling.  相似文献   

3.
The paper considers general multiplicative models for complete and incomplete contingency tables that generalize log-linear and several other models and are entirely coordinate free. Sufficient conditions for the existence of maximum likelihood estimates under these models are given, and it is shown that the usual equivalence between multinomial and Poisson likelihoods holds if and only if an overall effect is present in the model. If such an effect is not assumed, the model becomes a curved exponential family and a related mixed parameterization is given that relies on non-homogeneous odds ratios. Several examples are presented to illustrate the properties and use of such models.  相似文献   

4.
We consider the problem of learning the structure of a pairwise graphical model over continuous and discrete variables. We present a new pairwise model for graphical models with both continuous and discrete variables that is amenable to structure learning. In previous work, authors have considered structure learning of Gaussian graphical models and structure learning of discrete models. Our approach is a natural generalization of these two lines of work to the mixed case. The penalization scheme involves a novel symmetric use of the group-lasso norm and follows naturally from a particular parameterization of the model. Supplementary materials for this article are available online.  相似文献   

5.
Bayesian networks are one of the most widely used tools for modeling multivariate systems. It has been demonstrated that more expressive models, which can capture additional structure in each conditional probability table (CPT), may enjoy improved predictive performance over traditional Bayesian networks despite having fewer parameters. Here we investigate this phenomenon for models of various degree of expressiveness on both extensive synthetic and real data. To characterize the regularities within CPTs in terms of independence relations, we introduce the notion of partial conditional independence (PCI) as a generalization of the well-known concept of context-specific independence (CSI). To model the structure of the CPTs, we use different graph-based representations which are convenient from a learning perspective. In addition to the previously studied decision trees and graphs, we introduce the concept of PCI-trees as a natural extension of the CSI-based trees. To identify plausible models we use the Bayesian score in combination with a greedy search algorithm. A comparison against ordinary Bayesian networks shows that models with local structures in general enjoy parametric sparsity and improved out-of-sample predictive performance, however, often it is necessary to regulate the model fit with an appropriate model structure prior to avoid overfitting in the learning process. The tree structures, in particular, lead to high quality models and suggest considerable potential for further exploration.  相似文献   

6.
Probabilistic Decision Graphs (PDGs) are probabilistic graphical models that represent a factorisation of a discrete joint probability distribution using a “decision graph”-like structure over local marginal parameters. The structure of a PDG enables the model to capture some context specific independence relations that are not representable in the structure of more commonly used graphical models such as Bayesian networks and Markov networks. This sometimes makes operations in PDGs more efficient than in alternative models. PDGs have previously been defined only in the discrete case, assuming a multinomial joint distribution over the variables in the model. We extend PDGs to incorporate continuous variables, by assuming a Conditional Gaussian (CG) joint distribution. We also show how inference can be carried out in an efficient way.  相似文献   

7.
This paper describes a set of relatively simple procedures that are useful for solving a number of nonlinear programming problems. These problems are characterized by objective and constraint functions that are what we call “derivative decomposable”. Starting with a relaxed problem, we show how derivative decomposability yields a simple solution procedure. Then we demonstrate how slightly modified procedures can solve a variety of more complex problems displaying derivative decomposability. The solution procedures are easily understood in terms of their graphical representations. Furthermore, their simplicity and flexibility promise significant computational advantages for a variety of applications.  相似文献   

8.
We consider a log-linear model for time series of counts. This type of model provides a framework where both negative and positive association can be taken into account. In addition time dependent covariates are accommodated in a straightforward way. We study its probabilistic properties and maximum likelihood estimation. It is shown that a perturbed version of the process is geometrically ergodic, and, under some conditions, it approaches the non-perturbed version. In addition, it is proved that the maximum likelihood estimator of the vector of unknown parameters is asymptotically normal with a covariance matrix that can be consistently estimated. The results are based on minimal assumptions and can be extended to the case of log-linear regression with continuous exogenous variables. The theory is applied to aggregated financial transaction time series. In particular, we discover positive association between the number of transactions and the volatility process of a certain stock.  相似文献   

9.
10.
?ukasiewicz implication algebras are {→,1}-subreducts of Wajsberg algebras (MV-algebras). They are the algebraic counterpart of Super-?ukasiewicz Implicational logics investigated in Komori, Nogoya Math J 72:127–133, 1978. The aim of this paper is to study the direct decomposability of free ?ukasiewicz implication algebras. We show that freely generated algebras are directly indecomposable. We also study the direct decomposability in free algebras of all its proper subvarieties and show that infinitely freely generated algebras are indecomposable, while finitely free generated algebras can be only decomposed into a direct product of two factors, one of which is the two-element implication algebra.  相似文献   

11.
Abstract

A simple matrix formula is given for the observed information matrix when the EM algorithm is applied to categorical data with missing values. The formula requires only the design matrices, a matrix linking the complete and incomplete data, and a few simple derivatives. It can be easily programmed using a computer language with operators for matrix multiplication, element-by-element multiplication and division, matrix concatenation, and creation of diagonal and block diagonal arrays. The formula is applicable whenever the incomplete data can be expressed as a linear function of the complete data, such as when the observed counts represent the sum of latent classes, a supplemental margin, or the number censored. In addition, the formula applies to a wide variety of models for categorical data, including those with linear, logistic, and log-linear components. Examples include a linear model for genetics, a log-linear model for two variables and nonignorable nonresponse, the product of a log-linear model for two variables and a logit model for nonignorable nonresponse, a latent class model for the results of two diagnostic tests, and a product of linear models under double sampling.  相似文献   

12.
In analyses of complex diversity, especially that arising in genetics, genomics, ecology and other high-dimensional (and sometimes low-sample-size) data models, typically subgroup decomposability (analogous to ANOVA decomposability) arises. For group divergence of diversity measures in a high-dimension low-sample-size scenario, it is shown that Hamming distance type statistics lead to a general class of quasi-U-statistics having, under the hypothesis of homogeneity, a martingale (array) property, providing a key to the study of general (nonstandard) asymptotics. Neither the stochastic independence nor homogeneity of the marginal probability laws plays a basic role. A genomic MANOVA model is presented as an illustration.  相似文献   

13.
Bayesian model determination in the complete class of graphical models is considered using a decision theoretic framework within the regular exponential family. The complete class contains both decomposable and non-decomposable graphical models. A utility measure based on a logarithmic score function is introduced under reference priors for the model parameters. The logarithmic utility of a model is decomposed into predictive performance and relative complexity. Axioms of decision theory lead to the judgement of the plausibility of a model in terms of the posterior expected utility. This quantity has an analytic expression for decomposable models when certain reference priors are used and the exponential family is closed under marginalization. For non-decomposable models, a simulation consistent estimate of the expectation can be obtained. Both real and simulated data sets are used to illustrate the introduced methodology.  相似文献   

14.
A new Lee–Carter model parameterization is introduced with two advantages. First, the Lee–Carter parameters are normalized such that they have a direct and intuitive interpretation, comparable across populations. Second, the model is stated in terms of the “needed-exposure” (NE). The NE is the number required in order to get one expected death and is closely related to the “needed-to-treat” measure used to communicate risks and benefits of medical treatments. In the new parameterization, time parameters are directly interpretable as an overall across-age NE. Age parameters are interpretable as age-specific elasticities: percentage changes in the NE at a particular age in response to a percent change in the overall NE. A similar approach can be used to confer interpretability on parameters of other mortality models.  相似文献   

15.
Binary data represent a very special condition where both measures of distance and co-occurrence can be adopted. Euclidean distance-based non-hierarchical methods, like the k-means algorithm, or one of its versions, can be profitably used. When the number of available attributes increases the global clustering performance usually worsens. In such cases, to enhance group separability it is necessary to remove the irrelevant and redundant noisy information from the data. The present approach belongs to the category of attribute transformation strategy, and combines clustering and factorial techniques to identify attribute associations that characterize one or more homogeneous groups of statistical units. Furthermore, it provides graphical representations that facilitate the interpretation of the results.  相似文献   

16.
A known characterization of the decomposability of polytopes is reformulated in a way which may be more computationally convenient, and a more transparent proof is given. New sufficient conditions for indecomposability are then deduced, and illustrated with some examples.  相似文献   

17.
In this paper we study the direct decomposability of free Tarski algebras. We show that infinite freely generated Tarski algebras are directly indecomposable, whereas finite freely generated Tarski algebras can only be decomposed into a direct product of two factors, one of which is the two-element Tarski algebra.  相似文献   

18.
Graphical models are wildly used to describe conditional dependence relationships among interacting random variables. Among statistical inference problems of a graphical model, one particular interest is utilizing its interaction structure to reduce model complexity. As an important approach to utilizing structural information, decomposition allows a statistical inference problem to be divided into some sub-problems with lower complexities. In this paper, to investigate decomposition of covariate-dependent graphical models, we propose some useful definitions of decomposition of covariate-dependent graphical models with categorical data in the form of contingency tables. Based on such a decomposition, a covariate-dependent graphical model can be split into some sub-models, and the maximum likelihood estimation of this model can be factorized into the maximum likelihood estimations of the sub-models. Moreover, some sufficient and necessary conditions of the proposed definitions of decomposition are studied.  相似文献   

19.
While graphical models for continuous data (Gaussian graphical models) and discrete data (Ising models) have been extensively studied, there is little work on graphical models for datasets with both continuous and discrete variables (mixed data), which are common in many scientific applications. We propose a novel graphical model for mixed data, which is simple enough to be suitable for high-dimensional data, yet flexible enough to represent all possible graph structures. We develop a computationally efficient regression-based algorithm for fitting the model by focusing on the conditional log-likelihood of each variable given the rest. The parameters have a natural group structure, and sparsity in the fitted graph is attained by incorporating a group lasso penalty, approximated by a weighted lasso penalty for computational efficiency. We demonstrate the effectiveness of our method through an extensive simulation study and apply it to a music annotation dataset (CAL500), obtaining a sparse and interpretable graphical model relating the continuous features of the audio signal to binary variables such as genre, emotions, and usage associated with particular songs. While we focus on binary discrete variables for the main presentation, we also show that the proposed methodology can be easily extended to general discrete variables.  相似文献   

20.
This paper presents the use of graphical models and copula functions in Estimation of Distribution Algorithms (EDAs) for solving multivariate optimization problems. It is shown in this work how the incorporation of copula functions and graphical models for modeling the dependencies among variables provides some theoretical advantages over traditional EDAs. By means of copula functions and two well known graphical models, this paper presents a novel approach for defining new EDAs. Either dependence is modeled by a copula function chosen from a predefined set of six functions that aim to cover a wide range of inter-relations. It is also shown how the use of mutual information in the learning of graphical models implies a natural way of employing copula entropies. The experimental results on separable and non-separable functions show that the two new EDAs, which adopt copula functions to model dependencies, perform better than their original version with Gaussian variables.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号