首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
The Chernoff information between two probability measures is a statistical divergence measuring their deviation defined as their maximally skewed Bhattacharyya distance. Although the Chernoff information was originally introduced for bounding the Bayes error in statistical hypothesis testing, the divergence found many other applications due to its empirical robustness property found in applications ranging from information fusion to quantum information. From the viewpoint of information theory, the Chernoff information can also be interpreted as a minmax symmetrization of the Kullback–Leibler divergence. In this paper, we first revisit the Chernoff information between two densities of a measurable Lebesgue space by considering the exponential families induced by their geometric mixtures: The so-called likelihood ratio exponential families. Second, we show how to (i) solve exactly the Chernoff information between any two univariate Gaussian distributions or get a closed-form formula using symbolic computing, (ii) report a closed-form formula of the Chernoff information of centered Gaussians with scaled covariance matrices and (iii) use a fast numerical scheme to approximate the Chernoff information between any two multivariate Gaussian distributions.  相似文献   

2.
In this work, we first consider the discrete version of Fisher information measure and then propose Jensen–Fisher information, to develop some associated results. Next, we consider Fisher information and Bayes–Fisher information measures for mixing parameter vector of a finite mixture probability mass function and establish some results. We provide some connections between these measures with some known informational measures such as chi-square divergence, Shannon entropy, Kullback–Leibler, Jeffreys and Jensen–Shannon divergences.  相似文献   

3.
The Jeffreys divergence is a renown arithmetic symmetrization of the oriented Kullback–Leibler divergence broadly used in information sciences. Since the Jeffreys divergence between Gaussian mixture models is not available in closed-form, various techniques with advantages and disadvantages have been proposed in the literature to either estimate, approximate, or lower and upper bound this divergence. In this paper, we propose a simple yet fast heuristic to approximate the Jeffreys divergence between two univariate Gaussian mixtures with arbitrary number of components. Our heuristic relies on converting the mixtures into pairs of dually parameterized probability densities belonging to an exponential-polynomial family. To measure with a closed-form formula the goodness of fit between a Gaussian mixture and an exponential-polynomial density approximating it, we generalize the Hyvärinen divergence to α-Hyvärinen divergences. In particular, the 2-Hyvärinen divergence allows us to perform model selection by choosing the order of the exponential-polynomial densities used to approximate the mixtures. We experimentally demonstrate that our heuristic to approximate the Jeffreys divergence between mixtures improves over the computational time of stochastic Monte Carlo estimations by several orders of magnitude while approximating the Jeffreys divergence reasonably well, especially when the mixtures have a very small number of modes.  相似文献   

4.
In this paper we recall, extend and compute some information measures for the concomitants of the generalized order statistics (GOS) from the Farlie–Gumbel–Morgenstern (FGM) family. We focus on two types of information measures: some related to Shannon entropy, and some related to Tsallis entropy. Among the information measures considered are residual and past entropies which are important in a reliability context.  相似文献   

5.
By calculating the Kullback–Leibler divergence between two probability measures belonging to different exponential families dominated by the same measure, we obtain a formula that generalizes the ordinary Fenchel–Young divergence. Inspired by this formula, we define the duo Fenchel–Young divergence and report a majorization condition on its pair of strictly convex generators, which guarantees that this divergence is always non-negative. The duo Fenchel–Young divergence is also equivalent to a duo Bregman divergence. We show how to use these duo divergences by calculating the Kullback–Leibler divergence between densities of truncated exponential families with nested supports, and report a formula for the Kullback–Leibler divergence between truncated normal distributions. Finally, we prove that the skewed Bhattacharyya distances between truncated exponential families amount to equivalent skewed duo Jensen divergences.  相似文献   

6.
In this paper, we introduce a novel objective prior distribution levering on the connections between information, divergence and scoring rules. In particular, we do so from the starting point of convex functions representing information in density functions. This provides a natural route to proper local scoring rules using Bregman divergence. Specifically, we determine the prior which solves setting the score function to be a constant. Although in itself this provides motivation for an objective prior, the prior also minimizes a corresponding information criterion.  相似文献   

7.
In this paper, we focus on extended informational measures based on a convex function ϕ: entropies, extended Fisher information, and generalized moments. Both the generalization of the Fisher information and the moments rely on the definition of an escort distribution linked to the (entropic) functional ϕ. We revisit the usual maximum entropy principle—more precisely its inverse problem, starting from the distribution and constraints, which leads to the introduction of state-dependent ϕ-entropies. Then, we examine interrelations between the extended informational measures and generalize relationships such the Cramér–Rao inequality and the de Bruijn identity in this broader context. In this particular framework, the maximum entropy distributions play a central role. Of course, all the results derived in the paper include the usual ones as special cases.  相似文献   

8.
In the present paper, we study the information generating (IG) function and relative information generating (RIG) function measures associated with maximum and minimum ranked set sampling (RSS) schemes with unequal sizes. We also examine the IG measures for simple random sampling (SRS) and provide some comparison results between SRS and RSS procedures in terms of dispersive stochastic ordering. Finally, we discuss the RIG divergence measure between SRS and RSS frameworks.  相似文献   

9.
Importance sampling is a Monte Carlo method where samples are obtained from an alternative proposal distribution. This can be used to focus the sampling process in the relevant parts of space, thus reducing the variance. Selecting the proposal that leads to the minimum variance can be formulated as an optimization problem and solved, for instance, by the use of a variational approach. Variational inference selects, from a given family, the distribution which minimizes the divergence to the distribution of interest. The Rényi projection of order 2 leads to the importance sampling estimator of minimum variance, but its computation is very costly. In this study with discrete distributions that factorize over probabilistic graphical models, we propose and evaluate an approximate projection method onto fully factored distributions. As a result of our evaluation it becomes apparent that a proposal distribution mixing the information projection with the approximate Rényi projection of order 2 could be interesting from a practical perspective.  相似文献   

10.
The spreading of the stationary states of the multidimensional single-particle systems with a central potential is quantified by means of Heisenberg-like measures (radial and logarithmic expectation values) and entropy-like quantities (Fisher, Shannon, Rényi) of position and momentum probability densities. Since the potential is assumed to be analytically unknown, these dispersion and information-theoretical measures are given by means of inequality-type relations which are explicitly shown to depend on dimensionality and state’s angular hyperquantum numbers. The spherical-symmetry and spin effects on these spreading properties are obtained by use of various integral inequalities (Daubechies–Thakkar, Lieb–Thirring, Redheffer–Weyl, ...) and a variational approach based on the extremization of entropy-like measures. Emphasis is placed on the uncertainty relations, upon which the essential reason of the probabilistic theory of quantum systems relies.  相似文献   

11.
Two Rényi-type generalizations of the Shannon cross-entropy, the Rényi cross-entropy and the Natural Rényi cross-entropy, were recently used as loss functions for the improved design of deep learning generative adversarial networks. In this work, we derive the Rényi and Natural Rényi differential cross-entropy measures in closed form for a wide class of common continuous distributions belonging to the exponential family, and we tabulate the results for ease of reference. We also summarise the Rényi-type cross-entropy rates between stationary Gaussian processes and between finite-alphabet time-invariant Markov sources.  相似文献   

12.
The present paper investigates the update of an empirical probability distribution with the results of a new set of observations. The update reproduces the new observations and interpolates using prior information. The optimal update is obtained by minimizing either the Hellinger distance or the quadratic Bregman divergence. The results obtained by the two methods differ. Updates with information about conditional probabilities are considered as well.  相似文献   

13.
When the Dempster–Shafer evidence theory is applied to the field of information fusion, how to reasonably transform the basic probability assignment (BPA) into probability to improve decision-making efficiency has been a key challenge. To address this challenge, this paper proposes an efficient probability transformation method based on neural network to achieve the transformation from the BPA to the probabilistic decision. First, a neural network is constructed based on the BPA of propositions in the mass function. Next, the average information content and the interval information content are used to quantify the information contained in each proposition subset and combined to construct the weighting function with parameter r. Then, the BPA of the input layer and the bias units are allocated to the proposition subset in each hidden layer according to the weight factors until the probability of each single-element proposition with the variable is output. Finally, the parameter r and the optimal transform results are obtained under the premise of maximizing the probabilistic information content. The proposed method satisfies the consistency of the upper and lower boundaries of each proposition. Extensive examples and a practical application show that, compared with the other methods, the proposed method not only has higher applicability, but also has lower uncertainty regarding the transformation result information.  相似文献   

14.
In the rate-distortion function and the Maximum Entropy (ME) method, Minimum Mutual Information (MMI) distributions and ME distributions are expressed by Bayes-like formulas, including Negative Exponential Functions (NEFs) and partition functions. Why do these non-probability functions exist in Bayes-like formulas? On the other hand, the rate-distortion function has three disadvantages: (1) the distortion function is subjectively defined; (2) the definition of the distortion function between instances and labels is often difficult; (3) it cannot be used for data compression according to the labels’ semantic meanings. The author has proposed using the semantic information G measure with both statistical probability and logical probability before. We can now explain NEFs as truth functions, partition functions as logical probabilities, Bayes-like formulas as semantic Bayes’ formulas, MMI as Semantic Mutual Information (SMI), and ME as extreme ME minus SMI. In overcoming the above disadvantages, this paper sets up the relationship between truth functions and distortion functions, obtains truth functions from samples by machine learning, and constructs constraint conditions with truth functions to extend rate-distortion functions. Two examples are used to help readers understand the MMI iteration and to support the theoretical results. Using truth functions and the semantic information G measure, we can combine machine learning and data compression, including semantic compression. We need further studies to explore general data compression and recovery, according to the semantic meaning.  相似文献   

15.
A methodology to analyze dynamical changes in complex networks based on Information Theory quantifiers is proposed. The square root of the Jensen-Shannon divergence, a measure of dissimilarity between two probability distributions, and the MPR Statistical Complexity are used to quantify states in the network evolution process. Three cases are analyzed, the Watts-Strogatz model, a gene network during the progression of Alzheimer's disease and a climate network for the Tropical Pacific region to study the El Niño/Southern Oscillation (ENSO) dynamic. We find that the proposed quantifiers are able not only to capture changes in the dynamics of the processes but also to quantify and compare states in their evolution.  相似文献   

16.
Many visual representations, such as volume-rendered images and metro maps, feature a noticeable amount of information loss due to a variety of many-to-one mappings. At a glance, there seem to be numerous opportunities for viewers to misinterpret the data being visualized, hence, undermining the benefits of these visual representations. In practice, there is little doubt that these visual representations are useful. The recently-proposed information-theoretic measure for analyzing the cost–benefit ratio of visualization processes can explain such usefulness experienced in practice and postulate that the viewers’ knowledge can reduce the potential distortion (e.g., misinterpretation) due to information loss. This suggests that viewers’ knowledge can be estimated by comparing the potential distortion without any knowledge and the actual distortion with some knowledge. However, the existing cost–benefit measure consists of an unbounded divergence term, making the numerical measurements difficult to interpret. This is the second part of a two-part paper, which aims to improve the existing cost–benefit measure. Part I of the paper provided a theoretical discourse about the problem of unboundedness, reported a conceptual analysis of nine candidate divergence measures for resolving the problem, and eliminated three from further consideration. In this Part II, we describe two groups of case studies for evaluating the remaining six candidate measures empirically. In particular, we obtained instance data for (i) supporting the evaluation of the remaining candidate measures and (ii) demonstrating their applicability in practical scenarios for estimating the cost–benefit of visualization processes as well as the impact of human knowledge in the processes. The real world data about visualization provides practical evidence for evaluating the usability and intuitiveness of the candidate measures. The combination of the conceptual analysis in Part I and the empirical evaluation in this part allows us to select the most appropriate bounded divergence measure for improving the existing cost–benefit measure.  相似文献   

17.
We introduce a quantum key distribution protocol using mean multi-kings’ problem. Using this protocol, a sender can share a bit sequence as a secret key with receivers. We consider a relation between information gain by an eavesdropper and disturbance contained in legitimate users’ information. In BB84 protocol, such relation is known as the so-called information disturbance theorem. We focus on a setting that the sender and two receivers try to share bit sequences and the eavesdropper tries to extract information by interacting legitimate users’ systems and an ancilla system. We derive trade-off inequalities between distinguishability of quantum states corresponding to the bit sequence for the eavesdropper and error probability of the bit sequence shared with the legitimate users. Our inequalities show that eavesdropper’s extracting information regarding the secret keys inevitably induces disturbing the states and increasing the error probability.  相似文献   

18.
The distance that compares the difference between two probability distributions plays a fundamental role in statistics and machine learning. Optimal transport (OT) theory provides a theoretical framework to study such distances. Recent advances in OT theory include a generalization of classical OT with an extra entropic constraint or regularization, called entropic OT. Despite its convenience in computation, entropic OT still lacks sufficient theoretical support. In this paper, we show that the quadratic cost in entropic OT can be upper-bounded using entropy power inequality (EPI)-type bounds. First, we prove an HWI-type inequality by making use of the infinitesimal displacement convexity of the OT map. Second, we derive two Talagrand-type inequalities using the saturation of EPI that corresponds to a numerical term in our expressions. These two new inequalities are shown to generalize two previous results obtained by Bolley et al. and Bai et al. Using the new Talagrand-type inequalities, we also show that the geometry observed by Sinkhorn distance is smoothed in the sense of measure concentration. Finally, we corroborate our results with various simulation studies.  相似文献   

19.
R.C. Venkatesan  A. Plastino 《Physica A》2011,390(15):2749-2758
There exist two different versions of the Kullback-Leibler divergence (K-Ld) in Tsallis statistics, namely the usual generalized K-Ld and the generalized Bregman K-Ld. Problems have been encountered in trying to reconcile them. A condition for consistency between these two generalized K-Ld forms is derived by recourse to the additive duality of Tsallis statistics. It is also shown that the usual generalized K-Ld subjected to this additive duality, known as the dual generalized K-Ld, is a scaled Bregman divergence. This leads to an interesting conclusion: the dual generalized mutual information is a scaled Bregman information. The utility and implications of these results are discussed.  相似文献   

20.
Dempster–Shafer evidence theory is widely used in modeling and reasoning uncertain information in real applications. Recently, a new perspective of modeling uncertain information with the negation of evidence was proposed and has attracted a lot of attention. Both the basic probability assignment (BPA) and the negation of BPA in the evidence theory framework can model and reason uncertain information. However, how to address the uncertainty in the negation information modeled as the negation of BPA is still an open issue. Inspired by the uncertainty measures in Dempster–Shafer evidence theory, a method of measuring the uncertainty in the negation evidence is proposed. The belief entropy named Deng entropy, which has attracted a lot of attention among researchers, is adopted and improved for measuring the uncertainty of negation evidence. The proposed measure is defined based on the negation function of BPA and can quantify the uncertainty of the negation evidence. In addition, an improved method of multi-source information fusion considering uncertainty quantification in the negation evidence with the new measure is proposed. Experimental results on a numerical example and a fault diagnosis problem verify the rationality and effectiveness of the proposed method in measuring and fusing uncertain information.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号