首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 171 毫秒
1.
Word embeddings based on a conditional model are commonly used in Natural Language Processing (NLP) tasks to embed the words of a dictionary in a low dimensional linear space. Their computation is based on the maximization of the likelihood of a conditional probability distribution for each word of the dictionary. These distributions form a Riemannian statistical manifold, where word embeddings can be interpreted as vectors in the tangent space of a specific reference measure on the manifold. A novel family of word embeddings, called α-embeddings have been recently introduced as deriving from the geometrical deformation of the simplex of probabilities through a parameter α, using notions from Information Geometry. After introducing the α-embeddings, we show how the deformation of the simplex, controlled by α, provides an extra handle to increase the performances of several intrinsic and extrinsic tasks in NLP. We test the α-embeddings on different tasks with models of increasing complexity, showing that the advantages associated with the use of α-embeddings are present also for models with a large number of parameters. Finally, we show that tuning α allows for higher performances compared to the use of larger models in which additionally a transformation of the embeddings is learned during training, as experimentally verified in attention models.  相似文献   

2.
When studying the behaviour of complex dynamical systems, a statistical formulation can provide useful insights. In particular, information geometry is a promising tool for this purpose. In this paper, we investigate the information length for n-dimensional linear autonomous stochastic processes, providing a basic theoretical framework that can be applied to a large set of problems in engineering and physics. A specific application is made to a harmonically bound particle system with the natural oscillation frequency ω, subject to a damping γ and a Gaussian white-noise. We explore how the information length depends on ω and γ, elucidating the role of critical damping γ=2ω in information geometry. Furthermore, in the long time limit, we show that the information length reflects the linear geometry associated with the Gaussian statistics in a linear stochastic process.  相似文献   

3.
In the nervous system, information is conveyed by sequence of action potentials, called spikes-trains. As MacKay and McCulloch suggested, spike-trains can be represented as bits sequences coming from Information Sources (IS). Previously, we studied relations between spikes’ Information Transmission Rates (ITR) and their correlations, and frequencies. Now, I concentrate on the problem of how spikes fluctuations affect ITR. The IS are typically modeled as stationary stochastic processes, which I consider here as two-state Markov processes. As a spike-trains’ fluctuation measure, I assume the standard deviation σ, which measures the average fluctuation of spikes around the average spike frequency. I found that the character of ITR and signal fluctuations relation strongly depends on the parameter s being a sum of transitions probabilities from a no spike state to spike state. The estimate of the Information Transmission Rate was found by expressions depending on the values of signal fluctuations and parameter s. It turned out that for smaller s<1, the quotient ITRσ has a maximum and can tend to zero depending on transition probabilities, while for 1<s, the ITRσ is separated from 0. Additionally, it was also shown that ITR quotient by variance behaves in a completely different way. Similar behavior was observed when classical Shannon entropy terms in the Markov entropy formula are replaced by their approximation with polynomials. My results suggest that in a noisier environment (1<s), to get appropriate reliability and efficiency of transmission, IS with higher tendency of transition from the no spike to spike state should be applied. Such selection of appropriate parameters plays an important role in designing learning mechanisms to obtain networks with higher performance.  相似文献   

4.
A solvable model of a periodically driven trapped mixture of Bose–Einstein condensates, consisting of N1 interacting bosons of mass m1 driven by a force of amplitude fL,1 and N2 interacting bosons of mass m2 driven by a force of amplitude fL,2, is presented. The model generalizes the harmonic-interaction model for mixtures to the time-dependent domain. The resulting many-particle ground Floquet wavefunction and quasienergy, as well as the time-dependent densities and reduced density matrices, are prescribed explicitly and analyzed at the many-body and mean-field levels of theory for finite systems and at the limit of an infinite number of particles. We prove that the time-dependent densities per particle are given at the limit of an infinite number of particles by their respective mean-field quantities, and that the time-dependent reduced one-particle and two-particle density matrices per particle of the driven mixture are 100% condensed. Interestingly, the quasienergy per particle does not coincide with the mean-field value at this limit, unless the relative center-of-mass coordinate of the two Bose–Einstein condensates is not activated by the driving forces fL,1 and fL,2. As an application, we investigate the imprinting of angular momentum and its fluctuations when steering a Bose–Einstein condensate by an interacting bosonic impurity and the resulting modes of rotations. Whereas the expectation values per particle of the angular-momentum operator for the many-body and mean-field solutions coincide at the limit of an infinite number of particles, the respective fluctuations can differ substantially. The results are analyzed in terms of the transformation properties of the angular-momentum operator under translations and boosts, and as a function of the interactions between the particles. Implications are briefly discussed.  相似文献   

5.
Using finite time thermodynamic theory, an irreversible steady-flow Lenoir cycle model is established, and expressions of power output and thermal efficiency for the model are derived. Through numerical calculations, with the different fixed total heat conductances (UT) of two heat exchangers, the maximum powers (Pmax), the maximum thermal efficiencies (ηmax), and the corresponding optimal heat conductance distribution ratios (uLP(opt)) and (uLη(opt)) are obtained. The effects of the internal irreversibility are analyzed. The results show that, when the heat conductances of the hot- and cold-side heat exchangers are constants, the corresponding power output and thermal efficiency are constant values. When the heat source temperature ratio (τ) and the effectivenesses of the heat exchangers increase, the corresponding power output and thermal efficiency increase. When the heat conductance distributions are the optimal values, the characteristic relationships of P-uL and η-uL are parabolic-like ones. When UT is given, with the increase in τ, the Pmax, ηmax, uLP(opt), and uLη(opt) increase. When τ is given, with the increase in UT, Pmax and ηmax increase, while uLP(opt) and uLη(opt) decrease.  相似文献   

6.
Over the last six decades, the representation of error exponent functions for data transmission through noisy channels at rates below capacity has seen three distinct approaches: (1) Through Gallager’s E0 functions (with and without cost constraints); (2) large deviations form, in terms of conditional relative entropy and mutual information; (3) through the α-mutual information and the Augustin–Csiszár mutual information of order α derived from the Rényi divergence. While a fairly complete picture has emerged in the absence of cost constraints, there have remained gaps in the interrelationships between the three approaches in the general case of cost-constrained encoding. Furthermore, no systematic approach has been proposed to solve the attendant optimization problems by exploiting the specific structure of the information functions. This paper closes those gaps and proposes a simple method to maximize Augustin–Csiszár mutual information of order α under cost constraints by means of the maximization of the α-mutual information subject to an exponential average constraint.  相似文献   

7.
We discuss a covariant relativistic Boltzmann equation which describes the evolution of a system of particles in spacetime evolving with a universal invariant parameter τ. The observed time t of Einstein and Maxwell, in the presence of interaction, is not necessarily a monotonic function of τ. If t(τ) increases with τ, the worldline may be associated with a normal particle, but if it is decreasing in τ, it is observed in the laboratory as an antiparticle. This paper discusses the implications for entropy evolution in this relativistic framework. It is shown that if an ensemble of particles and antiparticles, converge in a region of pair annihilation, the entropy of the antiparticle beam may decreaase in time.  相似文献   

8.
We consider whether the new horizon-first law works in higher-dimensional f(R) theory. We firstly obtain the general formulas to calculate the entropy and the energy of a general spherically-symmetric black hole in D-dimensional f(R) theory. For applications, we compute the entropies and the energies of some black hokes in some interesting higher-dimensional f(R) theories.  相似文献   

9.
10.
Aims: Bubble entropy (bEn) is an entropy metric with a limited dependence on parameters. bEn does not directly quantify the conditional entropy of the series, but it assesses the change in entropy of the ordering of portions of its samples of length m, when adding an extra element. The analytical formulation of bEn for autoregressive (AR) processes shows that, for this class of processes, the relation between the first autocorrelation coefficient and bEn changes for odd and even values of m. While this is not an issue, per se, it triggered ideas for further investigation. Methods: Using theoretical considerations on the expected values for AR processes, we examined a two-steps-ahead estimator of bEn, which considered the cost of ordering two additional samples. We first compared it with the original bEn estimator on a simulated series. Then, we tested it on real heart rate variability (HRV) data. Results: The experiments showed that both examined alternatives showed comparable discriminating power. However, for values of 10<m<20, where the statistical significance of the method was increased and improved as m increased, the two-steps-ahead estimator presented slightly higher statistical significance and more regular behavior, even if the dependence on parameter m was still minimal. We also investigated a new normalization factor for bEn, which ensures that bEn =1 when white Gaussian noise (WGN) is given as the input. Conclusions: The research improved our understanding of bubble entropy, in particular in the context of HRV analysis, and we investigated interesting details regarding the definition of the estimator.  相似文献   

11.
The decomposition effect of variational mode decomposition (VMD) mainly depends on the choice of decomposition number K and penalty factor α. For the selection of two parameters, the empirical method and single objective optimization method are usually used, but the aforementioned methods often have limitations and cannot achieve the optimal effects. Therefore, a multi-objective multi-island genetic algorithm (MIGA) is proposed to optimize the parameters of VMD and apply it to feature extraction of bearing fault. First, the envelope entropy (Ee) can reflect the sparsity of the signal, and Renyi entropy (Re) can reflect the energy aggregation degree of the time-frequency distribution of the signal. Therefore, Ee and Re are selected as fitness functions, and the optimal solution of VMD parameters is obtained by the MIGA algorithm. Second, the improved VMD algorithm is used to decompose the bearing fault signal, and then two intrinsic mode functions (IMF) with the most fault information are selected by improved kurtosis and Holder coefficient for reconstruction. Finally, the envelope spectrum of the reconstructed signal is analyzed. The analysis of comparative experiments shows that the feature extraction method can extract bearing fault features more accurately, and the fault diagnosis model based on this method has higher accuracy.  相似文献   

12.
Probability is an important question in the ontological interpretation of quantum mechanics. It has been discussed in some trajectory interpretations such as Bohmian mechanics and stochastic mechanics. New questions arise when the probability domain extends to the complex space, including the generation of complex trajectory, the definition of the complex probability, and the relation of the complex probability to the quantum probability. The complex treatment proposed in this article applies the optimal quantum guidance law to derive the stochastic differential equation governing a particle’s random motion in the complex plane. The probability distribution ρc(t,x,y) of the particle’s position over the complex plane z=x+iy is formed by an ensemble of the complex quantum random trajectories, which are solved from the complex stochastic differential equation. Meanwhile, the probability distribution ρc(t,x,y) is verified by the solution of the complex Fokker–Planck equation. It is shown that quantum probability |Ψ|2 and classical probability can be integrated under the framework of complex probability ρc(t,x,y), such that they can both be derived from ρc(t,x,y) by different statistical ways of collecting spatial points.  相似文献   

13.
Weak fault signals, high coupling data, and unknown faults commonly exist in fault diagnosis systems, causing low detection and identification performance of fault diagnosis methods based on T2 statistics or cross entropy. This paper proposes a new fault diagnosis method based on optimal bandwidth kernel density estimation (KDE) and Jensen–Shannon (JS) divergence distribution for improved fault detection performance. KDE addresses weak signal and coupling fault detection, and JS divergence addresses unknown fault detection. Firstly, the formula and algorithm of the optimal bandwidth of multidimensional KDE are presented, and the convergence of the algorithm is proved. Secondly, the difference in JS divergence between the data is obtained based on the optimal KDE and used for fault detection. Finally, the fault diagnosis experiment based on the bearing data from Case Western Reserve University Bearing Data Center is conducted. The results show that for known faults, the proposed method has 10% and 2% higher detection rate than T2 statistics and the cross entropy method, respectively. For unknown faults, T2 statistics cannot effectively detect faults, and the proposed method has approximately 15% higher detection rate than the cross entropy method. Thus, the proposed method can effectively improve the fault detection rate.  相似文献   

14.
The measures of information transfer which correspond to non-additive entropies have intensively been studied in previous decades. The majority of the work includes the ones belonging to the Sharma–Mittal entropy class, such as the Rényi, the Tsallis, the Landsberg–Vedral and the Gaussian entropies. All of the considerations follow the same approach, mimicking some of the various and mutually equivalent definitions of Shannon information measures, and the information transfer is quantified by an appropriately defined measure of mutual information, while the maximal information transfer is considered as a generalized channel capacity. However, all of the previous approaches fail to satisfy at least one of the ineluctable properties which a measure of (maximal) information transfer should satisfy, leading to counterintuitive conclusions and predicting nonphysical behavior even in the case of very simple communication channels. This paper fills the gap by proposing two parameter measures named the α-q-mutual information and the α-q-capacity. In addition to standard Shannon approaches, special cases of these measures include the α-mutual information and the α-capacity, which are well established in the information theory literature as measures of additive Rényi information transfer, while the cases of the Tsallis, the Landsberg–Vedral and the Gaussian entropies can also be accessed by special choices of the parameters α and q. It is shown that, unlike the previous definition, the α-q-mutual information and the α-q-capacity satisfy the set of properties, which are stated as axioms, by which they reduce to zero in the case of totally destructive channels and to the (maximal) input Sharma–Mittal entropy in the case of perfect transmission, which is consistent with the maximum likelihood detection error. In addition, they are non-negative and less than or equal to the input and the output Sharma–Mittal entropies, in general. Thus, unlike the previous approaches, the proposed (maximal) information transfer measures do not manifest nonphysical behaviors such as sub-capacitance or super-capacitance, which could qualify them as appropriate measures of the Sharma–Mittal information transfer.  相似文献   

15.
Motivated by applications in unsourced random access, this paper develops a novel scheme for the problem of compressed sensing of binary signals. In this problem, the goal is to design a sensing matrix A and a recovery algorithm, such that the sparse binary vector x can be recovered reliably from the measurements y=Ax+σz, where z is additive white Gaussian noise. We propose to design A as a parity check matrix of a low-density parity-check code (LDPC) and to recover x from the measurements y using a Markov chain Monte Carlo algorithm, which runs relatively fast due to the sparse structure of A. The performance of our scheme is comparable to state-of-the-art schemes, which use dense sensing matrices, while enjoying the advantages of using a sparse sensing matrix.  相似文献   

16.
The modeling and prediction of chaotic time series require proper reconstruction of the state space from the available data in order to successfully estimate invariant properties of the embedded attractor. Thus, one must choose appropriate time delay τ and embedding dimension p for phase space reconstruction. The value of τ can be estimated from the Mutual Information, but this method is rather cumbersome computationally. Additionally, some researchers have recommended that τ should be chosen to be dependent on the embedding dimension p by means of an appropriate value for the time delay τw=(p1)τ, which is the optimal time delay for independence of the time series. The C-C method, based on Correlation Integral, is a method simpler than Mutual Information and has been proposed to select optimally τw and τ. In this paper, we suggest a simple method for estimating τ and τw based on symbolic analysis and symbolic entropy. As in the C-C method, τ is estimated as the first local optimal time delay and τw as the time delay for independence of the time series. The method is applied to several chaotic time series that are the base of comparison for several techniques. The numerical simulations for these systems verify that the proposed symbolic-based method is useful for practitioners and, according to the studied models, has a better performance than the C-C method for the choice of the time delay and embedding dimension. In addition, the method is applied to EEG data in order to study and compare some dynamic characteristics of brain activity under epileptic episodes  相似文献   

17.
We study a two state “jumping diffusivity” model for a Brownian process alternating between two different diffusion constants, D+>D, with random waiting times in both states whose distribution is rather general. In the limit of long measurement times, Gaussian behavior with an effective diffusion coefficient is recovered. We show that, for equilibrium initial conditions and when the limit of the diffusion coefficient D0 is taken, the short time behavior leads to a cusp, namely a non-analytical behavior, in the distribution of the displacements P(x,t) for x0. Visually this cusp, or tent-like shape, resembles similar behavior found in many experiments of diffusing particles in disordered environments, such as glassy systems and intracellular media. This general result depends only on the existence of finite mean values of the waiting times at the different states of the model. Gaussian statistics in the long time limit is achieved due to ergodicity and convergence of the distribution of the temporal occupation fraction in state D+ to a δ-function. The short time behavior of the same quantity converges to a uniform distribution, which leads to the non-analyticity in P(x,t). We demonstrate how super-statistical framework is a zeroth order short time expansion of P(x,t), in the number of transitions, that does not yield the cusp like shape. The latter, considered as the key feature of experiments in the field, is found with the first correction in perturbation theory.  相似文献   

18.
Transverse momentum spectra of π+, p, Λ, Ξ or Ξ¯+, Ω or Ω¯+ and deuteron (d) in different centrality intervals in nucleus–nucleus collisions at the center of mass energy are analyzed by the blast wave model with Boltzmann Gibbs statistics. We extracted the kinetic freezeout temperature, transverse flow velocity and kinetic freezeout volume from the transverse momentum spectra of the particles. It is observed that the non-strange and strange (multi-strange) particles freezeout separately due to different reaction cross-sections. While the freezeout volume and transverse flow velocity are mass dependent, they decrease with the resting mass of the particles. The present work reveals the scenario of a double kinetic freezeout in nucleus–nucleus collisions. Furthermore, the kinetic freezeout temperature and freezeout volume are larger in central collisions than peripheral collisions. However, the transverse flow velocity remains almost unchanged from central to peripheral collisions.  相似文献   

19.
We present computer simulation and theoretical results for a system of N Quantum Hard Spheres (QHS) particles of diameter σ and mass m at temperature T, confined between parallel hard walls separated by a distance Hσ, within the range 1H. Semiclassical Monte Carlo computer simulations were performed adapted to a confined space, considering effects in terms of the density of particles ρ*=N/V, where V is the accessible volume, the inverse length H1 and the de Broglie’s thermal wavelength λB=h/2πmkT, where k and h are the Boltzmann’s and Planck’s constants, respectively. For the case of extreme and maximum confinement, 0.5<H1<1 and H1=1, respectively, analytical results can be given based on an extension for quantum systems of the Helmholtz free energies for the corresponding classical systems.  相似文献   

20.
This paper systematically presents the λ-deformation as the canonical framework of deformation to the dually flat (Hessian) geometry, which has been well established in information geometry. We show that, based on deforming the Legendre duality, all objects in the Hessian case have their correspondence in the λ-deformed case: λ-convexity, λ-conjugation, λ-biorthogonality, λ-logarithmic divergence, λ-exponential and λ-mixture families, etc. In particular, λ-deformation unifies Tsallis and Rényi deformations by relating them to two manifestations of an identical λ-exponential family, under subtractive or divisive probability normalization, respectively. Unlike the different Hessian geometries of the exponential and mixture families, the λ-exponential family, in turn, coincides with the λ-mixture family after a change of random variables. The resulting statistical manifolds, while still carrying a dualistic structure, replace the Hessian metric and a pair of dually flat conjugate affine connections with a conformal Hessian metric and a pair of projectively flat connections carrying constant (nonzero) curvature. Thus, λ-deformation is a canonical framework in generalizing the well-known dually flat Hessian structure of information geometry.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号