首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 218 毫秒
1.
Word embeddings based on a conditional model are commonly used in Natural Language Processing (NLP) tasks to embed the words of a dictionary in a low dimensional linear space. Their computation is based on the maximization of the likelihood of a conditional probability distribution for each word of the dictionary. These distributions form a Riemannian statistical manifold, where word embeddings can be interpreted as vectors in the tangent space of a specific reference measure on the manifold. A novel family of word embeddings, called α-embeddings have been recently introduced as deriving from the geometrical deformation of the simplex of probabilities through a parameter α, using notions from Information Geometry. After introducing the α-embeddings, we show how the deformation of the simplex, controlled by α, provides an extra handle to increase the performances of several intrinsic and extrinsic tasks in NLP. We test the α-embeddings on different tasks with models of increasing complexity, showing that the advantages associated with the use of α-embeddings are present also for models with a large number of parameters. Finally, we show that tuning α allows for higher performances compared to the use of larger models in which additionally a transformation of the embeddings is learned during training, as experimentally verified in attention models.  相似文献   

2.
Using finite time thermodynamic theory, an irreversible steady-flow Lenoir cycle model is established, and expressions of power output and thermal efficiency for the model are derived. Through numerical calculations, with the different fixed total heat conductances (UT) of two heat exchangers, the maximum powers (Pmax), the maximum thermal efficiencies (ηmax), and the corresponding optimal heat conductance distribution ratios (uLP(opt)) and (uLη(opt)) are obtained. The effects of the internal irreversibility are analyzed. The results show that, when the heat conductances of the hot- and cold-side heat exchangers are constants, the corresponding power output and thermal efficiency are constant values. When the heat source temperature ratio (τ) and the effectivenesses of the heat exchangers increase, the corresponding power output and thermal efficiency increase. When the heat conductance distributions are the optimal values, the characteristic relationships of P-uL and η-uL are parabolic-like ones. When UT is given, with the increase in τ, the Pmax, ηmax, uLP(opt), and uLη(opt) increase. When τ is given, with the increase in UT, Pmax and ηmax increase, while uLP(opt) and uLη(opt) decrease.  相似文献   

3.
When studying the behaviour of complex dynamical systems, a statistical formulation can provide useful insights. In particular, information geometry is a promising tool for this purpose. In this paper, we investigate the information length for n-dimensional linear autonomous stochastic processes, providing a basic theoretical framework that can be applied to a large set of problems in engineering and physics. A specific application is made to a harmonically bound particle system with the natural oscillation frequency ω, subject to a damping γ and a Gaussian white-noise. We explore how the information length depends on ω and γ, elucidating the role of critical damping γ=2ω in information geometry. Furthermore, in the long time limit, we show that the information length reflects the linear geometry associated with the Gaussian statistics in a linear stochastic process.  相似文献   

4.
To sample from complex, high-dimensional distributions, one may choose algorithms based on the Hybrid Monte Carlo (HMC) method. HMC-based algorithms generate nonlocal moves alleviating diffusive behavior. Here, I build on an already defined HMC framework, hybrid Monte Carlo on Hilbert spaces (Beskos, et al. Stoch. Proc. Applic. 2011), that provides finite-dimensional approximations of measures π, which have density with respect to a Gaussian measure on an infinite-dimensional Hilbert (path) space. In all HMC algorithms, one has some freedom to choose the mass operator. The novel feature of the algorithm described in this article lies in the choice of this operator. This new choice defines a Markov Chain Monte Carlo (MCMC) method that is well defined on the Hilbert space itself. As before, the algorithm described herein uses an enlarged phase space Π having the target π as a marginal, together with a Hamiltonian flow that preserves Π. In the previous work, the authors explored a method where the phase space π was augmented with Brownian bridges. With this new choice, π is augmented by Ornstein–Uhlenbeck (OU) bridges. The covariance of Brownian bridges grows with its length, which has negative effects on the acceptance rate in the MCMC method. This contrasts with the covariance of OU bridges, which is independent of the path length. The ingredients of the new algorithm include the definition of the mass operator, the equations for the Hamiltonian flow, the (approximate) numerical integration of the evolution equations, and finally, the Metropolis–Hastings acceptance rule. Taken together, these constitute a robust method for sampling the target distribution in an almost dimension-free manner. The behavior of this novel algorithm is demonstrated by computer experiments for a particle moving in two dimensions, between two free-energy basins separated by an entropic barrier.  相似文献   

5.
The asymmetric skew divergence smooths one of the distributions by mixing it, to a degree determined by the parameter λ, with the other distribution. Such divergence is an approximation of the KL divergence that does not require the target distribution to be absolutely continuous with respect to the source distribution. In this paper, an information geometric generalization of the skew divergence called the α-geodesical skew divergence is proposed, and its properties are studied.  相似文献   

6.
The modeling and prediction of chaotic time series require proper reconstruction of the state space from the available data in order to successfully estimate invariant properties of the embedded attractor. Thus, one must choose appropriate time delay τ and embedding dimension p for phase space reconstruction. The value of τ can be estimated from the Mutual Information, but this method is rather cumbersome computationally. Additionally, some researchers have recommended that τ should be chosen to be dependent on the embedding dimension p by means of an appropriate value for the time delay τw=(p1)τ, which is the optimal time delay for independence of the time series. The C-C method, based on Correlation Integral, is a method simpler than Mutual Information and has been proposed to select optimally τw and τ. In this paper, we suggest a simple method for estimating τ and τw based on symbolic analysis and symbolic entropy. As in the C-C method, τ is estimated as the first local optimal time delay and τw as the time delay for independence of the time series. The method is applied to several chaotic time series that are the base of comparison for several techniques. The numerical simulations for these systems verify that the proposed symbolic-based method is useful for practitioners and, according to the studied models, has a better performance than the C-C method for the choice of the time delay and embedding dimension. In addition, the method is applied to EEG data in order to study and compare some dynamic characteristics of brain activity under epileptic episodes  相似文献   

7.
A new type of quantum correction to the structure of classical black holes is investigated. This concerns the physics of event horizons induced by the occurrence of stochastic quantum gravitational fields. The theoretical framework is provided by the theory of manifestly covariant quantum gravity and the related prediction of an exclusively quantum-produced stochastic cosmological constant. The specific example case of the Schwarzschild–deSitter geometry is looked at, analyzing the consequent stochastic modifications of the Einstein field equations. It is proved that, in such a setting, the black hole event horizon no longer identifies a classical (i.e., deterministic) two-dimensional surface. On the contrary, it acquires a quantum stochastic character, giving rise to a frame-dependent transition region of radial width δr between internal and external subdomains. It is found that: (a) the radial size of the stochastic region depends parametrically on the central mass M of the black hole, scaling as δrM3; (b) for supermassive black holes δr is typically orders of magnitude larger than the Planck length lP. Instead, for typical stellar-mass black holes, δr may drop well below lP. The outcome provides new insight into the quantum properties of black holes, with implications for the physics of quantum tunneling phenomena expected to arise across stochastic event horizons.  相似文献   

8.
Importance sampling is used to approximate Bayes’ rule in many computational approaches to Bayesian inverse problems, data assimilation and machine learning. This paper reviews and further investigates the required sample size for importance sampling in terms of the χ2-divergence between target and proposal. We illustrate through examples the roles that dimension, noise-level and other model parameters play in approximating the Bayesian update with importance sampling. Our examples also facilitate a new direct comparison of standard and optimal proposals for particle filtering.  相似文献   

9.
10.
Simple SummaryIn the early Universe, both QCD and EW eras play an essential role in laying seeds for nucleosynthesis and even dictating the cosmological large-scale structure. Taking advantage of recent developments in ultrarelativistic nuclear experiments and nonperturbative and perturbative lattice simulations, various thermodynamic quantities including pressure, energy density, bulk viscosity, relaxation time, and temperature have been calculated up to the TeV-scale, in which the possible influence of finite bulk viscosity is characterized for the first time and the analytical dependence of Hubble parameter on the scale factor is also introduced.AbstractBased on recent perturbative and non-perturbative lattice calculations with almost quark flavors and the thermal contributions from photons, neutrinos, leptons, electroweak particles, and scalar Higgs bosons, various thermodynamic quantities, at vanishing net-baryon densities, such as pressure, energy density, bulk viscosity, relaxation time, and temperature have been calculated up to the TeV-scale, i.e., covering hadron, QGP, and electroweak (EW) phases in the early Universe. This remarkable progress motivated the present study to determine the possible influence of the bulk viscosity in the early Universe and to understand how this would vary from epoch to epoch. We have taken into consideration first- (Eckart) and second-order (Israel–Stewart) theories for the relativistic cosmic fluid and integrated viscous equations of state in Friedmann equations. Nonlinear nonhomogeneous differential equations are obtained as analytical solutions. For Israel–Stewart, the differential equations are very sophisticated to be solved. They are outlined here as road-maps for future studies. For Eckart theory, the only possible solution is the functionality, H(a(t)), where H(t) is the Hubble parameter and a(t) is the scale factor, but none of them so far could to be directly expressed in terms of either proper or cosmic time t. For Eckart-type viscous background, especially at finite cosmological constant, non-singular H(t) and a(t) are obtained, where H(t) diverges for QCD/EW and asymptotic EoS. For non-viscous background, the dependence of H(a(t)) is monotonic. The same conclusion can be drawn for an ideal EoS. We also conclude that the rate of decreasing H(a(t)) with increasing a(t) varies from epoch to epoch, at vanishing and finite cosmological constant. These results obviously help in improving our understanding of the nucleosynthesis and the cosmological large-scale structure.  相似文献   

11.
Expected Shortfall (ES), the average loss above a high quantile, is the current financial regulatory market risk measure. Its estimation and optimization are highly unstable against sample fluctuations and become impossible above a critical ratio r=N/T, where N is the number of different assets in the portfolio, and T is the length of the available time series. The critical ratio depends on the confidence level α, which means we have a line of critical points on the αr plane. The large fluctuations in the estimation of ES can be attenuated by the application of regularizers. In this paper, we calculate ES analytically under an 1 regularizer by the method of replicas borrowed from the statistical physics of random systems. The ban on short selling, i.e., a constraint rendering all the portfolio weights non-negative, is a special case of an asymmetric 1 regularizer. Results are presented for the out-of-sample and the in-sample estimator of the regularized ES, the estimation error, the distribution of the optimal portfolio weights, and the density of the assets eliminated from the portfolio by the regularizer. It is shown that the no-short constraint acts as a high volatility cutoff, in the sense that it sets the weights of the high volatility elements to zero with higher probability than those of the low volatility items. This cutoff renormalizes the aspect ratio r=N/T, thereby extending the range of the feasibility of optimization. We find that there is a nontrivial mapping between the regularized and unregularized problems, corresponding to a renormalization of the order parameters.  相似文献   

12.
Transverse momentum spectra of π+, p, Λ, Ξ or Ξ¯+, Ω or Ω¯+ and deuteron (d) in different centrality intervals in nucleus–nucleus collisions at the center of mass energy are analyzed by the blast wave model with Boltzmann Gibbs statistics. We extracted the kinetic freezeout temperature, transverse flow velocity and kinetic freezeout volume from the transverse momentum spectra of the particles. It is observed that the non-strange and strange (multi-strange) particles freezeout separately due to different reaction cross-sections. While the freezeout volume and transverse flow velocity are mass dependent, they decrease with the resting mass of the particles. The present work reveals the scenario of a double kinetic freezeout in nucleus–nucleus collisions. Furthermore, the kinetic freezeout temperature and freezeout volume are larger in central collisions than peripheral collisions. However, the transverse flow velocity remains almost unchanged from central to peripheral collisions.  相似文献   

13.
In the nervous system, information is conveyed by sequence of action potentials, called spikes-trains. As MacKay and McCulloch suggested, spike-trains can be represented as bits sequences coming from Information Sources (IS). Previously, we studied relations between spikes’ Information Transmission Rates (ITR) and their correlations, and frequencies. Now, I concentrate on the problem of how spikes fluctuations affect ITR. The IS are typically modeled as stationary stochastic processes, which I consider here as two-state Markov processes. As a spike-trains’ fluctuation measure, I assume the standard deviation σ, which measures the average fluctuation of spikes around the average spike frequency. I found that the character of ITR and signal fluctuations relation strongly depends on the parameter s being a sum of transitions probabilities from a no spike state to spike state. The estimate of the Information Transmission Rate was found by expressions depending on the values of signal fluctuations and parameter s. It turned out that for smaller s<1, the quotient ITRσ has a maximum and can tend to zero depending on transition probabilities, while for 1<s, the ITRσ is separated from 0. Additionally, it was also shown that ITR quotient by variance behaves in a completely different way. Similar behavior was observed when classical Shannon entropy terms in the Markov entropy formula are replaced by their approximation with polynomials. My results suggest that in a noisier environment (1<s), to get appropriate reliability and efficiency of transmission, IS with higher tendency of transition from the no spike to spike state should be applied. Such selection of appropriate parameters plays an important role in designing learning mechanisms to obtain networks with higher performance.  相似文献   

14.
This paper investigates the achievable per-user degrees-of-freedom (DoF) in multi-cloud based sectored hexagonal cellular networks (M-CRAN) at uplink. The network consists of N base stations (BS) and KN base band unit pools (BBUP), which function as independent cloud centers. The communication between BSs and BBUPs occurs by means of finite-capacity fronthaul links of capacities CF=μF·12log(1+P) with P denoting transmit power. In the system model, BBUPs have limited processing capacity CBBU=μBBU·12log(1+P). We propose two different achievability schemes based on dividing the network into non-interfering parallelogram and hexagonal clusters, respectively. The minimum number of users in a cluster is determined by the ratio of BBUPs to BSs, r=K/N. Both of the parallelogram and hexagonal schemes are based on practically implementable beamforming and adapt the way of forming clusters to the sectorization of the cells. Proposed coding schemes improve the sum-rate over naive approaches that ignore cell sectorization, both at finite signal-to-noise ratio (SNR) and in the high-SNR limit. We derive a lower bound on per-user DoF which is a function of μBBU, μF, and r. We show that cut-set bound are attained for several cases, the achievability gap between lower and cut-set bounds decreases with the inverse of BBUP-BS ratio 1r for μF2M irrespective of μBBU, and that per-user DoF achieved through hexagonal clustering can not exceed the per-user DoF of parallelogram clustering for any value of μBBU and r as long as μF2M. Since the achievability gap decreases with inverse of the BBUP-BS ratio for small and moderate fronthaul capacities, the cut-set bound is almost achieved even for small cluster sizes for this range of fronthaul capacities. For higher fronthaul capacities, the achievability gap is not always tight but decreases with processing capacity. However, the cut-set bound, e.g., at 5M6, can be achieved with a moderate clustering size.  相似文献   

15.
16.
The measures of information transfer which correspond to non-additive entropies have intensively been studied in previous decades. The majority of the work includes the ones belonging to the Sharma–Mittal entropy class, such as the Rényi, the Tsallis, the Landsberg–Vedral and the Gaussian entropies. All of the considerations follow the same approach, mimicking some of the various and mutually equivalent definitions of Shannon information measures, and the information transfer is quantified by an appropriately defined measure of mutual information, while the maximal information transfer is considered as a generalized channel capacity. However, all of the previous approaches fail to satisfy at least one of the ineluctable properties which a measure of (maximal) information transfer should satisfy, leading to counterintuitive conclusions and predicting nonphysical behavior even in the case of very simple communication channels. This paper fills the gap by proposing two parameter measures named the α-q-mutual information and the α-q-capacity. In addition to standard Shannon approaches, special cases of these measures include the α-mutual information and the α-capacity, which are well established in the information theory literature as measures of additive Rényi information transfer, while the cases of the Tsallis, the Landsberg–Vedral and the Gaussian entropies can also be accessed by special choices of the parameters α and q. It is shown that, unlike the previous definition, the α-q-mutual information and the α-q-capacity satisfy the set of properties, which are stated as axioms, by which they reduce to zero in the case of totally destructive channels and to the (maximal) input Sharma–Mittal entropy in the case of perfect transmission, which is consistent with the maximum likelihood detection error. In addition, they are non-negative and less than or equal to the input and the output Sharma–Mittal entropies, in general. Thus, unlike the previous approaches, the proposed (maximal) information transfer measures do not manifest nonphysical behaviors such as sub-capacitance or super-capacitance, which could qualify them as appropriate measures of the Sharma–Mittal information transfer.  相似文献   

17.
We discuss a covariant relativistic Boltzmann equation which describes the evolution of a system of particles in spacetime evolving with a universal invariant parameter τ. The observed time t of Einstein and Maxwell, in the presence of interaction, is not necessarily a monotonic function of τ. If t(τ) increases with τ, the worldline may be associated with a normal particle, but if it is decreasing in τ, it is observed in the laboratory as an antiparticle. This paper discusses the implications for entropy evolution in this relativistic framework. It is shown that if an ensemble of particles and antiparticles, converge in a region of pair annihilation, the entropy of the antiparticle beam may decreaase in time.  相似文献   

18.
Aims: Bubble entropy (bEn) is an entropy metric with a limited dependence on parameters. bEn does not directly quantify the conditional entropy of the series, but it assesses the change in entropy of the ordering of portions of its samples of length m, when adding an extra element. The analytical formulation of bEn for autoregressive (AR) processes shows that, for this class of processes, the relation between the first autocorrelation coefficient and bEn changes for odd and even values of m. While this is not an issue, per se, it triggered ideas for further investigation. Methods: Using theoretical considerations on the expected values for AR processes, we examined a two-steps-ahead estimator of bEn, which considered the cost of ordering two additional samples. We first compared it with the original bEn estimator on a simulated series. Then, we tested it on real heart rate variability (HRV) data. Results: The experiments showed that both examined alternatives showed comparable discriminating power. However, for values of 10<m<20, where the statistical significance of the method was increased and improved as m increased, the two-steps-ahead estimator presented slightly higher statistical significance and more regular behavior, even if the dependence on parameter m was still minimal. We also investigated a new normalization factor for bEn, which ensures that bEn =1 when white Gaussian noise (WGN) is given as the input. Conclusions: The research improved our understanding of bubble entropy, in particular in the context of HRV analysis, and we investigated interesting details regarding the definition of the estimator.  相似文献   

19.
A solvable model of a periodically driven trapped mixture of Bose–Einstein condensates, consisting of N1 interacting bosons of mass m1 driven by a force of amplitude fL,1 and N2 interacting bosons of mass m2 driven by a force of amplitude fL,2, is presented. The model generalizes the harmonic-interaction model for mixtures to the time-dependent domain. The resulting many-particle ground Floquet wavefunction and quasienergy, as well as the time-dependent densities and reduced density matrices, are prescribed explicitly and analyzed at the many-body and mean-field levels of theory for finite systems and at the limit of an infinite number of particles. We prove that the time-dependent densities per particle are given at the limit of an infinite number of particles by their respective mean-field quantities, and that the time-dependent reduced one-particle and two-particle density matrices per particle of the driven mixture are 100% condensed. Interestingly, the quasienergy per particle does not coincide with the mean-field value at this limit, unless the relative center-of-mass coordinate of the two Bose–Einstein condensates is not activated by the driving forces fL,1 and fL,2. As an application, we investigate the imprinting of angular momentum and its fluctuations when steering a Bose–Einstein condensate by an interacting bosonic impurity and the resulting modes of rotations. Whereas the expectation values per particle of the angular-momentum operator for the many-body and mean-field solutions coincide at the limit of an infinite number of particles, the respective fluctuations can differ substantially. The results are analyzed in terms of the transformation properties of the angular-momentum operator under translations and boosts, and as a function of the interactions between the particles. Implications are briefly discussed.  相似文献   

20.
Recently, it has been shown that the information flow and causality between two time series can be inferred in a rigorous and quantitative sense, and, besides, the resulting causality can be normalized. A corollary that follows is, in the linear limit, causation implies correlation, while correlation does not imply causation. Now suppose there is an event A taking a harmonic form (sine/cosine), and it generates through some process another event B so that B always lags A by a phase of π/2. Here the causality is obviously seen, while by computation the correlation is, however, zero. This apparent contradiction is rooted in the fact that a harmonic system always leaves a single point on the Poincaré section; it does not add information. That is to say, though the absolute information flow from A to B is zero, i.e., TAB=0, the total information increase of B is also zero, so the normalized TAB, denoted as τAB, takes the form of 00. By slightly perturbing the system with some noise, solving a stochastic differential equation, and letting the perturbation go to zero, it can be shown that τAB approaches 100%, just as one would have expected.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号