首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
In this study, we focus on mixed data which are either observations of univariate random variables which can be quantitative or qualitative, or observations of multivariate random variables such that each variable can include both quantitative and qualitative components. We first propose a novel method, called CMIh, to estimate conditional mutual information taking advantages of the previously proposed approaches for qualitative and quantitative data. We then introduce a new local permutation test, called LocAT for local adaptive test, which is well adapted to mixed data. Our experiments illustrate the good behaviour of CMIh and LocAT, and show their respective abilities to accurately estimate conditional mutual information and to detect conditional (in)dependence for mixed data.  相似文献   

2.
In the area of brain-computer interfaces (BCI), the detection of P300 is a very important technique and has a lot of applications. Although this problem has been studied for decades, it is still a tough problem in electroencephalography (EEG) signal processing owing to its high dimension features and low signal-to-noise ratio (SNR). Recently, neural networks, like conventional neural networks (CNN), has shown excellent performance on many applications. However, standard convolutional neural networks suffer from performance degradation on dealing with noisy data or data with too many redundant information. In this paper, we proposed a novel convolutional neural network with variational information bottleneck for P300 detection. Wiht the CNN architecture and information bottleneck, the proposed network termed P300-VIB-Net could remove the redundant information in data effectively. The experimental results on BCI competition data sets show that P300-VIB-Net achieves cutting-edge character recognition performance. Furthermore, the proposed model is capable of restricting the flow of irrelevant information adaptively in the network from perspective of information theory. The experimental results show that P300-VIB-Net is a promising tool for P300 detection.  相似文献   

3.
Compared with mechanism-based modeling methods, data-driven modeling based on big data has become a popular research field in recent years because of its applicability. However, it is not always better to have more data when building a forecasting model in practical areas. Due to the noise and conflict, redundancy, and inconsistency of big time-series data, the forecasting accuracy may reduce on the contrary. This paper proposes a deep network by selecting and understanding data to improve performance. Firstly, a data self-screening layer (DSSL) with a maximal information distance coefficient (MIDC) is designed to filter input data with high correlation and low redundancy; then, a variational Bayesian gated recurrent unit (VBGRU) is used to improve the anti-noise ability and robustness of the model. Beijing’s air quality and meteorological data are conducted in a verification experiment of 24 h PM2.5 concentration forecasting, proving that the proposed model is superior to other models in accuracy.  相似文献   

4.
Deep learning has proven to be an important element of modern data processing technology, which has found its application in many areas such as multimodal sensor data processing and understanding, data generation and anomaly detection. While the use of deep learning is booming in many real-world tasks, the internal processes of how it draws results is still uncertain. Understanding the data processing pathways within a deep neural network is important for transparency and better resource utilisation. In this paper, a method utilising information theoretic measures is used to reveal the typical learning patterns of convolutional neural networks, which are commonly used for image processing tasks. For this purpose, training samples, true labels and estimated labels are considered to be random variables. The mutual information and conditional entropy between these variables are then studied using information theoretical measures. This paper shows that more convolutional layers in the network improve its learning and unnecessarily higher numbers of convolutional layers do not improve the learning any further. The number of convolutional layers that need to be added to a neural network to gain the desired learning level can be determined with the help of theoretic information quantities including entropy, inequality and mutual information among the inputs to the network. The kernel size of convolutional layers only affects the learning speed of the network. This study also shows that where the dropout layer is applied to has no significant effects on the learning of networks with a lower dropout rate, and it is better placed immediately after the last convolutional layer with higher dropout rates.  相似文献   

5.
Without assuming any functional or distributional structure, we select collections of major factors embedded within response-versus-covariate (Re-Co) dynamics via selection criteria [C1: confirmable] and [C2: irrepaceable], which are based on information theoretic measurements. The two criteria are constructed based on the computing paradigm called Categorical Exploratory Data Analysis (CEDA) and linked to Wiener–Granger causality. All the information theoretical measurements, including conditional mutual information and entropy, are evaluated through the contingency table platform, which primarily rests on the categorical nature within all involved features of any data types: quantitative or qualitative. Our selection task identifies one chief collection, together with several secondary collections of major factors of various orders underlying the targeted Re-Co dynamics. Each selected collection is checked with algorithmically computed reliability against the finite sample phenomenon, and so is each member’s major factor individually. The developments of our selection protocol are illustrated in detail through two experimental examples: a simple one and a complex one. We then apply this protocol on two data sets pertaining to two somewhat related but distinct pitching dynamics of two pitch types: slider and fastball. In particular, we refer to a specific Major League Baseball (MLB) pitcher and we consider data of multiple seasons.  相似文献   

6.
Neuroscience extensively uses the information theory to describe neural communication, among others, to calculate the amount of information transferred in neural communication and to attempt the cracking of its coding. There are fierce debates on how information is represented in the brain and during transmission inside the brain. The neural information theory attempts to use the assumptions of electronic communication; despite the experimental evidence that the neural spikes carry information on non-discrete states, they have shallow communication speed, and the spikes’ timing precision matters. Furthermore, in biology, the communication channel is active, which enforces an additional power bandwidth limitation to the neural information transfer. The paper revises the notions needed to describe information transfer in technical and biological communication systems. It argues that biology uses Shannon’s idea outside of its range of validity and introduces an adequate interpretation of information. In addition, the presented time-aware approach to the information theory reveals pieces of evidence for the role of processes (as opposed to states) in neural operations. The generalized information theory describes both kinds of communication, and the classic theory is the particular case of the generalized theory.  相似文献   

7.
夏菽兰  赵力 《应用声学》2015,23(5):1823-1826
BP网络是应用最广的一种人工神经网络,将BP神经网络应用到压力检测领域的温度等非线性补偿,具有重要的实用价值,对压力检测精度的改进效果显著。从传感器信息融合的角度看,神经网络就是一个融合系统。通过对神经网络基本理论的阐述,针对研究对象将BP神经网络原理与多传感器信息融合技术有机集合起来,提出了基于BP神经网络的二传感器信息融合模型及改进算法,建立了BP神经网络训练标准样本库,并对该网络模型进行主要技术指标的测试和仿真工作,测试结果表明构建的模型及其改进算法能很好地满足了高精度压力检测仪的指标要求。  相似文献   

8.
Recent advances in statistical inference have significantly expanded the toolbox of probabilistic modeling. Historically, probabilistic modeling has been constrained to very restricted model classes, where exact or approximate probabilistic inference is feasible. However, developments in variational inference, a general form of approximate probabilistic inference that originated in statistical physics, have enabled probabilistic modeling to overcome these limitations: (i) Approximate probabilistic inference is now possible over a broad class of probabilistic models containing a large number of parameters, and (ii) scalable inference methods based on stochastic gradient descent and distributed computing engines allow probabilistic modeling to be applied to massive data sets. One important practical consequence of these advances is the possibility to include deep neural networks within probabilistic models, thereby capturing complex non-linear stochastic relationships between the random variables. These advances, in conjunction with the release of novel probabilistic modeling toolboxes, have greatly expanded the scope of applications of probabilistic models, and allowed the models to take advantage of the recent strides made by the deep learning community. In this paper, we provide an overview of the main concepts, methods, and tools needed to use deep neural networks within a probabilistic modeling framework.  相似文献   

9.
We present novel data-processing inequalities relating the mutual information and the directed information in systems with feedback. The internal deterministic blocks within such systems are restricted only to be causal mappings, but are allowed to be non-linear and time varying, and randomized by their own external random input, can yield any stochastic mapping. These randomized blocks can for example represent source encoders, decoders, or even communication channels. Moreover, the involved signals can be arbitrarily distributed. Our first main result relates mutual and directed information and can be interpreted as a law of conservation of information flow. Our second main result is a pair of data-processing inequalities (one the conditional version of the other) between nested pairs of random sequences entirely within the closed loop. Our third main result introduces and characterizes the notion of in-the-loop (ITL) transmission rate for channel coding scenarios in which the messages are internal to the loop. Interestingly, in this case the conventional notions of transmission rate associated with the entropy of the messages and of channel capacity based on maximizing the mutual information between the messages and the output turn out to be inadequate. Instead, as we show, the ITL transmission rate is the unique notion of rate for which a channel code attains zero error probability if and only if such an ITL rate does not exceed the corresponding directed information rate from messages to decoded messages. We apply our data-processing inequalities to show that the supremum of achievable (in the usual channel coding sense) ITL transmission rates is upper bounded by the supremum of the directed information rate across the communication channel. Moreover, we present an example in which this upper bound is attained. Finally, we further illustrate the applicability of our results by discussing how they make possible the generalization of two fundamental inequalities known in networked control literature.  相似文献   

10.
混沌时序相空间重构参数确定的信息论方法   总被引:11,自引:0,他引:11       下载免费PDF全文
根据信息论基本原理,研究了混沌时间序列相空间重构参数延迟时间和嵌入维数的选取.提出了用符号分析的方法计算互信息函数,确定出延迟时间,在此基础上,提出了一种估计嵌入维数的信息论方法,即根据重构向量条件熵随向量维数的变化关系来确定嵌入维数,通过对几种典型混沌动力学系统的数值验证,结果表明该方法能够确定出合适的相空间重构嵌入维数. 关键词: 混沌 相空间重构 互信息 条件熵 符号分析  相似文献   

11.
Biological neural networks for color vision (also known as color appearance models) consist of a cascade of linear + nonlinear layers that modify the linear measurements at the retinal photo-receptors leading to an internal (nonlinear) representation of color that correlates with psychophysical experience. The basic layers of these networks include: (1) chromatic adaptation (normalization of the mean and covariance of the color manifold); (2) change to opponent color channels (PCA-like rotation in the color space); and (3) saturating nonlinearities to obtain perceptually Euclidean color representations (similar to dimension-wise equalization). The Efficient Coding Hypothesis argues that these transforms should emerge from information-theoretic goals. In case this hypothesis holds in color vision, the question is what is the coding gain due to the different layers of the color appearance networks? In this work, a representative family of color appearance models is analyzed in terms of how the redundancy among the chromatic components is modified along the network and how much information is transferred from the input data to the noisy response. The proposed analysis is performed using data and methods that were not available before: (1) new colorimetrically calibrated scenes in different CIE illuminations for the proper evaluation of chromatic adaptation; and (2) new statistical tools to estimate (multivariate) information-theoretic quantities between multidimensional sets based on Gaussianization. The results confirm that the efficient coding hypothesis holds for current color vision models, and identify the psychophysical mechanisms critically responsible for gains in information transference: opponent channels and their nonlinear nature are more important than chromatic adaptation at the retina.  相似文献   

12.
The varied cognitive abilities and rich adaptive behaviors enabled by the animal nervous system are often described in terms of information processing. This framing raises the issue of how biological neural circuits actually process information, and some of the most fundamental outstanding questions in neuroscience center on understanding the mechanisms of neural information processing. Classical information theory has long been understood to be a natural framework within which information processing can be understood, and recent advances in the field of multivariate information theory offer new insights into the structure of computation in complex systems. In this review, we provide an introduction to the conceptual and practical issues associated with using multivariate information theory to analyze information processing in neural circuits, as well as discussing recent empirical work in this vein. Specifically, we provide an accessible introduction to the partial information decomposition (PID) framework. PID reveals redundant, unique, and synergistic modes by which neurons integrate information from multiple sources. We focus particularly on the synergistic mode, which quantifies the “higher-order” information carried in the patterns of multiple inputs and is not reducible to input from any single source. Recent work in a variety of model systems has revealed that synergistic dynamics are ubiquitous in neural circuitry and show reliable structure–function relationships, emerging disproportionately in neuronal rich clubs, downstream of recurrent connectivity, and in the convergence of correlated activity. We draw on the existing literature on higher-order information dynamics in neuronal networks to illustrate the insights that have been gained by taking an information decomposition perspective on neural activity. Finally, we briefly discuss future promising directions for information decomposition approaches to neuroscience, such as work on behaving animals, multi-target generalizations of PID, and time-resolved local analyses.  相似文献   

13.
With the advent of big data and the popularity of black-box deep learning methods, it is imperative to address the robustness of neural networks to noise and outliers. We propose the use of Winsorization to recover model performances when the data may have outliers and other aberrant observations. We provide a comparative analysis of several probabilistic artificial intelligence and machine learning techniques for supervised learning case studies. Broadly, Winsorization is a versatile technique for accounting for outliers in data. However, different probabilistic machine learning techniques have different levels of efficiency when used on outlier-prone data, with or without Winsorization. We notice that Gaussian processes are extremely vulnerable to outliers, while deep learning techniques in general are more robust.  相似文献   

14.
We develop Categorical Exploratory Data Analysis (CEDA) with mimicking to explore and exhibit the complexity of information content that is contained within any data matrix: categorical, discrete, or continuous. Such complexity is shown through visible and explainable serial multiscale structural dependency with heterogeneity. CEDA is developed upon all features’ categorical nature via histogram and it is guided by all features’ associative patterns (order-2 dependence) in a mutual conditional entropy matrix. Higher-order structural dependency of k(3) features is exhibited through block patterns within heatmaps that are constructed by permuting contingency-kD-lattices of counts. By growing k, the resultant heatmap series contains global and large scales of structural dependency that constitute the data matrix’s information content. When involving continuous features, the principal component analysis (PCA) extracts fine-scale information content from each block in the final heatmap. Our mimicking protocol coherently simulates this heatmap series by preserving global-to-fine scales structural dependency. Upon every step of mimicking process, each accepted simulated heatmap is subject to constraints with respect to all of the reliable observed categorical patterns. For reliability and robustness in sciences, CEDA with mimicking enhances data visualization by revealing deterministic and stochastic structures within each scale-specific structural dependency. For inferences in Machine Learning (ML) and Statistics, it clarifies, upon which scales, which covariate feature-groups have major-vs.-minor predictive powers on response features. For the social justice of Artificial Intelligence (AI) products, it checks whether a data matrix incompletely prescribes the targeted system.  相似文献   

15.
In solving challenging pattern recognition problems, deep neural networks have shown excellent performance by forming powerful mappings between inputs and targets, learning representations (features) and making subsequent predictions. A recent tool to help understand how representations are formed is based on observing the dynamics of learning on an information plane using mutual information, linking the input to the representation (I(X;T)) and the representation to the target (I(T;Y)). In this paper, we use an information theoretical approach to understand how Cascade Learning (CL), a method to train deep neural networks layer-by-layer, learns representations, as CL has shown comparable results while saving computation and memory costs. We observe that performance is not linked to information–compression, which differs from observation on End-to-End (E2E) learning. Additionally, CL can inherit information about targets, and gradually specialise extracted features layer-by-layer. We evaluate this effect by proposing an information transition ratio, I(T;Y)/I(X;T), and show that it can serve as a useful heuristic in setting the depth of a neural network that achieves satisfactory accuracy of classification.  相似文献   

16.
Gene regulatory networks (GRNs) control biological processes like pluripotency, differentiation, and apoptosis. Omics methods can identify a large number of putative network components (on the order of hundreds or thousands) but it is possible that in many cases a small subset of genes control the state of GRNs. Here, we explore how the topology of the interactions between network components may indicate whether the effective state of a GRN can be represented by a small subset of genes. We use methods from information theory to model the regulatory interactions in GRNs as cascading and superposing information channels. We propose an information loss function that enables identification of the conditions by which a small set of genes can represent the state of all the other genes in the network. This information-theoretic analysis extends to a measure of free energy change due to communication within the network, which provides a new perspective on the reducibility of GRNs. Both the information loss and relative free energy depend on the density of interactions and edge communication error in a network. Therefore, this work indicates that a loss in mutual information between genes in a GRN is directly coupled to a thermodynamic cost, i.e., a reduction of relative free energy, of the system.  相似文献   

17.
We review the principal information theoretic tools and their use for feature selection, with the main emphasis on classification problems with discrete features. Since it is known that empirical versions of conditional mutual information perform poorly for high-dimensional problems, we focus on various ways of constructing its counterparts and the properties and limitations of such methods. We present a unified way of constructing such measures based on truncation, or truncation and weighing, for the Möbius expansion of conditional mutual information. We also discuss the main approaches to feature selection which apply the introduced measures of conditional dependence, together with the ways of assessing the quality of the obtained vector of predictors. This involves discussion of recent results on asymptotic distributions of empirical counterparts of criteria, as well as advances in resampling.  相似文献   

18.
This paper aims to empirically examine long memory and bi-directional information flow between estimated volatilities of highly volatile time series datasets of five cryptocurrencies. We propose the employment of Garman and Klass (GK), Parkinson’s, Rogers and Satchell (RS), and Garman and Klass-Yang and Zhang (GK-YZ), and Open-High-Low-Close (OHLC) volatility estimators to estimate cryptocurrencies’ volatilities. The study applies methods such as mutual information, transfer entropy (TE), effective transfer entropy (ETE), and Rényi transfer entropy (RTE) to quantify the information flow between estimated volatilities. Additionally, Hurst exponent computations examine the existence of long memory in log returns and OHLC volatilities based on simple R/S, corrected R/S, empirical, corrected empirical, and theoretical methods. Our results confirm the long-run dependence and non-linear behavior of all cryptocurrency’s log returns and volatilities. In our analysis, TE and ETE estimates are statistically significant for all OHLC estimates. We report the highest information flow from BTC to LTC volatility (RS). Similarly, BNB and XRP share the most prominent information flow between volatilities estimated by GK, Parkinson’s, and GK-YZ. The study presents the practicable addition of OHLC volatility estimators for quantifying the information flow and provides an additional choice to compare with other volatility estimators, such as stochastic volatility models.  相似文献   

19.
S.M.Lee  O.M.Kwon  JuH.Park 《中国物理 B》2010,19(5):50507-050507
In this paper,new delay-dependent stability criteria for asymptotic stability of neural networks with time-varying delays are derived.The stability conditions are represented in terms of linear matrix inequalities(LMIs) by constructing new Lyapunov-Krasovskii functional.The proposed functional has an augmented quadratic form with states as well as the nonlinear function to consider the sector and the slope constraints.The less conservativeness of the proposed stability criteria can be guaranteed by using convex properties of the nonlinear function which satisfies the sector and slope bound.Numerical examples are presented to show the effectiveness of the proposed method.  相似文献   

20.
Finite-time stability of a class of fractional-order neural networks is investigated in this paper.By Laplace transform,the generalized Gronwall inequality and estimates of Mittag-Leffler functions,sufficient conditions are presented to ensure the finite-time stability of such neural models with the Caputo fractional derivatives.Furthermore,results about asymptotical stability of fractional-order neural models are also obtained.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号