期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Information Bottleneck Theory Based Exploration of Cascade Learning

Xin Du Katayoun Farrahi Mahesan Niranjan 《Entropy (Basel, Switzerland)》2021,23(10)

In solving challenging pattern recognition problems, deep neural networks have shown excellent performance by forming powerful mappings between inputs and targets, learning representations (features) and making subsequent predictions. A recent tool to help understand how representations are formed is based on observing the dynamics of learning on an information plane using mutual information, linking the input to the representation (

I (X; T)

) and the representation to the target (

I (T; Y)

). In this paper, we use an information theoretical approach to understand how Cascade Learning (CL), a method to train deep neural networks layer-by-layer, learns representations, as CL has shown comparable results while saving computation and memory costs. We observe that performance is not linked to information–compression, which differs from observation on End-to-End (E2E) learning. Additionally, CL can inherit information about targets, and gradually specialise extracted features layer-by-layer. We evaluate this effect by proposing an information transition ratio,

I (T; Y) / I (X; T)

, and show that it can serve as a useful heuristic in setting the depth of a neural network that achieves satisfactory accuracy of classification. 相似文献

2.

Analytic Function Approximation by Path-Norm-Regularized Deep Neural Networks

Aleksandr Beknazaryan 《Entropy (Basel, Switzerland)》2022,24(8)

We show that neural networks with an absolute value activation function and with network path norm, network sizes and network weights having logarithmic dependence on

1 / ε

can

ε

-approximate functions that are analytic on certain regions of

C^{d}

. 相似文献

3.

Learnability of the Boolean Innerproduct in Deep Neural Networks

Mehmet Erdal Friedhelm Schwenker 《Entropy (Basel, Switzerland)》2022,24(8)

In this paper, we study the learnability of the Boolean inner product by a systematic simulation study. The family of the Boolean inner product function is known to be representable by neural networks of threshold neurons of depth 3 with only

2 n + 1

units (n the input dimension)—whereas an exact representation by a depth 2 network cannot possibly be of polynomial size. This result can be seen as a strong argument for deep neural network architectures. In our study, we found that this depth 3 architecture of the Boolean inner product is difficult to train, much harder than the depth 2 network, at least for the small input size scenarios

n \leq 16

. Nonetheless, the accuracy of the deep architecture increased with the dimension of the input space to 94% on average, which means that multiple restarts are needed to find the compact depth 3 architecture. Replacing the fully connected first layer by a partially connected layer (a kind of convolutional layer sparsely connected with weight sharing) can significantly improve the learning performance up to 99% accuracy in simulations. Another way to improve the learnability of the compact depth 3 representation of the inner product could be achieved by adding just a few additional units into the first hidden layer. 相似文献

4.

Entanglement-Structured LSTM Boosts Chaotic Time Series Forecasting

Xiangyi Meng Tong Yang 《Entropy (Basel, Switzerland)》2021,23(11)

Traditional machine-learning methods are inefficient in capturing chaos in nonlinear dynamical systems, especially when the time difference

Δ t

between consecutive steps is so large that the extracted time series looks apparently random. Here, we introduce a new long-short-term-memory (LSTM)-based recurrent architecture by tensorizing the cell-state-to-state propagation therein, maintaining the long-term memory feature of LSTM, while simultaneously enhancing the learning of short-term nonlinear complexity. We stress that the global minima of training can be most efficiently reached by our tensor structure where all nonlinear terms, up to some polynomial order, are treated explicitly and weighted equally. The efficiency and generality of our architecture are systematically investigated and tested through theoretical analysis and experimental examinations. In our design, we have explicitly used two different many-body entanglement structures—matrix product states (MPS) and the multiscale entanglement renormalization ansatz (MERA)—as physics-inspired tensor decomposition techniques, from which we find that MERA generally performs better than MPS, hence conjecturing that the learnability of chaos is determined not only by the number of free parameters but also the tensor complexity—recognized as how entanglement entropy scales with varying matricization of the tensor. 相似文献

5.

Hidden Hypergraphs,Error-Correcting Codes,and Critical Learning in Hopfield Networks

Christopher Hillar Tenzin Chan Rachel Taubman David Rolnick 《Entropy (Basel, Switzerland)》2021,23(11)

In 1943, McCulloch and Pitts introduced a discrete recurrent neural network as a model for computation in brains. The work inspired breakthroughs such as the first computer design and the theory of finite automata. We focus on learning in Hopfield networks, a special case with symmetric weights and fixed-point attractor dynamics. Specifically, we explore minimum energy flow (MEF) as a scalable convex objective for determining network parameters. We catalog various properties of MEF, such as biological plausibility, and then compare to classical approaches in the theory of learning. Trained Hopfield networks can perform unsupervised clustering and define novel error-correcting coding schemes. They also efficiently find hidden structures (cliques) in graph theory. We extend this known connection from graphs to hypergraphs and discover n-node networks with robust storage of

2^{Ω (n^{1 - ϵ})}

memories for any

ϵ > 0

. In the case of graphs, we also determine a critical ratio of training samples at which networks generalize completely. 相似文献

6.

Partially connected feedforward neural networks on Apollonian networks

W.K. Wong Z.X. Guo 《Physica A》2010,389(22):5298-5307

This paper presents a novel and data-independent method to construct a type of partially connected feedforward neural network (FNN). The proposed networks, called Apollonian network-based partially connected FNNs (APFNNs), are constructed in terms of the structures of two-dimensional deterministic Apollonian networks. The APFNNs are then applied in various experiments to solve function approximation, forecasting and classification problems. Their results are compared with those generated by partially connected FNNs with random connectivity (RPFNNs), different learning algorithm-based traditional FNNs and other benchmark methods. The results demonstrate that the proposed APFNNs have a good capacity to fit complicated input and output relations, and provide better generalization performance than traditional FNNs and RPFNNs. The APFNNs also demonstrate faster training speed in each epoch than traditional FNNs. 相似文献

7.

Complex Networks and the b-Value Relationship Using the Degree Probability Distribution: The Case of Three Mega-Earthquakes in Chile in the Last Decade

Fernanda Andrea Martín Denisse Pastn 《Entropy (Basel, Switzerland)》2022,24(3)

Studies from complex networks have increased in recent years, and different applications have been utilized in geophysics. Seismicity represents a complex and dynamic system that has open questions related to earthquake occurrence. In this work, we carry out an analysis to understand the physical interpretation of two metrics of complex systems: the slope of the probability distribution of connectivity (

γ

) and the betweenness centrality (BC). To conduct this study, we use seismic datasets recorded from three large earthquakes that occurred in Chile: the

M_{w}

8.2 Iquique earthquake (2014), the

M_{w}

8.4 Illapel earthquake (2015) and the

M_{w}

8.8 Cauquenes earthquake (2010). We find a linear relationship between the

b -

value and the

γ

value, with an interesting finding about the ratio between the

b -

value and

γ

that gives a value of ∼0.4. We also explore a possible physical meaning of the BC. As a first result, we find that the behaviour of this metric is not the same for the three large earthquakes, and it seems that this metric is not related to the

b -

value and coupling of the zone. We present the first results about the physical meaning of metrics from complex networks in seismicity. These first results are promising, and we hope to be able to carry out further analyses to understand the physics that these complex network parameters represent in a seismic system. 相似文献

8.

A Quantitative Comparison between Shannon and Tsallis–Havrda–Charvat Entropies Applied to Cancer Outcome Prediction

Thibaud Brochet Jrme Lapuyade-Lahorgue Alexandre Huat Sbastien Thureau David Pasquier Isabelle Gardin Romain Modzelewski David Gibon Juliette Thariat Vincent Grgoire Pierre Vera Su Ruan 《Entropy (Basel, Switzerland)》2022,24(4)

In this paper, we propose to quantitatively compare loss functions based on parameterized Tsallis–Havrda–Charvat entropy and classical Shannon entropy for the training of a deep network in the case of small datasets which are usually encountered in medical applications. Shannon cross-entropy is widely used as a loss function for most neural networks applied to the segmentation, classification and detection of images. Shannon entropy is a particular case of Tsallis–Havrda–Charvat entropy. In this work, we compare these two entropies through a medical application for predicting recurrence in patients with head–neck and lung cancers after treatment. Based on both CT images and patient information, a multitask deep neural network is proposed to perform a recurrence prediction task using cross-entropy as a loss function and an image reconstruction task. Tsallis–Havrda–Charvat cross-entropy is a parameterized cross-entropy with the parameter

α

. Shannon entropy is a particular case of Tsallis–Havrda–Charvat entropy for

α = 1

. The influence of this parameter on the final prediction results is studied. In this paper, the experiments are conducted on two datasets including in total 580 patients, of whom 434 suffered from head–neck cancers and 146 from lung cancers. The results show that Tsallis–Havrda–Charvat entropy can achieve better performance in terms of prediction accuracy with some values of

α

. 相似文献

9.

Discrete-Time Memristor Model for Enhancing Chaotic Complexity and Application in Secure Communication

Wenhao Yan Wenjie Dong Peng Wang Ya Wang Yanan Xing Qun Ding 《Entropy (Basel, Switzerland)》2022,24(7)

The physical implementation of the continuous-time memristor makes it widely used in chaotic circuits, whereas the discrete-time memristor has not received much attention. In this paper, the backward-Euler method is used to discretize the

T i O_{2}

memristor model, and the discretized model also meets the three fingerprints characteristics of the generalized memristor. The short period phenomenon and uneven output distribution of one-dimensional chaotic systems affect their applications in some fields, so it is necessary to improve the dynamic characteristics of one-dimensional chaotic systems. In this paper, a two-dimensional discrete-time memristor model is obtained by linear coupling of the proposed

T i O_{2}

memristor model and one-dimensional chaotic systems. Since the two-dimensional model has infinite fixed points, the stability of these fixed points depends on the coupling parameters and the initial state of the discrete

T i O_{2}

memristor model. Furthermore, the dynamic characteristics of one-dimensional chaotic systems can be enhanced by the proposed method. Finally, we apply the generated chaotic sequence to secure communication. 相似文献

10.

Adaptive Hurst-Sensitive Active Queue Management

Dariusz Marek Jakub Szygu&#x;a Adam Doma&#x;ski Joanna Doma&#x;ska Katarzyna Filus Marta Szczygie&#x; 《Entropy (Basel, Switzerland)》2022,24(3)

An Active Queue Management (AQM) mechanism, recommended by the Internet Engineering Task Force (IETF), increases the efficiency of network transmission. An example of this type of algorithm can be the Random Early Detection (RED) algorithm. The behavior of the RED algorithm strictly depends on the correct selection of its parameters. This selection may be performed automatically depending on the network conditions. The mechanisms that adjust their parameters to the network conditions are called the adaptive ones. The example can be the Adaptive RED (ARED) mechanism, which adjusts its parameters taking into consideration the traffic intensity. In our paper, we propose to use an additional traffic parameter to adjust the AQM parameters—degree of self-similarity—expressed using the Hurst parameter. In our study, we propose the modifications of the well-known AQM algorithms: ARED and fractional order

P I^{α} D^{β}

and the algorithms based on neural networks that are used to automatically adjust the AQM parameters using the traffic intensity and its degree of self-similarity. We use the Fluid Flow approximation and the discrete event simulation to evaluate the behavior of queues controlled by the proposed adaptive AQM mechanisms and compare the results with those obtained with their basic counterparts. In our experiments, we analyzed the average queue occupancies and packet delays in the communication node. The obtained results show that considering the degree of self-similarity of network traffic in the process of AQM parameters determination enabled us to decrease the average queue occupancy and the number of rejected packets, as well as to reduce the transmission latency. 相似文献

11.

A general geometric growth model for pseudofractal scale-free web

Zhongzhi Zhang Lili Rong Shuigeng Zhou 《Physica A》2007

相似文献

12.

Exploring self-similarity of complex cellular networks: The edge-covering method with simulated annealing and log-periodic sampling

Wei-Xing Zhou Zhi-Qiang Jiang Didier Sornette 《Physica A》2007

Song et al. [Self-similarity of complex networks, Nature 433 (2005) 392–395] have recently used a version of the box-counting method, called the node-covering method, to quantify the self-similar properties of 43 cellular networks: the minimal number _N_V

N_{V}

of boxes of size ?

?

needed to cover all the nodes of a cellular network was found to scale as the power-law _N_V∼(?+1)^-_D_V

N_{V} \sim (? + 1)^{- D_{V}}

with a fractal dimension _D_V=3.53±0.26

D_{V} = 3.53 \pm 0.26

. We implement an alternative box-counting method in terms of the minimum number _N_E

N_{E}

of edge-covering boxes which is well-suited to cellular networks, where the search over different covering sets is performed with the simulated annealing algorithm. The method also takes into account a possible discrete scale symmetry to optimize the sampling rate and minimize possible biases in the estimation of the fractal dimension. With this methodology, we find that _N_E

N_{E}

scales with respect to ?

?

as a power-law _N_E∼?^-_D_E

N_{E} \sim ?^{- D_{E}}

with _D_E=2.67±0.15

D_{E} = 2.67 \pm 0.15

for the 43 cellular networks previously analyzed by Song et al. [Self-similarity of complex networks, Nature 433 (2005) 392–395]. Bootstrap tests suggest that the analyzed cellular networks may have a significant log-periodicity qualifying a discrete hierarchy with a scaling ratio close to 2. 相似文献

13.

Scaling and correlations in three bus-transport networks of China

Xinping Xu Junhui Hu Feng Liu Lianshou Liu 《Physica A》2007

We report the statistical properties of three bus-transport networks (BTN) in three different cities of China. These networks are composed of a set of bus lines and stations serviced by these. Network properties, including the degree distribution, clustering and average path length are studied in different definitions of network topology. We explore scaling laws and correlations that may govern intrinsic features of such networks. Besides, we create a weighted network representation for BTN with lines mapped to nodes and number of common stations to weights between lines. In such a representation, the distributions of degree, strength and weight are investigated. A linear behavior between strength and degree s(k)∼k

s (k) \sim k

is also observed. 相似文献

14.

Social Influence Maximization in Hypergraphs

Alessia Antelmi Gennaro Cordasco Carmine Spagnuolo Przemys&#x;aw Szufel 《Entropy (Basel, Switzerland)》2021,23(7)

This work deals with a generalization of the minimum Target Set Selection (TSS) problem, a key algorithmic question in information diffusion research due to its potential commercial value. Firstly proposed by Kempe et al., the TSS problem is based on a linear threshold diffusion model defined on an input graph with node thresholds, quantifying the hardness to influence each node. The goal is to find the smaller set of items that can influence the whole network according to the diffusion model defined. This study generalizes the TSS problem on networks characterized by many-to-many relationships modeled via hypergraphs. Specifically, we introduce a linear threshold diffusion process on such structures, which evolves as follows. Let

H = (V, E)

be a hypergraph. At the beginning of the process, the nodes in a given set

S \subseteq V

are influenced. Then, at each iteration, (i) the influenced hyperedges set is augmented by all edges having a sufficiently large number of influenced nodes; (ii) consequently, the set of influenced nodes is enlarged by all the nodes having a sufficiently large number of already influenced hyperedges. The process ends when no new nodes can be influenced. Exploiting this diffusion model, we define the minimum Target Set Selection problem on hypergraphs (TSSH). Being the problem NP-hard (as it generalizes the TSS problem), we introduce four heuristics and provide an extensive evaluation on real-world networks. 相似文献

15.

Modelling and Analysis of the Epidemic Model under Pulse Charging in Wireless Rechargeable Sensor Networks

Guiyun Liu Ziyi Huang Xilai Wu Zhongwei Liang Fenghuo Hong Xiaokai Su 《Entropy (Basel, Switzerland)》2021,23(8)

With the development of wireless sensor networks (WSNs), energy constraints and network security have become the main problems. This paper discusses the dynamic of the Susceptible, Infected, Low-energy, Susceptible model under pulse charging (SILS-P) in wireless rechargeable sensor networks. After the construction of the model, the local stability and global stability of the malware-free T-period solution of the model are analyzed, and the threshold

R_{0}

is obtained. Then, using the comparison theorem and Floquet theorem, we obtain the relationship between

R_{0}

and the stability. In order to make the conclusion more intuitive, we use simulation to reveal the impact of parameters on

R_{0}

. In addition, the paper discusses the continuous charging model, and reveals its dynamic by simulation. Finally, the paper compares three charging strategies: pulse charging, continuous charging and non-charging and obtains the relationship between their threshold values and system parameters. 相似文献

16.

Forecasting COVID-19 Epidemic Trends by Combining a Neural Network with Rt Estimation

Pietro Cinaglia Mario Cannataro 《Entropy (Basel, Switzerland)》2022,24(7)

On 31 December 2019, a cluster of pneumonia cases of unknown etiology was reported in Wuhan (China). The cases were declared to be Coronavirus Disease 2019 (COVID-19) by the World Health Organization (WHO). COVID-19 has been defined as SARS Coronavirus 2 (SARS-CoV-2). Some countries, e.g., Italy, France, and the United Kingdom (UK), have been subjected to frequent restrictions for preventing the spread of infection, contrary to other ones, e.g., the United States of America (USA) and Sweden. The restrictions afflicted the evolution of trends with several perturbations that destabilized its normal evolution. Globally,

R_{t}

has been used to estimate time-varying reproduction numbers during epidemics. Methods: This paper presents a solution based on Deep Learning (DL) for the analysis and forecasting of epidemic trends in new positive cases of SARS-CoV-2 (COVID-19). It combined a neural network (NN) and an

R_{t}

estimation by adjusting the data produced by the output layer of the NN on the related

R_{t}

estimation. Results: Tests were performed on datasets related to the following countries: Italy, the USA, France, the UK, and Sweden. Positive case registration was retrieved between 24 February 2020 and 11 January 2022. Tests performed on the Italian dataset showed that our solution reduced the Mean Absolute Percentage Error (MAPE) by 28.44%, 39.36%, 22.96%, 17.93%, 28.10%, and 24.50% compared to other ones with the same configuration but that were based on the LSTM, GRU, RNN, ARIMA (1,0,3), and ARIMA (7,2,4) models, or an NN without applying the

R_{t}

as a corrective index. It also reduced MAPE by 17.93%, the Mean Absolute Error (MAE) by 34.37%, and the Root Mean Square Error (RMSE) by 43.76% compared to the same model without the adjustment performed by the

R_{t}

. Furthermore, it allowed an average MAPE reduction of 5.37%, 63.10%, 17.84%, and 14.91% on the datasets related to the USA, France, the UK, and Sweden, respectively. 相似文献

17.

Complex scale-free networks with tunable power-law exponent and clustering

E.R. Colman G.J. Rodgers 《Physica A》2013

We introduce a network evolution process motivated by the network of citations in the scientific literature. In each iteration of the process a node is born and directed links are created from the new node to a set of target nodes already in the network. This set includes m

m

“ambassador” nodes and l

l

of each ambassador’s descendants where m

m

and l

l

are random variables selected from any choice of distributions _p_l

p_{l}

and _q_m

q_{m}

. The process mimics the tendency of authors to cite varying numbers of papers included in the bibliographies of the other papers they cite. We show that the degree distributions of the networks generated after a large number of iterations are scale-free and derive an expression for the power-law exponent. In a particular case of the model where the number of ambassadors is always the constant m

m

and the number of selected descendants from each ambassador is the constant l

l

, the power-law exponent is (2l+1)/l

(2 l + 1) / l

. For this example we derive expressions for the degree distribution and clustering coefficient in terms of l

l

and m

m

. We conclude that the proposed model can be tuned to have the same power law exponent and clustering coefficient of a broad range of the scale-free distributions that have been studied empirically. 相似文献

18.

The Generalized Euler Characteristics of the Graphs Split at Vertices

Omer Farooq Micha&#x; &#x;awniczak Afshin Akhshani Szymon Bauch Leszek Sirko 《Entropy (Basel, Switzerland)》2022,24(3)

相似文献

19.

Analysis of cascading failure in complex power networks under the load local preferential redistribution rule 总被引：1，自引：0，他引：1

Du Qu Wei Xiao Shu Luo Bo Zhang 《Physica A》2012

In recent years several global blackouts have drawn a lot of attention to security problems in electric power transmission systems. Here we analyze the cascading failure in complex power networks based on the local preferential redistribution rule of the broken node’s load, where the weight of a node is correlated with its link degree k

k

as k^β

k^{β}

. It is found that there exists a threshold α^∗

α^{*}

such that cascading failure is induced and enhanced when the value of tolerance parameter is smaller than the threshold. It is also found that the larger β

β

is the more robust the power network is. 相似文献

20.

On the emergence of scaling in weighted networks

W Jeżewski 《Physica A》2007

相似文献