首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 16 毫秒
1.
2.
汪璐 《物理》2017,46(9):597-605
深度学习是一类通过多层信息抽象来学习复杂数据内在表示关系的机器学习算法。近年来,深度学习算法在物体识别和定位、语音识别等人工智能领域,取得了飞跃性进展。文章将首先介绍深度学习算法的基本原理及其在高能物理计算中应用的主要动机。然后结合实例综述卷积神经网络、递归神经网络和对抗生成网络等深度学习算法模型的应用。最后,文章将介绍深度学习与现有高能物理计算环境结合的现状、问题及一些思考。  相似文献   

3.
In many decision-making scenarios, ranging from recreational activities to healthcare and policing, the use of artificial intelligence coupled with the ability to learn from historical data is becoming ubiquitous. This widespread adoption of automated systems is accompanied by the increasing concerns regarding their ethical implications. Fundamental rights, such as the ones that require the preservation of privacy, do not discriminate based on sensible attributes (e.g., gender, ethnicity, political/sexual orientation), or require one to provide an explanation for a decision, are daily undermined by the use of increasingly complex and less understandable yet more accurate learning algorithms. For this purpose, in this work, we work toward the development of systems able to ensure trustworthiness by delivering privacy, fairness, and explainability by design. In particular, we show that it is possible to simultaneously learn from data while preserving the privacy of the individuals thanks to the use of Homomorphic Encryption, ensuring fairness by learning a fair representation from the data, and ensuring explainable decisions with local and global explanations without compromising the accuracy of the final models. We test our approach on a widespread but still controversial application, namely face recognition, using the recent FairFace dataset to prove the validity of our approach.  相似文献   

4.
The advancement of sensing technologies coupled with the rapid progress in big data analysis has ushered in a new era in intelligent transport and smart city applications. In this context, transportation mode detection (TMD) of mobile users is a field that has gained significant traction in recent years. In this paper, we present a deep learning approach for transportation mode detection using multimodal sensor data elicited from user smartphones. The approach is based on long short-term Memory networks and Bayesian optimization of their parameters. We conducted an extensive experimental evaluation of the proposed approach, which attains very high recognition rates, against a multitude of machine learning approaches, including state-of-the-art methods. We also discuss issues regarding feature correlation and the impact of dimensionality reduction.  相似文献   

5.
Adversarial examples are one of the most intriguing topics in modern deep learning. Imperceptible perturbations to the input can fool robust models. In relation to this problem, attack and defense methods are being developed almost on a daily basis. In parallel, efforts are being made to simply pointing out when an input image is an adversarial example. This can help prevent potential issues, as the failure cases are easily recognizable by humans. The proposal in this work is to study how chaos theory methods can help distinguish adversarial examples from regular images. Our work is based on the assumption that deep networks behave as chaotic systems, and adversarial examples are the main manifestation of it (in the sense that a slight input variation produces a totally different output). In our experiments, we show that the Lyapunov exponents (an established measure of chaoticity), which have been recently proposed for classification of adversarial examples, are not robust to image processing transformations that alter image entropy. Furthermore, we show that entropy can complement Lyapunov exponents in such a way that the discriminating power is significantly enhanced. The proposed method achieves 65% to 100% accuracy detecting adversarials with a wide range of attacks (for example: CW, PGD, Spatial, HopSkip) for the MNIST dataset, with similar results when entropy-changing image processing methods (such as Equalization, Speckle and Gaussian noise) are applied. This is also corroborated with two other datasets, Fashion-MNIST and CIFAR 19. These results indicate that classifiers can enhance their robustness against the adversarial phenomenon, being applied in a wide variety of conditions that potentially matches real world cases and also other threatening scenarios.  相似文献   

6.
This review looks at some of the central relationships between artificial intelligence, psychology, and economics through the lens of information theory, specifically focusing on formal models of decision-theory. In doing so we look at a particular approach that each field has adopted and how information theory has informed the development of the ideas of each field. A key theme is expected utility theory, its connection to information theory, the Bayesian approach to decision-making and forms of (bounded) rationality. What emerges from this review is a broadly unified formal perspective derived from three very different starting points that reflect the unique principles of each field. Each of the three approaches reviewed can, in principle at least, be implemented in a computational model in such a way that, with sufficient computational power, they could be compared with human abilities in complex tasks. However, a central critique that can be applied to all three approaches was first put forward by Savage in The Foundations of Statistics and recently brought to the fore by the economist Binmore: Bayesian approaches to decision-making work in what Savage called ‘small worlds’ but cannot work in ‘large worlds’. This point, in various different guises, is central to some of the current debates about the power of artificial intelligence and its relationship to human-like learning and decision-making. Recent work on artificial intelligence has gone some way to bridging this gap but significant questions remain to be answered in all three fields in order to make progress in producing realistic models of human decision-making in the real world in which we live in.  相似文献   

7.
When computers started to become a dominant part of technology around the 1950s, fundamental questions about reliable designs and robustness were of great relevance. Their development gave rise to the exploration of new questions, such as what made brains reliable (since neurons can die) and how computers could get inspiration from neural systems. In parallel, the first artificial neural networks came to life. Since then, the comparative view between brains and computers has been developed in new, sometimes unexpected directions. With the rise of deep learning and the development of connectomics, an evolutionary look at how both hardware and neural complexity have evolved or designed is required. In this paper, we argue that important similarities have resulted both from convergent evolution (the inevitable outcome of architectural constraints) and inspiration of hardware and software principles guided by toy pictures of neurobiology. Moreover, dissimilarities and gaps originate from the lack of major innovations that have paved the way to biological computing (including brains) that are completely absent within the artificial domain. As it occurs within synthetic biocomputation, we can also ask whether alternative minds can emerge from A.I. designs. Here, we take an evolutionary view of the problem and discuss the remarkable convergences between living and artificial designs and what are the pre-conditions to achieve artificial intelligence.  相似文献   

8.
Gabbard et al.have demonstrated that convolutional neural networks can achieve the sensitivity of matched filtering in the recognization of the gravitational-wave signals with high efficiency[Phys.Rev.Lett.120,141103(2018)].In this work we show that their model can be optimized for better accuracy.The convolutional neural networks typically have alternating convolutional layers and max pooling layers,followed by a small number of fully connected layers.We increase the stride in the max pooling layer by 1,followed by a dropout layer to alleviate overfitting in the original model.We find that these optimizations can effectively increase the area under the receiver operating characteristic curve for various tests on the same dataset.  相似文献   

9.
The spiking neural network (SNN) is regarded as a promising candidate to deal with the great challenges presented by current machine learning techniques, including the high energy consumption induced by deep neural networks. However, there is still a great gap between SNNs and the online meta-learning performance of artificial neural networks. Importantly, existing spike-based online meta-learning models do not target the robust learning based on spatio-temporal dynamics and superior machine learning theory. In this invited article, we propose a novel spike-based framework with minimum error entropy, called MeMEE, using the entropy theory to establish the gradient-based online meta-learning scheme in a recurrent SNN architecture. We examine the performance based on various types of tasks, including autonomous navigation and the working memory test. The experimental results show that the proposed MeMEE model can effectively improve the accuracy and the robustness of the spike-based meta-learning performance. More importantly, the proposed MeMEE model emphasizes the application of the modern information theoretic learning approach on the state-of-the-art spike-based learning algorithms. Therefore, in this invited paper, we provide new perspectives for further integration of advanced information theory in machine learning to improve the learning performance of SNNs, which could be of great merit to applied developments with spike-based neuromorphic systems.  相似文献   

10.
The increase in the proportion of elderly in Europe brings with it certain challenges that society needs to address, such as custodial care. We propose a scalable, easily modulated and live assistive technology system, based on a comfortable smart footwear capable of detecting walking behaviour, in order to prevent possible health problems in the elderly, facilitating their urban life as independently and safety as possible. This brings with it the challenge of handling the large amounts of data generated, transmitting and pre-processing that information and analysing it with the aim of obtaining useful information in real/near-real time. This is the basis of information theory. This work presents a complete system aiming at elderly people that can detect different user behaviours/events (sitting, standing without imbalance, standing with imbalance, walking, running, tripping) through information acquired from 20 types of sensor measurements (16 piezoelectric pressure sensors, one accelerometer returning reading for the 3 axis and one temperature sensor) and warn the relatives about possible risks in near-real time. For the detection of these events, a hierarchical structure of cascading binary models is designed and applied using artificial neural network (ANN) algorithms and deep learning techniques. The best models are achieved with convolutional layered ANN and multilayer perceptrons. The overall event detection performance achieves an average accuracy and area under the ROC curve of 0.84 and 0.96, respectively.  相似文献   

11.
The vulnerability of deep neural network (DNN)-based systems makes them susceptible to adversarial perturbation and may cause classification task failure. In this work, we propose an adversarial attack model using the Artificial Bee Colony (ABC) algorithm to generate adversarial samples without the need for a further gradient evaluation and training of the substitute model, which can further improve the chance of task failure caused by adversarial perturbation. In untargeted attacks, the proposed method obtained 100%, 98.6%, and 90.00% success rates on the MNIST, CIFAR-10 and ImageNet datasets, respectively. The experimental results show that the proposed ABCAttack can not only obtain a high attack success rate with fewer queries in the black-box setting, but also break some existing defenses to a large extent, and is not limited by model structure or size, which provides further research directions for deep learning evasion attacks and defenses.  相似文献   

12.

近年来自适应光学(AO)系统向着小型化和低成本化趋势发展,无波前探测自适应光学(WFSless AO)系统由于结构简单、应用范围广,成为目前相关领域的研究热点。硬件环境确定后,系统控制算法决定了WFSless AO系统的校正效果和系统收敛速度。新兴的深度学习及人工神经网络为WFSless AO系统控制算法注入了新的活力,进一步推动了WFSless AO系统的理论发展与应用发展。在回顾前期WFSless AO系统控制算法的基础上,全面介绍了近年来卷积神经网络(CNN)、长短期记忆神经网络(LSTM)、深度强化学习在WFSless AO系统控制中的应用,并对WFSless AO系统中各种深度学习模型的特点进行了总结。概述了WFSless AO技术在天文观测、显微成像、眼底成像、激光通信等领域的应用。

  相似文献   

13.
Based on the behavior of living beings, which react mostly to external stimuli, we introduce a neural-network model that uses external patterns as a fundamental tool for the process of recognition. In this proposal, external stimuli appear as an additional field, and basins of attraction, representing memories, arise in accordance with this new field. This is in contrast to the more-common attractor neural networks, where memories are attractors inside well-defined basins of attraction. We show that this procedure considerably increases the storage capabilities of the neural network; this property is illustrated by the standard Hopfield model, which reveals that the recognition capacity of our model may be enlarged, typically, by a factor 102. The primary challenge here consists in calibrating the influence of the external stimulus, in order to attenuate the noise generated by memories that are not correlated with the external pattern. The system is analyzed primarily through numerical simulations. However, since there is the possibility of performing analytical calculations for the Hopfield model, the agreement between these two approaches can be tested—matching results are indicated in some cases. We also show that the present proposal exhibits a crucial attribute of living beings, which concerns their ability to react promptly to changes in the external environment. Additionally, we illustrate that this new approach may significantly enlarge the recognition capacity of neural networks in various situations; with correlated and non-correlated memories, as well as diluted, symmetric, or asymmetric interactions (synapses). This demonstrates that it can be implemented easily on a wide diversity of models.  相似文献   

14.
Information field theory (IFT), the information theory for fields, is a mathematical framework for signal reconstruction and non-parametric inverse problems. Artificial intelligence (AI) and machine learning (ML) aim at generating intelligent systems, including such for perception, cognition, and learning. This overlaps with IFT, which is designed to address perception, reasoning, and inference tasks. Here, the relation between concepts and tools in IFT and those in AI and ML research are discussed. In the context of IFT, fields denote physical quantities that change continuously as a function of space (and time) and information theory refers to Bayesian probabilistic logic equipped with the associated entropic information measures. Reconstructing a signal with IFT is a computational problem similar to training a generative neural network (GNN) in ML. In this paper, the process of inference in IFT is reformulated in terms of GNN training. In contrast to classical neural networks, IFT based GNNs can operate without pre-training thanks to incorporating expert knowledge into their architecture. Furthermore, the cross-fertilization of variational inference methods used in IFT and ML are discussed. These discussions suggest that IFT is well suited to address many problems in AI and ML research and application.  相似文献   

15.
Reconstructability Analysis (RA) and Bayesian Networks (BN) are both probabilistic graphical modeling methodologies used in machine learning and artificial intelligence. There are RA models that are statistically equivalent to BN models and there are also models unique to RA and models unique to BN. The primary goal of this paper is to unify these two methodologies via a lattice of structures that offers an expanded set of models to represent complex systems more accurately or more simply. The conceptualization of this lattice also offers a framework for additional innovations beyond what is presented here. Specifically, this paper integrates RA and BN by developing and visualizing: (1) a BN neutral system lattice of general and specific graphs, (2) a joint RA-BN neutral system lattice of general and specific graphs, (3) an augmented RA directed system lattice of prediction graphs, and (4) a BN directed system lattice of prediction graphs. Additionally, it (5) extends RA notation to encompass BN graphs and (6) offers an algorithm to search the joint RA-BN neutral system lattice to find the best representation of system structure from underlying system variables. All lattices shown in this paper are for four variables, but the theory and methodology presented in this paper are general and apply to any number of variables. These methodological innovations are contributions to machine learning and artificial intelligence and more generally to complex systems analysis. The paper also reviews some relevant prior work of others so that the innovations offered here can be understood in a self-contained way within the context of this paper.  相似文献   

16.
Deep learning has proven to be an important element of modern data processing technology, which has found its application in many areas such as multimodal sensor data processing and understanding, data generation and anomaly detection. While the use of deep learning is booming in many real-world tasks, the internal processes of how it draws results is still uncertain. Understanding the data processing pathways within a deep neural network is important for transparency and better resource utilisation. In this paper, a method utilising information theoretic measures is used to reveal the typical learning patterns of convolutional neural networks, which are commonly used for image processing tasks. For this purpose, training samples, true labels and estimated labels are considered to be random variables. The mutual information and conditional entropy between these variables are then studied using information theoretical measures. This paper shows that more convolutional layers in the network improve its learning and unnecessarily higher numbers of convolutional layers do not improve the learning any further. The number of convolutional layers that need to be added to a neural network to gain the desired learning level can be determined with the help of theoretic information quantities including entropy, inequality and mutual information among the inputs to the network. The kernel size of convolutional layers only affects the learning speed of the network. This study also shows that where the dropout layer is applied to has no significant effects on the learning of networks with a lower dropout rate, and it is better placed immediately after the last convolutional layer with higher dropout rates.  相似文献   

17.
Convolutional neural networks utilize a hierarchy of neural network layers. The statistical aspects of information concentration in successive layers can bring an insight into the feature abstraction process. We analyze the saliency maps of these layers from the perspective of semiotics, also known as the study of signs and sign-using behavior. In computational semiotics, this aggregation operation (known as superization) is accompanied by a decrease of spatial entropy: signs are aggregated into supersign. Using spatial entropy, we compute the information content of the saliency maps and study the superization processes which take place between successive layers of the network. In our experiments, we visualize the superization process and show how the obtained knowledge can be used to explain the neural decision model. In addition, we attempt to optimize the architecture of the neural model employing a semiotic greedy technique. To the extent of our knowledge, this is the first application of computational semiotics in the analysis and interpretation of deep neural networks.  相似文献   

18.
珍珠粉和珍珠层粉化学成分相似,但是珍珠层粉的药用价值远低于珍珠粉,并且珍珠层粉制备容易,成本底,常被不法商家用于冒充或掺入珍珠粉中流入市场,谋取利益.因此,对珍珠粉掺伪鉴别和纯度检测具有重要的意义.采用激光拉曼光谱结合深度学习研究珍珠粉掺伪快速鉴别和纯度分析.将纯珍珠粉和珍珠层粉按一定比例混合,制成珍珠粉质量百分数分别...  相似文献   

19.
Recent advances in statistical inference have significantly expanded the toolbox of probabilistic modeling. Historically, probabilistic modeling has been constrained to very restricted model classes, where exact or approximate probabilistic inference is feasible. However, developments in variational inference, a general form of approximate probabilistic inference that originated in statistical physics, have enabled probabilistic modeling to overcome these limitations: (i) Approximate probabilistic inference is now possible over a broad class of probabilistic models containing a large number of parameters, and (ii) scalable inference methods based on stochastic gradient descent and distributed computing engines allow probabilistic modeling to be applied to massive data sets. One important practical consequence of these advances is the possibility to include deep neural networks within probabilistic models, thereby capturing complex non-linear stochastic relationships between the random variables. These advances, in conjunction with the release of novel probabilistic modeling toolboxes, have greatly expanded the scope of applications of probabilistic models, and allowed the models to take advantage of the recent strides made by the deep learning community. In this paper, we provide an overview of the main concepts, methods, and tools needed to use deep neural networks within a probabilistic modeling framework.  相似文献   

20.
In the last few decades, text mining has been used to extract knowledge from free texts. Applying neural networks and deep learning to natural language processing (NLP) tasks has led to many accomplishments for real-world language problems over the years. The developments of the last five years have resulted in techniques that have allowed for the practical application of transfer learning in NLP. The advances in the field have been substantial, and the milestone of outperforming human baseline performance based on the general language understanding evaluation has been achieved. This paper implements a targeted literature review to outline, describe, explain, and put into context the crucial techniques that helped achieve this milestone. The research presented here is a targeted review of neural language models that present vital steps towards a general language representation model.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号