首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
As state-of-the-art deep neural networks are being deployed at the core level of increasingly large numbers of AI-based products and services, the incentive for “copying them” (i.e., their intellectual property, manifested through the knowledge that is encapsulated in them) either by adversaries or commercial competitors is expected to considerably increase over time. The most efficient way to extract or steal knowledge from such networks is by querying them using a large dataset of random samples and recording their output, which is followed by the training of a student network, aiming to eventually mimic these outputs, without making any assumption about the original networks. The most effective way to protect against such a mimicking attack is to answer queries with the classification result only, omitting confidence values associated with the softmax layer. In this paper, we present a novel method for generating composite images for attacking a mentor neural network using a student model. Our method assumes no information regarding the mentor’s training dataset, architecture, or weights. Furthermore, assuming no information regarding the mentor’s softmax output values, our method successfully mimics the given neural network and is capable of stealing large portions (and sometimes all) of its encapsulated knowledge. Our student model achieved 99% relative accuracy to the protected mentor model on the Cifar-10 test set. In addition, we demonstrate that our student network (which copies the mentor) is impervious to watermarking protection methods and thus would evade being detected as a stolen model by existing dedicated techniques. Our results imply that all current neural networks are vulnerable to mimicking attacks, even if they do not divulge anything but the most basic required output, and that the student model that mimics them cannot be easily detected using currently available techniques.  相似文献   

2.
3.
In this paper, we study the learnability of the Boolean inner product by a systematic simulation study. The family of the Boolean inner product function is known to be representable by neural networks of threshold neurons of depth 3 with only 2n+1 units (n the input dimension)—whereas an exact representation by a depth 2 network cannot possibly be of polynomial size. This result can be seen as a strong argument for deep neural network architectures. In our study, we found that this depth 3 architecture of the Boolean inner product is difficult to train, much harder than the depth 2 network, at least for the small input size scenarios n16. Nonetheless, the accuracy of the deep architecture increased with the dimension of the input space to 94% on average, which means that multiple restarts are needed to find the compact depth 3 architecture. Replacing the fully connected first layer by a partially connected layer (a kind of convolutional layer sparsely connected with weight sharing) can significantly improve the learning performance up to 99% accuracy in simulations. Another way to improve the learnability of the compact depth 3 representation of the inner product could be achieved by adding just a few additional units into the first hidden layer.  相似文献   

4.
In this paper, a grid-free deep learning method based on a physics-informed neural network is proposed for solving coupled Stokes–Darcy equations with Bever–Joseph–Saffman interface conditions. This method has the advantage of avoiding grid generation and can greatly reduce the amount of computation when solving complex problems. Although original physical neural network algorithms have been used to solve many differential equations, we find that the direct use of physical neural networks to solve coupled Stokes–Darcy equations does not provide accurate solutions in some cases, such as rigid terms due to small parameters and interface discontinuity problems. In order to improve the approximation ability of a physics-informed neural network, we propose a loss-function-weighted function strategy, a parallel network structure strategy, and a local adaptive activation function strategy. In addition, the physical information neural network with an added strategy provides inspiration for solving other more complicated problems of multi-physical field coupling. Finally, the effectiveness of the proposed strategy is verified by numerical experiments.  相似文献   

5.
Deep Neural Networks (DNNs) usually work in an end-to-end manner. This makes the trained DNNs easy to use, but they remain an ambiguous decision process for every test case. Unfortunately, the interpretability of decisions is crucial in some scenarios, such as medical or financial data mining and decision-making. In this paper, we propose a Tree-Network-Tree (TNT) learning framework for explainable decision-making, where the knowledge is alternately transferred between the tree model and DNNs. Specifically, the proposed TNT learning framework exerts the advantages of different models at different stages: (1) a novel James–Stein Decision Tree (JSDT) is proposed to generate better knowledge representations for DNNs, especially when the input data are in low-frequency or low-quality; (2) the DNNs output high-performing prediction result from the knowledge embedding inputs and behave as a teacher model for the following tree model; and (3) a novel distillable Gradient Boosted Decision Tree (dGBDT) is proposed to learn interpretable trees from the soft labels and make a comparable prediction as DNNs do. Extensive experiments on various machine learning tasks demonstrated the effectiveness of the proposed method.  相似文献   

6.
陈帝伊  柳烨  马孝义 《物理学报》2012,61(10):100501-100501
鉴于径向基函数(RBF)神经网络模型在非线性预测方面的优良性能, 提出了利用该预测模型对混沌时间序列相空间重构的两个关键参数——延迟时间和嵌入维数进行联合估计的方法, 并以客观的评价指标为依据给出其最优估计值. 以Lorenz系统为例进行数值分析, 得到RBF单步及多步预测模型中嵌入维数和延迟时间的最佳参数估计值, 并在原模型中对估计值进行校验. 结果表明, 该方法可以有效地估计出嵌入维数和延迟时间, 从而显著提高预测精度.  相似文献   

7.
One of the most effective image processing techniques is the use of convolutional neural networks that use convolutional layers. In each such layer, the value of the layer’s output signal at each point is a combination of the layer’s input signals corresponding to several neighboring points. To improve the accuracy, researchers have developed a version of this technique, in which only data from some of the neighboring points is processed. It turns out that the most efficient case—called dilated convolution—is when we select the neighboring points whose differences in both coordinates are divisible by some constant . In this paper, we explain this empirical efficiency by proving that for all reasonable optimality criteria, dilated convolution is indeed better than possible alternatives.  相似文献   

8.
提出应用人工神经网络对互连导线间串扰问题进行预测的方法.选择对互连导线串扰响应有影响的相关参数作为输入预测因子,用基于误差反向传播的BP网络构造输入预测因子与串扰响应输出之间的映射关系,并用MTL和FDTD法计算获得的训练样本集对构造好的BP网络进行训练,建立基于BP网络的导线串扰的预测模型.最后,将串扰的BP预测结果和和测试样本进行比较,表明该方法有效.  相似文献   

9.
The article concerns the problem of classification based on independent data sets—local decision tables. The aim of the paper is to propose a classification model for dispersed data using a modified k-nearest neighbors algorithm and a neural network. A neural network, more specifically a multilayer perceptron, is used to combine the prediction results obtained based on local tables. Prediction results are stored in the measurement level and generated using a modified k-nearest neighbors algorithm. The task of neural networks is to combine these results and provide a common prediction. In the article various structures of neural networks (different number of neurons in the hidden layer) are studied and the results are compared with the results generated by other fusion methods, such as the majority voting, the Borda count method, the sum rule, the method that is based on decision templates and the method that is based on theory of evidence. Based on the obtained results, it was found that the neural network always generates unambiguous decisions, which is a great advantage as most of the other fusion methods generate ties. Moreover, if only unambiguous results were considered, the use of a neural network gives much better results than other fusion methods. If we allow ambiguity, some fusion methods are slightly better, but it is the result of this fact that it is possible to generate few decisions for the test object.  相似文献   

10.
In this paper,we propose a new model of weighted small-world biological neural networks based on biophysical Hodgkin-Huxley neurons with side-restrain mechanism.Then we study excitement properties of the model under alternating current (AC) stimulation.The study shows that the excitement properties in the networks are preferably consistent with the behavior properties of a brain nervous system under different AC stimuli,such as refractory period and the brain neural excitement response induced by different intensities of nolse and coupling.The results of the study have reference worthiness for the brain nerve electrophysiology and epistemological science.  相似文献   

11.
Automatic recognition of visual objects using a deep learning approach has been successfully applied to multiple areas. However, deep learning techniques require a large amount of labeled data, which is usually expensive to obtain. An alternative is to use semi-supervised models, such as co-training, where multiple complementary views are combined using a small amount of labeled data. A simple way to associate views to visual objects is through the application of a degree of rotation or a type of filter. In this work, we propose a co-training model for visual object recognition using deep neural networks by adding layers of self-supervised neural networks as intermediate inputs to the views, where the views are diversified through the cross-entropy regularization of their outputs. Since the model merges the concepts of co-training and self-supervised learning by considering the differentiation of outputs, we called it Differential Self-Supervised Co-Training (DSSCo-Training). This paper presents some experiments using the DSSCo-Training model to well-known image datasets such as MNIST, CIFAR-100, and SVHN. The results indicate that the proposed model is competitive with the state-of-art models and shows an average relative improvement of 5% in accuracy for several datasets, despite its greater simplicity with respect to more recent approaches.  相似文献   

12.
The biomedical field is characterized by an ever-increasing production of sequential data, which often come in the form of biosignals capturing the time-evolution of physiological processes, such as blood pressure and brain activity. This has motivated a large body of research dealing with the development of machine learning techniques for the predictive analysis of such biosignals. Unfortunately, in high-stakes decision making, such as clinical diagnosis, the opacity of machine learning models becomes a crucial aspect to be addressed in order to increase the trust and adoption of AI technology. In this paper, we propose a model agnostic explanation method, based on occlusion, that enables the learning of the input’s influence on the model predictions. We specifically target problems involving the predictive analysis of time-series data and the models that are typically used to deal with data of such nature, i.e., recurrent neural networks. Our approach is able to provide two different kinds of explanations: one suitable for technical experts, who need to verify the quality and correctness of machine learning models, and one suited to physicians, who need to understand the rationale underlying the prediction to make aware decisions. A wide experimentation on different physiological data demonstrates the effectiveness of our approach both in classification and regression tasks.  相似文献   

13.
Text classification is a fundamental research direction, aims to assign tags to text units. Recently, graph neural networks (GNN) have exhibited some excellent properties in textual information processing. Furthermore, the pre-trained language model also realized promising effects in many tasks. However, many text processing methods cannot model a single text unit’s structure or ignore the semantic features. To solve these problems and comprehensively utilize the text’s structure information and semantic information, we propose a Bert-Enhanced text Graph Neural Network model (BEGNN). For each text, we construct a text graph separately according to the co-occurrence relationship of words and use GNN to extract text features. Moreover, we employ Bert to extract semantic features. The former part can take into account the structural information, and the latter can focus on modeling the semantic information. Finally, we interact and aggregate these two features of different granularity to get a more effective representation. Experiments on standard datasets demonstrate the effectiveness of BEGNN.  相似文献   

14.
Machine learning methods, such as Long Short-Term Memory (LSTM) neural networks can predict real-life time series data. Here, we present a new approach to predict time series data combining interpolation techniques, randomly parameterized LSTM neural networks and measures of signal complexity, which we will refer to as complexity measures throughout this research. First, we interpolate the time series data under study. Next, we predict the time series data using an ensemble of randomly parameterized LSTM neural networks. Finally, we filter the ensemble prediction based on the original data complexity to improve the predictability, i.e., we keep only predictions with a complexity close to that of the training data. We test the proposed approach on five different univariate time series data. We use linear and fractal interpolation to increase the amount of data. We tested five different complexity measures for the ensemble filters for time series data, i.e., the Hurst exponent, Shannon’s entropy, Fisher’s information, SVD entropy, and the spectrum of Lyapunov exponents. Our results show that the interpolated predictions consistently outperformed the non-interpolated ones. The best ensemble predictions always beat a baseline prediction based on a neural network with only a single hidden LSTM, gated recurrent unit (GRU) or simple recurrent neural network (RNN) layer. The complexity filters can reduce the error of a random ensemble prediction by a factor of 10. Further, because we use randomly parameterized neural networks, no hyperparameter tuning is required. We prove this method useful for real-time time series prediction because the optimization of hyperparameters, which is usually very costly and time-intensive, can be circumvented with the presented approach.  相似文献   

15.
In this paper, we propose to quantitatively compare loss functions based on parameterized Tsallis–Havrda–Charvat entropy and classical Shannon entropy for the training of a deep network in the case of small datasets which are usually encountered in medical applications. Shannon cross-entropy is widely used as a loss function for most neural networks applied to the segmentation, classification and detection of images. Shannon entropy is a particular case of Tsallis–Havrda–Charvat entropy. In this work, we compare these two entropies through a medical application for predicting recurrence in patients with head–neck and lung cancers after treatment. Based on both CT images and patient information, a multitask deep neural network is proposed to perform a recurrence prediction task using cross-entropy as a loss function and an image reconstruction task. Tsallis–Havrda–Charvat cross-entropy is a parameterized cross-entropy with the parameter α . Shannon entropy is a particular case of Tsallis–Havrda–Charvat entropy for α=1 . The influence of this parameter on the final prediction results is studied. In this paper, the experiments are conducted on two datasets including in total 580 patients, of whom 434 suffered from head–neck cancers and 146 from lung cancers. The results show that Tsallis–Havrda–Charvat entropy can achieve better performance in terms of prediction accuracy with some values of α .  相似文献   

16.
于舒娟  宦如松  张昀  冯迪 《物理学报》2014,63(6):60701-060701
针对Hopfield神经网络的多起点问题,提出了一种新的基于混沌神经网络的盲信号检测算法,实现了二进制移相键控信号盲检测.据此进一步提出双sigmoid混沌神经网络模型,构造了新的能量函数,且证明了该模型的稳定性,并对网络参数进行配置.仿真实验表明:混沌神经网络能够避免局部极小点且具备较强的抗噪性能,双sigmoid混沌神经网络则继承了其所有的优点,且其收敛速度更快,仅需更短的接收数据即可到达全局真实平衡点,从而降低了算法的计算复杂度,减少了运行时间.  相似文献   

17.
郑鸿宇  罗晓曙  吴雷 《物理学报》2008,57(6):3380-3384
根据实际生物神经网络具有小世界连接和神经元之间的连接强度随时间变化的特点,首先构造了一个以Hodgkin-Huxley方程为节点动力学模型的动态变权小世界生物神经网络模型,然后研究了该模型神经元的兴奋特性、权值变化特点和不同的学习系数对神经元的兴奋统计特性的影响.最有意义的结果是,在同样的网络结构、网络参数及外部刺激信号的条件下,学习系数b存在一个最优值b*,使生物神经网络的兴奋度在b=b*时达到最大. 关键词: 动态变权生物神经网络 小世界网络 Hodgkin-Huxley方程  相似文献   

18.
In this paper, the performance of artificial neural networks in option pricing was analyzed and compared with the results obtained from the Black–Scholes–Merton model, based on the historical volatility. The results were compared based on various error metrics calculated separately between three moneyness ratios. The market data-driven approach was taken to train and test the neural network on the real-world options data from 2009 to 2019, quoted on the Warsaw Stock Exchange. The artificial neural network did not provide more accurate option prices, even though its hyperparameters were properly tuned. The Black–Scholes–Merton model turned out to be more precise and robust to various market conditions. In addition, the bias of the forecasts obtained from the neural network differed significantly between moneyness states. This study provides an initial insight into the application of deep learning methods to pricing options in emerging markets with low liquidity and high volatility.  相似文献   

19.
Session-based recommendations aim to predict a user’s next click based on the user’s current and historical sessions, which can be applied to shopping websites and APPs. Existing session-based recommendation methods cannot accurately capture the complex transitions between items. In addition, some approaches compress sessions into a fixed representation vector without taking into account the user’s interest preferences at the current moment, thus limiting the accuracy of recommendations. Considering the diversity of items and users’ interests, a personalized interest attention graph neural network (PIA-GNN) is proposed for session-based recommendation. This approach utilizes personalized graph convolutional networks (PGNN) to capture complex transitions between items, invoking an interest-aware mechanism to activate users’ interest in different items adaptively. In addition, a self-attention layer is used to capture long-term dependencies between items when capturing users’ long-term preferences. In this paper, the cross-entropy loss is used as the objective function to train our model. We conduct rich experiments on two real datasets, and the results show that PIA-GNN outperforms existing personalized session-aware recommendation methods.  相似文献   

20.
基于深度稀疏学习的土壤近红外光谱分析预测模型   总被引:2,自引:1,他引:1       下载免费PDF全文
提出一种基于深度稀疏学习的土壤近红外光谱分析预测模型。首先,使用稀疏特征学习方法对土壤近红外光谱数据进行约简,实现土壤近红外光谱内容的稀疏表示;然后采用径向基函数神经网络以稀疏表示特征系数为输入,以所测土壤成分为输出,分别建立土壤有机质、速效磷、速效钾的非线性预测模型。结果表明用该模型预测土壤有机质的含量是可行的,但对土壤速效磷和速效钾含量的预测还需对模型做进一步的优化。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号