首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Despite the importance of few-shot learning, the lack of labeled training data in the real world makes it extremely challenging for existing machine learning methods because this limited dataset does not well represent the data variance. In this research, we suggest employing a generative approach using variational autoencoders (VAEs), which can be used specifically to optimize few-shot learning tasks by generating new samples with more intra-class variations on the Labeled Faces in the Wild (LFW) dataset. The purpose of our research is to increase the size of the training dataset using various methods to improve the accuracy and robustness of the few-shot face recognition. Specifically, we employ the VAE generator to increase the size of the training dataset, including the basic and the novel sets while utilizing transfer learning as the backend. Based on extensive experimental research, we analyze various data augmentation methods to observe how each method affects the accuracy of face recognition. The face generation method based on VAEs with perceptual loss can effectively improve the recognition accuracy rate to 96.47% using both the base and the novel sets.  相似文献   

2.
Autoencoders are a self-supervised learning system where, during training, the output is an approximation of the input. Typically, autoencoders have three parts: Encoder (which produces a compressed latent space representation of the input data), the Latent Space (which retains the knowledge in the input data with reduced dimensionality but preserves maximum information) and the Decoder (which reconstructs the input data from the compressed latent space). Autoencoders have found wide applications in dimensionality reduction, object detection, image classification, and image denoising applications. Variational Autoencoders (VAEs) can be regarded as enhanced Autoencoders where a Bayesian approach is used to learn the probability distribution of the input data. VAEs have found wide applications in generating data for speech, images, and text. In this paper, we present a general comprehensive overview of variational autoencoders. We discuss problems with the VAEs and present several variants of the VAEs that attempt to provide solutions to the problems. We present applications of variational autoencoders for finance (a new and emerging field of application), speech/audio source separation, and biosignal applications. Experimental results are presented for an example of speech source separation to illustrate the powerful application of variants of VAE: VAE, β-VAE, and ITL-AE. We conclude the paper with a summary, and we identify possible areas of research in improving performance of VAEs in particular and deep generative models in general, of which VAEs and generative adversarial networks (GANs) are examples.  相似文献   

3.
Neural networks play a growing role in many scientific disciplines, including physics. Variational autoencoders (VAEs) are neural networks that are able to represent the essential information of a high dimensional data set in a low dimensional latent space, which have a probabilistic interpretation. In particular, the so-called encoder network, the first part of the VAE, which maps its input onto a position in latent space, additionally provides uncertainty information in terms of variance around this position. In this work, an extension to the autoencoder architecture is introduced, the FisherNet. In this architecture, the latent space uncertainty is not generated using an additional information channel in the encoder but derived from the decoder by means of the Fisher information metric. This architecture has advantages from a theoretical point of view as it provides a direct uncertainty quantification derived from the model and also accounts for uncertainty cross-correlations. We can show experimentally that the FisherNet produces more accurate data reconstructions than a comparable VAE and its learning performance also apparently scales better with the number of latent space dimensions.  相似文献   

4.
Variational inference is an optimization-based method for approximating the posterior distribution of the parameters in Bayesian probabilistic models. A key challenge of variational inference is to approximate the posterior with a distribution that is computationally tractable yet sufficiently expressive. We propose a novel method for generating samples from a highly flexible variational approximation. The method starts with a coarse initial approximation and generates samples by refining it in selected, local regions. This allows the samples to capture dependencies and multi-modality in the posterior, even when these are absent from the initial approximation. We demonstrate theoretically that our method always improves the quality of the approximation (as measured by the evidence lower bound). In experiments, our method consistently outperforms recent variational inference methods in terms of log-likelihood and ELBO across three example tasks: the Eight-Schools example (an inference task in a hierarchical model), training a ResNet-20 (Bayesian inference in a large neural network), and the Mushroom task (posterior sampling in a contextual bandit problem).  相似文献   

5.
Autoencoders are commonly used in representation learning. They consist of an encoder and a decoder, which provide a straightforward method to map n-dimensional data in input space to a lower m-dimensional representation space and back. The decoder itself defines an m-dimensional manifold in input space. Inspired by manifold learning, we showed that the decoder can be trained on its own by learning the representations of the training samples along with the decoder weights using gradient descent. A sum-of-squares loss then corresponds to optimizing the manifold to have the smallest Euclidean distance to the training samples, and similarly for other loss functions. We derived expressions for the number of samples needed to specify the encoder and decoder and showed that the decoder generally requires much fewer training samples to be well-specified compared to the encoder. We discuss the training of autoencoders in this perspective and relate it to previous work in the field that uses noisy training examples and other types of regularization. On the natural image data sets MNIST and CIFAR10, we demonstrated that the decoder is much better suited to learn a low-dimensional representation, especially when trained on small data sets. Using simulated gene regulatory data, we further showed that the decoder alone leads to better generalization and meaningful representations. Our approach of training the decoder alone facilitates representation learning even on small data sets and can lead to improved training of autoencoders. We hope that the simple analyses presented will also contribute to an improved conceptual understanding of representation learning.  相似文献   

6.
周勤  王远军 《波谱学杂志》2022,39(3):291-302
为解决基于深度学习的成对配准方法精度低和传统配准算法耗时长的问题,本文提出一种基于变分推断的无监督端到端的群组配准以及基于局部归一化互相关(NCC)和先验的配准框架,该框架能够将多个图像配准到公共空间并有效地控制变形场的正则化,且不需要真实的变形场和参考图像.该方法得到的预估变形场可建模为概率生成模型,使用变分推断的方法求解;然后借助空间转换网络和损失函数来实现无监督方式训练.对于公开数据集LPBA40的3D脑磁共振图像配准任务,测试结果表明:本文所提出的方法与基线方法相比,具有较好的Dice得分、运行时间少且产生更好的微分同胚域,同时对噪声具有鲁棒性.  相似文献   

7.
姚雯  赵桂萍  王双虎 《计算物理》2007,24(5):512-518
在统一坐标系的基础上引入变分方法,并从自适应网格的正交性、光滑性、疏密程度等角度考虑,获取速度系数h的椭圆方程,从而在边界上可以自由控制h的取值,以适应不同物理问题的需要.算例证明,变分法在统一坐标系中的应用是可行的,在边界上可以满足物理要求.  相似文献   

8.
Distributed training across several quantum computers could significantly improve the training time and if we could share the learned model, not the data, it could potentially improve the data privacy as the training would happen where the data is located. One of the potential schemes to achieve this property is the federated learning (FL), which consists of several clients or local nodes learning on their own data and a central node to aggregate the models collected from those local nodes. However, to the best of our knowledge, no work has been done in quantum machine learning (QML) in federation setting yet. In this work, we present the federated training on hybrid quantum-classical machine learning models although our framework could be generalized to pure quantum machine learning model. Specifically, we consider the quantum neural network (QNN) coupled with classical pre-trained convolutional model. Our distributed federated learning scheme demonstrated almost the same level of trained model accuracies and yet significantly faster distributed training. It demonstrates a promising future research direction for scaling and privacy aspects.  相似文献   

9.
The initial field has a crucial influence on numerical weather prediction (NWP). Data assimilation (DA) is a reliable method to obtain the initial field of the forecast model. At the same time, data are the carriers of information. Observational data are a concrete representation of information. DA is also the process of sorting observation data, during which entropy gradually decreases. Four-dimensional variational assimilation (4D-Var) is the most popular approach. However, due to the complexity of the physical model, the tangent linear and adjoint models, and other processes, the realization of a 4D-Var system is complicated, and the computational efficiency is expensive. Machine learning (ML) is a method of gaining simulation results by training a large amount of data. It achieves remarkable success in various applications, and operational NWP and DA are no exception. In this work, we synthesize insights and techniques from previous studies to design a pure data-driven 4D-Var implementation framework named ML-4DVAR based on the bilinear neural network (BNN). The framework replaces the traditional physical model with the BNN model for prediction. Moreover, it directly makes use of the ML model obtained from the simulation data to implement the primary process of 4D-Var, including the realization of the short-term forecast process and the tangent linear and adjoint models. We test a strong-constraint 4D-Var system with the Lorenz-96 model, and we compared the traditional 4D-Var system with ML-4DVAR. The experimental results demonstrate that the ML-4DVAR framework can achieve better assimilation results and significantly improve computational efficiency.  相似文献   

10.
Xue-Yi Guo 《中国物理 B》2023,32(1):10307-010307
Quantum computers promise to solve finite-temperature properties of quantum many-body systems, which is generally challenging for classical computers due to high computational complexities. Here, we report experimental preparations of Gibbs states and excited states of Heisenberg $XX$ and $XXZ$ models by using a 5-qubit programmable superconducting processor. In the experiments, we apply a hybrid quantum-classical algorithm to generate finite temperature states with classical probability models and variational quantum circuits. We reveal that the Hamiltonians can be fully diagonalized with optimized quantum circuits, which enable us to prepare excited states at arbitrary energy density. We demonstrate that the approach has a self-verifying feature and can estimate fundamental thermal observables with a small statistical error. Based on numerical results, we further show that the time complexity of our approach scales polynomially in the number of qubits, revealing its potential in solving large-scale problems.  相似文献   

11.
Recent advances in statistical inference have significantly expanded the toolbox of probabilistic modeling. Historically, probabilistic modeling has been constrained to very restricted model classes, where exact or approximate probabilistic inference is feasible. However, developments in variational inference, a general form of approximate probabilistic inference that originated in statistical physics, have enabled probabilistic modeling to overcome these limitations: (i) Approximate probabilistic inference is now possible over a broad class of probabilistic models containing a large number of parameters, and (ii) scalable inference methods based on stochastic gradient descent and distributed computing engines allow probabilistic modeling to be applied to massive data sets. One important practical consequence of these advances is the possibility to include deep neural networks within probabilistic models, thereby capturing complex non-linear stochastic relationships between the random variables. These advances, in conjunction with the release of novel probabilistic modeling toolboxes, have greatly expanded the scope of applications of probabilistic models, and allowed the models to take advantage of the recent strides made by the deep learning community. In this paper, we provide an overview of the main concepts, methods, and tools needed to use deep neural networks within a probabilistic modeling framework.  相似文献   

12.
In the last few decades, text mining has been used to extract knowledge from free texts. Applying neural networks and deep learning to natural language processing (NLP) tasks has led to many accomplishments for real-world language problems over the years. The developments of the last five years have resulted in techniques that have allowed for the practical application of transfer learning in NLP. The advances in the field have been substantial, and the milestone of outperforming human baseline performance based on the general language understanding evaluation has been achieved. This paper implements a targeted literature review to outline, describe, explain, and put into context the crucial techniques that helped achieve this milestone. The research presented here is a targeted review of neural language models that present vital steps towards a general language representation model.  相似文献   

13.
In recent years, deep learning has been applied to intelligent fault diagnosis and has achieved great success. However, the fault diagnosis method of deep learning assumes that the training dataset and the test dataset are obtained under the same operating conditions. This condition can hardly be met in real application scenarios. Additionally, signal preprocessing technology also has an important influence on intelligent fault diagnosis. How to effectively relate signal preprocessing to a transfer diagnostic model is a challenge. To solve the above problems, we propose a novel deep transfer learning method for intelligent fault diagnosis based on Variational Mode Decomposition (VMD) and Efficient Channel Attention (ECA). In the proposed method, the VMD adaptively matches the optimal center frequency and finite bandwidth of each mode to achieve effective separation of signals. To fuse the mode features more effectively after VMD decomposition, ECA is used to learn channel attention. The experimental results show that the proposed signal preprocessing and feature fusion module can increase the accuracy and generality of the transfer diagnostic model. Moreover, we comprehensively analyze and compare our method with state-of-the-art methods at different noise levels, and the results show that our proposed method has better robustness and generalization performance.  相似文献   

14.
Information plane analysis, describing the mutual information between the input and a hidden layer and between a hidden layer and the target over time, has recently been proposed to analyze the training of neural networks. Since the activations of a hidden layer are typically continuous-valued, this mutual information cannot be computed analytically and must thus be estimated, resulting in apparently inconsistent or even contradicting results in the literature. The goal of this paper is to demonstrate how information plane analysis can still be a valuable tool for analyzing neural network training. To this end, we complement the prevailing binning estimator for mutual information with a geometric interpretation. With this geometric interpretation in mind, we evaluate the impact of regularization and interpret phenomena such as underfitting and overfitting. In addition, we investigate neural network learning in the presence of noisy data and noisy labels.  相似文献   

15.
Image steganography is a scheme that hides secret information in a cover image without being perceived. Most of the existing steganography methods are more concerned about the visual similarity between the stego image and the cover image, and they ignore the recovery accuracy of secret information. In this paper, the steganography method based on invertible neural networks is proposed, which can generate stego images with high invisibility and security and can achieve lossless recovery for secret information. In addition, this paper introduces a mapping module that can compress information actually embedded to improve the quality of the stego image and its antidetection ability. In order to restore message and prevent loss, the secret information is converted into a binary sequence and then embedded in the cover image through the forward operation of the invertible neural networks. This information will then be recovered from the stego image through the inverse operation of the invertible neural networks. Experimental results show that the proposed method in this paper has achieved competitive results in the visual quality and safety of stego images and achieved 100% accuracy in information extraction.  相似文献   

16.
为克服机器学习方法在油藏单井产量预测中的过拟合问题,提高油田生产中的产量预测精度,提出一种基于条件生成式对抗网络(CGAN)的油藏单井产量预测模型。该模型使用长短期记忆、全连接等基础神经网络,构建生成和判别网络模型。生成网络模型以产量影响因素为条件输入,生成预测产量数据,利用对数损失函数评价预测数据与真实数据之间的偏差,通过条件生成式对抗网络的博弈训练,并结合贝叶斯超参数优化算法,优化模型结构,综合提高模型的泛化能力。基于Eclipse数值模拟软件建立同一井网条件下不同地质和生产条件下的油藏单井产量数据库,以地质与生产条件等产量影响因素作为模型的条件输入,进行油藏单井产量预测。结果表明:与全连接神经网络(FCNN)、随机森林(RF)以及长短期记忆神经网络(LSTM)模型的预测结果相比,CGAN模型在测试集上的平均绝对百分比误差分别提升了2.59%、 0.81%以及1.72%,并且过拟合比最小(1.027)。说明CGAN降低了机器学习产量预测模型的过拟合程度,提高了模型的泛化能力与预测精度,验证了所提算法的优越性,对指导油田高效开发和保障我国能源战略安全具有重要意义。  相似文献   

17.
The advancement of sensing technologies coupled with the rapid progress in big data analysis has ushered in a new era in intelligent transport and smart city applications. In this context, transportation mode detection (TMD) of mobile users is a field that has gained significant traction in recent years. In this paper, we present a deep learning approach for transportation mode detection using multimodal sensor data elicited from user smartphones. The approach is based on long short-term Memory networks and Bayesian optimization of their parameters. We conducted an extensive experimental evaluation of the proposed approach, which attains very high recognition rates, against a multitude of machine learning approaches, including state-of-the-art methods. We also discuss issues regarding feature correlation and the impact of dimensionality reduction.  相似文献   

18.
张弦  王宏力 《物理学报》2011,60(11):110201-110201
针对应用于混沌时间序列预测的正则极端学习机(RELM)网络结构设计问题,提出一种基于Cholesky分解的增量式RELM训练算法.该算法通过逐次增加隐层神经元的方式自动确定最佳的RELM网络结构,并以Cholesky分解方式计算其输出权值,有效减小了隐层神经元递增过程的计算代价.混沌时间序列预测实例表明,该算法可有效实现最佳RELM网络结构的自动确定,且计算效率高.利用该算法训练后的RELM预测模型具有预测精度高的优点,适用于混沌时间序列预测. 关键词: 神经网络 极端学习机 混沌时间序列 时间序列预测  相似文献   

19.
Jerne's model for the immune system formulated in terms of a neural network recently proposed by Weisbuch and Atlan is generalized to interactions with continuous coupling coefficients. It is shown that even the extended model can be solved analytically without the aid of computer simulations and exhibits one additional attractor, which corresponds to a configuration with high concentrations of active killer cells eventually causing death of the organism.  相似文献   

20.
Deep Neural Networks (DNNs) usually work in an end-to-end manner. This makes the trained DNNs easy to use, but they remain an ambiguous decision process for every test case. Unfortunately, the interpretability of decisions is crucial in some scenarios, such as medical or financial data mining and decision-making. In this paper, we propose a Tree-Network-Tree (TNT) learning framework for explainable decision-making, where the knowledge is alternately transferred between the tree model and DNNs. Specifically, the proposed TNT learning framework exerts the advantages of different models at different stages: (1) a novel James–Stein Decision Tree (JSDT) is proposed to generate better knowledge representations for DNNs, especially when the input data are in low-frequency or low-quality; (2) the DNNs output high-performing prediction result from the knowledge embedding inputs and behave as a teacher model for the following tree model; and (3) a novel distillable Gradient Boosted Decision Tree (dGBDT) is proposed to learn interpretable trees from the soft labels and make a comparable prediction as DNNs do. Extensive experiments on various machine learning tasks demonstrated the effectiveness of the proposed method.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号