首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Distributed training across several quantum computers could significantly improve the training time and if we could share the learned model, not the data, it could potentially improve the data privacy as the training would happen where the data is located. One of the potential schemes to achieve this property is the federated learning (FL), which consists of several clients or local nodes learning on their own data and a central node to aggregate the models collected from those local nodes. However, to the best of our knowledge, no work has been done in quantum machine learning (QML) in federation setting yet. In this work, we present the federated training on hybrid quantum-classical machine learning models although our framework could be generalized to pure quantum machine learning model. Specifically, we consider the quantum neural network (QNN) coupled with classical pre-trained convolutional model. Our distributed federated learning scheme demonstrated almost the same level of trained model accuracies and yet significantly faster distributed training. It demonstrates a promising future research direction for scaling and privacy aspects.  相似文献   

2.
Access to healthcare data such as electronic health records (EHR) is often restricted by laws established to protect patient privacy. These restrictions hinder the reproducibility of existing results based on private healthcare data and also limit new research. Synthetically-generated healthcare data solve this problem by preserving privacy and enabling researchers and policymakers to drive decisions and methods based on realistic data. Healthcare data can include information about multiple in- and out- patient visits of patients, making it a time-series dataset which is often influenced by protected attributes like age, gender, race etc. The COVID-19 pandemic has exacerbated health inequities, with certain subgroups experiencing poorer outcomes and less access to healthcare. To combat these inequities, synthetic data must “fairly” represent diverse minority subgroups such that the conclusions drawn on synthetic data are correct and the results can be generalized to real data. In this article, we develop two fairness metrics for synthetic data, and analyze all subgroups defined by protected attributes to analyze the bias in three published synthetic research datasets. These covariate-level disparity metrics revealed that synthetic data may not be representative at the univariate and multivariate subgroup-levels and thus, fairness should be addressed when developing data generation methods. We discuss the need for measuring fairness in synthetic healthcare data to enable the development of robust machine learning models to create more equitable synthetic healthcare datasets.  相似文献   

3.
Science based on the unified concepts on matter at the nanoscale provides a new foundation for knowledge creation, innovation, and technology integration. Convergent new technologies refers to the synergistic combination of nanotechnology, biotechnology, information technology and cognitive sciences (NBIC), each of which is currently progressing at a rapid rate, experiencing qualitative advancements, and interacting with the more established fields such as mathematics and environmental technologies (Roco & Bainbridge, 2002). It is expected that converging technologies will bring about tremendous improvements in transforming tools, new products and services, enable human personal abilities and social achievements, and reshape societal relationships.After a brief overview of the general implications of converging new technologies, this paper focuses on its effects on R&D policies and business models as part of changing societal relationships. These R&D policies will have implications on investments in research and industry, with the main goal of taking advantage of the transformative development of NBIC. Introduction of converging technologies must be done with respect of immediate concerns (privacy, toxicity of new materials, etc.) and longer-term concerns including human integrity, dignity and welfare. The efficient introduction and development of converging new technologies will require new organizations and business models, as well as solutions for preparing the economy, such as multifunctional research facilities, integrative technology platforms, and global risk governance.  相似文献   

4.
Science based on the unified concepts on matter at the nanoscale provides a new foundation for knowledge creation, innovation, and technology integration. Convergent new technologies refers to the synergistic combination of nanotechnology, biotechnology, information technology and cognitive sciences (NBIC), each of which is currently progressing at a rapid rate, experiencing qualitative advancements, and interacting with the more established fields such as mathematics and environmental technologies (Roco & Bainbridge, 2002). It is expected that converging technologies will bring about tremendous improvements in transforming tools, new products and services, enable human personal abilities and social achievements, and reshape societal relationships.After a brief overview of the general implications of converging new technologies, this paper focuses on its effects on R&D policies and business models as part of changing societal relationships. These R&D policies will have implications on investments in research and industry, with the main goal of taking advantage of the transformative development of NBIC. Introduction of converging technologies must be done with respect of immediate concerns (privacy, toxicity of new materials, etc.) and longer-term concerns including human integrity, dignity and welfare. The efficient introduction and development of converging new technologies will require new organizations and business models, as well as solutions for preparing the economy, such as multifunctional research facilities, integrative technology platforms, and global risk governance.(*) This is an extension of the presentation made at the Converging Technologies Conference, February 26, 2004, New York.This revised version was published online in August 2005 with a corrected issue number.  相似文献   

5.
In many decision-making scenarios, ranging from recreational activities to healthcare and policing, the use of artificial intelligence coupled with the ability to learn from historical data is becoming ubiquitous. This widespread adoption of automated systems is accompanied by the increasing concerns regarding their ethical implications. Fundamental rights, such as the ones that require the preservation of privacy, do not discriminate based on sensible attributes (e.g., gender, ethnicity, political/sexual orientation), or require one to provide an explanation for a decision, are daily undermined by the use of increasingly complex and less understandable yet more accurate learning algorithms. For this purpose, in this work, we work toward the development of systems able to ensure trustworthiness by delivering privacy, fairness, and explainability by design. In particular, we show that it is possible to simultaneously learn from data while preserving the privacy of the individuals thanks to the use of Homomorphic Encryption, ensuring fairness by learning a fair representation from the data, and ensuring explainable decisions with local and global explanations without compromising the accuracy of the final models. We test our approach on a widespread but still controversial application, namely face recognition, using the recent FairFace dataset to prove the validity of our approach.  相似文献   

6.
Recent advances in artificial intelligence (AI) have led to its widespread industrial adoption, with machine learning systems demonstrating superhuman performance in a significant number of tasks. However, this surge in performance, has often been achieved through increased model complexity, turning such systems into “black box” approaches and causing uncertainty regarding the way they operate and, ultimately, the way that they come to decisions. This ambiguity has made it problematic for machine learning systems to be adopted in sensitive yet critical domains, where their value could be immense, such as healthcare. As a result, scientific interest in the field of Explainable Artificial Intelligence (XAI), a field that is concerned with the development of new methods that explain and interpret machine learning models, has been tremendously reignited over recent years. This study focuses on machine learning interpretability methods; more specifically, a literature review and taxonomy of these methods are presented, as well as links to their programming implementations, in the hope that this survey would serve as a reference point for both theorists and practitioners.  相似文献   

7.
The adaptation of deep learning models within safety-critical systems cannot rely only on good prediction performance but needs to provide interpretable and robust explanations for their decisions. When modeling complex sequences, attention mechanisms are regarded as the established approach to support deep neural networks with intrinsic interpretability. This paper focuses on the emerging trend of specifically designing diagnostic datasets for understanding the inner workings of attention mechanism based deep learning models for multivariate forecasting tasks. We design a novel benchmark of synthetically designed datasets with the transparent underlying generating process of multiple time series interactions with increasing complexity. The benchmark enables empirical evaluation of the performance of attention based deep neural networks in three different aspects: (i) prediction performance score, (ii) interpretability correctness, (iii) sensitivity analysis. Our analysis shows that although most models have satisfying and stable prediction performance results, they often fail to give correct interpretability. The only model with both a satisfying performance score and correct interpretability is IMV-LSTM, capturing both autocorrelations and crosscorrelations between multiple time series. Interestingly, while evaluating IMV-LSTM on simulated data from statistical and mechanistic models, the correctness of interpretability increases with more complex datasets.  相似文献   

8.
深度学习是目前最好的模式识别工具,预期会在核物理领域帮助科学家从大量复杂数据中寻找与某些物理最相关的特征。本文综述了深度学习技术的分类,不同数据结构对应的最优神经网络架构,黑盒模型的可解释性与预测结果的不确定性。介绍了深度学习在核物质状态方程、核结构、原子核质量、衰变与裂变方面的应用,并展示如何训练神经网络预测原子核质量。结果发现使用实验数据训练的神经网络模型对未参与训练的实验数据拥有良好的预测能力。基于已有的实验数据外推,神经网络对丰中子的轻原子核质量预测结果与宏观微观液滴模型有较大偏离。此区域可能存在未被宏观微观液滴模型包含的新物理,需要进一步的实验数据验证。  相似文献   

9.
Nonalcoholic fatty liver disease (NAFLD) is the hepatic manifestation of metabolic syndrome and is the most common cause of chronic liver disease in developed countries. Certain conditions, including mild inflammation biomarkers, dyslipidemia, and insulin resistance, can trigger a progression to nonalcoholic steatohepatitis (NASH), a condition characterized by inflammation and liver cell damage. We demonstrate the usefulness of machine learning with a case study to analyze the most important features in random forest (RF) models for predicting patients at risk of developing NASH. We collected data from patients who attended the Cardiovascular Risk Unit of Mostoles University Hospital (Madrid, Spain) from 2005 to 2021. We reviewed electronic health records to assess the presence of NASH, which was used as the outcome. We chose RF as the algorithm to develop six models using different pre-processing strategies. The performance metrics was evaluated to choose an optimized model. Finally, several interpretability techniques, such as feature importance, contribution of each feature to predictions, and partial dependence plots, were used to understand and explain the model to help obtain a better understanding of machine learning-based predictions. In total, 1525 patients met the inclusion criteria. The mean age was 57.3 years, and 507 patients had NASH (prevalence of 33.2%). Filter methods (the chi-square and Mann–Whitney–Wilcoxon tests) did not produce additional insight in terms of interactions, contributions, or relationships among variables and their outcomes. The random forest model correctly classified patients with NASH to an accuracy of 0.87 in the best model and to 0.79 in the worst one. Four features were the most relevant: insulin resistance, ferritin, serum levels of insulin, and triglycerides. The contribution of each feature was assessed via partial dependence plots. Random forest-based modeling demonstrated that machine learning can be used to improve interpretability, produce understanding of the modeled behavior, and demonstrate how far certain features can contribute to predictions.  相似文献   

10.
Deep Neural Networks (DNNs) usually work in an end-to-end manner. This makes the trained DNNs easy to use, but they remain an ambiguous decision process for every test case. Unfortunately, the interpretability of decisions is crucial in some scenarios, such as medical or financial data mining and decision-making. In this paper, we propose a Tree-Network-Tree (TNT) learning framework for explainable decision-making, where the knowledge is alternately transferred between the tree model and DNNs. Specifically, the proposed TNT learning framework exerts the advantages of different models at different stages: (1) a novel James–Stein Decision Tree (JSDT) is proposed to generate better knowledge representations for DNNs, especially when the input data are in low-frequency or low-quality; (2) the DNNs output high-performing prediction result from the knowledge embedding inputs and behave as a teacher model for the following tree model; and (3) a novel distillable Gradient Boosted Decision Tree (dGBDT) is proposed to learn interpretable trees from the soft labels and make a comparable prediction as DNNs do. Extensive experiments on various machine learning tasks demonstrated the effectiveness of the proposed method.  相似文献   

11.
Background: the credit scoring model is an effective tool for banks and other financial institutions to distinguish potential default borrowers. The credit scoring model represented by machine learning methods such as deep learning performs well in terms of the accuracy of default discrimination, but the model itself also has many shortcomings such as many hyperparameters and large dependence on big data. There is still a lot of room to improve its interpretability and robustness. Methods: the deep forest or multi-Grained Cascade Forest (gcForest) is a decision tree depth model based on the random forest algorithm. Using multidimensional scanning and cascading processing, gcForest can effectively identify and process high-dimensional feature information. At the same time, gcForest has fewer hyperparameters and has strong robustness. So, this paper constructs a two-stage hybrid default discrimination model based on multiple feature selection methods and gcForest algorithm, and at the same time, it optimizes the parameters for the lowest type II error as the first principle, and the highest AUC and accuracy as the second and third principles. GcForest can not only reflect the advantages of traditional statistical models in terms of interpretability and robustness but also take into account the advantages of deep learning models in terms of accuracy. Results: the validity of the hybrid default discrimination model is verified by three real open credit data sets of Australian, Japanese, and German in the UCI database. Conclusions: the performance of the gcForest is better than the current popular single classifiers such as ANN, and the common ensemble classifiers such as LightGBM, and CNNs in type II error, AUC, and accuracy. Besides, in comparison with other similar research results, the robustness and effectiveness of this model are further verified.  相似文献   

12.
Private distributed learning studies the problem of how multiple distributed entities collaboratively train a shared deep network with their private data unrevealed. With the security provided by the protocols of blind quantum computation, the cooperation between quantum physics and machine learning may lead to unparalleled prospect for solving private distributed learning tasks.In this paper, we introduce a quantum protocol for distributed learning that is able to utilize the computational power of the remote quantum servers while keeping the private data safe. For concreteness, we first introduce a protocol for private single-party delegated training of variational quantum classifiers based on blind quantum computing and then extend this protocol to multiparty private distributed learning incorporated with diferential privacy. We carry out extensive numerical simulations with diferent real-life datasets and encoding strategies to benchmark the efectiveness of our protocol. We find that our protocol is robust to experimental imperfections and is secure under the gradient attack after the incorporation of diferential privacy. Our results show the potential for handling computationally expensive distributed learning tasks with privacy guarantees, thus providing a valuable guide for exploring quantum advantages from the security perspective in the field of machine learning with real-life applications.  相似文献   

13.
Data-mining techniques using machine learning are powerful and efficient for materials design, possessing great potential for discovering new materials with good characteristics. Here, this technique has been used on composition design for La(Fe,Si/Al)_(13)-based materials, which are regarded as one of the most promising magnetic refrigerants in practice. Three prediction models are built by using a machine learning algorithm called gradient boosting regression tree(GBRT) to essentially find the correlation between the Curie temperature(T_C), maximum value of magnetic entropy change((?S_M)_(max)),and chemical composition, all of which yield high accuracy in the prediction of T_C and(?SM)_(max). The performance metric coefficient scores of determination(R~2) for the three models are 0.96, 0.87, and 0.91. These results suggest that all of the models are well-developed predictive models on the challenging issue of generalization ability for untrained data, which can not only provide us with suggestions for real experiments but also help us gain physical insights to find proper composition for further magnetic refrigeration applications.  相似文献   

14.
Gang Huang 《中国物理 B》2021,30(8):88802-088802
The digitization, informatization, and intelligentization of physical systems require strong support from big data analysis. However, due to restrictions on data security and privacy and concerns about the cost of big data collection, transmission, and storage, it is difficult to do data aggregation in real-world power systems, which directly retards the effective implementation of smart grid analytics. Federated learning, an advanced distributed learning method proposed by Google, seems a promising solution to the above issues. Nevertheless, it relies on a server node to complete model aggregation and the framework is limited to scenarios where data are independent and identically distributed. Thus, we here propose a serverless distributed learning platform based on blockchain to solve the above two issues. In the proposed platform, the task of machine learning is performed according to smart contracts, and encrypted models are aggregated via a mechanism of knowledge distillation. Through this proposed method, a server node is no longer required and the learning ability is no longer limited to independent and identically distributed scenarios. Experiments on a public electrical grid dataset will verify the effectiveness of the proposed approach.  相似文献   

15.
This review deals with restricted Boltzmann machine(RBM) under the light of statistical physics.The RBM is a classical family of machine learning(ML) models which played a central role in the development of deep learning.Viewing it as a spin glass model and exhibiting various links with other models of statistical physics,we gather recent results dealing with mean-field theory in this context.First the functioning of the RBM can be analyzed via the phase diagrams obtained for various statistical ensembles of RBM,leading in particular to identify a compositional phase where a small number of features or modes are combined to form complex patterns.Then we discuss recent works either able to devise mean-field based learning algorithms;either able to reproduce generic aspects of the learning process from some ensemble dynamics equations or/and from linear stability arguments.  相似文献   

16.
Reconstructability Analysis (RA) and Bayesian Networks (BN) are both probabilistic graphical modeling methodologies used in machine learning and artificial intelligence. There are RA models that are statistically equivalent to BN models and there are also models unique to RA and models unique to BN. The primary goal of this paper is to unify these two methodologies via a lattice of structures that offers an expanded set of models to represent complex systems more accurately or more simply. The conceptualization of this lattice also offers a framework for additional innovations beyond what is presented here. Specifically, this paper integrates RA and BN by developing and visualizing: (1) a BN neutral system lattice of general and specific graphs, (2) a joint RA-BN neutral system lattice of general and specific graphs, (3) an augmented RA directed system lattice of prediction graphs, and (4) a BN directed system lattice of prediction graphs. Additionally, it (5) extends RA notation to encompass BN graphs and (6) offers an algorithm to search the joint RA-BN neutral system lattice to find the best representation of system structure from underlying system variables. All lattices shown in this paper are for four variables, but the theory and methodology presented in this paper are general and apply to any number of variables. These methodological innovations are contributions to machine learning and artificial intelligence and more generally to complex systems analysis. The paper also reviews some relevant prior work of others so that the innovations offered here can be understood in a self-contained way within the context of this paper.  相似文献   

17.
Information bottleneck (IB) and privacy funnel (PF) are two closely related optimization problems which have found applications in machine learning, design of privacy algorithms, capacity problems (e.g., Mrs. Gerber’s Lemma), and strong data processing inequalities, among others. In this work, we first investigate the functional properties of IB and PF through a unified theoretical framework. We then connect them to three information-theoretic coding problems, namely hypothesis testing against independence, noisy source coding, and dependence dilution. Leveraging these connections, we prove a new cardinality bound on the auxiliary variable in IB, making its computation more tractable for discrete random variables. In the second part, we introduce a general family of optimization problems, termed “bottleneck problems”, by replacing mutual information in IB and PF with other notions of mutual information, namely f-information and Arimoto’s mutual information. We then argue that, unlike IB and PF, these problems lead to easily interpretable guarantees in a variety of inference tasks with statistical constraints on accuracy and privacy. While the underlying optimization problems are non-convex, we develop a technique to evaluate bottleneck problems in closed form by equivalently expressing them in terms of lower convex or upper concave envelope of certain functions. By applying this technique to a binary case, we derive closed form expressions for several bottleneck problems.  相似文献   

18.
We present a quantum private comparison (QPC) protocol, enabling two players to compare the equality of their information without revealing any information about their respective private inputs, in which the four-particle cluster states as the information carriers are used. The presented protocol can ensure correctness, privacy, and fairness with the assistance of a semi-trusted third party (TP). Meanwhile, the participants including the TP are just required having the ability to perform single-particle measurements, which make the presented protocol more feasible in technique. Furthermore, the photon transmission is a one-way distribution; the Trojan horse attacks can be automatically avoided. The security of this protocol is also analyzed.  相似文献   

19.
周瑞瑞  杨理 《中国物理 B》2012,21(8):80301-080301
An unconditionally secure authority-certified anonymous quantum key distribution scheme using conjugate coding is presented,based on which we construct a quantum election scheme without the help of an entanglement state.We show that this election scheme ensures the completeness,soundness,privacy,eligibility,unreusability,fairness,and verifiability of a large-scale election in which the administrator and counter are semi-honest.This election scheme can work even if there exist loss and errors in quantum channels.In addition,any irregularity in this scheme is sensible.  相似文献   

20.
用光谱信息精准、高效地检测水稻叶片叶绿素含量,对诊断和优化水稻叶片氮素营养、开发和优化稻田氮素追肥系统、监测和评价水稻病虫害具有重要的实际意义。针对单纯采用机器学习模型反演水稻叶片叶绿素含量模型精确性和稳定性差的问题,以粳稻吉粳88为研究对象,通过网格试验获得分蘖期等关键生育期的叶片表型高光谱数据和相对叶绿素含量。选取核极限学习机(KELM)为基础建模模型,提出了一种先依据基础KELM建模效果选择预处理方法后,再利用仿生优化算法对所选预处理组合所对应的KELM模型的训练过程进行优化的新思路,以提高模型预测精度。首先,对光谱数据的各类预处理方法展开研究,通过对4类预处理方法进行全排列组合共得到72种预处理组合。利用连续投影算法(SPA)选择特征波段输入KELM模型以筛选较优预处理组合。依据建模效果,预处理组合CWT+MMS,CWT+MSC+SG+SS和CWT+SS所对应KELM的测试集决定系数(R2p)较高,分别为0.850,0.835和0.828。其次,为使KELM模型在保证稳定性和泛化性的前提下性能达到最优,引入哈里斯鹰优化算法(HHO),通过模拟鹰群在捕食时的合作行为和追逐策略,自动最优调节上述三种KELM模型参数,使得HHO-KELM模型R2p分别为0.957,0.867和0.858,模型精度得到有效提升,最高提升10.7%。通过研究,证明了HHO算法优化机器学习模型反演水稻叶片叶绿素含量的可行性,为东北粳稻叶绿素含量的测定和评估提供了有力的参考和借鉴。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号