首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Payment data is one of the most valuable assets that retail banks can leverage as the major competitive advantage with respect to new entrants such as Fintech companies or giant internet companies. In marketing, the value behind data relates to the power of encoding customer preferences: the better you know your customer, the better your marketing strategy. In this paper, we present a B2B2C lead generation application based on payment transaction data within the online banking system. In this approach, the bank is an intermediary between its private customers and merchants. The bank uses its competence in Machine Learning driven marketing to build a lead generation application that helps merchants run data driven campaigns through the banking channels to reach retail customers. The bank’s retail customers trade the utility hidden in its payment transaction data for special offers and discounts offered by merchants. During the entire process banks protects the privacy of the retail customer.  相似文献   

2.
The accurate prediction of gross box-office markets is of great benefit for investment and management in the movie industry. In this work, we propose a machine learning-based method for predicting the movie box-office revenue of a country based on the empirical comparisons of eight methods with diverse combinations of economic factors. Specifically, we achieved a prediction performance of the relative root mean squared error of 0.056 in the US and of 0.183 in China for the two case studies of movie markets in time-series forecasting experiments from 2013 to 2016. We concluded that the support-vector-machine-based method using gross domestic product reached the best prediction performance and satisfies the easily available information of economic factors. The computational experiments and comparison studies provided evidence for the effectiveness and advantages of our proposed prediction strategy. In the validation process of the predicted total box-office markets in 2017, the error rates were 0.044 in the US and 0.066 in China. In the consecutive predictions of nationwide box-office markets in 2018 and 2019, the mean relative absolute percentage errors achieved were 0.041 and 0.035 in the US and China, respectively. The precise predictions, both in the training and validation data, demonstrate the efficiency and versatility of our proposed method.  相似文献   

3.
在追求可持续发展的未来,热电材料是不可或缺的.它在全固态发电和制冷方面具有十分广泛的应用前景.在过去的几十年间,人们一直致力于寻找新型高性能热电材料.然而,传统的实验试错法效率较低,限制了新材料的研究步伐.机器学习作为一种具有强大数据分析能力的方法,近年来已越来越多地应用于热电材料的研究.这篇综述总结了热电材料研究领域常用的机器学习方法,系统地介绍了它们在材料结构、电子和热电输运等性质上的应用案例和相关研究进展,并对该领域的发展前景进行了展望.  相似文献   

4.
Increasing demand in the backbone Dense Wavelength Division (DWDM) Multiplexing network traffic prompts an introduction of new solutions that allow increasing the transmission speed without significant increase of the service cost. In order to achieve this objective simpler and faster, DWDM network reconfiguration procedures are needed. A key problem that is intrinsically related to network reconfiguration is that of the quality of transmission assessment. Thus, in this contribution a Machine Learning (ML) based method for an assessment of the quality of transmission is proposed. The proposed ML methods use a database, which was created only on the basis of information that is available to a DWDM network operator via the DWDM network control plane. Several types of ML classifiers are proposed and their performance is tested and compared for two real DWDM network topologies. The results obtained are promising and motivate further research.  相似文献   

5.
Accurate clustering is a challenging task with unlabeled data. Ensemble clustering aims to combine sets of base clusterings to obtain a better and more stable clustering and has shown its ability to improve clustering accuracy. Dense representation ensemble clustering (DREC) and entropy-based locally weighted ensemble clustering (ELWEC) are two typical methods for ensemble clustering. However, DREC treats each microcluster equally and hence, ignores the differences between each microcluster, while ELWEC conducts clustering on clusters rather than microclusters and ignores the sample–cluster relationship. To address these issues, a divergence-based locally weighted ensemble clustering with dictionary learning (DLWECDL) is proposed in this paper. Specifically, the DLWECDL consists of four phases. First, the clusters from the base clustering are used to generate microclusters. Second, a Kullback–Leibler divergence-based ensemble-driven cluster index is used to measure the weight of each microcluster. With these weights, an ensemble clustering algorithm with dictionary learning and the L2,1-norm is employed in the third phase. Meanwhile, the objective function is resolved by optimizing four subproblems and a similarity matrix is learned. Finally, a normalized cut (Ncut) is used to partition the similarity matrix and the ensemble clustering results are obtained. In this study, the proposed DLWECDL was validated on 20 widely used datasets and compared to some other state-of-the-art ensemble clustering methods. The experimental results demonstrated that the proposed DLWECDL is a very promising method for ensemble clustering.  相似文献   

6.
The main goal of this work is to adapt a Physics problem to the Machine Learning (ML) domain and to compare several techniques to solve it. The problem consists of how to perform muon count from the signal registered by particle detectors which record a mix of electromagnetic and muonic signals. Finding a good solution could be a building block on future experiments. After proposing an approach to solve the problem, the experiments show a performance comparison of some popular ML models using two different hadronic models for the test data. The results show that the problem is suitable to be solved using ML as well as how critical the feature selection stage is regarding precision and model complexity.  相似文献   

7.
The prediction of chaotic time series systems has remained a challenging problem in recent decades. A hybrid method using Hankel Alternative View Of Koopman (HAVOK) analysis and machine learning (HAVOK-ML) is developed to predict chaotic time series. HAVOK-ML simulates the time series by reconstructing a closed linear model so as to achieve the purpose of prediction. It decomposes chaotic dynamics into intermittently forced linear systems by HAVOK analysis and estimates the external intermittently forcing term using machine learning. The prediction performance evaluations confirm that the proposed method has superior forecasting skills compared with existing prediction methods.  相似文献   

8.
The most common machine-learning methods solve supervised and unsupervised problems based on datasets where the problem’s features belong to a numerical space. However, many problems often include data where numerical and categorical data coexist, which represents a challenge to manage them. To transform categorical data into a numeric form, preprocessing tasks are compulsory. Methods such as one-hot and feature-hashing have been the most widely used encoding approaches at the expense of a significant increase in the dimensionality of the dataset. This effect introduces unexpected challenges to deal with the overabundance of variables and/or noisy data. In this regard, in this paper we propose a novel encoding approach that maps mixed-type data into an information space using Shannon’s Theory to model the amount of information contained in the original data. We evaluated our proposal with ten mixed-type datasets from the UCI repository and two datasets representing real-world problems obtaining promising results. For demonstrating the performance of our proposal, this was applied for preparing these datasets for classification, regression, and clustering tasks. We demonstrate that our encoding proposal is remarkably superior to one-hot and feature-hashing encoding in terms of memory efficiency. Our proposal can preserve the information conveyed by the original data.  相似文献   

9.
Electric power forecasting plays a substantial role in the administration and balance of current power systems. For this reason, accurate predictions of service demands are needed to develop better programming for the generation and distribution of power and to reduce the risk of vulnerabilities in the integration of an electric power system. For the purposes of the current study, a systematic literature review was applied to identify the type of model that has the highest propensity to show precision in the context of electric power forecasting. The state-of-the-art model in accurate electric power forecasting was determined from the results reported in 257 accuracy tests from five geographic regions. Two classes of forecasting models were compared: classical statistical or mathematical (MSC) and machine learning (ML) models. Furthermore, the use of hybrid models that have made significant contributions to electric power forecasting is identified, and a case of study is applied to demonstrate its good performance when compared with traditional models. Among our main findings, we conclude that forecasting errors are minimized by reducing the time horizon, that ML models that consider various sources of exogenous variability tend to have better forecast accuracy, and finally, that the accuracy of the forecasting models has significantly increased over the last five years.  相似文献   

10.
Spectrum sensing is an important function in radio frequency spectrum management and cognitive radio networks. Spectrum sensing is used by one wireless system (e.g., a secondary user) to detect the presence of a wireless service with higher priority (e.g., a primary user) with which it has to coexist in the radio frequency spectrum. If the wireless signal is detected, the second user system releases the given frequency to maintain the principle of not interfering. This paper proposes a machine learning implementation of spectrum sensing using the entropy measure as a feature vector. In the training phase, the information about the activity of the wireless service with higher priority is gathered, and the model is formed. In the classification phase, the wireless system compares the current sensing report to the created model to calculate the posterior probability and classify the sensing report into either the presence or absence of wireless service with higher priority. This paper proposes the novel application of the Fluctuation Dispersion Entropy (FDE) measure recently introduced in the research community as a feature vector to build the model and implement the classification. An improved implementation of the FDE (IFDE) is used to enhance the robustness to noise. IFDE is further enhanced with an adaptive method (AIFDE) to automatically select the hyper-parameter introduced in IFDE. Then, this paper combines the machine learning approach with the entropy measure approach, which are both recent developments in spectrum sensing research. The approach is compared to similar approaches in literature and the classical energy detection method using a generated radar signal data set with different conditions of SNR(dB) and fading conditions. The results show that the proposed approach is able to outperform the approaches from literature based on other entropy measures or the Energy Detector (ED) in a consistent way across different levels of SNR and fading conditions.  相似文献   

11.
The trend prediction of the stock is a main challenge. Accidental factors often lead to short-term sharp fluctuations in stock markets, deviating from the original normal trend. The short-term fluctuation of stock price has high noise, which is not conducive to the prediction of stock trends. Therefore, we used discrete wavelet transform (DWT)-based denoising to denoise stock data. Denoising the stock data assisted us to eliminate the influences of short-term random events on the continuous trend of the stock. The denoised data showed more stable trend characteristics and smoothness. Extreme learning machine (ELM) is one of the effective training algorithms for fully connected single-hidden-layer feedforward neural networks (SLFNs), which possesses the advantages of fast convergence, unique results, and it does not converge to a local minimum. Therefore, this paper proposed a combination of ELM- and DWT-based denoising to predict the trend of stocks. The proposed method was used to predict the trend of 400 stocks in China. The prediction results of the proposed method are a good proof of the efficacy of DWT-based denoising for stock trends, and showed an excellent performance compared to 12 machine learning algorithms (e.g., recurrent neural network (RNN) and long short-term memory (LSTM)).  相似文献   

12.
高光谱与机器学习相结合的大白菜种子品种鉴别研究   总被引:1,自引:0,他引:1  
提出了基于高光谱信息的大白菜种子品种分类识别方法。利用近红外高光谱图像采集系统采集了八种共239个大白菜种子样本;提取15 pixel×15 pixel感兴趣区域平均光谱反射率信息作为样本信息;采用多元散射校正预处理方法对光谱进行消噪;验证了Ada-Boost 算法、极限学习机(extreme learning machine, ELM)、随机森林(random forest, RF)和支持向量机(support vector machine, SVM)四种分类算法的分类判别效果。为了简化输入变量,通过载荷系数分析选取了10个大白菜种子品种分类判别的特征波长。实验结果表明,四种分类算法基于全波段的分类识别对81个预测样本的正确区分率均超过90%,最优的分类判别模型为ELM和RF,识别正确率达到了100%;以10个特征波长的分类判别精度略有下降,但输入变量大幅减少,提高了信息处理效率,其中最优分类判别模型为EW-ELM模型,判别正确率为100%,因此以载荷系数选取的特征波长是有效的。利用高光谱结合机器学习对大白菜种子品种进行快速、无损分类识别是可行的,为大白菜种子批量化在线检测提供了一种新的方法。  相似文献   

13.
With the goal of understanding if the information contained in node metadata can help in the task of link weight prediction, we investigate herein whether incorporating it as a similarity feature (referred to as metadata similarity) between end nodes of a link improves the prediction accuracy of common supervised machine learning methods. In contrast with previous works, instead of normalizing the link weights, we treat them as count variables representing the number of interactions between end nodes, as this is a natural representation for many datasets in the literature. In this preliminary study, we find no significant evidence that metadata similarity improved the prediction accuracy of the four empirical datasets studied. To further explore the role of node metadata in weight prediction, we synthesized weights to analyze the extreme case where the weights depend solely on the metadata of the end nodes, while encoding different relationships between them using logical operators in the generation process. Under these conditions, the random forest method performed significantly better than other methods in 99.07% of cases, though the prediction accuracy was significantly degraded for the methods analyzed in comparison to the experiments with the original weights.  相似文献   

14.
Quantum Machine Learning (QML) has not yet demonstrated extensively and clearly its advantages compared to the classical machine learning approach. So far, there are only specific cases where some quantum-inspired techniques have achieved small incremental advantages, and a few experimental cases in hybrid quantum computing are promising, considering a mid-term future (not taking into account the achievements purely associated with optimization using quantum-classical algorithms). The current quantum computers are noisy and have few qubits to test, making it difficult to demonstrate the current and potential quantum advantage of QML methods. This study shows that we can achieve better classical encoding and performance of quantum classifiers by using Linear Discriminant Analysis (LDA) during the data preprocessing step. As a result, the Variational Quantum Algorithm (VQA) shows a gain of performance in balanced accuracy with the LDA technique and outperforms baseline classical classifiers.  相似文献   

15.
Insecure applications (apps) are increasingly used to steal users’ location information for illegal purposes, which has aroused great concern in recent years. Although the existing methods, i.e., static and dynamic taint analysis, have shown great merit for identifying such apps, which mainly rely on statically analyzing source code or dynamically monitoring the location data flow, identification accuracy is still under research, since the analysis results contain a certain false positive or true negative rate. In order to improve the accuracy and reduce the misjudging rate in the process of vetting suspicious apps, this paper proposes SAMLDroid, a combined method of static code analysis and machine learning for identifying Android apps with location privacy leakage, which can effectively improve the identification rate compared with existing methods. SAMLDroid first uses static analysis to scrutinize source code to investigate apps with location acquiring intentions. Then it exploits a well-trained classifier and integrates an app’s multiple features to dynamically analyze the pattern and deliver the final verdict about the app’s property. Finally, it is proved by conducting experiments, that the accuracy rate of SAMLDroid is up to 98.4%, which is nearly 20% higher than Apparecium.  相似文献   

16.
李军  后新燕 《物理学报》2019,68(10):100503-100503
利用指数加权在线核序列极限学习机(exponential weighted online sequential extreme learning machine with kernel, EW-KOSELM)辨识算法,开展了针对混沌动力学系统的动态重构研究. EW-KOSELM算法将核递归最小二乘(kernel recursive least squares, KRLS)算法直接延伸至在线ELM (extreme learning machine)框架中,通过引入遗忘因子削弱了旧数据的影响,并基于"固定预算(fixed-budget, FB)"内存技术,应对在线核学习算法所固有的规模不断增长的计算困难.将所提辨识算法应用于Duffing-Ueda振子的混沌动力学系统数值仿真实例中,对基于FB-EW-KOSELM的辨识模型与原系统的动态性能进行了定性与定量的分析校验,定性校验准则是基于对比辨识模型与原系统吸引子(轨迹嵌入)、庞加莱映射、分岔图、极限环完成的,定量校验准则包括对比辨识模型与原系统的李雅普诺夫指数与关联维.进一步将其分别应用于来自测量蔡氏电路产生双涡卷吸引子与螺旋吸引子的实测数据实验及某一实际混沌电路所产生的时间序列中,对于具有低信噪比的实测电压或电流数据还需进行了小波降噪预处理.通过分析辨识模型重构吸引子,实验结果表明,FB-EW-KOSELM算法具有良好的动态重构性能,能精确地再生出展示混沌动态行为的过程非线性模型,且具有与原混沌系统非常接近的动态不变性指标.  相似文献   

17.
基于机器学习的玉米单倍体近红外光谱鉴别方法研究   总被引:1,自引:0,他引:1  
在玉米单倍体技术中,单倍体鉴别是非常重要的环节。该研究对大量玉米单倍体与杂合二倍体的近红外透射光谱进行分析,以期建立一套在生产上实用的单倍体鉴别模型。通过采集三组遗传背景不同的玉米单倍体与杂合二倍体籽粒光谱,进行不同机器学习算法对比,光谱预处理建模效果比较,以及分析数据集大小对模型构建的影响。对比所有单倍体与杂合二倍体的平均光谱,发现二者在光谱的吸收峰位置基本相同,但是单倍体的吸光度略高于杂合二倍体,尤其是在波长940~1 120 nm以及1 180~1 316 nm这两段谱区差异较大。在构建的几个模型中,采用偏最小二乘法和神经网络算法的模型单倍体鉴别准确率较高,分别为93.26%和95.42%。测试集验证的结果与模型准确率一致,表明两种算法适宜进行单倍体大规模筛选。利用偏最小二乘法模型比较了不同光谱预处理方法的模型效果,发现仅进行移动窗口平滑预处理原始光谱进行建模准确率最高。对不同大小数据集的建模效果对比发现,在一定范围内增大数据集有助于提高模型准确率。而且数据中单倍体所占比例较高时,单倍体预测召回率可达100%。此外,还根据籽粒颜色标记挑选出不易鉴别的单倍体和杂合二倍体,利用偏最小二乘法构建的机器学习模型预测准确率可达93.39%,显示出近红外鉴别单倍体的优势,即有可能在不依赖籽粒颜色的情况下实现准确鉴别。基于机器学习的近红外单倍体鉴别方法具有较高的准确率,而且该方法还能在后期数据增加的基础上不断优化,对其开展理论研究有望为自动化智能鉴别单倍体创造条件。  相似文献   

18.
In distributed machine learning (DML), though clients’ data are not directly transmitted to the server for model training, attackers can obtain the sensitive information of clients by analyzing the local gradient parameters uploaded by clients. For this case, we use the differential privacy (DP) mechanism to protect the clients’ local parameters. In this paper, from an information-theoretic point of view, we study the utility–privacy trade-off in DML with the help of the DP mechanism. Specifically, three cases including independent clients’ local parameters with independent DP noise, dependent clients’ local parameters with independent/dependent DP noise are considered. Mutual information and conditional mutual information are used to characterize utility and privacy, respectively. First, we show the relationship between utility and privacy for the three cases. Then, we show the optimal noise variance that achieves the maximal utility under a certain level of privacy. Finally, the results of this paper are further illustrated by numerical results.  相似文献   

19.
Network anomaly detection systems (NADSs) play a significant role in every network defense system as they detect and prevent malicious activities. Therefore, this paper offers an exhaustive overview of different aspects of anomaly-based network intrusion detection systems (NIDSs). Additionally, contemporary malicious activities in network systems and the important properties of intrusion detection systems are discussed as well. The present survey explains important phases of NADSs, such as pre-processing, feature extraction and malicious behavior detection and recognition. In addition, with regard to the detection and recognition phase, recent machine learning approaches including supervised, unsupervised, new deep and ensemble learning techniques have been comprehensively discussed; moreover, some details about currently available benchmark datasets for training and evaluating machine learning techniques are provided by the researchers. In the end, potential challenges together with some future directions for machine learning-based NADSs are specified.  相似文献   

20.
The initial field has a crucial influence on numerical weather prediction (NWP). Data assimilation (DA) is a reliable method to obtain the initial field of the forecast model. At the same time, data are the carriers of information. Observational data are a concrete representation of information. DA is also the process of sorting observation data, during which entropy gradually decreases. Four-dimensional variational assimilation (4D-Var) is the most popular approach. However, due to the complexity of the physical model, the tangent linear and adjoint models, and other processes, the realization of a 4D-Var system is complicated, and the computational efficiency is expensive. Machine learning (ML) is a method of gaining simulation results by training a large amount of data. It achieves remarkable success in various applications, and operational NWP and DA are no exception. In this work, we synthesize insights and techniques from previous studies to design a pure data-driven 4D-Var implementation framework named ML-4DVAR based on the bilinear neural network (BNN). The framework replaces the traditional physical model with the BNN model for prediction. Moreover, it directly makes use of the ML model obtained from the simulation data to implement the primary process of 4D-Var, including the realization of the short-term forecast process and the tangent linear and adjoint models. We test a strong-constraint 4D-Var system with the Lorenz-96 model, and we compared the traditional 4D-Var system with ML-4DVAR. The experimental results demonstrate that the ML-4DVAR framework can achieve better assimilation results and significantly improve computational efficiency.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号