首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Stochastic separation theorems play important roles in high-dimensional data analysis and machine learning. It turns out that in high dimensional space, any point of a random set of points can be separated from other points by a hyperplane with high probability, even if the number of points is exponential in terms of dimensions. This and similar facts can be used for constructing correctors for artificial intelligent systems, for determining the intrinsic dimensionality of data and for explaining various natural intelligence phenomena. In this paper, we refine the estimations for the number of points and for the probability in stochastic separation theorems, thereby strengthening some results obtained earlier. We propose the boundaries for linear and Fisher separability, when the points are drawn randomly, independently and uniformly from a d-dimensional spherical layer and from the cube. These results allow us to better outline the applicability limits of the stochastic separation theorems in applications.  相似文献   

2.
3.
In professional soccer, the choices made in forming a team lineup are crucial for achieving good results. Players are characterized by different skills and their relevance depends on the position that they occupy on the pitch. Experts can recognize similarities between players and their styles, but the procedures adopted are often subjective and prone to misclassification. The automatic recognition of players’ styles based on their diversity of skills can help coaches and technical directors to prepare a team for a competition, to substitute injured players during a season, or to hire players to fill gaps created by teammates that leave. The paper adopts dimensionality reduction, clustering and computer visualization tools to compare soccer players based on a set of attributes. The players are characterized by numerical vectors embedding their particular skills and these objects are then compared by means of suitable distances. The intermediate data is processed to generate meaningful representations of the original dataset according to the (dis)similarities between the objects. The results show that the adoption of dimensionality reduction, clustering and visualization tools for processing complex datasets is a key modeling option with current computational resources.  相似文献   

4.
We develop Categorical Exploratory Data Analysis (CEDA) with mimicking to explore and exhibit the complexity of information content that is contained within any data matrix: categorical, discrete, or continuous. Such complexity is shown through visible and explainable serial multiscale structural dependency with heterogeneity. CEDA is developed upon all features’ categorical nature via histogram and it is guided by all features’ associative patterns (order-2 dependence) in a mutual conditional entropy matrix. Higher-order structural dependency of k(3) features is exhibited through block patterns within heatmaps that are constructed by permuting contingency-kD-lattices of counts. By growing k, the resultant heatmap series contains global and large scales of structural dependency that constitute the data matrix’s information content. When involving continuous features, the principal component analysis (PCA) extracts fine-scale information content from each block in the final heatmap. Our mimicking protocol coherently simulates this heatmap series by preserving global-to-fine scales structural dependency. Upon every step of mimicking process, each accepted simulated heatmap is subject to constraints with respect to all of the reliable observed categorical patterns. For reliability and robustness in sciences, CEDA with mimicking enhances data visualization by revealing deterministic and stochastic structures within each scale-specific structural dependency. For inferences in Machine Learning (ML) and Statistics, it clarifies, upon which scales, which covariate feature-groups have major-vs.-minor predictive powers on response features. For the social justice of Artificial Intelligence (AI) products, it checks whether a data matrix incompletely prescribes the targeted system.  相似文献   

5.
Software products in the market are changing due to changes in business processes, technology, or new requirements from the customers. Maintainability of legacy systems has always been an inspiring task for the software companies. In order to determine whether the software requires maintainability by reverse engineering or by forward engineering approach, a system assessment was done from diverse perspectives: quality, business value, type of errors, etc. In this research, the changes required in the existing software components of the legacy system were identified using a supervised learning approach. New interfaces for the software components were redesigned according to the new requirements and/or type of errors. Software maintainability was measured by applying a machine learning technique, i.e., Naïve Bayes classifier. The dataset was designed based on the observations such as component state, successful or error type in the component, line of code of error that exists in the component, component business value, and changes required for the component or not. The results generated by the Waikato Environment for Knowledge Analysis (WEKA) software confirm the effectiveness of the introduced methodology with an accuracy of 97.18%.  相似文献   

6.
Machine learning provides a way to use only portions of the variables of a spatiotemporal system to predict its subsequent evolution and consequently avoids the curse of dimensionality. The learning machines employed for this purpose, in essence, are time-delayed recurrent neural networks with multiple input neurons and multiple output neurons. We show in this paper that such kinds of learning machines have a poor generalization ability to variables that have not been trained with. We then present a one-dimensional time-delayed recurrent neural network for the same aim of model-free prediction. It can be trained on different spatial variables in the training stage but initiated by the time series of only one spatial variable, and consequently possess an excellent generalization ability to new variables that have not been trained on. This network presents a new methodology to achieve fine-grained predictions from a learning machine trained on coarse-grained data, and thus provides a new strategy for certain applications such as weather forecasting. Numerical verifications are performed on the Kuramoto coupled oscillators and the Barrio−Varea−Aragon−Maini model.  相似文献   

7.
Many complex fluids can be described by continuum hydrodynamic field equations, to which noise must be added in order to capture thermal fluctuations. In almost all cases, the resulting coarse-grained stochastic partial differential equations carry a short-scale cutoff, which is also reflected in numerical discretisation schemes. We draw together our recent findings concerning the construction of such schemes and the interpretation of their continuum limits, focusing, for simplicity, on models with a purely diffusive scalar field, such as ‘Model B’ which describes phase separation in binary fluid mixtures. We address the requirement that the steady-state entropy production rate (EPR) must vanish for any stochastic hydrodynamic model in a thermal equilibrium. Only if this is achieved can the given discretisation scheme be relied upon to correctly calculate the nonvanishing EPR for ‘active field theories’ in which new terms are deliberately added to the fluctuating hydrodynamic equations that break detailed balance. To compute the correct probabilities of forward and time-reversed paths (whose ratio determines the EPR), we must make a careful treatment of so-called ‘spurious drift’ and other closely related terms that depend on the discretisation scheme. We show that such subtleties can arise not only in the temporal discretisation (as is well documented for stochastic ODEs with multiplicative noise) but also from spatial discretisation, even when noise is additive, as most active field theories assume. We then review how such noise can become multiplicative via off-diagonal couplings to additional fields that thermodynamically encode the underlying chemical processes responsible for activity. In this case, the spurious drift terms need careful accounting, not just to evaluate correctly the EPR but also to numerically implement the Langevin dynamics itself.  相似文献   

8.
极端学习机以其快速高效和良好的泛化能力在模式识别领域得到了广泛应用,然而现有的ELM及其改进算法并没有充分考虑到数据维数对ELM分类性能和泛化能力的影响,当数据维数过高时包含的冗余属性及噪音点势必降低ELM的泛化能力,针对这一问题本文提出一种基于流形学习的极端学习机,该算法结合维数约减技术有效消除数据冗余属性及噪声对ELM分类性能的影响,为验证所提方法的有效性,实验使用普遍应用的图像数据,实验结果表明本文所提算法能够显著提高ELM的泛化性能。  相似文献   

9.
In statistical inference, the information-theoretic performance limits can often be expressed in terms of a statistical divergence between the underlying statistical models (e.g., in binary hypothesis testing, the error probability is related to the total variation distance between the statistical models). As the data dimension grows, computing the statistics involved in decision-making and the attendant performance limits (divergence measures) face complexity and stability challenges. Dimensionality reduction addresses these challenges at the expense of compromising the performance (the divergence reduces by the data-processing inequality). This paper considers linear dimensionality reduction such that the divergence between the models is maximally preserved. Specifically, this paper focuses on Gaussian models where we investigate discriminant analysis under five f-divergence measures (Kullback–Leibler, symmetrized Kullback–Leibler, Hellinger, total variation, and χ2). We characterize the optimal design of the linear transformation of the data onto a lower-dimensional subspace for zero-mean Gaussian models and employ numerical algorithms to find the design for general Gaussian models with non-zero means. There are two key observations for zero-mean Gaussian models. First, projections are not necessarily along the largest modes of the covariance matrix of the data, and, in some situations, they can even be along the smallest modes. Secondly, under specific regimes, the optimal design of subspace projection is identical under all the f-divergence measures considered, rendering a degree of universality to the design, independent of the inference problem of interest.  相似文献   

10.
Multi-label learning is dedicated to learning functions so that each sample is labeled with a true label set. With the increase of data knowledge, the feature dimensionality is increasing. However, high-dimensional information may contain noisy data, making the process of multi-label learning difficult. Feature selection is a technical approach that can effectively reduce the data dimension. In the study of feature selection, the multi-objective optimization algorithm has shown an excellent global optimization performance. The Pareto relationship can handle contradictory objectives in the multi-objective problem well. Therefore, a Shapley value-fused feature selection algorithm for multi-label learning (SHAPFS-ML) is proposed. The method takes multi-label criteria as the optimization objectives and the proposed crossover and mutation operators based on Shapley value are conducive to identifying relevant, redundant and irrelevant features. The comparison of experimental results on real-world datasets reveals that SHAPFS-ML is an effective feature selection method for multi-label classification, which can reduce the classification algorithm’s computational complexity and improve the classification accuracy.  相似文献   

11.
Intelligence is a central feature of human beings’ primary and interpersonal experience. Understanding how intelligence originated and scaled during evolution is a key challenge for modern biology. Some of the most important approaches to understanding intelligence are the ongoing efforts to build new intelligences in computer science (AI) and bioengineering. However, progress has been stymied by a lack of multidisciplinary consensus on what is central about intelligence regardless of the details of its material composition or origin (evolved vs. engineered). We show that Buddhist concepts offer a unique perspective and facilitate a consilience of biology, cognitive science, and computer science toward understanding intelligence in truly diverse embodiments. In coming decades, chimeric and bioengineering technologies will produce a wide variety of novel beings that look nothing like familiar natural life forms; how shall we gauge their moral responsibility and our own moral obligations toward them, without the familiar touchstones of standard evolved forms as comparison? Such decisions cannot be based on what the agent is made of or how much design vs. natural evolution was involved in their origin. We propose that the scope of our potential relationship with, and so also our moral duty toward, any being can be considered in the light of Care—a robust, practical, and dynamic lynchpin that formalizes the concepts of goal-directedness, stress, and the scaling of intelligence; it provides a rubric that, unlike other current concepts, is likely to not only survive but thrive in the coming advances of AI and bioengineering. We review relevant concepts in basal cognition and Buddhist thought, focusing on the size of an agent’s goal space (its cognitive light cone) as an invariant that tightly links intelligence and compassion. Implications range across interpersonal psychology, regenerative medicine, and machine learning. The Bodhisattva’s vow (“for the sake of all sentient life, I shall achieve awakening”) is a practical design principle for advancing intelligence in our novel creations and in ourselves.  相似文献   

12.
近红外光谱是热门的食品检测方法之一,对于这种高维光谱数据的分析常常需采用数据降维算法提取其中的特征,然而绝大多数算法都只能针对单个数据集进行分析。虽然已有基于对比学习的对比主成分分析成功应用于不同水果表面农残的近红外光谱检测中,但是该方法只能以线性的方式组合原有特征,特征提取效果存在局限性,并且需要调节对比参数来控制背景集影响,需要消耗更大的时间成本。cVAE(contrastive variational autoencoder)是一种基于对比学习和变分自编码器的改进算法,被用于图像去噪和RNA序列分析中,它仍然具备分析多个数据集的特点,同时因为组合了神经网络的概率生成模型而具备了提取非线性隐含特征的能力。将cVAE算法应用于近红外光谱分析,建立了准确的近红外光谱数据降维模型。在实际验证中,使用cVAE算法对购买的不同品牌和批次纯牛奶中掺假三聚氰胺进行检测。结果表明,使用VAE算法只能区分出不同品牌和批次的纯牛奶,而其中是否掺假三聚氰胺这一重要信息无法表现出来;而使用cVAE算法进行数据分析时,由于添加了背景数据集分离了无关变量,能够清晰的将有无掺假三聚氰胺的样本分类。这说明了,cV...  相似文献   

13.
With the aim of improving the reconstruction of stochastic evolution equations from empirical time-series data, we derive a full representation of the generator of the Kramers–Moyal operator via a power-series expansion of the exponential operator. This expansion is necessary for deriving the different terms in a stochastic differential equation. With the full representation of this operator, we are able to separate finite-time corrections of the power-series expansion of arbitrary order into terms with and without derivatives of the Kramers–Moyal coefficients. We arrive at a closed-form solution expressed through conditional moments, which can be extracted directly from time-series data with a finite sampling intervals. We provide all finite-time correction terms for parametric and non-parametric estimation of the Kramers–Moyal coefficients for discontinuous processes which can be easily implemented—employing Bell polynomials—in time-series analyses of stochastic processes. With exemplary cases of insufficiently sampled diffusion and jump-diffusion processes, we demonstrate the advantages of our arbitrary-order finite-time corrections and their impact in distinguishing diffusion and jump-diffusion processes strictly from time-series data.  相似文献   

14.
Principal component analysis (PCA) is a popular technique in remote sensing for dimensionality reduction. While PCA is suitable for data compression, it is not necessarily an optimal technique for feature extraction, particularly when the features are exploited in supervised learning applications (Cheriyadat and Bruce, 2003) [1]. Preserving features belonging to the target is very crucial to the performance of target detection/recognition techniques. Fukunaga–Koontz Transform (FKT) based supervised band reduction technique can be used to provide this requirement. FKT achieves feature selection by transforming into a new space in where feature classes have complimentary eigenvectors. Analysis of these eigenvectors under two classes, target and background clutter, can be utilized for target oriented band reduction since each basis functions best represent target class while carrying least information of the background class. By selecting few eigenvectors which are the most relevant to the target class, dimension of hyperspectral data can be reduced and thus, it presents significant advantages for near real time target detection applications. The nonlinear properties of the data can be extracted by kernel approach which provides better target features. Thus, we propose constructing kernel FKT (KFKT) to present target oriented band reduction. The performance of the proposed KFKT based target oriented dimensionality reduction algorithm has been tested employing two real-world hyperspectral data and results have been reported consequently.  相似文献   

15.
Deep learning, in general, was built on input data transformation and presentation, model training with parameter tuning, and recognition of new observations using the trained model. However, this came with a high computation cost due to the extensive input database and the length of time required in training. Despite the model learning its parameters from the transformed input data, no direct research has been conducted to investigate the mathematical relationship between the transformed information (i.e., features, excitation) and the model’s learnt parameters (i.e., weights). This research aims to explore a mathematical relationship between the input excitations and the weights of a trained convolutional neural network. The objective is to investigate three aspects of this assumed feature-weight relationship: (1) the mathematical relationship between the training input images’ features and the model’s learnt parameters, (2) the mathematical relationship between the images’ features of a separate test dataset and a trained model’s learnt parameters, and (3) the mathematical relationship between the difference of training and testing images’ features and the model’s learnt parameters with a separate test dataset. The paper empirically demonstrated the existence of this mathematical relationship between the test image features and the model’s learnt weights by the ANOVA analysis.  相似文献   

16.
A two-party private set intersection allows two parties, the client and the server, to compute an intersection over their private sets, without revealing any information beyond the intersecting elements. We present a novel private set intersection protocol based on Shuhong Gao’s fully homomorphic encryption scheme and prove the security of the protocol in the semi-honest model. We also present a variant of the protocol which is a completely novel construction for computing the intersection based on Bloom filter and fully homomorphic encryption, and the protocol’s complexity is independent of the set size of the client. The security of the protocols relies on the learning with errors and ring learning with error problems. Furthermore, in the cloud with malicious adversaries, the computation of the private set intersection can be outsourced to the cloud service provider without revealing any private information.  相似文献   

17.
The problem of calculating the equilibrium properties ofv-dimensional fluid mixture of hardv-spheres is studied. High temperature expansion for the density independent radial distribution function is derived for a hardv-sphere mixture. The ‘excess’ quantum corrections to the second virial coefficient and the excess free energy are also studied. Significant features are the large increase in ‘excess’ quantum correction with increasing dimensionality.  相似文献   

18.
The biomedical field is characterized by an ever-increasing production of sequential data, which often come in the form of biosignals capturing the time-evolution of physiological processes, such as blood pressure and brain activity. This has motivated a large body of research dealing with the development of machine learning techniques for the predictive analysis of such biosignals. Unfortunately, in high-stakes decision making, such as clinical diagnosis, the opacity of machine learning models becomes a crucial aspect to be addressed in order to increase the trust and adoption of AI technology. In this paper, we propose a model agnostic explanation method, based on occlusion, that enables the learning of the input’s influence on the model predictions. We specifically target problems involving the predictive analysis of time-series data and the models that are typically used to deal with data of such nature, i.e., recurrent neural networks. Our approach is able to provide two different kinds of explanations: one suitable for technical experts, who need to verify the quality and correctness of machine learning models, and one suited to physicians, who need to understand the rationale underlying the prediction to make aware decisions. A wide experimentation on different physiological data demonstrates the effectiveness of our approach both in classification and regression tasks.  相似文献   

19.
We consider a recently introduced generalization of the Ising model in which individual spin strength can vary. The model is intended for analysis of ordering in systems comprising agents which, although matching in their binarity (i.e., maintaining the iconic Ising features of ‘+’ or ‘−’, ‘up’ or ‘down’, ‘yes’ or ‘no’), differ in their strength. To investigate the interplay between variable properties of nodes and interactions between them, we study the model on a complex network where both the spin strength and degree distributions are governed by power laws. We show that in the annealed network approximation, thermodynamic functions of the model are self-averaging and we obtain an exact solution for the partition function. This allows us derive the leading temperature and field dependencies of thermodynamic functions, their critical behavior, and logarithmic corrections at the interface of different phases. We find the delicate interplay of the two power laws leads to new universality classes.  相似文献   

20.
In this paper, a new parametric compound G family of continuous probability distributions called the Poisson generalized exponential G (PGEG) family is derived and studied. Relevant mathematical properties are derived. Some new bivariate G families using the theorems of “Farlie-Gumbel-Morgenstern copula”, “the modified Farlie-Gumbel-Morgenstern copula”, “the Clayton copula”, and “the Renyi’s entropy copula” are presented. Many special members are derived, and a special attention is devoted to the exponential and the one parameter Pareto type II model. The maximum likelihood method is used to estimate the model parameters. A graphical simulation is performed to assess the finite sample behavior of the estimators of the maximum likelihood method. Two real-life data applications are proposed to illustrate the importance of the new family.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号