共查询到20条相似文献,搜索用时 125 毫秒
1.
区间型符号数据是一种重要的符号数据类型,现有文献往往假设区间内的点数据服从均匀分布,导致其应用的局限性。本文基于一般分布的假设,给出了一般分布区间型符号数据的扩展的Hausdorff距离度量,基于此提出了一般分布的区间型符号数据的SOM聚类算法。随机模拟试验的结果表明,基于本文提出的基于扩展的Hausdorff距离度量的SOM聚类算法的有效性优于基于传统Hausdorff距离度量的SOM聚类算法和基于μσ距离度量的SOM聚类算法。最后将文中方法应用于气象数据的聚类分析,示例文中方法的应用步骤与可操作性,并进一步评价文中方法在解决实际问题中的有效性。 相似文献
2.
《数学建模及其应用》2019,(4)
采用统计检验的方法对基因表达数据的特征选取和冗余去除展开研究,为此提出了相应模型及算法,与已有文献中的模型与算法相比较,该模型所提方法思路直观,易于理解,算法构造简单,且运行效率高.数值实验选取3个两分类基因表达数据集,实验结果表明该方法对特征选取和冗余去除均有较好的效果.在此基础上,采用类中心距离法对选取的特征基因进行了分类实验,结果进一步表明,本文提出的方法对两分类基因表达数据具有较高的分类精确度. 相似文献
3.
《数学的实践与认识》2013,(20)
针对传统的谱聚类算法不适合处理多尺度问题,引入一种新的相似性度量—密度敏感的相似性度量,该度量可以放大不同高密度区域内数据点间距离,缩短同一高密度区域内数据点间距离,最终有效描述数据的实际聚类分布.本文引入特征间隙的概念,给出一种自动确定聚类数目的方法.数值实验验证本文所提的算法的可行性和有效性. 相似文献
4.
在给定的度量空间中, 单位聚类问题就是寻找最少的单位球来覆盖给定的所有点。这是一个众所周知的组合优化问题, 其在线版本为: 给定一个度量空间, 其中的n个点会一个接一个的到达任何可能的位置, 在点到达的时候必须给该点分配一个单位聚类, 而此时未来点的相关信息都是未知的, 问题的目标是最后使用的单位聚类数目最少。本文考虑的是带如下假设的一类一维在线单位聚类问题: 在相应离线问题的最优解中任意两个相邻聚类之间的距离都大于0.5。本文首先给出了两个在线算法和一些引理, 接着通过0.5的概率分别运行两个在线算法得到一个组合随机算法, 最后证明了这个组合随机算法的期望竞争比不超过1.5。 相似文献
5.
在给定的度量空间中, 单位聚类问题就是寻找最少的单位球来覆盖给定的所有点。这是一个众所周知的组合优化问题, 其在线版本为: 给定一个度量空间, 其中的n个点会一个接一个的到达任何可能的位置, 在点到达的时候必须给该点分配一个单位聚类, 而此时未来点的相关信息都是未知的, 问题的目标是最后使用的单位聚类数目最少。本文考虑的是带如下假设的一类一维在线单位聚类问题: 在相应离线问题的最优解中任意两个相邻聚类之间的距离都大于0.5。本文首先给出了两个在线算法和一些引理, 接着通过0.5的概率分别运行两个在线算法得到一个组合随机算法, 最后证明了这个组合随机算法的期望竞争比不超过1.5。 相似文献
6.
7.
8.
《数理统计与管理》2015,(5):809-820
不平衡数据是指分类问题中目标变量的某一类观测值数量远大于其他类观测值数量的数据。针对处理不平衡数据算法SMOTE及其衍生算法的不足,本文提出一种新的向上采样算法SMUP(Synthetic Minority Using Proximity of Random Forests),通过样本相似度改进SMOTE算法中的距离测量方式,提高了算法的分类精度。实验结果表明,基于SMUP算法的单分类器能有效提升少数类的分类正确率,同时解决了SMOTE对定类型特征变量距离测度不佳的难题;基于SMUP算法的组合分类器分类效果也明显优于SMOTE衍生算法;最重要的是,SMUP将连续型、混合型和定类型这三种特征变量的距离测度整合到一个统一的框架下,为实际应用提供了便利。 相似文献
9.
考虑求解一类半监督距离度量学习问题. 由于样本集(数据库)的规模与复杂性的激增, 在考虑距离度量学习问题时, 必须考虑学习来的距离度量矩阵具有稀疏性的特点. 因此, 在现有的距离度量学习模型中, 增加了学习矩阵的稀疏约束. 为了便于模型求解, 稀疏约束应用了Frobenius 范数约束. 进一步, 通过罚函数方法将Frobenius范数约束罚到目标函数, 使得具有稀疏约束的模型转化成无约束优化问题. 为了求解问题, 提出了正定矩阵群上加速投影梯度算法, 克服了矩阵群上不能直接进行线性组合的困难, 并分析了算法的收敛性. 最后通过UCI数据库的分类问题的例子, 进行了数值实验, 数值实验的结果说明了学习矩阵的稀疏性以及加速投影梯度算法的有效性. 相似文献
10.
度量空间中的软间隔分类 总被引:1,自引:0,他引:1
本文研究了度量空间中的软间隔分类问题,利用度量d的特性,得到一非线性映射,将度量空间等距嵌入Banach空间中,并构造了一种软间隔分类算法. 相似文献
11.
Takeshi Asada Yeboon Yun Hirotaka Nakayama Tetsuzo Tanino 《Computational Management Science》2004,1(3-4):211-230
Support Vector Machines (SVMs) are now very popular as a powerful method in pattern classification problems. One of main features of SVMs is to produce a separating hyperplane which maximizes the margin in feature space induced by nonlinear mapping using kernel function. As a result, SVMs can treat not only linear separation but also nonlinear separation. While the soft margin method of SVMs considers only the distance between separating hyperplane and misclassified data, we propose in this paper multi-objective programming formulation considering surplus variables. A similar formulation was extensively researched in linear discriminant analysis mostly in 1980s by using Goal Programming(GP). This paper compares these conventional methods such as SVMs and GP with our proposed formulation through several examples.Received: September 2003, Revised: December 2003, 相似文献
12.
现有一类分类算法通常采用经典欧氏测度描述样本间相似关系,然而欧氏测度不能较好地反映一些数据集样本的内在分布结构,从而影响这些方法对数据的描述能力.提出一种用于改善一类分类器描述性能的高维空间一类数据距离测度学习算法,与已有距离测度学习算法相比,该算法只需提供目标类数据,通过引入样本先验分布正则化项和L1范数惩罚的距离测度稀疏性约束,能有效解决高维空间小样本情况下的一类数据距离测度学习问题,并通过采用分块协调下降算法高效的解决距离测度学习的优化问题.学习的距离测度能容易的嵌入到一类分类器中,仿真实验结果表明采用学习的距离测度能有效改善一类分类器的描述性能,特别能够改善SVDD的描述能力,从而使得一类分类器具有更强的推广能力. 相似文献
13.
Support vector machines (SVMs) have attracted much attention in theoretical and in applied statistics. The main topics of recent interest are consistency, learning rates and robustness. We address the open problem whether SVMs are qualitatively robust. Our results show that SVMs are qualitatively robust for any fixed regularization parameter λ. However, under extremely mild conditions on the SVM, it turns out that SVMs are not qualitatively robust any more for any null sequence λn, which are the classical sequences needed to obtain universal consistency. This lack of qualitative robustness is of a rather theoretical nature because we show that, in any case, SVMs fulfill a finite sample qualitative robustness property.For a fixed regularization parameter, SVMs can be represented by a functional on the set of all probability measures. Qualitative robustness is proven by showing that this functional is continuous with respect to the topology generated by weak convergence of probability measures. Combined with the existence and uniqueness of SVMs, our results show that SVMs are the solutions of a well-posed mathematical problem in Hadamard’s sense. 相似文献
14.
Julio López Sebastián Maldonado Ricardo Montoya 《The Journal of the Operational Research Society》2017,68(11):1323-1334
Support vector machines (SVMs) have been successfully used to identify individuals’ preferences in conjoint analysis. One of the challenges of using SVMs in this context is to properly control for preference heterogeneity among individuals to construct robust partworths. In this work, we present a new technique that obtains all individual utility functions simultaneously in a single optimization problem based on three objectives: complexity reduction, model fit, and heterogeneity control. While complexity reduction and model fit are dealt using SVMs, heterogeneity is controlled by shrinking the individual-level partworths toward a population mean. The proposed approach is further extended to kernel-based machines, conferring flexibility to the model by allowing nonlinear utility functions. Experiments on simulated and real-world datasets show that the proposed approach in its linear form outperforms existing methods for choice-based conjoint analysis. 相似文献
15.
An asymptotic formula is obtained for the number of imaginary quadratic number fields with 2-class number equal to 2, from which one can then obtain a type of density result for the 2-class number. The solution of this problem leads to an interesting question about a character sum over primes. 相似文献
16.
A semigroup is regular if it contains at least one idempotent in each ?-class and in each ?-class. A regular semigroup is inverse if it satisfies either of the following equivalent conditions: (i) there is a unique idempotent in each ?-class and in each ?-class, or (ii) the idempotents commute. Analogously, a semigroup is abundant if it contains at least one idempotent in each ?*-class and in each ?*-class. An abundant semigroup is adequate if its idempotents commute. In adequate semigroups, there is a unique idempotent in each ?* and ?*-class. M. Kambites raised the question of the converse: in a finite abundant semigroup such that there is a unique idempotent in each ?* and ?*-class, must the idempotents commute? In this note, we provide a negative answer to this question. 相似文献
17.
ZHANG ChunHua TIAN YingJie & DENG NaiYang School of Information Renmin University of China Beijing China Research Center on Fictitious Economy Data Science Chinese Academy of Sciences Beijing College of Science China Agricultural University Beijing 《中国科学 数学(英文版)》2010,(1)
This paper is concerned with the theoretical foundation of support vector machines (SVMs). The purpose is to develop further an exact relationship between SVMs and the statistical learning theory (SLT). As a representative, the standard C-support vector classification (C-SVC) is considered here. More precisely, we show that the decision function obtained by C-SVC is just one of the decision functions obtained by solving the optimization problem derived directly from the structural risk minimization principl... 相似文献
18.
Multiclass classification and probability estimation have important applications in data analytics. Support vector machines (SVMs) have shown great success in various real-world problems due to their high classification accuracy. However, one main limitation of standard SVMs is that they do not provide class probability estimates, and thus fail to offer uncertainty measure about class prediction. In this article, we propose a simple yet effective framework to endow kernel SVMs with the feature of multiclass probability estimation. The new probability estimator does not rely on any parametric assumption on the data distribution, therefore, it is flexible and robust. Theoretically, we show that the proposed estimator is asymptotically consistent. Computationally, the new procedure can be conveniently implemented using standard SVM softwares. Our extensive numerical studies demonstrate competitive performance of the new estimator when compared with existing methods such as multiple logistic regression, linear discrimination analysis, tree-based methods, and random forest, under various classification settings. Supplementary materials for this article are available online. 相似文献
19.
Method In this paper, we introduce a bi-level optimization formulation for the model and feature selection problems of support vector
machines (SVMs). A bi-level optimization model is proposed to select the best model, where the standard convex quadratic optimization
problem of the SVM training is cast as a subproblem.
Feasibility The optimal objective value of the quadratic problem of SVMs is minimized over a feasible range of the kernel parameters at
the master level of the bi-level model. Since the optimal objective value of the subproblem is a continuous function of the
kernel parameters, through implicity defined over a certain region, the solution of this bi-level problem always exists. The
problem of feature selection can be handled in a similar manner.
Experiments and results Two approaches for solving the bi-level problem of model and feature selection are considered as well. Experimental results
show that the bi-level formulation provides a plausible tool for model selection. 相似文献