首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
2.
3.
We consider the implications of streaming data for data analysis and data mining. Streaming data are becoming widely available from a variety of sources. In our case we consider the implications arising from Internet traffic data. By implication, streaming data are unlikely to be time homogeneous so that standard statistical and data mining procedures do not necessarily apply. Because it is essentially impossible to store streaming data, we consider recursive algorithms, algorithms which are adaptive and discount the past and also algorithms that create finite pseudo-samples. We also suggest some evolutionary graphics procedures that are suitable for streaming data. We begin our discussion with a discussion of Internet traffic in order to give the reader some sense of the time and data scale and visual resolution needed for such problems.  相似文献   

4.
捕获移出模型起源于生物种群调查和特殊社会网络的研究,是一种复杂抽样方法,一般用于对不确定群体的总量规模和方差进行估计。本文将改进的捕获移出模型应用到社交网络传播规模的抽样估计中,对网络信息的传播范围和波及人群进行了初步估计,并以近期北京频繁出现的"雾霾"事件的自由传播为例进行了实证分析。研究表明,捕获移出模型可以对社交网络中的热点事件的传播规模和再传播的概率进行有效估计,从而说明社交网络已逐渐成为公共话语空间的重要传播路径和传播方式。面对社交网络的迅速蔓延,本文对大数据环境下的抽样估计方法也进行了初步的探索和实践。  相似文献   

5.
大数据时代背景下,越来越多领域对大数据计算提出了高要求,尤其各行各业产生的大数据更多地是一种动态的流式数据形态,因此,实现实时、快速、高效的大数据流计算与分析日益紧要.在线机器学习算法是解决实时大数据流分析的有效方案.在机器学习算法中,通过核学习能够获得有效的核函数,而所选核函数又对核学习器的性能有很大影响.结合在线机器学习与核函数研究一种适用于大数据流环境下的多任务在线学习算法,探讨了算法过程中可能出现的扰动项,应用数据依赖核的构建方法提高了算法的广泛性.算法不需要对历史数据流进行存储和重新扫描,只需选择一个数据集样本,在分析新的流式大数据时能够在可接受时间内直接将当前核函数更新为最合适的核函数,非常适合应用于流式大数据环境下的核学习问题.  相似文献   

6.
云计算和大数据已成为IT领域的研究热点,如何将云计算在数据存储和数据处理方面的优势应用于大数据领域具有重要的实际应用价值.开源的云平台OpenStack可方便地从硬件管理方面构建私有云,其存储模块Swift能够支持PB级的大数据存储.开源的云平台Hadoop在数据处理方面具有很强的优势,但在支持超大数据存储方面存在不足.通过对OpenStack中的存储模块Swift和Hadoop中的文件处理模块HDFS的比较分析,提出了将Swift和Hadoop的MapReduce技术结合来构建企业处理大数据的私有云计算系统方案.分析结果显示该方案是可行的,这种异构的私有云系统可以整合不同云计算平台各自的优势进行高效的大数据处理.  相似文献   

7.
??Recently, big data, could computing and internet of thingsprovide some new information technologies for organization and management of complexsystems, and they have caused multifaceted changes on organization framework andoperations mechanism of enterprises. Based on this, we first construct a new stochasticmodel for a big data driven large-scale bike-sharing system, which expresses theimportant role played by big data, and describes the operations mechanism of thelarge-scale bike-sharing system, and specifically, the rebalancing of bikes in variousstations in terms of trucks. Then, we present a mean-field limit theory, which isapplied to analyzing the big data driven large-scale bike-sharing system, includingestablishing a time-inhomogeneous queueing system by means of the mean field theory,and setting up the mean-field equations through the time-inhomogeneous queueing system;providing an empirical measure process by means of a nonlinear birth-death process,giving algorithms for computing the fixed point in terms of a segmented structuralbirth-death processes, and computing the average number of bikes in each station; andproviding numerical examples to analyze how the steady average number of bikes in eachstation depends on some key parameters of the bike-sharing system. Using these results,this paper analyzes physical effect of big data on performance of the large-scalebike-sharing. Therefore, this paper gives a promising research direction of stochasticmodel in the study of large-scale bike-sharing systems.  相似文献   

8.
9.
大数据引发了思维模式的改变和技术的革新,对税收领域产生重大影响,也给税收领域的发展带来契机.从大数据思维和技术的角度出发,初步探索如何将大数据应用在识别逃税现象的过程中.结合大数据思维,应用大数据技术,通过建立税收大数据仓库,并基于税收大数据之间的相关关系运用关联规则数据挖掘技术,建立了大数据的逃税识别路径,实现逃税的识别,以期为税收领域中的大数据应用提供一定的新思路和借鉴基础.  相似文献   

10.
The Shapley value of certain non-superadditive games appears in the Talmudic literature, and in the Talmud itself, in several different contexts.  相似文献   

11.
对变分同化中的若干理论问题进行了研究,具体讨论了一类简单模式在整体和局部观测资料下的变分同化问题.对于整体观测资料下的变分同化问题,利用变分同化方法对预报模式中的初值、参数以及模式进行了修正,从理论上作出了变分同化方法的误差估计及收敛精度的估计,证明了变分同化方法的有效性.对于局部观测资料下的变分同化问题,由于得到的解往往不适定,因而通常的变分同化方法失效.为了克服问题的不适定性所带来的困难,利用变分同化结合正则化方法对预报模式中的初值、参数以及模式进行修正,同样作出了变分同化方法的误差估计及收敛精度估计,证明了变分同化与正则化方法结合的必要性和有效性,并对正则化参数的选择提供了理论判据.最后,举了一个实例说明所提出的方法的有效性.  相似文献   

12.
基于可拓学理论的高维大数据相似性研究   总被引:1,自引:0,他引:1  
高维大数据的相似性计算是数据挖掘领域的研究重点,论文通过分析高维大数据相似性计算的难点,提出采用可拓学的方法解决其中矛盾问题的研究思路。在基元表示高维大数据的基础上,借助数据转换、数据筛选、权重的确定、数据预处理等技术实现了数据之间的相似性计算,并基于水污染常规分析数据进行了算法验证。论文借助可拓的思想研究大数据相似性的问题,不仅对数据挖掘的研究有一定的理论促进,同时也为可拓学的研究提供了新的应用空间。  相似文献   

13.
A new class of involutive divisions induced by certain orderings of monomials is considered. It is proved that these divisions are Noetherian and constructive. Therefore, each of them allows one to compute an involutive Gröbner basis of a polynomial ideal by sequentially examining multiplicative reductions of nonmultiplicative prolongations. The dependence of involutive algorithms on the completion ordering is studied. Based on the properties of particular involutive divisions, two computational optimizations are suggested. One of them consists of a special choice of the completion ordering. The other optimization is related to recomputing multiplicative and nonmultiplicative variables in the course of the algorithm. Bibliography: 17 titles.  相似文献   

14.
处理大规模数据集时,抽样是一种很受欢迎的有效方法。体积抽样作为一种联合抽样的方法,它是按照与矩阵平方的行列式成比例进行抽样。该方法在线性回归模型背景下能得到参数的无偏估计。然而也容易受到异常点的影响,本文感兴趣的是体积抽样受异常点影响的程度。基于数据删除模型和均值漂移模型构建统计量进行异常点诊断,结果发现体积抽样方法在某些情况下极易受异常点影响。但是在给定损失的条件下,比独立同分布抽样所需的子样本量更小,在此基础上,提出样本量的自适应选择方法。作为体积抽样的扩展,杠杆值体积抽样同样可以得到普通最小二乘线性模型参数的无偏估计,一个有趣的发现是使用杠杆值体积抽样,等权最小二乘估计结果比非等权最小二乘估计效果好。  相似文献   

15.
This article surveys the usual techniques of nonlinear optimal control such as the Pontryagin Maximum Principle and the conjugate point theory, and how they can be implemented numerically, with a special focus on applications to aerospace problems. In practice the knowledge resulting from the maximum principle is often insufficient for solving the problem, in particular because of the well-known problem of initializing adequately the shooting method. In this survey article it is explained how the usual tools of optimal control can be combined with other mathematical techniques to improve significantly their performances and widen their domain of application. The focus is put onto three important issues. The first is geometric optimal control, which is a theory that has emerged in the 1980s and is combining optimal control with various concepts of differential geometry, the ultimate objective being to derive optimal synthesis results for general classes of control systems. Its applicability and relevance is demonstrated on the problem of atmospheric reentry of a space shuttle. The second is the powerful continuation or homotopy method, consisting of deforming continuously a problem toward a simpler one and then of solving a series of parameterized problems to end up with the solution of the initial problem. After having recalled its mathematical foundations, it is shown how to combine successfully this method with the shooting method on several aerospace problems such as the orbit transfer problem. The third one consists of concepts of dynamical system theory, providing evidence of nice properties of the celestial dynamics that are of great interest for future mission design such as low-cost interplanetary space missions. The article ends with open problems and perspectives.  相似文献   

16.
Several general classes of generating functions are established for a certain sequence of functions defined by Equation (1) below. By suitably specializing the various parameters involved, each of these main results can be applied to yield known as well as new generating functions for such familiar orthogonal polynomials as Jacobi, Laguerre, Hermite, and Bessel polynomials, and also for numerous interesting generalizations of these polynomials studied in the literature.  相似文献   

17.
We prove that every hesitant fuzzy set on a set E can be considered either a soft set over the universe [0,1] or a soft set over the universe E. Concerning converse relationships, for denumerable universes we prove that any soft set can be considered even a fuzzy set. Relatedly, we demonstrate that every hesitant fuzzy soft set can be identified with a soft set, thus a formal coincidence of both notions is brought to light. Coupled with known relationships, our results prove that interval type-2 fuzzy sets and interval-valued fuzzy sets can be considered as soft sets over the universe [0,1]. Altogether we contribute to a more complete understanding of the relationships among various theories that capture vagueness and imprecision.  相似文献   

18.
We introduce a class of convolution inequalities and study the implications of these inequalities for certain problems in harmonic analysis.

  相似文献   


19.
Siberian Mathematical Journal -  相似文献   

20.
为了提高在用电梯监督抽查工作效率及有效性,在统计分析G市电梯安全监管抽查的大样本数据基础上,构建以电梯使用情况、电梯基本参数及制造维保相关情况等为指标的管理体系。根据电梯抽查数据的实质,先对数据进行变量筛选,然后构建风险分级,最后对前人的方法作出改进形成风险矩阵法并提出以Logistic方法为电梯整机风险建立量化模型,最终形成电梯整机风险评估体系。从理论的角度看,通过使用LIFT统计量和K-S统计量比较两种风险值计算模型,得出用Logistic方法进行风险分层更为准确。而实际的工程应用表明,利用Logistic回归法与基于平均风险值赋权比例法的组合为电梯安全监管抽样调查提供的筛选比例,比现有的方法更合理准确。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号