首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Conventional decision trees use queries each of which is based on one attribute. In this study, we also examine decision trees that handle additional queries based on hypotheses. This kind of query is similar to the equivalence queries considered in exact learning. Earlier, we designed dynamic programming algorithms for the computation of the minimum depth and the minimum number of internal nodes in decision trees that have hypotheses. Modification of these algorithms considered in the present paper permits us to build decision trees with hypotheses that are optimal relative to the depth or relative to the number of the internal nodes. We compare the length and coverage of decision rules extracted from optimal decision trees with hypotheses and decision rules extracted from optimal conventional decision trees to choose the ones that are preferable as a tool for the representation of information. To this end, we conduct computer experiments on various decision tables from the UCI Machine Learning Repository. In addition, we also consider decision tables for randomly generated Boolean functions. The collected results show that the decision rules derived from decision trees with hypotheses in many cases are better than the rules extracted from conventional decision trees.  相似文献   

2.
魏恒东  李立萍  郭建秀 《中国物理 B》2010,19(5):50505-050505
It is an important problem in chaos theory whether an observed irregular signal is deterministic chaotic or stochastic. We propose an efficient method for distinguishing deterministic chaotic from stochastic time series for short scalar time series. We first investigate, with the increase of the embedding dimension, the changing trend of the distance between two points which stay close in phase space. And then, we obtain the differences between Gaussian white noise and deterministic chaotic time series underlying this method. Finally, numerical experiments are presented to testify the validity and robustness of the method. Simulation results indicate that our method can distinguish deterministic chaotic from stochastic time series effectively even when the data are short and contaminated.  相似文献   

3.
We propose a model for the control of fixational eye movements using time-delayed random walks. Fixational eye movements produce random displacements of the retinal image to prevent perceptual fading. First, we demonstrate that a transition from persistent to antipersistent correlations occurs in data recorded from a visual fixation task. Second, we propose and investigate a delayed random-walk model and get, by comparison of the transition points, an estimate of the neurophysiological delay. Differences between horizontal and vertical components of eye movements are found which can be explained neurophysiologically. Finally, we compare our numerical results with analytic approximations.  相似文献   

4.
The goal of the Label Ranking (LR) problem is to learn preference models that predict the preferred ranking of class labels for a given unlabeled instance. Different well-known machine learning algorithms have been adapted to deal with the LR problem. In particular, fine-tuned instance-based algorithms (e.g., k-nearest neighbors) and model-based algorithms (e.g., decision trees) have performed remarkably well in tackling the LR problem. Probabilistic Graphical Models (PGMs, e.g., Bayesian networks) have not been considered to deal with this problem because of the difficulty of modeling permutations in that framework. In this paper, we propose a Hidden Naive Bayes classifier (HNB) to cope with the LR problem. By introducing a hidden variable, we can design a hybrid Bayesian network in which several types of distributions can be combined: multinomial for discrete variables, Gaussian for numerical variables, and Mallows for permutations. We consider two kinds of probabilistic models: one based on a Naive Bayes graphical structure (where only univariate probability distributions are estimated for each state of the hidden variable) and another where we allow interactions among the predictive attributes (using a multivariate Gaussian distribution for the parameter estimation). The experimental evaluation shows that our proposals are competitive with the start-of-the-art algorithms in both accuracy and in CPU time requirements.  相似文献   

5.
This paper deals with a prediction problem of a new targeting variable corresponding to a new explanatory variable given a training dataset. To predict the targeting variable, we consider a model tree, which is used to represent a conditional probabilistic structure of a targeting variable given an explanatory variable, and discuss statistical optimality for prediction based on the Bayes decision theory. The optimal prediction based on the Bayes decision theory is given by weighting all the model trees in the model tree candidate set, where the model tree candidate set is a set of model trees in which the true model tree is assumed to be included. Because the number of all the model trees in the model tree candidate set increases exponentially according to the maximum depth of model trees, the computational complexity of weighting them increases exponentially according to the maximum depth of model trees. To solve this issue, we introduce a notion of meta-tree and propose an algorithm called MTRF (Meta-Tree Random Forest) by using multiple meta-trees. Theoretical and experimental analyses of the MTRF show the superiority of the MTRF to previous decision tree-based algorithms.  相似文献   

6.
Quantum walk, the quantum counterpart of random walk, is an important model and widely studied to develop new quantum algorithms. This paper studies the relationship between the continuous-time quantum walk and the symmetry of a graph, especially that of a tree. Firstly, we prove in mathematics that the symmetry of a graph is highly related to quantum walk. Secondly, we propose an algorithm based on the continuous-time quantum walk to compute the symmetry of a tree. Our algorithm has better time complexity O(N3) than the current best algorithm. Finally, through testing three types of 10024 trees, we find that the symmetry of a tree can be found with an extremely high efficiency with the help of the continuous-time quantum walk.  相似文献   

7.
Recently a great deal of effort has been made to explicitly determine the mean first-passage time (MFPT) between two nodes averaged over all pairs of nodes on a fractal network. In this paper, we first propose a family of generalized delayed recursive trees characterized by two parameters, where the existing nodes have a time delay to produce new nodes. We then study the MFPT of random walks on this kind of recursive tree and investigate the effect of the time delay on the MFPT. By relating random walks to electrical networks, we obtain an exact formula for the MFPT and verify it by numerical calculations. Based on the obtained results, we further show that the MFPT of delayed recursive trees is much shorter, implying that the efficiency of random walks is much higher compared with the non-delayed counterpart. Our study provides a deeper understanding of random walks on delayed fractal networks.  相似文献   

8.
Multivariable time series forecasting is an important topic of machine learning, and it frequently involves a complex mix of inputs, including static covariates and exogenous time series input. A targeted investigation of this input data is critical for improving prediction performance. In this paper, we propose the fusion transformer (FusFormer), a transformer-based model for forecasting time series data, whose framework fuses various computation modules for time series input and static covariates. To be more precise, the model calculation consists of two parallel stages. First, it employs a temporal encoder–decoder framework for extracting dynamic temporal features from time series data input, which analyzes and integrates the relative position information of sequence elements into the attention mechanism. Simultaneously, the static covariates are fed to the static enrichment module, which is inspired by gated linear units, to suppress irrelevant information and control the extent of nonlinear processing. Finally, the prediction results are calculated by fusing the outputs of the above two stages. Using Mooney viscosity forecasting as a case study, we demonstrate considerable forecasting performance improvements over existing methodologies and verify the effectiveness of each component of FusFormer via ablation analysis, and an interpretability use case is conducted to visualize temporal patterns of time series. The experimental results prove that FusFormer can achieve accurate Mooney viscosity prediction and improve the efficiency of the tire production process.  相似文献   

9.
离散时间序列的网络模体分析   总被引:1,自引:0,他引:1       下载免费PDF全文
董昭  李翔 《物理学报》2010,59(3):1600-1607
时间序列可以被转换成网络的形式,复杂网络理论也因此可以用于刻画时间序列的时域和相空间特性.本文针对可视图算法和相空间重构算法这两种时间序列的转换算法,研究了它们的伴生网络在倍周期分岔和混沌等各种类型时间序列的模体分布特征,分析了这两种算法各自的优点.  相似文献   

10.
In many industrial domains, there is a significant interest in obtaining temporal relationships among multiple variables in time-series data, given that such relationships play an auxiliary role in decision making. However, when transactions occur frequently only for a period of time, it is difficult for a traditional time-series association rules mining algorithm (TSARM) to identify this kind of relationship. In this paper, we propose a new TSARM framework and a novel algorithm named TSARM-UDP. A TSARM mining framework is used to mine time-series association rules (TSARs) and an up-to-date pattern (UDP) is applied to discover rare patterns that only appear in a period of time. Based on the up-to-date pattern mining, the proposed TSAR-UDP method could extract temporal relationship rules with better generality. The rules can be widely used in the process industry, the stock market, etc. Experiments are then performed on the public stock data and real blast furnace data to verify the effectiveness of the proposed algorithm. We compare our algorithm with three state-of-the-art algorithms, and the experimental results show that our algorithm can provide greater efficiency and interpretability in TSARs and that it has good prospects.  相似文献   

11.
The recursive and hierarchical structure of full rooted trees is applicable to statistical models in various fields, such as data compression, image processing, and machine learning. In most of these cases, the full rooted tree is not a random variable; as such, model selection to avoid overfitting is problematic. One method to solve this problem is to assume a prior distribution on the full rooted trees. This enables the optimal model selection based on Bayes decision theory. For example, by assigning a low prior probability to a complex model, the maximum a posteriori estimator prevents the selection of the complex one. Furthermore, we can average all the models weighted by their posteriors. In this paper, we propose a probability distribution on a set of full rooted trees. Its parametric representation is suitable for calculating the properties of our distribution using recursive functions, such as the mode, expectation, and posterior distribution. Although such distributions have been proposed in previous studies, they are only applicable to specific applications. Therefore, we extract their mathematically essential components and derive new generalized methods to calculate the expectation, posterior distribution, etc.  相似文献   

12.
Gradient Boosting Machines (GBM) are among the go-to algorithms on tabular data, which produce state-of-the-art results in many prediction tasks. Despite its popularity, the GBM framework suffers from a fundamental flaw in its base learners. Specifically, most implementations utilize decision trees that are typically biased towards categorical variables with large cardinalities. The effect of this bias was extensively studied over the years, mostly in terms of predictive performance. In this work, we extend the scope and study the effect of biased base learners on GBM feature importance (FI) measures. We demonstrate that although these implementation demonstrate highly competitive predictive performance, they still, surprisingly, suffer from bias in FI. By utilizing cross-validated (CV) unbiased base learners, we fix this flaw at a relatively low computational cost. We demonstrate the suggested framework in a variety of synthetic and real-world setups, showing a significant improvement in all GBM FI measures while maintaining relatively the same level of prediction accuracy.  相似文献   

13.
We investigate functions that are exact solutions to chaotic dynamical systems. A generalization of these functions can produce truly random numbers. For the first time, we present solutions to random maps. This allows us to check, analytically, some recent results about the complexity of random dynamical systems. We confirm the result that a negative Lyapunov exponent does not imply predictability in random systems. We test the effectiveness of forecasting methods in distinguishing between chaotic and random time series. Using the explicit random functions, we can give explicit analytical formulas for the output signal in some systems with stochastic resonance. We study the influence of chaos on the stochastic resonance. We show, theoretically, the existence of a new type of solitonic stochastic resonance, where the shape of the kink is crucial. Using our models we can predict specific patterns in the output signal of stochastic resonance systems. (c) 2001 American Institute of Physics.  相似文献   

14.
Though a significant amount of work has been done on detecting obstacles, not much attention has been given to the detection of drop offs, e.g., sidewalk curbs, downward stairs, and other hazards. In this paper, we propose algorithms for detecting negative obstacles in an urban setting using stereo vision and two-stage dynamic programming (TSDP) technique. We are developing computer vision algorithms for sensing important terrain features as an aid to blind navigation, which interpret visual information obtained from images collected by cameras mounted on camera legs nearly as high as young person. This paper focuses specifically on a novel computer vision algorithm for detecting negative obstacles (i.e. anything below the level of the ground, such as holes and drop-offs), which are important and ubiquitous features on and near sidewalks and other walkways. The proposed algorithm is compared to other algorithms such as belief propagation and random growing correspondence seeds (GCS). According to the results, the proposed method achieves higher speed, more accurate disparity map and lower RMS errors. The speed of the proposed algorithm is about 28% higher than the random GCS algorithm. We demonstrate experimental results on typical sidewalk scenes to show the effectiveness of the proposed method.  相似文献   

15.
Nowadays, more and more multimedia services are supported by Mobile Edge Computing (MEC). However, the instability of the wireless environment brings a lot of uncertainty to the computational offloading. Additionally, intelligent reflecting surface (IRS) is considered as a potential technology to enhance Quality of Service (QoS). Therefore, in this paper, we establish a framework for IRS-assisted MEC computational offloading to solve this problem and take fairness optimization as a key point involving communication and computing resources. Minimize user consumption by optimizing bandwidth allocation, task offloading ratio, edge computing resources, transmission power and IRS phase shifts. Firstly, we decompose the problem into three aspects, such as bandwidth allocation, computing resource allocation, transmission power and IRS phase shifts. Then, an alternative optimization algorithm is proposed to find the optimum solution and its convergence is proved. Secondly, since the optimization problem on transmission power and IRS phase shifts is non-convex, we propose Riemann gradient descent (R-SGD) algorithm to solve it. Finally, numerical results show that our proposed algorithm performs better than other algorithms and achieves a superiority in the framework.  相似文献   

16.
Self-similar topology, which can be characterized as power law size distribution, has been found in diverse tree networks ranging from river networks to taxonomic trees. In this study, we find that the statistical self-similar topology is an inevitable consequence of any full binary tree organization. We show this by coding a binary tree as a unique bifurcation string. This coding scheme allows us to investigate trees over the realm from deterministic to entirely random trees. To obtain partial random trees, partial random perturbation is added to the deterministic trees by an operator similar to that used in genetic algorithms. Our analysis shows that the hierarchical density of binary trees is more diverse than has been described in earlier studies. We find that the connectivity structure of river networks is far from strict self-similar trees. On the other hand, organization of some social networks is close to deterministic supercritical trees.  相似文献   

17.
Geometric method-based procedures, which we will call GM algorithms hereafter, were introduced in M.A. Sánchez-Granero, J.E. Trinidad Segovia, J. García Pérez, Some comments on Hurst exponent and the long memory processes on capital markets, Phys. A 387 (2008) 5543-5551, to calculate the Hurst exponent of a time series. The authors proved that GM algorithms, based on a geometrical approach, are more accurate than classical algorithms, especially with short length time series. The main contribution of this paper is to provide a mathematical background for the validity of these two algorithms to calculate the Hurst exponent H of random processes with stationary and self-affine increments. In particular, we show that these procedures are valid not only for exploring long memory in classical processes such as (fractional) Brownian motions, but also for estimating the Hurst exponent of (fractional) Lévy stable motions.  相似文献   

18.
Instance matching is a key task in knowledge graph fusion, and it is critical to improving the efficiency of instance matching, given the increasing scale of knowledge graphs. Blocking algorithms selecting candidate instance pairs for comparison is one of the effective methods to achieve the goal. In this paper, we propose a novel blocking algorithm named MultiObJ, which constructs indexes for instances based on the Ordered Joint of Multiple Objects’ features to limit the number of candidate instance pairs. Based on MultiObJ, we further propose a distributed framework named Follow-the-Regular-Leader Instance Matching (FTRLIM), which matches instances between large-scale knowledge graphs with approximately linear time complexity. FTRLIM has participated in OAEI 2019 and achieved the best matching quality with significantly efficiency. In this research, we construct three data collections based on a real-world large-scale knowledge graph. Experiment results on the constructed data collections and two real-world datasets indicate that MultiObJ and FTRLIM outperform other state-of-the-art methods.  相似文献   

19.
The heterogeneous graphical Granger model (HGGM) for causal inference among processes with distributions from an exponential family is efficient in scenarios when the number of time observations is much greater than the number of time series, normally by several orders of magnitude. However, in the case of “short” time series, the inference in HGGM often suffers from overestimation. To remedy this, we use the minimum message length principle (MML) to determinate the causal connections in the HGGM. The minimum message length as a Bayesian information-theoretic method for statistical model selection applies Occam’s razor in the following way: even when models are equal in their measure of fit-accuracy to the observed data, the one generating the most concise explanation of data is more likely to be correct. Based on the dispersion coefficient of the target time series and on the initial maximum likelihood estimates of the regression coefficients, we propose a minimum message length criterion to select the subset of causally connected time series with each target time series and derive its form for various exponential distributions. We propose two algorithms—the genetic-type algorithm (HMMLGA) and exHMML to find the subset. We demonstrated the superiority of both algorithms in synthetic experiments with respect to the comparison methods Lingam, HGGM and statistical framework Granger causality (SFGC). In the real data experiments, we used the methods to discriminate between pregnancy and labor phase using electrohysterogram data of Islandic mothers from Physionet databasis. We further analysed the Austrian climatological time measurements and their temporal interactions in rain and sunny days scenarios. In both experiments, the results of HMMLGA had the most realistic interpretation with respect to the comparison methods. We provide our code in Matlab. To our best knowledge, this is the first work using the MML principle for causal inference in HGGM.  相似文献   

20.
In this article, we propose new Monte Carlo techniques for moving a diffusive particle in a discontinuous media. In this framework, we characterize the stochastic process that governs the positions of the particle. The key tool is the reduction of the process to a Skew Brownian motion (SBM). In a zone where the coefficients are locally constant on each side of the discontinuity, the new position of the particle after a constant time step is sampled from the exact distribution of the SBM process at the considered time. To do so, we propose two different but equivalent algorithms: a two-steps simulation with a stop at the discontinuity and a one-step direct simulation of the SBM dynamic. Some benchmark tests illustrate their effectiveness.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号