首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Changepoint models are widely used to model the heterogeneity of sequential data. We present a novel sequential Monte Carlo (SMC) online expectation–maximization (EM) algorithm for estimating the static parameters of such models. The SMC online EM algorithm has a cost per time which is linear in the number of particles and could be particularly important when the data is representable as a long sequence of observations, since it drastically reduces the computational requirements for implementation. We present an asymptotic analysis for the stability of the SMC estimates used in the online EM algorithm and demonstrate the performance of this scheme by using both simulated and real data originating from DNA analysis. The supplementary materials for the article are available online.  相似文献   

2.
We provide a new approach to approximate emulation of large computer experiments. By focusing expressly on desirable properties of the predictive equations, we derive a family of local sequential design schemes that dynamically define the support of a Gaussian process predictor based on a local subset of the data. We further derive expressions for fast sequential updating of all needed quantities as the local designs are built up iteratively. Then we show how independent application of our local design strategy across the elements of a vast predictive grid facilitates a trivially parallel implementation. The end result is a global predictor able to take advantage of modern multicore architectures, providing a nonstationary modeling feature as a bonus. We demonstrate our method on two examples using designs with thousands of data points, and compare to the method of compactly supported covariances. Supplementary materials for this article are available online.  相似文献   

3.
In this era of big data, more and more models need to be trained to mine useful knowledge from large scale data. It has become a challenging problem to train multiple models accurately and efficiently so as to make full use of limited computing resources. As one of ELM variants, online sequential extreme learning machine (OS-ELM) provides a method to learn from incremental data. MapReduce, which provides a simple, scalable and fault-tolerant framework, can be utilized for large scale learning. In this paper, we propose an efficient parallel method for batched online sequential extreme learning machine (BPOS-ELM) training using MapReduce. Map execution time is estimated with historical statistics, where regression method and inverse distance weighted interpolation method are used. Reduce execution time is estimated based on complexity analysis and regression method. Based on the estimations, BPOS-ELM generates a Map execution plan and a Reduce execution plan. Finally, BPOS-ELM launches one MapReduce job to train multiple OS-ELM models according to the generated execution plan, and collects execution information to further improve estimation accuracy. Our proposal is evaluated with real and synthetic data. The experimental results show that the accuracy of BPOS-ELM is at the same level as those of OS-ELM and parallel OS-ELM (POS-ELM) with higher training efficiencies.  相似文献   

4.
This article presents a likelihood-based boosting approach for fitting binary and ordinal mixed models. In contrast to common procedures, this approach can be used in high-dimensional settings where a large number of potentially influential explanatory variables are available. Constructed as a componentwise boosting method, it is able to perform variable selection with the complexity of the resulting estimator being determined by information criteria. The method is investigated in simulation studies both for cumulative and sequential models and is illustrated by using real datasets. The supplementary materials for the article are available online.  相似文献   

5.
双并联前馈神经网络模型是单层感知机和单隐层前馈神经网络的混合结构,本文构造了一种双并联快速学习机算法,与其他类似算法比较,提出的算法能利用较少的隐层单元及更少的待定参数,获得近似的学习性能.数值实验表明,对很多实际分类问题,提出的算法具备更佳的泛化能力,因而可以作为快速学习机算法的有益补充.  相似文献   

6.
We explore a Bayesian framework for constructing combinations of classifier outputs, as a means to improving overall classification results. We propose a sequential Bayesian framework to estimate the posterior probability of being in a certain class given multiple classifiers. This framework, which employs meta-Gaussian modelling but makes no assumptions about the distribution of classifier outputs, allows us to capture nonlinear dependencies between the combined classifiers and individuals. An important property of our method is that it produces a combined classifier that dominates the individuals upon which it is based in terms of Bayes risk, error rate, and receiver operating characteristic (ROC) curve. To illustrate the method, we show empirical results from the combination of credit scores generated from four different scoring models.  相似文献   

7.
We propose sequential Monte Carlo-based algorithms for maximum likelihood estimation of the static parameters in hidden Markov models with an intractable likelihood using ideas from approximate Bayesian computation. The static parameter estimation algorithms are gradient-based and cover both offline and online estimation. We demonstrate their performance by estimating the parameters of three intractable models, namely the α-stable distribution, g-and-k distribution, and the stochastic volatility model with α-stable returns, using both real and synthetic data.  相似文献   

8.
We propose new sequential importance sampling methods for sampling contingency tables with given margins. The proposal for each method is based on asymptotic approximations to the number of tables with fixed margins. These methods generate tables that are very close to the uniform distribution. The tables, along with their importance weights, can be used to approximate the null distribution of test statistics and calculate the total number of tables. We apply the methods to a number of examples and demonstrate an improvement over other methods in a variety of real problems. Supplementary materials are available online.  相似文献   

9.
Complex functions, such as the output of computer simulators, can be difficult to optimize. The task becomes even more difficult when only some of the function evaluations return real numbers and others simply fail to return a value. We combine statistical emulation, classification, sequential design, and optimization with an asymmetric entropy measure to solve the thorny problem of finding an optimum along a constraint boundary. This approach is demonstrated on simulated examples and a real problem in groundwater remediation.  相似文献   

10.
将二维序贯均匀设计方法拓展到三维空间,扩展到三维的序贯均匀设计可用于优化的三因子试验设计问题,从而达到既减少试验次数,又提高试验精度的目的.  相似文献   

11.
We describe a method for generating independent samples from univariate density functions using adaptive rejection sampling without the log-concavity requirement. The method makes use of the fact that many functions can be expressed as a sum of concave and convex functions. Using a concave-convex decomposition, we bound the log-density by separately bounding the concave and convex parts using piecewise linear functions. The upper bound can then be used as the proposal distribution in rejection sampling. We demonstrate the applicability of the concave-convex approach on a number of standard distributions and describe an application to the efficient construction of sequential Monte Carlo proposal distributions for inference over genealogical trees. Computer code for the proposed algorithms is available online.  相似文献   

12.
针对连续数据流分类问题,基于在线学习理论,提出一种在线logistic回归算法.研究带有正则项的在线logistic回归,提出了在线logistic-l2回归模型,并给出了理论界估计.最终实验结果表明,随着在线迭代次数的增加,提出的模型与算法能够达到离线预测的分类结果.本文工作为处理海量流数据分类问题提供了一种新的有效方法.  相似文献   

13.
It is known that certain combinations of one‐sided sequential probability ratio tests are asymptotically optimal (relative to the expected sample size) for problems involving a finite number of possible distributions when probabilities of errors tend to zero and observations are independent and identically distributed according to one of the underlying distributions. The objective of this paper is to show that two specific constructions of sequential tests asymptotically minimize not only the expected time of observation but also any positive moment of the stopping time distribution under fairly general conditions for a finite number of simple hypotheses. This result appears to be true for general statistical models which include correlated and non‐homogeneous processes observed either in discrete or continuous time. For statistical problems with nuisance parameters, we consider invariant sequential tests and show that the same result is valid for this case. Finally, we apply general results to the solution of several particular problems such as a multi‐sample slippage problem for correlated Gaussian processes and for statistical models with nuisance parameters. This revised version was published online in June 2006 with corrections to the Cover Date.  相似文献   

14.
We study the best linear combination of markers in terms of the area under the receiver operating characteristic curve, since no single marker is perfect for classification purposes. The sequential fixed-width confidence interval estimate method is applied. We show that the proposed procedure is efficient in terms of the total sample size, with an optimal ratio of cases to controls, and is asymptotically consistent. The performance of our method is illustrated by synthesized data and a real example.  相似文献   

15.
Screening experiments are performed to eliminate unimportant factors efficiently so that the remaining important factors can be studied more thoroughly in later experiments. This paper proposes controlled sequential factorial design (CSFD) for discrete-event simulation experiments. It combines a sequential hypothesis testing procedure with a traditional (fractional) factorial design to control the Type I error and power for each factor under heterogeneous variance conditions. We compare CSFD with other sequential screening methods with similar error control properties. CSFD requires few assumptions and demonstrates robust performance with different system conditions. The method is appropriate for systems with a moderate number of factors and large variances.  相似文献   

16.
Abstract

We consider Markov mixture models for multiple longitudinal binary sequences. Prior uncertainty in the mixing distribution is characterized by a Dirichlet process centered on a matrix beta measure. We use this setting to evaluate and compare the performance of three competing algorithms that arise more generally in Dirichlet process mixture calculations: sequential imputations, Gibbs sampling, and a predictive recursion, for which an extension of the sequential calculations is introduced. This facilitates the estimation of quantities related to clustering structure which is not available in the original formulation. A numerical comparison is carried out in three examples. Our findings suggest that the sequential imputations method is most useful for relatively small problems, and that the predictive recursion can be an efficient preliminary tool for more reliable, but computationally intensive, Gibbs sampling implementations.  相似文献   

17.
This paper proposes a novel method to select an experimental design for interpolation in random simulation, especially discrete event simulation. (Though the paper focuses on Kriging, this design approach may also apply to other types of metamodels such as non-linear regression models and splines.) Assuming that simulation requires much computer time, it is important to select a design with a small number of observations (or simulation runs). The proposed method is therefore sequential. Its novelty is that it accounts for the specific input/output behavior (or response function) of the particular simulation at hand; i.e., the method is customized or application-driven. A tool for this customization is bootstrapping, which enables the estimation of the variances of predictions for inputs not yet simulated. The method is tested through two classic simulation models, namely the expected steady-state waiting time of the M/M/1 queuing model, and the mean costs of a terminating (s, S) inventory simulation. For these two simulation models the novel design indeed gives better results than a popular alternative design, namely Latin Hypercube Sampling (LHS) with a prefixed sample.  相似文献   

18.
The optimisation of a printed circuit board assembly line is mainly influenced by the constraints of the surface mount device (SMD) placement machine and the characteristics of the production environment. This paper surveys the characteristics of the various machine technologies and classifies them into five categories (dual-delivery, multi-station, turret-type, multi-head and sequential pick-and-place), based on their specifications and operational methods. Using this classification, we associate the machine technologies with heuristic methods and discuss the scheduling issues of each category of machine. We see the main contribution of this work as providing a classification for SMD placement machines and to survey the heuristics that have been used on different machines. We hope that this will guide other researchers so that they can subsequently use the classification or heuristics, or even design new heuristics that are more appropriate to the machine under consideration.  相似文献   

19.
We improve the twin support vector machine(TWSVM)to be a novel nonparallel hyperplanes classifier,termed as ITSVM(improved twin support vector machine),for binary classification.By introducing the diferent Lagrangian functions for the primal problems in the TWSVM,we get an improved dual formulation of TWSVM,then the resulted ITSVM algorithm overcomes the common drawbacks in the TWSVMs and inherits the essence of the standard SVMs.Firstly,ITSVM does not need to compute the large inverse matrices before training which is inevitable for the TWSVMs.Secondly,diferent from the TWSVMs,kernel trick can be applied directly to ITSVM for the nonlinear case,therefore nonlinear ITSVM is superior to nonlinear TWSVM theoretically.Thirdly,ITSVM can be solved efciently by the successive overrelaxation(SOR)technique or sequential minimization optimization(SMO)method,which makes it more suitable for large scale problems.We also prove that the standard SVM is the special case of ITSVM.Experimental results show the efciency of our method in both computation time and classification accuracy.  相似文献   

20.
We propose a sequential importance sampling strategy to estimate subgraph frequencies and detect network motifs. The method is developed by sampling subgraphs sequentially node by node using a carefully chosen proposal distribution. Viewing the subgraphs as rooted trees, we propose a recursive formula that approximates the number of subgraphs containing a particular node or set of nodes. The proposal used to sample nodes is proportional to this estimated number of subgraphs. The method generates subgraphs from a distribution close to uniform, and performs better than competing methods. We apply the method to four real-world networks and demonstrate outstanding performance in practical examples. Supplemental materials for the article are available online.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号