首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
The accurate estimation of rare event probabilities is a crucial problem in engineering to characterize the reliability of complex systems. Several methods such as Importance Sampling or Importance Splitting have been proposed to perform the estimation of such events more accurately (i.e., with a lower variance) than crude Monte Carlo method. However, these methods assume that the probability distributions of the input variables are exactly defined (e.g., mean and covariance matrix perfectly known if the input variables are defined through Gaussian laws) and are not able to determine the impact of a change in the input distribution parameters on the probability of interest. The problem considered in this paper is the propagation of the input distribution parameter uncertainty defined by intervals to the rare event probability. This problem induces intricate optimization and numerous probability estimations in order to determine the upper and lower bounds of the probability estimate. The calculation of these bounds is often numerically intractable for rare event probability (say 10?5), due to the high computational cost required. A new methodology is proposed to solve this problem with a reduced simulation budget, using the adaptive Importance Sampling. To this end, a method for estimating the Importance Sampling optimal auxiliary distribution is proposed, based on preceding Importance Sampling estimations. Furthermore, a Kriging-based adaptive Importance Sampling is used in order to minimize the number of evaluations of the computationally expensive simulation code. To determine the bounds of the probability estimate, an evolutionary algorithm is employed. This algorithm has been selected to deal with noisy problems since the Importance Sampling probability estimate is a random variable. The efficiency of the proposed approach, in terms of accuracy of the found results and computational cost, is assessed on academic and engineering test cases.  相似文献   

2.
为保证电网安全稳定运行,在大规模风电并网运行控制过程中,准确构建风电出力波动特性的概率分布模型具有重要意义.基于数据驱动的方法,采用加权高斯混合概率分布模型来拟合大规模风电基地的波动特性,模型拟合参数可采用基于期望最大化(Expectation Maximization,EM)的极大似然估计算法来获得,并提出了拟合评价...  相似文献   

3.
Reference growth curves estimate the distribution of a measurement as it changes according to some covariate, often age. We present a new methodology to estimate growth curves based on mixture models and splines. We model the distribution of the measurement with a mixture of normal distributions with an unknown number of components, and model dependence on the covariate through the weights, using smooth functions based on B-splines. In this way the growth curves respect the continuity of the covariate and there is no need for arbitrary grouping of the observations. The method is illustrated with data on triceps skinfold in Gambian girls and women.  相似文献   

4.
Common approaches to monotonic regression focus on the case of a unidimensional covariate and continuous response variable. Here a general approach is proposed that allows for additive structures where one or more variables have monotone influence on the response variable. In addition the approach allows for response variables from an exponential family, including binary and Poisson distributed response variables. Flexibility of the smooth estimate is gained by expanding the unknown function in monotonic basis functions. For the estimation of coefficients and the selection of basis functions a likelihood-based boosting algorithm is proposed which is simple to implement. Stopping criteria and inference are based on AIC-type measures. The method is applied to several datasets.  相似文献   

5.
Robust S-estimation is proposed for multivariate Gaussian mixture models generalizing the work of Hastie and Tibshirani (J. Roy. Statist. Soc. Ser. B 58 (1996) 155). In the case of Gaussian Mixture models, the unknown location and scale parameters are estimated by the EM algorithm. In the presence of outliers, the maximum likelihood estimators of the unknown parameters are affected, resulting in the misclassification of the observations. The robust S-estimators of the unknown parameters replace the non-robust estimators from M-step of the EM algorithm. The results were compared with the standard mixture discriminant analysis approach using the probability of misclassification criterion. This comparison showed a slight reduction in the average probability of misclassification using robust S-estimators as compared to the standard maximum likelihood estimators.  相似文献   

6.
A model based clustering procedure for data of mixed type, clustMD, is developed using a latent variable model. It is proposed that a latent variable, following a mixture of Gaussian distributions, generates the observed data of mixed type. The observed data may be any combination of continuous, binary, ordinal or nominal variables. clustMD employs a parsimonious covariance structure for the latent variables, leading to a suite of six clustering models that vary in complexity and provide an elegant and unified approach to clustering mixed data. An expectation maximisation (EM) algorithm is used to estimate clustMD; in the presence of nominal data a Monte Carlo EM algorithm is required. The clustMD model is illustrated by clustering simulated mixed type data and prostate cancer patients, on whom mixed data have been recorded.  相似文献   

7.
A general methodology to optimize the weight of power transmission structures is presented in this article. This methodology is based on the simulated annealing algorithm defined by Kirkpatrick in the early ‘80s. This algorithm consists of a stochastic approach that allows to explore and analyze solutions that do not improve the objective function in order to develop a better exploration of the design region and to obtain the global optimum. The proposed algorithm allows to consider the discrete behavior of the sectional variables for each element and the continuous behavior of the general geometry variables. Thus, an optimization methodology that can deal with a mixed optimization problem and includes both continuum and discrete design variables is developed. In addition, it does not require to study all the possible design combinations defined by discrete design variables. The algorithm proposed usually requires to develop a large number of simulations (structural analysis in this case) in practical applications. Thus, the authors have developed first order Taylor expansions and the first order sensitivity analysis involved in order to reduce the CPU time required. Exterior penalty functions have been also included to deal with the design constraints. Thus, the general methodology proposed allows to optimize real power transmission structures in acceptable CPU time.  相似文献   

8.
This article presents a practical refinement of generalized polynomial chaos expansion for uncertainty quantification under dependent input random variables. Unlike the Rodrigues-type formula, which exists for select probability measures, a three-step computational algorithm is put forward to generate a sequence of any approximate measure-consistent multivariate orthonormal polynomials. For uncertainty quantification analysis under dependent random variables, two regression methods, comprising existing standard least-squares and newly developed partitioned diffeomorphic modulation under observable response preserving homotopy (D-MORPH), are proposed to estimate the coefficients of generalized polynomial chaos expansion for the very first time. In contrast to the existing regression devoted so far to the classical polynomial chaos expansion, no tensor-product structure is required or enforced. The partitioned D-MORPH regression is applicable to either an underdetermined or overdetermined system, thus substantially enhancing the ability of the original D-MORPH regression. Numerical results obtained for Gaussian and non-Gaussian probability measures with rectangular or non-rectangular domains point to highly accurate orthonormal polynomials produced by the three-step algorithm. More significantly, the generalized polynomial chaos approximations of mathematical functions and stochastic responses from solid-mechanics problems, in tandem with the partitioned D-MORPH regression, provide excellent estimates of the second-moment properties and reliability from only hundreds of function evaluations or finite element analyses.  相似文献   

9.
This paper studies estimation in partial functional linear quantile regression in which the dependent variable is related to both a vector of finite length and a function-valued random variable as predictor variables. The slope function is estimated by the functional principal component basis. The asymptotic distribution of the estimator of the vector of slope parameters is derived and the global convergence rate of the quantile estimator of unknown slope function is established under suitable norm. It is showed that this rate is optimal in a minimax sense under some smoothness assumptions on the covariance kernel of the covariate and the slope function. The convergence rate of the mean squared prediction error for the proposed estimators is also be established. Finite sample properties of our procedures are studied through Monte Carlo simulations. A real data example about Berkeley growth data is used to illustrate our proposed methodology.  相似文献   

10.
1 引言 近来,人们对用过剩其函数来表示信号的处理方法表现出极大的兴趣,原因是基函数类越大所表示出的函数类就越大,人们通常采用小波基和Gabor基,本文的目的是构造一个寻找优化(或最优)基函数的算法,该算法的出发点是从过剩的其函数中选择紧支撑的基向量。  相似文献   

11.
Cluster analysis is an unsupervised learning technique for partitioning objects into several clusters. Assuming that noisy objects are included, we propose a soft clustering method which assigns objects that are significantly different from noise into one of the specified number of clusters by controlling decision errors through multiple testing. The parameters of the Gaussian mixture model are estimated from the EM algorithm. Using the estimated probability density function, we formulated a multiple hypothesis testing for the clustering problem, and the positive false discovery rate (pFDR) is calculated as our decision error. The proposed procedure classifies objects into significant data or noise simultaneously according to the specified target pFDR level. When applied to real and artificial data sets, it was able to control the target pFDR reasonably well, offering a satisfactory clustering performance.  相似文献   

12.
This article proposes a probability model for k-dimensional ordinal outcomes, that is, it considers inference for data recorded in k-dimensional contingency tables with ordinal factors. The proposed approach is based on full posterior inference, assuming a flexible underlying prior probability model for the contingency table cell probabilities. We use a variation of the traditional multivariate probit model, with latent scores that determine the observed data. In our model, a mixture of normals prior replaces the usual single multivariate normal model for the latent variables. By augmenting the prior model to a mixture of normals we generalize inference in two important ways. First, we allow for varying local dependence structure across the contingency table. Second, inference in ordinal multivariate probit models is plagued by problems related to the choice and resampling of cutoffs defined for these latent variables. We show how the proposed mixture model approach entirely removes these problems. We illustrate the methodology with two examples, one simulated dataset and one dataset of interrater agreement.  相似文献   

13.
This article considers a graphical model for ordinal variables, where it is assumed that the data are generated by discretizing the marginal distributions of a latent multivariate Gaussian distribution. The relationships between these ordinal variables are then described by the underlying Gaussian graphical model and can be inferred by estimating the corresponding concentration matrix. Direct estimation of the model is computationally expensive, but an approximate EM-like algorithm is developed to provide an accurate estimate of the parameters at a fraction of the computational cost. Numerical evidence based on simulation studies shows the strong performance of the algorithm, which is also illustrated on datasets on movie ratings and an educational survey.  相似文献   

14.
Recurrent event data frequently occur in longitudinal studies, and it is often of interest to estimate the effects of covariates on the recurrent event rate. This paper considers a class of semiparametric transformation rate models for recurrent event data, which uses an additive Aalen model as its covariate dependent baseline. The new models are flexible in that they allow for both additive and multiplicative covariate effects, and some covariate effects are allowed to be nonparametric and time-varying. An estimating procedure is proposed for parameter estimation, and the resulting estimators are shown to be consistent and asymptotically normal. Simulation studies and a real data analysis demonstrate that the proposed method performs well and is appropriate for practical use.  相似文献   

15.
We propose the study of the relevant factors for the survival of Russian commercial banks during the transition period. The accelerated development of the Russian commercial banking industry after the banking reform caused high rates of entry followed by a period of high rates of exit. As a consequence, many banks had to exit the market without refunding their deposits. Therefore, both for the banks and for the banks' depositors, it is of interest to identify the relevant factors that motivate the exit or the closing of the bank. We propose a different methodology based on penalized weighted least squares that represents a very general, flexible and innovative approach for this type of analysis. That is, the proposed methodology does not require the assumptions of a probability distribution for the variable under study or of any covariate parametric functional form for the covariate whose effect on the survival variable we wish to address. Copyright © 2010 John Wiley & Sons, Ltd.  相似文献   

16.
A general methodology for selecting predictors for Gaussian generative classification models is presented. The problem is regarded as a model selection problem. Three different roles for each possible predictor are considered: a variable can be a relevant classification predictor or not, and the irrelevant classification variables can be linearly dependent on a part of the relevant predictors or independent variables. This variable selection model was inspired by a previous work on variable selection in model-based clustering. A BIC-like model selection criterion is proposed. It is optimized through two embedded forward stepwise variable selection algorithms for classification and linear regression. The model identifiability and the consistency of the variable selection criterion are proved. Numerical experiments on simulated and real data sets illustrate the interest of this variable selection methodology. In particular, it is shown that this well ground variable selection model can be of great interest to improve the classification performance of the quadratic discriminant analysis in a high dimension context.  相似文献   

17.
We establish computationally flexible tools for the analysis of multivariate skew normal mixtures when missing values occur in data. To facilitate the computation and simplify the theoretical derivation, two auxiliary permutation matrices are incorporated into the model for the determination of observed and missing components of each observation and are manifestly effective in reducing the computational complexity. We present an analytically feasible EM algorithm for the supervised learning of parameters as well as missing observations. The proposed mixture analyzer, including the most commonly used Gaussian mixtures as a special case, allows practitioners to handle incomplete multivariate data sets in a wide range of considerations. The methodology is illustrated through a real data set with varying proportions of synthetic missing values generated by MCAR and MAR mechanisms and shown to perform well on classification tasks.  相似文献   

18.
Moment independent sensitivity index is widely concerned and used since it can reflect the influence of model input uncertainty on the entire distribution of model output instead of a specific moment. In this paper, a novel analytical expression to estimate the Borgonovo moment independent sensitivity index is derived by use of the Gaussian radial basis function and the Edgeworth expansion. Firstly, the analytical expressions of the unconditional and conditional first four-order moments are established by the training points and the widths of the Gaussian radial basis function. Secondly, the Edgeworth expansion is used to express the unconditional and conditional probability density functions of model output by the unconditional and conditional first four-order moments, respectively. Finally, the index can be readily computed by measuring the shifts between the obtained unconditional and conditional probability density functions of model output, where this process doesn't need any extra calls of model evaluation. The computational cost of the proposed method is independent of the dimensionality of model inputs and it only depends on the training points and the widths which are involved in the Gaussian radial basis function meta-model. Results of several case studies demonstrate the effectiveness of the proposed method.  相似文献   

19.
In the study of complex organisms, clarifying the association between the evolution of coding genes and the measures of functional variables is of fundamental importance. However, traditional analysis of the evolutionary rate is either built on the assumption of independence between responses or fails to handle a mixture distribution problem. In this paper, we utilize the concept of generalized estimating equations to propose an estimating equation to accommodate continuous and binary probability distributions. The proposed estimate can be shown to have consistency and asymptotic normality. Simulations and data analysis are also presented to illustrate the proposed method.  相似文献   

20.
A finite mixture model using the multivariate t distribution has been well recognized as a robust extension of Gaussian mixtures. This paper presents an efficient PX-EM algorithm for supervised learning of multivariate t mixture models in the presence of missing values. To simplify the development of new theoretic results and facilitate the implementation of the PX-EM algorithm, two auxiliary indicator matrices are incorporated into the model and shown to be effective. The proposed methodology is a flexible mixture analyzer that allows practitioners to handle real-world multivariate data sets with complex missing patterns in a more efficient manner. The performance of computational aspects is investigated through a simulation study and the procedure is also applied to the analysis of real data with varying proportions of synthetic missing values.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号