期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Unsupervised Learning With Random Forest Predictors

《Journal of computational and graphical statistics》2013,22(1):118-138

A random forest (RF) predictor is an ensemble of individual tree predictors. As part of their construction, RF predictors naturally lead to a dissimilarity measure between the observations. One can also define an RF dissimilarity measure between unlabeled data: the idea is to construct an RF predictor that distinguishes the “observed” data from suitably generated synthetic data. The observed data are the original unlabeled data and the synthetic data are drawn from a reference distribution. Here we describe the properties of the RF dissimilarity and make recommendations on how to use it in practice.

An RF dissimilarity can be attractive because it handles mixed variable types well, is invariant to monotonic transformations of the input variables, and is robust to outlying observations. The RF dissimilarity easily deals with a large number of variables due to its intrinsic variable selection; for example, the Addcl 1 RF dissimilarity weighs the contribution of each variable according to how dependent it is on other variables.

We find that the RF dissimilarity is useful for detecting tumor sample clusters on the basis of tumor marker expressions. In this application, biologically meaningful clusters can often be described with simple thresholding rules. 相似文献

2.

Case-Specific Random Forests

Ruo Xu Dan Nettleton Daniel J. Nordman 《Journal of computational and graphical statistics》2016,25(1):49-65

Random forest (RF) methodology is a nonparametric methodology for prediction problems. A standard way to use RFs includes generating a global RF to predict all test cases of interest. In this article, we propose growing different RFs specific to different test cases, namely case-specific random forests (CSRFs). In contrast to the bagging procedure in the building of standard RFs, the CSRF algorithm takes weighted bootstrap resamples to create individual trees, where we assign large weights to the training cases in close proximity to the test case of interest a priori. Tuning methods are discussed to avoid overfitting issues. Both simulation and real data examples show that the weighted bootstrap resampling used in CSRF construction can improve predictions for specific cases. We also propose a new case-specific variable importance (CSVI) measure as a way to compare the relative predictor variable importance for predicting a particular case. It is possible that the idea of building a predictor case-specifically can be generalized in other areas. 相似文献

3.

Bayesian tail‐risk forecasting using realized GARCH

下载免费PDF全文

Christian Contino Richard H. Gerlach 《商业与工业应用随机模型》2017,33(2):213-236

A realized generalized autoregressive conditional heteroskedastic (GARCH) model is developed within a Bayesian framework for the purpose of forecasting value at risk and conditional value at risk. Student‐t and skewed‐t return distributions are combined with Gaussian and student‐t distributions in the measurement equation to forecast tail risk in eight international equity index markets over a 4‐year period. Three realized measures are considered within this framework. A Bayesian estimator is developed that compares favourably, in simulations, with maximum likelihood, both in estimation and forecasting. The realized GARCH models show a marked improvement compared with ordinary GARCH for both value‐at‐risk and conditional value‐at‐risk forecasting. This improvement is consistent across a variety of data and choice of distributions. Realized GARCH models incorporating a skewed student‐t distribution for returns are favoured overall, with the choice of measurement equation error distribution and realized measure being of lesser importance. Copyright © 2017 John Wiley & Sons, Ltd. 相似文献

4.

数据分组和右截尾情形下混合指数分布的参数估计

田玉柱田茂再陈平《数理统计与管理》2012,(6):981-989

混合模型是可靠性工程,金融保险和计量经济学等领域中的一类重要模型。本文利用EM算法考虑了混合指数分布在分组数据和右截尾情形下的参数估计问题,并给出了相应的参数估计公式,最后的数值模拟表明EM算法对我们的模型是有效的。相似文献

5.

Asymmetry and multiple endemic equilibria in a model for HIV transmission in a heterosexual population

《Mathematical and Computer Modelling》1999,29(3):43-61

The difference in transmissibility of HIV between heterosexual males and females in specific social contexts is known to play an important role in determining the form of HIV/AIDS epidemics across the globe. A fundamental constraint here is the conservation of the number of new partnerships formed between the sexes. We examine the impact of general asymmetry in sexual behaviour between the sexes, subject to this group contact constraint, on the transient and long term behaviour of a HIV epidemic. A homogeneously mixing heterosexual population is modelled in which males and females differ only in their infectivity rates (average sexual risk per infected partner) and sexual activity rates (the mean number of sexual partners per unit time for a typical individual). A dominance form of sexual activity rates yields conditions for the existence of multiple endemic equilibria for R₀. the reproductive number, just less than unity. We interpret this as a resilience of the disease persistence for R₀ > 1, which requires significant differences between the sexes' transmissibility. Model simulations in this region of the parameter space show that the time scale and shape of an epidemic curve can be considerably altered. Sexual activity rates modelling the proportions of sexually active groups are also used to address the role of asymmetry. We discuss the consequences of our results for management of the disease. 相似文献

6.

On a proper way to select population failure distribution and a stochastic optimization method in parameter estimation

Wan-Kai Pang Shui-Hung Hou Wing-Tong Yu 《European Journal of Operational Research》2007

It is widely accepted that the Weibull distribution plays an important role in reliability applications. The reliability of a product or a system is the probability that the product or the system will still function for a specified time period when operating under some confined conditions. Parameter estimation for the three parameter Weibull distribution has been studied by many researchers in the past. Maximum likelihood has traditionally been the main method of estimation for Weibull parameters along with other recently proposed hybrids of optimization methods. In this paper, we use a stochastic optimization method called the Markov Chain Monte Carlo (MCMC) to carry out the estimation. The method is extremely flexible and inference for any quantity of interest is easily obtained. 相似文献

7.

Preserving the Rothschild–Stiglitz type of increasing risk with background risk

《Insurance: Mathematics and Economics》2016

Background risk refers to a risk that is exogenous and is not subject to transformations by a decision-maker. In this paper, we extend the definition of the Rothschild–Stiglitz type of increasing risk to a background risk framework. We theoretically investigate a more general definition of increase in risk in the presence of background risk. The results suggest that an extended concept of expectation dependence plays a vital role. 相似文献

8.

Artificial intelligence in healthcare operations to enhance treatment outcomes: a framework to predict lung cancer prognosis

Johnson Marina Albizri Abdullah Simsek Serhat 《Annals of Operations Research》2022,308(1-2):275-305

Artificial Intelligence (AI) is critical for data-driven decision making to increase resource utilization, operational performance, and service quality in various industry domains, particularly in healthcare. Using AI in healthcare operations can significantly improve treatment outcomes and enhance patient satisfaction while reducing costs. In this paper, we propose a multi-stage framework to build an AI-based decision support tool that can predict the 5-year survivability of lung cancer patients. We evaluate the proposed framework using the Surveillance, Epidemiology, and End Results dataset pertaining to the 1973–2015 period obtained from the National Institutes of Health. The first stage entails data preprocessing and target creation. The second stage applies six AI algorithms with feature selection through Particle Swarm Optimization and hyperparameter tuning with Cross-Validation. These Algorithms include Logistic Regression, Decision Trees, Random Forests (RF), Adaptive Boosting (AdaBoost), Artificial Neural Network, and Naïve Bayes. The results show that RF and AdaBoost models yield an AUC rate of 0.94 and outperform the other models. Stage 3 utilizes permutation importance to interpret the RF and AdaBoost models and applies Tree-based Augmented Naïve Bayes to gain insights regarding the interrelations among important features. The results of Stage 3 delineate that the number of lymph nodes containing metastases), the number of tumors that patients have had in their lifetime, the patient’s age, and the microscopic composition of cells rank among the topmost important features and can significantly impact patient survivability. We think this study has significant practical implications in helping physicians predict prognosis and develop treatment plans for lung cancer patients.

相似文献

9.

Portfolio Selection Problem with Minimax Type Risk Function 总被引：3，自引：0，他引：3

K.L. Teo X.Q. Yang 《Annals of Operations Research》2001,101(1-4):333-349

The investor's preference in risk estimation of portfolio selection problems is important as it influences investment strategies. In this paper a minimax risk criterion is considered. Specifically, the investor aims to restrict the standard deviation for each of the available stocks. The corresponding portfolio optimization problem is formulated as a linear program. Hence it can be implemented easily. A capital asset pricing model between the market portfolio and each individual return for this model is established using nonsmooth optimization methods. Some numerical examples are given to illustrate our approach for the risk estimation. 相似文献

10.

Bayesian parameter inference for models of the Black and Scholes type

Henryk Gzyl Enrique ter Horst Samuel W. Malone 《商业与工业应用随机模型》2008,24(6):507-524

In this paper, we describe a general method for constructing the posterior distribution of the mean and volatility of the return of an asset satisfying dS=SdX for some simple models of X. Our framework takes as inputs the prior distributions of the parameters of the stochastic process followed by the underlying, as well as the likelihood function implied by the observed price history for the underlying. As an application of our framework, we compute the value at risk (VaR) and conditional VaR (CVaR) measures for the changes in the price of an option implied by the posterior distribution of the volatility of the underlying. The implied VaR and CVaR are more conservative than their classical counterpart, since it takes into account the estimation risk that arises due to parameter uncertainty. Copyright © 2008 John Wiley & Sons, Ltd. 相似文献

11.

Dynamic Bivariate Mortality Modelling

Jiao Ying Salhi Yahia Wang Shihua 《Methodology and Computing in Applied Probability》2022,24(2):917-938

The dependence structure of the life statuses plays an important role in the valuation of life insurance products involving multiple lives. Although the mortality of individuals is well studied in the literature, their dependence remains a challenging field. In this paper, the main objective is to introduce a new approach for analyzing the mortality dependence between two individuals in a couple. It is intended to describe in a dynamic framework the joint mortality of married couples in terms of marginal mortality rates. The proposed framework is general and aims to capture, by adjusting some parametric form, the desired effect such as the “broken-heart syndrome”. To this end, we use a well-suited multiplicative decomposition, which will serve as a building block for the framework to relate the dependence structure and the marginals, and we make the link with existing practice of affine mortality models. Finally, given that the framework is general, we propose some illustrative examples and show how the underlying model captures the main stylized facts of bivariate mortality dynamics.

相似文献

12.

Bayesian Copulae Distributions, with Application to Operational Risk Management

Luciana Dalla Valle 《Methodology and Computing in Applied Probability》2009,11(1):95-115

The aim of this paper is to introduce a new methodology for operational risk management, based on Bayesian copulae. One of the main problems related to operational risk management is understanding the complex dependence structure of the associated variables. In order to model this structure in a flexible way, we construct a method based on copulae. This allows us to split the joint multivariate probability distribution of a random vector of losses into individual components characterized by univariate marginals. Thus, copula functions embody all the information about the correlation between variables and provide a useful technique for modelling the dependency of a high number of marginals. Another important problem in operational risk modelling is the lack of loss data. This suggests the use of Bayesian models, computed via simulation methods and, in particular, Markov chain Monte Carlo. We propose a new methodology for modelling operational risk and for estimating the required capital. This methodology combines the use of copulae and Bayesian models. 相似文献

13.

稳定分布的多期投资组合模型及其应用

下载免费PDF全文

玄海燕包海明石新勇杨娜娜《数学杂志》2015,35(3):735-742

本文研究了多期投资组合模型的问题.利用非正态稳定分布和参数估计的方法,建立了市场上含一个无风险证券和多个风险证券时多期投资组合的模型,对于描述风险证券所具有的偏态和过度峰态的非正态特征及其股市中的应用起到了作用. 相似文献

14.

一个虚拟事实模型中因果效应的Bayes估计

许静郑忠国《数学研究及应用》2004,24(3):381-387

给出了一个虚拟事实模型中因果效应的Bayes估计和经验Bayes估计,提供了三种可替换性假设的先验分布的选择方法,并用实验说明,在不知道取哪个可替换性假设的情况下,经验Bayes估计要优于其他的估计. 相似文献

15.

混合双参数广义Pareto分布的参数估计

刘媚汤银才《数学的实践与认识》2009,39(20)

Pareto分布族因其厚尾特点,在金融分析、寿命分析中都是非常重要的统计模型.但是对于混合双参广义Pareto分布,在模型参数估计时,传统的矩法估计和极大似然估计在理论上可以实现,实践时比较困难.本文应用EM算法之ECM算法,研究了混合广义Pareto分布在完全数据场合下的参数估计问题,并模拟说明EM算法来估计混合广义Pareto分布是一种容易实现又非常有效的方法. 相似文献

16.

Asymptotic behavior of Mean-CVaR portfolio selection model under nonparametric framework

Jun Zhao Yi Zhang 《高校应用数学学报(英文版)》2017,32(1):79-92

Portfolio selection is an important issue in finance and it involves the balance between risk and return. This paper investigates portfolio selection under Mean-CVa R model in a nonparametric framework with α-mixing data as financial data tends to be dependent. Many works have provided some insight into the performance of portfolio selection from the aspects of data and simulation while in this paper we concentrate on the asymptotic behaviors of the optimal solutions and risk estimation in theory. 相似文献

17.

屏蔽系统寿命数据的统计分析综述(英文)

下载免费PDF全文

Xu Ancha Tang Yincai 《应用概率统计》2012,28(4):380-388

系统竞争失效数据在工程应用中广泛存在.屏蔽数据作为它的一种特殊数据形式在工程中有着重要的作用.本文首先介绍了屏蔽数据的形式及其与常规竞争失效数据的区别,然后对串联系统或并联系统,阐述了两种分析屏蔽数据的方法(极大似然法和贝叶斯方法),最后用这两种方法分析了一个实际例子. 相似文献

18.

基于因子分析法的金融高频已实现波动的预测

刘丽萍《数学的实践与认识》2017,(14):244-252

金融高频数据的已实现波动(RV)在风险管理中扮演着非常重要的角色,已有大量文献对如何预测资产的已实现波动进行了研究.采用因子分析法来预测RV,探讨了不可观测的金融序列的公共因子在预测已实现波动时所起的作用,并考虑了资产价格中跳跃的影响,建立了基于因子分析法的波动预测模型(F-RV-J).从损失函数、MCS检验和在险价值VaR的预测能力三个方面,将F-RV-J模型与其它常用的预测模型进行了比较,发现F-RV-J模型明显要优于其它波动预测模型. 相似文献

19.

On a resolvent estimate of the Stokes system in a half space arising from a free boundary problem for the Navier–Stokes equations

Y. Shibata S. Shimizu 《Mathematische Nachrichten》2009,282(3):482-499

In this paper we consider a resolvent problem of the Stokes operator with some boundary condition in the half space, which is obtained as a model problem arising in evolution free boundary problems for viscous, incompressible fluid flow. We show standard resolvent estimates in the L_q framework (1 < q < ∞), applying some kernel estimates to concrete solution formulas. The Volevich trick in [21] plays a fundamental role in estimating solutions (© 2009 WILEY‐VCH Verlag GmbH & Co. KGaA, Weinheim) 相似文献

20.

Risk concentration and diversification: Second-order properties

Matthias Degen Dominik D. Lambrigger 《Insurance: Mathematics and Economics》2010,46(3):541-546

The quantification of diversification benefits due to risk aggregation plays a prominent role in the (regulatory) capital management of large firms within the financial industry. However, the complexity of today’s risk landscape makes a quantifiable reduction of risk concentration a challenging task. In the present paper we discuss some of the issues that may arise. The theory of second-order regular variation and second-order subexponentiality provides the ideal methodological framework to derive second-order approximations for the risk concentration and the diversification benefit. 相似文献