首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Estimation of the ratio of probability densities has attracted a great deal of attention since it can be used for addressing various statistical paradigms. A naive approach to density-ratio approximation is to first estimate numerator and denominator densities separately and then take their ratio. However, this two-step approach does not perform well in practice, and methods for directly estimating density ratios without density estimation have been explored. In this paper, we first give a comprehensive review of existing density-ratio estimation methods and discuss their pros and cons. Then we propose a new framework of density-ratio estimation in which a density-ratio model is fitted to the true density-ratio under the Bregman divergence. Our new framework includes existing approaches as special cases, and is substantially more general. Finally, we develop a robust density-ratio estimation method under the power divergence, which is a novel instance in our framework.  相似文献   

2.
The paper presents smooth estimation of densities utilizing penalized splines. The idea is to represent the unknown density by a convex mixture of basis densities, where the weights are estimated in a penalized form. The proposed method extends the work of Komárek and Lesaffre (Comput Stat Data Anal 52(7):3441–3458, 2008) and allows for general density estimation. Simulations show a convincing performance in comparison to existing density estimation routines. The idea is extended to allow the density to depend on some (factorial) covariate. Assuming a binary group indicator, for instance, we can test on equality of the densities in the groups. This provides a smooth alternative to the classical Kolmogorov-Smirnov test or an Analysis of Variance and it shows stable and powerful behaviour.  相似文献   

3.
We develop a test for log-concavity of multivariate densities. The method uses kernel density estimation, where the test statistic is the smallest bandwidth for which the estimate is log-concave. We examine the properties of this technique through numerical studies.  相似文献   

4.
We study a parametric estimation problem. Our aim is to estimate or to identify the conditional probability which is called the system. We suppose that we can select appropriate inputs to the system when we gather the training data. This kind of estimation is called active learning in the context of the artificial neural networks. In this paper we suggest new active learning algorithms and evaluate the risk of the algorithms by using statistical asymptotic theory. The algorithms are regarded as a version of the experimental design with two-stage sampling. We verify the efficiency of the active learning by simple computer simulations.  相似文献   

5.
Bivariate survival function can be expressed as the composition of marginal survival functions and a bivariate copula and, consequently, one may estimate bivariate hazard functions via marginal hazard estimation and copula density estimation. Leveraging on earlier developments on penalized likelihood density and hazard estimation, a nonparametric approach to bivariate hazard estimation is being explored in this article. The new ingredient here is the nonparametric estimation of copula density, a subject of interest by itself, and to accommodate survival data one needs to allow for censoring and truncation in the setting. A simple copularization process is implemented to convert density estimates into copula densities, and a cross-validation scheme is devised for density estimation under censoring and truncation. Empirical performances of the techniques are investigated through simulation studies, and potential applications are illustrated using real-data examples and open-source software.  相似文献   

6.
Learning strategies under covariate shift have recently been widely discussed. The density of learning inputs under covariate shift is different from that of test inputs. Learning machines in such environments need to employ special learning strategies to acquire greater capabilities of generalizing through learning. However, incremental learning methods are also used for learning in non-stationary learning environments, which represent a kind of covariate shift. However, the relation between covariate-shift environments and incremental-learning environments has not been adequately discussed. This paper focuses on the covariate shift in incremental-learning environments and our re-construction of a suitable incremental-learning method. Then, the model-selection criterion is also derived, which is to be an essential object function for memetic algorithms to solve these kinds of learning problems.  相似文献   

7.
Non-parametric density estimation is an important technique in probabilistic modeling and reasoning with uncertainty. We present a method for learning mixtures of polynomials (MoPs) approximations of one-dimensional and multidimensional probability densities from data. The method is based on basis spline interpolation, where a density is approximated as a linear combination of basis splines. We compute maximum likelihood estimators of the mixing coefficients of the linear combination. The Bayesian information criterion is used as the score function to select the order of the polynomials and the number of pieces of the MoP. The method is evaluated in two ways. First, we test the approximation fitting. We sample artificial datasets from known one-dimensional and multidimensional densities and learn MoP approximations from the datasets. The quality of the approximations is analyzed according to different criteria, and the new proposal is compared with MoPs learned with Lagrange interpolation and mixtures of truncated basis functions. Second, the proposed method is used as a non-parametric density estimation technique in Bayesian classifiers. Two of the most widely studied Bayesian classifiers, i.e., the naive Bayes and tree-augmented naive Bayes classifiers, are implemented and compared. Results on real datasets show that the non-parametric Bayesian classifiers using MoPs are comparable to the kernel density-based Bayesian classifiers. We provide a free R package implementing the proposed methods.  相似文献   

8.
This paper proposes a new methodology to model uncertainties associated with functional random variables. This methodology allows to deal simultaneously with several dependent functional variables and to address the specific case where these variables are linked to a vectorial variable, called covariate. In this case, the proposed uncertainty modelling methodology has two objectives: to retain both the most important features of the functional variables and their features which are the most correlated to the covariate. This methodology is composed of two steps. First, the functional variables are decomposed on a functional basis. To deal simultaneously with several dependent functional variables, a Simultaneous Partial Least Squares algorithm is proposed to estimate this basis. Second, the joint probability density function of the coefficients selected in the decomposition is modelled by a Gaussian mixture model. A new sparse method based on a Lasso penalization algorithm is proposed to estimate the Gaussian mixture model parameters and reduce their number. Several criteria are introduced to assess the methodology performance: its ability to approximate the functional variables probability distribution, their dependence structure and their features which explain the covariate. Finally, the whole methodology is applied on a simulated example and on a nuclear reliability test case.  相似文献   

9.
A General Tractable Density Concept for Graphs   总被引:1,自引:0,他引:1  
In many applications it is an important algorithmic task to find a densest subgraph in an input graph. The complexity of this task depends on how density is defined. If density means the ratio of the number of edges and the number of vertices in the subgraph, then the algorithmic problem has long been known efficiently solvable. On the other hand, the task becomes NP-hard with closely related but somewhat modified concepts of density. To capture many possible tractable density concepts of interest in a common model, we define and analyze a general concept of density, called F-density. Here F is a family of graphs and we are looking for a subgraph of the input graph, such that this subgraph is the densest in terms of containing the highest number of graphs from F relative to the size of the subgraph. We show that for any fixed finite family F, a subgraph of maximum F-density can be found in polynomial time. As our main tool we develop an algorithm, that may be of independent interest, which can find an independent set of maximum independence ratio in a certain class of weighted graphs. The independence ratio is the weight of the independent set divided by the weight of its neighborhood. This work was supported in part by NSF grants ANI-0220001 and CCF-0634848.  相似文献   

10.
In this paper a new method for real time estimation of vehicular flows and densities on motorways is proposed. This method is based on fusing traffic counts with mobile phone counts. The procedure used for the estimation of traffic flow parameters is based on the hypothesis that “instrumented” vehicles can be counted on specific motorway sections and traffic flow can be measured on entrance and exit ramps. The motorway is subdivided into cells, assuming that mobile phones entering and exiting every cell can be counted during the observation period. An estimate of “instrumented” vehicle concentration is obtained and propagated on the network in time and space. This allows one to estimate traffic flow parameters by sampling “instrumented” traffic flow parameters using a “concentration” (the ratio of the densities of instrumented vehicles to the density of overall traffic) propagation mechanism.  相似文献   

11.
12.
We present an importance sampling method for deciding, based on an observed random field, if a scan statistic provides significant evidence of increased activity in some localized region of time or space. Our method allows consideration of scan statistics based simultaneously on multiple scan geometries. Our approach yields an unbiased p value estimate whose variance is typically smaller than that of the naive hit-or-miss Monte Carlo technique when the p value is small. Furthermore, our p value estimate is often accurate for critical values that are not far enough in the tails of the null distribution to allow for accurate approximations via extreme value theory. The importance sampling approach unifies the analysis of various random field models, from (spatial) point processes to Gaussian random fields. For a scan statistic M, the method produces a p value of the form P[M ≥ τ] = Bρ, where B is the Bonferroni upper bound and the correction factor ρ measures the conservativeness of this upper bound. We present the application of our importance sampling estimator to multinomial sequences (molecular genetics), spatial point processes (digital mammography), and Gaussian random fields (PET scan brain imagery).  相似文献   

13.
This paper considers parameter estimation problems for state space systems with time-delay. By means of the property of the shift operator, the state space systems are transformed into the input–output representations and an auxiliary model identification method is presented to estimate the system parameters. Finally, an example is provided to test the effectiveness of the proposed algorithm.  相似文献   

14.
This paper proposes a nonparametric method for producing smooth and positive estimates of the density of a Lévy process, which is widely used in mathematical finance. We use the method of logwavelet density estimation to estimate the Lévy density with discretely sampled observations. Since Lévy densities are not necessarily probability densities, we introduce a divergence measure similar to Kullback–Leibler information to measure the difference between two Lévy densities. Rates of convergence are established over Besov spaces.  相似文献   

15.
This paper deals with the k-sample problem for functional data when the observations are density functions. We introduce test procedures based on distances between pairs of density functions (L 1 distance and Hellinger distance, among others). A simulation study is carried out to compare the practical behaviour of the proposed tests. Theoretical derivations have been done in order to allow weighted samples in the test procedures. The paper ends with a real data example: for a collection of European regions we estimate the regional relative income densities and then we test the significance of the country effect.  相似文献   

16.
Spatial scan density (SSD) estimation via mixture models is an important problem in the field of spatial statistical analysis and has wide applications in image analysis. The “borrowed strength” density estimation (BSDE) method via mixture models enables one to estimate the local probability density function in a random field wherein potential similarities between the density functions for the subregions are exploited. This article proposes an efficient methods for SSD estimation by integrating the borrowed strength technique into the alternative EM framework which combines the statistical basis of the BSDE approach with the stability and improved convergence rate of the alternative EM methods. In addition, we propose adaptive SSD estimation methods that extend the aforementioned approach by eliminating the need to find the posterior probability of membership of the component densities afresh in each subregion. Simulation results and an application to the detection and identification of man-made regions of interest in an unmanned aerial vehicle imagery experiment show that the adaptive methods significantly outperform the BSDE method. Other applications include automatic target recognition, mammographic image analysis, and minefield detection.  相似文献   

17.
The extensive use of maximum likelihood estimates underscores the importance of the problem of statistical estimation of their errors. These estimates are of utmost importance in cases where the family of normal distributions and the families related to the normal distributions are considered [1, 2, 4]. The mean square errors of the maximum likelihood estimates of the normal density were investigated in the author's paper [3]. The mean square errors of statistical estimates of some families of densities related to the normal distributions were considered in the papers [4–6]. In the present paper, we obtain an asymptotic expansion of the mean square error of the maximum likelihood estimates of the densities of the joint distribution of sufficient statistics of the family of multivariate normal distributions. The results obtained allow us to construct the mean square errors of the maximum likelihood estimates for the chi-square density and Wishart's density. Translated fromStatisticheskie Metody Otsenivaniya i Proverki Gipotez, pp. 4–11, Perm. 1990.  相似文献   

18.
Regenerative simulation has become a familiar and established tool for simulation-based estimation. However, many applications (e.g., traffic in high-speed communications networks) call for autocorrelated stochastic models to which traditional regenerative theory is not directly applicable. Consequently, extensions of regenerative simulation to dependent time series is increasingly gaining in theoretical and practical interest, with Markov chains constituting an important case. Fortunately, a regenerative structure can be identified in Harris-recurrent Markov chains with minor modification, and this structure can be exploited for standard regenerative estimation. In this paper we focus on a versatile class of Harris-recurrent Markov chains, called TES (Transform-Expand-Sample). TES processes can generate a variety of sample paths with arbitrary marginal distributions, and autocorrelation functions with a variety of functional forms (monotone, oscillating and alternating). A practical advantage of TES processes is that they can simultaneously capture the first and second order statistics of empirical sample paths (raw field measurements). Specifically, the TES modeling methodology can simultaneously match the empirical marginal distribution (histogram), as well as approximate the empirical autocorrelation function. We explicitly identify regenerative structures in TES processes and proceed to address efficiency and accuracy issues of prospective simulations. To show the efficacy of our approach, we report on a TES/M/1 case study. In this study, we used the likelihood ratio method to calculate the mean waiting time performance as a function of the regenerative structure and the intrinsic TES parameter controlling burstiness (degree of autocorrelation) in the arrival process. The score function method was used to estimate the corresponding sensitivity (gradient) with respect to the service rate. Finally, we demonstrated the importance of the particular regenerative structure selected in regard to the estimation efficiency and accuracy induced by the regeneration cycle length.  相似文献   

19.
The recent accelerated growth in the computing power has generated popularization of experimentation with dynamic computer models in various physical and engineering applications. Despite the extensive statistical research in computer experiments, most of the focus had been on the theoretical and algorithmic innovations for the design and analysis of computer models with scalar responses. In this article, we propose a computationally efficient statistical emulator for a large-scale dynamic computer simulator (i.e., simulator which gives time series outputs). The main idea is to first find a good local neighborhood for every input location, and then emulate the simulator output via a singular value decomposition (SVD) based Gaussian process (GP) model. We develop a new design criterion for sequentially finding this local neighborhood set of training points. Several test functions and a real-life application have been used to demonstrate the performance of the proposed approach over a naive method of choosing local neighborhood set using the Euclidean distance among design points. The supplementary material, which contains proof of the theoretical results, detailed algorithms, additional simulation results, and R codes, are available online.  相似文献   

20.
Yang  Jing  Lu  Fang  Yang  Hu 《中国科学 数学(英文版)》2019,62(10):1977-1996
We propose a robust estimation procedure based on local Walsh-average regression(LWR) for single-index models. Our novel method provides a root-n consistent estimate of the single-index parameter under some mild regularity conditions; the estimate of the unknown link function converges at the usual rate for the nonparametric estimation of a univariate covariate. We theoretically demonstrate that the new estimators show significant efficiency gain across a wide spectrum of non-normal error distributions and have almost no loss of efficiency for the normal error. Even in the worst case, the asymptotic relative efficiency(ARE) has a lower bound compared with the least squares(LS) estimates; the lower bounds of the AREs are 0.864 and 0.8896 for the single-index parameter and nonparametric function, respectively. Moreover, the ARE of the proposed LWR-based approach versus the ARE of the LS-based method has an expression that is closely related to the ARE of the signed-rank Wilcoxon test as compared with the t-test. In addition, to obtain a sparse estimate of the single-index parameter, we develop a variable selection procedure by combining the estimation method with smoothly clipped absolute deviation penalty; this procedure is shown to possess the oracle property. We also propose a Bayes information criterion(BIC)-type criterion for selecting the tuning parameter and further prove its ability to consistently identify the true model. We conduct some Monte Carlo simulations and a real data analysis to illustrate the finite sample performance of the proposed methods.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号