首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
2.
Summary  Several data can be presented as interval curves where intervals reflect a within variability. In particular, this representation is well adapted for load profiles, which depict the electricity consumption of a class of customers. Electricity load profiling consists in assigning a daily load curve to a customer based on their characteristics such as energy requirement. Within the load profiling scope, this paper investigates the extension of multivariate regression trees to the case of interval dependent (or response) variables. The tree method aims at setting up simultaneously load profiles and their assignment rules based on independent variables. The extension of multivariate regression trees to interval responses is detailed and a global approach is defined. It consists in a first stage of a dimension reduction of the interval response variables. Thereafter, the extension of the tree method is applied to the first principal interval components. Outputs are the classes of the interval curves where each class is characterized both by an interval load profile (e.g. the class prototype) and an assignment rule based on the independent variables.  相似文献   

3.
This article introduces graphical tools for visualizing multivariate functions, specializing to the case of visualizing multivariate density estimates. We visualize a density estimate by visualizing a series of its level sets. From each connected part of a level set a shape tree is formed. A shape tree is a tree whose nodes are associated with regions of the level set. With the help of a shape tree we define a transformation of a multivariate set to a univariate function. The shape trees are visualized with the shape plots and the location plot. By studying these plots one may identify the regions of the Euclidean space where the probability mass is concentrated. An application of shape trees to visualize the distribution of stock index returns is presented.  相似文献   

4.
Abstract

We investigate a new method for regression trees which obtains estimates and predictions subject to constraints on the coefficients representing the effects of splits in the tree. The procedure leads to both shrinking of the node estimates and pruning of branches in the tree and for some problems gives better predictions than cost-complexity pruning used in the classification and regression tree (CART) algorithm. The new method is based on the least absolute shrinkage and selection operator (LASSO) method developed by Tibshirani.  相似文献   

5.
This article presents a method for visualization of multivariate functions. The method is based on a tree structure—called the level set tree—built from separated parts of level sets of a function. The method is applied for visualization of estimates of multivarate density functions. With different graphical representations of level set trees we may visualize the number and location of modes, excess masses associated with the modes, and certain shape characteristics of the estimate. Simulation examples are presented where projecting data to two dimensions does not help to reveal the modes of the density, but with the help of level set trees one may detect the modes. I argue that level set trees provide a useful method for exploratory data analysis.  相似文献   

6.
The paper presents a generalized regression technique centered on a superquantile (also called conditional value-at-risk) that is consistent with that coherent measure of risk and yields more conservatively fitted curves than classical least-squares and quantile regression. In contrast to other generalized regression techniques that approximate conditional superquantiles by various combinations of conditional quantiles, we directly and in perfect analog to classical regression obtain superquantile regression functions as optimal solutions of certain error minimization problems. We show the existence and possible uniqueness of regression functions, discuss the stability of regression functions under perturbations and approximation of the underlying data, and propose an extension of the coefficient of determination R-squared for assessing the goodness of fit. The paper presents two numerical methods for solving the error minimization problems and illustrates the methodology in several numerical examples in the areas of uncertainty quantification, reliability engineering, and financial risk management.  相似文献   

7.
The analysis of compositions of Runge-Kutta methods involves manipulations of functions defined on rooted trees. Existing formulations due to Butcher [1972], Hairer and Wanner [1974], and Murua and Sanz-Serna [1999], while equivalent, differ in details. The subject of the present paper is a new recursive formulation of the composition rules. This both simplifies and extends the existing approaches. Instead of using the order conditions based on trees, we propose the construction of the order conditions using a suitably chosen basis on the tree space. In particular, the linear structure of the tree space gives a representation of the C and D simplifying assumptions on trees which is not restricted to Runge-Kutta methods. A proof of the group structure of the set of elementary weight functions satisfying the D simplifying assumptions is also given is this paper.  相似文献   

8.
This paper presents an extension of the standard regression tree method to clustered data. Previous works extending tree methods to accommodate correlated data are mainly based on the multivariate repeated-measures approach. We propose a “mixed effects regression tree” method where the correlated observations are viewed as nested within clusters rather than as vectors of multivariate repeated responses. The proposed method can handle unbalanced clusters, allows observations within clusters to be split, and can incorporate random effects and observation-level covariates. We implemented the proposed method using a standard tree algorithm within the framework of the expectation-maximization (EM) algorithm. The simulation results show that the proposed regression tree method provides substantial improvements over standard trees when the random effects are non negligible. A real data example is used to illustrate the method.  相似文献   

9.
A tree is scattered if it does not contain a subdivision of the complete binary tree as a subtree. We show that every scattered tree contains a vertex, an edge, or a set of at most two ends preserved by every embedding of T. This extends results of Halin, Polat and Sabidussi. Calling two trees equimorphic if each embeds in the other, we then prove that either every tree that is equimorphic to a scattered tree T is isomorphic to T, or there are infinitely many pairwise non-isomorphic trees which are equimorphic to T. This proves the tree alternative conjecture of Bonato and Tardif for scattered trees, and a conjecture of Tyomkyn for locally finite scattered trees.  相似文献   

10.
Abstract

The mode tree of Minnotte and Scott provides a valuable method of investigating features such as modes and bumps in a unknown density. By examining kernel density estimates for a range of bandwidths, we can learn a lot about the structure of a data set. Unfortunately, the basic mode tree can be strongly affected by small changes in the data, and gives no way to differentiate between important modes and those caused, for example, by outliers. The mode forest overcomes these difficulties by looking simultaneously at a large collection of mode trees, all based on some variation of the original data, by means such as resampling or jittering. The resulting graphic tool is both visually appealing and informative.  相似文献   

11.
Abstract

An essential feature of longitudinal data is the existence of autocorrelation among the observations from the same unit or subject. Two-stage random-effects linear models are commonly used to analyze longitudinal data. These models are not flexible enough, however, for exploring the underlying data structures and, especially, for describing time trends. Semi-parametric models have been proposed recently to accommodate general time trends. But these semi-parametric models do not provide a convenient way to explore interactions among time and other covariates although such interactions exist in many applications. Moreover, semi-parametric models require specifying the design matrix of the covariates (time excluded). We propose nonparametric models to resolve these issues. To fit nonparametric models, we use the novel technique of the multivariate adaptive regression splines for the estimation of mean curve and then apply an EM-like iterative procedure for covariance estimation. After giving a general algorithm of model building, we show how to design a fast algorithm. We use both simulated and published data to illustrate the use of our proposed method.  相似文献   

12.
Abstract

Asymptotic corrections are used to compute the means and the variance-covariance matrix of multivariate posterior distributions that are formed from a normal prior distribution and a likelihood function that factors into separate functions for each variable in the posterior distribution. The approximations are illustrated using data from the National Assessment of Educational Progress (NAEP). These corrections produce much more accurate approximations than those produced by two different normal approximations. In a second potential application, the computational methods are applied to logistic regression models for severity adjustment of hospital-specific mortality rates.  相似文献   

13.
We propose a new approach for nonlinear regression modeling by employing basis expansion for the case where the underlying regression function has inhomogeneous smoothness. In this case, conventional nonlinear regression models tend to be over- or underfitting, where the function is more or less smoother, respectively. First, the underlying regression function is roughly approximated with a locally linear function using an \(\ell _1\) penalized method, where this procedure is executed by extending an algorithm for the fused lasso signal approximator. We then extend the fused lasso signal approximator and develop an algorithm. Next, the residuals between the locally linear function and the data are used to adaptively prepare the basis functions. Finally, we construct a nonlinear regression model with these basis functions along with the technique of a regularization method. To select the optimal values of the tuning parameters for the regularization method, we provide an explicit form of the generalized information criterion. The validity of our proposed method is then demonstrated through several numerical examples.  相似文献   

14.
Abstract

In a longitudinal study, individuals are observed over some period of time. The investigator wishes to model the responses over this time as a function of various covariates measured on these individuals. The times of measurement may be sparse and not coincident across individuals. When the covariate values are not extensively replicated, it is very difficult to propose a parametric model linking the response to the covariates because plots of the raw data are of little help. Although the response curve may only be observed at a few points, we consider the underlying curve y(t). We fit a regression model y(t) = x Tβ(t) + ε(t) and use the coefficient functions β(t) to suggest a suitable parametric form. Estimates of y(t) are constructed by simple interpolation, and appropriate weighting is used in the regression. We demonstrate the method on simulated data to show its ability to recover the true structure and illustrate its application to some longitudinal data from the Panel Study of Income Dynamics.  相似文献   

15.
Abstract

A regression analysis usually consists of several stages, such as variable selection, transformation and residual diagnosis. Inference is often made from the selected model without regard to the model selection methods that preceeded it. This can result in overoptimistic and biased inferences. We first characterize data-analytic actions as functions acting on regression models. We investigate the extent of the problem and test bootstrap, jackknife, and sample-splitting methods for ameliorating it. We also demonstrate an interactive LISP-STAT system for assessing the cost of the data analysis while it is taking place.  相似文献   

16.
Based on uniform recursive trees, we introduce random trees with the factor of time, which are named Yule recursive trees. The structure and the distance between the vertices in Yule recursive trees are investigated in this paper. For arbitrary time t > 0, we first give the probability that a Yule recursive tree Yt is isomorphic to a given rooted tree γ; and then prove that the asymptotic distribution of ζt,m, the number of the branches of size m, is the Poisson distribution with parameter λ = 1/m. Finally, two types of distance between vertices in Yule recursive trees are studied, and some limit theorems for them are established.© 2007 Wiley Periodicals, Inc. Random Struct. Alg., 2007  相似文献   

17.
The two functions in question are mappings: [n]→[n], with [n] = {1, 2,?,n}. The acyclic function may be represented by forests of labeled rooted trees, or by free trees withn + 1 points; the parking functions are associated with the simplest ballot problem. The total number of each is (n + 1) n-1. The first of two mappings given is based on a simple mapping, due to H. O. Pollak, of parking functions on tree codes. In the second, each kind of function is mapped on permutations, arising naturally from characterizations of the functions. Several enumerations are given to indicate uses of the mappings.  相似文献   

18.
A Brelot space is a connected, locally compact, noncompact Hausdorff space together with the choice of a sheaf of functions on this space which are called harmonic. We prove that by considering functions on a tree to be functions on the edges as well as on the vertices (instead of just on the vertices), a tree becomes a Brelot space. This leads to many results on the potential theory of trees. By restricting the functions just to the vertices, we obtain several new results on the potential theory of trees considered in the usual sense. We study trees whose nearest-neighbor transition probabilities are defined by both transient and recurrent random walks. Besides the usual case of harmonic functions on trees (the kernel of the Laplace operator), we also consider as “harmonic” the eigenfunctions of the Laplacian relative to a positive eigenvalue showing that these also yield a Brelot structure and creating new classes of functions for the study of potential theory on trees.  相似文献   

19.
In this paper, we develop a fast algorithm for a smoothing spline estimator in multivariate regression. To accomplish this, we employ general concepts associated with roughness penalty methods in conjunction with the theory of radial basis functions and reproducing kernel Hilbert spaces. It is shown that through the use of compactly supported radial basis functions it becomes possible to recover the band structured matrix feature of univariate spline smoothing and thereby obtain a fast computational algorithm. Given n data points in R 2, the new algorithm has complexity O(n 2) compared to O(n 3), the order for the thin plate multivariate smoothing splines.  相似文献   

20.
Most everyday reasoning and decision making is based on uncertain premises. The premises or attributes, which we must take into consideration, are random variables, therefore we often have to deal with a high dimensional multivariate random vector. A multivariate random vector can be represented graphically as a Markov network. Usually the structure of the Markov network is unknown. In this paper we construct special type of junction trees, in order to obtain good approximations of the real probability distribution. These junction trees are capable of revealing some of the conditional independences of the network. We have already introduced the concept of the t-cherry junction tree (E. Kovács and T. Szántai in Proceedings of the IFIP/IIASA//GAMM Workshop on Coping with Uncertainty, 2010), based on the t-cherry tree graph structure. This approximation uses only two and three dimensional marginal probability distributions. Now we use k-th order t-cherry trees, also called simplex multitrees to introduce the concept of the k-th order t-cherry junction tree. We prove that the k-th order t-cherry junction tree gives the best approximation among the family of k-width junction trees. Then we give a method which starting from a k-th order t-cherry junction tree constructs a (k+1)-th order t-cherry junction tree which gives at least as good approximation. In the last part we present some numerical results and some possible applications.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号