首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 0 毫秒
1.
Principal component analysis on interval data   总被引:2,自引:0,他引:2  
Summary  Real world data analysis is often affected by different types of errors as: measurement errors, computation errors, imprecision related to the method adopted for estimating the data. The uncertainty in the data, which is strictly connected to the above errors, may be treated by considering, rather than a single value for each data, the interval of values in which it may fall: the interval data. Statistical units described by interval data can be assumed as a special case of Symbolic Object (SO). In Symbolic Data Analysis (SDA), these data are represented as boxes. Accordingly, purpose of the present work is the extension of Principal Component analysis (PCA) to obtain a visualisation of such boxes, on a lower dimensional space pointing out of the relationships among the variables, the units, and between both of them. The aim is to use, when possible, the interval algebra instruments to adapt the mathematical models, on the basis of the classical PCA, to the case in which an interval data matrix is given. The proposed method has been tested on a real data set and the numerical results, which are in agreement with the theory, are reported.  相似文献   

2.
This paper introduces a principal component methodology for analysing histogram-valued data under the symbolic data domain. Currently, no comparable method exists for this type of data. The proposed method uses a symbolic covariance matrix to determine the principal component space. The resulting observations on principal component space are presented as polytopes for visualization. Numerical representation of the resulting polytopes via histogram-valued output is also presented. The necessary algorithms are included. The technique is illustrated on a weather data set.  相似文献   

3.
4.
The aim of this paper is to investigate the economic specialization of the Italian local labor systems (sets of contiguous municipalities with a high degree of self-containment of daily commuter travel) by using the Symbolic Data approach, on the basis of data derived from the Census of Industrial and Service Activities. Specifically, the economic structure of a local labor system (LLS) is described by an interval-type variable, a special symbolic data type that allows for the fact that all municipalities within the same LLS do not have the same economic structure.  相似文献   

5.
We propose an extension of the notion of the histogram used for variables to describe a knowledge base where the knowledge is represented by a special kind of symbolic objects. Boolana assertion objects.  相似文献   

6.
The problem of characterizing a /c-dimensional statistic contained in the past cj>f a discrete-time stochastic process y, which allows the best linear least-squares prediction of th^ future of y, is considered. The solution is provided in terms of the Schmidt pairs and singular values of an infinite matrix, and of the linear innovations of y. In the stationary case, the spectral characteristic of the optimal statistic, and of the corresponding prediction estimate, is obtained. In the base of a rational spectrum, the results are shown to assume a form particularly attractive from the algorithmic point of view. The results admit a straightforward extension to multivariate stochastic processes.  相似文献   

7.
Compositional data are considered as data where relative contributions of parts on a whole, conveyed by (log-)ratios between them, are essential for the analysis. In Symbolic Data Analysis (SDA), we are in the framework of interval data when elements are characterized by variables whose values are intervals on \(\mathbb {R}\) representing inherent variability. In this paper, we address the special problem of the analysis of interval compositions, i.e., when the interval data are obtained by the aggregation of compositions. It is assumed that the interval information is represented by the respective midpoints and ranges, and both sources of information are considered as compositions. In this context, we introduce the representation of interval data as three-way data. In the framework of the log-ratio approach from compositional data analysis, it is outlined how interval compositions can be treated in an exploratory context. The goal of the analysis is to represent the compositions by coordinates which are interpretable in terms of the original compositional parts. This is achieved by summarizing all relative information (logratios) about each part into one coordinate from the coordinate system. Based on an example from the European Union Statistics on Income and Living Conditions (EU-SILC), several possibilities for an exploratory data analysis approach for interval compositions are outlined and investigated.  相似文献   

8.
Advances in Data Analysis and Classification - A Correction to this paper has been published: 10.1007/s11634-022-00503-9  相似文献   

9.
We introduce Auto-associative composite models, which have shown a good behavior on real data sets, and share important theoretical approximation properties. Their basic principle is to approximate iteratively data by manifolds of increasing dimension. We exhibit a special class of such models: auto-associative additive models. Their use is widespread in Projection pursuit regression. First, we show that Principal component analysis is a linear auto-associative additive model. Then, we show that principal component analysis is the only auto-associative composite model which is additive.  相似文献   

10.
In this paper, a revisited interval approach for linear regression is proposed. In this context, according to the Midpoint-Radius (MR) representation, the uncertainty attached to the set-valued model can be decoupled from its trend. The estimated interval model is built from interval input-output data with the objective of covering all available data. The constrained optimization problem is addressed using a linear programming approach in which a new criterion is proposed for representing the global uncertainty of the interval model. The potential of the proposed method is illustrated by simulation examples.  相似文献   

11.
12.
In this paper, we develop an approach to solving integer programming problems with interval data based on using the possibilities of varying the relaxation set of the problem. This is illustrated by means of an L-class enumeration algorithm for solving a dicrete production planning problem. We describe the algorithm and a number of its modifications and present results of a computational experiment for families of problems from the OR Library and with randomly generated initial data. This approach is also applied to obtain approximate solutions of the mentioned problem in its conventional setting.  相似文献   

13.
This paper is an adaptation of symbolic interval Principal Component Analysis (PCA) to histogram data. We proposed two methodologies. The first one involved three steps: the coding of bins of histogram, the ordinary PCA of means of variables and the representation of dispersion of symbolic observations we call concepts. For the representation of dispersion of these concepts we proposed the transformation of histograms into intervals. Then, we suggest the projection of the hypercubes or the interval lengths associated to each concept on the principal axes of the ordinary PCA of means. In the second methodology, we proposed the use of the three previous steps with the angular transformation.  相似文献   

14.
This paper considers the problem of interval scale data in the most widely used models of data envelopment analysis (DEA), the CCR and BCC models. Radial models require inputs and outputs measured on the ratio scale. Our focus is on how to deal with interval scale variables especially when the interval scale variable is a difference of two ratio scale variables like profit or the decrease/increase in bank accounts. We suggest the use of these ratio scale variables in a radial DEA model.  相似文献   

15.
In this paper, we investigate DEA with interval input-output data. First we show various extensions of efficiency and that 25 of them are essential. Second we formulate the efficiency test problems as mixed integer programming problems. We prove that 14 among 25 problems can be reduced to linear programming problems and that the other 11 efficiencies can be tested by solving a finite sequence of linear programming problems. Third, in order to obtain efficiency scores, we extend SBM model to interval input-output data. Fourth, to moderate a possible positive overassessment by DEA, we introduce the inverted DEA model with interval input-output data. Using efficiency and inefficiency scores, we propose a classification of DMUs. Finally, we apply the proposed approach to Japanese Bank Data and demonstrate its advantages.  相似文献   

16.
17.
This paper proposes fuzzy symbolic modeling as a framework for intelligent data analysis and model interpretation in classification and regression problems. The fuzzy symbolic modeling approach is based on the eigenstructure analysis of the data similarity matrix to define the number of fuzzy rules in the model. Each fuzzy rule is associated with a symbol and is defined by a Gaussian membership function. The prototypes for the rules are computed by a clustering algorithm, and the model output parameters are computed as the solutions of a bounded quadratic optimization problem. In classification problems, the rules’ parameters are interpreted as the rules’ confidence. In regression problems, the rules’ parameters are used to derive rules’ confidences for classes that represent ranges of output variable values. The resulting model is evaluated based on a set of benchmark datasets for classification and regression problems. Nonparametric statistical tests were performed on the benchmark results, showing that the proposed approach produces compact fuzzy models with accuracy comparable to models produced by the standard modeling approaches. The resulting model is also exploited from the interpretability point of view, showing how the rule weights provide additional information to help in data and model understanding, such that it can be used as a decision support tool for the prediction of new data.  相似文献   

18.
Data envelopment analysis (DEA) is a method to estimate the relative efficiency of decision-making units (DMUs) performing similar tasks in a production system that consumes multiple inputs to produce multiple outputs. So far, a number of DEA models with interval data have been developed. The CCR model with interval data, the BCC model with interval data and the FDH model with interval data are well known as basic DEA models with interval data. In this study, we suggest a model with interval data called interval generalized DEA (IGDEA) model, which can treat the stated basic DEA models with interval data in a unified way. In addition, by establishing the theoretical properties of the relationships among the IGDEA model and those DEA models with interval data, we prove that the IGDEA model makes it possible to calculate the efficiency of DMUs incorporating various preference structures of decision makers.  相似文献   

19.
Ratio analysis is a commonly used analytical tool for verifying the performance of a firm. While ratios are easy to compute, which in part explains their wide appeal, their interpretation is problematic, especially when two or more ratios provide conflicting signals. Indeed, ratio analysis is often criticized on the grounds of subjectivity, that is the analyst must pick and choose ratios in order to assess the overall performance of a firm.In this paper we demonstrate that Data Envelopment Analysis (DEA) can augment the traditional ratio analysis. DEA can provide a consistent and reliable measure of managerial or operational efficiency of a firm. We test the null hypothesis that there is no relationship between DEA and traditional accounting ratios as measures of performance of a firm. Our results reject the null hypothesis indicating that DEA can provide information to analysts that is additional to that provided by traditional ratio analysis. We also apply DEA to the oil and gas industry to demonstrate how financial analysts can employ DEA as a complement to ratio analysis.  相似文献   

20.
We explore the use of principal differential analysis as a tool for performing dimensional reduction of functional data sets. In particular, we compare the results provided by principal differential analysis and by functional principal component analysis in the dimensional reduction of three synthetic data sets, and of a real data set concerning 65 three-dimensional cerebral geometries, the AneuRisk65 data set. The analyses show that principal differential analysis can provide an alternative and effective representation of functional data, easily interpretable in terms of exponential, sinusoidal, or damped-sinusoidal functions and providing a different insight to the functional data set under investigation. Moreover, in the analysis of the AneuRisk65 data set, principal differential analysis is able to detect interesting features of the data, such as the rippling effect of the vessel surface, that functional principal component analysis is not able to detect.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号