首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Interactive graphics provide a very important tool that facilitates the process of exploratory data and model analysis which is a crucial step in real-world applied statistics. Only a very limited set of software exists that provides truly interactive graphics for data analysis, partially because it is not easy to implement. Very often specialized software is created to offer graphics for a particular problem, but many fundamental plots are omitted since it is not considered new research. In this paper we discuss a general framework that allows to create interactive graphics software on a sound foundation that offers consistent user interface, fast prototyping of new plots and extensibility to support interactive models. In addition, we also discuss one implementation of the general framework: iPlots eXtreme—next-generation interactive graphics for analysis of large data in R. It provides most fundamental plot types and allows new interactive plots to be created. The implementation raises interactive graphics performance to an entirely new level. We will discuss briefly several methods that allowed us to achieve this goal and illustrate the use of advanced programmability features in conjunction with R.  相似文献   

2.
Abstract

We present a method for graphically displaying regression data with Bernoulli responses. The method, which is based on the use of grayscale graphics to visualize contributions to a likelihood function, provides an analog of a scatterplot for logistic regression, as well as probit analysis. Furthermore, the method may be used in place of a traditional scatterplot in situations where such plots are often used.  相似文献   

3.
Abstract

This article introduces a new form of empirical distribution function (EDF) called the flipped empirical distribution function (FEDF), to represent univariate data graphically. Because the plot shows the location of individual points, it may be useful when we need to manipulate specific data points as with dynamic graphics. The article introduces several methods to explore multidimensional data using the FEDF. They are called a parallel FEDF, an FEDF scatterplot matrix, and an FEDF starplot. Usefulness of these plots in exploring multidimensional data becomes more prominent when they are implemented with the methods of dynamic graphics such as selecting, deleting, linking, locating, and identifying a group of data points.  相似文献   

4.
Abstract

Graphical selection of data views is a fundamental task in interactive statistical graphics. Linked plots provide a form of indirect selection, where direct manipulation of objects displayed in one plot indirectly selects objects in other plots. A pointing device or brush is typically used for direct manipulation, and so this indirect selection is commonly known as linked brushing. Most commonly, linked brushing is applied to two or more scatterplots showing various pairs of variables from a multivariable dataset. This article describes a generalization of linked brushing for the setting where plots display different, though related datasets. With this form of linking, we can graphically explore relationships between datasets. Our linking system is extensible and handles any kind of display of any kind of dataset, as well as arbitrary relationships between those datasets.  相似文献   

5.
Abstract

This article proposes some probability plots are proposed to test spherical and elliptical symmetry in terms of some invariant statistics under orthogonal transformations. Some correlation coefficients as numerical measures of detecting deviation from spherical or elliptical symmetry are recommended, and the empirical percentiles of these correlation coefficients are calculated by simulation. The simulation results for data sets from 12 different populations show that the new plots are useful for testing spherical and elliptical symmetry. Some discussion is given also.  相似文献   

6.
Abstract

We describe a system, called the Graphics Production Library (GPL), that implements a language for quantitative graphics. The structure of this system differs from existing statistical graphics, visualization, and mapping systems. Instead of treating a graphics display as a viewer for underlying data, GPL treats data as an accessory to viewing a graph. GPL is based on the mathematical definition of the graph of a function and uses that definition to organize data linked to the graph.  相似文献   

7.
Temporal data are information measured in the context of time. This contextual structure provides components that need to be explored to understand the data and that can form the basis of interactions applied to the plots. In multivariate time series, we expect to see temporal dependence, long term and seasonal trends, and cross-correlations. In longitudinal data, we also expect within and between subject dependence. Time series and longitudinal data, although analyzed differently, are often plotted using similar displays. We provide a taxonomy of interactions on plots that can enable exploring temporal components of these data types, and describe how to build these interactions using data transformations. Because temporal data are often accompanied other types of data we also describe how to link the temporal plots with other displays of data. The ideas are conceptualized into a data pipeline for temporal data and implemented into the R package cranvas. This package provides many different types of interactive graphics that can be used together to explore data or diagnose a model fit.  相似文献   

8.
Abstract

A simple method for providing mathematical annotation of plots produced with the R environment is described. Although the implementation is specific to R, a similar method could be used in any environment which uses an expression-based command interface and provides a basic quoting mechanism.  相似文献   

9.
On the category Q-Mod   总被引:1,自引:0,他引:1  
In this paper we consider the category Q-Mod of modules over a given quantale Q. The paper is motivated by constructions and results from the category of modules over a ring. We show that the category Q-Mod is monadic, consider its relation to the category Q-Top of Q-topological spaces and generalize a method of completion of partially ordered sets. Received December 20, 2005; accepted in final form December 4, 2006.  相似文献   

10.
Abstract

Recently, Huber offered a taxonomy of data set sizes ranging from tiny (102 bytes) to huge (1010 bytes). This taxonomy is particularly appealing because it quantifies the meaning of tiny, small, medium, large, and huge. Indeed, some investigators consider 300 small and 10,000 large while others consider 10,000 small. In Huber's taxonomy, most statistical and visualization techniques are computationally feasible with tiny data sets. With larger data sets, however, computers run out of computational horsepower and graphics displays run out of resolution fairly quickly. In this article, I discuss aspects of data set size and computational feasibility for general classes of algorithms in the context of CPU performance, memory size, hard disk capacity, screen resolution and massively parallel architectures. I discuss some strategies such as recursive formulations that mitigate the impact of size. I also discuss the potential for scalable parallelization that will mitigate the effects of computational complexity.  相似文献   

11.
Abstract

We propose a rudimentary taxonomy of interactive data visualization based on a triad of data analytic tasks: finding Gestalt, posing queries, and making comparisons. These tasks are supported by three classes of interactive view manipulations: focusing, linking, and arranging views. This discussion extends earlier work on the principles of focusing and linking and sets them on a firmer base. Next, we give a high-level introduction to a particular system for multivariate data visualization—XGobi. This introduction is not comprehensive but emphasizes XGobi tools that are examples of focusing, linking, and arranging views; namely, high-dimensional projections, linked scatterplot brushing, and matrices of conditional plots. Finally, in a series of case studies in data visualization, we show the powers and limitations of particular focusing, linking, and arranging tools. The discussion is dominated by high-dimensional projections that form an extremely well-developed part of XGobi. Of particular interest are the illustration of asymptotic normality of high-dimensional projections (a theorem of Diaconis and Freedman), the use of high-dimensional cubes for visualizing factorial experiments, and a method for interactively generating matrices of conditional plots with high-dimensional projections. Although there is a unifying theme to this article, each section—in particular the case studies—can be read separately.  相似文献   

12.
Open-source machine learning: R meets Weka   总被引:1,自引:0,他引:1  
Two of the prime open-source environments available for machine/statistical learning in data mining and knowledge discovery are the software packages Weka and R which have emerged from the machine learning and statistics communities, respectively. To make the different sets of tools from both environments available in a single unified system, an R package RWeka is suggested which interfaces Weka’s functionality to R. With only a thin layer of (mostly R) code, a set of general interface generators is provided which can set up interface functions with the usual “R look and feel”, re-using Weka’s standardized interface of learner classes (including classifiers, clusterers, associators, filters, loaders, savers, and stemmers) with associated methods.  相似文献   

13.
Abstract

There are many examples of text data bases, including literary corpora and computer source code, in which statistics are associated with each line. A visualization technique for this class of data represents the text lines as thin colored rows within columns. The position, length, and indentation of each row corresponds to that of the text. The color of each row is determined by a statistic associated with each line. The display looks like a miniature picture of the text with the color showing the spatial distribution of the statistic within the text. Using this technique, SeeSoft?, a dynamic graphics software tool, can easily display 50,000 lines of text simultaneously on a high-resolution monitor.  相似文献   

14.
《代数通讯》2013,41(2):897-906
ABSTRACT

We consider which sets of graded Betti numbers actually occur among all resolutions of modules with a given Hilbert function.  相似文献   

15.
Abstract

We present dynamic and static graphs for exploratory analysis of survival data. These graphs are based on a smooth semiparametric estimate of the survival probability as a function of time and a covariate. We overlay a contour plot of the conditional survival distribution on a scatterplot of time and covariate. This is augmented by plots of the estimated survival function at particular covariate values and the receiver operating characteristic curve at particular time points. In our XLisp-Stat implementation these plots are linked and the time and covariate values for the augmenting plots can be varied dynamically. The methods are illustrated on data from a clinical study of liver disease.  相似文献   

16.
Abstract

The concept of statistical strategy is introduced and used to develop a structured graphical user interface for guiding data analysis. The interface visually represents statistical strategies that are designed by expert data analysts to guide novices. The representation is an abstraction of the expert's concepts of the essence of a data analysis. We argue that an environment that visually guides and structures data analysis will improve data analysis productivity, accuracy, accessibility, and satisfaction in comparison to an environment without such aids, especially for novice data analysts. Our concepts are based on notions from cognitive science, and can be empirically evaluated. The interface consists of two interacting windows—the guidemap and the workmap. Each window contains a graph that has nodes and edges. The guidemap graph represents the statistical strategy for a specific statistical task (such as describing data). Nodes represent potential data analysis actions that can be taken by the system. Edges represent potential actions that can be taken by the analyst. The guidemap graph exists prior to the data analysis session, having been created by an expert. The workmap graph represents the complete history of all steps taken by the data analyst. It is constructed during the data analysis session as a result of the analyst's actions. Workmap nodes represent data sets, data models, or data analysis procedures that have been created or used by the analyst. Workmap edges represent the chronological sequence of the analyst's actions. One workmap node is highlighted to indicate which statistical object is the focus of the strategy. We illustrate our concepts with ViSta, the Visual Statistics system that we have developed.  相似文献   

17.
A spreadplot is a visualization that simultaneously shows several different views of a dataset or model. The individual views can be dynamic, can support high-interaction direct manipulation, and can be algebraically linked with each other, possibly via an underlying statistical model. Thus, when a data analyst changes the information shown in one view of a statistical model, the changes can be processed by the model and instantly represented in the other views. Spreadplots simplify the analyst's task when many different plots are relevant to the analysis at hand, as is the case in regression analysis, where there are many plots that can be used for model building and diagnosis. On the other hand, the development of a visualization involving many dynamic, highly interactive, directly manipulable graphics is not a trivial task. This article discusses a software architecture which simplifies the spreadplot developer's task. The architecture addresses the two main problems in constructing a spreadplot, simplifying the layout of the plots and structuring the communication between them.  相似文献   

18.
Many observations about coalgebras were inspired by comparable situations for algebras. Despite the prominent role of prime algebras, the theory of a corresponding notion for coalgebras was not well understood so far. Coalgebras C over fields may be called coprime provided the dual algebra C* is prime. This definition, however, is not intrinsic—it strongly depends on the base ring being a field. The purpose of the article is to provide a better understanding of related notions for coalgebras over commutative rings by employing traditional methods from (co)module theory, in particular (pre)torsion theory.

Dualizing classical primeness condition, coprimeness can be defined for modules and algebras. These notions are developed for modules and then applied to comodules. We consider prime and coprime, fully prime and fully coprime, strongly prime and strongly coprime modules and comodules. In particular, we obtain various characterisations of prime and coprime coalgebras over rings and fields.  相似文献   

19.
Boxplots are useful displays that convey rough information about the distribution of a variable. Boxplots were designed to be drawn by hand and work best for small datasets, where detailed estimates of tail behavior beyond the quartiles may not be trustworthy. Larger datasets afford more precise estimates of tail behavior, but boxplots do not take advantage of this precision, instead presenting large numbers of extreme, though not unexpected, observations. Letter-value plots address this problem by including more detailed information about the tails using “letter values,” an order statistic defined by Tukey. Boxplots display the first two letter values (the median and quartiles); letter-value plots display further letter values so far as they are reliable estimates of their corresponding quantiles. We illustrate letter-value plots with real data that demonstrate their usefulness for large datasets. All graphics are created using the R package lvplot, and code and data are available in the supplementary materials.  相似文献   

20.
Abstract

Visualization is a critical technology for understanding complex, data-rich systems. Effective visualizations make important features of the data immediately recognizable and enable the user to discover interesting and useful results by highlighting patterns. A key element of such systems is the ability to interact with displays of data by selecting a subset for further investigation. This operation is needed for use in linked views systems and in drill-down analysis. It is a common manipulation in many other systems and is as ubiquitous as selecting icons in a desktop graphical user interface (GUI). It is therefore surprising to note that little research has been done on how selection can be implemented. This article addresses this omission, presenting a taxonomy for selection mechanisms and discussing the interactions between branches of the taxonomy.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号