首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
More than 50 years ago, John Tukey called for a reformation of academic statistics. In “The Future of Data Analysis,” he pointed to the existence of an as-yet unrecognized science, whose subject of interest was learning from data, or “data analysis.” Ten to 20 years ago, John Chambers, Jeff Wu, Bill Cleveland, and Leo Breiman independently once again urged academic statistics to expand its boundaries beyond the classical domain of theoretical statistics; Chambers called for more emphasis on data preparation and presentation rather than statistical modeling; and Breiman called for emphasis on prediction rather than inference. Cleveland and Wu even suggested the catchy name “data science” for this envisioned field. A recent and growing phenomenon has been the emergence of “data science” programs at major universities, including UC Berkeley, NYU, MIT, and most prominently, the University of Michigan, which in September 2015 announced a $100M “Data Science Initiative” that aims to hire 35 new faculty. Teaching in these new programs has significant overlap in curricular subject matter with traditional statistics courses; yet many academic statisticians perceive the new programs as “cultural appropriation.” This article reviews some ingredients of the current “data science moment,” including recent commentary about data science in the popular media, and about how/whether data science is really different from statistics. The now-contemplated field of data science amounts to a superset of the fields of statistics and machine learning, which adds some technology for “scaling up” to “big data.” This chosen superset is motivated by commercial rather than intellectual developments. Choosing in this way is likely to miss out on the really important intellectual event of the next 50 years. Because all of science itself will soon become data that can be mined, the imminent revolution in data science is not about mere “scaling up,” but instead the emergence of scientific studies of data analysis science-wide. In the future, we will be able to predict how a proposal to change data analysis workflows would impact the validity of data analysis across all of science, even predicting the impacts field-by-field. Drawing on work by Tukey, Cleveland, Chambers, and Breiman, I present a vision of data science based on the activities of people who are “learning from data,” and I describe an academic field dedicated to improving that activity in an evidence-based manner. This new field is a better academic enlargement of statistics and machine learning than today’s data science initiatives, while being able to accommodate the same short-term goals. Based on a presentation at the Tukey Centennial Workshop, Princeton, NJ, September 18, 2015.  相似文献   

2.
An enduring concern among science education researchers is the “swing away from science” ( Osborne. 2003 ). One of their central dilemmas is to identify—or construct—a valid outcome measure that could assess curricular effectiveness, and predict students' choices of science courses, university majors, or careers in science. Many instruments have been created and variably evaluated. The primary purpose of this paper was to re‐evaluate the psychometric properties of the Image of Science and Scientists Scale (ISSS) ( Krajkovich 1978 ). In the current study, confirmatory factor analysis (CFA) was used to examine the dimensionality of the 29‐item ISSS, which was administered to 531 middle school students in three San Antonio. Texas school districts at the beginning of the 2004–2005 school year. The results failed to confirm the presumed 1‐factor structure of the ISSS. but instead showed a 3‐factor structure with only marginal fit with the data, even after removal of 12 inadequate items. The three dimensions were “Positive Images of Scientists” (5 items). “Negative Images of Scientists” (9 items), and “Science Avocation” (3 items). The results do not support use of the original form of the ISSS for measuring “attitudes toward science,”“images of scientists. “or “scientific attitudes. “Shortening the scale from 29 to 17 items makes it more feasible to use in a classroom setting. Determining whether the three dimensions identified in our analysis. “Positive Images of Scientists. ““Negative Images of Scientists. “and “Science Avocation “contain useful assessments of middle school student impressions and attitudes will require independent investigation in other samples.  相似文献   

3.
Abstract

Statistical software provides essential support for statisticians and others who are analyzing data or doing research on new statistical techniques. Those supported typically regard themselves as “users” of the software, but as soon as they need to express their own ideas computationally, they in fact become “programmers.” Nothing is more important for the success of statistical software than enabling this transition from user to programmer, and on to gradually more ambitious software design. What does the user need? How can the design of statistical software help? This article presents a number of suggestions based on past experience and current research. The evolution of the S system reflects some of these opinions. Work on the Omegahat software provides a promising direction for future systems that reflect similar motivations.  相似文献   

4.
This article presents an example of how computer software can be used to facilitate collaborative learning and the integration of mathematics and science. “Snap shots” from a pilot project, with thirty high school juniors who were involved in a university summer program, reveal how student-centered learning is facilitated by technology. This exploratory trial provides a glimpse of what the “classroom after next” might look like utilizing groupware in instructional settings.  相似文献   

5.
Networks are ubiquitous in science. They have also become a focal point for discussion in everyday life. Formal statistical models for the analysis of network data have emerged as a major topic of interest in diverse areas of study, and most of these involve a form of graphical representation. Probability models on graphs date back to 1959. Along with empirical studies in social psychology and sociology from the 1960s, these early works generated an active “social science network community” and a substantial literature in the 1970s. This effort moved into the statistical literature in the late 1970s and 1980s, and the past decade has seen a burgeoning network literature coming out of statistical physics and computer science. In particular, the growth of the World Wide Web and the emergence of online “networking communities” such as Facebook, Google+, MySpace, LinkedIn, and Twitter, and a host of more specialized professional network communities have intensified interest in the study of networks and network data. This article reviews some of these developments, introduces some relevant statistical models for static network settings, and briefly points to open challenges.  相似文献   

6.
The study reported in this paper investigated perceptions concerning connections between mathematics and science held by university/college instructors who participated in the Maryland Collaborative for Teacher Preparation (MCTP), an NSF-funded program aimed at developing special middle-level mathematics and science teachers. Specifically, we asked (a) “What are the perceptions of MCTP instructors about the ‘other’ discipline?” (b) “What are the perceptions of MCTP instructors about the connections between mathematics and science?” and (c) “What are some barriers perceived by MCTP instructors in implementing mathematics and science courses that emphasize connections?” The findings suggest that the benefits of emphasizing mathematics and science connections perceived by MCTP instructors were similar to the benefits reported by school teachers. The barriers reported were also similar. The participation in the project appeared to have encouraged MCTP instructors to grapple with some fundamental questions, like “What should be the nature of mathematics and science connections?” and “What is the nature of mathematics/science in relationship to the other discipline?”  相似文献   

7.
Evaluating the attitudes of science students is important for teachers, curriculum developers, and those working with preservice teachers. Although in the United States a great deal of attitudinal research has been conducted with regard to science education, in the People's Republic of China very little work concerning science attitudes has been completed. This study will report on an evaluation of Chinese boys' and girls' attitudes toward selected science topics. Students attended a middle school in the city of Shanghai. Analysis indicated that when the male and female Chinese students differ in their response patterns, females select more intense responses (“strongly agree” as opposed to “agree,”“strongly disagree” as opposed to “disagree”). Furthermore, the surveyed females often selected responses suggesting that they were more interested in the science topics and issues presented in the survey.  相似文献   

8.
In this article, we have considered the role of the chair in leading the learning necessary for a department to become effective in the teaching and learning of science from a reformed perspective. We conceptualize the phrase “leading learning” to mean the chair's constitution of influence, power, and authority to intentionally impact the conceptual, pedagogical, cultural, and political aspects of teachers’ work. The data for this article are based on our ongoing work with one science department, over the past nine years, and have been woven into a longitudinal narrative study of a chair who has led the learning of an effective department since 2000. In considering the data, we can reach two major conclusions. First, for a chair to lead learning is to build a professional commitment to a vision of science education, not a particular program. Second, in leading learning, chairs afford opportunities for teacher empowerment. This affordance, however, is only half the issue. It is commitment to a vision that drives a desire to take advantage of opportunities as they arise. In leading learning that reflects changes in the broader science education community, learning opportunities are opened beyond the department.  相似文献   

9.
For semiparametric survival models with interval-censored data and a cure fraction, it is often difficult to derive nonparametric maximum likelihood estimation due to the challenge in maximizing the complex likelihood function. In this article, we propose a computationally efficient EM algorithm, facilitated by a gamma-Poisson data augmentation, for maximum likelihood estimation in a class of generalized odds rate mixture cure (GORMC) models with interval-censored data. The gamma-Poisson data augmentation greatly simplifies the EM estimation and enhances the convergence speed of the EM algorithm. The empirical properties of the proposed method are examined through extensive simulation studies and compared with numerical maximum likelihood estimates. An R package “GORCure” is developed to implement the proposed method and its use is illustrated by an application to the Aerobic Center Longitudinal Study dataset. Supplementary material for this article is available online.  相似文献   

10.
Recently a new statistical methodology, developed over the last three decades, has become available to practitioners. This methodology is called “ranking and selection” theory. In this article we review procedures for completely ranking a set of populations (from “best”, “second best”, etc., down to “worst”); we also give new tables needed to implement these procedures, and we consider several practical examples using real data.  相似文献   

11.
Martingales in the limit (mils) were introduced about two decades ago as nontrivial extensions of martingales. It was proved in 1976 that they have good convergence properties (at least) for real-valued stochastic processes. But, so far there have not been found any “real-life” applications of mils.In this article, we apply the full generality of mils to a problem in information science. There we study the evolution in time of source journals as, e.g., defined by the Institute for Scientific Information (ISI) who selects, on a yearly basis, the most “visible” journals in the world. In this connection one also encounters quasi-martingales.  相似文献   

12.
Methods for analyzing or learning from “fuzzy data” have attracted increasing attention in recent years. In many cases, however, existing methods (for precise, non-fuzzy data) are extended to the fuzzy case in an ad-hoc manner, and without carefully considering the interpretation of a fuzzy set when being used for modeling data. Distinguishing between an ontic and an epistemic interpretation of fuzzy set-valued data, and focusing on the latter, we argue that a “fuzzification” of learning algorithms based on an application of the generic extension principle is not appropriate. In fact, the extension principle fails to properly exploit the inductive bias underlying statistical and machine learning methods, although this bias, at least in principle, offers a means for “disambiguating” the fuzzy data. Alternatively, we therefore propose a method which is based on the generalization of loss functions in empirical risk minimization, and which performs model identification and data disambiguation simultaneously. Elaborating on the fuzzification of specific types of losses, we establish connections to well-known loss functions in regression and classification. We compare our approach with related methods and illustrate its use in logistic regression for binary classification.  相似文献   

13.
14.
The main purpose of this study was to investigate the effectiveness of a primary teacher education program in improving science teaching efficacy beliefs (personal science teaching efficacy beliefs and outcome expectancy beliefs) of preservice primary school teachers. The study also investigated whether the program has an effect on student teachers' attitudes toward science. Data were collected by administering the “Science Teaching Efficacy Beliefs Instrument” and “Attitudes toward Science Scale” to 282 preservice primary teachers (147 freshmen, 135 seniors). Statistical techniques such as means and t‐test were used to analyze the data. Results of the study showed that the primary teacher education program has a medium positive effect on science teaching efficacy beliefs of the primary preservice teachers (t = 4.791, p = .000) and that there were no gender differences in terms of efficacy beliefs. Results also indicated that preservice primary teachers' attitudes toward science were moderately positive and differ by class level. Fourth‐year preservice teachers' attitudes toward science were found to be significantly more positive than the first years (t = 5.494, p = .000). There were no gender differences in attitudes toward science.  相似文献   

15.
To handle the ubiquitous problem of “dependence learning,” copulas are quickly becoming a pervasive tool across a wide range of data‐driven disciplines encompassing neuroscience, finance, econometrics, genomics, social science, machine learning, healthcare, and many more. At the same time, despite their practical value, the empirical methods of “learning copula from data” have been unsystematic with full of case‐specific recipes. Taking inspiration from modern LP‐nonparametrics, this paper presents a modest contribution to the need for a more unified and structured approach of copula modeling that is simultaneously valid for arbitrary combinations of continuous and discrete variables.  相似文献   

16.
The goal of this article is to inform professional understanding regarding preservice science teachers’ knowledge of engineering and the engineering design process. Originating as a conceptual study of the appropriateness of “knowledge as design” as a framework for conducting science teacher education to support learning related to engineering design, the findings are informed by an ongoing research project. Perkins’s theory encapsulates knowledge as design within four complementary components of the nature of design. When using the structure of Perkins’s theory as a framework for analysis of data gathered from preservice teachers conducting engineering activities within an instructional methods course for secondary science, a concurrence between teacher knowledge development and the theory emerged. Initially, the individuals, who were participants in the research, were unfamiliar with engineering as a component of science teaching and expressed a lack of knowledge of engineering. The emergence of connections between Perkins’s theory of knowledge as design and knowledge development for teaching were found when examining preservice teachers’ development of creative and systematic thinking skills within the context of engineering design activities as well as examination of their knowledge of the application of science to problem‐solving situations.  相似文献   

17.
In this article, we investigate the lower bound of life-span of classical solutions of the hyperbolic geometry flow equations in several space dimensions with “small” initial data. We first present some estimates on solutions of linear wave equations in several space variables. Then, we derive a lower bound of the life-span of the classical solutions to the equations with “small” initial data.  相似文献   

18.
本文从我国统计信息事业的现状,提出了关于第一类统计与第二类统计。同时指出统计信息事业的迅速发展与繁荣,首先必须实现两类统计的相互渗透;其二必须与计算机技术密切结合,以建立起各级各类统计信息的高速公路,使统计信息更好地为我国的科技、经济、社会的全面发展提供有效的服务。  相似文献   

19.
20.
In many domains, data now arrive faster than we are able to mine it. To avoid wasting these data, we must switch from the traditional “one-shot” data mining approach to systems that are able to mine continuous, high-volume, open-ended data streams as they arrive. In this article we identify some desiderata for such systems, and outline our framework for realizing them. A key property of our approach is that it minimizes the time required to build a model on a stream while guaranteeing (as long as the data are iid) that the model learned is effectively indistinguishable from the one that would be obtained using infinite data. Using this framework, we have successfully adapted several learning algorithms to massive data streams, including decision tree induction, Bayesian network learning, k-means clustering, and the EM algorithm for mixtures of Gaussians. These algorithms are able to process on the order of billions of examples per day using off-the-shelf hardware. Building on this, we are currently developing software primitives for scaling arbitrary learning algorithms to massive data streams with minimal effort.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号