首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
One problem in many fields is knowledge discovery in heterogeneous, high-dimensional data. As an example, in text mining an analyst often wishes to identify meaningful, implicit, and previously unknown information in an unstructured corpus. Lack of metadata and the complexities of document space make this task difficult. We describe Iterative Denoising, a methodology for knowledge discovery in large heterogeneous datasets that allows a user to visualize and to discover potentially meaningful relationships and structures. In addition, we demonstrate the features of this methodology in the analysis of a heterogeneous Science News corpus.  相似文献   

2.
The insurance industry is concerned with many problems of interest to the operational research community. This paper presents a case study involving two such problems and solves them using a variety of techniques within the methodology of data mining. The first of these problems is the understanding of customer retention patterns by classifying policy holders as likely to renew or terminate their policies. The second is better understanding claim patterns, and identifying types of policy holders who are more at risk. Each of these problems impacts on the decisions relating to premium pricing, which directly affects profitability. A data mining methodology is used which views the knowledge discovery process within an holistic framework utilising hypothesis testing, statistics, clustering, decision trees, and neural networks at various stages. The impacts of the case study on the insurance company are discussed.  相似文献   

3.
Pathology ordering by general practitioners (GPs) is a significant contributor to rising health care costs both in Australia and worldwide. A thorough understanding of the nature and patterns of pathology utilization is an essential requirement for effective decision support for pathology ordering. In this paper a novel methodology for integrating data mining and case-based reasoning for decision support for pathology ordering is proposed. It is demonstrated how this methodology can facilitate intelligent decision support that is both patient-oriented and deeply rooted in practical peer-group evidence. Comprehensive data collected by professional pathology companies provide a system-wide profile of patient-specific pathology requests by various GPs as opposed to that limited to an individual GP practice. Using the real data provided by XYZ Pathology Company in Australia that contain more than 1.5 million records of pathology requests by general practitioners (GPs), we illustrate how knowledge extracted from these data through data mining with Kohonen’s self-organizing maps constitutes the base that, with further assistance of modern data visualization tools and on-line processing interfaces, can provide “peer-group consensus” evidence support for solving new cases of pathology test ordering problem. The conclusion is that the formal methodology that integrates case-based reasoning principles which are inherently close to GPs’ daily practice, and data-driven computationally intensive knowledge discovery mechanisms which can be applied to massive amounts of the pathology requests data routinely available at professional pathology companies, can facilitate more informed evidential decision making by doctors in the area of pathology ordering.  相似文献   

4.
针对微观交通仿真与GIS数据共享存在的数据冗余和低效的问题,基于本体理论,提出微观交通仿真与GIS数据共享框架,给出数据共享框架的分层设计.对微观交通仿真数据特征进行概括和抽象,研究并设计微观交通仿真本体.提出基于迭代思想的微观交通仿真本体(MTSON)构建方法,重点阐述微观交通仿真本体的需求获取、分析、设计与实现过程.最后以OWL为形式化描述语言,对获取的微观交通仿真本体进行表达,为微观交通仿真与GIS数据共享的平滑实现奠定了基础.  相似文献   

5.
To achieve high productivity in a flexible manufacturing system (FMS), an efficient layout arrangement and material flow path design are important due to the large percentage of product cost that is related to material handling. The layout design problem addressed in this paper has departments with fixed shapes and pick-up/drop-off points. It is an open-field type layout with single-loop directed flow path. A two-step heuristic is proposed to solve the problem. It first solves a traditional block layout with directed-loop flow path to minimize material handling costs by using a combined spacefilling curve and simulated annealing algorithm. The second step of the proposed methodology uses the resulting flow sequence and relative positioning information from the first step as input to solve the detailed FMS layout, which includes the spatial coordinates and orientation of each FMS cell. This detailed FMS layout problem is formulated and solved as a mixed integer program. Empirical illustrations show promising results for the proposed methodology in solving real-world type problems.  相似文献   

6.
数据挖掘过程中连续属性离散化新方法研究   总被引:2,自引:0,他引:2  
在知识发现和机器学习领域里,许多数据挖掘方法如基于粗集的数据挖掘工具等需要使用离散的属性值,但实际观测到的大多是连续性属性数据,这对许多新型数据挖掘工具的研究带来了不便.本文针对以上问题,在综合分析目前连续属性离散化方法的基础上,提出了一种基于数据分布特征的连续属性离散化新方法,并用经典算例验证了此算法,实验结果表明该方法具有合理性和可行性.  相似文献   

7.
Product family design takes advantage of modularity to enable product variety while maintaining mass production efficiency. Focusing on a set of similar product variants, product family modularity (PFM) is achieved by reusing common components and minimizing fulfillment costs throughout the product realization process. On the other hand, traditional modular design emphasizes technical system modularity (TSM) that focuses on a single product and is geared towards product decomposition in light of technical feasibility. While it is appealing to incorporate product family considerations into the prevailing modularization theories and methods, the key challenge lies in that TSM and PFM are essentially associated with different goals and decision criteria. This leads to a dilemma that TSM and PFM are competing in decision making for identification of modules by grouping similar components. Realizing the importance of game-theoretic decision making underlying product family-driven modular design, this paper proposes to leverage TSM and PFM within a coherent framework of joint optimization. A hierarchical game joint optimization model is developed in line with bilevel programming. A two-dimension evaluation criteria taxonomy is presented for TSM and PFM criteria measure. A bilevel nested genetic algorithm is put forward for efficient solution of the non-linear hierarchical joint optimization model. A case study of robotic vacuum cleaner modular design is reported to gain insight into joint optimization of TSM and PFM. Results and analyses demonstrate that the proposed hierarchical joint optimization model is robust and can empower modular design in cohesion with product family concerns.  相似文献   

8.
A general methodology to optimize the weight of power transmission structures is presented in this article. This methodology is based on the simulated annealing algorithm defined by Kirkpatrick in the early ‘80s. This algorithm consists of a stochastic approach that allows to explore and analyze solutions that do not improve the objective function in order to develop a better exploration of the design region and to obtain the global optimum. The proposed algorithm allows to consider the discrete behavior of the sectional variables for each element and the continuous behavior of the general geometry variables. Thus, an optimization methodology that can deal with a mixed optimization problem and includes both continuum and discrete design variables is developed. In addition, it does not require to study all the possible design combinations defined by discrete design variables. The algorithm proposed usually requires to develop a large number of simulations (structural analysis in this case) in practical applications. Thus, the authors have developed first order Taylor expansions and the first order sensitivity analysis involved in order to reduce the CPU time required. Exterior penalty functions have been also included to deal with the design constraints. Thus, the general methodology proposed allows to optimize real power transmission structures in acceptable CPU time.  相似文献   

9.
10.
数据挖掘是指从大型数据库的海量信息中有效进行知识发现的过程,而其效能的高低主要取决于搜索机制所依据的算法.有鉴于此,提出了一种基于个体免疫与群体进化机制于一体的一种高效的全局优化搜索算法,即基于免疫规划的广义规则推理算法.与已有算法所不同的是,广义规则推理算法不仅仅着眼于发现一些有关分类方面的信息,而是利用背景理论和先验知识在知识表示与运行效率之间相均衡的基础上,着重新知识的发现和对高级规则的预测.理论分析和仿真实验表明,广义规则推理算法有利于进化群体的相对稳定和整体性能的提高,并可以在规则提取过程中保持较高的精确度.  相似文献   

11.
Nowadays, with the volume of data growing at an unprecedented rate, large-scale data mining and knowledge discovery have become a new challenge. Rough set theory for knowledge acquisition has been successfully applied in data mining. The recently introduced MapReduce technique has received much attention from both scientific community and industry for its applicability in big data analysis. To mine knowledge from big data, we present parallel large-scale rough set based methods for knowledge acquisition using MapReduce in this paper. We implemented them on several representative MapReduce runtime systems: Hadoop, Phoenix and Twister. Performance comparisons on these runtime systems are reported in this paper. The experimental results show that (1) The computational time is mostly minimum on Twister while employing the same cores; (2) Hadoop has the best speedup for larger data sets; (3) Phoenix has the best speedup for smaller data sets. The excellent speedups also demonstrate that the proposed parallel methods can effectively process very large data on different runtime systems. Pitfalls and advantages of these runtime systems are also illustrated through our experiments, which are helpful for users to decide which runtime system should be used in their applications.  相似文献   

12.
This paper presents an application of knowledge discovery via rough sets to a real life case study of global investing risk in 52 countries using 27 indicator variables. The aim is explanation of the classification of the countries according to financial risks assessed by Wall Street Journal international experts and knowledge discovery from data via decision rule mining, rather than prediction; i.e. to capture the explicit or implicit knowledge or policy of international financial experts, rather than to predict the actual classifications. Suggestions are made about the most significant attributes for each risk class and country, as well as the minimal set of decision rules needed. Our results compared favorably with those from discriminant analysis and several variations of preference disaggregation MCDA procedures. The same approach could be adapted to other problems with missing data in data mining, knowledge extraction, and different multi-criteria decision problems, like sorting, choice and ranking.  相似文献   

13.
由于供应商选择问题直接影响着企业的最终收益, 所以它对企业来说一直是一个重要的决策问题. 在以往的研究中, 供应商选择仅仅是从产品零部件的角度去考虑而没有从产品的整体出发. 此外, 传统的供应商选择都是发生在产品设计阶段之后的产品生产阶段. 然而, 在产品设计初期考虑供应商选择问题可以有效地避免合适供应商的短缺问题. 提出了一个基于产品平台的多目标供应商预选方法, 并在产品设计初期从产品整体角度建立了一个以最小化产品族外包成本、最小化产品族生产风险以及最小化供应商供应时间为多目标的优化模型, 从而有助于决策者在产品开发的早期对产品整体设计方案进行改善. 此外, 由于产品平台存在部件共享问题, 因此在优化模型中也考虑了部件共享对供应商预选结果的影响. 采用非支配排序遗传算法(NSGA-II)对优化模型进行求解, 并通过实际案例来说明提出的优化方法以及求解算法的合理性和有效性.  相似文献   

14.
Deep neural networks (DNNs) have emerged as a state-of-the-art tool in very different research fields due to its adaptive power to the decision space since they do not presuppose any linear relationship between data. Some of the main disadvantages of these trending models are that the choice of the network underlying architecture profoundly influences the performance of the model and that the architecture design requires prior knowledge of the field of study. The use of questionnaires is hugely extended in social/behavioral sciences. The main contribution of this work is to automate the process of a DNN architecture design by using an agglomerative hierarchical algorithm that mimics the conceptual structure of such surveys. Although the train had regression purposes, it is easily convertible to deal with classification tasks. Our proposed methodology will be tested with a database containing socio-demographic data and the responses to five psychometric Likert scales related to the prediction of happiness. These scales have been already used to design a DNN architecture based on the subdimension of the scales. We show that our new network configurations outperform the previous existing DNN architectures.  相似文献   

15.
Worst-Case Tolerance Design and Quality Assurance via Genetic Algorithms   总被引:2,自引:0,他引:2  
In many engineering designs, several components are often placed together in a mechanical assembly. Due to manufacturing variations, there is a tolerance associated with the nominal dimension of each component in the assembly. The goal of worst-case tolerance analysis is to determine the effect of the smallest and largest assembly dimensions on the product performance. Furthermore, to achieve product quality and robustness, designers must ensure that the product performance variation is minimal.Recently, genetic algorithms (GAs) have gained a great deal of attention in the field of tolerance design. The main strength of GAs lies in their ability to effectively perform directed random search in a large space of design solutions and produce optimum results. However, simultaneous treatment of tolerance analysis and robust design for quality assurance via genetic algorithms has been marginal.In this paper, we introduce a new method based on GAs, which addresses both the worst-case tolerance analysis of mechanical assemblies and robust design. A novel formulation based on manufacturing capability indices allows the GA to rank candidate designs based on varying the tolerances around the nominal design parameter values. Standard genetic operators are then applied to ensure that the product performance measure exhibits minimal variation from the desired target value. The computational results in the design of a clutch assembly highlight the advantages of the proposed methodology.  相似文献   

16.
In this paper, a new methodology is investigated to support the prioritization of the voices of customers through various customer satisfaction surveys. This new methodology consists of two key components: an innovative evidence-driven decision modelling framework for representing and transforming large amounts of data sets and a generic reasoning-based decision support process for aggregating evidence to prioritize the voices of customer on the basis of the Evidential Reasoning (ER) approach. Methods and frameworks for data collection and representation via multiple customer satisfaction surveys were examined first and the distinctive features of quantitative and qualitative survey data are analysed. Several novel yet natural and pragmatic rule-based functions are then proposed to transform survey data systematically and consistently from different measurement scales to a common scale, with the original features and profiles of the data preserved in the transformation process. These new transformation functions are proposed to mimic expert judgement processes and designed to be sufficiently flexible and rigorous so that expert judgements and domain specific knowledge can be taken into account naturally, systematically and consistently in the transformation process. The ER approach is used for synthesizing quantitative and qualitative data under uncertainty that can be caused due to missing data and ambiguous survey questions. A new generic method is also proposed for ranking the voices of customer based on qualitative measurement scales without having to quantify assessment grades to fixed numerical values. A case study is examined using an Intelligent Decision System (IDS) to illustrate the application of the decision modelling framework and decision support process for prioritizing the voices of customer for a world-leading car manufacturer.  相似文献   

17.
In this paper, a new fuzzy linear programming (FLP)-based methodology using a specific membership function named modified logistic membership function is proposed. The modified logistic membership function is first formulated and its flexibility established by an analytical approach. This membership function is tested for its useful performance through an illustrative example by employing FLP. The developed methodology of FLP has provided confidence in applying to real-life industrial production planning problem. This approach of solving industrial production planning problem can provide feedback to the decision maker, implementer and analyst. In such cases, this approach can be called interactive FLP. There is a possibility to design the self-organizing of the fuzzy system for the product mix selection problem in order to find a satisfactory solution. The decision maker, analyst and implementer can incorporate their knowledge and experience to obtain the best outcome.  相似文献   

18.
Granular Computing is an emerging conceptual and computing paradigm of information-processing. A central notion is an information-processing pyramid with different levels of clarifications. Each level is usually represented by ‘chunks’ of data or granules, also known as information granules. Rough Set Theory is one of the most widely used methodologies for handling or defining granules.Ontologies are used to represent the knowledge of a domain for specific applications. A challenge is to define semantic knowledge at different levels of human-depending detail.In this paper we propose four operations in order to have several granular perspectives for a specific ontological commitment. Then these operations are used to have various views of an ontology built with a rough-set approach. In particular, a rough methodology is introduced to construct a specific granular view of an ontology.  相似文献   

19.
20.
New product development involves several critical decisions. A key decision making area in new product development is the evaluation of the viability and the market potentials of a new product. In the absence of any relevant historical data, companies ask the potential buyers of their products about their intentions to buy those products when assessing their viability. Despite the popularity of the use of behavioral intentions in predicting the market acceptance of new product ideas, both survey and empirical studies suggest that the accuracy of such predictions is usually very low. Although earlier case-based studies suggest that a number of factors can affect the quality of new product decisions, it is still empirically unclear how product knowledge and the type of new products might impact the predictive accuracy of intentions-based new product forecasting. This study utilized a longitudinal research design and empirically tested the hypotheses across two new products. The study first collected purchase intentions data about the new products. Second, it collected subsequent actual purchase data about the new products. The results of series of hierarchical regression analyses comparing the initial purchase intentions and subsequent actual behaviors showed that while product knowledge is positively related to the predictive accuracy and consistency of intentions-based new product forecasting, product type is negatively related to them.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号