首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
In the relational model of data, Rissanen's Theorem provides the basis for the usual normalization process based on decomposition of relations. However, many difficulties occur if information is incomplete in databases and nulls are required to represent missing or unknown data. We concentrate here on the notion of outer join and find some reasonable conditions to guarantee that outer join will also preserve the lossless join property for two relations. Next we provide a generalization of this result to several relations.  相似文献   

2.
3.
In a variety of applications ranging from environmental and health sciences to bioinformatics, it is essential that data collected in large databases are generated stochastically. This states qualitatively new problems both for statistics and for computer science. Namely, instead of deterministic (usually worst case) analysis, the average case analysis is needed for many standard database problems. Since both stochastic and deterministic methods and notation are used it causes additional difficulties for an investigation of such problems and for an exposition of results. We consider a general class of probabilistic models for databases and study a few problems in a probabilistic framework. In order to demonstrate the general approach, the problems for systems of database constraints (keys, functional dependencies and related) are investigated in more detail. Our approach is based on consequent using Rényi entropy as a main characteristic of uncertainty of distribution and Poisson approximation (Stein–Chen technique) of the corresponding probabilities.  相似文献   

4.
粗集理论对知识进行了形式化定义,它为处理不确定,不完整的海量数据知识提供了一套严密的数据分析处理工具.但粗集概念及运算的代数意义表示往往不易被人理解.本文针对于此。在知识库中提出了知识的信息熵问题,证明了知识的某些信息表示与其代数表示是等价的,最后还讨论了知识库上的粗动力系统的一些性质。  相似文献   

5.
Practical structures often operate with some degree of uncertainties, and the uncertainties are often modelled as random parameters or interval parameters. For realistic predictions of the structures behaviour and performance, structure models should account for these uncertainties. This paper deals with time responses of engineering structures in the presence of random and/or interval uncertainties. Three uncertain structure models are introduced. The first one is random uncertain structure model with only random variables. The generalized polynomial chaos (PC) theory is applied to solve the random uncertain structure model. The second one is interval uncertain structure model with only interval variables. The Legendre metamodel (LM) method is presented to solve the interval uncertain structure model. The LM is based on Legendre polynomial expansion. The third one is hybrid uncertain structure model with both random and interval variables. The polynomial-chaos-Legendre-metamodel (PCLM) method is presented to solve the hybrid uncertain structure model. The PCLM is a combination of PC and LM. Three engineering examples are employed to demonstrate the effectiveness of the proposed methods. The uncertainties resulting from geometrical size, material properties or external loads are studied.  相似文献   

6.
This paper describes the allocation of a wastewater treatment fund within a region based on a dynamic input-output model. Considering the complexity of the input-output process, many indeterminate factors must be included in the model. For example, with the aging of machines, an unexpected loss will be caused by the retention of raw materials during an operation; this can be realistically considered as a random variable, because of the sufficiently large amount of historical data. By contrast, actions such as a temporary transfer or inexperienced operators can only be regard as uncertain variables, because of a lack of historical data. First, the pollution control model is formulated in an uncertain environment by including both human uncertainty and objective randomness. Second, an optimal control model subject to an uncertain random singular system is established; this model can be transformed into an equivalent optimization problem. To solve such a problem, recurrence equations are presented based on Bellman’s principle, and these were successfully applied to address the optimal control problem in two special cases. Moreover, two algorithms are formulated for solving the pollution control problem. Finally, the optimal distribution strategies of the pollution control fund used to control the emissions of COD and NH3-H, which are two indicators of wastewater in China, were obtained through the proposed algorithms.  相似文献   

7.
Numerical databases arise in many scientific applications to keep track of large dense and sparse matrices, residing on secondary devices in matrix compact data representation. This paper describes a language-driven generalized data translator for translating any numerical database from one matrix compact data representation to another. Our approach is to describe any matrix compact data representation by a physical schema and any numerical database and its mapping to storage by data language facilities. The languages are processed by a Generalized Syntax-Directed Translation Scheme (GSDTS) to automatically generate FORTRAN conversion programs which become the major modules of the translator.  相似文献   

8.
Uncertain random variables are used to describe the phenomenon of simultaneous appearance of both uncertainty and randomness in a complex system. For modeling multi-objective decision-making problems with uncertain random parameters, a class of uncertain random optimization is suggested for decision systems in this paper, called the uncertain random multi-objective programming. For solving the uncertain random programming, some notions of the Pareto solutions and the compromise solutions as well as two compromise models are defined. Subsequently, some properties of these models are investigated, and then two equivalent deterministic mathematical programming models under some particular conditions are presented. Some numerical examples are also given for illustration.  相似文献   

9.
The problem of describing minimal response time execution strategies in evaluating the join of several fragmented database relations, is considered. The consequential optimization problem assumes the convenient form of a min-max integer program. With further attention, various generalizations are realized that also include the performance objective of total execution cost.Tables of data logically conforming to the relational model of information are, at the physical level, frequently divided into numerous pieces. These fragments are found disseminated amongst the various sites of a distributed database system, with each one possibly replicated at any number of separate facilities.A submission demanding the amalgamation of many such relations is resolved by joining together their sets of component fragments in an appropriate manner, as defined by complicated patterns of overlapping attribute values. The final result is realized by then concatenating the products of these computations. This process is to be performed under the supervision of the database management system in such a way as to minimize the time taken, as perceived by the user who issued the request.These developments are based upon earlier investigations [1–5] that consider only the alternative optimization goal of minimal execution cost. With this in mind, several different different approaches may be taken to realize distinct hybrid models that give due regard to both measures of join query performance.  相似文献   

10.
A structure for representing inexact information in the form of a relational database is presented. The structure differs from ordinary relational databases in two important respects: Components of tuples need not be single values and a similarity relation is required for each domain set of the database. Two critical properties possessed by ordinary relational databases are proven to exist in the fuzzy relational structure. These properties are (1) no two tuples have identical interpretations, and (2) each relational operation has a unique result.  相似文献   

11.
12.
Inferring null join dependencies in relational databases   总被引:1,自引:0,他引:1  
The inference problem for data dependencies in relational databases is the problem of deciding whether a set of data dependencies logically implies another data dependency. For join dependencies (JDs), the inference problem has been extensively studied by utilising the well-known chase procedure. We generalise JDs to null join dependencies (NJDs) that hold in relations which may contain null values. In our model for incomplete information we allow only a single unmarked null value denoted bynull. This allows us to solve the inference problem for NJDs by extending the chase procedure to the or-chase procedure. In order to define the or-chase procedure we generalise relations with nulls to or-relations which contain a limited form of disjunctive information. The main result of the paper shows that the inference problem for NJDs, including embedded NJDs (which are a special case of NJDs), is decidable; this is realised via the or-chase procedure.  相似文献   

13.
The fuzzy relational model of Buckles and Petry is a rigorous scheme for incorporating non-ideal or fuzzy information in a relational database. In addition to providing a consistent scheme for representing fuzzy information in the relational structure, the model possesses two critical properties that hold for classical relational databases. These properties are that no two tuples have identical interpretations and each relational operation has a unique result.The fuzzy relational model relies on similarity relations for each scalar domain in the fuzzy database. These relations are reflexive, symmetric, and max-min transitive. In addition to introducing fuzziness into the relational model, each similarity relation induces equivalence classes in its domain. It is the existence of these equivalence classes that provides the model with the important properties possessed by classical relational databases.In this paper, we extend the fuzzy relational database model of Buckles and Petry to deal with proximity relations for scalar domains. Since reflexivity and symmetry are the only constraints placed on proximity relations, they generalize the notion of similarity relations. We show that it is possible to induce equivalence classes from proximity relations; thus, the ‘nice’ properties of the fuzzy relational model of Buckles and Petry are preserved. Furthermore, the removal of the max-min transitivity restriction also provides database users with more freedom to express their value structures.  相似文献   

14.
Irregularities are widespread in large databases and often lead to erroneous conclusions with respect to data mining and statistical analysis. For example, considerable bias is often resulted from many parameter estimation procedures without properly handling significant irregularities. Most data cleaning tools assume one known type of irregularity. This paper proposes a generic Irregularity Enlightenment (IE) framework for dealing with the situation when multiple irregularities are hidden in large volumes of data in general and cross sectional time series in particular. It develops an automatic data mining platform to capture key irregularities and classify them based on their importance in a database. By decomposing time series data into basic components, we propose to optimize a penalized least square loss function to aid the selection of key irregularities in consecutive steps and cluster time series into different groups until an acceptable level of variation reduction is achieved. Finally visualization tools are developed to help analysts interpret and understand the nature of data better and faster before further data modeling and analysis.  相似文献   

15.
Expressions are developed for the mean square deviation between a discounted sequence of uncertain payments and the estimated present value. These formulas incorporate random variation from four sources: (1) from the fact that expected values of individual payments are estimated, (2) from the fact that future discount factors are estimated, (3) random variation in the events that give rise to future payments, and (4) random variation in future discount rates. The results extend and encompass existing expressions based on the assumed constancy of one or more of the random vectors.  相似文献   

16.
Uncertain programming is a theoretical tool to handle optimization problems under uncertain environment. The research reported so far is mainly concerned with probability, possibility, or credibility measure spaces. Up to now, uncertain programming realized in Sugeno measure space has not been investigated. The first type of uncertain programming considered in this study and referred to as an expected value model optimizes a given expected objective function subject to some expected constraints. We start with a concept of the Sugeno measure space. We revisit some main properties of the Sugeno measure and elaborate on the gλ random variable and its characterization. Furthermore, the laws of the large numbers are discussed based on this space. In the sequel we introduce a Sugeno expected value model (SEVM). In order to construct an approximate solution to the complex SEVM, the ideas of a Sugeno random number generation and a Sugeno simulation are presented along with a hybrid approach.  相似文献   

17.
Reliability analysis requires modeling of joint probability distribution of uncertain parameters, which can be a challenge since the random variables representing the parameter uncertainties may be correlated. For convenience, a Gaussian data dependence is commonly assumed for correlated random variables. This paper first investigates the effect of multidimensional non-Gaussian data dependences underlying the multivariate probability distribution on reliability results. Using different bivariate copulas in a vine structure, various data dependences can be modeled. The associated copula parameters are identified from available statistical information by moment matching techniques. After the development of the vine copula model for representing the multivariate probability distribution, the reliability involving correlated random variables is evaluated based on the Rosenblatt transformation. The impact of data dependence is significant because a large deviation in failure probability is observed, which emphasizes the need for accurate dependence characterization. A practical method for dependence modeling based on limited data is thus provided. The result demonstrates that the non-Gaussian data dependences can be real in practice, and the reliability can be biased if the Gaussian dependence is used inappropriately. Moreover, the effect of conditioning order on reliability should not be overlooked except that the vine structure contains only one type of copula.  相似文献   

18.
Every researcher in OR models knows that data in this field of human science are deterministic or random or uncertain. Of course, if measurements are available, the scientist must use such strong data but, in many case, a lot of data are weaker and subjectivity is necessary. To combine in a good way, at the best, taking account the present level of knowledge, it is what we can do. Fuzzy sets—and specially, fuzzy numbers—is a good tool for the OR analyst facing partial uncertainty and subjectivity. We are able to associate with several hybrid operators, probabilistic and uncertain data. Several processes and examples will be explained. The goal: to build a model faithful at the best and intelligible for the decision maker.  相似文献   

19.
The functionally graded material (FGM) has a potential to replace ordinary ones in engineering reality due to its superior thermal and dynamical characteristics. In this regard, the paper presents an effective approach for uncertain natural frequency analysis of composite beams with axially varying material properties. Rather than simply assuming the material model as a deterministic function, we further extend the FGM property as a random field, which is able to account for spatial variability in laboratory observations and in-field data. Due to the axially varying input uncertainty, natural frequencies of the stochastically FGM (S-FGM) beam become random variables. To this end, the Karhunen–Loève expansion is first introduced to represent the composite material random field as the summation of a finite number of random variables. Then, a generalized eigenvalue function is derived for stochastic natural frequency analysis of the composite beam. Once the mechanistic model is available, the brutal Monte-Carlo simulation (MCS) similar to the design of experiment can be used to estimate statistical characteristics of the uncertain natural frequency response. To alleviate the computational cost of the MCS method, a generalized polynomial chaos expansion model developed based on a rather small number of training samples is used to mimic the true natural frequency function. Case studies have demonstrated the effectiveness of the proposed approach for uncertain natural frequency analysis of functionally graded material beams with axially varying stochastic properties.  相似文献   

20.
Mining association rules is a popular and well researched method for discovering interesting relations between variables in large databases. A practical problem is that at medium to low support values often a large number of frequent itemsets and an even larger number of association rules are found in a database. A widely used approach is to gradually increase minimum support and minimum confidence or to filter the found rules using increasingly strict constraints on additional measures of interestingness until the set of rules found is reduced to a manageable size. In this paper we describe a different approach which is based on the idea to first define a set of “interesting” itemsets (e.g., by a mixture of mining and expert knowledge) and then, in a second step to selectively generate rules for only these itemsets. The main advantage of this approach over increasing thresholds or filtering rules is that the number of rules found is significantly reduced while at the same time it is not necessary to increase the support and confidence thresholds which might lead to missing important information in the database.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号