首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
2.
This paper is concerned with the allocation of multi-attribute records on several disks so as to achieve high degree of concurrency of disk access when responding to partial match queries.An algorithm to distribute a set of multi-attribute records onto different disks is presented. Since our allocation method will use the principal component analysis, this concept is first introduced. We then use it to generate a set of real numbers which are the projections on the first principal component direction and can be viewed as hashing addresses.Then we propose an algorithm based upon these hashing addresses to allocate multi-attribute records onto different disks. Some experimental results show that our method can indeed be used to solve the multi-disk data allocation problem for concurrent accessing.  相似文献   

3.
We study the computational problem “find the value of the quantified formula obtained by quantifying the variables in a sum of terms.” The “sum” can be based on any commutative monoid, the “quantifiers” need only satisfy two simple conditions, and the variables can have any finite domain. This problem is a generalization of the problem “given a sum-of-products of terms, find the value of the sum” studied in [R.E. Stearns and H.B. Hunt III, SIAM J. Comput. 25 (1996) 448–476]. A data structure called a “structure tree” is defined which displays information about “subproblems” that can be solved independently during the process of evaluating the formula. Some formulas have “good” structure trees which enable certain generic algorithms to evaluate the formulas in significantly less time than by brute force evaluation. By “generic algorithm,” we mean an algorithm constructed from uninterpreted function symbols, quantifier symbols, and monoid operations. The algebraic nature of the model facilitates a formal treatment of “local reductions” based on the “local replacement” of terms. Such local reductions “preserve formula structure” in the sense that structure trees with nice properties transform into structure trees with similar properties. These local reductions can also be used to transform hierarchical specified problems with useful structure into hierarchically specified problems having similar structure.  相似文献   

4.
This paper proposes an information retrieval (IR) model based on possibilistic directed networks. The relevance of a document w.r.t a query is interpreted by two degrees: the necessity and the possibility. The necessity degree evaluates the extent to which a given document is relevant to a query, whereas the possibility degree evaluates the reasons of eliminating irrelevant documents. This new interpretation of relevance led us to revisit the term weighting scheme by explicitly distinguishing between informative and non-informative terms in a document. Experiments carried out on three standard TREC collections show the effectiveness of the model.  相似文献   

5.
The inversive congruential method is an attractive alternative to the classical linear congruential method for pseudorandom number generation. The authors have recently introduced a new method for obtaining nontrivial upper bounds on the multidimensional discrepancy of inversive congruential pseudorandom numbers in parts of the period. This method has also been used to study the multidimensional distribution of several other similar families of pseudorandom numbers. Here we apply this method to show that, “on average” over all initial values, much stronger results than those known for “individual” sequences can be obtained.  相似文献   

6.
Cuckoo hashing     
We present a simple dictionary with worst case constant lookup time, equaling the theoretical performance of the classic dynamic perfect hashing scheme of Dietzfelbinger et al. [SIAM J. Comput. 23 (4) (1994) 738–761]. The space usage is similar to that of binary search trees. Besides being conceptually much simpler than previous dynamic dictionaries with worst case constant lookup time, our data structure is interesting in that it does not use perfect hashing, but rather a variant of open addressing where keys can be moved back in their probe sequences. An implementation inspired by our algorithm, but using weaker hash functions, is found to be quite practical. It is competitive with the best known dictionaries having an average case (but no nontrivial worst case) guarantee on lookup time.  相似文献   

7.
We prove a criterion for an element of a commutative ring to be contained in an archimedean subsemiring. It can be used to investigate the question whether nonnegativity of a polynomial on a compact semialgebraic set can be certified in a certain way. In case of (strict) positivity instead of nonnegativity, our criterion simplifies to classical results of Stone, Kadison, Krivine, Handelman, Schmüdgen et al. As an application of our result, we give a new proof of the following result of Handelman: If an odd power of a real polynomial in several variables has only nonnegative coefficients, then so do all sufficiently high powers.Partially supported by the DFG project 214371 “Darstellung positiver Polynome”.  相似文献   

8.
The paper advocates the use of a new fuzzy-based clustering algorithm for document categorization. Each document/datum will be represented as a fuzzy set. In this respect, the fuzzy clustering algorithm, will be constrained additionally in order to cluster fuzzy sets. Then, one needs to find a metric measure in order to detect the overlapping between documents and the cluster prototype (category). In this respect, we use one of the interclass probabilistic reparability measures known as Bhattacharyya distance, which will be incorporated in the general scheme of the fuzzy c-means algorithm for measuring the overlapping between fuzzy sets. This enables the introduction of fuzziness in the document clustering in the sense that it allows a single document to belong to more than one category. This is in line with semantic multiple interpretations conveyed by single words, which support multiple membership to several classes. Performances of the algorithms will be illustrated using a case study from the construction sector.  相似文献   

9.
non-expansive hashing scheme, similar inputs are stored in memory locations which are close. We develop a non-expansive hashing scheme wherein any set of size from a large universe may be stored in a memory of size (any , and ), and where retrieval takes operations. We explain how to use non-expansive hashing schemes for efficient storage and retrieval of noisy data. A dynamic version of this hashing scheme is presented as well. Received: February 5, 1996  相似文献   

10.
Conditionally specified statistical models are frequently constructed from one-parameter exponential family conditional distributions. One way to formulate such a model is to specify the dependence structure among random variables through the use of a Markov random field (MRF). A common assumption on the Gibbsian form of the MRF model is that dependence is expressed only through pairs of random variables, which we refer to as the “pairwise-only dependence” assumption. Based on this assumption, J. Besag (1974, J. Roy. Statist. Soc. Ser. B36, 192–225) formulated exponential family “auto-models” and showed the form that one-parameter exponential family conditional densities must take in such models. We extend these results by relaxing the pairwise-only dependence assumption, and we give a necessary form that one-parameter exponential family conditional densities must take under more general conditions of multiway dependence. Data on the spatial distribution of the European corn borer larvae are fitted using a model with Bernoulli conditional distributions and several dependence structures, including pairwise-only, three-way, and four-way dependencies.  相似文献   

11.
For a noncommutative space X, we study Inj(X), the set of isomorphism classes of indecomposable injective X-modules. In particular, we look at how this set, suitably topologized, can be viewed as an underlying “spectrum” for X. As applications we discuss noncommutative notions of irreducibility and integrality, and a way of associating an integral subspace of X to each element of Inj(X) which behaves like a “weak point.”  相似文献   

12.
Documents today have an increasing variety of uses, and because of the involvement of computers in both production and analysis, may be encountered in many forms. To meet these application and accessibility demands, we propose a representation formalism based on a five-dimensional space: physical, logical, functional, and topical organization, as well as document class. The suggested implementation is a semantic net with inheritance with the region as the primitive element and relations for specialization, aggregation, continuity, and topicality.  相似文献   

13.
The indexing problem is where a text is preprocessed and subsequent queries of the form “Find all occurrences of pattern P in the text” are answered in time proportional to the length of the query and the number of occurrences. In the dictionary matching problem a set of patterns is preprocessed and subsequent queries of the form “Find all occurrences of dictionary patterns in text T” are answered in time proportional to the length of the text and the number of occurrences.There exist efficient worst-case solutions for the indexing problem and the dictionary matching problem, but none that find approximate occurrences of the patterns, i.e., where the pattern is within a bound edit (or Hamming) distance from the appropriate text location.In this paper we present a uniform deterministic solution to both the indexing and the general dictionary matching problem with one error. We preprocess the data in time O(n log2 n), where n is the text size in the indexing problem and the dictionary size in the dictionary matching problem. Our query time for the indexing problem is O(m log n log log n + tocc), where m is the query string size and tocc is the number of occurrences. Our query time for the dictionary matching problem is O(n log3 d log log d + tocc), where n is the text size and d the dictionary size. The time bounds above apply to both bounded and unbounded alphabets.  相似文献   

14.
15.
LSI潜在语义信息检索模型   总被引:5,自引:0,他引:5  
本文介绍了基于向量空间的信息检索方法 ,检索词和文件之间的关系表示成一个矩阵 ,查寻信息表示为检索词权重的向量 ,通过求查寻和文件向量的夹角余弦确定出数据库中的相关文件 .使用矩阵的 QR分解和奇异值分解 ( SVD)来处理数据库本身的不确定性 ,本文的目的是说明线性代数中的基本概念可以很好解决信息检索 ( IR)问题  相似文献   

16.
This paper presents the use of graphical models and copula functions in Estimation of Distribution Algorithms (EDAs) for solving multivariate optimization problems. It is shown in this work how the incorporation of copula functions and graphical models for modeling the dependencies among variables provides some theoretical advantages over traditional EDAs. By means of copula functions and two well known graphical models, this paper presents a novel approach for defining new EDAs. Either dependence is modeled by a copula function chosen from a predefined set of six functions that aim to cover a wide range of inter-relations. It is also shown how the use of mutual information in the learning of graphical models implies a natural way of employing copula entropies. The experimental results on separable and non-separable functions show that the two new EDAs, which adopt copula functions to model dependencies, perform better than their original version with Gaussian variables.  相似文献   

17.
This paper proposes a method for estimation of a class of partially linear single-index models with randomly censored samples. The method provides a flexible way for modelling the association between a response and a set of predictor variables when the response variable is randomly censored. It presents a technique for “dimension reduction” in semiparametric censored regression models and generalizes the existing accelerated failure-time models for survival analysis. The estimation procedure involves three stages: first, transform the censored data into synthetic data or pseudo-responses unbiasedly; second, obtain quasi-likelihood estimates of the regression coefficients in both linear and single-index components by an iteratively algorithm; finally, estimate the unknown nonparametric regression function using techniques for univariate censored nonparametric regression. The estimators for the regression coefficients are shown to be jointly root-n consistent and asymptotically normal. In addition, the estimator for the unknown regression function is a local linear kernel regression estimator and can be estimated with the same efficiency as all the parameters are known. Monte Carlo simulations are conducted to illustrate the proposed methodology.  相似文献   

18.
In this paper we solve the following Ulam problem: “Give conditions in order for a linear mapping near an approximately linear mapping to exist” and establish results involving a product of powers of norms [[5.]; [5.]; [5.]]. There has been much activity on a similar “ -isometry” problem of Ulam [ [1.], 633–636; [2.], 263–277; [4.]]. This work represents an improvement and generalization of the work of [3.], 222–224].  相似文献   

19.
After an introduction which addresses some basic questions, this article is organized around three points: (1) The theoretical framework of the so-called “instrumental approach” which has been a theme in the last two CAME symposia; (2) A consideration of two processes (instrumentalization and instrumentation) which interact in the instrumental genesis; and (3) The introduction of the idea of instrumental orchestration as a way of allowing the teacher to assist the student’s instrumental genesis.This article originates from a lecture given at the Third CAME (Computer Algebra in Mathematics Education) Symposium (Reims, June 2003).  相似文献   

20.
A new interpolation-based order preserving hashing algorithm suitable for on-line maintenance of large dynamic external files under sequences of four kinds of operationsinsertion, update, deletion, andorthogonal range query is proposed. The scheme, an adaptation of linear hashing, requires no index or address directory structure and utilizesO(n) space for files containingn records; all of the benefits of linear hashing are inherited by this new scheme. File implementations yielding average successful search lengths much less than 2 and average unsuccessful search lengths much less than 4 for individual records are obtainable; the actual storage required is controllable by the implementor.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号