首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 125 毫秒
1.
Semantic hashing   总被引:1,自引:0,他引:1  
We show how to learn a deep graphical model of the word-count vectors obtained from a large set of documents. The values of the latent variables in the deepest layer are easy to infer and give a much better representation of each document than Latent Semantic Analysis. When the deepest layer is forced to use a small number of binary variables (e.g. 32), the graphical model performs “semantic hashing”: Documents are mapped to memory addresses in such a way that semantically similar documents are located at nearby addresses. Documents similar to a query document can then be found by simply accessing all the addresses that differ by only a few bits from the address of the query document. This way of extending the efficiency of hash-coding to approximate matching is much faster than locality sensitive hashing, which is the fastest current method. By using semantic hashing to filter the documents given to TF-IDF, we achieve higher accuracy than applying TF-IDF to the entire document set.  相似文献   

2.
This paper introduces a novel vertical handoff decision scheme. The objective is to provide users with enhanced quality of service (QoS) and maximize the network revenue. This scheme balances both-side interests via a suitably defined network merit function and a user–operator negotiation model. The merit function evaluates network performance based on user preferences and decides the most appropriate network for users. The negotiation model is defined as a semi-Markov decision process (SMDP). An optimal policy that maximizes the network revenue without violating QoS constraints is found by resolving the SMDP problem using Q-learning. Furthermore, a time-adaptive QoS monitoring mechanism is combined with the merit function in order to decrease the power consumption on terminal interface activation. The simulation results demonstrate that the proposed vertical handoff decision scheme enhances the performance in terms of power consumption, handoff call-dropping probability (HCDP) and network revenue.  相似文献   

3.
Possibility theory provides a good framework for dealing with merging problems when information is pervaded with uncertainty and inconsistency. Many merging operators in possibility theory have been proposed. This paper develops a new approach to merging uncertain information modeled by possibilistic networks. In this approach we restrict our attention to show how a “triangular norm” establishes a lower bound on the degree to which an assessment is true when it is obtained by a set of initial hypothesis represented by a joint possibility distribution. This operator is characterized by its high effect of reinforcement. A strongly conjunctive operator is suitable to merge networks that are not involved in conflict, especially those supported by both sources. In this paper, the Lukasiewicz t-norm is first applied to a set of possibility measures to combine networks having the same and different graphical structures. We then present a method to merge possibilistic networks dealing with cycles.  相似文献   

4.
The two-dimensional representation of documents which allows documents to be represented in a two-dimensional Cartesian plane has proved to be a valid visualization tool for Automated Text Categorization (ATC) for understanding the relationships between categories of textual documents, and to help users to visually audit the classifier and identify suspicious training data. This paper analyzes a specific use of this visualization approach in the case of the Naive Bayes (NB) model for text classification and the Binary Independence Model (BIM) for text retrieval. For text categorization, a reformulation of the equation for the decision of classification has to be written in such a way that each coordinate of a document is the sum of two addends: a variable component P(d|ci), and a constant component P(ci), the prior of the category. When plotted in the Cartesian plane according to this formulation, the documents that are constantly shifted along the x-axis and the y-axis can be seen. This effect of shifting is more or less evident according to which NB model, Bernoulli or multinomial, is chosen. For text retrieval, the same reformulation can be applied in the case of the BIM model. The visualization helps to understand the decisions that are taken to order the documents, in particular in the case of relevance feedback.  相似文献   

5.
We study the local decodability and (tolerant) local testability of low‐degree n‐variate polynomials over arbitrary fields, evaluated over the domain {0,1}n. We show that for every field there is a tolerant local test whose query complexity depends only on the degree. In contrast we show that decodability is possible over fields of positive characteristic, but not over the reals.  相似文献   

6.
LSI潜在语义信息检索模型   总被引:5,自引:0,他引:5  
本文介绍了基于向量空间的信息检索方法 ,检索词和文件之间的关系表示成一个矩阵 ,查寻信息表示为检索词权重的向量 ,通过求查寻和文件向量的夹角余弦确定出数据库中的相关文件 .使用矩阵的 QR分解和奇异值分解 ( SVD)来处理数据库本身的不确定性 ,本文的目的是说明线性代数中的基本概念可以很好解决信息检索 ( IR)问题  相似文献   

7.
Within the context of intermodal logistics, the design of transportation networks becomes more complex than it is for single mode logistics. In an intermodal network, the respective modes are characterized by the transportation cost structure, modal connectivity, availability of transfer points and service time performance. These characteristics suggest the level of complexity involved in designing intermodal logistics networks. This research develops a mathematical model using the multiple-allocation p-hub median approach. The model encompasses the dynamics of individual modes of transportation through transportation costs, modal connectivity costs, and fixed location costs under service time requirements. A tabu search meta-heuristic is used to solve large size (100 node) problems. The solutions obtained using this meta-heuristic are compared with tight lower bounds developed using a Lagrangian relaxation approach. An experimental study evaluates the performance of the intermodal logistics networks and explores the effects and interactions of several factors on the design of intermodal hub networks subject to service time requirements.  相似文献   

8.
The task of computing a function F with the help of an oracle X can be viewed as a search problem where the cost measure is the number of queries to X. We ask for the minimal number that can be achieved by a suitable choice of X and call this quantity the query complexity of F. This concept is suggested by earlier work of Beigel, Gasarch, Gill, and Owings on “Bounded query classes”. We introduce a fault tolerant version and relate it with Ulam's game. For many natural classes of functions F we obtain tight upper and lower bounds on the query complexity of F. Previous results like the Nonspeedup Theorem and the Cardinality Theorem appear in a wider perspective. Mathematics Subject Classification: 03D20, 68Q15, 68R05.  相似文献   

9.
An inventory model for a deteriorating item with stock dependent demand is developed under two storage facilities over a random planning horizon, which is assumed to follow exponential distribution with known parameter. For crisp deterioration rate, the expected profit is derived and maximized via genetic algorithm (GA). On the other hand, when deterioration rate is imprecise then optimistic/pessimistic equivalent of fuzzy objective function is obtained using possibility/necessity measure of fuzzy event. Fuzzy simulation process is proposed to maximize the optimistic/pessimistic return and finally fuzzy simulation-based GA is developed to solve the model. The models are illustrated with some numerical data. Sensitivity analyses on expected profit function with respect to distribution parameter λ and confidence levels α1 and α2 are also presented.  相似文献   

10.
We consider a model for complex networks that was introduced by Krioukov et al. (Phys Rev E 82 (2010) 036106). In this model, N points are chosen randomly inside a disk on the hyperbolic plane according to a distorted version of the uniform distribution and any two of them are joined by an edge if they are within a certain hyperbolic distance. This model exhibits a power‐law degree sequence, small distances and high clustering. The model is controlled by two parameters α and ν where, roughly speaking, α controls the exponent of the power‐law and ν controls the average degree. In this paper we focus on the probability that the graph is connected. We show the following results. For and ν arbitrary, the graph is disconnected with high probability. For and ν arbitrary, the graph is connected with high probability. When and ν is fixed then the probability of being connected tends to a constant that depends only on ν, in a continuous manner. Curiously, for while it is strictly increasing, and in particular bounded away from zero and one, for . © 2016 Wiley Periodicals, Inc. Random Struct. Alg., 49, 65–94, 2016  相似文献   

11.
In this paper, multi-item economic production quantity (EPQ) models with selling price dependent demand, infinite production rate, stock dependent unit production and holding costs are considered. Flexibility and reliability consideration are introduced in the production process. The models are developed under two fuzzy environments–one with fuzzy goal and fuzzy restrictions on storage area and the other with unit cost as fuzzy and possibility–necessity restrictions on storage space. The objective goal and constraint goal are defined by membership functions and the presence of fuzzy parameters in the objective function is dealt with fuzzy possibility/necessity measures. The models are formed as maximization problems. The first one—the fuzzy goal programming problem is solved using Fuzzy Additive Goal Programming (FAGP) and Modified Geometric Programming (MGP) methods. The second model with fuzzy possibility/necessity measures is solved by Geometric Programming (GP) method. The models are illustrated through numerical examples. The sensitivity analyses of the profit function due to different measures of possibility and necessity are performed and presented graphically.  相似文献   

12.
13.
Graphs are important structures to model complex relationships such as chemical compounds, proteins, geometric or hierarchical parts, and XML documents. Given a query graph, indexing has become a necessity to retrieve similar graphs quickly from large databases. We propose a novel technique for indexing databases, whose entries can be represented as graph structures. Our method starts by representing the topological structure of a graph as well as that of its subgraphs as vectors in which the components correspond to the sorted laplacian eigenvalues of the graph or subgraphs. By doing a nearest neighbor search around the query spectra, similar but not necessarily isomorphic graphs are retrieved. (© 2008 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim)  相似文献   

14.
In this paper, we formulate two classes of problems, the colored range query problems and the colored point enclosure query problems to model multi-dimensional range and point enclosure queries in the presence of categorical information. Many of these problems are difficult to solve using traditional data structural techniques. Based on a new framework of combining sketching techniques and traditional data structures, we obtain two sets of results in solving the problems approximately and efficiently. In addition, the framework can be employed to attack other related problems by finding the appropriate summary structures.  相似文献   

15.
对XML数据建立有效的索引,是左右XML数据处理性能的重要因素.现有的索引和存储策略,大部分以丢失结构信息为代价,不利于结构查询和更新.XMLSchema作为描述XML文档结构信息的标准,对文档和查询路径进行有效性验证提供保证,基于此提出了一种基于XMLmSchema模式约束的XML文档数据索引技术SBXI,用于文档数据存储和查询的导航,提高了存储和查询效率,具有较高的空间利用率和较低的索引维护代价,并支持含有多个谓词的复杂查询.  相似文献   

16.
提出吸引度依赖于时间的竞争网络模型.利用Poisson过程获得这个模型稳态平均度分布的解析表达式.理论分析表明,这类网络幂律指数与渐近吸引系数和新节点边数m有关,且在区间(1+1/m,m+1)内.作为竞争网络模型的应用,获得了适应度模型的度分布估计.结果表明适应度模型是竞争网络模型的特例,反之则不然.  相似文献   

17.
Global optima results for the Kauffman NK model   总被引:2,自引:0,他引:2  
The Kauffman NK model has been used in theoretical biology, physics and business organizations to model complex systems with interacting components. Recent NK model results have focused on local optima. This paper analyzes global optima of the NK model. The resulting global optimization problem is transformed into a stochastic network model that is closely related to two well-studied problems in operations research. This leads to applicable strategies for explicit computation of bounds on the global optima particularly with K either small or close to N. A general lower bound, which is sharp for K = 0, is obtained for the expected value of the global optimum of the NK model. A detailed analysis is provided for the expectation and variance of the global optimum when K = N−1. The lower and upper bounds on the expectation obtained for this case show that there is a wide gap between the values of the local and the global optima. They also indicate that the complexity catastrophe that occurs with the local optima does not arise for the global optima.  相似文献   

18.
The range-searching problems that allow efficient partition trees are characterized as those defined by range spaces of finite Vapnik-Chervonenkis dimension. More generally, these problems are shown to be the only ones that admit linear-size solutions with sublinear query time in the arithmetic model. The proof rests on a characterization of spanning trees with a low stabbing number. We use probabilistic arguments to treat the general case, but we are able to use geometric techniques to handle the most common range-searching problems, such as simplex and spherical range search. We prove that any set ofn points inE d admits a spanning tree which cannot be cut by any hyperplane (or hypersphere) through more than roughlyn 1–1/d edges. This result yields quasi-optimal solutions to simplex range searching in the arithmetic model of computation. We also look at polygon, disk, and tetrahedron range searching on a random access machine. Givenn points inE 2, we derive a data structure of sizeO(n logn) for counting how many points fall inside a query convexk-gon (for arbitrary values ofk). The query time isO(kn logn). Ifk is fixed once and for all (as in triangular range searching), then the storage requirement drops toO(n). We also describe anO(n logn)-size data structure for counting how many points fall inside a query circle inO(n log2 n) query time. Finally, we present anO(n logn)-size data structure for counting how many points fall inside a query tetrahedron in 3-space inO(n 2/3 log2 n) query time. All the algorithms are optimal within polylogarithmic factors. In all cases, the preprocessing can be done in polynomial time. Furthermore, the algorithms can also handle reporting within the same complexity (adding the size of the output as a linear term to the query time).Portions of this work have appeared in preliminary form in Partition trees for triangle counting and other range searching problems (E. Welzl),Proc. 4th Ann. ACM Symp. Comput. Geom. (1988), 23–33, and Tight Bounds on the Stabbing Number of Spanning Trees in Euclidean Space (B. Chazelle), Comput. Sci. Techn. Rep. No. CS-TR-155-88, Princeton University, 1988. Bernard Chazelle acknowledges the National Science Foundation for supporting this research in part under Grant CCR-8700917. Emo Welzl acknowledges the Deutsche Forschungsgemeinschaft for supporting this research in part under Grant We 1265/1-1.  相似文献   

19.
Let S be a set of noncrossing triangular obstacles in R 3 with convex hull H . A triangulation T of H is compatible with S if every triangle of S is the union of a subset of the faces of T. The weight of T is the sum of the areas of the triangles of T. We give a polynomial-time algorithm that computes a triangulation compatible with S whose weight is at most a constant times the weight of any compatible triangulation. One motivation for studying minimum-weight triangulations is a connection with ray shooting. A particularly simple way to answer a ray-shooting query (``Report the first obstacle hit by a query ray') is to walk through a triangulation along the ray, stopping at the first obstacle. Under a reasonably natural distribution of query rays, the average cost of a ray-shooting query is proportional to triangulation weight. A similar connection exists for line-stabbing queries (``Report all obstacles hit by a query line'). Received February 3, 1997, and in revised form August 21, 1998.  相似文献   

20.
In three-dimensional space an embedded network is called gradient-constrained if the absolute gradient of any differentiable point on the edges in the network is no more than a given value m. A gradient-constrained minimum Steiner tree T is a minimum gradient-constrained network interconnecting a given set of points. In this paper we investigate some of the fundamental properties of these minimum networks. We first introduce a new metric, the gradient metric, which incorporates a new definition of distance for edges with gradient greater than m. We then discuss the variational argument in the gradient metric, and use it to prove that the degree of Steiner points in T is either three or four. If the edges in T are labelled to indicate whether the gradients between their endpoints are greater than, less than, or equal to m, then we show that, up to symmetry, there are only five possible labellings for degree 3 Steiner points in T. Moreover, we prove that all four edges incident with a degree 4 Steiner point in T must have gradient m if m is less than 0.38. Finally, we use the variational argument to locate the Steiner points in T in terms of the positions of the neighbouring vertices.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号