首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Abdou Youssef 《PAMM》2007,7(1):1010501-1010502
As digital libraries of mathematical and scientific contents become available, it is essential to have math-aware search systems. The search systems must understand math symbols and structures, and allow users to enter queries that involve not only text keywords but also mathematical expressions and fragments of expressions. In addition, the search results must be presented in a way that enables the user to find the desired information rapidly. This short paper gives a quick overview of the state of the art of math search, and focuses on the math search system that the author has developed for the Digital Library of Mathematical Functions (DLMF) project of the National Institute of Standards and Technology. (© 2008 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim)  相似文献   

2.
A path decomposition at the infimum for positive self-similar Markov processes (pssMp) is obtained. Next, several aspects of the conditioning to hit 0 of a pssMp are studied. Associated to a given pssMp XX, that never hits 0, we construct a pssMp XX that hits 0 in a finite time. The latter can be viewed as XX conditioned to hit 0 in a finite time, and we prove that this conditioning is determined by the pre-minimum part of XX. Finally, we provide a method for conditioning a pssMp that hits 0 by a jump to do it continuously.  相似文献   

3.
Math search is a new area of research with many enabling technologies but also many challenges. Some of the enabling technologies include XML, XPath, XQuery, and MathML. Some of the challenges involve enabling search systems to recognize mathematical symbols and structures. Several math search projects have made considerable progress in meeting those challenges. One of the remaining challenges is the creation and implementation of a math query language that enables the general users to express their information needs intuitively yet precisely. This paper will present such a language and detail its features. The new math query language offers an alternative way to describe mathematical expressions that is more consistent and less ambiguous than conventional mathematical notation. In addition, the language goes beyond the Boolean and proximity query syntax found in standard text search systems. It defines a powerful set of wildcards that are deemed important for math search. These wildcards provide for more precise structural search and multi-levels of abstractions. Three new sets of wildcards and their implementation details will also be discussed.   相似文献   

4.
Solutions produced by the first generation of heuristics for the vehicle routeing problem are often far from optimal. Recent adaptations of local search improvement heuristics, like tabu search, produce much better solutions but require increased computing time. However there are situations where good solutions must be obtained quickly. The algorithm proposed in this paper yields solutions almost as good as those produced by tabu search adaptations, but at only a small fraction of their computing time. This heuristic can be seen as an improved version of the original petal heuristic. On 14 benchmark test problems, the proposed heuristic yields solutions whose values lie on average within 2.38% of that of the best known solutions.  相似文献   

5.
Of repeated hits and repeated explosions after first explosion for a birth and death process with explosion some properties are investigated. The properties of repeated hits after first explosion may be expressed by the properties of the first hit after the first explosion. Project supported by the National Natural Science Foundation of China (Grant No. 19761028) and the Centre of Researching Mathematics and Forstering Higher Talent, the Ministry of Education of China.  相似文献   

6.
Online test paper generation (Online-TPG) generates a test paper automatically online according to user specification based on multiple assessment criteria, and the generated test paper can then be attempted online by user. Online-TPG is challenging as it is a multi-objective optimization problem that is NP-hard, and it is also required to satisfy the online generation requirement. In this paper, we propose an efficient multi-objective optimization approach based on the divide-and-conquer memetic algorithm (DAC-MA) for Online-TPG. Instead of solving the multi-objective constraints simultaneously, the set of constraints is divided into two subsets of relevant constraints, which can then be solved separately and effectively by evolutionary computation and local search of DAC-MA. The empirical performance results have shown that the proposed approach has outperformed other TPG techniques in terms of runtime efficiency and paper quality.  相似文献   

7.
8.
Multi-dimensional classification aims at finding a function that assigns a vector of class values to a given vector of features. In this paper, this problem is tackled by a general family of models, called multi-dimensional Bayesian network classifiers (MBCs). This probabilistic graphical model organizes class and feature variables as three different subgraphs: class subgraph, feature subgraph, and bridge (from class to features) subgraph. Under the standard 0-1 loss function, the most probable explanation (MPE) must be computed, for which we provide theoretical results in both general MBCs and in MBCs decomposable into maximal connected components. Moreover, when computing the MPE, the vector of class values is covered by following a special ordering (gray code). Under other loss functions defined in accordance with a decomposable structure, we derive theoretical results on how to minimize the expected loss. Besides these inference issues, the paper presents flexible algorithms for learning MBC structures from data based on filter, wrapper and hybrid approaches. The cardinality of the search space is also given. New performance evaluation metrics adapted from the single-class setting are introduced. Experimental results with three benchmark data sets are encouraging, and they outperform state-of-the-art algorithms for multi-label classification.  相似文献   

9.
In this paper, we propose to explain Discounted Cumulative Gain (DCG) as the expectation of the total utility collected by a user given a generative probabilistic model on how users browse the result page ranking list of a search engine. We contrast this with a generalization of Average Precision, pAP, that has been defined in Dupret and Piwowarski (2010) [13]. In both cases, user decision models coupled with Web search logs allow to estimate some parameters that are usually left to the designer of a metric. In this paper, we compare the user models for DCG and pAP at the interpretation and experimental level.DCG and AP are metrics computed before a ranking function is exposed to users and as such, their role is to predict the function performance. In counterpart to prognostic metric, a diagnostic metric is computed after observing the user interactions with the result list. A commonly used diagnostic metric is the clickthrough rate at position 1, for example. In this work we show that the same user model developed for DCG can be used to derive a diagnostic version of this metric. The same hold for pAP and any metric with a proper user model.We show that not only does this diagnostic view provide new information, it also allows to define a new criterion for assessing a metric. In previous works based on user decision modeling, the performance of different metrics were compared indirectly in terms of the ability of the associated user model to predict future user actions. Here we propose a new and more direct criterion based on the ability of the prognostic version of the metric to predict the diagnostic performance.  相似文献   

10.
This article presents a survey of techniques for ranking results in search engines, with emphasis on link-based ranking methods and the PageRank algorithm. The problem of selecting, in relation to a user search query, the most relevant documents from an unstructured source such as the WWW is discussed in detail. The need for extending classical information retrieval techniques such as boolean searching and vector space models with link-based ranking methods is demonstrated. The PageRank algorithm is introduced, and its numerical and spectral properties are discussed. The article concludes with an alternative means of computing PageRank, along with some example applications of this new method.  相似文献   

11.
We propose a variable selection procedure in model-based clustering using multilocus genotype data. Indeed, it may happen that some loci are not relevant for clustering into statistically different populations. Inferring the number K of clusters and the relevant clustering subset S of loci is seen as a model selection problem. The competing models are compared using penalized maximum likelihood criteria. Under weak assumptions on the penalty function, we prove the consistency of the resulting estimator ${(\widehat{K}_n, \widehat{S}_n)}$ . An associated algorithm named Mixture Model for Genotype Data (MixMoGenD) has been implemented using c++ programming language and is available on http://www.math.u-psud.fr/~toussile. To avoid an exhaustive search of the optimum model, we propose a modified Backward-Stepwise algorithm, which enables a better search of the optimum model among all possible cardinalities of S. We present numerical experiments on simulated and real datasets that highlight the interest of our loci selection procedure.  相似文献   

12.
This paper proposes an information retrieval (IR) model based on possibilistic directed networks. The relevance of a document w.r.t a query is interpreted by two degrees: the necessity and the possibility. The necessity degree evaluates the extent to which a given document is relevant to a query, whereas the possibility degree evaluates the reasons of eliminating irrelevant documents. This new interpretation of relevance led us to revisit the term weighting scheme by explicitly distinguishing between informative and non-informative terms in a document. Experiments carried out on three standard TREC collections show the effectiveness of the model.  相似文献   

13.
This paper presents a model to assist in setting the daily production rates for an offshore oilfield to achieve a quarterly production target. The produced crude oil is frequently accompanied by the production of gas, which has much lower value. Since there are environmental limits to the amount of gas that can be flared to waste, problems on gas-processing can very quickly limit oil output. With the commencement of natural decline in many oilfields in the North Sea, the loss of crude output as a result of gas constraints cannot be compensated in the same planning period. This makes the ultimate output, and hence the reward, more sensitive to the problems associated with gas-processing. The model uses a stochastic dynamic programming algorithm to maximize the financial reward. Gas-related limiting factors and the oil and gas plant downtime are considered. The user is allowed to interrogate the model and interact with it when major unforeseen restrictions are imposed on gas capacities. The output includes the optimal production rate and probability and cost of not achieving the target.  相似文献   

14.
This paper deals with the two-noisy-versus-one-silent duel which is still open, as pointed out by Styszyński (Ref. 1). Player I has a noisy gun with two bullets, and player II has a silent gun with one bullet. Each player fires his bullets aiming at his opponent at any time in [0, 1]. The accuracy function (the probability that one player hits his opponent if he fires at timet) isp(t)=t for each player. If player I hits player II, without being hit himself before, the payoff of the duel is +1; if player I is hit by player II, without hitting player II before, the payoff is taken to be ?1. In this paper, we determine the optimal strategies and the value of the game. The strategy for player II depends explicitly on the firing moment of player I's first shot.  相似文献   

15.
Most search service providers such as Lycos and Google either produce irrelevant search results or unstructured company listings to the consumers. To overcome these two shortcomings, search service providers such as GoTo.com have developed mechanisms for firms to advertise their services and for consumers to search for the right services. To provide relevant search results, each firm who wishes to advertise at the GoTo site must specify a set of keywords. To develop structured company listings, each firm bids for priority listing in the search results that appear on the GoTo site. Since the search results appear in descending order of bid price, each firm has some control over the order in which the firm appears on the list resulting from the search. In this paper, we present a one-stage game for two firms that captures the advertising mechanism of a search service provider (such as GoTo). This model enables us to examine the firm’s optimal bidding strategy and evaluate the impact of various parameters on the firm’s strategy. Moreover, we analyze the conditions under which all firms would increase their bids at the equilibrium. These conditions could be helpful to the service provider when developing mechanisms to entice firms to submit higher bids.  相似文献   

16.
Markov Chain Monte Carlo (MCMC) methods may be employed to search for a probability distribution over a bounded space of function arguments to estimate which argument(s) optimize(s) an objective function. This search-based optimization requires sampling the suitability, or fitness, of arguments in the search space. When the objective function or the fitness of arguments vary with time, significant exploration of the search space is required. Search efficiency then becomes a more relevant measure of the usefulness of an MCMC method than traditional measures such as convergence speed to the stationary distribution and asymptotic variance of stationary distribution estimates. Search efficiency refers to how quickly prior information about the search space is traded-off for search effort savings. Optimal search efficiency occurs when the entropy of the probability distribution over the space during search is maximized. Whereas the Metropolis case of the Hastings MCMC algorithm with fixed candidate generation is optimal with respect to asymptotic variance of stationary distribution estimates, this paper proves that Barker’s case is optimal with respect to search efficiency if the fitness of the arguments in the search space is characterized by an exponential function. The latter instance of optimality is beneficial for time-varying optimization that is also model-independent.  相似文献   

17.
Topic analysis of search engine user queries is an important task, since successful exploitation of the topic of queries can result in the design of new information retrieval algorithms for more efficient search engines. Identification of topic changes within a user search session is a key issue in analysis of search engine user queries. This study presents an application of Markov chains in the area of search engine research to automatically identify topic changes in a user session by using statistical characteristics of queries, such as time intervals, query reformulation patterns and the continuation/shift status of the previous query. The findings show that Markov chains provide fairly successful results for automatic new topic identification with a high level of estimation for topic continuations and shifts. Copyright © 2009 John Wiley & Sons, Ltd.  相似文献   

18.
Adopting a proper cache document replacement policy is critical to the performance of a caching system. Among the existing cache document replacement policies, no one policy can surpass all the other policies in every case. Besides, the most suitable cache document replacement policy for a caching system is often chosen from the existing policies, which cannot guarantee the optimality of the chosen policy. These phenomena motivate us to construct a cache document replacement policy which content can be tailored to the specific requirements of a caching system. In this study, the optimal linear combination (OLC) cache document replacement policy tailored to the requirements of the caching system is to be found out. To evaluate the effectiveness of the proposed methodology, an experimental EC website has been constructed, and the log file of the website server was used as the data source to evaluate the performances of various cache document replacement policies under different cache sizes. In our simulation experiments, the OLC policies outperformed the other traditional policies by increasing the hit rate and the byte hit rate up to 7% and 11%, respectively.  相似文献   

19.
We describe our design and implementation of a dual-objective course-timetabling system for the Science Division at Rollins College, and we compare the results of our system with the actual timetable that was manually constructed for the Fall 2009 school term. The course timetables at Rollins, as at most colleges in the U.S., must be created before students enroll in classes, and our “wish list” of pairs of classes that we would like to offer in non-overlapping timeslots is considerably larger than if we were to consider only those that absolutely must be in non-overlapping timeslots. This necessitates assigning different levels of conflict severity for the class pairs and setting our objective to minimize total conflict severity. Our second objective is to create timetables that result in relatively compact schedules for the instructors and students. In addition to our automatic construction, a second, equally important component of our system is a graphical user interface (GUI) that enables the user to participate in the input, construction, and modification of a timetable. In the input phase, course incompatibility, instructor and student preferences, and desire for compact schedules all require subjective judgments. The GUI allows the user to quantify and convert this information to the weighted-graph model. In the construction and modification phase, the GUI enables the user to directly assign or reassign courses to timeslots while guided by heuristics.  相似文献   

20.
Adaptive computation using adaptive meshes is now recognized as essential for solving complex PDE problems. This computation requires, at each step, the definition of a continuous metric field to govern the generation of the adapted meshes. In practice, via an appropriate a posteriori error estimation, metrics are calculated at the vertices of the computational domain mesh. In order to obtain a continuous metric field, the discrete field is interpolated in the whole domain mesh. In this Note, a new method for interpolating discrete metric fields, based on a so-called “natural decomposition” of metrics, is introduced. The proposed method is based on known matrix decompositions and is computationally robust and efficient. Some qualitative comparisons with classical methods are made to show the relevance of this methodology.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号