期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Nageswara S. V. Rao Vijay K. Vaishnavi S. Sitharama Iyengar 《BIT Numerical Mathematics》1988,28(1):37-53

In this paper we present a simple dynamization method that preserves the query and storage costs of a static data structure and ensures reasonable update costs. In this method, the majority of data elements are maintained in a single data structure, and the updates are handled using smaller auxiliary data structures. We analyze the query, storage, and amortized update costs for the dynamic version of a static data structure in terms of a functionf, such thatf(n)<n, that bounds the sizes of the auxiliary data structures (wheren is the number of elements in the data structure). The conditions onf for minimal (with respect to asymptotic upper bounds) amortized update costs are then obtained. The proposed method is shown to be particularly suited for the cases where the merging of two data structures is more efficient than building the resultant data structure from scratch. Its effectiveness is illustrated by applying it to a class of data structures that have linear merging cost; this class consists of data structures such as Voronoi diagrams, K-d trees, quadtrees, multiple attribute trees, etc. 相似文献

2.

Numerical differentiation procedures for non-exact data

Dr. R. S. Anderssen Professor P. Bloomfield 《Numerische Mathematik》1974,22(3):157-182

The numerical differentiation of data divides naturally into two distinct problems:

the differentiation of exact data, and
the differentiation of non-exact (experimental) data.

In this paper, we examine the latter. Because methods developed for exact data are based on abstract formalisms which are independent of the structure within the data, they prove, except for the regularization procedure of Cullum, to be unsatisfactory for non-exact data. We therefore adopt the point of view that satisfactory methods for non-exact data must take the structure within the data into account in some natural way, and use the concepts of regression and spectrum analysis as a basis for the development of such methods. The regression procedure is used when either the structure within the non-exact data is known on independent grounds, or the assumptions which underlie the spectrum analysis procedure [viz., stationarity of the (detrended) data] do not apply. In this latter case, the data could be modelled using splines. The spectrum analysis procedure is used when the structure within the nonexact data (or a suitable transformation of it, where the transformation can be differentiated exactly) behaves as if it were generated by a stationary stochastic process. By proving that the regularization procedure of Cullum is equivalent to a certain spectrum analysis procedure, we derive a fast Fourier transform implementation for regularization (based on this equivalence) in which an acceptable value of the regularization parameter is estimated directly from a time series formulation based on this equivalence. Compared with the regularization procedure, which involvesO(n ³) operations (wheren is the number of data points), the fast Fourier transform implementation only involvesO(n logn). 相似文献

3.

THE CAUCHY PROBLEM FOR NONHOMOGENEOUS NAVIER-STOKES EQUATIONS IN L~q([0,T];L~p)

苗长兴顾永耕《应用数学学报(英文版)》1999,(2)

1.IntroductionInthispaperweconsiderCauchyproblemforaclassofnonhomogeneousNavier-Stokesequationsintheinfinitecylinderwith.Givensatisfyinginthedistributionsensediv,weseekasolutionvectorandapressurefunctionP(t,x)suchthatwhereisanonlinearvector-valuedfun... 相似文献

4.

A hierarchical multi-label classification ant colony algorithm for protein function prediction

Fernando E. B. Otero Alex A. Freitas Colin G. Johnson 《Memetic Computing》2010,2(3):165-181

This paper proposes a novel ant colony optimisation (ACO) algorithm tailored for the hierarchical multi-label classification problem of protein function prediction. This problem is a very active research field, given the large increase in the number of uncharacterised proteins available for analysis and the importance of determining their functions in order to improve the current biological knowledge. Since it is known that a protein can perform more than one function and many protein functional-definition schemes are organised in a hierarchical structure, the classification problem in this case is an instance of a hierarchical multi-label problem. In this type of problem, each example may belong to multiple class labels and class labels are organised in a hierarchical structure—either a tree or a directed acyclic graph structure. It presents a more complex problem than conventional flat classification, given that the classification algorithm has to take into account hierarchical relationships between class labels and be able to predict multiple class labels for the same example. The proposed ACO algorithm discovers an ordered list of hierarchical multi-label classification rules. It is evaluated on sixteen challenging bioinformatics data sets involving hundreds or thousands of class labels to be predicted and compared against state-of-the-art decision tree induction algorithms for hierarchical multi-label classification. 相似文献

5.

Persistent Cohomology and Circular Coordinates

Vin?de?Silva Dmitriy?Morozov Mikael?Vejdemo-Johansson Email author 《Discrete and Computational Geometry》2011,45(4):737-759

Nonlinear dimensionality reduction (NLDR) algorithms such as Isomap, LLE, and Laplacian Eigenmaps address the problem of representing high-dimensional nonlinear data in terms of low-dimensional coordinates which represent the intrinsic structure of the data. This paradigm incorporates the assumption that real-valued coordinates provide a rich enough class of functions to represent the data faithfully and efficiently. On the other hand, there are simple structures which challenge this assumption: the circle, for example, is one-dimensional, but its faithful representation requires two real coordinates. In this work, we present a strategy for constructing circle-valued functions on a statistical data set. We develop a machinery of persistent cohomology to identify candidates for significant circle-structures in the data, and we use harmonic smoothing and integration to obtain the circle-valued coordinate functions themselves. We suggest that this enriched class of coordinate functions permits a precise NLDR analysis of a broader range of realistic data sets. 相似文献

6.

Bound Analysis Through HDMR for Multivariate Data Modelling - CMMSE

M. Alper Tunga Metin Demiralp 《Journal of Mathematical Modelling and Algorithms》2013,12(3):265-276

Multivariate data modelling problems consist of a number of nodes with associated function (class) values. The main purpose of these problems is to construct an analytical model to represent the characteristics of the problem under consideration. Because the devices, tools, and/or algorithms used to collect the data may have incapabilities or limited capabilities, the data set is likely to contain unavoidable errors. That is, each component of data is reliable only within an interval which contains the data value. To this end, when an analytical structure is needed for the given data, a band structure should be determined instead of a unique structure. As the multivariance of the given data set increases, divide–and–conquer methods become important in multivariate modelling problems. HDMR based methods allow us to partition the given multivariate data into less variate data sets to reduce the complexity of the given problem. This paper focuses on Interval Factorized HDMR method developed to determine an approximate band structure for a given multivariate data modelling problem having uncertainties on its nodes and function values. 相似文献

7.

Exploring incomplete data using visualization techniques 总被引：1，自引：0，他引：1

Matthias Templ Andreas Alfons Peter Filzmoser 《Advances in Data Analysis and Classification》2012,6(1):29-47

Visualization of incomplete data allows to simultaneously explore the data and the structure of missing values. This is helpful for learning about the distribution of the incomplete information in the data, and to identify possible structures of the missing values and their relation to the available information. The main goal of this contribution is to stress the importance of exploring missing values using visualization methods and to present a collection of such visualization techniques for incomplete data, all of which are implemented in the ${{\sf R}}$ package VIM. Providing such functionality for this widely used statistical environment, visualization of missing values, imputation and data analysis can all be done from within ${{\sf R}}$ without the need of additional software. 相似文献

8.

Bounds for the extremal eigenvalues of a class of symmetric tridiagonal matrices with applications

Hannes Buchholzer Christian Kanzow 《Linear algebra and its applications》2012,436(7):1837-1849

We consider a class of symmetric tridiagonal matrices which may be viewed as perturbations of Toeplitz matrices. The Toeplitz structure is destroyed since two elements on each off-diagonal are perturbed. Based on a careful analysis, we derive sharp bounds for the extremal eigenvalues of this class of matrices in terms of the original data of the given matrix. In this way, we also obtain a lower bound for the smallest singular value of certain matrices. Some numerical results indicate that our bounds are extremely good. 相似文献

9.

Topology Representing Networks for the Visualization of Manifolds

Agnes Vathy-Fogarassy Agnes Werner-Stark Janos Abonyi 《Journal of Mathematical Modelling and Algorithms》2008,7(4):351-370

In practical data mining tasks, high-dimensional data has to be analyzed. In most of the cases it is very informative to map and visualize the hidden structure of a complex data set in a low-dimensional space. In this paper a new class of mapping algorithms is defined. These algorithms combine topology representing networks and different nonlinear mapping algorithms. While the former methods aim to quantify the data and disclose the real structure of the objects, the nonlinear mapping algorithms are able to visualize the quantized data in the low-dimensional vector space. In this paper, techniques based on these methods are gathered and the results of a detailed analysis performed on them are shown. The primary aim of this analysis is to examine the preservation of distances and neighborhood relations of the objects. Preservation of neighborhood relations was analyzed both in local and global environments. To evaluate the main properties of the examined methods we show the outcome of the analysis based both on synthetic and real benchmark examples. 相似文献

10.

Towards a normative model for inventory cost management in a generalized ABC classification system

R E Stanford W Martin 《The Journal of the Operational Research Society》2007,58(7):922-928

This paper establishes a general ABC inventory classification system as the foundation for a normative model of the maintenance cost structure and stock turnover characteristics of a large, multi-item inventory system with constant demand. For any specified number of inventory classes, the model allows expression of the overall system combined ordering and holding cost in terms of (i) the re-ordering frequencies for the items in each inventory class and (ii) the inventory class structure, that is, the proportion of the total system's items that are in each inventory class. The model yields a minimum total maintenance cost function, which reflects the effect of class structure on inventory maintenance costs and turnover. If the Pareto curve (a.k.a. Distribution-by-value function) for the inventory system can be expressed (or approximated) analytically, the model can also be used to determine an optimal class structure, as well as an appropriate number of inventory classes. A special case of the model produces a simply structured, class-based ordering policy for minimizing total inventory maintenance costs. Using real data, the cost characteristics of this policy are compared to those of a heuristic, commonly used by managers of multi-item inventory systems. This cost comparison, expressed graphically, underscores the need for normative modelling approaches to the problem of inventory cost management in large, multi-item systems. 相似文献

11.

Multi-dimensional classification with Bayesian networks

C. Bielza 《International Journal of Approximate Reasoning》2011,52(6):705-727

Multi-dimensional classification aims at finding a function that assigns a vector of class values to a given vector of features. In this paper, this problem is tackled by a general family of models, called multi-dimensional Bayesian network classifiers (MBCs). This probabilistic graphical model organizes class and feature variables as three different subgraphs: class subgraph, feature subgraph, and bridge (from class to features) subgraph. Under the standard 0-1 loss function, the most probable explanation (MPE) must be computed, for which we provide theoretical results in both general MBCs and in MBCs decomposable into maximal connected components. Moreover, when computing the MPE, the vector of class values is covered by following a special ordering (gray code). Under other loss functions defined in accordance with a decomposable structure, we derive theoretical results on how to minimize the expected loss. Besides these inference issues, the paper presents flexible algorithms for learning MBC structures from data based on filter, wrapper and hybrid approaches. The cardinality of the search space is also given. New performance evaluation metrics adapted from the single-class setting are introduced. Experimental results with three benchmark data sets are encouraging, and they outperform state-of-the-art algorithms for multi-label classification. 相似文献

12.

Homogeneous Self-dual Algorithms for Stochastic Semidefinite Programming

S. Jin K. A. Ariyawansa Y. Zhu 《Journal of Optimization Theory and Applications》2012,155(3):1073-1083

Ariyawansa and Zhu have proposed a new class of optimization problems termed stochastic semidefinite programs to handle data uncertainty in applications leading to (deterministic) semidefinite programs. For stochastic semidefinite programs with finite event space, they have also derived a class of volumetric barrier decomposition algorithms, and proved polynomial complexity of certain members of the class. In this paper, we consider homogeneous self-dual algorithms for stochastic semidefinite programs with finite event space. We show how the structure in such problems may be exploited so that the algorithms developed in this paper have complexity similar to those of the decomposition algorithms mentioned above. 相似文献

13.

Le prolongement analytique de la solution du problème de Cauchy pour un certain opérateur différentiel à coefficients polynomiaux II

Yûsaku Hamada Takashi Okaji Yoshitsugu Takei 《Journal of Fixed Point Theory and Applications》2013,14(1):113-148

In this article, we study analytic continuation and global structure of singularities of the solution of the Cauchy problem with entire functions data or singular data for certain differential operators with principal part of polynomial coefficients in the complex domain. This study gives a complete proof of the results of a preceding article [J. Fixed Point Theory Appl. 1 (2007), 209–220]. 相似文献

14.

<Emphasis Type="Italic">Graph_sampler</Emphasis>: a simple tool for fully Bayesian analyses of DAG-models

Sagnik Datta Ghislaine Gayraud Eric Leclerc Frederic Y. Bois 《Computational Statistics》2017,32(2):691-716

Bayesian networks (BNs) are widely used graphical models usable to draw statistical inference about directed acyclic graphs. We presented here Graph_sampler a fast free C language software for structural inference on BNs. Graph_sampler uses a fully Bayesian approach in which the marginal likelihood of the data and prior information about the network structure are considered. This new software can handle both the continuous as well as discrete data and based on the data type two different models are formulated. The software also provides a wide variety of structure prior which can depict either the global or local properties of the graph structure. Now based on the type of structure prior selected, we considered a wide range of possible values for the prior making it either informative or uninformative. We proposed a new and much faster jumping kernel strategy in the Metropolis–Hastings algorithm. The source C code distributed is very compact, fast, uses low memory and disk storage. We performed out several analyses based on different simulated data sets and synthetic as well as real networks to discuss the performance of Graph_sampler. 相似文献

15.

Examples of fitting structured phase–type distributions

M. J. Faddy 《商业与工业应用随机模型》1994,10(4):247-255

A sub–class of phase–type distributions is defined in terms of a Markov process with sequential transitions between transient states and transitions from these states to absorption. Such distributions form a very rich class; they can be fitted to data, and any structure revealed by the parameter estimates used to develop more parsimonious re–parametrizations. Several example data sets are used as illustrations. 相似文献

16.

纵向数据非参数混合效应模型的一个局部不变估计

梁华师义民《系统科学与数学》2007,27(1):102-112

非参数核回归方法近年来已被用于纵向数据的分析(Lin和Carroll,2000)．一个颇具争议性的问题是在非参数核回归中是否需要考虑纵向数据间的相关性．Lin和Carroll (2000)证明了基于独立性(即忽略相关性)的核估计在一类核GEE估计量中是(渐近)最有效的．基于混合效应模型方法作者提出了一个不同的核估计类,它自然而有效地结合了纵向数据的相关结构．估计量达到了与Lin和Carroll的估计量相同的渐近有效性,且在有限样本情形下表现更好．由此方法可以很容易地获得对于总体和个体的非参数曲线估计．所提出的估计量具有较好的统计性质,且实施方便,从而对实际工作者具有较大的吸引力．相似文献

17.

On the formulation of credit barrier model using radial basis functions

Humphrey K K Tung Michael C S Wong 《The Journal of the Operational Research Society》2014,65(9):1437-1452

Albanese et al in 2003 and Avellaneda and Zhu in 2001 develop the framework of credit barrier model. They provide special solutions to the model in case of simple stochastic structure. The technical aspect of the model remains wide open for general stochastic structure that is crucial when the model is required to calibrate with aggregate amount of empirical data. This paper provides a technical solution to this problem with the use of radial basis functions (RBF). This paper discusses the numerical implementation of the credit barrier model using the RBF method. It also demonstrates that the RBF method is numerically tractable in this problem and allows in the model richer stochastic structure capable of capturing relevant market information. 相似文献

18.

Multiresolution Network Models

Bailey K. Fosdick Tyler H. McCormick Thomas Brendan Murphy Tin Lok James Ng Ted Westling 《Journal of computational and graphical statistics》2019,28(1):185-196

Many existing statistical and machine learning tools for social network analysis focus on a single level of analysis. Methods designed for clustering optimize a global partition of the graph, whereas projection-based approaches (e.g., the latent space model in the statistics literature) represent in rich detail the roles of individuals. Many pertinent questions in sociology and economics, however, span multiple scales of analysis. Further, many questions involve comparisons across disconnected graphs that will, inevitably be of different sizes, either due to missing data or the inherent heterogeneity in real-world networks. We propose a class of network models that represent network structure on multiple scales and facilitate comparison across graphs with different numbers of individuals. These models differentially invest modeling effort within subgraphs of high density, often termed communities, while maintaining a parsimonious structure between said subgraphs. We show that our model class is projective, highlighting an ongoing discussion in the social network modeling literature on the dependence of inference paradigms on the size of the observed graph. We illustrate the utility of our method using data on household relations from Karnataka, India. Supplementary material for this article is available online. 相似文献

19.

Lossless filter for multiple repetitions with Hamming distance

Pierre Peterlongo Nadia Pisanti Frdric Boyer Alair Pereira do Lago Marie-France Sagot 《Journal of Discrete Algorithms》2008,6(3):497-509

Similarity search in texts, notably in biological sequences, has received substantial attention in the last few years. Numerous filtration and indexing techniques have been created in order to speed up the solution of the problem. However, previous filters were made for speeding up pattern matching, or for finding repetitions between two strings or occurring twice in the same string. In this paper, we present an algorithm called Nimbus for filtering strings prior to finding repetitions occurring twice or more in a string, or in two or more strings. Nimbus uses gapped seeds that are indexed with a new data structure, called a bi-factor array, that is also presented in this paper. Experimental results show that the filter can be very efficient: preprocessing with Nimbus a data set where one wants to find functional elements using a multiple local alignment tool such as Glam, the overall execution time can be reduced from 7.5 hours to 2 minutes. 相似文献

20.

DC-NMF: nonnegative matrix factorization based on divide-and-conquer for fast clustering and topic modeling

Rundong Du Da Kuang Barry Drake Haesun Park 《Journal of Global Optimization》2017,68(4):777-798

The importance of unsupervised clustering and topic modeling is well recognized with ever-increasing volumes of text data available from numerous sources. Nonnegative matrix factorization (NMF) has proven to be a successful method for cluster and topic discovery in unlabeled data sets. In this paper, we propose a fast algorithm for computing NMF using a divide-and-conquer strategy, called DC-NMF. Given an input matrix where the columns represent data items, we build a binary tree structure of the data items using a recently-proposed efficient algorithm for computing rank-2 NMF, and then gather information from the tree to initialize the rank-k NMF, which needs only a few iterations to reach a desired solution. We also investigate various criteria for selecting the node to split when growing the tree. We demonstrate the scalability of our algorithm for computing general rank-k NMF as well as its effectiveness in clustering and topic modeling for large-scale text data sets, by comparing it to other frequently utilized state-of-the-art algorithms. The value of the proposed approach lies in the highly efficient and accurate method for initializing rank-k NMF and the scalability achieved from the divide-and-conquer approach of the algorithm and properties of rank-2 NMF. In summary, we present efficient tools for analyzing large-scale data sets, and techniques that can be generalized to many other data analytics problem domains along with an open-source software library called SmallK. 相似文献