期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Random walk distances in data clustering and applications

Sijia Liu Anastasios Matzavinos Sunder Sethuraman 《Advances in Data Analysis and Classification》2013,7(1):83-108

In this paper, we develop a family of data clustering algorithms that combine the strengths of existing spectral approaches to clustering with various desirable properties of fuzzy methods. In particular, we show that the developed method “Fuzzy-RW,” outperforms other frequently used algorithms in data sets with different geometries. As applications, we discuss data clustering of biological and face recognition benchmarks such as the IRIS and YALE face data sets. 相似文献

2.

Model-based two-way clustering of second-level units in ordinal multilevel latent Markov models

Montanari Giorgio Eduardo Doretti Marco Marino Maria Francesca 《Advances in Data Analysis and Classification》2022,16(2):457-485

In this paper, an ordinal multilevel latent Markov model based on separate random effects is proposed. In detail, two distinct second-level discrete effects are considered in the model, one affecting the initial probability vector and the other affecting the transition probability matrix of the first-level ordinal latent Markov process. To model these separate effects, we consider a bi-dimensional mixture specification that allows to avoid unverifiable assumptions on the random effect distribution and to derive a two-way clustering of second-level units. Starting from a general model where the two random effects are dependent, we also obtain the independence model as a special case. The proposal is applied to data on the physical health status of a sample of elderly residents grouped into nursing homes. A simulation study assessing the performance of the proposal is also included.

相似文献

3.

Random effects model: Nonparametric case

Z. Govindarajulu Jayant V. Deshpandé 《Annals of the Institute of Statistical Mathematics》1972,24(1):165-170

相似文献

4.

Dynamic programming in cricket: choosing a night watchman

S R Clarke J M Norman 《The Journal of the Operational Research Society》2003,54(8):838-845

In cricket, when a batsman is dismissed towards the end of a day's play, he is often replaced by a lower-order batsman (a ‘night watchman’), in the hope that the remaining recognised batsmen can start their innings on the following day. A dynamic programming analysis suggests that the common practice of using a lower-order batsman is often sub-optimal. Towards the end of a day's play, when the conventional wisdom seems to be to use a night watchman, it may be best to send in the next recognised batsman in the batting order. Sending in a night watchman may be good judgement when there are several recognised batsman and several lower order batsmen still to play (say four of each). However, with smaller numbers (two of each, for example), then, with very few overs left to play, it may be better to send in a recognised batsman. 相似文献

5.

Random state transitions of knots: a first step towards modeling unknotting by type II topoisomerases

Xia Hua 《Topology and its Applications》2007,154(7):1381-1397

Type II topoisomerases are enzymes that change the topology of DNA by performing strand-passage. In particular, they unknot knotted DNA very efficiently. Motivated by this experimental observation, we investigate transition probabilities between knots. We use the BFACF algorithm to generate ensembles of polygons in Z³ of fixed knot type. We introduce a novel strand-passage algorithm which generates a Markov chain in knot space. The entries of the corresponding transition probability matrix determine state-transitions in knot space and can track the evolution of different knots after repeated strand-passage events. We outline future applications of this work to DNA unknotting. 相似文献

6.

Functional data clustering: a survey 总被引：1，自引：0，他引：1

Julien Jacques Cristian Preda 《Advances in Data Analysis and Classification》2014,8(3):231-255

Clustering techniques for functional data are reviewed. Four groups of clustering algorithms for functional data are proposed. The first group consists of methods working directly on the evaluation points of the curves. The second groups is defined by filtering methods which first approximate the curves into a finite basis of functions and second perform clustering using the basis expansion coefficients. The third groups is composed of methods which perform simultaneously dimensionality reduction of the curves and clustering, leading to functional representation of data depending on clusters. The last group consists of distance-based methods using clustering algorithms based on specific distances for functional data. A software review as well as an illustration of the application of these algorithms on real data are presented. 相似文献

7.

DC-NMF: nonnegative matrix factorization based on divide-and-conquer for fast clustering and topic modeling

Rundong Du Da Kuang Barry Drake Haesun Park 《Journal of Global Optimization》2017,68(4):777-798

The importance of unsupervised clustering and topic modeling is well recognized with ever-increasing volumes of text data available from numerous sources. Nonnegative matrix factorization (NMF) has proven to be a successful method for cluster and topic discovery in unlabeled data sets. In this paper, we propose a fast algorithm for computing NMF using a divide-and-conquer strategy, called DC-NMF. Given an input matrix where the columns represent data items, we build a binary tree structure of the data items using a recently-proposed efficient algorithm for computing rank-2 NMF, and then gather information from the tree to initialize the rank-k NMF, which needs only a few iterations to reach a desired solution. We also investigate various criteria for selecting the node to split when growing the tree. We demonstrate the scalability of our algorithm for computing general rank-k NMF as well as its effectiveness in clustering and topic modeling for large-scale text data sets, by comparing it to other frequently utilized state-of-the-art algorithms. The value of the proposed approach lies in the highly efficient and accurate method for initializing rank-k NMF and the scalability achieved from the divide-and-conquer approach of the algorithm and properties of rank-2 NMF. In summary, we present efficient tools for analyzing large-scale data sets, and techniques that can be generalized to many other data analytics problem domains along with an open-source software library called SmallK. 相似文献

8.

Factor probabilistic distance clustering (FPDC): a new clustering method

Cristina Tortora Mireille Gettler Summa Marina Marino Francesco Palumbo 《Advances in Data Analysis and Classification》2016,10(4):441-464

Factor clustering methods have been developed in recent years thanks to improvements in computational power. These methods perform a linear transformation of data and a clustering of the transformed data, optimizing a common criterion. Probabilistic distance (PD)-clustering is an iterative, distribution free, probabilistic clustering method. Factor PD-clustering (FPDC) is based on PD-clustering and involves a linear transformation of the original variables into a reduced number of orthogonal ones using a common criterion with PD-clustering. This paper demonstrates that Tucker3 decomposition can be used to accomplish this transformation. Factor PD-clustering alternatingly exploits Tucker3 decomposition and PD-clustering on transformed data until convergence is achieved. This method can significantly improve the PD-clustering algorithm performance; large data sets can thus be partitioned into clusters with increasing stability and robustness of the results. Real and simulated data sets are used to compare FPDC with its main competitors, where it performs equally well when clusters are elliptically shaped but outperforms its competitors with non-Gaussian shaped clusters or noisy data. 相似文献

9.

Location-area partition in a cellular radio network

D-W Tcha T-J Choi Y-S Myung 《The Journal of the Operational Research Society》1997,48(11):1076-1081

With an increasing population of mobile subscribers, the signalling traffic to control the subscriber mobility expands rapidly. Subscriber mobility is controlled through location registration based on the so-called location area, the basic area unit for paging which consists of a number of cells. There is a tradeoff between the two kinds of signalling traffic: paging and location updating. As location areas include a larger number of cells, the traffic volume for paging increases while that for location updating decreases. Given not only the pattern of call arrivals but also that for subscriber mobility, our problem is to minimise the total signalling traffic by optimally partitioning the whole area into location areas. We show that this problem can be transformed to the so-called clique partitioning problem (CPP). Also we demonstrate the process of implementing the algorithm for solving the CPP for real-world problems defined on the cellular network in Seoul. 相似文献

10.

Random Walks in Random Media on a Cayley Tree

Rozikov U. A. 《Ukrainian Mathematical Journal》2001,53(10):1688-1702

We present sufficient conditions for the transience of random walks with bounded jumps in random media on a Cayley tree. 相似文献

11.

Finite-element modeling of the particle clustering effect in a powder-metallurgy-processed ceramic-particle-reinforced metal matrix composite on its mechanical properties

W. J. Lee Y. J. Kim N. H. Kang I. M. Park Y. H. Park 《Mechanics of Composite Materials》2011,46(6):639-648

A new numerical method is proposed to predict the effect of particle clustering on grain boundaries in a ceramic- particle-reinforced metal matrix composite on its mechanical properties, and micromechanical finite-element simulation of stress–strain responses in composites with random and clustered arrangements of ceramic particles are carried out. A particular material modeled and analyzed is a TiC-particle-reinforced Al matrix composite processed by powder metallurgy. A representative volume element of a composite microstructure with 5 vol.% TiC is reconstructed based on the tetrakaidecahedral grain boundary structure by using a modified random sequential adsorption. The model proposed in this study accurately represents the stress concentrations and particle-particle interactions during deformation of the powder-metallurgy-processed composite. A comparison with the random-arrangement model shows that the present numerical approach is more accurate in simulating the behavior of the composite material. 相似文献

12.

proper versus improper mixtures: Toward a quaternionic quantum mechanics

F. Masillo G. Scolarici S. Sozzo 《Theoretical and Mathematical Physics》2009,160(1):1006-1013

相似文献

13.

Using inexact gradients in a multilevel optimization algorithm

Robert Michael Lewis Stephen G. Nash 《Computational Optimization and Applications》2013,56(1):39-61

Many optimization algorithms require gradients of the model functions, but computing accurate gradients can be computationally expensive. We study the implications of using inexact gradients in the context of the multilevel optimization algorithm MG/Opt. MG/Opt recursively uses (typically cheaper) coarse models to obtain search directions for finer-level models. However, MG/Opt requires the gradient on the fine level to define the recursion. Our primary focus here is the impact of the gradient errors on the multilevel recursion. We analyze, partly through model problems, how MG/Opt is affected under various assumptions about the source of the error in the gradients, and demonstrate that in many cases the effect of the errors is benign. Computational experiments are included. 相似文献

14.

Self-learning K-means clustering: a global optimization approach

Z. Volkovich D. Toledano-Kitai G.-W. Weber 《Journal of Global Optimization》2013,56(2):219-232

An appropriate distance is an essential ingredient in various real-world learning tasks. Distance metric learning proposes to study a metric, which is capable of reflecting the data configuration much better in comparison with the commonly used methods. We offer an algorithm for simultaneous learning the Mahalanobis like distance and K-means clustering aiming to incorporate data rescaling and clustering so that the data separability grows iteratively in the rescaled space with its sequential clustering. At each step of the algorithm execution, a global optimization problem is resolved in order to minimize the cluster distortions resting upon the current cluster configuration. The obtained weight matrix can also be used as a cluster validation characteristic. Namely, closeness of such matrices learned during a sample process can indicate the clusters readiness; i.e. estimates the true number of clusters. Numerical experiments performed on synthetic and on real datasets verify the high reliability of the proposed method. 相似文献

15.

MrDIRECT: a multilevel robust DIRECT algorithm for global optimization problems

Qunfeng Liu Jinping Zeng Gang Yang 《Journal of Global Optimization》2015,62(2):205-227

相似文献

16.

Complementation in the face lattice of a proper cone

《Linear algebra and its applications》1986

相似文献

17.

Non-local modeling of size effects in amorphous metals

Benjamin Klusemann Tao Xiao Swantje Bargmann 《PAMM》2014,14(1):529-530

相似文献

18.

Random A-permutations: Convergence to a Poisson process

A. L. Yakymiv 《Mathematical Notes》2007,81(5-6):840-846

Suppose that S _n is the permutation group of degree n, A is a subset of the set of natural numbers ?, and T _n(A) is the set of all permutations from S _n whose cycle lengths belong to the set A. Permutations from T _n are usually called A-permutations. We consider a wide class of sets A of positive asymptotic density. Suppose that ζ _mn is the number of cycles of length m of a random permutation uniformly distributed on T _n. It is shown in this paper that the finite-dimensional distributions of the random process {tz _mn, m ε A} weakly converge as n → ∞ to the finite-dimensional distributions of a Poisson process on A. 相似文献

19.

Dislocation-based modeling of size effects in microscale plasticity

Cornelia Schwarz Radan Sedláček Christian Krempaszky Ewald Werner 《PAMM》2007,7(1):4080003-4080004

A continuum dislocation-based model for the size-dependent plastic deformation at the microscale is presented. An outlook to its application to the single-slip bending of a thin strip is given. (© 2008 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim) 相似文献

20.

Some recent developments in modeling quantile treatment effects

《高校应用数学学报(英文版)》2020,(2)

This paper provides a selective review of the recent developments on econometric/statistical modeling in quantile treatment effects under both selection on observables and on unobservables.First,we discuss identification,estimation and inference of quantile treatment effects under the framework of selection on observables.Then,we consider the case where the treatment variable is endogenous or self-selected,for which an instrumental variable method provides a powerful tool to tackle this problem.Finally,some extensions are discussed to the data-rich environments,to the regression discontinuity design,and some other approaches to identify quantile treatment effects are also discussed.In particular,some future research works in this area are addressed. 相似文献