期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Scalability and Communication in Parallel Low-Complexity Lossless Compression

Luigi Cinque Sergio De Agostino Luca Lombardi 《Mathematics in Computer Science》2010,3(4):391-406

Approximation schemes for optimal compression with static and sliding dictionaries which can run on a simple array of processors with distributed memory and no interconnections are presented. These approximation algorithms can be implemented on both small and large scale parallel systems. The sliding dictionary method requires large size files on large scale systems. As far as lossless image compression is concerned, arithmetic encoders enable the best lossless compressors but they are often ruled out because they are too complex. Storer extended dictionary text compression to bi-level images to avoid arithmetic encoders (BLOCK MATCHING). We were able to partition an image into up to a hundred areas and to apply the BLOCK MATCHING heuristic independently to each area with no loss of compression effectiveness. Therefore, the approach is suitable for a small scale parallel system at no communication cost. On the other hand, bi-level image compression seems to require communication on large scale systems. With regard to grey scale and color images, parallelizable lossless image compression (PALIC) is a highly parallelizable and scalable lossless compressor since it is applied independently to blocks of 8 × 8 pixels. We experimented the BLOCK MATCHING and PALIC heuristics with up to 32 processors of a 256 Intel Xeon 3.06 GHz processors machine () on a test set of large topographic bi-level images and color images in RGB format. We obtained the expected speed-up of the compression and decompression times, achieving parallel running times about 25 times faster than the sequential ones. Finally, scalable algorithms computing static and sliding dictionary optimal text compression on an exclusive read, exclusive write shared memory parallel machine are presented. On the same model, compression by block matching of bi-level images is shown which can be implemented on a full binary tree architecture under some realistic assumptions with no scalability issues. 相似文献

2.

A parallel balance scheme for banded linear systems

Gene H. Golub Ahmed H. Sameh Vivek Sarin 《Numerical Linear Algebra with Applications》2001,8(5):297-316

A parallel algorithm is proposed for the solution of narrow banded non‐symmetric linear systems. The linear system is partitioned into blocks of rows with a small number of unknowns common to multiple blocks. Our technique yields a reduced system defined only on these common unknowns which can then be solved by a direct or iterative method. A projection based extension to this approach is also proposed for computing the reduced system implicitly, which gives rise to an inner–outer iteration method. In addition, the product of a vector with the reduced system matrix can be computed efficiently on a multiprocessor by concurrent projections onto subspaces of block rows. Scalable implementations of the algorithm can be devized for hierarchical parallel architectures by exploiting the two‐level parallelism inherent in the method. Our experiments indicate that the proposed algorithm is a robust and competitive alternative to existing methods, particularly for difficult problems with strong indefinite symmetric part. Copyright © 2001 John Wiley & Sons, Ltd. 相似文献

3.

Parallel iteration across the steps of high-order Runge-Kutta methods for nonstiff initial value problems

P. J. van der Houwen B. P. Sommeijer W. A. van der Veen 《Journal of Computational and Applied Mathematics》1995,60(3):309-329

For the parallel integration of nonstiff initial value problems (IVPs), three main approaches can be distinguished: approaches based on “parallelism across the problem”, on “parallelism across the method” and on “parallelism across the steps”. The first type of parallelism does not require special integration methods and can be exploited within any available IVP solver. The method-parallelism approach received much attention, particularly within the class of explicit Runge-Kutta methods originating from fixed point iteration of implicit Runge-Kutta methods of Gaussian type. The construction and implementation on a parallel machine of such methods is extremely simple. Since the computational work per processor is modest with respect to the number of data to be exchanged between the various processors, this type of parallelism is most suitable for shared memory systems. The required number of processors is roughly half the order of the generating Runge-Kutta method and the speed-up with respect to a good sequential IVP solver is about a factor 2. The third type of parallelism (step-parallelism) can be achieved in any IVP solver based on predictor-corrector iteration and requires the processors to communicate after each full iteration. If the iterations have sufficient computational volume, then the step-parallel approach may be suitable for implementation on distributed memory systems. Most step-parallel methods proposed so far employ a large number of processors, but lack the property of robustness, due to a poor convergence behaviour in the iteration process. Hence, the effective speed-up is rather poor. The dynamic step-parallel iteration process proposed in the present paper is less massively parallel, but turns out to be sufficiently robust to achieve speed-up factors up to 15. 相似文献

4.

Observer-based synchronization of uncertain chaotic system and its application to secure communications

Fanglai Zhu 《Chaos, solitons, and fractals》2009,40(5):2384-2391

Within the drive-response configuration, this paper considers the synchronization of uncertain chaotic systems based on observers and chaos-based secure communication. Even if there are unknown disturbances and parameters in the drive system, a robust adaptive observer can be used as response system to realize chaotic synchronization. The proposed method is then applied to secure communication. The transmitter is constructed by injecting the information into the drive system with proper manner and one of the transmitting signal is the sum of one of the output and the information signal. The Lur’e chaotic system is considered as an illustrative example to demonstrate the effectiveness of the proposed approaches. 相似文献

5.

Parallelism across the steps in iterated Runge-Kutta methods for stiff initial value problems

P. J. van der Houwen B. P. Sommeijer W. A. van der Veen 《Numerical Algorithms》1994,8(2):293-312

For the parallel integration of stiff initial value problems (IVPs) three main approaches can be distinguished: approaches based on parallelism across the problem, on parallelism across the method and on parallelism across the steps. The first type of parallelism does not require special integration methods can be exploited within any available IVP solver. The methodparallel approach received some attention in the case of Runge-Kutta based methods. For these methods, the required number of processors is roughly half the order of the generating Runge-Kutta method and the speed-up with respect to a good sequential IVP solver is about a factor 2. The third type of parallelism (step-parallelism) can be achieved in any IVP solver based on predictor-corrector iteration. Most step-parallel methods proposed so far employ a large number of processors, but lack the property of robustness, due to a poor convergence behaviour in the iteration process. Hence, the effective speed-up is rather poor. The step-parallel iteraction process proposed in the present paper is less massively parallel, but turns out to be sufficiently robust to solve the four-stage Radau IIA corrector used in our experiments within a few effective iterations per step and to achieve speed-up factors up to 10 with respect to the best sequential codes.The research reported in this paper was partly supported by the Technology Foundation (STW) in the Netherlands. 相似文献

6.

A novel digital image covert communication scheme based on generalized FCM in DCT domain

Li-yun Su Feng-lan Li Jiao-jun Li Bo Chen 《佛山科学技术学院》2011,3(2):127-136

A novel covert communication method of digital image is presented, based on generalized fuzzy c-means clustering (GFCM), human visual system (HVS) and discrete cosine transform (DCT). Therefore, the original image blocks are classified into two classes according to specified characteristic parameters. So one block is suited for embedding security information, but the other block is not. Hence the appropriate blocks can be selected in an image to embed the security information by selectively modifying the middle-frequency part of the original image in conjunction with HVS and DCT. Furthermore the maximal information strength is fixed based to the frequency masking. Also to improve performances of the proposed algorithm, the security information is modulated into the chaotic modulation array. The simulation results show that we can remarkably extract the hiding security information and can achieve good robustness with common signal distortion or geometric distortion and the quality of the embedded image is guaranteed. 相似文献

7.

一类刚性大系统的并行组合方法 总被引：3，自引：0，他引：3

陈丽容刘德贵《应用数学学报》2000,23(1):130-140

本文针对一类分解的刚性大系统提出一种并行组合方法（ＰＣＭ）,该方法将系统分割的并行化方法与并行化方法相结合,采用并行显式Ｒｕｎｇｅ－ｋｕｔｔａ（ＲＫ）方法求解非刚性子系统,采用并行Ｒｏｓｅｎｂｒｏｃｋ方法求解刚性子系统,文中讨论了方法的相容阶、并对方法的收敛性进行了分析,数值结果表明该方法对于分解的刚性大系统的求解是实用和有效的。相似文献

8.

组合RK-Rosenbrock方法及其稳定性分析 总被引：6，自引：0，他引：6

陈丽容刘德贵《计算数学》2000,22(3):319-332

１．引言在研究和设计宇航飞行器时,常常会遇到刚性大系统,他们具有特殊结构,系统的解分量有的变化很快,而有的变化很慢。我们可将其分解成两个耦合的子系统;其中（１）式为刚性子系统,（２）式为非刚性子系统。由于子系统（１）是刚性的,因而整个系统也是刚性的,所以需要采用适合于求解刚性方程的隐式或半隐式方法来求解。但是,在很多情况中,刚性方程组（１）仅占整个方程组的很小一部分,而且右函数相当简单,因而整个右函数计算量主要集中在非刚性方程组（２）上。另一方面,这种对整个方程组采用同一个数值积分方法来处理的… 相似文献

9.

广义块Toeplitz特征值问题的基于sine变换的预处理子

王元媛卢琳璋《数学研究》2008,41(3):240-250

在求块Toeplitz矩阵束（Amn，Bmn）特征值的Lanczos过程中，通过对移位块Toepltz矩阵Amn-ρBmn进行基于sine变换的块预处理，从而改进了位移块Toeplitz矩阵的谱分布，加速了Lanczos过程的收敛速度．该块预处理方法能通过快速算法有效快速执行．本文证明了预处理后Lanczos过程收敛迅速，并通过实验证明该算法求解大规模矩阵问题尤其有效．相似文献

10.

On some new inclusion theorems for the eigenvalues of partitioned matrices

D. Meyer K. Veselić 《Numerische Mathematik》1980,34(4):431-437

Summary Some new results of Gershgorin type for partitioned matrices have been obtained using the so-called departure from normality of the diagonal blocks. This has been shown to improve the existing results at least in the case where diagonal blocks are simultaneously nearly defective and nearly normal. Also a set of Gershgorin-like circles is found such that each of them contains at least one eigenvalue (even if no separation takes place). As a corollary it is shown that every classical Gershgorin circle of a normal matrix contains at least one eigenvalue. 相似文献

11.

Uniform two‐class regular partial Steiner triple systems

Melissa S. Keranen Donald L. Kreher Sibel Özkan 《组合设计杂志》2012,20(3):161-178

A 2‐class regular partial Steiner triple system is a partial Steiner triple system whose points can be partitioned into 2‐classes such that no triple is contained in either class and any two points belonging to the same class are contained in the same number of triples. It is uniform if the two classes have the same size. We provide necessary and sufficient conditions for the existence of uniform 2‐class regular partial Steiner triple systems. 相似文献

12.

An improved GPBi-CG algorithm suitable for distributed parallel computing 总被引：1，自引：0，他引：1

Xian-yu Zuo Ze-yao Mo 《Applied mathematics and computation》2010,215(12):4101-4109

An improved generalized product-type bi-conjugate gradient (GPBi-CG) method (IGPBi-CG method, in brief) for solving large sparse linear systems with unsymmetrical coefficient matrices is proposed for distributed parallel environments. The method reduces three global synchronization points to two by reconstructing GPBi-CG method and the communication time required for the inner product can be efficiently overlapped with useful computation. The cost is only slightly increased computation time, which can be ignored compared with the reduction of communication time. Performance and isoefficiency analysis show that the IGPBi-CG method has better parallelism and scalability than the GPBi-CG method. Numerical experiments show that the scalability can be improved by a factor greater than 1.5 and the improvement in parallel communication performance approaches 33.3˙%. 相似文献

13.

Even circuits in planar graphs

P.D Seymour 《Journal of Combinatorial Theory, Series B》1981,31(3):327-338

We prove that a planar graph can be partitioned into edge-disjoint circuits of even length, if and only if every vertex has even valency and every block has an even number of edges. 相似文献

14.

An Evaluation of HPF and MPI Approaches and Performance in Unstructured Finite Element Simulations

Dale Shires Ram Mohan 《Journal of Mathematical Modelling and Algorithms》2002,1(3):153-167

The High Performance Fortran (HPF) language and the Message Passing Interface (MPI) are two widely used methods to achieve parallelism on today's clusters and multiprocessor supercomputers. HPF is a distinct language providing extensions to Fortran 90/95 to express parallel execution paths and regions. MPI is a library of communication calls that can be inserted into modern high-level languages (C and Fortran). This paper discusses the use of the two approaches in a parallel finite element application for liquid composite manufacturing process modeling. The unstructured nature of the code provides an excellent opportunity to test both the computation and communication effectiveness of the two approaches. We discuss performance results based on implementations conducted on a modern massively parallel computing platform with a highly tuned processor interconnection network. 相似文献

15.

RHPMDs and FHPMDs with k = 4 and Type 4 u

Xuebin Zhang 《Graphs and Combinatorics》2005,21(4):541-552

A (v, k, 1)-HPMD is called a frame (briefly, k-FHPMD), if the blocks of the HPMD can be partitioned into v partial parallel classes such that the complement of each partial parallel class is a group of the HPMD. A (v, k, 1)-HPMD is called resolvable (briefly, k-RHPMD), if the blocks of the HPMD can be partitioned into parallel classes. In this article, (i) we shall construct 3-FHPMDs of type 3⁶ and 21⁶ to completely settle the existence of 3-FHPMD of type h^u; (ii) we shall show that the necessary conditions for the existence of 4-FHPMD of type h^u are sufficient for the case h = 4; (iii) we shall show that the necessary conditions for the existence of 4-RHPMD of type h^u are sufficient for the case h = 4. 相似文献

16.

Algorithms for finding the minimal polynomials and inverses of resultant matrices

Shu-Ping Gao San-Yang Liu 《Journal of Applied Mathematics and Computing》2004,16(1-2):251-263

In this paper, algorithms for computing the minimal polynomial and the common minimal polynomial of resultant matrices over any field are presented by means of the approach for the Gröbner basis of the ideal in the polynomial ring, respectively, and two algorithms for finding the inverses of such matrices are also presented. Finally, an algorithm for the inverse of partitioned matrix with resultant blocks over any field is given, which can be realized by CoCoA 4.0, an algebraic system over the field of rational numbers or the field of residue classes of modulo prime number. We get examples showing the effectiveness of the algorithms. 相似文献

17.

Poly-symmetry in processor-sharing systems

Thomas Bonald Céline Comte Virag Shah Gustavo de Veciana 《Queueing Systems》2017,86(3-4):327-359

We consider a system of processor-sharing queues with state-dependent service rates. These are allocated according to balanced fairness within a polymatroid capacity set. Balanced fairness is known to be both insensitive and Pareto-efficient in such systems, which ensures that the performance metrics, when computable, will provide robust insights into the real performance of the system considered. We first show that these performance metrics can be evaluated with a complexity that is polynomial in the system size if the system is partitioned into a finite number of parts, so that queues are exchangeable within each part and asymmetric across different parts. This in turn allows us to derive stochastic bounds for a larger class of systems which satisfy less restrictive symmetry assumptions. These results are applied to practical examples of tree data networks, such as backhaul networks of Internet service providers, and computer clusters. 相似文献

18.

Optimal equi-partition of rectangular domains for parallel computation

Ioannis T. Christou Robert R. Meyer 《Journal of Global Optimization》1996,8(1):15-34

We present an efficient method for the partitioning of rectangular domains into equi-area sub-domains of minimum total perimeter. For a variety of applications in parallel computation, this corresponds to a load-balanced distribution of tasks that minimize interprocessor communication. Our method is based on utilizing, to the maximum extent possible, a set of optimal shapes for sub-domains. We prove that for a large class of these problems, we can construct solutions whose relative distance from a computable lower bound converges to zero as the problem size tends to infinity. PERIX-GA, a genetic algorithm employing this approach, has successfully solved to optimality million-variable instances of the perimeter-minimization problem and for a one-billion-variable problem has generated a solution within 0.32% of the lower bound. We report on the results of an implementation on a CM-5 supercomputer and make comparisons with other existing codes.This research was partially funded by Air Force Office of Scientific Research grant F496-20-94-1-0036 and National Science Foundation grants CDA-9024618 and CCR-9306807. 相似文献

19.

Fast incremental algorithm for speeding up the computation of binarization

Kuo-Liang Chung Chia-Lun Tsai 《Applied mathematics and computation》2009,212(2):396-408

Binarization is an important basic operation in image processing community. Based on the thresholded value, the gray image can be segmented into a binary image, usually consisting of background and foreground. Given the histogram of input gray image, based on minimizing the within-variance (or maximizing the between-variance), the Otsu method can obtain a satisfactory binary image. In this paper, we first transfer the within-variance criterion into a new mathematical formulation, which is very suitable to be implemented in a fast incremental way, and it leads to the same thresholded value. Following our proposed incremental computation scheme, an efficient heap- and quantization-based (HQ-based) data structure is presented to realize its implementation. Under eight real gray images, experimental results show that our proposed HQ-based incremental algorithm for binarization has 36% execution-time improvement ratio in average when compared to the Otsu method. Besides this significant speedup, our proposed HQ-based incremental algorithm can also be applied to speed up the Kittler and Illingworth method for binarization. 相似文献

20.

On resolvable designs

Haim Hanani 《Discrete Mathematics》1972,3(4):343-357

相似文献