首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
In the last decade many models for parallel computation have been proposed and many parallel algorithms have been developed. However, few of these models have been realized and most of these algorithms are supposed to run on idealized, unrealistic parallel machines.The parallel machines constructed so far all use a simple model of parallel computation. Therefore, not every existing parallel machine is equally well suited for each type of algorithm. The adaptation of a certain algorithm to a specific parallel architecture may severely increase the complexity of the algorithm or severely obscure its essence.Little is known about the performance of some standard combinatorial algorithms on existing parallel machines. In this paper we present computational results concerning the solution of knapsack, shortest paths and change-making problems by branch and bound, dynamic programming, and divide and conquer algorithms on the ICL-DAP (an SIMD computer), the Manchester dataflow machine and the CDC-CYBER-205 (a pipeline computer).  相似文献   

2.
It has been known for many years that an optimal discrete nonlinear filter may be synthesized for systems whose plant dynamics, sensor characteristics and signal statistics are known by applying Bayes' Rule to sequentially update the conditional probability density function from the latest data. However, it was not until 1969 that a digital computer algorithm implementing the theory for a one-state variable one-step predictor appeared in the literature. This delay and the continuing scarcity of multidimensional nonlinear filters result from the overwhelming computational task which leads to unrealistic data processing times. For many nonlinear filtering problems analog and digital computers (a hybrid computation) combine to yield a higher data rate than can be obtained by con¬ventional digital methods. This paper describes an implementation of the theory by means of a hybrid computer algorithm for the optimal nonlinear one-step predictor.

The hybrid computer algorithm presented reduces the overall solution time per prediction because:

1) Many large computations of identical form are executed on the analog computer in parallel.

2) The discrete running variable in the digital algorithm may be replaced with a continuous analog computer variable in one or more dimensions leading to increased computational speed and finer resolution of the exponential transformation.

3) The modern analog computer is well suited to generate functions such as the expo¬nential at high speed with modest equipment.

4) The arithmetic, storage, and control functions performed rapidly by the digital computer are utilized without introducing extensive auxiliary calculations.

To illustrate pertinent aspects of the algorithm developed, the scalar cubed sensor problem previously described by Bucy is treated extensively. The hybrid algorithm is described. Problems associated with partitioning of equations between analog and digital computers, machine representations of variables, setting of initial conditions and floating of grid base are discussed. The effects of analog component bandwidths, digital-to-analog and analog-to-digital conversion times, analog computer mode switching times and digital computer I/O data rates on overall processing time are examined. The effect of limited analog computer dynamic range on accuracy is discussed. Results from a simulation of this optimal predictor using MOBSSL, a continuous system simulation language, are given. Timing estimates are presented and compared against similar estimates for the all digital algorithm.

For example, given a four-state variable optimal 1-step predictor utilizing 7 discrete points in each dimension, the hybrid algorithm can be used to generate predictions accurate to 2 decimal places once every 10 seconds. An analog computer complement of 250 integra¬tors and multipliers and a high-speed 3rd generation digital computer such as the CDC 6600 or IBM 360/85 are required. This compares with a lower bound of about 3 seconds per all digital prediction which would require 49 CDC 6600's operating in parallel. Analytical and simulation work quantifying errors in one state variable filters is presented. Finally, the use of an interactive graphic system for real time display and for filter evaluation is described.  相似文献   

3.
In the present work multibody 3-axis truck vehicle model coupled with submodels of suspension bodies is considered. Coupling the main model with submodels allows to perform simulation of multibody dynamics and non-stationary heat conduction and stress distribution processes in bodies simultaneously. High efficiency of parallel simulation on computer cluster is achieved using developed software based on MPI library.  相似文献   

4.
In this paper we present new results on the approximate parallel construction of Huffman codes. Our algorithm achieves linear work and logarithmic time, provided that the initial set of elements is sorted. This is the first parallel algorithm for that problem with the optimal time and work. Combining our approach with the best known parallel sorting algorithms we can construct an almost optimal Huffman tree with optimal time and work. This also leads to the first parallel algorithm that constructs exact Huffman codes with maximum codeword length H in time O(H) with n/logn processors, if the elements are sorted.  相似文献   

5.
Cyclic codes and their various generalizations, such as quasi-twisted (QT) codes, have a special place in algebraic coding theory. Among other things, many of the best-known or optimal codes have been obtained from these classes. In this work we introduce a new generalization of QT codes that we call multi-twisted (MT) codes and study some of their basic properties. Presenting several methods of constructing codes in this class and obtaining bounds on the minimum distances, we show that there exist codes with good parameters in this class that cannot be obtained as QT or constacyclic codes. This suggests that considering this larger class in computer searches is promising for constructing codes with better parameters than currently best-known linear codes. Working with this new class of codes motivated us to consider a problem about binomials over finite fields and to discover a result that is interesting in its own right.  相似文献   

6.
7.
Traditional debuggers are of limited value for modern scientific codes that manipulate large complex data structures. Current parallel machines make this even more complicated, because the data structure may be distributed across processors, making it difficult to view/interpret and validate its contents. Therefore, many applications’ developers resort to placing validation code directly in the source program. This paper discusses a novel debug-time assertion, called a “Statistical Assertion”, that allows using extracted statistics instead of raw data to reason about large data structures, therefore help locating coding defects. In this paper, we present the design and implementation of an ‘extendable’ statistical-framework which executes the assertion in parallel by exploiting the underlying parallel system. We illustrate the debugging technique with a molecular dynamics simulation. The performance is evaluated on a 20,000 processor Cray XE6 to show that it is useful for real-time debugging.  相似文献   

8.
In networked systems research, game theory is increasingly used to model a number of scenarios where distributed decision making takes place in a competitive environment. These scenarios include peer‐to‐peer network formation and routing, computer security level allocation, and TCP congestion control. It has been shown, however, that such modeling has met with limited success in capturing the real‐world behavior of computing systems. One of the main reasons for this drawback is that, whereas classical game theory assumes perfect rationality of players, real world entities in such settings have limited information, and cognitive ability which hinders their decision making. Meanwhile, new bounded rationality models have been proposed in networked game theory which take into account the topology of the network. In this article, we demonstrate that game‐theoretic modeling of computing systems would be much more accurate if a topologically distributed bounded rationality model is used. In particular, we consider (a) link formation on peer‐to‐peer overlay networks (b) assigning security levels to computers in computer networks (c) routing in peer‐to‐peer overlay networks, and show that in each of these scenarios, the accuracy of the modeling improves very significantly when topological models of bounded rationality are applied in the modeling process. Our results indicate that it is possible to use game theory to model competitive scenarios in networked systems in a way that closely reflects real world behavior, topology, and dynamics of such systems. © 2016 Wiley Periodicals, Inc. Complexity 21: 123–137, 2016  相似文献   

9.
In this paper we present an algorithm for approximating the range of the real eigenvalues of interval matrices. Such matrices could be used to model real-life problems, where data sets suffer from bounded variations such as uncertainties (e.g. tolerances on parameters, measurement errors), or to study problems for given states.The algorithm that we propose is a subdivision algorithm that exploits sophisticated techniques from interval analysis. The quality of the computed approximation and the running time of the algorithm depend on a given input accuracy. We also present an efficient C++ implementation and illustrate its efficiency on various data sets. In most of the cases we manage to compute efficiently the exact boundary points (limited by floating point representation).  相似文献   

10.
Parallel computation offers a challenging opportunity to speed up the time consuming enumerative procedures that are necessary to solve hard combinatorial problems. Theoretical analysis of such a parallel branch and bound algorithm is very hard and empirical analysis is not straightforward because the performance of a parallel algorithm cannot be evaluated simply by executing the algorithm on a few parallel systems. Among the difficulties encountered are the noise produced by other users on the system, the limited variation in parallelism (the number of processors in the system is strictly bounded) and the waste of resources involved: most of the time, the outcomes of all computations are already known and the only issue of interest is when these outcomes are produced.We will describe a way to simulate the execution of parallel branch and bound algorithms on arbitrary parallel systems in such a way that the memory and cpu requirements are very reasonable. The use of simulation has only minor consequences for the formulation of the algorithm.  相似文献   

11.
The simulation of large particle systems with the Discrete Element Method can be very time consuming. This is due to the necessity for collision detection between the disordered particles. Various methods, originating from different areas such as computer science, are well established and have been used in various applications. For parallel computations the simulation domain needs to be divided into subdomains to be distributed among the different nodes or machines within a supercomputer or a computer-cluster. The strategy for this domain decomposition has a significant influence on the performance of the calculation. In this paper we discuss some aspects of the development of a hierarchical domain decomposition algorithm that provides flexible adaption of the decomposition pattern to the changing structure of the particle system during the simulation. Thus an even load distribution among the different machines can be maintained. Moreover, the same method is also used to deal with the computational bottleneck caused by the presence of unstructured data. (© 2005 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim)  相似文献   

12.
Instabilities occurring during the implicit time-integration are still handicapping a time-efficient solution of large FEM systems of equations. Especially the simulation of flexible rotating structures is barely mastered by implicit FEM codes. For this, the Newmark algorithm and related algorithms are used for many years. Here, we derive the reasons for the mentioned inevitable numerical issues and present concepts that lead to an efficient and stable solution.  相似文献   

13.
Scientific computing poses many challenges to formal verification, including the facts that typical programs: (1) are numerically-intensive, (2) are highly-optimized (often by hand), and (3) often employ parallelism in complex ways. Another challenge is specifying correctness. One approach is to provide a very simple, sequential version of an algorithm together with the optimized (possibly parallel) version. The goal is to show the two versions are functionally equivalent, or provide useful feedback when they are not. We present a new verification suite consisting of pairs of programs of this form. The suite can be used to evaluate and compare tools that verify functional equivalence. The programs are all in C and the parallel versions use the Message Passing Interface. They are simpler than codes used in practice, but are representative of real coding patterns (e.g., manager-worker parallelism, loop tiling) and present realistic challenges to current verification tools. The suite includes solvers for the 1-d and 2-d diffusion equations, Jacobi iteration schemes, Gaussian elimination, and N-body simulation.  相似文献   

14.
Within the framework of parallel numerical treatment, there are many more results on stationary (or time-frozen) PDE's than on their evolution (or time-dependent) counterparts. This is justified as computer simulation of time flow – an intrinsically sequential phenomenum – leads naturally to a sequential algorithm, thus inhibiting any search for concurrent ones. (© 2008 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim)  相似文献   

15.
弹性接触问题参数变分原理的有限元并行算法*   总被引:1,自引:0,他引:1  
本文基于弹性接触问题的参数变分原理的有限元解法,利用并行计算机的特性和并行处理结构,建立了相应的并行算法.该算法从刚度阵的生成和组集,静凝聚过程,求应力过程等多方面实现了并行化.该算法在西安交通大学ELXSI-6400并行计算机上程序实现,计算结果表明能有效地节省计算时间,是一种分析接触问题的有效的并行算法.  相似文献   

16.
Although various hash functions based on chaos or chaotic neural network were proposed, most of them can not work efficiently in parallel computing environment. Recently, an algorithm for parallel keyed hash function construction based on chaotic neural network was proposed [13]. However, there is a strict limitation in this scheme that its secret keys must be nonce numbers. In other words, if the keys are used more than once in this scheme, there will be some potential security flaw. In this paper, we analyze the cause of vulnerability of the original one in detail, and then propose the corresponding enhancement measures, which can remove the limitation on the secret keys. Theoretical analysis and computer simulation indicate that the modified hash function is more secure and practical than the original one. At the same time, it can keep the parallel merit and satisfy the other performance requirements of hash function, such as good statistical properties, high message and key sensitivity, and strong collision resistance, etc.  相似文献   

17.
Parareal算法是一种非常有效的实时并行计算方法.与传统的并行计算方法相比,该算法的显著特点是它的时间并行性-先将整个计算时间划分成若干个子区间,然后在每个子区间内同时进行计算.Parareal算法收敛速度快,并行效率高,且易于编程实现,从2001年由Lions,Maday和Turinici等人首次提出至今,在短短...  相似文献   

18.
We present an algorithm for calculating temperature fields in orthotropic bodies of complex shape, which is based on the method of integral equations. To develop the algorithm, the heat conduction boundary-value problem for orthotropic bodies is preliminarily reduced to the corresponding heat conduction problems for isotropic bodies with modified boundary conditions and heat sources. An investigation of the influence of anisotropy on temperature fields in a bounded and an infinite body with a cavity that are heated by heat sources and flows is performed.  相似文献   

19.
Linear codes with a few weights can be applied to communication, consumer electronics and data storage system. In addition, the weight hierarchy of a linear code has many applications such as on the type II wire-tap channel, dealing with t-resilient functions and trellis or branch complexity of linear codes and so on. In this paper, we present a formula for computing the weight hierarchies of linear codes constructed by the generalized method of defining sets. Then, we construct two classes of binary linear codes with a few weights and determine their weight distributions and weight hierarchies completely. Some codes of them can be used in secret sharing schemes.  相似文献   

20.
Recent developments in high performance computer architecture have a significant effect on all fields of scientific computing. Linear algebra and especially the solution of linear systems of equations lie at the heart of many applications in scientific computing. This paper describes and analyzes three parallel versions of the dense direct methods such as the Gaussian elimination method and the LU form of Gaussian elimination that are used in linear system solving on a multicore using an OpenMP interface. More specifically, we present two naive parallel algorithms based on row block and row cyclic data distribution and we put special emphasis on presenting a third parallel algorithm based on the pipeline technique. Further, we propose an implementation of the pipelining technique in OpenMP. Experimental results on a multicore CPU show that the proposed OpenMP pipeline implementation achieves good overall performance compared to the other two naive parallel methods. Finally, in this work we propose a simple, fast and reasonably analytical model to predict the performance of the direct methods with the pipelining technique.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号