期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Paralleling of computational processes with loops and branchings

V. S. Sokhranskaya T. A. Tushkina 《Journal of Mathematical Sciences》1982,20(2):2055-2067

A model of a computational process with loops and branchings, viz., a bilogic graph, is studied. An algorithm for partitioning a bilogic graph into a sequence of linear segments is described. A class of graphs is distinguished, for which the set of all staticodynamic schedules coincides with the set of schedules for the sequence of linear segments.Translated from Zapiski Nauchnykh Seminarov Leningradskogo Otdelenlya Matematicheskogo Instituta im. V. A. Steklova AN SSSR, Vol. 90, pp. 210–226, 1979 相似文献

2.

Torus data flow for parallel computation of missized matrix problems

《Linear algebra and its applications》1986

A data-flow approach is used to solve dense symmetric systems of equations on a torus-connected 2-D mesh of processors. A torus mapping of the matrix onto this processor array allows the Cholesky decomposition to be completed in 3n − 2 time steps using only n²/4 processors (less than half the number needed in previously reported results). New definitions for missized problems and parallel algorithm performance are given along with various time-step, efficiency, and processor utilization plots. 相似文献

3.

Recursive mesh refinement on hypercubes

W. D. Gropp I. C. F. Ipsen 《BIT Numerical Mathematics》1989,29(2):186-211

Adaptive methods for PDEs can be viewed as a graph problem. An efficient parallel implementation of adaptive PDE methods then requires distributing the nodes of the associated graph uniformly across the processors so that the resulting cost of communication between processors is low.We solve this problem in two phases: labeling of graph nodes and subsequent mapping of these labels onto processors. We describe a new form of Gray-code which we call aninterleaved Gray-code that allows easy labeling of graph nodes when the maximal level of refinement is unknown, allows easy determination of nearby nodes in the graph, is completely deterministic, and often (in a well-defined sense) distributes the graph uniformly across a hypercube. The theoretical results are supported by computational experiments on the Connection Machine.The work presented in this paper was supported by the Office of Naval Research under contract N00014-85-K-0461, by the Army Research Office under contract DAAL03-86-K-0158, by the Office of Naval Research under contract number N00014-86-K-0310 and the National Science Foundation under contract number DCR 8521451. Part of this work was performed while the second author was in residence at the Computer Science and Systems Division of Harwell Laboratory, United Kingdom. 相似文献

4.

Scheduling precedence graphs of bounded height

Danny Dolev Manfred K Warmuth 《Journal of Algorithms in Cognition, Informatics and Logic》1984,5(1):48-59

The existence of a schedule for a partially ordered set of unit length tasks on m identical processors is known to be NP-complete (J. D. Ullman, NP-complete scheduling problems, J. Comput. System Sci., 10 (1975), 384–393). The problem remains NP-complete even if we restrict the precedence graph to be of height bounded by a constant. (J. K. Lenkstra and A. H. G. Rinnooy Kan, Complexity of scheduling under precedence constraints, Operations Res., 26 (1978), 22–35; D. Dolev and M. K. Warmuth, “Scheduling Flat Graphs,” IBM Research Report RJ 3398, 1982). In these NP-completeness proofs the upper bound on the number of available processors varies with the problem instance. We present a polynomial algorithm for the case where the upper bound on the number of available processors and the height of the precedence graph are both constants. 相似文献

5.

Graph models for scheduling systems with machine saturation property

Paolo Dell’Olmo Monica Gentili 《Mathematical Methods of Operations Research》2006,63(2):329-340

Let be a set of n independent tasks and a set of m processors. During each time instant, each processor can be used by a single task at most. A schedule is for each task an allocation of one or more time intervals to one or more processors. A schedule is said to be optimal if it minimizes the maximum completion time. We say a schedule S has the machine saturation property (MS property) if, at any time instant of task execution, all the machines are simultaneously busy. In this paper, we analyze the conditions under which a parallel scheduling system allows a schedule with the MS property. While for some simple models the analytical conditions can be easily stated, a graph model approach is required when conflicts of processor usage are present. For this reason, we define the class of saturated graphs that correspond to scheduling systems with the MS property. We present efficient graph recognition algorithms to verify the MS property directly on some classes of saturated graphs 相似文献

6.

Upper bound on the number of processors for scheduling with interprocessor communication delays

Aziz Moukrim 《Mathematical Methods of Operations Research》2000,52(1):99-113

相似文献

7.

A random 1-011-011-01algorithm for depth first search

A. Aggarwal R. J. Anderson 《Combinatorica》1988,8(1):1-12

In this paper we present a fast parallel algorithm for constructing a depth first search tree for an undirected graph. The algorithm is anRNC algorithm, meaning that it is a probabilistic algorithm that runs in polylog time using a polynomial number of processors on aP-RAM. The run time of the algorithm isO(T _MM(n) log³ n), and the number of processors used isP _MM (n) whereT _MM(n) andP _MM(n) are the time and number of processors needed to find a minimum weight perfect matching on ann vertex graph with maximum edge weightn.This research was done while the first author was visiting the Mathematical Research Institute in Berkeley. Research supported in part by NSF grant 8120790.Supported by Air Force Grant AFOSR-85-0203A. 相似文献

8.

Iterative Dynamic Load Balancing in Multicomputers

Cheng-Zhong Xu Francis C. M. Lau 《The Journal of the Operational Research Society》1994,45(7):786-796

Dynamic load balancing in multicomputers can improve the utilization of processors and the efficiency of parallel computations through migrating the workload across processors at runtime. We present a survey and critique of dynamic load balancing strategies that are iterative: that is, workload migration is carried out through transferring processes across nearest neighbour processors. Iterative strategies have become prominent in recent years because of the increaasing popularity of point-to-point interconnection networks for multicomputers. 相似文献

9.

Scheduling groups of tasks with precedence constraints on three dedicated processors

《Discrete Applied Mathematics》2004,134(1-3):141-168

We study the problem of scheduling groups of tasks with precedence constraints on three dedicated processors. Each task requires a specified set of processors. Up to three precedence constraints are considered among groups of tasks requiring the same set of processors. The objective of the problem is to find a nonpreemptive schedule which minimizes the maximum completion time (makespan). This scheduling problem is equivalent to the problem of finding an extension of the constraint graph (i.e. the graph which represents the conflicts between tasks and the precedence constraints) to a comparability graph with minimum (over all the extensions) maximum clique weight. The problem is NP-hard in the strong sense. A normal schedule is such that all the tasks requiring the same set of processors are scheduled consecutively. With a normal schedule the problem reduces to the quotient graph of the constraint graph. In this paper we obtain tight approximation results for the minimum makespan of a normal schedule through tight results on the minimum increase of the maximum clique weight when the (partially oriented) quotient graph is extended to a comparability graph. 相似文献

10.

Performance bounds on multiprocessor scheduling strategies for statically allocated programs

Lars Lundberg 《BIT Numerical Mathematics》1993,33(2):190-213

In multiprocessors with static allocation of processes to processors, scheduling can be done locally for each processor. The scheduling strategy may have dramatic effect on the execution time of a parallel program. It is an NP-hard problem to find an optimal schedule, and very little is known of how close the heuristic solutions get.The major result here is a theorem stating that if certain program parameters, which can be obtained from an execution of the program on a single-processor, are known, the execution time of the optimal schedule can be calculated within a factor equal to the largest number of border processes on one processor. Border processes are processes which communicate with other processors. The program parameters are obtained using a previously developed tool.Due to the generality of this theorem, the proof is rather complex because it has to cover a large range of situations. The theorem itself, however, is easy to apply, making it possible to compare the performance of different scheduling strategies with the optimal case. The proof also gives important hints on how to design efficient scheduling algorithms for statically allocated programs. 相似文献

11.

Optimum Allocation of Processes in a Distributed Environment: A Process-to-Process Approach

Stella Sofianopoulou 《The Journal of the Operational Research Society》1990,41(4):329-337

The process allocation problem (PAP) concerned with the assignment of a number of communicating processes to a certain number (not known a priori) of identical processors in a telecommunications environment is examined. The objective is to minimize the total message-passing between processes residing on different processors subject to constraints on the processing power and storage capacity (code-, data-storage and occupancy) of the processors. Constraints imposed on the co-location of certain processes on the same processor are also included. The problem is formulated as a 0-1 linear maximization problem, taking into account only the number of processes involved, while the number of processors required is produced automatically with the optimum solution. An implicit enumeration algorithm is developed which produces an optimum message-passing allocation. Computational results of a set of random problems which have similar characteristics to a real-world application in telecommunications are also presented. 相似文献

12.

Scheduling with uncertainties on new computing platforms

Amine Mahjoub Johnatan E. Pecero Sánchez Denis Trystram 《Computational Optimization and Applications》2011,48(2):369-398

New distributed computing platforms (grids) are based on interconnections of a large number of processing elements. A most important issue for their effective utilization is the optimal use of resources through proper task scheduling. It consists of allocating the tasks of a parallel program to processors on the platform and to determine at what time the tasks will start their execution. As data may be subject to uncertainties or disturbances, it is practically impossible to precisely predict the input parameters of the task scheduling problem. 相似文献

13.

The Process Allocation Problem: a Survey of the Application of Graph-Theoretic and Integer Programming Approaches

S. Sofianopoulou 《The Journal of the Operational Research Society》1992,43(5):407-413

The Process Allocation Problem, which consists of allocating a number of processes to a network of processors with the objective of minimizing the sum of communication (between processes residing on different processors) and execution costs, is presented and the relevant literature reviewed. Solution procedures employing graph-theoretic and mathematical programming methods are critically examined. Some heuristic algorithms are also presented. 相似文献

14.

A Randomized Parallel Algorithm for Planar Graph Isomorphism

Hillel Gazit John H Reif 《Journal of Algorithms in Cognition, Informatics and Logic》1998,28(2):290-314

We present a parallel randomized algorithm running on aCRCW PRAM, to determine whether two planar graphs are isomorphic, and if so to find the isomorphism. We assume that we have a tree of separators for each planar graph (which can be computed by known algorithms inO(log² n) time withn^{1 + ε}processors, for any ε > 0). Ifnis the number of vertices, our algorithm takesO(log(n)) time with processors and with a probability of failure of 1/nat most. The algorithm needs 2 · log(m) − log(n) + O(log(n)) random bits. The number of random bits can be decreased toO(log(n)) by increasing the number of processors ton^{3/2 + ε}, for any ε > 0. Our parallel algorithm has significantly improved processor efficiency, compared to the previous logarithmic time parallel algorithm of Miller and Reif (Siam J. Comput.20(1991), 1128–1147), which requiresn⁴randomized processors orn⁵deterministic processors. 相似文献

15.

Approximation for the mean value performance of locking algorithms for distributed database systems: a partitioned database

A. Raghuram Thomas W. Morgan B. Rajaraman Yzhak Ronen 《Annals of Operations Research》1992,36(1):299-345

Concurrent access to databases must be synchronized for correct execution of transactions and preservation of data consistency. This is usually achieved through use of concurrency control algorithms, amongst which locking algorithms are the most popular both in the literature and in practice. Several analytic methods have been developed for predicting the performance of centralized database systems employing locking algorithms for concurrency control, but very few exist for distributed database systems.This paper proposes a method to approximate the mean value of various performance parameters in distributed database systems using locking for concurrency control. The main contribution of this approach is its ability to model the interaction between resource and data contention and the resulting effect on system performance. System performance is evaluated at a point where the interaction between these two factors is in equilibrium (stable state) and both the data and resource contention equations are simultaneously satisfied.The model involves the solution of a set of simultaneous polynomial equations whose order is dependent on several problem parameters such as the number of nodes and number of locks requested per transaction. These equations are solved by an iterative procedure to evaluate approximate values of relative throughput, utilization of servers and transaction response time. The small computational requirements of the analytical model permit sensitivity analysis on network parameters, and can thus be effectively used by system designers to evaluate choices of communication line speeds, processor capacity, database sizes, etc.The analytic approximations have been extensively verified against simulations for networks with up to 20 nodes. The input traffic was varied from light loads (about 5% utilization of the channels and processors) to heavy loads (about 65% utilization of the processors and channels). The discrepancies between the analytic approximation and the simulation were quite small (2–8%).This work was done while the authors were at Drexel University, Philadelphia. 相似文献

16.

A dispatching rule-based approach to production scheduling in a printed circuit board manufacturing system

G-C Lee Y-D Kim J-G Kim S-H Choi 《The Journal of the Operational Research Society》2003,54(10):1038-1049

This paper focuses on a production-scheduling problem in a printed circuit board (PCB) manufacturing system that produces multiple product types with different due dates and different manufacturing processes. In the PCB manufacturing system, there is a number of serial workstations, and there are multiple parallel machines at each workstation. Also, setup operations are required at certain workstations or machines, and some product types have re-entrant flows. We develop new dispatching rules for scheduling at each workstation, considering the special features of PCB manufacturing. With the dispatching rules, we determine not only the start time of each lot at a machine but also the batch size of each product at each machine. Simulation experiments are carried out to test the performance of the production-scheduling method and dispatching rules devised in this study. Results show that the production-scheduling method suggested in this study performs better than methods with well-known dispatching rules and heuristic algorithms for lot sizing in terms of the total tardiness of orders. 相似文献

17.

A study of average-case speedup and scalability of parallel computations on static networks

《Mathematical and Computer Modelling》1999,29(9):83-94

A parallel system consists of a parallel algorithm and a parallel machine that supports the implementation of the algorithm. The scalability of a parallel system is a measure of its capability to increase speedup in proportion to the number of processors, or its capability to keep a constant efficiency as the number of processors increases. The present paper is devoted to the investigation of the average-case scalability of parallel algorithms executing on multicomputers with symmetric static networks, including the completely connected network, ring, hypercube, and torus. In particular, we characterize the communication overhead such that the expected efficiency can be kept at certain constant level, and that the number of tasks grows at the rate Θ(P log P). 相似文献

18.

General scheduling non-approximability results in presence of hierarchical communications

R. Giroudeau J.C. König 《European Journal of Operational Research》2008

We investigate on the issue of minimizing the makespan (resp. the sum of the completion times) for the multiprocessor scheduling problem in presence of hierarchical communications. We consider a model with two levels of communication: interprocessor and intercluster. The processors are grouped in fully connected clusters. We propose general non-approximability results in the case where all the tasks of the precedence graph have unit execution times, and where the multiprocessor is composed of an unrestricted number of machines with l ? 4 identical processors each. 相似文献

19.

Parallel iterative methods using factorized preconditioning matrices for solving elliptic equations on triangular grids

O. Yu. Milyukova 《Computational Mathematics and Mathematical Physics》2006,46(6):1044-1060

Parallel analogs of the variants of the incomplete Cholesky-conjugate gradient method and the modified incomplete Cholesky-conjugate gradient method for solving elliptic equations on uniform triangular and unstructured triangular grids on parallel computer systems with the MIMD architecture are considered. The construction of parallel methods is based on the use of various variants of ordering the grid points depending on the decomposition of the computation domain. Results of the theoretic and experimental studies of the convergence rate of these methods are presented. The solution of model problems on a moderate number processors is used to examine the efficiency of the proposed parallel methods. 相似文献

20.

A trust region method for solving the decentralized static output feedback design problem

El-Sayed M. E. Mostafa 《Journal of Applied Mathematics and Computing》2005,17(1-2):1-23

The shortest-paths problem is a fundamental problem in graph theory and finds diverse applications in various fields. This is why shortest path algorithms have been designed more thoroughly than any other algorithm in graph theory. A large number of optimization problems are mathematically equivalent to the problem of finding shortest paths in a graph. The shortest-path between a pair of vertices is defined as the path with shortest length between the pair of vertices. The shortest path from one vertex to another often gives the best way to route a message between the vertices. This paper presents anO(n ²) time sequential algorithm and anO(n ²/p+logn) time parallel algorithm on EREW PRAM model for solving all pairs shortest paths problem on circular-arc graphs, wherep andn represent respectively the number of processors and the number of vertices of the circular-arc graph. 相似文献