首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
The problem of optimal scheduling n tasks in a parallel processor system is studied. The tasks are malleable, i.e., a task may be executed by several processors simultaneously and the processing speed of a task is a nonlinear function of the number of processors allocated to it. The total number of processors is m and it is an upper bound on the number of processors that can be used by all the tasks simultaneously. It is assumed that the number of processors is sufficient to process all the tasks simultaneously, i.e. nm. The objective is to find a task schedule and a processor allocation such that the overall task completion time, i.e. the makespan, is minimized. The problem is motivated by real-life applications of parallel computer systems in scientific computing of highly parallelizable tasks. An O(n) algorithm is presented to solve this problem when all the processing speed functions are convex. If these functions are all concave and the number of tasks is a constant, the problem can be solved in polynomial time. A relaxed problem, in which the number of processors allocated to each task is not required to be integer, can be solved in O(nmax {m,nlog 2 m}) time. It is proved that the minimum makespan values for the original and relaxed problems coincide. For n=2 or n=3, an optimal solution for the relaxed problem can be converted into an optimal solution for the original problem in a constant time.  相似文献   

2.
This paper considers a communication system which consists of many processors and studies the problem for improving its reliability by adopting the recovery techniques of checkpoint and rollback. When either processor failure or communication error has occurred, the rollback recovery for processors associated with such an event is executed to the most recent checkpoint, and so, a consistent state in the whole system is maintained. The stochastic model with the above recovery techniques is formulated, using the theory of Markov renewal processes. The mean time to take checkpoint and the expected numbers of rollback recovery caused by processor failures and communication errors are derived. Further, an optimal checkpointing interval which minimizes the expected cost is analytically discussed.  相似文献   

3.
This paper presents an optimal scheduling algorithm for minimizing set-up costs in the parallel processing shop while meeting workload balancing restrictions.There are M independent batch type jobs which have sequence dependent set-up costs and N parallel processing machines. Each of the M jobs must be processed on exactly one of the N available machines. It is desirable to minimize total changeover costs with the restriction that each machine workload assignment T n be within P units of the average machine assignment. The paper describes a static problem in which all jobs are available at time zero. The sequence dependent change over costs are identical for each machine. An extension of the algorithm handles nonidentical processor problems.A combinatorial programming approach to the problem is used. For the special case of identical processors, the problem can be treated as a multi-salesman travelling salesman problem. A general branch and bound algorithm and numerical results are given.  相似文献   

4.
5.
The problem of scheduling directed acyclic task graphs on an unbounded number of processors is considered. We present a single algorithm which is applicable to several special cases, thus effecting a unified approach to task scheduling independent of the task graph. We start by considering multi-stage dags and present an algorithm that computes a schedule in O(Nq log q) time, where N is the number of stages, and q is the maximum number of edges between any two stages of the graph. We show that the schedule produced by the algorithm is optimal when: (i) all communication delays are zero or, (ii) the precedence graph is an in-tree or an out-tree and communication times are small or, (iii) the task graph is densely connected and communication costs and processing costs are unity. For multi-stage dags with small communication times we show that the makespan of the schedule generated by our algorithm is less than twice that of the optimal. We also bound the makespan for the case when communication times are arbitrary. We then show how the algorithm may be applied to schedule arbitrary dags and derive the performance bounds for this case. Finally, we present the results of tests we carried out with randomly generated task graphs. These seem to indicate that, on the average, the algorithm performs substantially better than theoretical worst case predictions.  相似文献   

6.
Approximation schemes for optimal compression with static and sliding dictionaries which can run on a simple array of processors with distributed memory and no interconnections are presented. These approximation algorithms can be implemented on both small and large scale parallel systems. The sliding dictionary method requires large size files on large scale systems. As far as lossless image compression is concerned, arithmetic encoders enable the best lossless compressors but they are often ruled out because they are too complex. Storer extended dictionary text compression to bi-level images to avoid arithmetic encoders (BLOCK MATCHING). We were able to partition an image into up to a hundred areas and to apply the BLOCK MATCHING heuristic independently to each area with no loss of compression effectiveness. Therefore, the approach is suitable for a small scale parallel system at no communication cost. On the other hand, bi-level image compression seems to require communication on large scale systems. With regard to grey scale and color images, parallelizable lossless image compression (PALIC) is a highly parallelizable and scalable lossless compressor since it is applied independently to blocks of 8 × 8 pixels. We experimented the BLOCK MATCHING and PALIC heuristics with up to 32 processors of a 256 Intel Xeon 3.06 GHz processors machine () on a test set of large topographic bi-level images and color images in RGB format. We obtained the expected speed-up of the compression and decompression times, achieving parallel running times about 25 times faster than the sequential ones. Finally, scalable algorithms computing static and sliding dictionary optimal text compression on an exclusive read, exclusive write shared memory parallel machine are presented. On the same model, compression by block matching of bi-level images is shown which can be implemented on a full binary tree architecture under some realistic assumptions with no scalability issues.  相似文献   

7.
Feinberg  Eugene A.  Kella  Offer 《Queueing Systems》2002,42(4):355-376
We consider an M/G/1 queue with a removable server. When a customer arrives, the workload becomes known. The cost structure consists of switching costs, running costs, and holding costs per unit time which is a nonnegative nondecreasing right-continuous function of a current workload in the system. We prove an old conjecture that D-policies are optimal for the average cost per unit time criterion. It means that for this criterion there is an optimal policy that either runs the server all the time or switches the server off when the system becomes empty and switches it on when the workload reaches or exceeds some threshold D.  相似文献   

8.
A discrete–continuous problem of non-preemptive task scheduling on identical parallel processors is considered. Tasks are described by means of a dynamic model, in which the speed of the task performance depends on the amount of a single continuously divisible renewable resource allotted to this task over time. An upper bound on the completion time of all the tasks is given. The criterion is to minimize the maximum resource consumption at each time instant, i.e., the resource level. This problem has been observed in many industrial applications, where a continuously divisible resource such as gas, fuel, electric, hydraulic or pneumatic power, etc., has to be distributed among the processing units over time, and it affects their productivity. The problem consists of two interrelated subproblems: task sequencing on processors (discrete subproblem) and resource allocation among the tasks (continuous subproblem). An optimal resource allocation algorithm for a given sequence of tasks is presented and computationally tested. Furthermore, approximation algorithms are proposed, and their theoretical and experimental worst-case performances are analyzed. Computer experiments confirmed the efficiency of all the algorithms.  相似文献   

9.
The start-up process existed in every solvent extraction process. An optimal start-up strategy in the solvent extraction process can shorten the time to get to steady state greatly. In order to offer some useful methods for the study on the start-up in the extraction process, an ideal start-up model was developed for a modified Scheibel extraction column. The numerical solution for the multi-stage extraction system was obtained from solving the mathematical equations. From the calculated values, the relationships between variables and time were established. According to the relationships obtained above, suitable start-up strategy could be taken to drive the extraction system to the desired steady state.  相似文献   

10.
1.IntroductionTheobjectiveofthisworkistostudystochasticapproximationinrea1time.Apipelineapproachissuggested.Asymptoticpropertiesoftheprocedurearedeve1oped,andcomparisonsofrateofconvergencewiththeclassicalalgorithmsaremade.LetxeE',andf(.):EL-FL-Thetraditio…  相似文献   

11.
Scheduling divisible application on a set of heterogeneous processors with limited memory sizes is analyzed in this paper. Divisible loads are computations which allow for dividing computations into several parts of arbitrary sizes, and the parts can be processed independently in parallel. Though the model originated in the parallel computing context, it has strong links with other applications of operations research. A star communication network is assumed. Each processing element of the network is characterized by the processing speed, memory size, speed and startup time of its communication link. The goal is to find a distribution of the load whose schedule length is minimum. The problem is established to be computationally hard. Therefore, two types of algorithms are proposed, and evaluated: an exact algorithm whose execution time may be exponential, and polynomial-time heuristics.  相似文献   

12.
New distributed computing platforms (grids) are based on interconnections of a large number of processing elements. A most important issue for their effective utilization is the optimal use of resources through proper task scheduling. It consists of allocating the tasks of a parallel program to processors on the platform and to determine at what time the tasks will start their execution. As data may be subject to uncertainties or disturbances, it is practically impossible to precisely predict the input parameters of the task scheduling problem.  相似文献   

13.
In order to maintain load balancing in a distributed system, we should obtain workload information from all the nodes on network. This processing requiresO(v 2) communication overhead, wherev is the number of nodes. In this paper, we present a new synchronous dynamic distributed load balancing algorithm on a (v, k+1, 1)-configured network applying a symmetric balanced incomplete block design, wherev=k 2+k+1. Our algorithm needs only $O(v\sqrt v )$ communication overhead and each node receives workload information from all the nodes without redundancy. Therefore, load balancing is maintained since every link has the same amount of traffic for transferring workload information.  相似文献   

14.
This paper presents two integer linear programming (ILP) models for cyclic scheduling of tasks with unit/general processing time. Our work is motivated by digital signal processing (DSP) applications on FPGAs (Field-Programmable Gate Arrays)—hardware architectures hosting several sets of identical arithmetic units. These hardware units can be formalized as dedicated sets of parallel identical processors. We propose a method to find an optimal periodic schedule of DSP algorithms on such architectures where the number of available arithmetic units must be determined by the scheduling algorithm with respect to the capacity of the FPGA circuit. The emphasis is put on the efficiency of the ILP models. We show the advantages of our models in comparison with common ILP models on benchmarks and randomly generated instances.  相似文献   

15.
We investigate the efficiency of a parallel direct simulation Monte Carlo (PDSMC) algorithm in solving the rarefied subsonic flow through a nanochannel. We use MPI library to transfer data between the processors. It is observed that PDSMC solver shows ideal speed up if sufficient workload is provided for each of processors. Additionally, this study shows that the computational time and speed up of the extended PDSMC solver do not depend (or slightly depend) on the number of cells. In contrary, increasing the total number of particles would result in a better efficiency of the PDSMC.  相似文献   

16.
This paper presents a brief evaluation of a start-up strategy for multi-species anaerobic digestion systems modelled as two-step reaction systems, where acidogenesis is described by Monod kinetics while the methanogenesis is described by Haldane kinetics. The start-up policy has been developed originally for single species systems with the aim of maximizing the biogas outflow rate. It consists of switching the dilution rate from minimum to maximum and then to the optimal value (bang-bang control) in order to bring the system from an arbitrary initial condition to the optimal set-point. This start-up strategy is applied to the multi-species system using an averaged model, which is usually the only model that can be identified for a multi-species system, as measuring individual biomasses is almost impossible in practice. Even the development of an accurate averaged model, fully characterizing the system dynamics based on the variation of the species proportions is difficult. The averaged models used in this study are built based on a more or less accurate knowledge of the species proportions and their kinetics at the start-up instant and used as such in the application of the start-up policy. It is shown that the start-up policy leads to an efficient ecosystem, characterized by high outflow rate of biogas, which is very close to the maximum even in the case of an inaccurate averaged model. The influence of the model accuracy on the system stability and its productivity is discussed. This study can also be viewed as a robustness evaluation with respect to model inaccuracy of the single species start-up strategy, as the process changes from the averaged kinetics to the kinetics of the winning species during species selection.  相似文献   

17.
Dynamic load balancing in multicomputers can improve the utilization of processors and the efficiency of parallel computations through migrating the workload across processors at runtime. We present a survey and critique of dynamic load balancing strategies that are iterative: that is, workload migration is carried out through transferring processes across nearest neighbour processors. Iterative strategies have become prominent in recent years because of the increaasing popularity of point-to-point interconnection networks for multicomputers.  相似文献   

18.
《Applied Mathematical Modelling》2014,38(15-16):3975-3986
This paper addresses a certain type of scheduling problem that arises when a parallel computation is to be executed on a set of identical parallel processors. It is assumed that if two precedence-related tasks are processed on two different processors, due to the information transferring, there will be a task-dependent communication delay between them. For each task, a processing time, a due date and a weight is given while the goal is to minimize the total weighted late work. An integer linear mathematical programming model and a branch-and-bound algorithm have been developed for the proposed problem. Comparing the results obtained by the proposed branch-and-bound algorithm with those obtained by CPLEX, indicates the effectiveness of the method.  相似文献   

19.
Consider a manufacturing process in which a group of machines (or people) perform a single operation on a number of different parts. The processing time depends on both the part and the machine. In addition, each machine requires significant setup time between processing different part types. The problem consists of obtaining a feasible allocation of parts to machines such that the makespan (i.e. greatest machine workload) is minimized. We present two equivalent 0–1 models. The first model arises by considering the assignment of individual parts to machines. It is amenable to Lagrangian decomposition techniques. The second model is more hierarchical in nature; it considers the two options of assigning an entire part type to a single machine, or of splitting the type across machines. The second model is more amenable than the first to branch-and-bound techniques. We report about our computational experience for finding lower bounds of the optimal solution by appending violated cuts and, ultimately, obtaining the optimal solution of real-life problems.  相似文献   

20.
we evaluate several variants of a standard election algorithm on a ring of processors. The performance measures of interest are the number of messages exchanged (communication complexity) and the execution time (time complexity). Classical models use a synchronism assumption according to which all processors start at the same time and message delays are constant.We attempt to capture the essential asynchronism of this class of algorithms by using probabilistic models. Two such models are discussed, one in discrete and one in continuous time. In each case, both the uni- and the bidirectional cases are studied and compared. We obtain expressions for the distributions of the number of exchanged messages and derive their asymptotic behavior whenn, the number of processors in the ring, grows large. The results show how the communication complexity actually depends on the speed of communcations on the ring, and what is the interest of having bidirectional communications.We also address in part the evaluation of the completion time of the algorithm. This time decomposes into astartup time and anexploration time. We show that the average of the startup time is of the order of logn.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号