POWER OF PREEMPTION FOR MINIMIZING TOTAL COMPLETION TIME ON UNIFORM PARALLEL MACHINES

For scheduling problems on parallel machines, the power of preemption is defined as the supremum ratio of the cost of an optimal nonpreemptive schedule over the cost of an optimal preemptive schedule (for the same input), where the cost is defined by a fixed common cost function. We present a tight analysis of the power of preemption for the problem of minimizing the total completion time on m ≥ 2 uniformly related machines, showing that its value for m = 2 is equal to 1.2, and its overall value is approximately 1.39795.


Introduction.
In parallel machine scheduling, we are given the jobs of a set N = {J 1 , J 2 , . . ., J n }, all available at time zero (where J j is also called job j), and m ≥ 2 parallel machines M 1 , M 2 , . . ., M m (where M i is also called machine i).If a job J j ∈ N is processed on machine M i completely, then its processing time is defined to be p ij .There are three main types of scheduling systems with parallel machines: (i) identical parallel machines, for which the processing times are machineindependent, i.e., p ij = p j ; (ii) uniformly related (or uniform) parallel machines, which have different speeds, so that p ij = p j /s i , where s i denotes the speed of machine M i ; and (iii) unrelated parallel machines, for which the processing time of a job depends on the machine assignment.For the models on identical and uniform machines, the value p j is referred to as the size of job J j ∈ N .
In a nonpreemptive schedule, each job is processed on the machine it is assigned to without interruption.In a preemptive schedule, the processing of a job on a machine can be interrupted at any time and then resumed either on that or on any other machine, provided that the job is not processed on two or more machines at the same time.
Given a scheduling problem, let S be a feasible schedule and let C j (S) denote the completion time of job J j in schedule S.Among the most popular objective functions studied in scheduling literature are the total completion time C j (S), i.e., the sum of completion times, and the makespan C max (S) = max {C j (S) |J j ∈ N }, i.e., the maximum completion time.
Given a schedule S, denote by Φ(S) the value of the objective function Φ ∈ { C j , C max } computed for S and refer to it as the cost of S. For a scheduling problem to minimize an objective Φ on m parallel machines (identical, uniformly related, or unrelated), let S * np and S * p denote, respectively, an optimal nonpreemptive and an optimal preemptive schedule with respect to function Φ.
Given a problem instance of minimizing an objective Φ, denote R = Φ(S * np )/Φ(S * p ) and call this value the cost ratio (for that instance).Define the power of preemption as the supremum ratio R = Φ(S * np )/Φ(S * p ) across all instances of the problem at hand.We denote the power of preemption by ρ.If the number of machines is constrained to be at most m, then we denote the power of preemption restricted to such family of instances by ρ m , and thus ρ = sup m≥2 ρ m .Since by definition ρ m+1 ≥ ρ m holds for any m ≥ 2, it follows that ρ = lim m→∞ ρ m .
The main interest in studying the power of preemption is that this value determines what can be gained if preemption is allowed.Most preemptive models assume that preemption is free, but in real life every preemption slows the system, in general, and increases the processing time of preempted jobs.Thus, when the power of preemption is small, it is beneficial to use nonpreemptive schedules instead of preemptive ones.From a purely theoretical point of view, this is a clean, basic, and natural combinatorial optimization problem; it does not depend on models of complexity, and it provides an interesting comparison between well-known related scheduling problems.
The purpose of this paper is to establish the value of the power of preemption for the scheduling problem of minimizing total completion time on m uniformly related machines.Specifically, we show that for this problem ρ ≈ 1.39795 and ρ 2 = 1.2.
The remainder of this paper is organized as follows.Section 2 presents an overview of known results on the power of preemption for various models on parallel machines.For the model of minimizing total completion time on uniformly related machines, the well-known algorithms for finding optimal nonpreemptive and preemptive schedules are described in section 3.In section 4, we explain how to transform a given instance of the problem under consideration into an instance that possesses required properties, without decreasing the cost ratio.The value ρ of the power of preemption for the model with uniform related machines is derived in section 5, and a sequence of instances is exhibited for which the cost ratio tends to the established value of ρ.The case of two uniform machines is addressed in section 6, where it is shown that the power of preemption is exactly 1.2.

Power of preemption: A review.
In order to determine the exact value of ρ for a particular problem of minimizing a given objective the following should be done: (i) demonstrate that the inequality holds for all instances of the problem; (ii) exhibit instances of the problem for which (1) holds as equality (possibly in the supremum), i.e., show that the value of ρ is tight.Most of the known results on the power of preemption have been established for the problem of minimizing the makespan.
If preemption is not allowed, the problem of minimizing the makespan is NPhard, even on two identical parallel machines.By contrast, finding a preemptive schedule that minimizes the makespan can be done in polynomial time, even in the most general settings with unrelated machines.See a focused survey [5] on parallel machine scheduling with the makespan objective for details and references.Thus, in order to give the concept of the power of preemption practical meaning, for the problem of minimizing the objective Φ = C max , the studies on the power of preemption are normally accompanied by an additional point: (iii) develop a polynomial-time algorithm that finds a nonpreemptive schedule S np such that the inequalities hold for all instances, i.e., the ratio between the cost of a heuristic schedule S np and the cost of an optimal preemptive schedule does not exceed the upper bound ρ that is claimed for the cost ratio.Without loss of generality, throughout this paper it is assumed that for the models on identical and uniform machines, the jobs are numbered in LPT (Longest Processing Time) order, i.e., in a nondecreasing order of their sizes: (3) In the case of identical machines, for an optimal preemptive schedule S * p the equality (4) holds, where T = p (N ) /m is the average machine load.An optimal schedule S * p can be found in O (n) time by a so-called wrap-around algorithm developed in [17].
As proved in [1], in the case of m identical machines the power of preemption for Φ = C max is given by ρ m = 2 − 2/ (m + 1).Moreover, it is demonstrated in [1] that a nonpreemptive schedule S np , for which (2) holds (that is, the second inequality of (2) holds), can be found in O (m + n log n) time by applying the famous LPT List Scheduling algorithm, which scans the jobs in the order of their LPT numbering and assigns the next job to the first available machine (see also [15]).For the class of instances with C max (S * p ) = p 1 , the value of the power of preemption is strictly smaller than the global bound 2 − 2/ (m + 1), as established in [19].
Unless stated otherwise, for the model with m uniform machines, throughout this paper assume that the machines are numbered in nonincreasing order of their speeds, i.e., (5) Define S u = u i=1 s i , the total speed of the u fastest machines, 1 ≤ u ≤ m, and introduce According to [9], for an optimal preemptive schedule S * p the equality holds, and an optimal schedule S * p can be found in O (n + m log m) time.It is shown in [25] that ρ m = 2 − 1/m, and a nonpreemptive schedule S np for which (2) holds can be found in O (m + n log n) time by a version of the LPT List Scheduling algorithm.As clarified in [24] For m = 2, a parametric analysis of the power of preemption with respect to the speed of the faster machine is independently performed in [12] and [23].For m = 3, a similar analysis is contained in [23], provided that the machine speeds take at most two values, 1 and s ≥ 1.
To complete the discussion of the power of preemption of the problems of minimizing the makespan, consider the model with m unrelated machines.An optimal preemptive schedule S * p can be found in polynomial time by solving a linear programming problem that determines the values x ij , equal to the total length of the time intervals during which job J j is processed on machine M i ; see, for example, [14].The values x ij /p ij can be rounded to produce a nonpreemptive schedule S np .In particular, a rounding procedure that is attributed to Shmoys and Tardos and reproduced in [16] and [7] finds a nonpreemptive schedule S np such that (2) holds for ρ = 4.This bound is tight, as proved in [7].
We now turn to considering the issues of the power of preemption for the objective function Φ = C j and its weighted counterpart Φ = w j C j .In the latter case, job J j additionally has a positive weight w j associated with it, which reflects its relative importance.
For identical machines, it is proved in [18] that allowing preemption does not reduce the optimal value of Φ = w j C j , i.e., for that model ρ = 1.Notice that the problem of minimizing Φ = w j C j is NP-hard, even for two identical machines [3].The problem of minimizing C j on unrelated parallel machines belongs to a group of rare scheduling problems for which solving the preemptive version is harder than its nonpreemptive counterpart: finding schedule S * np reduces to a rectangular assignment problem and takes strongly polynomial time [3,10], while the problem of finding schedule S * p is NP-hard, as proved in [21].For the problem of minimizing Φ = w j C j on unrelated parallel machines, approximability results established by Sitters in [22] can be interpreted in terms of the power of preemption, which they imply for that very general model ρ ≤ 1.81, which improves a previously known bound of 2 [20].
This paper focuses on the problems of minimizing the unweighted function Φ = C j on uniformly related machines.Unlike the problems of minimizing the makespan discussed above, here both versions of the problem, nonpreemptive and preemptive, are polynomially solvable.Indeed, an optimal nonpreemptive schedule S * np can be found in O (m + n log n) time [6], while finding an optimal preemptive schedule S * p takes O (n log n + nm) time; see [4,8,13].Detailed descriptions of the corresponding algorithms are given in section 3.

Algorithms on uniform machines.
An instance I of the problem with n jobs and m parallel uniformly related machines is defined by the list L n = (p 1 , p 2 , . . ., p n ) of the sizes of the jobs, and the list M m = (s 1 , s 2 , . . ., s m ) of the machine speeds.The objective is the total completion time Φ = C j .As assumed in section 2, the jobs are numbered in the LPT order (3), while the machines are numbered in accordance with (5).Without loss of generality, we may assume that n ≥ m; otherwise, the m − n slowest machines can be removed since they will not be assigned any jobs in any optimal schedule, preemptive or nonpreemptive.This holds since moving the jobs or parts of jobs of one machine to another empty machine that is not slower does not harm the schedule.The running time for this redistribution step is O(m + n), using methods of median selection and partitioning.Sometimes, to stress for which instance a schedule S is created, we will write S (I), explicitly referring to instance I.
The algorithm for finding an optimal nonpreemptive schedule S * np scans the jobs in the order of their numbering and forms the processing sequence on each machine in a backwards manner, starting from the rear end, so that the next job is assigned to the machine on which it makes the smallest contribution to the objective function.Formally, the algorithm can be stated similarly to [2] (and this is a special case of the algorithm for unrelated machines [10,3]).In the description of the algorithm, Π i denotes the sequence of jobs assigned to machine M i , and • is the operation of concatenation, i.e., J j • Π denotes the sequence obtained by adding job J j at the beginning of the current sequence Π.

Algorithm QSumNP
1.If necessary, renumber the jobs in accordance with (3) and the machines in accordance with (5). 2. For each machine M i , i = 1, 2, . . ., m, define Π i := ∅ and ω i := 1/s i .3. For each j from 1 to n do (a) Find the smallest index v with As shown in [11], for n ≥ m Algorithm QSumNP can be implemented in O(n log n) time (the values ω i are stored in a priority queue, where even a simple binary heap allows one to implement all steps except for the initial sorting in time O(n log m)).Notice that in schedule S * np , on each machine the jobs are processed in the order opposite to their numbering, i.e., in nondecreasing order of their sizes, which is known as the Shortest Processing Time (SPT) rule.Each machine M i has a a multiplier associated with it, denoted by ω i , which is updated during the run.For each j, 1 ≤ j ≤ n, the algorithm matches job J j to the smallest available multiplier ω v , possible ties being broken in favor of the machine with the smallest index; i.e., the job is assigned to the fastest machine associated with the current smallest multiplier.The contribution of job J j to the total cost Φ S * np = C j S * np is defined as ω v p j (where ω v is the value calculated in step 3(a) for j).Notice also that since the value Φ S * np is the sum of all contributions of the jobs, the contribution of job J j may be different from its completion time.
The algorithm for finding an optimal preemptive schedule S * p scans the jobs in the SPT order, i.e., in a nondecreasing order of the sizes, opposite to their numbering.Each job is assigned preemptively in such a way that its completion time is minimized.For example, job J n is assigned to the fastest machine M 1 , so that it completes at time Δt = p n /s 1 .During the time interval [0, Δt] , machine M i , 2 ≤ i ≤ m, processes part of job J n−i+1 (assuming m ≤ n).Then the remaining part of job J n−1 is assigned to machine M 1 , where it will be processed for (p n−1 − Δt • s 2 )/s 1 time units, while during that time interval each of the other machines will process part of a job (if such a job exists), etc.A formal description of the algorithm given below follows [2].
Algorithm QSumP In what follows, when we discuss optimal schedules, we will assume that when multiple optimal schedules exist, the optimal schedule we consider is the one created by the corresponding algorithm given here.

Tight instances.
For the problem of scheduling n jobs on m uniformly related machines to minimize total completion time Φ (S) = n j=1 C j (S), let ρ be the power of preemption, i.e., (1) holds for any instance of the problem.An instance I * is called tight if for that instance the following equality holds: Observe that ρ is defined as a supremum of an infinite set of values, and thus, in general, there need not be a tight instance.In such cases, there is a sequence of instances whose sequence of cost ratios approaches ρ.Such a sequence is called a tight sequence.
In this section, we establish the existence of a tight sequence that possesses specific properties.
Lemma 1.There exists a tight sequence in which for every element of the sequence all jobs are identical, i.e., p j , j = 1, . . ., n, are the same.
Proof.It suffices to show that given a set of m machines with speeds s 1 , s 2 , . . ., s m and an upper bound n on the number of jobs, there exists an instance maximizing the power of preemption that is restricted to instances with this set of machines and at most n jobs such that the vector of sizes is binary.This claim implies that there is a tight sequence of instances with binary job sizes.Since zero-sized jobs can be seen to complete at time zero, given a tight sequence with binary job sizes, we can delete the zero-sized jobs from each instance of the sequence, and this does not affect the cost for both the optimal nonpreemptive schedule and the optimal preemptive schedule.
Clearly, we can disregard the instances in which all n jobs have zero size.Therefore, without loss of generality, we can normalize the job sizes so that the largest size is equal to 1. Thus, an instance is associated with a vector of n variables p 1 , p 2 , . . ., p n such that For schedule S * np , let the contribution of job J j ∈ N to Φ S * np be a j p j , where a j depends on n and m, as well as on the index of the job in the sorted list and the machine speeds.Indeed, the objective function value Φ S * np is a linear function of the variables p 1 , . . ., p n (given the ordering) since, as follows from Algorithm QSumNP, the contribution ω v p j of job J j to the objective function is the product of its multiplier and its size, and the objective function value is the sum of contributions of the jobs.
Similarly, Φ S * p is a linear function which can be written as n j=1 b j p j .We prove this by induction on the iterations of Algorithm QSumP.Specifically, we show that in each iteration the values of a and of Δt, as well as the updated values of p j , J j ∈ N , are all linear functions of the original sizes p 1 , . . ., p n .Recall that Algorithm QSumP considers the jobs in the order opposite their numbering given by (7), and that in iteration φ, where the value of the index i is n − φ + 1, the completion time of job J i is defined.First, since the values of p 1 , p 2 , . . ., p n are updated in each iteration of the algorithm, we show by induction that (7) holds after every iteration, and moreover, after iteration φ, p n−φ+1 = p n−φ+2 = • • • = p n = 0. Due to the sorting, these properties hold before any iterations are performed.
Assume that the properties hold after iteration φ − 1.In iteration φ, step 3(a), a time interval of length Δt is determined such that Δt = , and for k = min{n−φ+1, m}, job n−v −φ+2 for 1 ≤ v ≤ k is assigned to machine M v during this time interval.Since the processing time which can be used on machine are already equal to zero, so p n−k−φ+1 , . . ., p 1 are unchanged, and the sorted order is kept.Moreover, for v = 1, the new value of p n−v−φ+2 is zero by the choice of Δt.Finally, we now demonstrate that Φ S * p is a linear function of the original sizes p j .To do this we show that after each iteration of the algorithm the current values of p 1 , . . ., p n linear depend on the original sizes of the jobs.Consider iteration φ; then for every job J j the value of p j either remains the same or decreases by Δt • s v (for some value of v), where Δt depends linearly on the current values of p 1 , p 2 , . . ., p n , which is linear in the original sizes using the induction hypothesis.For each job, the completion time of the job is the sum of all Δt values in a prefix of the list of iterations of the algorithm (all iterations up to and including the iteration in which its completion time is defined), and those are also linear in the original sizes of the jobs.
Thus, we have that the power of preemption of this subclass of instances is the optimal value of the following mathematical program: For this problem, the matrix of the constraints defines a nonempty bounded polytope, and for any feasible solution, the denominator of the objective function is positive.This implies that the optimal value of the objective function, which we denote by λ * , is finite.Solving the above mathematical program is equivalent to finding a maximizer of the objective function n j=1 a j p j − λ * • ( n j=1 b j p j ) subject to the same set of constraints.The resulting problem is a linear program over a nonempty polytope.Thus, we know that there exists an optimal solution for this linear program that is an extreme point of the polytope.Observe that the structure of the constraint matrix that defines this polytope is such that in each row all entries are zero, except at most one 1 and at most one −1.Such constraint matrices are known to be totally unimodular, and since the right-hand side of the system of constraints is an integer vector, we deduce that all extreme points of the polytope are integral.Since all feasible integral solutions are in fact binary due to (7), Lemma 1 is proved.
In our search for a tight sequence, Lemma 1 allows us to focus on instances in which all processing times are unit, i.e., p j = 1, J j ∈ N .For the analysis of ρ, we will consider large inputs, with numbers of jobs growing to infinity as justified by the Lemma 2. There is a tight sequence such that the numbers of unit-sized jobs of the instances along the sequence form a monotonically increasing sequence of integers, growing to infinity.
Proof.Let I be an instance with n unit-sized jobs J 1 , . . ., J n .We create an instance I with 2n unit-sized jobs J 1 , . . ., J 2n by duplicating each machine; in other words, instead of using m machines M 1 , M 2 , . . ., M m with the speed vector (s 1 , s 2 , . . ., s m ) we will use 2m machines M 1 , M 2 , . . ., M 2m−1 , M 2m with the speed vector (s 1 , s 1 , s 2 , s 2 , . . ., s m , s m ).Observe that a nonpreemptive optimal schedule for instance I can be formed by running on machines M 2i−1 and M 2i the same number of jobs that in schedule S * np (I) are executed on machine M i , 1 ≤ i ≤ m, and thus the cost of S * np (I ) is exactly twice the cost of S * np (I).The last claim can be proved by examining the action of Algorithm QSumNP.If the selected machine for job J j is M v for input I, then for instance I jobs J 2j−1 and J 2j will be assigned, respectively, to machines M 2v−1 and M 2v .On the other hand, for instance I , we can reproduce the assignment in schedule S * p (I) twice: on the machines M 1 , M 3 , . . ., M 2m−1 with the odd indices, and on the machines M 2 , M 4 , . . ., M 2m with the even indices.This results in a feasible preemptive schedule for instance I whose cost is twice that of S * p (I).Hence, the cost of S * p (I ) is at most twice that of S * p (I).Thus, for any instance I with n unit-sized jobs and m machines, we can form an instance I with 2n unit-sized jobs and 2m machines such that , i.e., the cost ratio for instance I is no more than that for instance I .Therefore, given a tight sequence, we conclude that there is a tight sequence that is an infinite sequence of instances whereby the sequence of numbers of unit-sized jobs is (strictly) monotone increasing, and the lemma holds.
Lemma 2 demonstrates the existence of a tight sequence rather than of a finite number of tight instances.It plays an important role in the proof of Theorem 2.
Given an instance of the problem, consider the run of Algorithm QSumNP and the run of Algorithm QSumP.Recall that only instances with m ≤ n need be considered.
Since we deal with instances that contain only unit-sized jobs, the cost of an optimal nonpreemptive schedule depends on the list of the multipliers generated during the run of Algorithm QSumNP.For an instance I, Algorithm QSumNP generates a list of n used multipliers, so that for each job the product of its size and the matched multiplier defines its contribution to the total cost.Additionally, upon the completion of the algorithm, each machine M i supplies a multiplier ω i , 1 ≤ i ≤ m, which we call a ready multiplier.Should the instance under consideration contain another job J n+1 , that job would be matched by Algorithm QSumNP to the smallest ready multiplier.If additional jobs arrive, the list of m ready multipliers will be modified, so that if k si is the smallest ready multiplier (where in the case of ties i is the minimum index of a machine whose current multiplier is the smallest one), then k+1 si is inserted into the list as the multiplier of machine M i , while k si is removed from the list.Another representation would be to create n multipliers of the form k si for k = 1, 2, . . ., n for each machine M i , 1 ≤ i ≤ m, and sort this list in nondecreasing order, provided that the elements of equal value are additionally sorted by increasing order of the machine indices.Let Ω (I) be this sorted list of length n • m, where each multiplier  Runing AlgorithmQSumNP for eight unit-sized jobs.
appears in the list as many times as it occurs.In this list, the first n elements are used multipliers, and the (n + 1)th element is the first ready multiplier.Notice that Algorithm QSumNP actually assigns job J j according to the jth element in the list so that if the jth element in the list is k si , then job J j is assigned to be processed on machine M i in the kth position from the rear.
To illustrate the introduced notions, consider an instance of eight unit-sized jobs J j , 1 ≤ j ≤ 8, to be processed on machines M 1 , M 2 , and M 3 with the speeds s 1 = 3, s 2 = 2, and s 3 = 1.Running Algorithm QSumNP and scanning the jobs in the order of their numbering, these jobs are allocated to the machines and associated with the used multipliers as shown in Table 1.
The following scaling procedure is crucial for further analysis.If necessary, we scale all speeds of the machines in such a way that the smallest ready multiplier generated upon the completion of Algorithm QSumNP is equal to 1 (that is, we multiply all speeds by the smallest ready multiplier).Notice that such a scaling does not affect the cost ratio.For example, to scale the eight-job instance above, we need to multiply all the speeds of the machines by 5/3.
Due to the performed scaling, the smallest ready multiplier generated upon the completion of Algorithm QSumNP is equal to 1.For an instance with n unit-sized jobs in schedule S * np , no machine is assigned more jobs than its speed, and in particular, a machine whose speed is strictly below 1 has no jobs assigned to it.A machine of speed s will be assigned at least s − 1 jobs (as otherwise, a ready multiplier on that machine is less than 1).It is assigned at most s jobs, since a larger number of jobs means that a multiplier strictly above 1 has been used by the algorithm, which means that all multipliers of value 1 have been used and there cannot be a ready multiplier with that value.In particular, Algorithm QSumNP outputs a schedule in which any machine of speed 1 has at most one job assigned to it.
Note that as a result of scaling, the minimum machine speed may become less than 1, however, this does not happen in instances which we are interested in, as we show now.This will allow us in further analysis to assume that no machine speed is below 1.
Lemma 3. Given an instance I of n unit-sized jobs and m machines such that the minimum machine speed is smaller than 1 (i.e., min i s i < 1), there is an instance I of n unit-sized jobs and m machines such that the minimum machine speed in I equals 1, and the cost ratio for I is no less than that for I.
Proof.Take an instance I for which min i s i < 1 and transform it into instance I as described below.For every machine M whose speed is below 1, change its speed to 1 (without changing the numbering of the machines).Since instance I is scaled, it follows from the structure of Algorithm QSumNP that in schedule S * np (I) the machines with speeds below 1 receive no jobs.Then, in the resulting optimal schedule S * np (I ) for instance I , the assignment of jobs to machines remains as in S * np (I), since the machines are numbered in accordance with (5) and the ties are broken in favor of the machines with smaller indices, and since the smallest ready multiplier is equal to 1.The cost of an optimal nonpreemptive schedule does not change, i.e., Φ S * np (I ) = Φ S * np (I) .The described transformation may only decrease the cost of an optimal preemptive schedule, since when speeds increase, it is still possible to use any previous optimal schedule as a (not necessarily optimal) schedule for I , i.e., Φ S * p (I ) ≤ Φ S * p (I) .Thus, the cost ratio for instance I is no larger than that for instance I .This proves the claim.Now, we show a stronger property for the speeds.The property is that among instances with n unit-sized jobs and m ≤ n machines, it suffices to consider instances with at most one machine of speed s > 1 and all other m − 1 machines of speed 1 (by Lemma 3, no machine has a speed below 1).
To show this, we do the following.Consider an appropriately scaled instance I that, in particular, contains a machine M (x) of speed s (x) = x + α > 1, and a machine M (y) of speed s (y) = y + β > 1, where x, y ≥ 1 are integers and α and β are nonnegative numbers such that α, β ≤ 1.If one of the speeds s (x) and s (y) is an integer (or if both are), the values α and β are selected (out of 0 and 1) in such a way that Algorithm QSumNP assigns x jobs to machine M (x) and y jobs to machine M (y) .
We examine the effect of changing the speeds of machines M (x) and M (y) to s (x) + s (y) − 1 and 1, respectively, keeping the other machines untouched.We show in Lemma 4 that this modification cannot increase the value Φ S * p of the optimal preemptive schedule, and in Lemma 5 we prove that it cannot decrease the value Φ S * np of the optimal nonpreemptive schedule.Applying this modification as long as there are at least two machines with speeds strictly larger than 1 results in a new instance I whose cost ratio is no less than the cost ratio of I, and it has an additional property that there is at most one machine of speed larger than 1.Notice that we may assume that there is exactly one machine of speed larger than 1 in instance I ; otherwise, if no machine has a speed larger than 1, then the machines are identical and the cost ratio for that instance is 1.
Lemma 4. Changing the speeds of the two machines M (x) and M (y) from s (x)  and s (y) to s (x) + s (y) − 1 and 1, respectively, does not increase the cost of an optimal preemptive schedule.
Proof.Consider an optimal preemptive schedule S * p for the given initial instance I.We will emulate the allocation of jobs to M (x) and M (y) of speeds s (x) and s (y) , respectively, using the two machines of speeds s (x) +s (y) −1 and 1.To do this, consider a (maximal) time interval T of length t such that in schedule S * p a part of one job (say, J ) runs on machine M (x) of speed s (x) , and a part of another job (say, J ) runs on machine M (y) of speed s (y) ; see Figure 1(a).The proof below is presented for the case when both machines M (x) and M (y) are busy in interval T (if this is not the case, no job will be run instead of running the missing job).
The processing amounts of job J and of job J in interval T (the sizes of parts of these jobs that are processed) are equal to ts (x) and ts (y) , respectively.
We show that the same processing amount of each of these jobs in interval T can  be achieved by the following modification.Assign speed s (x) + s (y) − 1 to machine M (x) , and speed 1 to machine M (y) .Define Process job J during the first t 0 time units of interval T on machine M (x) , and in the remaining part of interval T on machine M (y) .Similarly, run job J during the first t 0 time units of interval T on machine M (y) of speed 1, and in the remaining part of interval T on machine M (x) ; see Figure 1(b).This transformation does not affect other jobs and machines, and none of J and J is processed on two machines simultaneously.
The total processing amount of job J during T in the modified schedule is i.e., is equal to the processing amount of J during interval T before the modification.Similarly, the total processing amount of job J during interval T is again the same as the total processing amount of this job during T before the modification.Since Algorithm QSumP creates a finite number of intervals like interval T in the above transformation, it follows that this process of dealing with these intervals one by one will terminate.The cost of the constructed preemptive schedule for the modified instance is the same as the cost for the initial instance I.
Lemma 5. Changing the speeds of the two machines M (x) and M (y) from s (x) = x + α and s (y) = y + β to s (x) + s (y) − 1 = x + y + α + β − 1 and 1, respectively, does not decrease the cost of an optimal nonpreemptive schedule.
Proof.Consider optimal nonpreemptive schedules S * np (I) for I, and S * np (I ) for I , where I is the instance with the modified speeds (recall that we assume that the schedules are created by Algorithm QSumNP).According to our notation, for I, Algorithm QSumNP assigns x and y jobs to machines M (x) and M (y) , respectively.
For instance I, let Ω (I) be the sorted list of n • m multipliers that consists of n multipliers for each machine.Recall that in this list, the first n elements are used multipliers, and the (n + 1)th element is the first ready multiplier, which is equal to 1.For instance I , list Ω (I ) is defined similarly, though the value of the (n + 1)th multiplier is yet to be evaluated.Since the jobs are unit-sized, the cost of a schedule (and in particular, the costs of S * np (I) and of S * np (I )) is equal to the sum of all used multipliers in list Ω (I) (or in list Ω (I ), respectively).For instance I, all fractional multipliers, i.e., those strictly less than 1, appear among the first n elements of Ω(I) and thus they are used multipliers.
Making the transfer from instance I to instance I , we transform list Ω (I) into list Ω (I ) by the removal of all multipliers related to machines M (x) and M (y) , followed by the insertion of multipliers of the form k s (x) +s (y) −1 and k, where 1 ≤ k ≤ n is an integer.The later multipliers are, respectively, supplied by machines M (x) and M (y) in instance I .
By assumption, in list Ω(I) the number of fractional multipliers is no larger than n, and the number of multipliers no larger than 1 is at least n + 1.Before we proceed with the analysis for instance I , we show that the value of the (n + 1)th element of Ω(I ) is 1, since we are currently examining only instances for which the first ready multiplier is equal to 1.For this purpose, we show that the number of fractional multipliers in Ω(I ) is no larger than that of Ω(I), and that the number of multipliers no larger than 1 in Ω(I ) is no smaller than that of Ω(I).
For any instance, a machine M of speed s supplies s − 1 fractional multipliers.The only difference between Ω(I) and Ω(I ) in terms of fractional multipliers can be the difference between the numbers of such multipliers supplied by M (x) and M (y) .Thus, the change in the number of fractional multipliers is −(( x+ α −1)+ ( y + β − 1))+( x+α+y +β −1 −1) (note that M (y) no longer supplies a fractional multiplier, as its modified speed is 1).Due to a simple property λ 1 + λ 2 ≥ λ 1 + λ 2 , which holds for any real λ 1 and λ 2 , we have that the change is nonpositive.
For any instance, a machine M of speed s supplies s multipliers no larger than 1.The only difference between Ω(I) and Ω(I ) in terms of such multipliers can be the difference between the numbers of such multipliers supplied by M (x) and M (y) .Thus, the change in the number of multipliers no larger than 1 is −(( x + α ) + ( y + c 2017 SIAM.Published by SIAM under the terms of the Creative Commons 4.0 license Downloaded 04/20/17 to 193.60.77.88.Redistribution subject to CCBY license β )) + ( x + α + y + β − 1 + 1) (in this case, M (y) now supplies one multiplier of value 1).Due to a simple property λ 1 + λ 2 ≤ λ 1 + λ 2 , which holds for any real λ 1 and λ 2 , we have that the change is nonnegative.
Thus, I is a valid input.Compare the prefixes of length n in the lists Ω(I) and Ω(I ), which are essentially the lists of used multipliers.It follows that the prefix of length n of Ω(I ) contains all fractional multipliers of all machines, and for each machine other than M (x) and M (y) , these multipliers are the same multipliers as in Ω(I).The remaining multipliers in the prefixes under consideration are the fractional multipliers supplied by machines M (x) and M (y) , and possibly some multipliers of value 1 (ensuring that the total number of multipliers in each prefix is n).
Let Γ denote the sum of fractional multipliers for all machines excluding machines M (x) and M (y) , and let g be the number of such multipliers; both these values are the same for lists Ω(I) and Ω(I ).The value x(x+1) 2(x+α) + y(y+1) 2(y+β) represents the sum of the x+y smallest multipliers associated with machines M (x) and M (y) in list Ω (I).Since in instance I machine M (x) receives x jobs, and machine M (y) receives y jobs, it follows that the used multipliers supplied by machines M (x) and M (y) are of the form k x+α for k = 1, 2, . . ., x, and of the form k y+β for k = 1, 2, . . ., y, respectively.All remaining used multipliers in Ω(I) are equal to 1, and there are n − g − (x + y) such multipliers.Thus, the cost for I can be computed as Γ + (n . The proof is split into two cases, depending on the value of α + β.Notice that the equality (8) z(z + 1) holds for any z > 0 and γ ≥ 0. We will apply it several times, always for a positive integer z and 0 ≤ γ ≤ 1.In particular, for z = x and γ = α we have (9) x(x + 1) and for z = y and γ = β we have (10) y(y + 1) , in order to show that the cost for I is no smaller than that of I, we show that the following holds: Applying (8) with z = x + y and γ = α + β − 1, we deduce (x + y)(x + y + 1) We first prove that which is equivalent to showing that the expression is nonpositive.Using simple algebra we get that Since 0 ≤ α ≤ 1 and 0 ≤ β ≤ 1, we obtain that G ≤ 0, i.e., (12) holds.Using (12) we obtain Applying ( 9) and ( 10), we finally deduce that i.e., (11) holds.Case 2. Assume that α + β < 1.If α + β > 0, then machine M (x) supplies x + y − 1 fractional multipliers, and the cost of the corresponding schedule is Comparing the cost of the corresponding schedule to Γ + (n 2(y+β) , in order to show that this cost for I is no smaller than that of I, we show that the following holds: Applying (8) with z = x + y − 1 and γ = α + β, we deduce Next, we prove that which is equivalent to showing that the expression Since x, y ≥ 1 and 1 ≥ α, β ≥ 0, we have and Thus, H ≤ 0 and ( 14) holds.Using (14), we obtain Applying ( 9) and ( 10), we finally deduce that The remaining lemmas of this section will use all properties proved above, and will impose a final restriction on the types of instances that should be analyzed for computing ρ m or ρ.Definition 1.An instance I with n unit-sized jobs and m machines is called a good input if it satisfies the following conditions: (a) n ≥ m; (b) machine M 1 has speed s, 1 < s ≤ n, while the speed of each of the remaining machines M 2 , . . ., M m is 1; (c) in schedule S * np (I) found by Algorithm QSumNP, at least one of the unit speed machines is not assigned a job, so that the smallest ready multiplier is equal to 1. Lemma 6.Any instance I can be converted into a good input without decreasing the cost ratio.
Proof.As proved earlier in this section, it suffices to consider an initial instance I of n unit-sized jobs, and m machines of speeds no smaller than 1, where n ≥ m.Moreover, in I the speeds are scaled so that the smallest ready multiplier is equal to 1. Further, by Lemmas 4 and 5, at least m − 1 of these machines are of speed 1.We may assume that instance I contains a faster machine M 1 of speed s > 1; otherwise, all machines are identical.
Assume that in schedule S * np (I) each slower machine of speed 1 is assigned a job.Then the only machine that can supply the smallest ready multiplier equal to 1 is the faster machine M 1 .This is, however, impossible since Algorithm QSumNP breaks ties for the smallest current multiplier in favor of a machine with a smaller index.
To prove the lemma, we need only show that the speed s of machine M 1 is at most n.
Assume that s > n.Let s = n, and let I be a good input obtained from I by changing the speed of machine M 1 to s .We claim that in both schedules S * np (I) and S * np (I ) all jobs are assigned to the first machine.This last claim holds since the first n multipliers generated on the first machine are no larger than 1, while all the multipliers related to the other machines are no smaller than 1, and Algorithm QSumNP breaks ties in favor of a machine of a smaller index.Considering the jobs in the order of their completion, we get C k (S * np (I)) = k s and C k (S * np (I )) = k s , 1 ≤ k ≤ n.Thus, the ratio Φ S * np (I) /Φ S * np (I ) between the costs of optimal nonpreemptive schedules for instances I and I is s s .We show now that the ratio Φ S * p (I) /Φ S * p (I ) between the costs of optimal preemptive schedules for I and I is at least s s .To show this, consider an optimal preemptive schedule S * p (I). Multiply the speed of each machine by a factor of s s .In the resulting schedule S p , all completion times increase by a factor of s s , compared to the completion times in schedule S * p (I). Increase the speed of each machine, except machine M 1 , which now has a speed of s = n, back to its original speed of 1, as in I.We obtain instance I .Consider schedule S p (I ), in which the starting times of parts of jobs on all machines are kept as in schedule S p .Such a schedule is feasible since in instance I the speeds of machines M 2 , . . .M m have been increased from , so that the cost ratio for instance I is no more than that for a good input I .This concludes the proof.
In fact, for analyzing ρ, we may further tighten the conditions of a good input as Lemma 7. Let I be a good input with n jobs and m ≤ n machines, and a fast machine of speed s ∈ (1, n].Then, there exists a good input I with n jobs and n machines, a fast machine of the same speed s, such that the cost ratio for I is no less than that for I. Proof.If m = n in instance I, we are done.Thus, assume m < n.Add to I n− m machines of speed 1 to obtain I (with exactly n − 1 machines of speed 1).These extra machines will not affect the cost of an optimal nonpreemptive schedule, since Algorithm QSumNP assigns no job to any of them.This implies that Φ S * np (I ) = Φ S * np (I) .On the other hand, it follows that Φ S * p (I) ≥ Φ S * p (I ) , since it is possible not to use the added machines in a preemptive schedule.Thus, I is a good input with m = n, and the cost ratio for I is no less than that for the original instance I.
To summarize the findings of this section, we present the following statement.
Theorem 1.For analyzing ρ, there is a tight sequence such that each instance in the sequence is a good input with equal numbers of unit-sized jobs and machines, and furthermore, the sequence of numbers of unit-sized jobs along the tight sequence is monotonically increasing and tends to infinity.
Each instance I that is described in Theorem 1 can be represented by a pair (n, s), where n is the number of unit-sized jobs and the number of processing machines, while s, 1 ≤ s ≤ n, is the speed of the faster machine M 1 , with the other machines being of speed 1.

The value of the power of preemption.
Let I be an instance of a tight sequence that satisfies the conditions of Theorem 1.For the optimal preemptive schedule, the value of the objective function can be found as stated below.
Lemma 8. Let S * p be an optimal preemptive schedule for I. Then Proof.Applying Algorithm QSumP, we find an optimal schedule S * p in which the jobs are completed on the fast machine M 1 in accordance with the increasing sequence C We use the recursive relation (16) to prove by induction that It is clear that (17) holds for C 1 S * p = 1/s.Assume that it holds for all values of k, 1 ≤ k ≤ q < n, and prove that it also holds for k = q + 1.Using ( 16) and ( 17 as required.
Then we derive which proves the lemma.Now, we consider the value of the objective function of an optimal nonpreemptive schedule for instance I. Lemma 9. Let S * np be an optimal nonpreemptive schedule for the instance I. Then First, assume that the speed s of the fast machine is an integer.If Algorithm QSumNP assigns exactly s − 1 jobs to machine M 1 , then there are no more jobs left, since for the next job, if it existed, the multiplier on each machine would be 1, and that job would be assigned to M 1 due to the tie-breaking rule.However, the condition n = s − 1 contradicts the assumption s ≤ n.Thus, in schedule S * np , Algorithm QSumNP assigns exactly s jobs to the fast machine M 1 , while each of the remaining jobs is processed alone on a slow machine, which is always available.
The completion times of the jobs in schedule S * np are defined by i.e., for an integer s the claim holds.If s is not integral, then Algorithm QSumNP assigns s − 1 jobs to the fast machine M 1 , while each of the remaining jobs is processed alone on a slow machine.Denote s = s − 1 .The completion times of jobs are defined by For a positive integer n and a real s such that 1 ≤ s ≤ n, we let ψ(n, s) , and ψ n = sup 1≤s≤n ψ(n, s).Our next goal is to show that there is a constant upper bound on the sequence {ψ n } n .Here we show a bound of 4.
Lemma 10.For every positive integer n and real s such that 1 ≤ s ≤ n, we have ψ(n, s) ≤ 4.
Proof.Since s ≤ n, we have n+1−s > 0, and as s s−1 s n+1 ≥ 0, the denominator of the definition of ψ(n, s) is positive.
We are left with the case where s > 7 and n < 7s/6.We prove Lemma 10 by providing a lower bound on ( s−1 s ) n+1 .Since s−1 s < 1, we have that ( s−1 s ) n+1 > ( s−1 s ) 7s/6+1 .The function ( s−1 s ) s−1 is monotonically nonincreasing, and lim s→∞ ( s−1 s ) s−1 = 1 e ≈ 0.3678.Thus ( s−1 s ) s−1 ≥ 0.367.Thus ( s−1 s ) 7(s−1)/6 ≥ 0.31, and using ( s−1 s ) 13/6 ≥ 0.7, which holds by s ≥ 7, we have ( s−1 s ) 7s/6+1 = ( s−1 s ) 7(s−1)/6+13/6 ≥ 0.21 > 1/5.It is sufficient to show n−s/2+1/2 s/5+n+1−s ≤ 4, or alternatively, 27s ≤ 30n + 35, which holds as s ≤ n.Now we are able to establish a tight bound on the power of preemption., where I n is the set of inputs consisting of n jobs.We have ρ = sup n≥1 c n .Using Lemma 2, we also have ρ = lim sup n→∞ c n .We have shown in a sequence of lemmas that it is sufficient to consider inputs with unit-size jobs, and moreover, for every n, it is sufficient to consider good inputs with m machines (and n unit-sized jobs).That is, by Lemma 2, the supremum for ρ is achieved by a tight sequence of good inputs, with equal numbers of jobs and machines per input, where the number of unit-sized jobs, n, grows to infinity.
Since the worst-case for ρ is achieved by a tight sequence where the number of unit-sized jobs, n, grows to infinity, consider such a sequence of instances.In a tight sequence, an instance, which is a good input satisfying the properties in Theorem 1, is characterized by a pair (n, s), where n is the number of jobs and machines, while s ≤ n is the speed of the faster machine.Consider all pairs (n, s) along a tight sequence.Since 1 ≤ s ≤ n, it follows that the sequence of values s n is bounded.By Lemma 10, the sequence of values ψ n is also bounded.Thus, there is a subsequence of indices (of n) such that for this subsequence both s n and ψ n are converging, and consider the sequence of instances corresponding to this subsequence of values of n.Observe that the instances related to this subsequence form a tight sequence as well, and the number of jobs in these instances grows to infinity.Let μ be the limit of s n when n grows to infinity (along this subsequence).
Thus, we deduce The derivative of function , so that R (μ) reaches its maximum at a stationary point, which is the solution of the equation 2e − 1 μ − μ + μe − 1 μ = 0. Numerically, such a solution μ 0 is approximately equal to 0.7959 . . ., which gives an upper bound R (μ 0 ) on the power of preemption approximately equal to 1.39795 . . . .
To see that R (μ 0 ) is also a lower bound on the power of preemption, we can exhibit a tight sequence such that instance I is associated with a pair of integers (n , s ).Instance I is a good input that contains n unit-sized jobs and n machines, such that the speed of the fast machine is s , 1 < s ≤ n, while the speed of each remaining machine is equal to 1.Moreover, For the corresponding sequence of instances the sequence of cost ratios converges to R (μ 0 ).

The power of preemption for two machines.
The upper bound on the power of preemption established in Theorem 2 is a global bound that holds for all instances of the problem of minimizing the total completion time on uniformly related machines.This power of preemption is achieved as a limit for instances with huge numbers of jobs and machines.However, for a fixed number of machines a smaller bound can be derived, as shown below for the case of m = 2. then, as established in section 4, we may assume that I is a good input.Consider such an input that consists of n unit-sized jobs.Assume that machine M 1 has speed s, 1 < s ≤ n, while the speed of machine M 2 is 1.For a schedule S, let C k (S) , 1 ≤ k ≤ n, be a nondecreasing sequence of the completion times in S.
Recall that in an optimal nonpreemptive schedule, for a good input at least one slow machine is not assigned any jobs.Thus, in our case, in schedule S * np (I) all jobs are processed on the fast machine M 1 , so that (20) Notice that since the jobs are unit-sized, Algorithm QSumNP associates the multiplier k s with the job to be scheduled in the kth position on machine M 1 .In particular, in order to assign the last job to machine M 1 , the current multiplier n s on machine M 1 cannot be larger than 1, so we have n ≤ s.This, together with the condition s ≤ n, implies that s = n in instance I. Therefore, since Recall that in any preemptive schedule, the makespan, i.e., the completion time of the last job, cannot be smaller than the ratio of all processing times to the sum of speeds, which holds by simple averaging; see (6).This implies (when applied for a prefix of k jobs of the job sequence) that First, assume that n ≥ 5. Then due to s = n, the inequality Applying (21) and the above expressions with s = n, we obtain c 2017 SIAM.Published by SIAM under the terms of the Creative Commons 4.0 license Downloaded 04/20/17 to 193.60.77.88.Redistribution subject to CCBY license c 2017 SIAM.Published by SIAM under the terms of the Creative Commons 4.0 license Downloaded 04/20/17 to 193.60.77.88.Redistribution subject to CCBY license following lemma.

Fig. 1 .
Fig. 1.Interval T (a) in schedule S * p ; (b) in the modified schedule.

s s to 1 .
Thus, the cost of S p (I ) is at most s s times the cost of S * p (I), i.e., Φ S * p (I ) ≤ Φ (S p (I )) ≤ s s Φ S * p (I) , as required.

c
2017 SIAM.Published by SIAM under the terms of the Creative Commons 4.0 license Downloaded 04/20/17 to 193.60.77.88.Redistribution subject to CCBY license shown below.
) c 2017 SIAM.Published by SIAM under the terms of the Creative Commons 4.0 license Downloaded 04/20/17 to 193.60.77.88.Redistribution subject to CCBY license for k = q, we obtain

Theorem 3 .
In the case of two machines, and this bound is tight.Proof.If for some instance I Φ S * np (I) Φ S * p (I) = ρ 2 , proof, we show that6  5 is a tight bound on the power of preemption for m = 2.To see this, consider an instance with two unit-sized jobs and s = 2.We have thatC 1 S * np = C 1 S * p = 1/2,C 2 S * np = 1, and C 2 S * p = 3/4, so that Φ S * np = 3/2 and Φ S * p = 5/4, leading to the cost ratio of Φ S * np /Φ S * p = 6 5 .
, this bound is tight for the class of instances where C max S * p = T m c 2017 SIAM.Published by SIAM under the terms of the Creative Commons 4.0 license Downloaded 04/20/17 to 193.60.77.88.Redistribution subject to CCBY license holds; however, for the class of instances for which C max S * p

Table 1
1 S * p , C 2 S * p , . . ., C n S * p .It is clear that C 1 S * p = 1/s, and for k, 1 ≤ k ≤ n−1, job J k+1 is processed in the time interval 0, C k S * p on slow machines (since there are sufficiently many machines to start it at time zero), starts on machine M 1 at time C k S * p , and is processed there during 1 − C k S * Downloaded 04/20/17 to 193.60.77.88.Redistribution subject to CCBY license holds, i.e., R (μ 0 ) is an upper bound on the power of preemption, and moreover, this bound is tight.Proof.For an integer n, let c n = sup I∈In holds for each k, 1 ≤ k ≤ n, and the cost ratio is therefore no larger than6  5for n ≥ 5. We now prove the upper bound for small values of n,1 ≤ n ≤ 4. If n = 1, then obviously Φ S * p = C 1 S * p = Φ S * np = C 1 S * np = 1/s.For each value of n ∈ {2, 3, 4}, we compute and analyze the ratiosThus, C 1 S * p + C 2 S * p = 3 s − 1 s 2 , C 1 S * p + C 2 S * p + C 3 S * p = 6 s − 3 s 2 + 1 s 3 ,andC 1 S * p + C 2 S * p + C 3 S * p + C 4 S *