Main Questions:
- Is my runtime analysis correct?
- Is my approach optimal or close to it?
- Is there a better way and/or improvements?
Objective:
- Write an algorithm to assign
n
jobs tok
entities (servers, cpus, whatever) in a round robin fashion. The job that takes longest will be assign first, the shortest last. In the case of a tie, the job with lowest id will be assigned first. - return the job ids assigned to each entity, where an entity can be represented as a list
sample input:
jobs= [40, 80,40, 4, 2]
- entities = 2
- where
jobs[i]
is the length of time the task, andi
is its identifier.
sample output:
- [[1, 2, 4], [0, 3]]
I came up with two algorithms and wanted to see if my runtime analysis is correct, and if there is a better way to approach this
Implementation 1: Linear Search
Pseudocode:
- linear search for the longest job, from left to right, saving its job id (index)
- only update the index if we find a larger job in list
- once we find it assign to another list, which will contain the jobs in order
- iterate over the ordered list, assigning to each server in round robin fashion
Hypothesis for runtime:
- time proportional to
n ^ 2
, wheren
are the number of jobs
Code
def findMax(jobs):
ind_of_max = 0
### iterate over jobs
for k in range(len(jobs)):
# if current job we are processing is greater than the previous lomgest job
if jobs[k] > jobs[ind_of_max]:
ind_of_max = k
# sinonimouns to removing the job from the PQ
jobs[ind_of_max] = -1
return ind_of_max
def determineJobOrder(jobs):
# list of jobIds
order_of_jobs = []
num_of_jobs = len(jobs)
#### TOTAL RUNTIME ANALYSIS:
# - runs in time proportional to N ^ 2
####
# runs in time proportional to N
while num_of_jobs > 0:
# removing jobs from list (aka performing dequeue operations until we run out of jobs)
# runs in time proportional to N
ind_of_max = findMax(jobs)
order_of_jobs.append(ind_of_max)
num_of_jobs -= 1
return order_of_jobs
def assignJobsToServers(servers, jobs):
jobs = determineJobOrder(jobs)
# create a list of lists, where the number of inner lists = number of servers
assignment = [[] for i in range(0, servers)]
num_jobs = len(jobs)
job_ind = 0
server_ind = 0
###
### RUNTIME ANALYSIS:
# - runs in time proportional to n
#while there are jobs left to assign
while num_jobs > 0:
assignment[server_ind % servers].append(jobs[job_ind])
# update server index, job index, and number of jobs left to assign
num_jobs -= 1
job_ind += 1
server_ind += 1
return assignment
Implementation 2: Using Heap
Pseudocode:
- Iterate over all jobs, creating a Job object, Job(job_id, time)
- Since python's
heapq
class is a min heap, I useheapq.nlargest
, passing a key for comparison, the list of Jobs
Hypothesis of runtime:
- time proportional to
n log n
, where n are the number of jobs
Code:
## To enncapsulate the jobs and be able to compare by multiple attributes
class Job:
def __init__(self, job_id, time):
self.job_id = job_id
self.time = time
def __repr__(self):
return 'Job(id: {}, time: {})'.format(self.job_id, self.time)
import heapq
def assignJobsToServers(servers, jobs):
for k, exec_time in enumerate(jobs):
jobs[k] = Job(k, exec_time)
# resource used to figure out how to sort by two attributes, one in natural order one in reverse:
# https://www.techiedelight.com/sort-list-of-objects-by-multiple-attributes-python/
#### Runtime Analysis:
# - I am assuming that what this will do is build the max heap, and then remove the n items to get them in max order.
# - Inserts into heap take log (n).
# - removals take log(n)
# - if we are enqueueing n items, this will take n log n
# - if we are removing n items, this will take n log n for removals
# - hence: runtime should be proportional n log n, ignoring constants
####
# key defines that that the larger the time the larger the, and on a tie, the smallest the job id the largest the item
jobs_ordered = heapq.nlargest(len(jobs), jobs, key = lambda job: (job.time, -job.job_id))
# create a list of lists, where the number of inner lists = number of servers
assignments = [[] for i in range(0, servers)]
num_jobs = len(jobs)
server_ind = 0
job_ind = 0
###### RUNTIME ANALYSIS:
# - at this point we already did all the difficult computation. We now just iterate over all jobs which will be in order and assign
# - this takes time proportional to n, the number of jobs
while num_jobs > 0:
assignments[server_ind % servers].append(jobs_ordered[job_ind].job_id)
# update server index, and number of jobs left to assign
num_jobs -= 1
server_ind += 1
job_ind += 1
return assignments