Title: Matt Mutka and Miron Lzvny 1990
1Scheduling Remote Processing Capacity in A
Workstation-Processor Bank Network(Up-Down
Algorithm in Condor )
- By
- Matt _at_ Mutka and Miron Lzvny (1990)
- The 7th on Distributed Computing System
- Reviewed by
- Paskorn Champrasert
2Problems
- Workstations are powerful machines capable of
executing millions of instructions each second. - The processing demands of the owner are much
smaller that the capacity of the workstation. - Some of the users face the problem that the
capacity of their workstations is much too small
to meet their processing demands. - Can we provide a high quality of service in a
highly utilized network of workstations?
3Condor Up-Down Algorithm
- The Condor systems schedule long running
background jobs at idle workstations. - Background jobs means long running process which
do not require interaction with the user. - Exploration of algorithms for the management of
idle workstation capacity. - - The jobs from users who request large amounts
of capacity should be granted as much as possible
without inhibiting the access to capacity of
other users who want smaller amounts. - - The Up-Down algorithm is designed to allow
fair access to remote capacity. - (fair between jobs of light users and jobs
of heavy users)
4System Design
- Background jobs require several hours of CPU time
and little interaction with their users. - A system has been designed and implemented to
execute background jobs remotely at idle
workstations.
5Scheduling Structure
- A centralized coordinator will assign background
jobs to execute at available remote workstations. - The coordinator gets system information in order
to implement the meta scheduling policy. - The list of running jobs, waiting jobs
- The location of idle stations.
6(No Transcript)
7Scheduling Structure
- Each workstation contains
- a local scheduler
- Each workstation makes its own decision which job
should be executed next - process queue
- One workstation work as a coordinator
- holds central coordinator
- Every 2 minutes the central coordinator gets the
information from workstations to see - which workstations are available
- which workstations have background jobs waiting
- If a background job (remote process) running on a
workstation - The local scheduler in the workstation checks
every ½ minute to see if the background job
should be preempted because the local user has
resumed using the station. - The local scheduler immediately preempts the
background job - Checkpointing is invoked
- Checkpointing of a program is the saving of an
intermediate state of the program so that its
execution can be restarted from this intermediate
state. - The coordinator submit the preempted background
job to another idle workstation.
8Fair Access to Remote Cycles
- The authors observed that the users can be
divided into 2 groups - Heavy userstry to consumes all available
capacity for long periods. - 2. Light users
- consume remote cycles occasionally.
- All user should be served fairly.
- Heavy users should not inhibit light users to
access remote cycles.
9Up-down algorithm
- Up-down algorithm enables heavy users to maintain
steady access to remote cycles while providing
fair access to cycles for light users. - Up-down algorithm protects the right of light
users when a few heavy users try to monopolize
all free resources.
10Up-Down Algorithm
- Each workstation maintains Schedule Index (SI)
- Schedule Index Table (SI of all workstations) is
maintained at the CONDOR coordinator. - The value of SI is used to decide which
workstation is next to be allocated remote
process. - Workstations with smaller SI entries are given
priority over workstations with larger SI - (Lower SI, higher Priority)
- Initially, SI is set to zero
11Up-Down Algorithm
- Each workstation maintains scheduling index (SI)
as the priority to preempt remote cycles. - At the beginning SI is 0
- For each scheduling interval,
- Each workstation increases/decreases its SI
- Workstation that has a light user, SI is very low
(high priority) - Workstation that has a heavy user, SI is very
high (low priority) - Periodically, the coordinator checks if any
workstations have new background jobs to execute. - If a workstation with high priority has a job to
execute and there is no idle workstation, the
coordinator preempts a remotely executing
background job on low priority workstation - - the preempted job is checkpointed (save
intermediate state) - - the coordinator scheduler will place the
preempted job on lower priority workstation. - Up-Down algorithm is an algorithm to dynamically
update SI - Light users that increase their loads ( of
background jobs) will have their SI increase (low
priority) - So they will be considered l heavy users.
- Heavy users that decrease their loads will have
their SI decrease (high priority) be - So, they will be considered light user.
12Node available remote processing
cycles f,g,h,l are SI updating function f
increase SI when a light user increases its
load g decrease SI when the process has to wait
for remote cycles SI is decreased if a station
wants a remote processor but was denied. h and l
stabilize the SI when workstations do not want
remote cycles.
13SI is initialized 0 When a job arrives and there
is no allocation given -gt SI decreases
g(SI) After allocation is made -gt SI increases
f(SI) If two allocations are given -gtSI
increases twice as fast (2f(SI) Completion
of one job -gt SI increases f(SI) Second job
complete -gt SI decreases h(SI)
Once stations SI reaches zero it will stay there
until it asks for a node. ScheldulingInterval 10
min
14Algorithms Used for Comparisons
- For comparison with the Up-Down Algorithm, the
Random algorithm and the Round-Robin algorithm
will be used. - Random and Round-Robin are non-preemptive
algorithms. After the process is allocated on a
remote resources the process has to run until it
terminates. - The random algorithm
- All its decisions are made without reference to
past decisions. - When a workstation want to place a process to a
remote workstation, the random algorithm randomly
picks one of the available remote workstation. - The Round-Robin algorithm
- Each workstation is given a chance in particular
order to receive remote cycles.
15Simulation Configuration
- Number of workstations 13
- 11 Workstations have a light user
- ( background jobs 1)
- 1 Workstation has a heavy user
- ( background jobs will be varied from 2-13)
- 1 Workstation has a medium user
- ( background jobs 2)
- The background jobs have a mean service demand
2 hours for all workstations - Scheduling Interval 10 minutes
- Job transfer cost (time for transfer a job) 1
minutes - Run the simulation for 2 years simulation time
16Simulation Configuration
SI will be very high.The authors want to reduce
SI to zero faster
17Simulation Results
- Remote Cycle Wait Ratio is calculated by
- a remote execution time a workstation received
divided by its wait time. (higher is better)
18Extra Slides
19Performance
- 23 workstations are observed in one month
- The workstations operate under the BSD 4.3 Unix
- A workstation works as the coordinator.
20average
job demand
User A heavy user User B-E light users
21Total queue length
The heavy user kept more than 30 jobs in the queue
Wait ratio Amount of time that a job waits / its
service time
Light user did not wait. Wait ratio is very
small. The up-down algorithm allocated remote
capacity to light user and preempt the heavy user.
(Demand service)
2212438 machine hours were available for remote
execution. 4771 machine hours were utilized by
Condor. Average of local utilization is only 25
Almost 200 machine days of capacity that
otherwise would have been lost where consumed by
Condor
Utilization of the system over one working week
(Mon Fri) 20 in the evening and night50 for
short peak period in the afternoon.
23Queue length
24Impact on Local Workstions
- Some local capacity is provided to
- placement and checkpointing of remote jobs
- Local scheduler
- The coordinator also consumes some resources.
- Results
- - local scheduler consumes less than 1 of
stations capacity. - - coordinator consumes less than 1 of stations
capacity. - - Checkpointing depends on the size of jobs
- - The size of checkpoint is approximately .5
megabyte - - Approximately 5 seconds per megabyte of the
checkpoint file. - 2.5 seconds for one checkpoints
25Leverage is the ratio of the capacity consumed by
a job remotely to the capacity consumed on the
home station to support remote execution? Large
gt job consumes more remote capacity Small gt job
consumes more local capacity
Average of leverage is 1300. 1 minute of local
capacity 22 hours of remote capacity is
received.
26Conclusions
- Almost 200 machine days of capacity that
otherwise would have been lost where consumed by
Condor. - The users dedicate small amount of local
resources to access the huge amount of remote
resources
27The Remote Unix (RU) Facility
- Workstations operate under BSD 4.3 Unix
- Remote Unix turns idle workstation into cycle
servers. - A shadow process runs locally as the substitute
of the process running on the remote machine. - Any unix system call (e.g., read/write files)
made by the program on the remote machine
communicates with the shadow process. - A message indicating the type of system call is
sent to the shadow process on the local machine
and can be viewed as a remote procedure call.
28Checkpointing
- When a job is removed from a remote location,
checkpointing is invoked. - The checkpointing of the program is the saving of
the state of the program so that its execution
can be restarted. - The state of a process contains
- Text (executable code)
- Data (variables of the program)
- Stack segments of the program
- The values of registers
- The status of the open files
- Messages that is sent from the program to its
shadow which no reply