Effect of Context Aware Scheduler on TLB - PowerPoint PPT Presentation

1 / 26
About This Presentation
Title:

Effect of Context Aware Scheduler on TLB

Description:

priority bitmap and array of linked list of threads. O(1) scheduler. searches priority bitmap. chooses a thread with the highest priority ... – PowerPoint PPT presentation

Number of Views:34
Avg rating:3.0/5.0
Slides: 27
Provided by: aleCsceK
Category:

less

Transcript and Presenter's Notes

Title: Effect of Context Aware Scheduler on TLB


1
Effect of Context Aware Scheduler on TLB
  • Satoshi Yamada
  • PhD Candidate
  • Kusakabe Laboratory

2
Contents
  • Introduction
  • Overhead of Context Switch
  • Context Aware Scheduler
  • Benchmark Applications and Measurement
    Environment
  • Result
  • Related Works
  • Conclusion

3
widely spread multithreading
  • Multithreading hides the latency of disk I/O and
    network access
  • Threads in many languages, Java, Perl, and Python
    correspond to OS threads

More context switches happen today Process
scheduler in OS is more responsible for
the system performance
4
Context Switch and Caches
  • Overhead of a context switch
  • includes that of loading a new working set for
    next process
  • is deeply related with the utilization of caches
  • Agarwal. etc Cache performance of operating
    system and multiprogramming workloads (1988)
  • Mogul, et al. The effect of of context switches
    no cache performance (1991)

Process B
Process A
Working sets overflows the cache
Process A
Process B
Process B
Process A
Cache
5
Advantage of Sibling Thread
Parent
Parent
fork()
task_struct
task_struct
mm_struct
mm signal file . .
mm signal file . .
signal_struct
signal_struct
. .
create a PROCESS
create a THREAD
OS does not have to switch memory address spaces
in switch sibling threads
we can expect the reduction of the overhead of
context switch
6
Contents
  • Introduction
  • Overhead of Context Switch and TLB
  • Context Aware Scheduler
  • Benchmark Applications and Measurement
    Environment
  • Result
  • Related Works
  • Conclusion

7
TLB flush in Context Switch
  • TLB is a cache which stores the translation from
    virtual addresses into physical address
  • TLB translation latency 1 ns
  • TLB miss overhead several accesses to memory
  • On x86 processors, most of TLB entries are
    invalidated (flushed) in every context switch by
    changing memory address space

TLB flush does not happen in the context switch
among sibling threads
8
Overhead due to a context switch by lat_ctx in
LMbench
9
Contents
  • Introduction
  • Overhead of Context Switch and TLB
  • Context Aware Scheduler
  • Benchmark Applications and Measurement
    Environment
  • Result
  • Related Works
  • Conclusion

10
O(1) Scheduler in Linux
  • O(1) scheduler runqueue has
  • active queue and expired queue
  • priority bitmap and array of linked list of
    threads
  • O(1) scheduler
  • searches priority bitmap
  • chooses a thread with the highest priority

Scheduling overhead is independent of the number
of threads
11
Context Aware (CA) Scheduler
  • CA scheduler creates auxiliary runqueues per
    group of threads
  • CA scheduler compares Preg and Paux
  • Preg the highest priority in regular O(1)
    scheduler runqueue
  • Paux the highest priority in the auxiliary
    runqueue
  • if Preg - Paux lt threshold, then we choose Paux

12
Context Aware Scheduler
Linux O(1) scheduler
A
C
D
B
E
Context switches between processes3 times
A
C
D
B
E
CA scheduler
Context switches between processes1 time
13
Fairness
  • O(1) scheduler keeps the fairness by epoch
  • cycles of active queue and expired queue
  • CA scheduler also follows epoch
  • guarantee the same level of fairness as O(1)
    scheduler

14
Contents
  • Introduction
  • Overhead of Context Switch
  • Context Aware Scheduler
  • Benchmarks and Measurement Environment
  • Result
  • Related Works
  • Conclusion

15
Benchmarks
  • Java
  • Volano Benchmark (Volano)
  • lusearch program in DaCapo benchmark suite
    (DaCapo)
  • C
  • Chat benchmark (Chat)
  • memory program in SysBench benchmark suite
    (SysBench)

Information of Each Benchmark Applications
16
Measurement Environment
  • Hardware
  • Suns J2SE 5.0
  • threshold of context aware scheduler
  • 1 and 10
  • Perfctr to count the TLB misses
  • GNUs time command to measure the total system
    performance

17
Contents
  • Introduction
  • Overhead of Context Switch
  • Context Aware Scheduler
  • Benchmarks and Measurement Environment
  • Result
  • Related Works
  • Conclusion

18
Effect on TLB
Results of TLB misses (million times)
  • CA scheduler significantly reduces TLB misses
  • Bigger threshold is more effective
  • frequent changes of priority by dynamic priority
    especially in DaCapo and Volano

19
Effect on System Performance
Results of the Counters in Each
Application(seconds)
Results by time command (seconds)
  • CA scheduler
  • enhances the throughput on every application
  • reduces the total elapsed time by 43

20
Contents
  • Introduction
  • Overhead of Context Switch
  • Context Aware Scheduler
  • Benchmarks and Measurement Environment
  • Result
  • Related Works
  • Conclusion

21
H. L. Sujay Parekh, et. al,Thread Sensitive
Scheduling for SMT Processors (2000)
  • Parekhs scheduler
  • tries groups of threads to execute in parallel
    and sample the information about
  • IPC
  • TLB misses
  • L2 cache misses, etc
  • schedules on the information sampled

Sampling Phase
Scheduling Phase
Sampling Phase
Scheduling Phase
22
Pranay Koka, et. al, Opportunities for Cache
Friendly Process (2005)
  • Kokas scheduler
  • traces the execution of each thread
  • put the focus on the shared memory space between
    threads
  • Schedule on the information above

Tracing Phase
Scheduling Phase
Tracing Phase
Scheduling Phase
23
Conclusion
  • Conclusion
  • CA scheduler is effective in reducing TLB misses
  • CA scheduler enhances the throughput of every
    application
  • Future Works
  • Evaluation on other platforms
  • Investigation of fairness among an epoch
  • compare with Completely Fair Scheduler (Linux
    2.6.23)

24
widely spread multithreading
ThreadA ThreadB
  • Multithreading hides the latency of disk I/O and
    network access
  • Threads in many languages, Java, Perl, and Python
    correspond to OS threads

ThreadB waits
disk
More context switches happen today Process
scheduler in OS is more responsible for
the system performance
25
Context Aware (CA) scheduler
Our CA scheduler aggregates sibling threads
Linux O(1) scheduler CA scheduler
A
C
D
B
E
Context switches between processes3 times
A
C
D
B
E
Context switches between processes1 time
26
Results of Context Switch
(micro seconds)
Process C
Process A
2MB
L2 cache size 2MB
Process B
1MB
Cache
0
Write a Comment
User Comments (0)
About PowerShow.com