Architectural Features of Transactional Memory Designs for an Operating System - PowerPoint PPT Presentation

1 / 18
About This Presentation
Title:

Architectural Features of Transactional Memory Designs for an Operating System

Description:

Chris Rossbach, Hany Ramadan, Don Porter. Advanced Computer Architecture. Fall 2006- Prof. Burger ... Current Transactional Memory proposals make architectural ... – PowerPoint PPT presentation

Number of Views:46
Avg rating:3.0/5.0
Slides: 19
Provided by: hanyra
Category:

less

Transcript and Presenter's Notes

Title: Architectural Features of Transactional Memory Designs for an Operating System


1
Architectural Features of Transactional Memory
Designs for an Operating System
  • Chris Rossbach, Hany Ramadan, Don Porter
  • Advanced Computer Architecture
  • Fall 2006- Prof. Burger

2
Motivation
  • What would a realistic HTM system actually
    support? (primitives/design choices)
  • Current Transactional Memory proposals make
    architectural design choices with inadequate
    information
  • shared counter, linked list benchmarks
  • focus on user mode avoids OS issues

3
HTM OS are you nuts?
  • Large concurrent program with complex data access
    patterns
  • Complex code simplify programming model
  • Many apps spend a lot of time in kernel
  • Diverse synchronization primitives
  • spinlocks, semaphores, per-CPU variables, RCU,
    seqlocks, completions, mutexes

4
Our HTM System
  • Basic primitives
  • xbegin, xend
  • OS-specific primitives
  • xpush, xpop
  • stack management interrupts on x86 re-use stack
  • Configurable Hardware Parameters
  • Conflict detection granularity
  • Commit abort penalties
  • Overflow costs
  • Configurable contention management
  • Conflict resolution policies which tx restarts?
  • Backoff policies how long to wait before restart

5
An Issue Unique to an OS Using transactions in
interrupt handlers
No tx in interrupts
system_call() XBEGIN modify 0x10 XEND
intr_handler() XPUSH XBEGIN modify
0x30 XEND XPOP
0x10
TX 1 0x10
0x20
Interrupts abort active tx
0x30
TX 1 0x10
0x40
interrupt
TX 2 0x30
Nest the transactions
TX 1 0x10, 0x30
TX 1 0x10
Multiple active transactions
TX 1 0x10
TX 2 0x30
6
Converting Linux to TxLinux
  • TxLinux based on kernel 2.6.16.1
  • Converted core primitives to use transactions
  • spin-locks, RCU primitives, r/w locks
  • critical sections become transactions
  • Converted high traffic subsystems
  • memory allocators, FS directory cache, mapping
    addresses to pages data structures, memory
    mapping files into address spaces, ip routing,
    and socket locking
  • Modified interrupt-handling code to use
    primitives in our HTM model (xpush, xpop)

7
HTM Implementation
  • Implemented HTM model as x86 extensions
  • Simulation environment
  • Simics 3.0.17 machine simulator
  • transactional L1 cache (variable 4k-32k)
  • 4MB L2 1GB RAM
  • 1 cycle/instruction, 16 cycle/L1 miss, 200
    cycle/L2 miss
  • 4 8 processors

8
Experimental Setup
  • Benchmarks
  • micro kernalloc, Counter, directory cache
    punisher
  • macro pmake, netcat, MAB, configure, find
  • Measurements
  • Execution time
  • Transactions statistics created/restarted/overflo
    wed, working sets, footprint
  • Cache statistics (e.g. miss rate)
  • Variables
  • Contention management (conflict/backoff policies)
  • Transactional cache size
  • Commit, abort, overflow penalties
  • Conflict granularity (byte vs. word vs. cache
    line)

9
TxLinux Results (4 processors)
Transactions Created 105,972 425,888 475,860 1,810,602 1,408,610 243,934
  • Performance change minimal, lots of transactions
  • Unique Transaction restarts were lt 0.07
  • Data cache miss rates do not change appreciably

10
Contention Management Matters!
linear back off policy, 4 processors
11
Conclusions
  • TxLinux is cooler than, and has comparable
    performance to Linux
  • Cache line granularity is good enough
  • 16KB Transactional cache covers the vast majority
    of transactions
  • Best contention management policy is workload
    dependent.
  • Exponential back off is too conservative

12
Backup Slides
13
Contention Management Restart Rates
14
Conflict Granularity Backoff Policy
15
Stack Management Issue
  • Treating the Stack as a shared resource
  • Checkpoint
  • Partition

16
Txl Memory Allocator Investigation
  • Examine Tx complexity/performance trade-off
  • The slab is the default Kernel memory allocator
  • Highly tuned for performance
  • Avoids contention/locks , uses per-CPU structures
  • About 3,880 lines of code
  • The slob is a drop-in replacement
  • Designed for minimal bookkeeping memory overhead
  • Uses two coarse-grained locks (386 lines)
  • The slob-opt is slob with modifications
  • Removed obvious transaction bottlenecks
  • Only a couple of dozen lines of code changed

17
Txl Memory Allocator Results (4 proc)
Kernalloc Pmake MAB configure Find
slab 1.4 13.9 8.0 14.1 1.8
0 0.04 0.07 0.04 0
slob - 14.3 21.3 16.3 1.8
- 1.78 19.72 5.93 0.71
slob-optimized 16.7 14.1 12.7 14.9 1.8
18.17 0.45 8.48 1.42 0.12
Execution time (in seconds) Unique restarts
18
Transactional Memory Issues
  • Hardware vs. Software
  • Different interfaces
  • strong (HW) vs. weak (SW) atomicity
  • Will transactions make programming easier?
  • Transactions for blocking primitives?
  • Using transactions for security?
Write a Comment
User Comments (0)
About PowerShow.com