Data layouts for object-oriented programs - PowerPoint PPT Presentation

About This Presentation
Title:

Data layouts for object-oriented programs

Description:

Solution Most object-oriented languages use garbage collection. ... layout on application Implement the layouts with simple algorithms populate sort copy fixup ... – PowerPoint PPT presentation

Number of Views:99
Avg rating:3.0/5.0
Slides: 22
Provided by: Martin779
Category:

less

Transcript and Presenter's Notes

Title: Data layouts for object-oriented programs


1
Data layouts forobject-oriented programs
  • Martin HirzelIBM Research
  • SIGMETRICS 6/16/2007

2
Problem
  • Object-oriented programs put datain objects.
  • Caches and TLBs put data in blocks.
  • Scattering objects over blocks causes cache/TLB
    misses.
  • Misses cost time.

o1
o5
o6
o8
o2
o3
o7
o10
o4
o11
o9
cache line
TLB page
3
Solution
  • Most object-oriented languages use garbage
    collection.
  • Garbage collection can move objects.
  • To avoid misses, move objects to the right
    cache/TLB blocks.
  • Simple, right?

o1
o5
o6
o8
o2
o3
o7
o10
o4
o11
o9
o11
o9
o10
o2
o1
o3
o4
o5
o8
o6
o7
4
Cheney Copying GC
From-space
o9
o10
o2
o1
o3
o4
o5
o8
o6
o7
To-space
Copied not yet scanned
Copied scanned
5
BF Breadth-first layout
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
Why? Siblings
How? Queue-based traversal
Who? Cheney 1970
6
DF Depth-first layout
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
Why? Child-parent
How? Stack-based traversal
Who? Fenichel/Yochelson 1969
7
HI Hierarchical layout
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
Why? Both siblings and child-parent
How? Block-bounded breadth first
Who? Moon 1984 Wilson et al. 1991
8
AO Allocation order layout
1
2d
3
4d
5d
6
7
8d
9
1c
3c
6c
7c
9c
Why? Creation order matches usage
How? Sliding compaction
Who? Lisp 2 collector 1967
9
SZ Size segregation layout
1
2
3d
4
5d
7
6
9
8d
1c
6c
2c
7c
9c
4c
Why? Efficiently finding allocation holes
How? Segregated free lists
Who? Comfort 1964
10
TH Thread local layout
001
010
1
8
3
111
11
2
4
9
12
110
101
10
6
5
7
100
Why? Disjoint working sets
How? Reachability from call stacks
Who? Steensgaard 2000
11
Problem
AO Allocation order
AS Allocation site
BF Breadth-first
DF Depth-first
HI Hierarchical
PO Popularity
RA Random
SZ Size
TH Thread
TY Type
  • Which layout is best, and which is worst?
  • How much does it matter in practice?
  • How similar are the layouts?
  • How much does it matter in the limit?

12
Solutions?
  • Appeal to intuition
  • They cant all be right!
  • Formal
  • Petrank/Rawitz showed hardness
  • Simulation
  • Who would believe those numbers?
  • Brute-force
  • Do you have a few person-years to spare?

13
Avoiding Heisenberg Effects
Garbage collector
Garbage collector
Garbage collector
Application
Application
Time
  • Exclude garbage collector performance
  • Measure real effect of layout on application
  • Implement the layouts with simple algorithms

14
Object sorting garbage collection
populate
sort
copy
fixup
Sort keys AO, AS, PO, RA, SZ, TH, TY
15
32 Benchmarks
16
Mutator time overhead
  • All layouts sometimes best, sometimes worst
  • RA is worst, as expected
  • Low averages, but beware of worst cases!
  • AO has best average (but not by much)
  • DF has most best cases
  • TH has most benign worst case
  • Performance impact increases with SMP
  • Conclusions for AO, DF, TH, RA still hold

17
Mutator miss rate increases
  • Layouts have large impact on miss rates
  • Miss rates confirm overhead conclusions
  • Miss rates can not replace time measurements

18
Layout similarities and differences
  • AO, PO, and TH are quite similar
  • As expected, RA is far out

19
Estimated limit mutator time
  • Benchmarkslargest avg. overhead
  • Baselinebest observed
  • Linear regression

Limit time (no misses) Limit time (no misses)
cache misses ? cache latency
TLB misses ? TLB latency
Total time (as measured) Total time (as measured)
20
Related work
21
Conclusions
  • Layouts matter little on average, but
  • Beware of the worst cases!
  • Layout importance increases with SMP
  • All layouts are sometimes best,sometimes worst
  • AO has best average
  • DF has most best-cases
  • TH has best worst-cases
Write a Comment
User Comments (0)
About PowerShow.com