Title: A Few Subtle Insights About UCP
1A Few Subtle Insights About UCP
Moinuddin K. Qureshi
Work on UCP done while at
2First Things First
- I thank Xing and Rajeev for
- Validating that UCP (based on misses) works
- Re-Validating that UCP (based on IPC) is slightly
betterthan one based on misses 1-3 - As mentioned, this is not the 1st (or 2nd, or
3rd, or 4th ) paper to provide this insight
3Critique 1 UCP(MPKI) UCP
Consider two apps, A and B, with identical miss
rate curves
UCP(MPKI) gives 2 ways to both AB
A B both access cache 1 per 100 inst, Cache
Hit 1 Cycle, Memory 100 cycles A has 99
integer ops (1 cycle each) CPI_A (991
MissRatePerc)/100 B has 99 FP ops (10 cycles
each) CPI_B (9901 MissRatePerc)/100
UCP(MPKC) ? 4 ways to A IPC_best, WS_best
UCP(MICRO06) optimizes perf more than UCP(MPKI)
4Critique 2 Dynamic can beat Static Optima
5Critique 3 Not all Misses are Created Equal
Problem with Linear CPI Model of Xing
CPI
MPKI
6UCP The last 4.5 years
Things I would have liked to see in literature
1. Non-Integer Way Partition 2. Utility Based
Cache Insertion 3. Prefetch Aware Cache Partition
7Extension 1 Probabilistic Way Partition
Common criticism of way partitioning We can
only allocate Integer number of ways A simple
way to avoid this is Probabilistic Way
Partition. Say you want to allocate 3.5 ways to
application A Then on a cache miss, consult a
Rand number generator If Randval gt 50 of
Randmax, then A gets 4 ways, else 3 ways On
average, A will end up getting 3.5 ways in the
cache Can go finer, say we want to allocate
4.125 ways to B
8Extension 2 Utility Based Cache Insertion
One can achieve the effect of partitioning by
intelligent insertion In a 16-way cache, a given
application A can insert at 16 locations If N
applications share the cache the decision space
is 16N An efficient hardware scheme that obtains
the best decision in this decision space will
outperform both UCP and TADIP
9Extension 3 Prefetch Aware Partitioning
How does one do partitioning under prefetching
? For applications whose dataset is
prefetchable, we may Not want to give cache space
(even if it has high utility) In-fact sometimes
its a win-win to give more cache to
irregular Apps, as it provides more bandwidth
available for prefetching What is the right way
to extend UCP to prefetches ?
10Summary
UCP Partitioning based on misses works
(simple) Several work has shown UCP based on IPC
works slightly better There are several
extensions of UCP still unexplored -- Let me
know if you are interested in exploring
questions/comments moinqureshi_at_gmail.com