CS 7810 Lecture 5 - PowerPoint PPT Presentation

About This Presentation

Title:

CS 7810 Lecture 5

Description:

Degree of parallelism: vector int-programs-today int-programs-before (no caches! ... for integer is 6FO4, for FP. non-vector is 5FO4, for FP vector is 4FO4 ... – PowerPoint PPT presentation

Number of Views:17

Avg rating:3.0/5.0

Slides: 18

Provided by: RajeevBala4

Learn more at: https://users.cs.utah.edu

Category:

Tags: degree | lecture

Transcript and Presenter's Notes

Title: CS 7810 Lecture 5

1
CS 7810 Lecture 5
The Optimal Logic Depth Per Pipeline Stage is 6
to 8 FO4 Inverter Delays M.S. Hrishikesh, N.P.
Jouppi, K.I. Farkas, D. Burger, S.W. Keckler, P.
Shivakumar UT-Austin and Compaq ISCA02
2
Improvements in Clock Speed
33MHz
66MHz
100MHz
200MHz
450MHz
1GHz
2GHz
1000nm
130nm
3
Definitions

Clock Period f flogic flatch fskew
fjitter
flogic the actual work being done in one stage
flatch data has to be saved in latch registers
at the
end of each pipeline stage (1 FO4 36ps at
100nm)
fskew Two parts of the circuit may receive
their
clocks thru different paths, resulting in a
slight
phase difference (0.3 FO4)
fjitter Unpredictable variations (0.5 FO4)

4
Processor Model

An Alpha-like processor with latencies updated
for 100nm
Simplification the study is insensitive to the
technology generation
Note that all structures are perfectly pipelined
this is a Limit of Pipelining study

5
Effect of Deep Pipelining
Add 16 FO4 Mpred 128 FO4 Load from mem 400
FO4 Mult 160 FO4 Overhead 2 FO4
.
.
.
.
Clock Period 18 FO4 10 FO4
add 162
mpred 8x18
load 400
mult 180
add 8282
mpred 16x10
load 400
mult 200
Clock Period FO4s
Cycles Clock speed 18 FO4
18144400180742 42
1.54GHz 10 FO4 20160400200780
78 2.78GHz
6
Yet, Performance Increases

Deepening a car assembly line ? more cars
being made at the same time ? a new car rolls
out at twice the freq
Independent instrs benefit from deep pipelining
Dependent instrs are slowed down
The latter dominates when pipelining overhead is
a large fraction of clock period

7
Example Latencies
Logic Delay L1D IssueQ Int-Add
2 FO4 16 9 9
4 FO4 9 5 5
8 FO4 5 3 3
16 FO4 3 2 2
8
In-Order Processors

With no overhead, when flogic reduces from 8FO4
to 4FO4, performance can go up by 100 (like in
the car assembly line), but only goes up by 18
With overhead, max performance is seen for 6FO4
for all three benchmark classes
For the Cray, optimal pipeline depth was 10.9FO4
(Int) and 5.4FO4 (vector)
Degree of parallelism vector gt
int-programs-today
gt int-programs-before (no caches!)

9
Out-of-Order Processors

Optimal logic delay for integer is 6FO4, for FP
non-vector is 5FO4, for FP vector is 4FO4
These results are insensitive to overhead costs
and microarchitecure optimizations
P.S. The effect of o-o-o execution on
performance
Non-vector FP 0.5 ? 1.0
Integer 0.8 ? 1.8
Vector FP 0.9 ? 3.5

10
Out-of-Order Processors
11
Increased Pipeline Depth

Reasons for IPC decrease
Longer ALU latencies (not quantified)
Longer load latency (25 for 6-cyc increase)
Longer branch mpred cost (10)
Longer wakeupselect (55)

12
Pipelining Wakeup

It takes a long time to broadcast tags across
the
entire issueq
Hence, wake the first eight instructions in the
first cycle, wake the next eight in the second,
and
so on
This works well if most ready instructions are
in
the first stage a 10-stage pipeline worsens
performance by only 11 -- will this change the
optimal logic depth?

13
Instruction Select

Stage-1 only goes through one arbiter
Stages 2-4 have a pre-select and go thru 2
arbiters
Does well if most ready instrs in stage-1 (4
loss)

stage 4
stage 3
stage 2
16-input arbiters
stage 1
/
8
14
IssueQ Compaction

Both techniques work well only if instructions
move up to occupy empty slots
Wastes energy, increases complexity
Correctness problems what if you miss the tag
while in transit

15
Conclusions

Logic per stage will only shrink by a factor of
two
limits clock speed improvements in the future
Pipelining wakeupselect has the biggest impact
on IPC

16
Related Work

Hartstein and Puzak (IBM) Most programs have
optimal pipeline depth between 13-30,
corresponding to FO4 delays of 4-8
Sprangle and Carmean (Intel) Optimum pipeline
depth is 50-60, corresponding to FO4 delays of
4-5

17
Title

Bullet

Write a Comment

User Comments (0)

About PowerShow.com

Recommended Relevance Latest Highest Rated Most Viewed

Sort by:

Related More from user

CrystalGraphics Presentations

Introducing-PowerShowcom PowerPoint PPT Presentation

Introducing-PowerShowcom - Introducing-PowerShowcom (Without Music)

CrystalGraphics 3D Character Slides for PowerPoint PowerPoint PPT Presentation

CrystalGraphics 3D Character Slides for PowerPoint - CrystalGraphics 3D Character Slides for PowerPoint

Chart and Diagram Slides for PowerPoint PowerPoint PPT Presentation

Chart and Diagram Slides for PowerPoint - Beautifully designed chart and diagram s for PowerPoint with visually stunning graphics and animation effects. Our new CrystalGraphics Chart and Diagram Slides for PowerPoint is a collection of over 1000 impressively designed data-driven chart and editable diagram s guaranteed to impress any audience. They are all artistically enhanced with visually stunning color, shadow and lighting effects. Many of them are also animated. And they’re ready for you to use in your PowerPoint presentations the moment you need them. – PowerPoint PPT presentation

Related Presentations

CS 7810 Lecture 13 PowerPoint PPT Presentation

CS 7810 Lecture 13 - A high PVN value can be achieved by using N low-confidence branches. to invoke gating if PVN is 30%, re-defining low-confidence as two ... | PowerPoint PPT presentation | free to view

CS 7810 Lecture 4 PowerPoint PPT Presentation

CS 7810 Lecture 4 - The max possible improvement (UB model) is 44% Other Results ... Load imbalance and communication become. worse the best heuristic/threshold will depend ... | PowerPoint PPT presentation | free to view

CS 7810 Lecture 12 PowerPoint PPT Presentation

CS 7810 Lecture 12 - Modeling Challenges for Next-Generation. Microprocessors. D. Brooks et al. ... When a processor structure is not used in a cycle, ... | PowerPoint PPT presentation | free to view

CS 7810 Lecture 8 PowerPoint PPT Presentation

CS 7810 Lecture 8 - Ld/St. An incomplete store stalls ... Every ld/st depends on the last store in its set. Causes serialized stores ... belong to one color keep track of the ... | PowerPoint PPT presentation | free to view

Lecture 8: Insect Outbreaks PowerPoint PPT Presentation

Lecture 8: Insect Outbreaks - Title: Lecture 8: Insect Outbreaks Author: James H. Speer Last modified by: James H. Speer Created Date: 10/27/2003 8:57:49 PM Document presentation format | PowerPoint PPT presentation | free to view

Lecture 1: Introduction and Memory Systems PowerPoint PPT Presentation

Lecture 1: Introduction and Memory Systems - Lecture 1: Introduction and Memory Systems CS 7810 Course organization: 7 lectures on memory systems 3 lectures on cache coherence and consistency | PowerPoint PPT presentation | free to view

CS 7810 Lecture 15 PowerPoint PPT Presentation

CS 7810 Lecture 15 - Title: PowerPoint Presentation Author: Rajeev Balasubramonian Last modified by: deepthi Created Date: 9/20/2002 6:19:18 PM Document presentation format | PowerPoint PPT presentation | free to view

The Last Lecture PowerPoint PPT Presentation

The Last Lecture - The Last Lecture Brick walls in-class essay Overall Most responded to the topic See development of voice of authority More comfortable using quotes Continuing ... | PowerPoint PPT presentation | free to view

EE689 Lecture 4 PowerPoint PPT Presentation

EE689 Lecture 4 - Title: EE689 Lecture 4 Author: Steven Woodward Last modified by: Steven Woodward Created Date: 1/26/1999 7:04:42 PM Document presentation format: On-screen Show | PowerPoint PPT presentation | free to view

Generating Questions From Lecture Notes PowerPoint PPT Presentation

Generating Questions From Lecture Notes - Title: Generating Questions From Lecture Notes Author: SCastill Last modified by: Dionne.Davila Created Date: 9/4/2003 9:13:44 PM Document presentation format | PowerPoint PPT presentation | free to view

AOS 100 Lecture 2 PowerPoint PPT Presentation

AOS 100 Lecture 2 - AOS 100 Lecture 2 Chapter 1 Composition of the Atmosphere Composition of the Atmosphere Temperature Temperature Vertical Temperature structure of atmosphere Variation ... | PowerPoint PPT presentation | free to view

Lecture 3 Outline (Ch. 7) PowerPoint PPT Presentation

Lecture 3 Outline (Ch. 7) - Self-Check Self-Check Lecture 3 Outline (Ch. 7) ... proteins embedded in membrane Membrane structure freeze fracture proteins intact, ... | PowerPoint PPT presentation | free to view

Statistics 270 - Lecture 12 PowerPoint PPT Presentation

Statistics 270 - Lecture 12 - Statistics 270 - Lecture 12 Last day/Today: More discrete probability distributions Assignment 4: Chapter 3: 5, 7,17, 25, 27, 31, 33, 37, 39, 41, 45, 47, 51, 65, 67 ... | PowerPoint PPT presentation | free to view

Lecture 6 (week 4) PowerPoint PPT Presentation

Lecture 6 (week 4) - Lecture 6 (week 4) Tests of population variance Two population variances | PowerPoint PPT presentation | free to view

Get Lecture Problem 6 PowerPoint PPT Presentation

Get Lecture Problem 6 - Lecture 16 Get Lecture Problem 6 The Synthetics This Week in Lab: Ch 9 PreLab Due Quiz 5 Ch 6 Final Report Due Next Week: Spring Break!!! The Condensation Reactions ... | PowerPoint PPT presentation | free to view

Lecture 7 Outline (Ch. 10) PowerPoint PPT Presentation

Lecture 7 Outline (Ch. 10) - Self-Check Self-Check Self-Check Self-Check Lecture 7 Concepts Describe in words the purpose of photosynthesis Write the equation for photosynthesis List the steps of ... | PowerPoint PPT presentation | free to view

Lecture 23: Interconnection Networks PowerPoint PPT Presentation

Lecture 23: Interconnection Networks - Lecture 23: Interconnection Networks Topics: Router microarchitecture, topologies Final exam next Tuesday: same rules as the first midterm Next semester: CS/EE 7810 ... | PowerPoint PPT presentation | free to view

Lecture Recorder PowerPoint PPT Presentation

Lecture Recorder - The MEDIASHAREiQ™ lecture recorder allows tutors to record and share course-associated video at any place. | PowerPoint PPT presentation | free to view

Lecture Capture Software PowerPoint PPT Presentation

Lecture Capture Software - MEDIASHAREiQ’s flexible interface helps educators create and share content for every learning environment and learner scenario. MEDIASHAREiQ is the easiest to use, yet most robust video content management and lecture capture software solution on the market today. Combining a next generation cloud-based platform with a suite of intuitive apps, MEDIASHAREiQ empowers educators to create, manage, and deliver live and on demand interactive education. Get your live demo at mediashareiq.com. | PowerPoint PPT presentation | free to view

Customise Your Lecture Seating PowerPoint PPT Presentation

Customise Your Lecture Seating - All Evertaut lecture seating can be customised to fit your teaching spaces and to match your chosen style and finish.Whether you have a large lecture theatre with thousands of seats or a small classroom with only 30, fixed lecture seating offers a robust yet comfortable solution to facilitate student learning. | PowerPoint PPT presentation | free to view

Working with Lecture Seating Manufacturers in the UK PowerPoint PPT Presentation

Working with Lecture Seating Manufacturers in the UK - If you are looking to source lecture seating for an education establishment or training facility, there are a wide range of manufacturers and suppliers who you could potentially do business with. A quick Google search will present a range of manufacturers who you can deal directly with as well as intermediaries who can source what you require. | PowerPoint PPT presentation | free to view

How Lecture Theatre Seating Impacts Learning PowerPoint PPT Presentation

How Lecture Theatre Seating Impacts Learning - There are many elements students will consider when choosing a destination for their higher education – from the courses on offer, facilities and fees to the location, accommodation, and nightlife. One thing they probably won’t give any thought to however is the lecture theatre seating. Although the amount of face-to-face teaching will vary dependent on the course of study undertaken, UK university students spend an average of 11-13 hours per week in lectures. This means over 3 years of learning, students can expect to spend over a thousand hours sitting in a lecture room, which makes lecture theatre seating a key part of their learning experience. | PowerPoint PPT presentation | free to view

Lecture Capture Hardware PowerPoint PPT Presentation

Lecture Capture Hardware - MEDIASHAREiQ is an online video platform and Lecture Capture Hardware for training and development. Use it to manage your video library, collaborate with cohorts and share videos as per requirement. | PowerPoint PPT presentation | free to view

Video and Lecture Capture Hardware PowerPoint PPT Presentation

Video and Lecture Capture Hardware - In today's fast-paced world, education training needs to be convenient, flexible and available around the clock. This is why universities have been using Video and Lecture Capture Hardware in their classrooms. MEDIASHAREiQ's experienced team can help you implement the right solution to cut costs, increase efficiency and improve student engagement. Visit: https://www.mediashareiq.com/platform/video-capture-hardware/ | PowerPoint PPT presentation | free to view

Video and Lecture Capture Hardware (1) PowerPoint PPT Presentation

Video and Lecture Capture Hardware (1) - EMS video and lecture capture hardware is the leading provider of experiential online learning tool, involved in delivering high-quality simulation and didactic courses for last 30 years. Visit: https://www.mediashareiq.com/platform/video-capture-hardware/ | PowerPoint PPT presentation | free to view

Video and Lecture Capture Hardware (2) PowerPoint PPT Presentation

Video and Lecture Capture Hardware (2) - A robust video and lecture capture hardware, lecture capture recorder, like MEDIASHAREiQ, can deliver engaging online courses and increase student participation at under-attended lectures, collect vital data, and provide content analysis. Visit: https://www.mediashareiq.com/platform/video-capture-hardware/ | PowerPoint PPT presentation | free to view

Video and Lecture Capture Hardware (3) PowerPoint PPT Presentation

Video and Lecture Capture Hardware (3) - MEDIASHAREiQ video and lecture capture hardware is engineered to help educators create captivating online courses through its innovative video capture program, content management system, assessment capabilities, and much more. Visit: https://www.mediashareiq.com/platform/video-and-lecture-capture-hardware/ | PowerPoint PPT presentation | free to view