Title: Validating The Intel Pentium 4 Microprocessor
1Validating The Intel Pentium 4 Microprocessor
Bob Bentley Intel Corporation bob.bentley_at_intel.co
m
DSLab ???
2 Abstract
- The microarchitecture of the Pentium 4 processor
is significantly - more complex than any previous Intel
Architecture microprocessor. - This paper describes how we went about the task
of validating the - Pentium 4 processor.
- We hope that other microprocessor designers and
validators will be - able to benefit from our experience and
insights.
3 Introduction
- Validation case studies are relatively rare in
the literature of computer - architecture and design.
- Case studies of commercial microprocessor are
even rarer. - Cost of an undetected bug
- Monetary sense
- Society that is increasing dependent on computer
-
4 Introduction
- The Pentium 4 processor is significantly more
complex than any - previous IA-32 microprocessor.
- The challenge of validating the logical
correctness of the design in - a timely fashion was indeed a daunting one.
- applied a number of innovative tools and
methodologies -
5 The Pentium 4 processor
- 400MHz system bus
- hyper pipelined technology
- advanced dynamic execution
- rapid execution engine
- advanced transfer cache
- execution trace cache
- Streaming SIMD Extensions 2 (SSE2).
SIMD Single Instruction, Multiple Data
6 A brief timeline
- 1996 Structural RTL (SRTL) work began
- 1997 spring First full-chip SRTL integration
- 1998 Q2 SRTL was largely completed
- 1999 Dec. A-step tapeout occurred
- 2000 Jan. First packaged parts arrived
- 2000 Q1 Initial samples to customers
- 2000 Oct. Production ship qualification granted
- 2000 Nov. Pentium 4 launched at 1.41.5GHz
7 Validation Overview
- applied the same or similar tools and
methodologies that were used - on the Pentium Pro processor to validate the
Pentium 4 processor. - develop new methodologies and tools in response
to - lessons learnt from previous projects
- address new challenges raised by the Pentium 4
processor
8 Validation Overview
- Methodologies and tools are either new or a
greatly extended form that - used on previous projects
- - Formal Verification
- - Cluster Test Environments
- - Focused Power Reduction Validation
9 Challenge
- First challenge was to build a pre-silicon
validation team. - a nucleus of people who had worked on the
Pentium Pro processor - 10 people were nowhere near enough for a 42
million-transistor design. - mounted an extensive recruitment campaign
focused mostly on new - college graduates
10 Pre-Silicon Validation Environment
- Pre-silicon logic validation using
- Cluster-level model
- Full-chip SRTL model
- running in the CSIM simulation environment
- Run these simulation models on
- interactive workstations
- compute servers
CSIM developed by Intel Design Technology
11 Pre-Silicon Validation Environment
- IBM RS/6000s running AIX ? Pentium III based
systems running Linux - The full-chip model speed range
- IBM RS/6000s 0.50.6 Hz
- Pentium III 35 Hz
- P4 around 15 Hz
- The speed of the cluster models varied, but all
of them were - faster than full-chip.
12 Pre-Silicon Validation Environment
- use an internal tool called Netbatch to submit
large numbers of - batch simulations
- averaging 5-6 billion cycles per week and had
accumulated over - 200 billion SRTL simulation cycles of all types
- roughly equivalent to 2 minutes on 1GHz CPU
13 Formal Verification
- The Pentium 4 processor was the first project at
Intel to apply FV - on a large scale.
- couldn't formally verify the entire designthat
was (and still is) way - beyond the state of the art for today's tools
- focused on the floating-point execution units
and the - instruction decode logic
14 Formal Verification
- develop the tools and methodology needed to
handle a large number - of proofs in a highly dynamic environment
- using the Prover tool to compare SRTL against
separate specifications - written in Formal Specification Language (FSL)
- found over 100 logic bugs
- not a large number in the overall scheme of
things - 20 of them were "high-quality" bugs
Prover Intel's Design Technology group
15 Formal Verification
- Two classic floating-point data space problems
- The FADD instruction
- for a specific combination of source operands,
carryout bit - was setting to 1 when there was no actual
carryout - The FMUL instruction
- the sticky bit was not set correctly for certain
combinations - of source operand mantissa values
16 Cluster-Level Testing
- One of the fundamental decisions in the Pentium
4 processor development - program was to develop Cluster Test
Environments (CTEs) . - Unlike the Pentium Pro processor and some other
new microarchitecture - developments, the Pentium 4 processor never
needed an SRTL "get-well plan - at the full-chip level where new development is
halted.
17 Cluster-Level Testing
- CTEs provided a number of key advantages
- provided controllability that was otherwise
lacking at - the full-chip level
- make significant strides in early validation of
the Pentium 4 - processor SRTL even before a full-chip model
was available - caught almost 60 of the bugs found by dynamic
testing - at the SRTL level
18 Power Reduction Validation
- Two main mechanisms for active power reduction
in the design - clock gating
- thermal management
- Each presented validation challenges-in
particular, clock gating. - Clock gating as a concept is not new.
- What was different about the Pentium 4 processor
design was - the extent to which clock gating was used.
19 Power Reduction Validation
- Every unit on the chip had a power reduction
plan. - Almost every Functional Unit Block (FUB)
contained clock gating logic. - The results exceeded our fondest expectations
- clock gating fully functional on A-0 silicon
- approximately 20W of power saving in a system
- running typical workloads
20 Full-chip Integration and Testing
- With a design as complex as the Pentium 4
processor, integrating the - pieces of SRTL code together to get a
functioning full-chip model is not - a trivial task.
- The Architecture Validation (AV) team took the
lead in developing tests - that would exercise the new features as they
became available in each - phase, but did not depend upon any as-yet
unimplemented IA-32 features.
21 Full-chip Integration and Testing
- developed a methodology called feature
pioneering - when a new feature was released to full-chip for
the first time - a validator running his or her feature exercise
tests - debugging the failures
- working with designers to rapidly drive fixes
into experimental models - greatly speeded up the integration process and
also had a side effect - - it helped the AV team develop their
full-chip debugging skills - much more rapidly
22 Coverage-Based Validation
- Primary coverage tool was Proto used to create
coverage monitors and - measure coverage for a large number of
microarchitecture conditions. - By tapeout we were tracking almost 2.5 million
unit-level conditions and - more than 250,000 inter-unit conditions
- hitting almost 90 of the former
- 75 of the latter
Proto from Intel Design Technology
23 Coverage-Based Validation
- Use the Pathfinder tool to measure how well we
were exercising all the - possible microcode paths in the machine.
- Much to our surprise, running all of the AV test
suite yielded coverage - of less than 10 .
- It did reinforce the value of collecting
coverage feedback and not just - assuming that our tests were hitting specified
conditions.
Pathfinder Intel's Central Validation
Capabilities group
24 Bug Discussion
- Comparing the development of the Pentium 4 with
the Pentium Pro -
- 350 increase in the number of bugs filed
against SRTL - The breakdown of bugs by cluster (microcode)
- Pentium Pro over 30 of the total
- Pentium 4 less than 14
- Memory Cluster was the largest source of
hardware bugs, - accounting for around 25 of the total in both
designs.
25 Bug Discussion
- RTL Coding (18.1 )
- things like typos, cut and paste errors
- incorrect assertions (instrumentation) in the
SRTL code - the designer misunderstood what he/she was
supposed to implement
26 Bug Discussion
- Microarchitecture (25.1 )
- problems in the microarchitecture definition
- architects not communicating their expectations
clearly to designers - incorrect documentation of algorithms,
protocols, etc
27 Bug Discussion
- Logic / Microcode Changes (18.4 )
- the design was changed
- fix bugs or timing problems
- state was not properly cleared or initialized at
reset - clock gating
28 Bug Discussion
- Architecture (2.8)
- Certain features were not defined until late in
the project. - led to shoehorning them into working
functionality
29 Conclusion
- The Pentium 4 processor was highly functional on
A-0 silicon and - received production qualification in only ten
months from tapeout. - why we can maintain such a tight schedule
- enable Intel to realize early revenue from the
Pentium 4 processor - in today's highly competitive marketplace
30 References
1Clark, D, "Bugs Are Good A Problem-Oriented
Approach To The Management Of Design
Engineering," Research-Technology Management,
33(3), May 1990, pp. 23-27. 2Bentley, R.,
and Milburn, B., "Analysis of Pentium Pro
Processor Bugs," Intel Design and Test
Technology Conference, June 1996.Intel Internal
Document. 3Zucker, R., "Bug Root Cause
Analysis for Willamette," Intel Design and
Test Technology Conference, August 2000. Intel
Internal Document 4 www.intel.com Intel
Technology Journal