Instruction Set Architecture Overview Target ISA: Intel Itanium IA64 Itanium 2 PowerPoint PPT Presentation

presentation player overlay
1 / 35
About This Presentation
Transcript and Presenter's Notes

Title: Instruction Set Architecture Overview Target ISA: Intel Itanium IA64 Itanium 2


1
Instruction Set Architecture OverviewTarget
ISA Intel? Itanium? IA-64 (Itanium 2)
CECS 440, Spring 2003
  • Team
  • James Callahan
  • Charles Pickman

Date May 5, 2003 Class MW 7-750 PM, Professor
G. C. Hill
2
Contents
  • Section Page
  • Introduction Overcoming CPU Bottlenecks---------
    --3
  • Introduction Itanium? Chronology----------------
    --4
  • Introduction Technology Roadmap-----------------
    --5
  • Introduction Photos-----------------------------
    --6-7
  • Introduction Exploded Packaging and
    Concept-------8
  • Introduction Articles---------------------------
    --9-11
  • Introduction - Overview, EPIC---------------------
    --12
  • Introduction Implemented in Lite and Not
    ---------13
  • Introduction - Hardware Architecture--------------
    --14
  • Itanium Branches and Predication
    -----------------15
  • Itanium General Instruction Format--------------
    --16-17
  • Itanium Using Predication to Eliminate
    Branches---18-19
  • Itanium Memory Hierarchy------------------------
    --20
  • Itanium Speculation-----------------------------
    --21-27
  • ISA Classification--------------------------------
    --28
  • Register Set Integer----------------------------
    --29
  • Data Types ---------------------------------------
    --30
  • Addressing Modes----------------------------------
    --31

3
Overcoming CPU bottlenecks
  • Why 64 bits?
  • VLSI technology is increasing the number of
    transistors available on a single die.
  • Compiler technology is very advanced now,
    however, it still has some limitations. 
  • Multithreading is becoming more pervasive.
  • "Media-rich" means parallelism. 
  • Modularity and scalability will become
    increasingly important. 
  • Goals for Intels next generation CPUs
  • Simplicity
  • Extensibility
  • Parallelism
  • Compiler-oriented
  • 64-bit computing
  • Extremely large file support
  • Extremely large physical memory support
  • A huge virtual address space for applications
  • 64-bit computation

4
Introduction - Itanium? Chronology
  • 1994 - Intel and Hewlett Packard work together on
    Itanium (codename Merced)
  • 1999 - Prototypes were promised to be released
    mid-year 1999.
  • 2000 Demonstrated a 4-CPU Itanium at Linux
    World, rollout delayed until 2001.
  • 2000 units shipped for demonstration and 500
    units sold
  • 2001 Itanium 2 (codename McKinley) is due to
    arrive in late 2001, eclipsing the first Itanium
    rollout.
  • 2002 Reported cost of Itanium development is
    over 1 Billion
  • Federal patent suits find Intel guilty of using
    Intergraph technology on Itanium
  • 2003 Supercomputing applications finally
    kick-in and show what this bold new Intel
    architecture can do!

5
Introduction - Intel? Itanium? Processor Family
Roadmap
6
Introduction - Photos
Itanium-1 (L3 Cache External to Die)
Itanium-2
7
Itanium? CPU Layout
8
Itanium? Exploded Packaging and Concept
  • Designed to take complexity away from processor,
    and making the programmer, compiler and assembler
    more complex.
  • 3x5 cartridge
  • CPU L3 cache
  • 130W Power
  • 420mm2
  • Transistors
  • CPU 25 million,
  • L3 Cache 300 million

9
One of Many Supercomputer Itanium Articles
Intel Itanium Architecture to be Foundation for
One of World's Most Powerful Scientific Computing
SystemsAugust 9, 2001 3300 Intel Processors to
be Linked in a System Capable of Calculating More
Than 13.6 Trillion Operations Per Second Intel
today announced that its Itanium family of
processors will be used to build a distributed
scientific computing system expected to be the
largest of its kind in the world. The computing
system, dubbed the "TeraGrid," is part of a 53
million award by the National Science Foundation
(NSF) to four facilities to address complex
scientific research by creating a Distributed
Terascale Facility (DTF). The TeraGrid will link
computers powered by more than 3,300 Intel
Itanium family processors. It will be capable of
more than 13.6 trillion calculations per second
(13.6 teraflops) and have the ability to store,
access and share more than 450 trillion bytes of
information. The TeraGrid will be accessible to
researchers across the United States so that they
can more quickly analyze, simulate and help solve
some of the most complex scientific problems.
Examples of research areas include molecular
modeling for disease detection, cures and drug
discovery, automobile crash simulations, research
on alternative energy sources and climate and
atmospheric simulations for more accurate weather
predictions. "The Itanium processor family is
bringing a new level of performance, scalability
and lower costs to high-performance computing,"
said Abhi Talwalkar, Intel vice president and
assistant general manager, Enterprise Platforms
Group. "Today's NSF award is a major show of
support for Itanium technology. All of us at
Intel are proud of the role our products play in
helping to advance the progress of scientific
discovery." The system announced today has been
dubbed "TeraGrid" due to its speed, distributed
design and deployment across multiple networked
geographic sites. It will achieve "tera"
performance with its ability to calculate
trillions of floating point operations per second
(teraflops) and store trillions of bytes
(terabytes) of data. The grid is a resource for
researchers to mutually access the system and
collaborate using shared computing hardware,
software and information. Expected to be
available in 2002, the TeraGrid is planned to be
the most comprehensive distributed scientific
computing infrastructure of its kind. It will
build upon an existing one-teraflops solution
with more than 300 Itanium processors now being
deployed at the National Center for
Supercomputing Applications (NCSA). The TeraGrid
will be based on both Intel's Itanium and
"McKinley" processors. McKinley is the code name
for the second product in Intel's Itanium
processor family, due in 2002. The largest
portion of the DTF computing power will be at the
NCSA at the University of Illinois in
Urbana-Champaign. NCSA has three DTF partners
which will also deploy Itanium systems the San
Diego Supercomputer Center (SDSC) at the
University of California, San Diego Argonne
National Laboratory in suburban Chicago and the
California Institute of Technology in
Pasadena. The system will consist of clustered
IBM servers running the Linux operating system,
and will be connected by a Qwest high-speed
optical network. In addition to providing the
processors powering the IBM systems, Intel will
supply the TeraGrid with key compilers, software,
tools and engineering design, and tuning support
services. The Itanium architecture design
enables breakthrough capabilities in processing
terabytes of data at high speeds and processing
complex computations. Itanium-based solutions are
providing the highest levels of floating-point
performance for complex, numerical-intensive
applicationssurpassing many of the best
RISC-based results and benchmarks to date. The
Itanium processor's floating-point engine enables
up to 6.4 billion operations per second and
includes increased system memory bandwidth.
Intel, the world's largest chip maker, is also a
leading manufacturer of computer, networking and
communications products. Additional information
about Intel is available at http//www.intel.com/p
ressroom/. Intel is a registered trademark and
Itanium is a trademark of Intel Corporation.
Third party marks and brands are property of
their respective holders.
http//www.teragrid.org/news/080901_intel.html
10
Recent Itanium? Articles
April 10, 2003
http//www.businessweek.com/technology/cnet/storie
s/996357.htm
Itanium gets supercomputing software Researchers
build full Itanium support into software that can
be used to assemble supercomputers out of
clusters of Linux computers. Researchers at
the National Partnership for Advanced
Computational Infrastructure have built full
Itanium support into software that can be used to
assemble supercomputers out of clusters of Linux
computers. Version 2.3.2 of the NPACI Rocks
software, code-named Annapurna, is the first
version to support Itanium, Intel's high-end
processor, NPACI said in a statement Thursday.
The software makes it easier to install the Linux
operating system on numerous computers despite
differences between each machine. There already
was an Itanium version of the Rocks software, but
it didn't include all the software components of
the version for computers using Intel's Pentium
and Xeon or Advanced Micro Devices' Athlon chips.
The move will make it easier for Rocks users to
add Itanium systems into clusters that use the
other chips, according to Philip Papadopoulos,
program director for the San Diego Supercomputing
Center's (SDSC) grid and cluster computing group.
Because Itanium understands a completely
different set of instructions from lower-end
Intel processors, software must be completely
rebuilt for the newer chips. That barrier has
hindered adoption of Itanium in broad business
markets, but it's been less of a problem in the
supercomputing niche, where customers often
control their own software instead of relying on
products such as Oracle's database or Computer
Associates' management software. Indeed,
Gartner analyst John Enck said in a March 26
report that Itanium systems are fine for
supercomputing clusters and will expand this year
to some mainstream markets. "Gartner believes
(the Itanium processor family) is safe for
high-performance computer clusters immediately
and will be ready for mainstream database use on
all operating systems by year-end 2003," Enck
said. "Other application usage models will
quickly follow." The NPACI Rocks software is
being used at a host of academic and government
sites, including Northwestern University, Pacific
Northwest National Laboratory, the Scripps
Institution of Oceanography, Stanford University
and the University of Macedonia. Rocks is an
open-source program that's developed by the NPACI
at the SDSC by the University of California at
Berkeley, Singapore Computing systems and
individual programmers. It's based on Red Hat
Linux version 7.3. The program includes cluster
software for tasks such as sending messages from
one computer to another, monitoring each system's
performance and scheduling jobs across the
cluster. By Stephen Shankland, Staff Writer,
CNET News.com
11
Recent Itanium? Articles
Thursday, Apr 24 _at_ 1631 PDTSingapore - The
Linux Competency Centre at Singapore Computer
Systems (SCS-LCC) has commissioned a new
60-processor CPU Intel Itanium 2-based cluster
for the Singapore-MIT Alliance (SMA) at the
National University of Singapore. The SMA
cluster, named HydraIII, is the first large-scale
Intel Itanium 2-based Beowulf cluster to be
deployed into production using the open-source
Rocks cluster toolkit, whose development is led
by the San Diego Supercomputer Center. The
cluster was installed with Rocks and had
applications running in less than a day. "The
rapid deployment by SCS of the HP system
demonstrates that 64-bit high performance
clusters are now as easy to build as 32-bit x86
processor systems, said Leslie Ong, Director, HP
Business Critical Systems, South East Asia. "Such
efficiency in rollout underscores the growing
momentum to move to open standards from
proprietary systems in the scientific community,
he added. "The increasing demand for
high-performance computing power will be a major
driver of computing innovation throughout the
next decade. We expect clusters and grids using
the open standard Intel Itanium processor family
to deliver the performance and affordability
required by the industry," said William Wu,
Itanium processor family marketing manager, Asia
Pacific. HydraIII cluster supports about 50 SMA
researchers and post-graduate students involved
in various projects, ranging from computational
fluid dynamics to bio-engineering. The cluster
consists of fifteen HP rx5670 nodes, each with
four Itanium 2 processor, and is interconnected
with a high-performance, high-bandwidth,
low-latency switching system from Myrinet. The
cluster's operating system software is Red Hat
Linux, managed by the tools of NPACI Rocks
version 2.3.2. Current Linpack performance
achieves around 70 of theoretical peak
processing power (240GFLOPS) at 167GFLOPS over
the Myrinet interconnect. "We are very pleased
with the performance and ease of management of
the Rocks-based Itanium 2 cluster," said Prof.
Khoo Boo Cheong, Program Co-Chair of High
Performance Computation for Engineered Systems at
SMA. "We intend to encourage more researchers to
migrate to HydraIII over the next few months. The
technical expertise and assistance that the
SCS-LCC team has provided to us made a huge
difference to our transition to 64-bit Linux
parallel computing." "The team took less than a
day to install the cluster with Rocks and getting
the cluster operational. This is a testimony to
the amount of work that has gone into making
Rocks one of the best and easiest to use cluster
toolkits in the world," said Laurence Liew,
manager of the SCS Linux Competency Centre.
"SCS Linux Competency Centre collaborates
closely with the San Diego Supercomputer Center
on NPACI Rocks and provides critical support in
the areas of file systems and queuing systems,"
said Dr Philip Papadopoulos, program director for
SDSC's Grid and Cluster Computing group. "The
Rocks user community benefits greatly from SCS'
expertise and their significant contributions to
this community toolkit."
http//www.supercomputingonline.com/article.php?si
d1392
12
Overview EPIC (Explicitly Parallel Instruction
Computing)
  • Designed to take complexity away from processor
    hardware, and making the programmer, compiler and
    assembler more complex.
  • Much of the parallelism is handled by the
    compiler with hardware support
  • The compiler can spend days with many resources
    optimizing (parallelizing) the code at the vendor
  • All the runtime user applications benefit from
    optimal parallel code, so IA-64 does not need to
    optimize at runtime
  • Many hardware and compiler driven methods are
    used to speedup operation
  • A large (10-stage) pipeline increases speed, but
    requires accurate branch prediction, this is a
    important reason why predication is provided
    (explained later)
  • Branch misses are very difficult to repair
    because of the large pipeline
  • Predication simply uses a 1-bit predicate
    register to allow either branch of an if
    statement to take effect, both branches of all
    predicated if statements are run concurrently.
  • Predication allows both branch streams to be
    merged into a single stream, elminating branches
    and misses which need to be corrected
  • Many functional hardware units are available for
    performing operations in parallel.
  • The instructions are bundled into groups of 3
    instructions with a added 5-bit template for a
    complete 128-bit instruction bundle.
  • The 3 instructions in the bundle are determined
    to be non-interfering by the compiler
  • Speculative loads allows operands to be fetched
    in advance, removing memory access latency

13
Overview Itanium Lite Implemented/ Not
  • Product Features Implemented
  •          IA-64 ISA
  •          RISC Instruction Set
  •          Predication (note all instructions take
    1 clock cycle to execute)
  •          Control Speculation
  •          Branch adder
  •          Physical Register Subset, 32 registers
    each 64-bits
  •          Split L1 Cache for instructions and
    data, each has independent non-blocking main
    memory access
  •          Instructions are 41-bit fixed-format
  •          Delayed Branch for branches with NOP
    insertion, in anticipation of being pipelined in
    the future ref. 6, p. 558
  • A single NOP insertion is adequate as placeholder
    after all conditional branches to avoid
    performing unintended instructions
  • When the pipeline is eventually implemented the
    placeholder NOPs can be replaced with sufficient
    number of NOP insertions
  • Features Not Presently Supported
  •          IA-32 ISA
  •          Pipeline
  •          Floating point
  •         Data Speculation
  •          Multiple execution units

14
Introduction Hardware Architecture
15
Branches and Predication
  • Traditional Architectures
  • Intel estimates that 20 to 30 of processor
    performance is eaten up by branch
    miss-predictions.
  • Branches limit your freedom to schedule the code
    for optimum performance.
  • If-Then-Else Conditional Statement.
  • Could evaluate the If, then depending on the
    outcome process the Then or the Else paths.
  • Alternative is to use Branch Prediction. While
    waiting for the If, just guess which branch and
    execute it.
  • If you get it right, you haven't wasted any time
    if you get it wrongthat's where that 20-30
    performance hit comes into effect. But even
    assuming you get it right, you might still have a
    number of execution slots going to waste.

PredicationEPIC deals with the problems which
branching introduces by just getting rid of
branches whenever it can. When IA-64 comes upon a
conditional branch, instead of trying to predict
which branch the program will take, it just takes
them both.  To understand how this process works,
it's best to look at an example.
16
Intel Itanium Instruction Format
  • A typical Itanium instruction is a three operand
    instruction, with the following syntax
  • (qp) mnemonic.comp1.comp2 dests srcs
  • Some examples of different Itanium instructions
  • Simple Instruction add r1 r2, r3
  • Predicated instruction (p4)add r1 r2, r3
  • Instruction with immediate add r1 r2, r3, 1
  • Instruction with completer cmp.eq p3 r2, r4

17
Intel Itanium Instruction Format
  • (qp) A qualifying predicate is a predicate
    register indicating whether or not the
    instruction is executed. When the value of the
    register is true (1), the instruction is
    executed. When the value of the register is false
    (0), the instruction is executed as a NOP.
  • Instructions that are not explicitly preceded by
    a predicate, assume the first predicate register,
    p0, which is always true. Some instructions
    cannot be predicated.
  • mnemonic A unique name identifying the
    instruction.
  • comp1comp2 Some instructions may include one
    or more completers. Completers indicate optional
    variations on the basic mnemonic.
  • dests, srcs Most Itanium instructions have at
    least two source operands and a destination
    operand. Source operands are used as input.
    Typically, the source operands are registers, or
    immediates. The destination operand(s) is
    typically a register to which the result is
    written.

18
Using Predication to Eliminate Branches
  • Predication is the conditional execution of
    instructions based on a qualifying predicate.
  • When the predicate is true (1), the instruction
    is executed.
  • When it is false (0), the instruction is treated
    as a NOP.
  • Predicates are set by various instructions,
    including the compare instructions.
  • Predication enables you to convert a control
    dependency to a data dependency, thus eliminating
    branches in the code.

These code examples show the control flow of code
with and without predication. In the predicated
code example below, a data dependency exists
between the cmp and the two predicated
instructions, which execute in parallel.
Predicated Code movl r1,type ld4 r1
r2 cmp.eq p1,p2, a r2 cmp.eq p3,p4, b
r2 (p1) add r2 10, r2 (p3) add r2 20,
r2 st4 r1 r2 default
C Code Example switch (type) case 'a'
type type 10 break case 'b'
type type 20 break default break
19
Predication Summary
  • All conditional instructions are predicated
  • Avoids short branches that inject bubbles into
    the pipeline
  • Executes both branch paths simultaneously
  • Discards irrelevant path as predicate is
    evaluated
  • Delays final result effect, so allows time to
    resolve qualifying predicates
  • Example 1
  • Original code Predicated Pseudo-code Predicated
    Code
  • r1 r2 r3 if (p5) r1 r2 r3 (p5) add r1
    r2, r3
  • Example 2
  • Original code Predicated Pseudo-code Predicated
    Code
  • if (agtb) c c 1 pT, pF compare(agtb) cmp
    pT, pF ra, rb
  • else d d e f if (pT) c c 1 (pT) add
    c 1, c
  • if (pF) d d e f (pF) shladd d d, e,
    f

20
Memory Hierarchy
  • A solution to obtaining quick memory access
    relies on locality of reference
  • most programs do not access all code or data
    uniformly
  • Generally smaller hardware is faster than larger
    hardware
  • Faster hardware is expensive
  • Any Instruction Load or Data Load can take a
    large number of CPU clocks (large amount of time
    or latency)
  • Speculation (pre-fetching) reduces effective
    access time of instructions and data

21
Speculation
  • Fast processor speeds are of limited value if
    computational registers sit idle while the
    processor retrieves required data from memory
  • Speculation allows the compiler to identify
    future data needs, so essential data can be
    pre-loaded into the processor
  • This technique can significantly reduce or
    eliminate processor wait times
  • There is no 100 guarantee that any speculative
    attempt to perform either an instruction
    (control) or data fetch ahead of time will be
    successful
  • Many hardware / ISA attempt to reduce negative
    impacts of bad speculations

22
Control Speculation
  • Load transfers data stored in memory to a general
    register and can take a long time
  • The data transferred can either be software
    instructions from a program or purely data
  • To reduce effective access time special
    mechanisms are provided to allow for
    compiler-directed speculation
  • Control speculation is compiler optimization
  • An instruction or sequence of instructions are
    executed before it is known (exactly) that the
    dynamic control flow of the program will actually
    reach the point in the program where the sequence
    of instructions are needed.
  • Starting execution early allows compiler to
    overlap the execution with other work, increasing
    parallelism and decreasing overall execution
    time.
  • This optimization is performed when it is
    determined that the calculation will be required
  • In cases where control flow does not need the
    calculation, the results are discarded or not
    used
  • Since the speculative instruction sequence may
    not be required after all, then any exceptions
    should be delayed until the actual sequence is
    known to be required
  • A mechanism is provided for these exceptions to
    be recorded and deferred, to be signalled later
  • A special token is written into the target
    register extra bit, NaT (Not a Thing).

23
Control Speculation
  • Instructions are either speculative and
    non-speculative
  • Non-speculative instructions will raise
    exceptions immediately and are unsafe to be
    scheduled before they are known to be executed
  • Speculative instructions defer exceptions, so can
    be scheduled before they are needed
  • At the point in the program where it is known
    that the speculative calculation result is
    necessary, then a speculation check (chk.s)
    instruction is used
  • The check is made for the deferred exception
    token in NaT.
  • If no deferred exceptions are found than the
    speculative calculation was successful and
    execution continues normall
  • If a deferred exception token is found, then the
    speculative calculation was unsuccessful and must
    be re-done, this time by branching to a new
    address
  • A branch is taken to a new address with a
    non-speculative version of the same code
  • On this second try to run the code the exceptions
    are handled normally (non-speculative)
  • Original code Speculated code
  • if (agtb) load(ld_addr1, target1) sload(ld_addr1,t
    arget1)
  • else load(ld_addr2, target2) sload(ld_addr2,targ
    et2)
  • / other operations including uses of
    target1 and target2 /
  • if (agtb) scheck(target1, recovery_addr1)
    else scheck(target2, recovery_addr2)

24
Control Speculation
  • Computational instructions do not generally cause
    exceptions
  • The only instructions which generate deferred
    exception tokens are speculative loads
  • Other speculative instructions propagate deferred
    exeption tokens, but do not generate them
  • Compare instructions (cmp and tbit) read general
    registers and write one or two predicate
    registers
  • If any source contains a deferred exception
    token, all predicate targets are either cleared
    or left unchanged.
  • Software uses this method to ensure any dependent
    conditional branches are not taken and any
    dependent predicated instructions are nullified
  • Deferred exception tokens can also be tested
    using test NaT (tnat)
  • Tnat tests the NaT bit corresponding to the
    specified general register and writes two
    predicated results
  • A non-speculative instruction that reads a
    register containing a deferred exception token
    will raise a Register NaT Consumption fault.
  • Such instructions are thought of as performing a
    non-recoverable speculation check operation
  • The operating system also has control over
    exception deferral
  • The O/S has option to select which exceptions are
    deferred automatically in hardware
  • Other exceptions may be handled (and possibly
    deferred) by software
  • Special Register Spill and Fill instructions both
    store and load a register to memory which
    preserve any deferred exception token.

25
Data Speculation
  • Similar to control speculation, allows compiler
    to schedule instructions across some types of
    ambiguous data dependencies.
  • An ambiguous data or memory dependency exists
    between a store, which updates the memory state,
    and a load from memory to registers when it
    cannot be determined whether the load and store
    might access overlapping regions of memory.
  • A store that cannot be disambiguated relative to
    a particular load is said to be ambiguous
    relative to that load.
  • In such cases, the compiler cannot change the
    order in which the load and store instructions
    were originally specified in the program.
  • To overcome this scheduling limitation a special
    kind of load instruction called an advanced load
    can be scheduled to execute earlier than the one
    or more stores that are ambiguous relative to
    that load.

26
Data Speculation
  • The compiler can also speculate operations that
    are dependent upon the advanced load and later
    insert a check instruction to determine if the
    speculation was successful or not
  • For data speculation, the check can be placed
    anywhere the original non-speculative data load
    would have been scheduled.
  • A data speculative sequence of instructions
    consists of an advanced load, zero or more
    instructions dependent on the value of that load,
    and a check instruction.
  • Original code Speculated code
  • store(st_addr, data) aload(ld_addr,target)
  • load(ld_addr,target) / other opeations
    including uses of target /
  • use(target) store(st_addr,data)
  • acheck(target,recovery_addr)
  • use(target)

27
Data Speculation
  • Data Speculation and Instructions
  • Advanced loads are available in many forms
    (integer, floating-point, floating-point pair)
  • When an advanced load is executed, it allocates
    an entry in a structure called the Advanced Load
    Address Table (ALAT). Later, when a
    corresponding check insertion (e.g. chk.a) is
    executed, the presence of an entry indicates that
    the data speculation succeeded
  • The advanced load check (chk.a) is used when an
    advanced load and several instructions that
    depend on the loaded data value are scheduled
    before a store that is ambiguous relative to that
    advanced load.
  • The chk.a works like the chk.s, if the
    speculation was successful then execution
    continues inline and no recovery is necessary
  • If the speculation was unsuccessful the chk.a
    branches to compiler-generated recovery code.
  • The recovery code contains instructions that will
    re-execute all the work that was dependent on the
    failed data speculative load up to the point of
    the check instruction.
  • The ALAT is searched for a matching entry to
    determine success or failure

28
ISA Classification
  • ISA classification is based on the operand
    addressing of data manipulation operations (i.e.
    ADD, SUB, MUL)
  • two parameters of interest (M, N) N is maximum
    number of operands that can be explicitly
    addressed, M is maximum number of operands that
    can be explicitly addressed in memory.
  • The Itanium is classified as a (0,3)
  • Three address operand for each data manipulation
    instruction
  • Zero memory direct operands
  • Generally this is known as a RISC ISA
    classification
  • Note the bundling of 3 instructions to make a
    128-bit word is generally considered
    very-long-instruction word (VLIW), so Itanium has
    combinations of features from both complex and
    RISC processors

29
Register Set Integer
  • 32 x 64-bit general purpose registers
  • Zero address returns zero value
  • 32 x 1-bit Not-A-Thing (NaT) registers,
    correspond to the general purpose registers
  • zero address returns zero value
  • 64 x 1-bit Predicate Registers
  • zero address returns one value

30
Data Types
  • Digital only, no floating point.
  • 64-bit Integer
  • Byte Ordering
  • Big Endian

31
Addressing Modes
  • The Itanium has only one simple addressing mode,
    register indirect.
  • This reduces the amount of overhead per clock
    cycle, since it does not have to deal with the
    address-generation units required for multiple
    addressing modes.
  • Example 1 Example 2
  • ld8 r1 r3 st8 r3 r2
  • loads 8 bytes from address indicated stores 8
    bytes from register r2 to address indicated
  • by value in r3 into register r1 by value in r3
  • PC-relative is also used to perform branches

32
Instruction Set Format
33
Instructions Set Itanium Lite
34
Instructions Set Itanium Lite
35
Lite Instruction Formats not Covered
Write a Comment
User Comments (0)
About PowerShow.com