Embedded SystemonChip Design and Validation - PowerPoint PPT Presentation

1 / 66
About This Presentation
Title:

Embedded SystemonChip Design and Validation

Description:

Report on Design Reuse and IP Core Workshop. Organized by DARPA, EDA Industry Council, NIST ... In-Circuit Emulator (ICE): A box of hardware that can emulate the ... – PowerPoint PPT presentation

Number of Views:58
Avg rating:3.0/5.0
Slides: 67
Provided by: suji4
Category:

less

Transcript and Presenter's Notes

Title: Embedded SystemonChip Design and Validation


1
Embedded System-on-Chip Design and Validation
  • Rajesh Gupta, UC Irvine
  • Sujit Dey, UC San Diego
  • Peter Marwedel, U. of Dortmund

2
Outline
  • Validation Challenges and Issues for
    System-on-Chip
  • Validation Methodologies
  • Simulation
  • Prototype Validation
  • System Verification Environments
  • Improving Simulation Performance Using Models
  • Hardware-Software Co-Simulation
  • Analysis/Estimation
  • Performance
  • Power

3
System-on-chip Using IP Cores
4
Challenges for System-on-Chip Industry
  • ... the industry is just beginning to fathom
    the scope of the challenges confronting those who
    integrate blocks of reusable IP on large chips.
    Most of the participants summed up the toughest
    challenge in one word verification.
  • Source EE Times (Jan. 20, 1997)
  • Report on Design Reuse and IP Core Workshop
  • Organized by DARPA, EDA Industry Council, NIST

5
System-on-Chip Verification Challenges
  • Verification goals
  • functionality, timing, performance, power,
    physical
  • Design complexity
  • MPUs, MCUs, DSPs, interface, telecom, multimedia
  • Diversity of blocks (IPs/Cores)
  • different vendors
  • soft, firm, hard
  • digital, analog, synchronous, asynchronous
  • different modeling and description languages - C,
    Verilog, VHDL
  • software, firmware, hardware
  • Different phases in system design flow
  • specification validation, algorithmic,
    architectural, hw/sw,
  • full timing, prototype

6
Verification Gap
Test Complexity
Sim Performance
Simulator Performance
Verification Gap
Design Complexity (FFs)
Source Cadence
7
Increasing Simulation Loads
Source Synopsys
8
System-on-Chip Design and Validation Flow
9
Embedded Software Implementation and Validation
Software Tasks
Estimators - Performance - Power
Instruction Set Simulator
Mapping tasks to CPUs
Compiler Assembler Linker
Multitask Scheduling - Priority selection
Co-Simulator
H /W
Multiprocessor Integration - Protocols -
Shared Memory
Debugger Emulator
RTOS
Software Implementation
10
Verification of Cores in High Level Design Flow
user constraints
resource, performance, etc.
Functional RTL
Structural RTL
Behavioral
Hardware Sharing Delay / Power/ Testability
RT
-Level
VHDL Specs.
CFG DFG
Scheduler (cycle-by-cycle behavior)
Compiler
Scheduled VHDL
(contr DP)
RT-Level Optimization
Test Bench Generation
Estimators
Mapping, Physical Synthesis
Verification - using Test Bench - Formal
11
Integration of Cores Verification of Interfaces
ip
ip


CPU
DMA
Peripheral
Peripheral
External Bus Interface
ASB
APB
Bridge


ROM RAM
Peripheral
Peripheral
Ext Access (Test)
High Speed
Low power
Source ARM
AMBA Advanced Microprocessor Bus Architecture
12
Outline
  • Validation Challenges for System-on-Chips
  • Validation Methodologies
  • Simulation
  • Prototype Validation
  • System Verification Environments
  • Improving Simulation Performance Using Models
  • Hardware-Software Co-Simulation
  • Analysis/Estimation
  • Performance
  • Power

13
Hardware Simulation
  • Event-driven
  • complied code
  • native compiled code (directly producing
    optimized object code)
  • - very slow
  • asynchronous circuits, timing verification,
    initialize to known state
  • Cycle-based
  • faster (3-10x than NCC)
  • - synchronous design, no timing verification,
    cannot handle x,z states

14
Validating System-on-Chip by Simulation
  • Need for both cycle-based and event-driven
  • asynchronous interfaces
  • verification of initialization
  • verification of buses, timing
  • Need for mixed VHDL/Verilog simulators
  • IP from various vendors
  • models in different languages
  • SOC verification not possible by current
    simulation tools
  • Growing gap between amount of verification
    desired and amount that can be done
  • 1 million times more simulation load than chip in
    1990 (Synopsys)

15
Prototype Validation
16
Emulation
  • Emulation Imitation of all or parts of the
    target system by another
  • system, the target system performance achieved
    primarily by hardware
  • implementation
  • In-Circuit Emulator (ICE) A box of hardware that
    can emulate the
  • processor in the target system. The ICE can
    execute code in the target
  • systems memory or a code that is down loaded to
    emulator.
  • - ICE also can be fabricated as silicon within
    the processor-core
  • provides interface between a source level
    debugger and a processor
  • embedded within an ASIC.
  • - Provides Realtime emulation.
  • - Functions supported such as Breakpoint
    setting, Single step execution, Trace display and
    Performance analysis.
  • - Provide C-source debugger.
  • Examples embeddedICE macrocell in ARM SY7TDM1,
    NEC 850 family of prcessors, LSI Logic

17
Embedded ICE Macrocell
2
EmbeddedICE Macrocell
EmbeddICE Macrocell
ARM Core
0
ARM7TDM
Control
1
Data
Addr
Traditional boundary scan
Data bus scan chain
TAP
5 pin JTAG Interface
Source ARM
18
Embedded ICE in ARM7TDMI Core
EmbeddedICE Interface
ASIC
EmbeddedICE macrocell
Debug Host
ARM
Source ARM
19
Debugging environment for CPU core
Source NECEL
20
Problems of Prototype Validation
  • Simulation too slow ( 10-100 cycles/sec)
  • Emulation is fast (1M cycles/sec) but ...
  • too expensive (Intel 5M - 10M per processor)
  • errors found expensive to diagnose and fix
  • loss of main focus time to market

Source Embedded Systems Programming, Jan 1996
21
Need to move Validation Early in the Design Cycle
22
System Verification Environment
  • provides system environment in which a core
    should function
  • useful to verify if a core, core-based system,
    works in the test environment
  • Compliance Test Environment
  • specially suited for cores that comply with
    industry standards, eg. PCI, USB, MPEG, Ethernet,
    MAC, ...
  • compliance tests, as well as application specific
    test
  • Examples PCI Test from Virtual Chips, MPEG Test
    from CompCore
  • System Verification Environment
  • provides a generic model of the system that is
    commonly built using the specific core
  • Examples SVE from LSI Logic, NEC Simulation
    Environment, MicroPack from ARM

23
An Example PCI Bus
  • 3 Main Buses
  • Processor Local Bus
  • PCI Bus Hierarchy
  • Standard I/O Bus
  • PCI Bridge Functions
  • Host-to-PCI
  • PCI-to-PCI
  • PCI-to-Standard
  • PCI-to-I/O Controller

Source Virtual Chips
24
PCI Bus Features
  • Designed for high throughput (132 Mbytes/sec)
  • Widely-adopted standard designed for PCs now in
    workstations, minicomputers, etc.
  • Optimized for bursting -- transferring multiple
    data words after one address/arbitration phase
  • Rich set of command types to maximize performance
    in different system applications. Examples
  • Read Commands Write Commands Special Commands
  • I/O Read I/O Write Interrupt Acknowledge
  • Mem Read Mem Write
  • 32 synthesizable soft cores supporting options
  • VHDL/Verilog - 32-bit (33 MHz) or
    64-bit (66 MHz)
  • PCI - Host, PCI - Satellite - FIFO or
    register data storage
  • Synchronous or Asynchronous

Source Virtual Chips
25
PCI Core In A System Environment
Customers Chip
Application Interface
Virtual Chips PCI Core
PCI Bus
I/O Pads
I/O Pads
Source Virtual Chips
26
PCI Core Architecture Example Host Bridge w/FIFOs
Application Interface
Target address
Master address
Master Write Data
Master Write FIFO
Almost Full
I/O Pads
Master Read Data
Master Read FIFO
Empty
Target Write Data
Target Read FIFO
Almost Full
Target Read Data
Target Write FIFO
Empty
Target State Machine
Master State Machine
Config write data, ADDR, Control
Config Regs
Source Virtual Chips
27
Case 1 PCI Compliance Validation
  • Need protocol validation capability
  • Check for wrong responses, timeouts, retry
    conditions, etc.
  • Need timing validation capability
  • Clock to Output, Input Setup/Hold, Tri-State
    Enable/Disable
  • Need checks that PCI design is a Good Agent on
    bus
  • Uses the bus efficiently
  • Uses the Right command for transfers
  • Need Ability to Force Exceptions
  • Need Ability to Run Random and Collision Tests
  • Need to go beyond the standard Compliance Suite
    defined by the PCI SIG, which is only a starting
    point

Source Virtual Chips
28
Types of Tests Used
  • Rigorous test methodology includes
  • Compliance Tests
  • Master
  • Target
  • Enhanced feature tests (not in compliance suite)
  • Ex FIFO flow control
  • Ex Large burst lengths
  • Random Tests
  • To simulate real-world traffic
  • Essential for catching many types of bugs
  • Collision Tests

Source Virtual Chips
29
PCI Validation Environment
File Reader Interface
TARGET MODEL
MASTER MODEL
Pool MEM
Pool MEM
Procedural Interface ( PI )
P C I B U S
FSM
FSM
Clk Gen Model
Protocol Monitor
Arbiter Model
Timing Checker
Optional Input File
Device Under Test
Source Virtual Chips
30
PCI Validation Environment Features
  • Includes models for Master and Target devices
  • Includes arbitration model
  • Any PCI core, can be instantiated as the Device
    Under Test
  • Includes logical protocol checker
  • Includes timing checker for full-timing
    simulation
  • Procedural interface (PI) provides tasks to
    initiate PCI transactions and check results
  • Includes script files with compliance and other
    tests written using the PI tasks

Source Virtual Chips
31
Test Examples (Same Compliance Test)
Programming Interface Example
  • -- Test Scenario 2.9. TARGET RECEIVES MEMORY
    CYCLES
  • -- CASE 1 Linear Incrementing AD(10) '00'
    (Single Transfer cycles). --gt
  • -- Memory Write cycle by the Primary Master
    to the Slave under test.
  • MEWE 0x200000 0x11111111 0x0000 0
  • -- Read back the written location using a
    Memory Read PCI cycle.
  • MEVF 0x200000 0x11111111 0x0000 0xffffffff 0
  • -- Read back the written location using a
    Memory Read Line PCI cycle.
  • MERDL 0x200000 0x0000 0
  • -- Read back the written location using a
    Memory Read Multiple PCI cycle.
  • MERDM 0x200000 0x0000 0
  • --
  • -- Test Scenario 2.9. TARGET RECEIVES MEMORY
    CYCLES
  • --
  • write(l, string'("TEST SCENARIO 2.9 --gt "))
  • write(l, now)
  • writeline(OUTPUT, l)
  • write(l, string'("CASE 1 Linear Incrementing
    AD10 '00' (Single Transfer cycles).--gt "))
  • write(l, now) writeline(OUTPUT, l)
  • addr CONV_STD_LOGIC_VECTOR(16200000,
    32)
  • data CONV_STD_LOGIC_VECTOR(1611111111,
    32)
  • -- Memory Write cycle by the Primary Master to
    the Slave under test.
  • cpPciCycle(cpsig1, PCI_MW, addr, data)
    wait4cpdone1 if (cpsig1.error '1') then
    ASSERT false REPORT " Memory Write
    failed. Remaining TS-2.9 tests invalid."
    SEVERITY ERROR checkAnyError(cpsig1)
    end if
  • -- Read back the written location using a
    Memory Read PCI cycle.
  • cpSetGlobal(cpsig1, SET_READ_CMP,
    C_COMPARE_ONCE)
  • cpPciCycle(cpsig1, PCI_MRV, addr, data)
  • wait4cpdone1
  • write(l, string'(" Memory Read cycle
    "))
  • if (cpsig1.error '1') then write(l,
    string'(" gt FAILED."))
  • writeline(OUTPUT, l)
  • checkAnyError(cpsig1)
  • else

Input File Test Script Example
Source Virtual Chips
32
System Verification Environment (LSI Logic)
33
System Simulation Environment (NEC)
Source NECEL
34
MicroPack - An Example AMBA system in HDL (ARM)
Source ARM
35
Enhancing Simulation Speed Using Simulation Models
  • Hardware model
  • Behavioral (C) model
  • Bus-functional model
  • Instruction-Set simulation (ISS) model
  • instruction accurate
  • cycle accurate
  • Full-timing gate-level model
  • encrypted to protect IP

36
Hardware Model
  • Use the actual physical device to model its own
    behavior during simulation
  • Advantages accuracy, full device functionality,
    including any undocumented behavior.
  • Disadvantages delivers 1 to 10 instructions/sec,
    cost
  • Example
  • Logic Modeling (Synopsys) Hardware Models

37
Behavioral Model
  • Behavior of the core modeled in C
  • Example Memory models from Denali
  • 30-70 of system chip area is memory gt power,
    latency, area of chip
  • In typical simulation, conventional models
    consume as much as 90 of workstation memory
  • C models of DRAM, SRAM, Flash, PROM, SDRAM,
    EEPROM, FIFO
  • RAMBUS, Configurable Cache
  • parameterizable models, common interface to all
    simulators
  • allows adaptive dynamic allocation, memory
    specific debugging

38
Bus Functional Model
  • Idea is to remove the application code and the
    target processor from the hardware simulation
    environment
  • Performance gains by using the host processors
    capabilities rather than simulating same
    operation happening on target processor
  • varying degrees of use of host processor leads to
    different models
  • Bus functional model
  • only models the interface circuitry (bus), no
    internal functionality
  • usually driven by commands, like read, write,
    interrupt, ..
  • bus-transaction commands coneveted into a timed
    sequence of signal transitions fed as events to
    traditional hardware simulator
  • Bus Functional model emulates
  • Read/Write Cycles (single/burst transfers)
  • Interrupts

39
Compiled Code Simulation
  • Host code not eqaul to Target code
  • Low-level debugging not possible
  • eg. observing processor internal registers
  • Measurements may be inaccurate
  • eg. cycle counts

40
Instruction Set Simulation
  • full functional accuracy of the processor as
    viewed from pins
  • Operations of CPU modelled at the
    register/instruction level
  • registers as program variables
  • instructions by program functions which operate
    on register values
  • Instructions define relationships between
    registers, internal memory, and external memory
  • Data Path that connects the registers abstracted
    out
  • Allows both high level and assembly code to be
    debugged
  • Instruction Accurate
  • accurate at instruction boundaries only
  • correct bus operations, and total number of
    cycles, but no gurantee of state of CPU at each
    clock cycle inaccuracy due to bus contention
  • Cycle Accurate
  • gurantees the state of the CPU at every clock
    cycle
  • gurantees exact bus behavior
  • slower than instruction-accurate, but faster than
    full behavioral model

Source LSI Logic, Mentor Graphics
41
Instruction-Set Simulation Example
  • Example system Microtec XRAY Sim
  • Fast 100,000 instructions/sec
  • software debug source code debugging, register
    and memory views

42
Example of Simulation Models Used
Example NEC provides the following
simulation models 1- C model (behavioral)
2- RTL model w/ timing wrapper 3- Verilog
gate level model In the early stage of the
ASIC design and software development, customer
uses the C-model because it is the fastest
model. RTL model with timing wrapper for the
accurate timing and function verification. For
the final design verification, gate level model.
( very slow execution time)
43
Hardware-Software Co-Simulation
  • Most of the bus cycles are Instruction or Data
    fetches
  • High Activity
  • 700-1000 instructions for each I/O bus cycle
  • Low Activity
  • Only during processor I/O cycles

44
Hardware-Software Co-Simulation Implementation
45
Seamless CVE Comprehensive System Wide Analysis
Debug
Source Mentor Graphics
46
Optimization FoundationMemory Access Time
Source Mentor Graphics
47
Seamless Optimization Example
Source Mentor Graphics
48
Non-Optimized Logic Simulation
Source Mentor Graphics
49
Instruction Fetch Optimization By Masking
Source Mentor Graphics
50
Fetch Suppressed Logic Simulation
Source Mentor Graphics
51
Memory Access Suppressed
Source Mentor Graphics
52
Logic Simulate Active Cycles Only
Source Mentor Graphics
53
Performance Optimization Results
Source Mentor Graphics
54
Comparison of Validation Methods
55
Interface Based System Verification
  • Verification of cores (IP Provider)
  • making sure core works in all intended
    environments
  • verification models (functional, interface)
  • testbench
  • Verification of system-on-chip
  • pre-verified cores
  • validation of
  • interfaces, integration, buses, protocols, etc
  • Modeling Core Interface
  • Interface standards to facilitate integration and
    validation of cores (from multiple sources) on
    the same chip
  • Open Modeling Interface from Open Modeling Forum
    (Cadence, VSIA)

56
Virtual Socket Interface Alliance
Alliance of semiconductor vendors, systems
comapnies, independent core providers, and EDA
vendors Aim Develop open core design
interface and productization standards - define
a set of interfaces for the creation and
integration of cores that enable the efficient
and accurate integration, verification, and
testing of multiple cores (possibly from
multiple sources) on a single piece of silicon.
57
VSIA Verification and Interface Standards
58
Outline
  • Validation Challenges for System-on-Chips
  • Validation Methodologies
  • Simulation
  • Prototype Validation
  • System Verification Environments
  • Improving Simulation Performance Using Models
  • Hardware-Software Co-Simulation
  • Analysis/Estimation
  • Performance
  • Power

59
Software Performance Analysis
  • Goal Determine a tight upper bound on a
    programs worst case execution time estimated
    WCET.
  • Instruction cache memory analysis
  • Applications
  • HW-SW partitioning
  • Real-time systems
  • Program Path Analysis
  • Determine the worst case execution paths.
  • Avoid exhaustive search of program paths.
  • Make use of path information provided by the
    user.
  • Microarchitecture Modeling
  • Model hardware and determine the execution time
    of a known sequence of instructions.
  • Caches, CPU pipelines, etc. complicate analysis.

for (i0 ilt100 i) if (rand() gt 0.5)
j else k
2100 possible worst case paths!
60
Program Path Analysis
  • Integer Linear Programming formulation
  • Assume constant instruction execution time.
  • Basic idea
  • Maximize
  • subject to a set of linear constraints
  • structural constraints derived from program
    structure
  • functionality constraints provided by user

?
c
x
i
i
i
Exec. count of Bi (variable)
Single exec. time of basic block Bi (constant)
  • No explicit path enumeration
  • Li, Malik,Wolfe, ICCAD 95

61
Structural Functionality Constraints
Structural Constraints At each node Exec. count
of Bi ? inputs ? outputs
Example While loop
x
?
d
?
d
/ p gt 0 / q p while (qlt10) q r q
1
1
2
x
?
d
?
d
?
d
?
d
2
2
4
3
5
x
?
d
?
d
3
3
4
x
?
d
?
d
4
6
5
Functionality Constraints provide loop bounds
and other path information
Source Code
Control Flow Graph
0
x
?
x
?
10
x
3
1
1
62
Software Performance Analysis Experimental
Results
63
Software Power Analysis
  • Instruction-level energy analysis
  • Assign energy cost to instructions and
    inter-instrucion effects
  • Base energy costs of instructions
  • Energy costs of inter-instruction effects (eg.
    circuit state overhead, cache misses, pipeline
    stalls)
  • Applied to Intel 486DX2, Fujitsu SPARClite,
    Fujitsu DSP
  • Tiwari, Malik, Wolfe, TVLSI, Dec. 94

64
Hardware Power Estimation
  • Activity-sensitive power models for macro blocks
  • Word-level models for arithmetic components
  • Bit-level models
  • RTL simulation does not reveal glitching activity
  • Glitching can account for as much as 50 of power
  • Solution Glitching models for macro blocks
    activity estimation techniques for control logic
    using functional and partial delay info.
  • Landman, Rabaey, TVLSI, June 1995
  • Raghunathan, Dey, Jha, ICCAD 1996

65
Sources
  • NEC Electronics, NEC CCRL
  • LSI Logic
  • Advanced RISC Machines
  • Virtual Chips (Phoenix Technologies)
  • Mentor Graphics
  • Alta Group of Cadence Design Systems
  • Synopsys
  • Eagle Design Automation
  • Princeton University

66
Acknowledgements
  • C. Smith, M. El-Khatib - NEC Electronics
  • D. McKenney, S. Hussain - LSI Logic
  • A. Greenhill - Advanced RISC Machines
  • T. Anderson, C. Snyder - Virtual Chips
  • S. A. Leef - Mentor Graphics
  • R. Grossman, B. Williams - Eagle Design
    Automation
  • S. Malik, Princeton University
  • A. Raghunathan, NEC
Write a Comment
User Comments (0)
About PowerShow.com