Build GCC Cross Compiler for a Specify CPU - PowerPoint PPT Presentation

1 / 45
About This Presentation
Title:

Build GCC Cross Compiler for a Specify CPU

Description:

Build GCC Cross Compiler for a Specify CPU Chia-Tsun Wu D92943007 tommy_at_access.ee.ntu.edu.tw Outline Introduction to SoC Motivation and project goal Design a CPU ... – PowerPoint PPT presentation

Number of Views:168
Avg rating:3.0/5.0
Slides: 46
Provided by: Tommy
Category:
Tags: cpu | gcc | build | compiler | cross | prefix | specify | test

less

Transcript and Presenter's Notes

Title: Build GCC Cross Compiler for a Specify CPU


1
Build GCC Cross Compiler for a Specify CPU
  • Chia-Tsun Wu
  • D92943007
  • tommy_at_access.ee.ntu.edu.tw

2
Outline
  • Introduction to SoC
  • Motivation and project goal
  • Design a CPU
  • Tools are used to design CPU hardware
  • CPU Specification
  • CPU Design flow
  • Simulation and Results

3
Outline
  • Build a GCC Cross Compiler
  • GCC structure
  • Knowledge to port GCC
  • Build Flow
  • Build a GCC Cross Assembler and Cross Linker
  • Build a GCC Cross Compiler
  • A simple test program
  • Summary

4
Introduction to SoC
  • SoC System on a Chip.
  • Highly integrated include
  • CPU
  • System Bus
  • Peripherals
  • Co-processor
  • Low cost, low area, high performance.

5
What is SOC?
6
SOC Design Flow
System Specs..
HW/SW Partitioning
Hardware Descript.
Software Descript.
HW Synth. and Configuration
Software Gen. Parameterization
Interface Synthesis
Configuration Modules
Hardware Components
HW/SW Interfaces
Software Modules
HW/SW Integration and Cosimulation
Integrated System
System Evaluation
Design Coverification
System Validation
7
Motivation and project goal
  • Motivation
  • SoC is the major trend in recent years
  • CPU is one of the key kernel of SoC design
  • Development environment is the most important to
    a CPU
  • Goal
  • Design a simple 32-bit RISC CPU
  • Build a cross assembler and cross linker for a
    specify CPU
  • Build a cross compiler for a specify CPU

8
Design a CPU
  • Specification
  • 32-bit RISC based CPU
  • General-purpose register architecture
  • 32-bit (64 Gbyte) addressing
  • 32-bit fixed instruction length (excluding
    immediate data)
  • MSB first
  • Reset address 0x000ffffc
  • No pipeline, one instruction cycle four clock
    cycles
  • Instruction fetch
  • Instruction decode and Data fetch
  • Execution
  • Write back
  • No interrupt
  • No timer

9
Registers
  • General purpose register R0R15
  • R13 Accumulator
  • R14 memory data pointer
  • R15 stack pointer
  • Program counter (PC) (0x000ffffc after reset)
  • Program status (PS) (Sign flag, Zero flag,
    oVerflow flag, Carry flag)

10
Instruction formats
  • General OP Rn1, Rn2
  • OP 8 bits
  • n register number 0000 R0, 1111 R15
  • Immediate OP data, Rn2
  • OP 8 bits
  • n register number 0000 R0, 1111 R15
  • data32 bit data
  • Branch OP Addr
  • OP 16 bit (low byte0x00)
  • Addr 32 bits branch address

11
Instruction sets
  • ADD Rn1,Rn2 Machine code00000000Rn1Rn2
  • Rn2Rn1Rn2
  • Flag SZVC
  • ADDC Rn1,Rn2 Machine code00000001Rn1Rn2
  • Rn2Rn1Rn2
  • Flag SZVC
  • SUB Rn1,Rn2 Machine code00000010Rn1Rn2
  • Rn2Rn2-Rn1
  • Flag SZVC
  • SUBC Rn1,Rn2 Machine code00000011Rn1Rn2
  • Rn2Rn2-Rn1
  • Flag SZVC

12
Instruction sets
  • LDI data,Rn2 Machine code00001000000Rn2Data
  • Rn2data
  • Flag
  • MOV Rn1,Rn2 Machine code00000101Rn1Rn2
  • Rn2Rn1
  • Flag
  • RET Machine code0000011000000000
  • PCSP--
  • Flag
  • JMP Addr Machine code0000011100000000Addr
  • PCAddr
  • Flag

13
Tools are used
  • Synposis Design Compiler
  • Mentor Graph ModelSim
  • Synposis Apollo
  • TSMC 0.25um standard cell libraries

14
Design Flow
CPU Specifications
RTL Coding
Function simulation
Test bench
Design compiler
Constrain
Gate level simulation
Test bench
Apollo
Constrain
Test bench
Post layout simulation
Tape out
15
Test vectors
  • LDI 0x0,R0 00000000000000000000010000000000
    00000000000000000000000000000000
  • LDI 0x1,R1 00000000000000000000010000000001
    00000000000000000000000000000001
  • LDI 0x2,R2 00000000000000000000010000000010
    00000000000000000000000000000010
  • LDI 0x3,R3 00000000000000000000010000000011
    00000000000000000000000000000011
  • LDI 0x4,R4 00000000000000000000010000000100
    00000000000000000000000000000100
  • LDI 0x5,R5 00000000000000000000010000000101
    00000000000000000000000000000101
  • LDI 0x6,R6 00000000000000000000010000000110
    00000000000000000000000000000110
  • LDI 0x7,R7 00000000000000000000010000000111
    00000000000000000000000000000111
  • LDI 0x8,R8 00000000000000000000010000001000
    00000000000000000000000000001000
  • LDI 0x9,R9 00000000000000000000010000001001
    00000000000000000000000000001001
  • LDI 0xa,R10 00000000000000000000010000001010
    00000000000000000000000000001010
  • LDI 0xb,R11 00000000000000000000010000001011
    00000000000000000000000000001011
  • LDI 0xc,R12 00000000000000000000010000001100
    00000000000000000000000000001100
  • LDI 0xd,R13 00000000000000000000010000001101
    00000000000000000000000000001101
  • LDI 0xe,R14 00000000000000000000010000001110
    00000000000000000000000000001110
  • LDI 0xf,R15 00000000000000000000010000001111
    00000000000000000000000000001111
  • ADD R0,R1 00000000000000000000000000000001
  • ADDC R2,R3 00000000000000000000000100100011
  • SUB R4,R5 00000000000000000000001001000101

16
Simulation result
17
Synthesis results
  • TSMC 0.25um
  • Area0.35mmmm
  • Clock400MHz
  • Power1.73mW
  • UMC 0.18um
  • Area0.19mmmm
  • Clock600MHz
  • Power1mW

18
Build a GCC Cross Compiler
  • GCC structure
  • Knowledge to port GCC
  • Build Flow
  • Build a GCC Cross Assembler and Cross Linker
  • Build a GCC Cross Compiler
  • A simple test program
  • Summary

19
GCC Execution
20
The Structure of Compiler
21
The Structure of GCC
22
GCC Code Generation
  • Backend machine description pattern match
    intermediate format (RTL).
  • Machine description like a template.
  • Machine description includes
  • type bit widths, memory alignment
  • instruction patterns, register classes
  • peephole optimization rules

23
GCC Code Generation (contd)
24
Example of RTL
  • Adds two 4-byte integer (SImode) operands.
  • First operand is register
  • Register is also 4-byte integer.
  • Register number is 8.
  • Second operand is constant integer.
  • Value is 123.
  • Mode is VOIDmode (not given).

25
Templates
  • Used for three purposes
  • Generating RTL from parse tree.
  • Generating machine insns from RTL.
  • Specifying parameters about instructions.
  • Sample Template for RISC machine

26
GCC Porting and Retargeting
  • Porting to new machines/processors
  • The Using and Porting the GCC book and
    self-contained.
  • Done by describing machine, not how to compile
    for machine.
  • Using GCC as backend for other language
  • Few well-documented.
  • Few examples.
  • See GNAT?GNU Cobol?Fortran porting.
  • In both case, copy from similar ports.

27
How to port GCC
  • In directory gcc-xxx/gcc/config/machine/
  • machine.h
  • Contain C macros that define general attributes
    of the machine.
  • machine.md
  • Contain RTL expressions that define the
    instruction set.
  • Input to programs that procude .h and .c files.
  • machine.c
  • Machine-dependent functions normally things too
    large to cleanly put into above two files.

28
How to port GCC (contd)
29
gcc/config--Architecture characteristic key
  • H A hardware implementation does not exist.
  • M A hardware implementation is not currently
    being manufactured.
  • S A Free simulator does not exist.
  • L Integer registers are narrower than 32 bits.
  • Q Integer registers are at least 64 bits wide.
  • N Memory is not byte addressable, and/or bytes
    are not eight bits.
  • F Floating point arithmetic is not included in
    the instruction set
  • I Architecture does not use IEEE format floating
    point numbers
  • C Architecture does not have a single condition
    code register.
  • B Architecture has delay slots.
  • D Architecture has a stack that grows upward.
  • l Port cannot use ILP32 mode integer arithmetic.

30
gcc/config--Architecture characteristic key
  • q Port can use LP64 mode integer arithmetic.
  • r Port can switch between ILP32 and LP64 at
    runtime. (Not necessarily supported by all
    subtargets.)
  • c Port uses cc0.
  • p Port does not use define_peephole.
  • f Port does not define prologue and/or epilogue
    RTL expanders.
  • g Port does not define TARGET_ASM_FUNCTION_(PROEP
    I)LOGUE.
  • m Port does not use define_constants.
  • b Port does not use '" ..."' notation for output
    template code.
  • d Port uses DFA scheduler descriptions.
  • h Port contains old scheduler descriptions.
  • a Port generates multiple inheritance thunks
    using TARGET_ASM_OUTPUT_MI(_VCALL)_THUNK.
  • t All insns either produce exactly one assembly
    instruction, or trigger a define_split.
  • e ltarchgt-elf is not a supported target.
  • s ltarchgt-elf is the correct target to use with
    the simulator in /cvs/src.

31
gcc/config--Architecture characteristic key
  • Gcc-config.txt

32
define_peephole
  • In addition to instruction patterns the md' file
    may contain definitions of machine-specific
    peephole optimizations.
  • The combiner does not notice certain peephole
    optimizations when the data flow in the program
    does not suggest that it should try them.
  • For example, sometimes two consecutive insns
    related in purpose can be combined even though
    the second one does not appear to use a register
    computed in the first one. A machine-specific
    peephole optimizer can detect such opportunities.

33
define_splits
  • Often you can rewrite the single insn as a list
    of individual insns, each corresponding to one
    machine instruction.
  • The compiler splits the insn if there is a reason
    to believe that it might improve instruction or
    delay slot scheduling.
  • Splits are evaluated after the combiner pass and
    before the scheduling passes
  • Splits optimaized the speed and instruction
    length
  • they are the perfect place to put this
    intelligence.
  • Ex If we are loading a small negative constant
    we can save space and time by loading the
    positive value and then sign extending it.

34
define_expand
  • On some target machines, some standard pattern
    names for RTL generation cannot be handled with
    single insn, but a sequence of RTL insns can
    represent them.
  • For these target machines, you can write a
    define_expand' to specify how to generate the
    sequence of RTL.
  • A define_expand' is an RTL expression that looks
    almost like a define_insn' but, unlike the
    latter, a define_expand' is used only for RTL
    generation and it can produce more than one RTL
    insn.
  • The combiner pass only
  • cares about reducing the number of instructions
  • does not care about instruction lengths or speeds

35
define_insn
  • Push and pop
  • movsi_push
  • movsi_popmove
  • Move
  • movqi_unsigned_register_load movqi_signed_register
    _load
  • movqi_internal
  • movhi
  • movhi_unsigned_register_load movhi_signed_register
    _load
  • movhi_internal
  • movsi
  • movsi_internal
  • movdi
  • movdi_insn
  • movsf
  • movsf_internal
  • movsf_constant_storeSigned
  • conversions from a smaller integer to a larger
    integer
  • extendqisi2
  • extendhisi2
  • Addition
  • add_to_stack
  • addsi3
  • addsi_regs
  • addsi_small_int
  • addsi_big_int
  • addsi_for_reload
  • Subtraction
  • subsi3
  • Multiplication
  • mulsidi3
  • umulsidi3
  • mulhisi3
  • umulhisi3
  • mulsi3
  • Negation
  • negsi2
  • Shifts
  • ashlsi3

36
define_insn
  • Logical Operations
  • andsi3
  • iorsi3
  • xorsi3
  • one_cmplsi2
  • Comparisons
  • cmpsi
  • cmpsi_internal
  • Branches
  • beq
  • bne
  • blt
  • ble
  • bgt
  • bge
  • bltu
  • bleu
  • bgtu
  • bgeu
  • Calls Jumps
  • call
  • call_value
  • jump
  • indirect_jump
  • tablejump
  • Function Prologues and Epilogues
  • prologue
  • epilogue
  • return_from_func
  • leave_func
  • enter_func
  • Miscellaneous
  • nop
  • blockage

37
define_insn addsi_regs
  • (define_insn "addsi_regs"
  • (set (match_operandSI 0 "register_operand"
    "r")
  • (plusSI (match_operandSI 1 "register_operand"
    "0")
  • (match_operandSI 2 "register_operand"
    "r")))
  • ""
  • "add 2, 0"
  • )
  • set value x
    chapter 9.15 p110
  • valuex
  • (plusm x y)
  • xy with carry out in mode m

38
define_insn addsi_regs (contd)
  • (mach_operandm n predicate constraint)
    chapter 10.4 p131
  • if condition(predicate) is true then
    return n
  • n count from 0
  • for each number n, only one match_operand
    expression
  • predicate is a name of C function call.
    return 0 when failed
  • general_operand check the operand is
    either a constant, a register, or a memory
    reference
  • register_operand check the operand
    is register or not
  • immediate_operand check the operand
    is immediate data or not
  • constraint describes one kind of operand
    that is permited
  • r register
  • m any kind of memory operand
  • o only offsetable memory operand
  • V only not offsetable memory operand
  • lt memory operand with autodecrement
    addressing
  • gt memory operand with autoincrement
    addressing
  • i immediate integer operand
  • 09 an operand that matches the
    specified operand number is allowed.

39
Build a GCC Cross Compiler
Configure GCC
Configure Binutils
Machine Description
Make
Make
Make install
Make install
GCC compiler
40
Build a GCC Cross Assembler and Cross Linker
  • Binutils Ver 2.14
  • Configure --targetfr30-elf prefixdir
  • Make
  • Make install

41
Build a GCC Cross Compiler
  • GCC ver 3.3.1
  • ../configure --targetfr30-elf --prefixdir
    --enable-languagesc
  • Make
  • Make install

42
A simple c to test cross compiler
  • int test(int i,int j,int k)
  • int a
  • int b
  • a49999999
  • b39999999
  • ak
  • bj
  • a
  • b--
  • i a b
  • return i
  • fr30-elf-gcc S O2 t.c

43
A simple c to test cross compiler (contd)
  • .file "t.c"
  • .text
  • .p2align 2
  • .globl test
  • .type test, _at_function
  • test
  • mov r4, r2
    00000000000000000000010101000010
  • ldi32 50000000, r4
    00000000000000000000010000000100
    10111110101111000010000000
  • ldi32 39999998, r1
    00000000000000000000010000000001
    10011000100101100111111110
  • add r6, r4
    00000000000000000000000001100100
  • add r5, r1
    00000000000000000000000001010001
  • add r1, r4
    00000000000000000000000000010100
  • add r2, r4
    00000000000000000000000000100100
  • ret
  • .size test, .-test
  • .ident "GCC (GNU) 3.3.1 (cygming special)"

44
A simple c to test cross compiler (contd)
45
Summary
  • Study RTL is more important than study MD.
  • Build cross assembler and cross linker before
    build cross compiler.
  • There are few data to port GCC as a cross
    compiler
  • Modify an existing MD is easier than to create a
    new one.
  • The main goal of GCC was to make a good, fast
    compiler for machines in the class that the GNU
    system aims to run on 32-bit machines that
    address 8-bit bytes and have several general
    registers. -- Richard Stallman.
  • It seems that to design a new CPU is easier than
    to build a cross compiler for a GIEE studient.
  • http//gcc.gnu.org
Write a Comment
User Comments (0)
About PowerShow.com