Prelude to Multiprocessing - PowerPoint PPT Presentation

About This Presentation
Title:

Prelude to Multiprocessing

Description:

Prelude to Multiprocessing. Detecting cpu and system-board capabilities with ... If it's implemented, this instruction can be executed in any of the processor ... – PowerPoint PPT presentation

Number of Views:79
Avg rating:3.0/5.0
Slides: 20
Provided by: usfi
Learn more at: https://www.cs.usfca.edu
Category:

less

Transcript and Presenter's Notes

Title: Prelude to Multiprocessing


1
Prelude to Multiprocessing
  • Detecting cpu and system-board capabilities with
    CPUID and the MP Configuration Table

2
CPUID
  • Recent Intel processors provide a cpuid
    instruction (opcode 0x0F, 0xA2) to assist
    software in detecting a CPUs capabilities
  • If its implemented, this instruction can be
    executed in any of the processor modes, and at
    any privilege level
  • But it may not be implemented (e.g., 8086, 80286,
    80386)

3
Pentium EFLAGS register
31
16
21
0
0
0
0
0
0
0
0
0
0
I D
V I P
V I F
A C
V M
R F
15
0
0
N T
IOPL
O F
D F
I F
T F
S F
Z F
0
A F
0
P F
1
C F
Software can toggle the ID-bit (bit 21) in the
32-bit EFLAGS register if the processor is
capable of executing the cpuid instruction
4
But what if theres no EFLAGS?
  • The early Intel processors (8086, 80286) did not
    implement 32-bit registers
  • The FLAGS register was only 16-bits wide
  • So there was no ID-bit that software could try to
    toggle
  • How can software be sure that the 32-bit EFLAGS
    register exists within the CPU?

5
Detecting 32-bit processors
  • Theres a subtle difference in the way the
    logical shift/rotate instructions work when
    register CL contains the shift-factor
  • On the 32-bit processors (e.g., 80386) the value
    in CL is truncated to 5-bits, but not so on the
    16-bit CPUs (8086, 80286)
  • Software can exploit this distinction, in order
    to tell if EFLAGS is implemented

6
Detecting EFLAGS
  • Heres a test for the presence of EFLAGS
  • mov -1, ax a nonzero value
  • mov 32, cl shift-factor of 32
  • shl cl, ax do logical shift
  • or ax, ax test result in AX
  • jnz is32bit EFLAGS present
  • jmp is16bit EFLAGS absent

7
Testing for ID-bit toggle
  • Heres a test for the presence of the CPUID
    instruction
  • pushfl copy EFLAGS contents
  • pop eax to accumulator register
  • mov eax, edx save a duplicate image
  • btc 21, eax toggle the ID-bit (bit 21)
  • push eax copy revised contents
  • popfl back into EFLAGS
  • pushfl copy EFLAGS contents
  • pop eax back into accumulator
  • xor edx, eax do XOR with prior value
  • bt 21, eax did ID-bit get toggled?
  • jc y_cpuid yes, can execute cpuid
  • jmp n_cpuid else cpuid unimplemented

8
How does CPUID work?
  • Step 1 load value 0 into register EAX
  • Step 2 execute cpuid instruction
  • Step 3 Verify GenuineIntel character- string
    in registers (EBX,EDX,ECX)
  • Step 4 Find maximum CPUID input-value in the
    EAX register

9
Version and Features
  • load 1 into EAX and execute CPUID
  • Processor model and stepping information is
    returned in register EAX
  • 20 19 16 13 12 11
    8 7 4 3 0

Extended Family ID
Extended Model ID
Type
Family ID
Model
Stepping ID
10
Some Feature Flags in EDX
28
H T T
9
3
1
2
0
A P I C
P S E
D E
V M E
F P U
HTT HyperThreading Technology (1 yes, 0
no) APIC Advanced Programmable Interrupt
Controller on-chip (1 yes,0 no) PSE
Page-Size Extensions (1 yes, 0 no) DE
Debugging Extensions (1yes, 0no) VME
Virtual-8086 Mode Enhancements (1 yes, 0
no) FPU Floating-Point Unit on-chil (1yes,
0no)
11
Some Feature Flags in ECX
5
V M X
VMX Virtual Machine Extensions (1 yes, 0 no)
12
Multiprocessor Specification
  • Its an industry standard, allowing OS software
    to use multiple processors in a uniform way
  • Software searches in three regions of the
    physical address-space below 1-megabyte for a
    paragraph-aligned data-structure of length
    16-bytes called the MP Floating Pointer
    Structure
  • Search in lowest KB of Extended Bios Data Area
  • Search in topmost KB of conventional 640K RAM
  • Search in the 64KB ROM-BIOS (0xF0000-0xFFFFF)

13
MP Floating Pointer Structure
  • This structure may contain an ID-number for one a
    small number of standard SMP system
    architectures, or may contain the memory address
    for a more extensive MP Configuration Table whose
    entries specify a more customized system
    architecture
  • Our classroom machines employ the latter of these
    two options

14
The processors Local-APIC
  • The purpose of each processors APIC is to allow
    CPUs in a multiprocessor system to transmit
    messages among one another and to manage the
    delivery of interrupts from the various
    peripheral devices to one or more CPUs in a
    dynamically determined way
  • The Local-APIC has a variety of registers which
    are memory mapped to paragraph-aligned
    addresses in the 4KB page at 0xFEE00000

15
Local-APICs register-space
APIC
0xFEE00000
4GB physical address-space
RAM
0x00000000
16
Each CPU has its own timer!
  • Four of the Local-APIC registers are used to
    implement a programmable timer
  • It can privately deliver a periodic interrupt
    just to its own CPU
  • 0xFEE00320 Timer Vector register
  • 0xFEE00380 Initial Count register
  • 0xFEE00390 Current Count register
  • 0xFEE003E0 Divider Configuration register

17
Timers Local Vector Table
0xFEE00320
7 0
12
16
17
Interrupt ID-number
M O D E
M A S K
B U S Y
MODE 0one-shot 1periodic
MASK 0unmasked 1masked
BUSY 0not busy 1busy
18
In-class exercise
  • Run the cpuid.cpp Linux application (on our
    course website) to see if the CPUs in our
    classroom implement HyperThreading (i.e.,
    multiple processors within one CPU)
  • Then run the smpinfo.cpp application, to see if
    the MP Base Configuration Table has entries for
    more than one processor
  • If both results hold true, then we can write our
    own multiprocessing software in here!

19
In-class exercise 2
  • Run the apictick.s demo (on our website) to
    observe the APICs periodic interrupt drawing
    bytes onto the screen
  • It executes for ten-milliseconds (the 8254 is
    used to create this timed delay)
  • Try reprogramming the APICs Divider
    Configuration register, to cut the interrupt
    frequency in half (or to double it)
Write a Comment
User Comments (0)
About PowerShow.com