Prelude to Multiprocessing presentation

About This Presentation

Transcript and Presenter's Notes

Title: Prelude to Multiprocessing

1
Prelude to Multiprocessing

Detecting cpu and system-board capabilities with
CPUID and the MP Configuration Table

2
CPUID

Recent Intel processors provide a cpuid
instruction (opcode 0x0F, 0xA2) to assist
software in detecting a CPUs capabilities
If its implemented, this instruction can be
executed in any of the processor modes, and at
any of its four privilege levels
But this cpuid instruction might not be
implemented (e.g., 8086, 80286, 80386)

3
Intel x86 EFLAGS register
31
16
21
0
0
0
0
0
0
0
0
0
0
I D
V I P
V I F
A C
V M
R F
15
0
0
N T
IOPL
O F
D F
I F
T F
S F
Z F
0
A F
0
P F
1
C F
Software can toggle the ID-bit (bit 21) in the
32-bit EFLAGS register if the processor is
capable of executing the cpuid instruction
4
But what if theres no EFLAGS?

The early Intel processors (8086, 80286) did not
implement any 32-bit registers
The FLAGS register was only 16-bits wide
So there was no ID-bit that software could try to
toggle (to see if cpuid existed)
How can software be sure that the 32-bit EFLAGS
register exists within the CPU?

5
Detecting 32-bit processors

Theres a subtle difference in the way the
logical shift/rotate instructions work when
register CL contains the shift-factor
On the 32-bit processors (e.g., 80386) the value
in CL is truncated to 5-bits, but not so on the
16-bit CPUs (8086, 80286)
Software can exploit this distinction, in order
to tell if EFLAGS is implemented

6
Detecting EFLAGS

Heres a test for the presence of EFLAGS
mov -1, ax a nonzero value
mov 32, cl shift-factor of 32
shl cl, ax do logical shift
or ax, ax test result in AX
jnz is32bit EFLAGS present
jmp is16bit EFLAGS absent

7
Testing for ID-bit toggle

Heres a test for the presence of the CPUID
instruction
pushfl copy EFLAGS contents
pop eax to accumulator register
mov eax, edx save a duplicate image
btc 21, eax toggle the ID-bit (bit 21)
push eax copy revised contents
popfl back into EFLAGS
pushfl copy EFLAGS contents
pop eax back into accumulator
xor edx, eax do XOR with prior value
bt 21, eax did ID-bit get toggled?
jc y_cpuid yes, can execute cpuid
jmp n_cpuid else cpuid unimplemented

8
How does CPUID work?

Step 1 load value 0 into register EAX
Step 2 execute cpuid instruction
Step 3 Verify GenuineIntel character- string
in registers (EBX,EDX,ECX)
Step 4 Find maximum CPUID input-value in the
EAX register

9
Version and Features

load 1 into EAX and execute CPUID
Processor model and stepping information is
returned in register EAX

20 19 16 13 12 11
8 7 4 3 0

Extended Family ID
Extended Model ID
Type
Family ID
Model
Stepping ID
10
Some Feature Flags in EDX
28
H T T
9
3
1
2
0
13
P G E
A P I C
P S E
D E
V M E
F P U
HTT HyperThreading Technology (1 yes, 0
no) PGE Page Global Entries (1yes, 0no) APIC
Advanced Programmable Interrupt Controller
on-chip (1 yes,0 no) PSE Page-Size
Extensions (1 yes, 0 no) DE Debugging
Extensions (1yes, 0no) VME Virtual-8086 Mode
Enhancements (1 yes, 0 no) FPU
Floating-Point Unit on-chil (1yes, 0no)
11
Some Feature Flags in ECX
5
V M X
VMX Virtual Machine Extensions (1 yes, 0 no)
12
Multiprocessor Specification

Its an industry standard, allowing OS software
to use multiple processors in a uniform way
OS software searches in three regions of the
physical address-space below 1-megabyte for a
paragraph-aligned data-structure of length
16-bytes called the MP Floating Pointer
Structure
Search in lowest KB of Extended Bios Data Area
Search in topmost KB of conventional 640K RAM
Search in the 128KB ROM-BIOS (0xE0000-0xFFFFF)

13
MP Floating Pointer Structure

This structure may contain an ID-number for one a
small number of standard SMP system
architectures, or may contain the memory address
for a more extensive MP Configuration Table
having entries that specify a customized system
architecture
The machines in our classroom employ the latter
of these two options

14
An example record

The MP Configuration Table will contain a record
for each logical processor

reserved (0)
reserved (0)
Feature Flags
CPU signature (stepping, model, family)
CPU Flags BP (bit 1), EN (bit 0)
Local-APIC version
Local-APIC ID
Entry Type 0
BP Bootstrap Processor (1yes, 0no), EN
Enabled (1yes, 0no)
15
Our mpinfo.cpp utility

We created a Linux utility that will display the
system-information contained in the MP
Configuration Table (in hex format)
You can refer to the MP Specification 1.4
document (online) to interpret this display
This utility needs a device-driver dram.c to be
pre-installed (in order that it be able to
directly access the systems memory)

16
A processors Local-APIC

The purpose of each processors APIC is to allow
the CPUs in a multiprocessor system to send
messages to one another and to manage the
delivery of the interrupt-requests from the
various peripheral devices to one (or more) of
the CPUs in a dynamically programmable way
Each processors Local-APIC has a variety of
registers, all memory mapped to
paragraph-aligned addresses within the 4KB page
at physical-address 0xFEE00000

17
Local-APICs register-space
APIC
0xFEE00000
4GB physical address-space
RAM
0x00000000
18
Analogies with the PIC

Among the registers in a Local-APIC are these
(which had analogues in the older 8259 PICs
design
IRR Interrupt Request Register (256-bits)
ISR In-Service Register (256-bits)
TMR Trigger-Mode Register (256-bits)
For each of these, its 256-bits are divided among
eight 32-bit register addresses

19
New way to do EOI

Instead of using a special End-Of-Interrupt
command-byte, the Local-APIC contains a dedicated
write-only register (named the EOI Register)
which an Interrupt Handler writes to when it is
ready to signal an EOI

issuing EOI to the Local-APIC mov 0xFEE00000,
ebx address of the cpus Local-APIC movl 0,
fs0xB0(ebx) write any value into EOI
register Here we assume segment-register FS
holds the selector for a segment-descriptor for
a writable 4GB-size expand-up data-segment
whose base-address equals 0
20
Each CPU has its own timer!

Four of the Local-APIC registers are used to
implement a programmable timer
It can privately deliver a periodic interrupt (or
one-shot interrupt) just to its own CPU
0xFEE00320 Timer Vector register
0xFEE00380 Initial Count register
0xFEE00390 Current Count register
0xFEE003E0 Divider Configuration register

21
Timers Local Vector Table
0xFEE00320
7 0
12
16
17
Interrupt ID-number
M O D E
M A S K
B U S Y
MODE 0one-shot 1periodic
MASK 0unmasked 1masked
BUSY 0not busy 1busy
22
Timers Divide-Configuration
0xFEE003E0
3 2 1 0
reserved (0)
0
Divider-Value field (bits 3, 1, and 0) 000
divide by 2 001 divide by 4 010 divide by
8 011 divide by 16 100 divide by 32 101
divide by 64 110 divide by 128 111 divide
by 1
23
Initial and Current Counts
0xFEE00380
Initial Count Register (read/write)
0xFEE00390
Current Count Register (read-only)
When the timer is programmed for periodic mode,
the Current Count is automatically reloaded from
the Initial Count register, then counts down
with each CPU bus-cycle, generating an interrupt
when it reaches zero
24
Using the timers interrupts

Setup your desired Initial Count value
Select your desired Divide Configuration
Setup the APIC-timers LVT register with your
desired interrupt-ID number and counting mode
(periodic or one-shot), and clear the LVT
registers Mask bit to initiate the automatic
countdown operation

25
In-class exercise 1

Run the cpuid.cpp Linux application (on our
course website) to see if the CPUs in our
classroom implement HyperThreading (i.e.,
multiple logical processors in a cpu)
Then run the mpinfo.cpp application, to see if
the MP Base Configuration Table has entries for
more than one processor
If both results hold true, then we can write our
own multiprocessing software in H235!

26
In-class exercise 2

Run the apictick.s demo (on our CS 630 website)
to observe the APICs periodic
interrupt-handler drawing Ts onscreen
It executes for ten-milliseconds (the 8254 is
used here to create that timed delay)
Try reprogramming the APICs Divider
Configuration register, to cut the interrupt
frequency in half (or perhaps to double it)

Write a Comment

User Comments (0)

About PowerShow.com

Prelude to Multiprocessing PowerPoint PPT Presentation