Title: Prelude to Multiprocessing
1Prelude to Multiprocessing
- Detecting cpu and system-board capabilities with
CPUID and the MP Configuration Table
2CPUID
- Recent Intel processors provide a cpuid
instruction (opcode 0x0F, 0xA2) to assist
software in detecting a CPUs capabilities - If its implemented, this instruction can be
executed in any of the processor modes, and at
any privilege level - But it may not be implemented (e.g., 8086, 80286,
80386)
3Pentium EFLAGS register
31
16
21
0
0
0
0
0
0
0
0
0
0
I D
V I P
V I F
A C
V M
R F
15
0
0
N T
IOPL
O F
D F
I F
T F
S F
Z F
0
A F
0
P F
1
C F
Software can toggle the ID-bit (bit 21) in the
32-bit EFLAGS register if the processor is
capable of executing the cpuid instruction
4But what if theres no EFLAGS?
- The early Intel processors (8086, 80286) did not
implement 32-bit registers - The FLAGS register was only 16-bits wide
- So there was no ID-bit that software could try to
toggle - How can software be sure that the 32-bit EFLAGS
register exists within the CPU?
5Detecting 32-bit processors
- Theres a subtle difference in the way the
logical shift/rotate instructions work when
register CL contains the shift-factor - On the 32-bit processors (e.g., 80386) the value
in CL is truncated to 5-bits, but not so on the
16-bit CPUs (8086, 80286) - Software can exploit this distinction, in order
to tell if EFLAGS is implemented
6Detecting EFLAGS
- Heres a test for the presence of EFLAGS
- mov -1, ax a nonzero value
- mov 32, cl shift-factor of 32
- shl cl, ax do logical shift
- or ax, ax test result in AX
- jnz is32bit EFLAGS present
- jmp is16bit EFLAGS absent
7Testing for ID-bit toggle
- Heres a test for the presence of the CPUID
instruction - pushfl copy EFLAGS contents
- pop eax to accumulator register
- mov eax, edx save a duplicate image
- btc 21, eax toggle the ID-bit (bit 21)
- push eax copy revised contents
- popfl back into EFLAGS
- pushfl copy EFLAGS contents
- pop eax back into accumulator
- xor edx, eax do XOR with prior value
- bt 21, eax did ID-bit get toggled?
- jc y_cpuid yes, can execute cpuid
- jmp n_cpuid else cpuid unimplemented
8How does CPUID work?
- Step 1 load value 0 into register EAX
- Step 2 execute cpuid instruction
- Step 3 Verify GenuineIntel character- string
in registers (EBX,EDX,ECX) - Step 4 Find maximum CPUID input-value in the
EAX register
9Version and Features
- load 1 into EAX and execute CPUID
- Processor model and stepping information is
returned in register EAX
- 20 19 16 13 12 11
8 7 4 3 0
Extended Family ID
Extended Model ID
Type
Family ID
Model
Stepping ID
10Some Feature Flags in EDX
28
H T T
9
3
1
2
0
A P I C
P S E
D E
V M E
F P U
HTT HyperThreading Technology (1 yes, 0
no) APIC Advanced Programmable Interrupt
Controller on-chip (1 yes,0 no) PSE
Page-Size Extensions (1 yes, 0 no) DE
Debugging Extensions (1yes, 0no) VME
Virtual-8086 Mode Enhancements (1 yes, 0
no) FPU Floating-Point Unit on-chil (1yes,
0no)
11Some Feature Flags in ECX
5
V M X
VMX Virtual Machine Extensions (1 yes, 0 no)
12Multiprocessor Specification
- Its an industry standard, allowing OS software
to use multiple processors in a uniform way - Software searches in three regions of the
physical address-space below 1-megabyte for a
paragraph-aligned data-structure of length
16-bytes called the MP Floating Pointer
Structure - Search in lowest KB of Extended Bios Data Area
- Search in topmost KB of conventional 640K RAM
- Search in the 64KB ROM-BIOS (0xF0000-0xFFFFF)
13MP Floating Pointer Structure
- This structure may contain an ID-number for one a
small number of standard SMP system
architectures, or may contain the memory address
for a more extensive MP Configuration Table whose
entries specify a more customized system
architecture - Our classroom machines employ the latter of these
two options
14The processors Local-APIC
- The purpose of each processors APIC is to allow
CPUs in a multiprocessor system to transmit
messages among one another and to manage the
delivery of interrupts from the various
peripheral devices to one or more CPUs in a
dynamically determined way - The Local-APIC has a variety of registers which
are memory mapped to paragraph-aligned
addresses in the 4KB page at 0xFEE00000
15Local-APICs register-space
APIC
0xFEE00000
4GB physical address-space
RAM
0x00000000
16Each CPU has its own timer!
- Four of the Local-APIC registers are used to
implement a programmable timer - It can privately deliver a periodic interrupt
just to its own CPU - 0xFEE00320 Timer Vector register
- 0xFEE00380 Initial Count register
- 0xFEE00390 Current Count register
- 0xFEE003E0 Divider Configuration register
17Timers Local Vector Table
0xFEE00320
7 0
12
16
17
Interrupt ID-number
M O D E
M A S K
B U S Y
MODE 0one-shot 1periodic
MASK 0unmasked 1masked
BUSY 0not busy 1busy
18In-class exercise
- Run the cpuid.cpp Linux application (on our
course website) to see if the CPUs in our
classroom implement HyperThreading (i.e.,
multiple processors within one CPU) - Then run the smpinfo.cpp application, to see if
the MP Base Configuration Table has entries for
more than one processor - If both results hold true, then we can write our
own multiprocessing software in here!
19In-class exercise 2
- Run the apictick.s demo (on our website) to
observe the APICs periodic interrupt drawing
bytes onto the screen - It executes for ten-milliseconds (the 8254 is
used to create this timed delay) - Try reprogramming the APICs Divider
Configuration register, to cut the interrupt
frequency in half (or to double it)