IA32 Paging Scheme - PowerPoint PPT Presentation

About This Presentation
Title:

IA32 Paging Scheme

Description:

For efficient time-sharing' among multiple tasks, an operating system needs to ... To accomplish this using actual physical memory-addressing would require doing ... – PowerPoint PPT presentation

Number of Views:120
Avg rating:3.0/5.0
Slides: 31
Provided by: professora5
Learn more at: https://www.cs.usfca.edu
Category:

less

Transcript and Presenter's Notes

Title: IA32 Paging Scheme


1
IA32 Paging Scheme
  • Introduction to the Intel x86s support for
    virtual memory

2
What is paging?
  • Its a scheme for dynamically remapping addresses
    for fixed-size memory-blocks

Virtual address-space
Physical address-space
3
Whats paging good for?
  • For efficient time-sharing among multiple
    tasks, an operating system needs to have several
    programs residing in main memory at the same time
  • To accomplish this using actual physical
    memory-addressing would require doing
    address-relocation calculations each time a
    program was loaded (to avoid conflicting with any
    addresses already being used)

4
Why use Paging?
  • Use of paging allows relocations to be done
    just once (by the linker), and every program can
    reuse the same addresses

Task 3
Task 1
physical memory
Task 2
5
How to enable paging
Control Register CR0
31



0
P G
C D
N W
A M
W P
N E
E T
T S
E M
M P
P E
Protected-Mode must be enabled (PE1)
Then Paging can be enabled (set PG1)
Here is how you can enable paging (if CPU is in
protected-mode) mov cr0, eax get current
machine status bts 31, eax turn on the
PE-bits image mov eax, cr0 put modified
status in CR0 jmp . 2 now flush the
prefetch queue but you had better prepare the
mapping beforehand!
6
Several paging schemes
  • Intels design for paging has continued to
    evolve since its introduction in 80386 CPU
  • Our Core-2 Quad CPUs support the initial paging
    design (plus several extensions)
  • Here we shall describe the initial design (its
    simplest and it remains the default)
  • It is based on subdividing the entire 4GB
    physical address-space into 4-KB blocks

7
Terminology
  • The 4KB memory-blocks are called page frames --
    and they are non-overlapping
  • Therefore each page-frame begins at a
    memory-address which is a multiple of 4K
  • Remember 4K 4 x 1024 4096 212
  • So the address of any page-frame will have its
    lowest 12-bits equal to zeros
  • Example page six begins at 0x00006000

8
Control Register CR3
  • Register CR3 is used by the CPU to find the
    paging-tables in memory which define its
    virtual-to-physical address-translation
  • Specifically, CR3 points to a page-frame, called
    the Page Directory, which contains addresses of
    frames called Page Tables
  • An address in CR3 must be page aligned

31
0
CR3
Physical Address of the Page-Directory
9
Two-Level Translation Scheme
PAGE TABLES
PAGE DIRECTORY
PAGE FRAMES
CR3
10
Page-Directory
  • The Page-Directory occupies one frame, so it has
    room for 1024 4-byte entries
  • Each page-directory entry can contain a pointer
    to a further data-structure, called a Page-Table
    (also page-aligned 4KB size)
  • Each Page-Table occupies one frame and has enough
    room for 1024 4-byte entries
  • Page-Table entries can contain pointers

11
Address-translation
  • The CPU examines any virtual address it
    encounters, subdividing it into three fields

31 22 21
12 11
0
offset into page-frame
index into page-directory
index into page-table
10-bits
10-bits
12-bits
This field selects one of the 1024
array-entries in the Page-Directory
This field selects one of the 1024
array-entries in that Page-Table
This field provides the offset to one of
the 4096 bytes in that Page-Frame
12
Identity-mapping
  • When the CPU first turns on the paging
    capability, it must be executing code from an
    identity-mapped page (or it crashes!)

identity-mapping
code
code
physical memory
virtual memory
13
Additional mappings
  • Besides having at least one page that is
    identity-mapped (for turning paging on),
    there can be multiple other mappings

data
data
identity-mapping
code
code
data
data
physical memory
virtual memory
14
Demo program
  • We wrote a very simple demo-program showing how
    to create a Page-Directory and a Page-Table for
    an identity-mapping of the page-frame that
    contains program-code, plus a non-identity
    mapping for the initial page of the video display
    memory
  • This demo is named vrampage.s (you can find it
    on our CS 630 course website)

15
Demos page-mapping
program arena
one page-table
page-directory
unused unused
unused
video memory
CR3
Our vrampage.s demo-program uses only four
page-frames of physical memory (16K) 1) the
programs arena (at 0x00010000) 2) the
page-directory (at 0x00011000) 3) only one
page-table (at 0x00012000) 4) one page of vram
(at 0x000B8000)
16
Virtual-to-Physical
video memory
0x000B8000
page-directory
page-table
code and data
code and data
0x00010000
0x00010000
video memory
0x00000000
physical address-space
virtual address-space
17
The demos table-entries
  • Our page-directory uses only one entry
  • And our page-table uses only two entries

0x00011 003
pgdir0x000
0x000B8 003
pgtbl0x000
0x00010 003
pgtbl0x010
identity-mapping
18
The segment descriptors
  • Our demos GDT uses three descriptors

executable segment at virtual-address
0x00010000
0x0000009A010000FFFF
writable segment at virtual-address 0x00010000
0x00000092010000FFFF
writable segment at virtual-address 0x00000000
0x00000092000000FFFF
19
Page-Level protection
  • Each entry in a Page-Table can assign a
    collection of attributes to the Page-Frame it
    points to for example
  • The P-bit (page is present) can be used by an
    operating system to implement demand paging and
    memory-mapping of disk-files
  • The W/R-bit can be used to mark a page as either
    Writable (1) or as Read-Only (0)
  • The U/S-bit can be used to mark a page as
    User-accessible or as Supervisor-only

20
Format of a Page-Table entry
31
12 11 10 9 8 7 6 5 4 3 2 1 0
PAGE-FRAME BASE ADDRESS
P
W
U
P W T
P C D
A
D
0
0
AVAIL
LEGEND P Present (1yes, 0no) W Writable
(1 yes, 0 no) U User (1 yes, 0 no)
A Accessed (1 yes, 0 no) D Dirty (1
yes, 0 no)
PWT Page Write-Through (1yes, 0 no) PCD
Page Cache-Disable (1 yes, 0 no)
21
Format of a Page-Directory entry
31
12 11 10 9 8 7 6 5 4 3 2 1 0
PAGE-TABLE BASE ADDRESS
P
W
U
P W T
P C D
A
0
P S
0
AVAIL
LEGEND P Present (1yes, 0no) W Writable
(1 yes, 0 no) U User (1 yes, 0 no)
A Accessed (1 yes, 0 no)
PS Page-Size (04KB, 1 4MB)
PWT Page Write-Through (1yes, 0 no) PCD
Page Cache-Disable (1 yes, 0 no)
NOTE The PS-bit is only meaningful when the
PSE-bit in register CR4 is set
22
Violations
  • When a task violates the page-attributes of any
    Page-Frame, the CPU will generate a Page-Fault
    Exception (interrupt 0x0E)
  • Then the operating systems page-fault
    exception-handler gets control and can take
    whatever action it deems is suitable
  • The CPU will provide help to the OS in
    determining why a Page-Fault occurred

23
The Error-Code format
  • The CPU will push an Error-Code onto the
    operating systems stack

3 2 1 0
P
W / R
U / S
reserved (0)
Legend P (Present) 0attempted access was to
a not-present page W/R (Write/Read)
1attempted to write to a read-only page U/S
(User/Supervisor) 1user attempted access to a
supervisor page NOTE User means that CPL
3 Supervisor means that CPL 0, 1, or 2
24
Control Register CR2
  • Whenever a Page-Fault is encountered, the CPU
    will save the virtual-address that caused that
    fault into the CR2 register
  • If the CPU was trying to modify the value of an
    operand in a read-only page, then that
    operands virtual address is written into CR2
  • If the CPU was trying to read the value of an
    operand in a supervisor-only page (or was trying
    to fetch-and-execute an instruction) while CPL3,
    the relevant virtual address will be written into
    CR2

25
CR3 and Task-Switching
32-bits
0 4 8 12 16 20 24 28 32 36 40 44 48 52 56 60 64 68
72 76 80 84 88 92 96 100
link
esp0
ss0
esp1
ss1
esp2
Page-Table Directory Base
ss2
PTDB
EIP
This value will get loaded into register CR3
as part of the context-switching mechanism
when paging has been enabled (PG1) So the
incoming task will automatically have its
own individual mapping of its virtual
address-space to page-frames in the CPUs
physical address-space
26 longwords
ss0
ss0
EFLAGS
ss0
ss0
EAX
ss0
ss0
ECX
ss0
ss0
EDX
ss0
ss0
EBX
ss0
ss0
ESP
ss0
ss0
EBP
ss0
ss0
ESI
ss0
ss0
EDI
ES
CS
SS
DS
field is static
FS
GS
field is volatile
LDTR
IOMAP
TRAP
field is reserved
I/O permission bitmap
26
Extensions to paging scheme
  • The Core-2 Quad CPU provides several enhancements
    to the original 386 paging
  • These enhancements are optional and must be
    selectively enabled by software
  • Control Register CR4 implements bits to turn on
    the desired paging-extension and some other
    enhancements that are unrelated to the paging
    architectures

27
Control Register CR4
31
13 10 9 8 7 6 5 4
3 2 1 0
V M X E
P C E
P G E
M C E
P A E
P S E
D E
T S D
P V I
V M E
Legend (for paging-related extensions) PSE
Page-Size Extension is enabled (1 yes, 0 no)
PAE Page-Address Extension is enabled (1
yes, 0 no) PGE Page-Global Extension is
enabled (1 yes, 0 no)
28
What about efficiency?
  • When paging is enabled, every reference to memory
    requires the CPU to translate the
    virtual-address into a physical-address
  • That translation is based on table-lookups
  • These lookups must be done sequentially
  • So address-translation could be costly in terms
    of CPU speed a high percentage of instructions
    typically refer to memory

29
The TLB solution
  • When the CPU has performed the table lookups that
    map a virtual-address to a physical-address, it
    remembers that relationship by saving the pair
    of page-addresses (virtual-page ? physical page)
    in a special CPU cache known as the TLB
    (Translation Look-aside Buffer)
  • Subsequent references to this same page can be
    resolved quickly -- via that cache!

30
4-way set-associative
  • The TLB is implemented as a 4-way
    set-associative cache -- its like a
    parallelized version of a Hash Table (with
    evictions)
  • Due to the locality of reference principle, the
    TLB concept generally works well in most common
    programming contexts as an efficient speedup of
    the page-address table-lookup translation-mechanis
    m
  • Modifying CR3 invalidates the TLB cache
Write a Comment
User Comments (0)
About PowerShow.com