CS 3210 Fall 2003 - PowerPoint PPT Presentation

1 / 24
About This Presentation
Title:

CS 3210 Fall 2003

Description:

name, #pages, #free, spinlock, start address. zone balancing thresholds. zone_mem_map ... good cache utilization (coloring) small cache footprint just a few ... – PowerPoint PPT presentation

Number of Views:195
Avg rating:3.0/5.0
Slides: 25
Provided by: Phillip4
Category:
Tags: coloring | fall | free | pages

less

Transcript and Presenter's Notes

Title: CS 3210 Fall 2003


1
CS 3210Fall 2003
  • Memory
  • Management

2
Dynamic Memory
  • kernel loaded starting at 1M
  • zero page reserved
  • i/o aperture (640K-1M)
  • dynamic memory the rest!
  • managed by kernel memory allocators
  • the allocators
  • page allocator buddy system
  • slab (cache) allocator object (struct) specific
    allocators
  • kmalloc small requests
  • vmalloc virtually contiguous

3
How Does malloc() Work?
  • library routine that asks kernel for memory
  • sbrk() add pages to per-process heap (user
    space)
  • malloc() manages heap using freelist
  • freelist linked list of free memory chunks
    (nodes)
  • links embedded in nodes themselves!
  • malloc(size)
  • look through list for available chunk
  • first fit, best fit, worst fit algorithms (see
    Knuth)
  • split chunk if necessary, update linkage, return
    to user
  • details
  • chunks must be aligned, rounded-up to min size
  • extra space (above return address!) for size,
    etc.
  • freeing twice often creates cycles! breaks
    malloc()

4
User vs. Kernel Allocators
  • kernel
  • small kernel stack - lots of mallocs
  • must be fast! reliable! often non-blocking!
  • optimized for concurrent requests
  • various flavors of memory dma, high, etc.
  • user
  • ok to block
  • can be relatively slow
  • more concern about memory utilization

5
Page Frame Management
  • array of page descriptors to track frame state
  • struct page mem_map of frames
  • 64 byte descriptors about 16KB per MB
  • page descriptor fields
  • count ref count (atomic_t)
  • flags PG_locked, _uptodate, _dirty, _slab,
    _highmem
  • wait queues, links, index on disk, lru info,
  • zone

6
Memory Zones
  • architectural restrictions
  • ZONE_DMA (ISA) 0..16MB
  • ZONE_NORMAL 16MB..896MB
  • ZONE_HIGHMEM 896MB..
  • zone_struct
  • name, pages, free, spinlock, start address
  • zone balancing thresholds
  • zone_mem_map,

7
NUMA
  • Non-Uniform Memory Architecture
  • per-node memory
  • memory access costs (non-uniform)
  • same node fast
  • different node slower
  • Kernel optimizes data structure placement
  • Node descriptors
  • id, zonelist, etc.
  • UMA single node (contig_page_data)

8
Memory Structures Init
  • paging_init()
  • kmap_init() mapping windows for highmem
  • free_area_init() most init done here
  • mem_init()
  • num_physpages
  • calls free_page() on each free page

9
Requesting/Releasing Frames
  • request parameters
  • order size (in pages) as a power of 2
  • gfp_mask how to look for page
  • gfp "get free page"
  • priority, blocking?, zone, etc.
  • examples GFP_ATOMIC, GFP_NOIO, GFP_USER
  • alloc functions (returns page descriptor)
  • alloc_page(gfp_mask)
  • alloc_pages(gfp_mask, order)
  • get functions (returns address of first frame)
  • _get_free_page(gfp_mask)
  • _get_free_pages(gfp_mask, order)

10
Mapping High Memory
  • High 128 MB (of 4GB) for mapping highmem
  • alloc, then map
  • mapping windows a limited resource
  • map requests may block
  • Mapping types
  • permanent blocking
  • can't be used by interrupts, softirqs
  • uses a separate small page table (up to 4MB)
  • temporary non-blocking (swaps)
  • used by interrupts only 5 per cpu
  • non-contiguous

11
Buddy System
  • Variable sized allocator
  • contiguous pages in powers of 2 (1..512)
  • reduces external fragmentation
  • lot's of little pieces of free memory
  • Simple idea
  • keep 10 linked lists of chunks 1, 2, 4, 8, .. 512
  • assume initially just 512 chunks
  • Request for 128 page chunk?
  • split 512 into 256, 128, 128
  • Buddies contiguous, equal-sized chunks (pairs)
  • Free a 128 page chunk?
  • merge if buddy is free, move to 256 page list
  • cascades are possible

12
Buddy System Data Structures
  • Per-zone buddy system
  • free_area_t free_area10 (zone descriptor field)
  • free_area_t
  • circular list of chunks (threaded through
    mem_map)
  • bitmap 1 bit for every pair
  • 1 means 1 in use, 1 free 0 means 2 in use or 2
    free
  • see figure 7.2 page 236
  • alloc_pages()
  • try to free pages if below threshold (blocking)
  • zone-free_areaorder
  • split as necessary
  • Gruesome details in the book

13
Slab Cache Allocator
  • memory area variable sized region
  • internal fragmentation small allocation, big
    block
  • Slab Cache (Sun 1994)
  • generator for per-descriptor allocators
  • includes memory cache
  • slabs big memory chunks from page allocator
  • benefits / characteristics
  • object-oriented ctor/dtor (not used by Linux)
  • good cache utilization (coloring)
  • small cache footprint just a few instructions
    to allocate
  • /proc/slabinfo

14
Caches, Slabs, Objects
  • caches kmem_cache_t
  • name, object size, num per slab, spinlock
  • slab tracking (full, partial, free)
  • cache_cache (!) cache of cache descriptors
  • slab slab_t
  • descriptors stored either internally or
    externally
  • internally front of slab
  • externally in general purpose cache
  • num inuse, first, first free, linkage
  • objects actual allocated memory

15
Cache Operations
  • kmem_cache_create()
  • mostly used by modules
  • kmem_cache_destroy()
  • first frees, releases slabs
  • kmem_cache_grow()
  • get free slabs from page allocator
  • kmem_cache_shrink()
  • return free slabs to page allocator

16
Slab Management Diagram
cache_cache
Cache Descriptor
Cache Descriptor
Cache Descriptor
Slab Descriptor
first
free
Slab Descriptor
Slab Descriptor
17
Slab Allocator and Buddy System
  • caches request slabs from page allocator
  • kmem_getpages()
  • caches return slabs to page allocator
  • kmem_freepages()
  • slabs are returned under memory pressure
  • one kernel strategy to free memory
  • others flush memory caches, swap, kill processes

18
Objects
  • object descriptors kmem_bufctl_t
  • array of descriptors after slab descriptor
  • stored internally or externally
  • freelist linked through free descriptor
  • descriptors point to actual memory
  • operations
  • kmem_cache_alloc()
  • kmem_cache_free()

19
Alignment
  • memory objects must be aligned
  • slow or impossible to perform unaligned accesses
  • natural alignment
  • n byte item aligned on a multiple of n
  • example int is 4 byte aligned, double, 8 byte
  • alignment based on first field in object
  • alignment specified at cache creation
  • specific size
  • page alignment (4KB on Intel)
  • hardware cache line (32 bytes on Intel)

20
Slab Coloring
  • problem for aligned objects
  • tend to map to same cache lines!
  • (hw caches hash modulo cache size)
  • n-way cache allows n collisions (expensive)
  • clever trick
  • add a little extra space at beginning of each
    slab
  • objects in slab contend across slabs, do not
  • color calculation (how much extra space?)
  • a little complicated
  • must consider slab size, object size, alignment

21
Allocate/Free Objects
  • code description in book
  • special cases for SMP
  • per-CPU freelist (no locking required!)

22
kmalloc()
  • just an array of general purpose slab caches!
  • cache_sizes
  • powers of two 32 131,072
  • kmalloc(size, flags)
  • kfree(p)

23
vmalloc()
  • noncontiguous, page-based allocator
  • used for largish allocations
  • not suitable for all memory requests
  • enough pages but not contiguous?
  • modifies kernel page table
  • make pages look virtually contiguous
  • allocated in top GB after physical mem map
  • on a system with 1GB, 896MB
  • places 4KB red zones around allocations
  • vmlist linked list of vm_struct descriptors
  • descriptors allocated using kmalloc()
  • see code in book includes page table
    manipulations

24
vmalloc() Diagram
0
vmlist
vmalloc area descriptor
vmalloc area descriptor
vmalloc area descriptor
4GB
from kmalloc()
PAGE_OFFSET
up to 896MB (based on physical memory)
high_mem
vmalloc() area
high memory mappings
Write a Comment
User Comments (0)
About PowerShow.com