Storage - PowerPoint PPT Presentation

1 / 37
About This Presentation
Title:

Storage

Description:

Peripheral memory consists of a very, very long sequence of bits, organized into ... A loader takes a compiled program and puts it somewhere in a computer memory ... – PowerPoint PPT presentation

Number of Views:60
Avg rating:3.0/5.0
Slides: 38
Provided by: davidlm3
Category:
Tags: storage

less

Transcript and Presenter's Notes

Title: Storage


1
Storage
2
Parts of a computer
  • For purposes of this talk, we will assume three
    main parts to a computer
  • Main memory, once upon a time called core, but
    these days called RAM (Random Access Memory)
  • RAM consists of a very long sequence of bits,
    organized into bytes (8-bit units) or words
    (longer units)
  • Peripheral memory, these days called disks (even
    when they arent) or drives
  • Peripheral memory consists of a very, very long
    sequence of bits, organized into pages of words
    or bytes
  • Peripheral memory is thousands of times slower
    than RAM
  • The CPU (Central Processing Unit), which
    manipulates these bits, and moves them back and
    forth between main memory and peripheral memory

3
Its all bits
  • Everything in a computer is represented by a
    sequence of bitsintegers, floating point
    numbers, characters, and, most importantly,
    instructions
  • Bits are the ultimate flexible representationat
    least until we have working quantum computers,
    which use qubits (quantum bits)
  • Modern languages use strong typing to prevent you
    from accidentally treating a floating point
    number as a boolean, or a string as an integer
  • A weakly typed language provides some protection,
    but there are ways around it
  • But it wasnt always this way...

4
Storage is storage
  • At one time, words representing machine
    instructions and words representing data could be
    intermixed
  • Strong typing was a thing of the future
  • It was the programmers responsibility to avoid
    executing data, or doing arithmetic on
    instructions
  • Both of these things could be done, either
    accidentally or deliberately
  • Machine instructions are just a sequence of bits
  • They can be manipulated like any other sequence
    of bits
  • Hence, programmers could change any instruction
    into any other instruction (of the same size), or
    rewrite whole blocks of instructions
  • A self-modifying program is one that changes its
    own instructions

5
Self-modifying programs
  • Once upon a time, self-modifying programs were
    thought of as a good thing
  • Just think of how flexible your programs could
    be!
  • ...yes, and smoking was once considered good for
    your health
  • The usual way to step through an array was by
    adding one to the address part of a load or store
    instruction
  • You could write some really clever self-modifying
    programs
  • But, as the poet Piet Hein says
  • Heres a good rule of thumbToo clever is dumb.

6
Preparation for next example
  • In the next example, we will talk about how a
    higher-level language might be translated into
    assembly languages
  • Here are some of the assembly instructions we
    will use
  • The load instruction copies a value from a memory
    location into a special register called the
    accumulator
  • Example load 53 gets whatever is in location 53
    and puts it into the accumulator
  • The enter instruction puts a given value into the
    accumulator
  • Example enter 53 puts 53 itself into the
    accumulator
  • All arithmetic is done in the accumulator
  • Example add 53 adds the contents of location 53
    to the accumulator
  • The store instruction copies a value from the
    accumulator into memory
  • Example store 53 puts whatever is in the
    accumulator into location 53

7
Procedure calls
  • Consider the following
  • a add(b, c)...function add(x, y)
    return x y
  • Heres how it might have been translated to
    assembly language in the old days (red values are
    filled in as the program runs)
  • 42 0 // a43 10 // b44 15 // c
  • 20 load from 43 // addr of b21 store in
    7122 load from 44 // addr of c23 store in
    7224 enter 27 // the return addr25 store in
    addr part of 7026 jump to 7327 store in 42
    // addr of a
  • 70 jump to 27 // gets return addr71 10 //
    will receive b72 15 // will receive c73
    load value at addr 7174 add value at addr
    7275 jump to 70

8
Problems with the previous code
  • In this example, storage was staticyou always
    knew where everything was (and it didnt move
    around)
  • If you called a function, you told it where to
    return to, by storing the return address in the
    function itself
  • Hence, you could call the function from (almost)
    anywhere, and it would find its way back
  • You stored the parameter values in the function
    itself
  • This worked fine until recursion was invented
  • Recursion requires
  • Multiple return addresses
  • Multiple copies of parameters and local variables
  • In other words, recursion requires dynamic storage

9
The end of an era
  • What really killed off self-modifying programs
    was the advent of timesharing computers
  • Multiple users, or at least multiple programs,
    could share the computer, taking turns
  • But there isnt always enough main memory to
    satisfy everybody
  • When one program is running, another program (or
    parts of it) may need to be copied to disk, and
    brought back in again later
  • This is really, really slow
  • If only the data changed, not the program, we
    wouldnt have to save the program (which is often
    the largest part) over and over and over...
  • Besides, with the new emphasis on understandable
    programs, self-modifying programs were turning
    out to be a Really Bad Idea
  • Besides, think about what a security nightmare
    self-modifying programs could be!

10
An asidecompilers and loaders
  • Although self-modifying code is a bad idea, it is
    still necessary for computers to be able to
    create and modify machine instructions
  • This is what a compiler doesit creates machine
    instructions
  • A loader takes a compiled program and puts it
    somewhere in a computer memory
  • It cant always put it in the same place, so it
    has to be able to modify the addresses in the
    instructions
  • Still, compilers and loaders dont modify
    themselves

11
Static and dynamic storage
  • In the beginning, storage was staticyou declared
    your variables at the beginning of the program,
    and that was all you got
  • A procedure or function with, say, three
    parameters, got three words in which to store
    them
  • The parameters went in a fixed, known location in
    memory, assigned to them by the compiler
  • Recursion had not yet been invented
  • The programming language Algol 60 introduced
    recursive functions and procedures
  • Parameters went onto a stack
  • Hence, parameters were dynamically assigned to
    memory locations, not by the compiler, but by the
    running program itself
  • Storage was dynamically allocated and deallocated
    as needed

12
Stacks
  • Stacks obey a simple regimenlast in, first out
    (LIFO)
  • When you enter a function or procedure or method,
    storage is allocated for you on the stack
  • When you leave, the storage is released
  • In Java, this is even more fine-grainedstorage
    is allocated and deallocated for individual
    blocks, and even for for statements
  • Since this is so well-defined, your compiler
    writes the code to do it for you
  • But its still dynamicdone by your running
    program
  • Since virtually every language supports recursion
    these days (and all the popular languages do),
    computers typically provide machine-language
    instructions to simplify stack operations

13
Heaps
  • Stacks are great, but they have their limitations
  • Suppose you want to write a method to read in an
    array
  • You enter the method, and declare the array, thus
    dynamically allocating space for it
  • You read values into the array
  • You return from the method and POOF! your array
    is gone
  • You need something more flexiblesomething where
    you have control over allocation and deallocation
  • The invention that allows this (which came
    somewhat later than the stack, Im not sure when)
    is the heap
  • You explicitly get storage via malloc (C) or new
    (Java)
  • The storage remains until you are done with it

14
Stacks vs. heaps
  • Stack allocation and deallocation is very regular
  • Heap allocation and deallocation is unpredictable
  • Stack allocation and deallocation is handled by
    the compiler
  • Heap allocation is at the whim of the programmer
  • Heap deallocation may also be up to the
    programmer (C, C) or by the programming
    language system (Java)
  • Values on stacks are typically small and uniform
    in size
  • In Java, arrays and objects dont go in the
    stackreferences to them do
  • Values on the heap can be any size
  • Stacks are tightly packed, with no wasted space
  • Deallocation can leave gaps in the heap

15
Implementing a heap
  • A heap is a single large area of storage
  • When the program requests a block of storage, it
    is given a pointer (reference) to some part of
    this storage that is not already in use
  • The task of the heap routines is to keep track of
    which parts of the heap are available and which
    are in use
  • To do this, the heap routines create a linked
    list of blocks of varying sizes
  • Every block, whether available or in use,
    contains header information about the block
  • We will describe a simple implementation in which
    each block header contains two items of
    information
  • A pointer to the next block, and
  • The size of this block

16
Anatomy of a block
  • Here is our simple block
  • Java Objects hold more information than this (for
    example, the class of the object)
  • Notice that our implementation will return a
    pointer to the first word available to the user
  • Data with negative offsets are header data
  • ptr-1 contains the size of this block, including
    header information
  • ptr-2 will be used to construct a free space list
    of available blocks

17
The heap, I
  • Initially, the user has no blocks, and the free
    space list consists of a single block
  • In our implementation, we will allocate space
    from the end of the block
  • To begin, lets assume that the user asks for a
    block of two words

18
The heap, II
  • The user has asked for a block of size 2
  • The free block is reduced in size from 20 to 16
    (two words asked for by the user, plus two for a
    new header)
  • The new block has size 4 and the next field is
    not used
  • Next, assume the user asks for a block of three
    words

19
The heap, III
  • The user has asked for a block of size 3
  • The free block is reduced in size from 16 to 11
    (three words asked for by the user, plus two for
    a new header)
  • The new block has size 5 and the next field is
    not used
  • Next, assume the user asks for a block of just
    one word

20
The heap, IV
  • The user has asked for a block of size 1
  • The free block is reduced in size from 11 to 8
    (one word for the user, plus two for a new
    header)
  • The new block has size 3 and the next field is
    not used
  • Next, the user releases the second block (at 13)

21
The heap, V
  • The user has released the block of size 5
  • The freed block is added to the front of the free
    space list
  • Its next field is set to the old value of free
  • free is set to point to this block
  • Next, the user requests a block of size 4
  • The first block on the free list isnt large
    enough, so we have to go to the next free block

22
The heap, VI
  • The user requests a block of size 3
  • The size of the first free block is now 3, and
    its next field does not change
  • The user gets a pointer to the new block
  • Now the user releases the smallest block (at 10)
  • Again, this will be added to the beginning of the
    free space list

23
The heap, VII
  • The user releases the smallest block(at 10)
  • The freed block is added to the front of the free
    space list
  • Its next field is set to the old value of free
  • free is set to point to this block
  • Now the user requests a block of size 4
  • Currently, we cannot satisfy this request
  • We have enough space, but no single block is
    large enough
  • However, free blocks 10 and 13 are adjacent to
    each other
  • We can coalesce blocks 10 and 13

24
The heap, VIII
  • Blocks at 10 and 13 have now been coalesced
  • The size of the new block is the sum of the sizes
    of the old blocks
  • We had to adjust the links
  • Now we can give the user a block of size 4

25
Pointers
  • Allocating storage from the heap is easy
  • Person p new Person ( )
  • In Java, you request storage from the heap with
    new there is no other way to get storage on the
    heap
  • All Objects are on the heap
  • In C and C you get a pointer to the new
    storage in Java you get a reference
  • The implementation is identical the difference
    is that there are more operations on pointers
    than on references
  • C and C provide operations on pointers
  • C and C let you do arithmetic on pointers, for
    example, p
  • Pointers are pervasive in C and C you can't
    avoid them

26
Advantages/disadvantages
  • Pointers give you
  • Greater flexibility and (maybe) convenience
  • A much more complicated syntax
  • More ways to create hard-to-find errors
  • Serious security holes
  • References give you
  • Less flexibility (no pointer arithmetic)
  • Simpler syntax, more like that of other variables
  • Much safer programs with fewer mysterious bugs
  • Pointer arithmetic is inherently unsafe
  • You can accidentally point to the wrong thing
  • You cannot be sure of the type of the thing you
    are pointing to

27
Deallocation
  • There are two potential errors when de-allocating
    (freeing) storage yourself
  • De-allocating too soon, so that you have dangling
    references (pointers to storage that has been
    freed and possibly reused)
  • A dangling reference is not a null linkit points
    to something (you just dont know what)
  • Forgetting to de-allocate, so that unused storage
    accumulates and you have a memory leak
  • If you have to de-allocate storage yourself, a
    good strategy is to keep track of which function
    or method owns the storage
  • The function that owns the storage is responsible
    for de-allocating it
  • Ownership can be transferred to another function
    or method
  • You just need a clearly defined policy for
    determining ownership
  • In practice, this is easier said than done

28
Discipline
  • Most C/C advocates say
  • It's just a matter of being disciplined
  • I'm disciplined, even if other people aren't
  • Besides, there are good tools for finding memory
    problems
  • However
  • Virtually all large C/C programs have memory
    problems

29
Garbage collection
  • Garbage is storage that has been allocated but is
    not longer available to the program
  • It's easy to create garbage
  • Allocate some storage and save the pointer to it
    in a variable
  • Assign a different value to that variable
  • A garbage collector automatically finds and
    de-allocates garbage
  • This is far safer (and more convenient) than
    having the programmer do it
  • Dangling references cannot happen
  • Memory leaks, while not impossible, are pretty
    unlikely
  • Practically every modern language, not including
    C, uses a garbage collector

30
Garbage collection algorithms
  • There are two well-known algorithms (and several
    not so well known ones) for doing garbage
    collection
  • Reference counting
  • Mark and sweep

31
Reference counting
  • When a block of storage is allocated, it includes
    header data that contains an integer reference
    count
  • The reference count keeps track of how many
    references the program has to that block
  • Any assignment to a reference variable modifies
    reference counts
  • If the variable previously referenced an object
    (was not null), the reference count of that
    object is decremented
  • If the new value is an object (not null), the
    reference count for the new object is incremented
  • When a reference count reaches zero, the storage
    can immediately be garbage collected
  • For this to work, the reference count has to be
    at a known displacement from the reference
    (pointer)
  • If arbitrary pointer arithmetic is allowed, this
    condition cannot be guaranteed

32
Problems with reference counting
  • If object A points to object B, and object B
    points to object A, then each is referenced, even
    if nothing else in the program references either
    one
  • This fools the garbage collector, which doesn't
    collect either object A or object B
  • Thus, reference counting is imperfect and
    unreliable memory leaks still happen
  • However, reference counting is a simple technique
    and is occasionally used

33
Mark and sweep
  • When memory runs low, languages that use
    mark-and-sweep temporarily pause the program and
    run the garbage collector
  • The collector marks every block
  • It then does an exhaustive search, starting from
    every reference variable in the program, and
    unmarks all the storage it can reach
  • When done, every block that is still marked must
    not be accessible from the program it is garbage
    that can be freed
  • In order for this technique to work,
  • It must be possible to find every block (so they
    are in a linked list)
  • It must be possible to find and follow every
    reference
  • The mark has to be at a known displacement from
    the reference
  • Again, this is not compatible with arbitrary
    pointer arithmetic

34
Problems with mark and sweep
  • Mark-and-sweep is a complex algorithm that takes
    substantial time
  • Unlike reference counting, it must be done all at
    oncenothing else can be going on
  • The program stops responding during garbage
    collection
  • This is unsuitable for many real-time applications

35
Garbage collection in Java
  • Java uses mark-and-sweep
  • Mark-and-sweep is highly reliable, but may cause
    unexpected slowdowns
  • You can ask Java to do garbage collection at a
    time you feel is more appropriate
  • The call is System.gc()
  • But not all implementations respect your request
  • This problem is known and is being worked on
  • There is also a Real-time Specification for Java

36
No garbage collection in C or C
  • C and C do not have garbage collectionit is up
    to the programmer to explicitly free storage when
    it is no longer needed by the program
  • C and C have pointer arithmetic, which means
    that pointers might point anywhere
  • There is no way to do reference counting if the
    programming language does not have strict control
    over pointers
  • There is no way to do mark-and-sweep if the
    programming language does not have strict control
    over pointers
  • Pointer arithmetic and garbage collection are
    incompatible--it is essentially impossible to
    have both

37
The End
Write a Comment
User Comments (0)
About PowerShow.com