Title: Arrays and Pointers
1Arrays and Pointers
Programming Language Principles Lecture 23
- Prepared by
- Manuel E. Bermúdez, Ph.D.
- Associate Professor
- University of Florida
2Arrays
- Most common composite data type.
- Semantically, viewed as a mapping from the index
type to the element type. - Some languages permit only integer as the index
type others allow any scalar.
3Array Declaration Syntax
- C
- char upper26 / array of 26 chars,
0..25 / - Fortran
- character(26) upper
- Pascal
- var upper arraya .. z of char
- Ada
- upper array (character range a .. z) of
character
4Arrays and functions in Ada
- In either case, upper(a) returns A.
5Multi-Dimension Arrays
- Ada
- matrix array (1..10, 1..10) of real
- Modula-3
- VAR matrix ARRAY 1..10,1..10 OF
REAL - (same as)
- VAR matrix ARRAY 1..10 OF
- ARRAY 1..10 OF
REAL - and
- matrix3,4 is the same as matrix34.
6Multi-Dimension Arrays (contd)
- In Ada,
- matrix array(1..10,1..10) of real
- is NOT the same as
- matrix array(1..10) of array (1..10) of real
- matrix(3)(4) not legal in first form
- matrix(3,4) not legal in second form.
7Multi-Dimension Arrays (contd)
- An array of arrays is a slice.
- In C, double matrix1010.
- However, C integrates arrays and pointers,
- so
- matrix3 is not an array of 10 doubles.
- It is (depending on context) either
- A pointer to the third row of matrix, or
- the value of matrix30
8Slices in Fortran
9Array Dimensions, Bounds and Allocation
- Five cases
- Global lifetime, static shape
- static allocation (easy).
- Local lifetime, static shape
- space allocated on a stack frame.
- Local lifetime, shape bound at elaboration
stack frame needs fixed-size part and
variable-size part. - Arbitrary lifetime, shape bound at elaboration
Java, programmer allocates space. - Arbitrary lifetime, dynamic shape
array lives on the heap.
10Array allocation in Ada (shape bound at
elaboration time)
11Conformant Arrays in Pascal
- Array shape determined at time of call.
- Pascal doesnt allow local dynamic-shaped arrays.
- Ada DOES allow local dynamic-shaped arrays (see
textbook)
12Other Forms of Dynamic Arrays
- C Arrays passed by reference, so bounds are
irrelevant ! (programmers problem) - Java strings
- String s short
- s s and sweet // immutable
13Resizing Arrays in Java
- Create a new array of proper length and data
type - Integer a new Integer10
- Object newArray new ObjectnewLength
- Copy all elements from old array into new one
- System.arraycopy(a,0,newArray,0,a.length)
- Rename array
- element newArray
- // old space reclaimed by garbage
- // collector.
14Dynamic Arrays in Fortran 90
- Arrays sized at run time, but cant be changed
once set.
15Classic Array Memory Layouts
16Memory Layout in C
17Address calculation (static array bounds)
18Virtual Location of Array
- With static array bounds, weve moved the array
in 3Ds.
19Dope Vectors
- A run-time descriptor for the array.
- Contains, for each dimension (except last one,
always statically known) - Lower bound
- Size
- Upper bound (if dynamic checks are required)
- Size of dope vector depends on of dimensions
(i.e. static). - Typically placed next to the array pointer, in
the fixed-size portion of the stack frame.
20Strings
- Usually an array of characters.
- Many languages allow more flexibility with
strings than with other types of arrays. - Single-character string vs. single character
- Pascal no distinction.
- C very different
- String constants 'abc', abc.
- Rules for embedding special characters
- Pascal double the character ' ab''cd'
- C escape sequence ab\cd.
21Strings
- C, Pascal, Ada string length bound no later than
elaboration time (allocate in stack frame). - Lisp, Icon, ML, Java allow dynamically-bound
strings, stored in the heap. - Pascal supports lexicographically-ordered
comparison of strings ('abc' lt 'abd'). Ada
supports it on all 1D discrete-valued arrays. - C no string assignment, elements copied
individually (library functions).
22Strings in C
23Sets
- Pascal supports sets of any discrete type
- var a,b,c set of char
- d,e set of weekday
-
- a b c ( union )
- a b c ( intersection )
- a b c ( difference )
24Set implementations
- Arrays, hash tables, trees.
- Bit-vectors each entry true (element in the
set), or false (element not in the set) - Efficient operations
- Union is inclusive bit-wise OR.
- Intersection is bit-wise AND.
- Difference is NOT, followed by AND.
- Wont work for large base types
- A set of 32-bit integers 500MBs.
- A set of 64-bit integers 241 MBs
- Usually limited to 128, or 512.
25Pointers and Recursive Types
- Most recursive types are records.
- Reference model languages (Lisp, ML, Clu, Java)
every field is a reference. - A record of type f contains a reference to
another record of type f. - Value model languages (C, Pascal, Ada) need a
pointer (a variable whose value is a reference). - Dont confuse pointer with address an address
may be segmented.
26Storage Reclamation
- Explicit (C,C, Pascal, Modula-2) programmer
must reclaim unused heap space. - Can be done efficiently.
- Easy to get wrong if so, can lead to memory
leaks. - Implicit (Lisp, ML, Modula-3, Ada, Java) heap
space reclaimed automatically. - Not so efficient (but getting better)
- Simplifies programmers task a LOT.
27Reference Model (ML)
- node (R,node(X,empty,empty),
node(Y,node(Z,empty,empty), - node(W,empty,empty)))
28Reference Model (Lisp)
- '(\R(\X()())(\Y(\Z()())(\W()())))
29Value Model
- Pascal
- type chr_tree_ptr chr_tree
- chr_tree record
- left, rightchr_tree_ptr
- val char
- end
- C
- struct chr_tree
- struct chr_tree left, right
- char val
-
- In C, struct names are not quite type names.
Shorthand - typedef struct chr_tree chr_tree_type
30Memory Allocation
- Pascal new(my_ptr)
- Ada my_ptrnew chr_tree
- C my_ptr(struct chr_tree )
- malloc(sizeof (struct chr_tree))
- C, Java my_ptr new chr_tree(args)
-
31Pointer References
- Pascal
- my_ptr.val X
- C
- (my_ptr).val X
- my_ptr-gtval X
- Ada
- T chr_tree
- P char_tree_ptr
- T.val X
- P.val X good for record or
pointer to one. - T P.all if need to reference
the record.
32Pointers and Arrays in C
- int n
- int a
- int b10
- All are valid
- a b
- n a3
- n (a3)
- n b3
- n (b3)
33Pointers and Arrays in C (contd)
- Interoperable, but not the same
- int an allocates n pointers
- intnm allocates a full 2D array.
- In fact, assuming int an
- (ai)
- (ia)
- ai
- ia
- are all equivalent !
34Pointers and Arrays in C (contd)
- In C, arrays are passed by reference the array
name is a pointer. - Its customary to pass the array name, and its
dimensions - double det (double M, int rows, int cols)
- int i,j ...
- val (Micolsj) / Mij /
-
35Tombstones
- Technique for catching dangling references
36Tombstones
- Advantages
- Catch dangling references.
- Prevent memory leaks.
- Helpful in heap compaction.
- Disadvantages
- Cheap on the heap, expensive on the stack
(procedure entry/return). - Tombstones themselves can dangle.
37Locks and Keys
38Locks and Keys
- Advantage
- No need to keep tombstones around.
- Disadvantages
- Only work for heap objects.
- Significant overhead.
- Increase the cost of copying a pointer.
- Increase the cost of every access.
39Reference Counts
- Set count to 1 upon object creation
- Upon assignment.,
- Decrement count of object on left.
- Increment count of object on right.
- Upon subroutine entry, increment counts for local
pointers. - Upon subroutine return, decrement counts for
local pointers. - Need type descriptors for this objects can be
deeply structured. - WILL FAIL ON CIRCULAR STRUCTURES !
40Reference Counts Fail on Circular Structures
41Garbage Collection
- System determines which memory is not in use and
return the memory to the pool of free storage. - Done in two or three steps
- Mark nodes that are in use.
- Compact free space (optional).
- Move free nodes to storage pool.
42Marking
c
a
e
d
b
firstNode
- Unmark all nodes (set all mark bits to false).
- Start at each program variable that contains a
reference, follow all pointers, mark nodes that
are reached.
43Compaction
Free Memory
c
b
e
d
b
a
e
d
firstNode
- Move all marked nodes (i.e., nodes in
- use) to one end of memory, updating
- all pointers as necessary.
44Lists in Lisp and ML
45Equality Testing and Assignment
- Equality comparison is easy for scalars
- For complex or abstract data types, say, strings
s and t, s t could mean - s and t are aliases
- s and t occupy the same storage
- s and t contain the same sequence of characters
- s and t print the same
46Deep and Shallow Comparisons
- Shallow Comparison
- Both expressions refer to the same object.
- Deep Comparison
- Expressions refer to objects that are equal in
content somehow. - Most PLs use shallow comparisons, and shallow
assignments.
47Arrays and Pointers
Programming Language Principles Lecture 23
- Prepared by
- Manuel E. Bermúdez, Ph.D.
- Associate Professor
- University of Florida