Title: Data Structure
1Data Structure Abstract Data Type
- C and Data Structures
- Baojian Hua
- bjhua_at_ustc.edu.cn
2Data Types
- A data type consists of
- A collection of data elements (a type)
- A set of operations on these data elements
- Data types in languages
- predefined
- any language defines a group of predefined data
types - (In C) int, char, float, double,
- user-defined
- allow programmers to define their own (new) data
types - (In C) structure, union,
3Data Type Examples
- Predefined
- type int
- elements , -2, -1, 0, 1, 2,
- operations , -, , /, ,
- User-defined
- type complex
- elements 13i, -58i,
- operations newComplex, add, sub, distance,
4Concrete Data Types (CDT)
- An concrete data type
- both data type declarations and concrete
representations are available - Almost all C predefined types are CDT
- For instance, int is a 32-bit double-word, and
, -,
5Abstract Data Types (ADT)
- An abstract data type
- separates data type declaration from
representation - separates function declaration (prototypes) from
implementation - Example of abstract data types in languages
- interfaces in Java
- signatures in ML
- (roughly) header files typedef in C
6Data Structures
- Data structure studies the organization of data
in computers, consisting of - the (abstract) data types (definition and repr)
- relationship between elements of this type
- operations on data types
- Algorithms
- operations on data structures
- tradeoffs efficiency and simplicity, etc.
- subtle interplay with data structure design
- Slogan program data structuresalgorithm
7What will this part cover?
- Linear structures
- Linked list, stack, queue, extensible array,
descriptor-based string - Tree forest
- binary tree, binary search tree
- Graph
- Hash
- Searching
8More on Modules, CDT and ADT
- Suppose we need a data type to represent complex
number c - a data type complex
- elements 34i, -5-8i,
- operations
- newComplex, add, sub, distance,
- How to represent this data type in C (CDT, ADT or
)?
9Complex Number
- // Recall the definition of a complex number c
- c x yi, where x,y \in R, and isqrt(-1)
- // Some typical operations
- complex newComplex (double x, double y)
- complex complexAdd (complex c1, complex c2)
- complex complexSub (complex c1, complex c2)
- complex complexMult (complex c1, complex c2)
- complex complexDistance (complex c1, complex
c2) - complex complexModus (complex c1, complex c2)
- complex complexDivide (complex c1, complex c2)
- // Next, wed discuss several variants of reps
- // CDT, ADT.
10CDT of ComplexInterfaceTypes
- // In a file complex.h
- ifndef COMPLEX_H
- define COMPLEX_H
- struct complexStruct
-
- double x
- double y
-
- typedef struct complexStruct complex
- complex newComplex (double x, double y)
- // other function prototypes are similar
-
- endif
11Client Code
- // With this interface, we can write client codes
- // that manipulate complex numbers. File
main.c - include complex.h
- int main ()
-
- complex c1, c2, c3
- c1 newComplex (3.0, 4.0)
- c2 newComplex (7.0, 6.0)
- c3 complexAdd (c1, c2)
- complexOutput (c3)
- return 0
Do we know c1, c2, c3s concrete
representation? How?
12CDT Complex Implementation
- // In a file complex.c
- include complex.h
- complex newComplex (double x, double y)
-
- complex c
- c.x x
- c.y y
- return c
-
- // other functions are similar. See Lab2
13Problem 1
- int main ()
-
- complex c
- c newComplex (3.0, 4.0)
-
- // Want to do this c c (5i6)
- // Ooooops, this is legal
- c.x 5
- c.y 6
-
- return 0
14Problem 2
- ifndef COMPLEX_H
- define COMPLEX_H
- struct complexStruct
-
- // change to a more fancy one? Anger main
- double a2
-
- typedef struct complexStruct complex
- complex newComplex (double x, double y)
- // other function prototypes are similar
-
- endif
15Problems with CDT?
- Operations are transparent.
- user code have no idea of the algorithm
- Good!
- Data representations dependence
- Problem 1 User code can access data directly
- kick away the interface
- safe?
- Problem 2 make code rigid
- easy to change or evolve?
16ADT of ComplexInterfaceTypes
- // In file complex.h
- ifndef COMPLEX_H
- define COMPLEX_H
- // note that struct complexStruct not given
- typedef struct complexStruct complex
- complex newComplex (double x, double y)
- // other function prototypes are similar
-
- endif
17Client Code
- // With this interface, we can write client codes
- // that manipulate complex numbers. File
main.c - include complex.h
- int main ()
-
- complex c1, c2, c3
- c1 newComplex (3.0, 4.0)
- c2 newComplex (7.0, 6.0)
- c3 complexAdd (c1, c2)
- complexOutput (c3)
- return 0
Can we still know c1, c2, c3s concrete
representation? Why?
18ADT Complex Implementation1Types
- // In a file complex.c
- include complex.h
- // We may choose to define complex type as
- struct complexStruct
-
- double x
- double y
-
- // which is hidden in implementation.
19ADT Complex Implementation Continued
- // In a file complex.c
- include complex.h
- complex newComplex (double x, double y)
-
- complex c
-
- c (complex)malloc (sizeof (c))
- c-gtx x
- c-gty y
- return c
-
- // other functions are similar. See Lab2
20ADT Summary
- Yes, thats ADT!
- Algorithm is hidden
- Data representation is hidden
- user code may never access it
- thus, client code independent of the impl
- See Lab2 for another data type nat
- CDT or ADT
21Polymorphism
- To explain polymorphism, we start with a new data
type tuple - A tuple is of the form (x, y)
- x?A, y?B (aka AB)
- A, B unknown in advance and may be different
- Example
- Aint, Bint
- (2, 3), (4, 6), (9, 7),
- Achar , Bdouble
- (Bob, 145.8), (Alice, 90.5),
22Polymorphism
- From the data type point of view, two types
- A, B
- operations
- newTuple (x, y)// create a new tuple with x and
y - equals (t1, t2) // equality testing
- first (t) // get the first element of
t - second (t) // get the second element of t
-
- How to represent this type in computers (using C)?
23Monomorphic Version
- Next, we first consider a monomorphic tuple type
called intTuple - both the first and second components are of int
type - (2, 3), (8, 9),
- The intTuple ADT
- type intTuple
- elements (2, 3), (8, 9),
- Operations
- tuple newNatTuple (int x, int y)
- int first (int t)
- int second (tuple t)
- int equals (tuple t1, tuple t2)
24intTuple CDT
- // in a file intTuple.h
- ifndef INT_TUPLE_H
- define INT_TUPLE_H
- struct intTupleStruct
-
- int x
- int y
-
- typedef struct intTupleStruct intTuple
- intTuple newIntTuple (int n1, int n2)
- int first (intTuple t)
-
- endif
25Or the intTuple ADT
- // in a file intTuple.h
- ifndef INT_TUPLE_H
- define INT_TUPLE_H
- typedef struct intTupleStruct intTuple
- intTuple newIntTuple (int n1, int n2)
- int first (intTuple t)
- int tupleEquals (intTuple t1, intTuple t2)
-
- endif
- // We only discuss tupleEquals (). All others
- // functions left to you.
26tupleEquals()
- // in a file intTuple.c
- int tupleEquals (intTuple t1, intTuple t2)
-
- return ((t1-gtx t2-gtx) (t1-gtyt2-gty))
27Polymorphism
- Now, we consider a polymorphic tuple type called
tuple - poly may take various forms
- Every element of tuple may be of different types
- (2, 3.14), (8, a), (\0, 99),
- The tuple ADT
- type tuple
- elements (2, 3.14), (8, a), (\0, 99),
28The Tuple ADT
- What about operations?
- tuple newTuple (??? x, ??? y)
- ??? first (tuple t)
- ??? second (tuple t)
- int equals (tuple t1, tuple t2)
29Polymorphic Type
- To cure this, C offers a polymorphic type void
- void is a pointer which can point to any
concrete types (i.e., its compatible with any
pointer type), very poly - think a box or a mask
- can not be used directly, use ugly cast
- similar to constructs in others language, such as
Object
30The Tuple ADT
- What about operations?
- tuple newTuple (void x, void y)
- void first (tuple t)
- void second (tuple t)
- int equals (tuple t1, tuple t2)
31tuple Interface
- // in a file tuple.h
- ifndef TUPLE_H
- define TUPLE_H
- typedef void poly
- typedef struct tupleStruct tuple
- tuple newTuple (poly x, poly y)
- poly first (tuple t)
- poly second (tuple t)
- int equals (tuple t1, tuple t2)
- endif TUPLE_H
32Client Code
- // in a file main.c
- include complex.h // need the ADT version
- include tuple.h
- int main ()
-
- complex c1 newComplex (1.0, 2.0)
- int ip (int )malloc (sizeof (i))
- tuple t1 newTuple (c1, ip)
-
- return 0
33tuple ADT Implementation
- // in a file tuple.c
- include ltstdlib.hgt
- include tuple.h
- struct tupleStruct
-
- poly x
- poly y
-
- tuple newTuple (poly x, poly y)
-
- tuple t (tuple)malloc (sizeof (t))
- t-gtx x
- t-gty y
- return t
34tuple ADT Implementation
- // in a file tuple.c
- include ltstdlib.hgt
- include tuple.h
- struct tuple
-
- poly x
- poly y
-
- poly first (tuple t)
-
- return t-gtx
35Client Code
- include complex.h // ADT version
- include tuple.h
- int main ()
-
- complex c1 newComplex (1.0, 2.0)
- int ip (int )malloc (sizeof (i))
- tuple t1 newTuple (c1, ip)
- complex c2 (complex)first (t1) // type cast
-
- return 0
36equals?
- struct tupleStruct
-
- poly x
- poly y
-
- // The 1 try
- int equals (tuple t1, tuple t2)
-
- return ((t1-gtx t2-gtx)
- (t1-gty t2-gty))
- // Wrong!!
37equals?
- struct tuple
-
- poly x
- poly y
-
- // The 2 try
- int equals (tuple t1, tuple t2)
-
- return ((t1-gtx) (t2-gtx)
- (t1-gty) (t2-gty))
- // Problem?
38equals?
- struct tuple
-
- poly x
- poly y
-
- // The 3 try
- int equals (tuple t1, tuple t2)
-
- return (equalsXXX (t1-gtx, t2-gtx)
- equalsYYY (t1-gty, t2-gty))
- // but what are equalsXXX and equalsYYY?
39Function as Arguments
- // So in the body of equals function, instead
- // of guessing the types of t-gtx and t-gty, we
- // require the callers of equals supply the
- // necessary equality testing functions.
- // The 4 try
- typedef int (tf)(poly, poly)
- int equals (tuple t1, tuple t2, tf eqx, tf eqy)
-
- return (eqx (t1-gtx, t2-gtx)
- eqy (t1-gty, t2-gty))
40Change to tuple Interface
- // in file tuple.h
- ifndef TUPLE_H
- define TUPLE_H
- typedef void poly
- typedef int (tf)(poly, poly)
- typedef struct tuple tuple
- tuple newTuple (poly x, poly y)
- poly first (tuple t)
- poly second (tuple t)
- int equals (tuple t1, tuple t2, tf eqx, tf eqy)
- endif TUPLE_H
41Client Code
- // in file main.c
- include complex.h
- include tuple.h
- int main ()
-
- complex c newComplex (1.0, 2.0)
- int ip (int )malloc (sizeof (int))
- tuple t1 , t2
- equals (t1, t2, complexEquals, intEquals)
- return 0
42Moral
- void serves as polymorphic type in C
- mask all pointer types (think Object type in
Java) - Pros
- code reuse write once, used in arbitrary context
- wed see more examples later in this course
- Cons
- Polymorphism doesnt come for free
- boxed data data heap-allocated (to cope with
void ) - no static or runtime checking (at least in C)
- clumsy code
- extra function pointer arguments
43Data Carrying Functions
- Why we can NOT make use of data, such as passed
as function arguments, when its of type void
? - Better idea
- Make data carry functions themselves, instead of
make external function calls - such kind of data called objects
44Function Pointer in Data
- int equals (tuple t1, tuple t2)
-
- // note that if t1-gtx or t1-gty has carried the
- // equality testing functions, then the code
- // could just be written
- return (t1-gtx-gtequals (t1-gtx, t2-gtx)
- t1-gty-gtequals (t1-gty, t2-gty))
-
equals
equals_x
t1
x
equals_y
y
equals
45Function Pointer in Data
- // To cope with this, we should modify other
- // modules. For instance, the complex ADT
- struct complexStruct
-
- int (equals) (poly, poly)
- double a2
-
- complex newComplex (double x, double y)
-
- complex c (complex)malloc (sizeof (c))
- c-gtequals complexEquals
-
- return n
46The Call
- int equals (tuple t1, tuple t2)
-
- return (t1-gtx-gtequals (t1-gtx, t2-gtx)
- t1-gty-gtequals (t1-gty,t2-gty))
-
equals
t2
t1
a0
a0
x
x
a1
a1
y
y
47Client Code
- // in file main.c
- include complex.h
- include tuple.h
- int main ()
-
- complex c1 newComplex (1.0, 2.0)
- complex c2 newComplex (1.0, 2.0)
- tuple t1 newTuple (c1, c2)
- tuple t2 newTuple (c1, c2)
- equals (t1, t2) // dirty simple! -P
- return 0
48Object
- Data elements with function pointers is the
simplest form of objects - object virtual functions private data
- With such facilities, we can in principal model
object oriented programming - In fact, early C compilers compiles to C
- Thats partly why I dont love object-oriented
languages - See Lab 2 for a more production-quality
implementation of objects
49Summary
- Abstract data types enable modular programming
- clear separation between interface and
implementation - interface and implementation should design and
evolve together - Polymorphism enables code reuse
- Object data function pointers