Title: CS 3015 Lecture 11
1CS 3015 Lecture 11
2Symbolic Data
- We now extend our knowledge of Scheme to allow
arbitrary symbols in our data structures - Quotation
- Like quotation in English
- The value of a is a
- The value of (a b c d) is (a b c d).
- Now we can create all kinds of structures
- ( ( 23 45) ( x 9))
- (define (fact n)
- (if ( n 1) 1 ( n (fact (- n 1)))))
3Examples
- (define a 1)
- (define b 2)
- (list a b)
- (list a b)
- (list a b)
- (a b)
- (car (a b c))
- (cdr (a b c))
- ()
4The eq? primitive and memq
- Takes two symbols and returns true if they are
the same - (define (memq item x) (cond ((null? x) false)
((eq? item (car x)) x) (else (memq i
tem (cdr x))))) - (memq apple (pear banana prune))
- (memq apple (x (apple sauce) y apple pear))
5Example Symbolic Differentiation
- We want a procedure that takes an algebraic
expression and a variable and returns the
derivative of that expression wrt to the variable - deriv(ax2 bx c,x) 2ax b
- Use our principle of data abstraction
- Define the differentiation algorithm for
variables, sums, products, etc., without worrying
about how these will be represented
6Rules of Differentiation
7Design by Wishful Thinking
- (variable? e) Is e a variable?
- (same-variable? v1 v2) Are v1 and v2 the same
variable? - (sum? e) Is e a sum?
- (addend e) Addend of the sum e.
- (augend e) Augend of the sum e.
- (make-sum a1 a2) Construct the sum of a1 and a2.
- (product? e) Is e a product?
8Constructors and Selectors contd
- (multiplier e) Multiplier of the product e.
- (multiplicand e) Multiplicand of the product e.
- (make-product m1 m2) Construct the product of m1
and m2.
9The Code
- (define (deriv exp var)(cond ((number? exp) 0)
((variable? exp) (if (same-variable? ex
p var) 1 0)) ((sum? exp) (make-sum (d
eriv (addend exp) var) (deriv (au
gend exp) var))) ((product? exp) (mak
e-sum (make-product (multiplier exp)
(deriv (multiplicand exp) var))
(make-product (deriv (multiplier exp) var)
(multiplicand exp))))
(else (error "unknown expression type -- DE
RIV" exp))))
10Representing Algebraic Expressions
- There are many ways we might choose to represent
algrbraic expressions - We choose parenthesized prefix (i.e. LISP)
notation - The variables are symbols. They are identified by
the primitive predicate symbol? - (define (variable? x) (symbol? x))
- Two variables are the same if the symbols
representing them are eq? - (define (same-variable? v1 v2) (and (variable? v
1) (variable? v2) - (eq? v1 v2)))
11Representation
- Sums and products are constructed as lists
- (define (make-sum a1 a2) (list ' a1 a2))
- (define (make-product m1 m2) (list ' m1 m2))
- A sum is a list whose first element is the symbol
- (define (sum? x) (and (pair? x) (eq? (car x) ')
)) - The addend is the second item of the sum list
- (define (addend s) (cadr s))
- The augend is the third item of the sum list
- (define (augend s) (caddr s))
12Representation
- A product is a list whose first element is the
symbol - (define (product? x) (and (pair? x) (eq? (car x)
'))) - The multiplier is the second item of the product
list - (define (multiplier p) (cadr p))
- The multiplicand is the third item of the product
list - (define (multiplicand p) (caddr p))
13Using Deriv
- (deriv '( x 3) 'x)( 1 0)
- (deriv '( x y) 'x)( ( x 0) ( 1 y))
- (deriv '( ( x y) ( x 3)) 'x)( ( ( x y) ( 1
0)) ( ( ( x 0) ( 1 y)) ( x 3))) - These answers are correct, but not simplified
14Reducing to Simplest Form
- Much like our rational number implementation,
answer should be reduced to simplest form. - In doing this, we dont change deriv at all
- Instead, we change the constructors for sums and
products
15Sums
- (define (make-sum a1 a2)(cond ((number? a1 0) a2
) ((number? a2 0) a1) ((and (number?
a1) (number? a2)) - ( a1 a2)) (else (list ' a1 a2))))
- (define (number? exp num) (and (number? exp) (
exp num)))
16Products
- (define (make-product m1 m2)(cond ((or (number?
m1 0) - (number? m2 0))
- 0) ((number? m1 1) m2) ((nu
mber? m2 1) m1) ((and (number? m1) (number?
m2)) - ( m1 m2)) (else (list ' m1 m2))))
17Using the New deriv
- (deriv '( x 3) 'x)1
- (deriv '( x y) 'x)y
- (deriv '( ( x y) ( x 3)) 'x)( ( x y) ( y (
x 3))) - This is better, but more can be done.
18Example Representing Sets
- The choice of representation for sets is not as
obvious as the other examples we have seen - Informally, a set is a collection of distinct
objects - Employ data abstraction to give a more precise
definition - Define sets by specifying the operations to be
used on sets
19Operations on Sets
- Union, intersection, element-of?, adjoin
- element-of? Is a predicate that determines
whether a given element is a member of a set - adjoin adds an element to a set
- intersection of two sets returns the set
containing elements of both - union of two sets returns the set containing
elements in either of the sets - Now we are free to implement these any way we
choose
20Sets as Unordered Lists
- Lists of elements in which no element appears
more than once - (define (element-of? x set) (cond ((null? set) f
alse) ((equal? x (car set)) true)
(else - (element-of? x (cdr set)))))
21Adjoin
- (define (adjoin x set) (if (element-of? x set)
set (cons x set)))
22Intersection
- (define (intersection set1 set2)(cond ((or (null?
set1) (null? set2)) - '()) ((element-of? (car set1) set2
) - (cons (car set1) (intersecti
on - (cdr set1) set2))) (else (intersection
(cdr set1) - set2))))
23Efficiency
- How many steps do our set operations require?
- All operations use element-of? , so speeding up
this operation has a major impact - To determine if an element is present, must scan
whole set T(n) - How about intersection and union?
24Sets as Ordered Lists
- Speed up set operations by requiring set elements
to appear in increasing order - Need a way to compare elements
- E.g., lexicographical for symbols
- Lets consider sets of numbers
- Now we can rewrite element-of?
25Reimplementation of Element-of?
- (define (element-of? x set)(cond ((null? set) fal
se) (( x (car set)) true) ((lt x (car
set)) false) (else (element-of? x - (cdr set)))))
26How Many Steps Does This Save?
- In the worst case, the element we are looking for
is always the last element, so still T(n). - In the average case, its T(n/2) which is still
T(n).
27Speedup for intersection is better
- Because the elements are ordered, we can do a
merge which is T(n). - (define (intersection set1 set2)(if (or (null? se
t1) (null? set2) '() (let ((x1 (car set1))
(x2 (car set2))) (cond (( x1 x2)
(cons x1 (intersectio
n (cdr set1) (cdr set2))))
((lt x1 x2)
(intersection-set (cdr set1) set2))
((lt x2 x1)
(intersection-set set1 (cdr set2)))))))
28Sets as Binary Trees
- Represent a set as a tree where each node of the
tree holds one element and a link to each of two
other nodes - The left link points to a tree of smaller
elements - The right link points to a tree of larger elements
29Example Trees Representing Sets
30Advantage of Tree Representation
- To check whether x is contained in a set, compare
with top node. - Based on this, we know which subtree to look in
next - If the tree is balanced, we reduce the size of
the search problem in half at each step. - How many steps?
31Advantage of Tree Representation
- To check whether x is contained in a set, compare
with top node. - Based on this, we know which subtree to look in
next - If the tree is balanced, we reduce the size of
the search problem in half at each step. - How many step? T(log n)
- This is a big speedup!
32Representing Binary Trees as Lists
- (define (entry tree) (car tree))
- (define (left tree) (cadr tree))
- (define (right tree) (caddr tree))
- (define (make-tree entry left right)
- (list entry left right))
33Element-of?
- (define (element-of? x set)(cond ((null? set) fal
se) (( x (entry set)) true) ((lt x (en
try set)) (element-of? x - (left-branch set))) ((gt x (entry s
et)) (element-of? x - (right-branch set)))))
34Adjoin
- (define (adjoin x set)(cond ((null? set) (make-tr
ee x '() '())) (( x (entry set)) set)
((lt x (entry set)) (make-tree - (entry set) (adjoin x (left-branch se
t)) (right-branch set))) ((gt x (ent
ry set)) (make-tree - (entry set) (left-branch set)
(adjoin x (right-branch set))))))
35Caution
- This implementation of adjoin will not preserve
the balanced property!
36Sets and Information Retrieval
- Sets are important because the techniques used
appear repeatedly in information retrieval
applications - A database of records is really just a set of
those records, each identified by a key - Given a key we want to be able to retrieve its
record efficiently
37Looking up Records in a Database
- lookup is almost the same as element-of?
- (define (lookup given-key records)(cond ((null? r
ecords) false) ((equal? given-key
(key (car records))) (car records)) (
else - (lookup given-key (cdr records))
- )))
38Data Abstraction and High Performance
- Data abstraction is an enormous help in
high-performance applications - Can experiment with different implementations of
data structures without changing program logic
39Example Huffman Encoding Trees
- How are characters represented?
- Fixed length code
- If we want to represent n characters, we need
log2 n bits - This is a fixed length code
- BAC
- 001000010
40Compression
- How can we compress character strings (files)?
- Use a variable length code
- Arrange for the code for the most frequently used
characters to be the shortest - E.g., Morse code
41How to Separate the Characters
- When character codes are variable length, how are
characters in a sequence separated? - Ensure that no shorter character code is a prefix
of a longer one
42Variable Length Codes and Compression
- Significant space savings can be realized if
characters that occur more frequently have
shorter codes - One scheme for doing this is called the Huffman
encoding method - Construct a binary tree whose leaves are the
characters to encode - Leaves for characters of higher frequency are
weighted more heavily
43Huffman Encoding
- Nodes in the tree represent the sets of
characters found at their leaves and a weight
with the sum of their leaf weights
44Example Huffman Tree
45How to Find a Characters Encoding
- Start at root of tree
- Following the branches to the character
- Each left branch is a zero
- Each right branch is a one
46How to Build a Huffman Tree
- Maintain a set of weighted subtrees
- Initially this set contains just the characters
with their weights - Its just the set of leaves of the tree
- Iterate doing the following until the set
contains a single tree - Select and merge the two subtrees of lowest weight
47Example
- Initial (A 8) (B 3) (C 1) (D 1) (E 1) (F 1) (G
1) (H 1) - Merge (A 8) (B 3) (C D 2) (E 1) (F 1) (G 1) (H
1) - Merge (A 8) (B 3) (C D 2) (E F 2) (G 1) (H
1) - Merge (A 8) (B 3) (C D 2) (E F 2) (G H 2)
- Merge (A 8) (B 3) (C D 2) (E F G H 4)
- Merge (A 8) (B C D 5) (E F G H 4)
- Merge (A 8) (B C D E F G H 9)
- Final merge (A B C D E F G H 17)
48Representing Huffman Trees
- (define (make-leaf symbol weight) (list 'leaf sy
mbol weight)) - (define (leaf? object) (eq? (car object) 'leaf))
- (define (symbol-leaf x) (cadr x))
- (define (weight-leaf x) (caddr x))
49Huffman Trees contd
- (define (make-code-tree left right)(list
- left right (append (symbols left)
- (symbols right)) ( (weight left) (weight r
ight))))
50Huffman Trees contd
- (define (left-branch tree) (car tree))
- (define (right-branch tree) (cadr tree))
- (define (symbols tree)(if (leaf? tree) (list
(symbol-leaf tree)) (caddr tree))) - (define (weight tree)(if (leaf? tree) (weight
-leaf tree) (cadddr tree)))
51Huffman Decoding Procedure
- Takes a code and a Huffman tree and returns the
list of characters for that code - (define (decode bits tree)(define (decode-1 bits
current-branch) (if (null? bits)
'() (let ((next-branch
(choose-branch (car bits) - current-branch))) (if (leaf? next-
branch) (cons (symbol-leaf next-branch
) (decode-1 (cdr bits) tree))
(decode-1 (cdr bits) - next-branch)))))(decode-1 bits tree))
52Choose-branch
- (define (choose-branch bit branch)(cond (( bit 0
) (left-branch branch)) (( bit 1) (right-br
anch branch)) (else (error "bad bit -- CHOOS
E- BRANCH" bit))))
53Sets of Leaves and Trees
- This algorithm requires that we work with sets
containing leaves and subtrees - If the sets are ordered, the code is more
efficient - (define (adjoin x set)(cond ((null? set) (list x)
) ((lt (weight x)(weight (car set))) - (cons x set)) (else (cons (car set)
(adjoin x (cdr set))))))
54Initialization Procedure
- (define (make-leaf-set pairs)(if (null? pairs)'()
(let ((pair (car pairs))) (adjoin-set - (make-leaf (car pair) (cadr
pair)) (make-leaf-set (cdr pairs))))))
55Huffman Code
- (define (huffman pairs)
- (define (huff-internal tree-set)
- (if ( (length tree-set) 1)
- (car tree-set)
- (huff-internal
- (adjoin
- (make-code-tree (car tree-set)
- (cadr tree-set))
- (cddr tree-set)))))
- (huff-internal (make-leaf-set pairs)))