Title: Data Structures: Trees and Grammars
1Data StructuresTrees and Grammars
- CS201, Spring 2008
- Readings Sections 6.1, 7.1-7.4 (more from Ch.
6 later)
2Goals for this Unit
- Continue focus on data structures and algorithms
- Understand concepts of reference-based data
structures (e.g. linked lists, binary trees) - Some implementation for binary trees
- Understand usefulness of trees and hierarchies as
useful data models - Recursion used to define data organization
3Taxonomy of Data Structures
- From the text
- Data type collection of values and operations
- Compare to Abstract Data Type!
- Simple data types vs. composite data types
- Book Data structures are composite data types
- Definition a collection of elements that are
some combination of primitive and other composite
data types
4Books Classification of Data Structures
- Four groupings
- Linear Data Structures
- Hierarchical
- Graph
- Sets and Tables
- When defining these, note an element has
- one or more information fields
- relationships with other elements
5Note on Our Book and Our Course
- Our books strategy
- In Ch. 6, discuss principles of Lists
- Give an interface, then implement from scratch
- In Ch. 7, discuss principles of Trees
- Later, in Ch. 9, see what Java gives us
- Our courses strategy
- We did Ch. 9 first. Saw List interfaces and
operations - Then, Ch. 8 on maps and sets
- Now, trees with some implementation too
6Trees Represent
- Concept of a tree verycommon and important.
- Tree terminology
- Nodes have one parent
- A nodes children
- Leaf nodes no children
- Root node top or start no parent
- Data structures that store trees
- Execution or processing that can be expressed as
a tree - E.g. method calls as a program runs
- Searching a maze or puzzle
7Trees are Important
- Trees are important for cognition and computation
- computer science
- language processing (human or computer)
- parse trees
- knowledge representation (or modeling of the
real world) - E.g. family trees the Linnaean taxonomy
(kingdom, phylum, , species) etc.
8Another Tree Example File System
- What about file links (Unix) or shortcuts
(Windows)?
9Another Tree Example XML and HTML documents
-
-
-
- My Page
- Blah
- blah blah
- End
-
-
How is this a tree? What are the leaves?
10Tree Data Structures
- Why this now?
- Very useful in coding
- TreeMap in Java Collections Framework
- Example of recursive data structures
- Methods are recursive algorithms
11Tree Definitions and Terms
- First, general trees vs. binary trees
- Each node in a binary tree has at most two
children - General tree definition
- Set of nodes T (possibly empty?) with a
distinguished node, the root - All other nodes form a set of disjoint subtrees
Ti - each a tree in its own right
- each connected to the root with an edge
- Note the recursive definition
- Each node is the root of a subtree
12Picture of Tree Definition
- And all subtrees are recursively defined as
- a node with
- subtrees attached to it
13Tree Terminology
- A nodes parent
- A nodes children
- Binary tree left child and right child
- Sibling nodes
- Descendants, ancestors
- A nodes degree (how many children)
- Leaf nodes or terminal nodes
- Internal or non-terminal nodes
14Recursive Data Structure
- Recursive Data Structure a data structure that
contains a pointer or reference to an instance of
itself - public class ListNode
- Object nodeItem
- ListNode next, previous
-
-
- Recursion is a natural way to express many
algorithms. - For recursive data-structures, recursive
algorithms are a natural choice
15General Trees
- Representing general trees is a bit harder
- Each node has a list of child nodes
- Turns out that
- Binary trees are simpler and still quite useful
- From now on, lets focus on binary-trees only
16ADT Tree
- Remember definition on an ADT?
- Model of information we just covered that
- Operations? See page 366 in textbook
- Many are similar to ADT List or any data
structure - The CRUD operations create, replace, update,
delete - Important about this list of operations
- some are in terms of one specified node, e.g.
hasParent() - others are tree-wide, e.g. size(), traversal
17Classes for Binary Trees
- class BinaryTree
- reference to root BinaryTreeNode
- methods tree-level operations
- class BinaryTreeNode
- data an object (of some type)
- left references root of left-subtree (or null)
- right references root of right-subtree (or null)
- parent references this nodes parent node
- Could this be null? When should it be?
- methods node-level operations
18Two-class Strategy for Recursive Data Structures
- Common design use two classes for a Tree or List
- Top class
- has reference to first node
- other things that apply to the whole
data-structure object (e.g. the tree-object) - both methods and fields
- Node class
- Recursive definitions are here as references to
other node objects - Also data (of course)
- Methods defined in this class are recursive
19Binary Tree and Node Class
- BinaryTree class has
- reference to root node
- reference to a current node, a cursor
- non-recursive methods likeboolean find(tgt) //
see if tgt is in the whole tree - Node class has
- data, references to left and right subtrees
- recursive versions of methods like findboolean
find(tgt) // is tgt here or in my substrees? - Note BinaryTree.find() just calls Node.find() on
the root node! - Other methods work this way too
20Why Does This Matter Now?
- This illustrates (again) important design ideas
- The tree itself is what were interested in
- There are tree-level operations on it (ADT
level operations) - The implementation is a recursive data structure
- There are recursive methods inside the
lower-level classes that are closely related
(same name!) to the ADT-level operation - Principles? abstraction (hiding details),
delegation (helper classes, methods)
21ADT Tree Operations Navigation
- Positioning
- toRoot(), toParent(), toLeftChild(),
toRightChild(), find(Object o) - Checking
- hasParent(), hasLeftChild(), etc.
- equals(Object tree2)
- Book calls this a deep compare
- Do two distinct objects have the same structure
and contents?
22ADT Tree Operations Mutators
- Mutators
- insertRight(Object o), insertLeft(Object o)
- create a new node containing new data
- make this new node be the child of the current
node - Important We use these to build trees!
- prune()
- delete the subtree rooted by the current node
23Next Implementation
- Next (in the book)
- How to implement Java classes for binary trees
- Class for node, another class for BinTree
- Interface for both, then two implementations
(array and reference) - But for us
- Well skip this
- Well only look at reference-base implementation
- Next concept of a binary search tree
24Binary Search Trees
- We often need collections that store items
- Maybe a long series of inserts or deletions
- We want fast lookup, and often we want to access
in sorted order - Lists O(n) lookup
- Could sort them for O(lg n) lookup
- Cost to sort is O(n lg n) and we might need to
re-sort often as we insert, remove items - Solution search tree
25Binary Search Trees
- Associated with each node is a key value that can
be compared. - Binary search tree property
- every node in the left subtree has key whose
value is less than the value of the roots key
value, and - every node in the right subtree has key whose
value is greater than the value of the roots key
value.
26Example
5
8
4
11
7
1
3
BINARY SEARCH TREE
27Counterexample
8
11
5
18
10
6
2
7
4
20
15
NOT A BINARY SEARCH TREE
21
28Find and Insert in BST
- Find look for where it should be
- If not there, thats where you insert
29Recursion and Tree Operations
- Recursive code for tree operations is simple,
natural, elegant - Example pseudo-code for Node.find()
- boolean find(Comparable tgt) if (this.data
matches tgt) return true else if
(this.data this.leftChild else // current data tgts
data Node next this.rightChild //
next points to left or right subtree if (next
null ) return false // no subtree else
return next.find(tgt) // search on
30Backus-Naur Form
- http//en.wikipedia.org/wiki/Backus-Naur_form
- BNF is a widely-used notation for describing the
grammar or formal syntax of programming languages
or data - BNF specifics a grammar as a set of derivation
rules of this form with symbols - Look at website and example there (also on next
slide) - How are trees involved here? Is it recursive?
31BNF for Postal Address
-
- "."
-
-
- ","
- Example Ann Marie G. Jones
- 123 Main St. Hooville, VA 22901
- Wheres the recursion?
32Grammars in Language
- Rule-based grammars describe
- how legal statements can be produced
- how to tell if a statement is legal
- Study textbook, pp. 389-391, to see rule-based
grammar for simple Java-like arithmetic
expressions - four rules for expressions, terms, factors, and
letter - Study how a (possibly) legal statement is parsed
to generate a parse tree
33Computing Parse-Tree Example
34Grammar Terms and Concepts
- First, this is whats called a context-free
grammar - For CS201, lets not worry about what this means!
- A CFG has
- a set of variables (AKA non-terminals)
- a set of terminal symbols
- a set of productions
- a starting symbol
35Previous Parse Tree
- Terminal symbols
- could be
- could be a b c
- Production ?
36Natural Language Parse Tree
- Statement The man bit the dog
37How Can We Use Grammars?
- Parsing
- Is a given statement a valid statement in the
language? (Is the statement recognized by the
grammar?) - Note this is what the Java compiler does as a
first step toward creating an executable form of
your program. (Find errors, or build
executable.) - Production
- Generate a legal statement for this grammar
- Demo generate random statements!
- See link on website next to slides
38Demos Poem-grammar data file
-
-
- The tonight
-
-
-
- waves
- big yellow flowers
- slugs
-
sigh portend like
die wari
ly grumpily
- Note no recursive productions in this example!