Huffman Codes and A Lot More Java - PowerPoint PPT Presentation

1 / 56
About This Presentation
Title:

Huffman Codes and A Lot More Java

Description:

instanceof usage. So I used instanceof in the traversal example. But what is it? ... Usage Example: try { FileOutputStream fos = new FileOutputStream(filename) ... – PowerPoint PPT presentation

Number of Views:310
Avg rating:3.0/5.0
Slides: 57
Provided by: briana78
Category:
Tags: codes | huffman | java | lot | more

less

Transcript and Presenter's Notes

Title: Huffman Codes and A Lot More Java


1
Huffman Codes and(A Lot More) Java
  • CS 2 Introduction to Programming Methods
  • 30 January 2003

2
Trees!
3
Tree terms
  • Node the basic unit in constructing a tree
  • Children the nodes below and connected to a
    given node.
  • Parent the node above and connected to a given
    node
  • Root the only node with no parent, it is the
    top of the tree
  • Leaf a childless node, i.e., the bottom most
    nodes.

4
Tree Examples
  • Note how only the leaves in this example store
    data. Depending on the application, this may be
    appropriate.

5
Tree Examples
  • In this tree, Nodes have more than two children
    and all the Nodes have data (color in this
    example).

6
Traversing a Tree
  • Traversing a tree is fairly simple using
    recursion.
  • traverse(TreeNode t, Operation o)
  • o.operate(t.getRoot()) //do something to
    self
  • if(t instanceof TreeLeaf)
  • // No children to bother
  • else
  • // Get the children and do something to
    them
  • Iterator kids t.children()
  • while(kids.hasNext()) traverse(kids.next())

7
Tree Traversal
  • The nodes in this tree are numbered in the order
    of operation for the preceding example of
    Traversal

8
The instanceof keyword
9
instanceof usage
  • So I used instanceof in the traversal example.
    But what is it?
  • instanceof is used to determine if a given Object
    is an instance of a particular class. For
    instance, if t is a TreeLeaf, then (t instanceof
    TreeLeaf) evaluates to true, otherwise it
    evaluates to false.
  • Moreover, if t is a TreeLeaf, which is a subclass
    of TreeNode, then (t instanceof TreeNode) also
    will evaluate to true.

10
Exceptions, quick and dirty
11
Exceptions
  • In Java, when something goes wrong, an exception
    is thrown.
  • So far, if you saw an exception, it meant you did
    something wrong. For instance, popping an object
    off an empty stack.
  • There are many times when something will go wrong
    through no fault of your own. When trying to do
    file input and output (I/O), bad things can
    happen and you need a graceful way to deal with
    them. Exceptions are the answer.

12
Try-catch blocks
  • The construct for dealing with Exceptions is the
    try catch block. Basically, it says try to do
    something and if something goes wrong use the
    code in the catch block to try and deal with the
    problem.
  • Example
  • try
  • doACM95set(1) // total time to spend in
    hours
  • catch(WishfulThinkingException e)
  • System.err.println(Not gonna happen!)

13
And nowthe quick and dirty way out
  • If you dont want to deal with Exceptions right
    now, all you need to do is add throws Exception
    into all method declarations like so
  • public static void main(String args)
    throws Exception
  • //ha ha I dont have to handle any
    Exceptions
  • //I am invincible

14
Worst coding habits ever
  • You can do this for lab 4 to ease the learning
    curve. There are a lot of new programming
    concepts you need to understand for the lab and
    you dont need to understand exceptions.
  • However, this is a terrible coding habit. I
    cannot emphasize enough how bad it is to do this.
    Once you learn the right way to handle
    exceptions, never do this again. Ever.
  • Ever!

15
File I/O
16
File I/O intro
  • Files are stored on the hard disk as a series of
    bytes. Bytes are numbers from 0 to 255 (28-1)
    that represent data. Plain text is usually
    stored in ASCII, which maps the numbers 0-255 to
    characters.
  • Java normally uses streams to do I/O. Basically
    this means that you cannot go backwards while
    doing I/O easily. If you want to go back and
    change something, you have to start from the
    beginning. There are ways around this, but you
    wont need them for this lab.
  • I/O operations almost always throw IOExceptions.
  • To use I/O in java you need to import java.io.

17
Input
  • In java, FileInputStream is a class that allows
    you to read in a file. It has a lot of commands,
    but the useful one is read() which returns the
    next byte in the file or -1 if there are no bytes
    left to read.
  • Usage example
  • try
  • FileInputStream fis new FileInputStream(file
    name)
  • int c fis.read()
  • while(c ! -1)
  • // do something with c
  • c fis.read()
  • catch(IOException e)
  • // deal with the exception
  • System.err.println(IO Error!
    e.getMessage())

18
Output
  • Output is done by FileOutputStream and its method
    write(data). Look in the documentation to see
    all the possible arguments for write, since there
    are a bunch.
  • write(int b) writes the byte in b to the next
    spot in the file.
  • Usage Example
  • try
  • FileOutputStream fos new FileOutputStream(fi
    lename)
  • for(int i 0 i lt 256 i) write(i)
  • catch (IOException e)
  • System.err.println(IO Error! Oh no!)

19
Huffman TreeHow do these structures relate to
trees?
20
HuffmanTree
  • HuffmanLeaf
  • Acts like a leaf in a tree
  • May or my not have parent
  • Has no children
  • HuffmanNode
  • Acts like a non-leaf node in a tree
  • May or may not have parent
  • Can have 0, 1 or 2 children

21
HuffmanTree
  • Some Examples of Valid Huffman Trees

Leaf
Node
22
HuffmanTree Methods of HuffmanLeaf
  • HuffmanLeaf( int Value)
  • HuffmanLeaf( int Value, int frequency)
  • Value value leaf is representing
  • Frequency how often value occurs (more on that
    later)
  • getValue()
  • setValue()
  • toString()
  • Returns a string of the form value, frequency

23
HuffmanTree Methods of HuffmanNode
  • HuffmanNode()
  • HuffmanNode(int frequency)
  • HuffmanNode(int Frequency, HuffmanTree left,
    Huffman Tree right)
  • left and right are the children of the Huffman
    Node

Left
Right
Left
Right
24
HuffmanTree Methods of HuffmanNode
  • HuffmanNode()
  • HuffmanNode(int frequency)
  • HuffmanNode(int Frequency, HuffmanTree left,
    Huffman Tree right)
  • left and right are the children of the Huffman
    Node
  • Nodes can also be children

Left
Right
Left
Right
25
HuffmanTree Methods of HuffmanNode
  • getLeft(), getRight()
  • Returns nodes left and right child respectively
  • setLeft(), setRight()
  • Sets nodes left and right child respectively
  • toString()
  • Returns a string representation of the node

26
CompressionOne reason we care about trees
27
Compression
  • Used EVERYWHERE
  • Examples
  • Music MP3s
  • Images JPG, GIF
  • Other ZIP, TAR
  • Idea Compression takes a file and reduces the
    number of bits it takes to express that
    information.
  • One simple but effective compression algorithm is
    Huffman Encoding
  • For more info visit http//www.data-compression.c
    om/

28
Huffman Encoding A Greedy Algorithm
  • Huffman encoding is example of a greedy algorithm
  • A greedy algorithm is an algorithm that always
    makes the choice that looks best at the moment
  • In such algorithms the locally optimal solution
    leads to the globally optimal solution.
  • There are many other greedy algorithms (CS 38)

29
Huffman Encoding Resources
  • Huffman encoding is a common algorithm and there
    are many resources available online explaining
    it,
  • http//www.cs.duke.edu/csed/poop/huff/info/
  • If you have any questions about this or any other
    part of your homework, come any of our office
    hours or e-mail the TAs

30
Huffman Encoding Overview
  • Compression
  • Reads a text file and creates a tree representing
    a best possible encoding
  • Uses tree to convert the original file into a
    compressed file.
  • Tree info is saved (we do that)
  • As you can see compression requires reading over
    the original file twice (we handle most of the
    file reading)

31
Huffman Encoding Creating the Tree
  • Start with an array of 256 HuffmanLeafs
  • there are 256 char in ASCII, converting to ints
    simple
  • HuffmanLeaf Foo new HuffmanLeaf256
  • Fooi.getValue I gt the ith char in ASCII
  • All frequencies initially zero
  • More on arrays in a bit
  • Read in the file a character at a time
  • Whenever encoder character I increment foois
    frequency
  • Fooi.setFrequency(Fooi.getFrequency() 1)
  • End up with an array of the frequencies of each
    of the 256 chars

32
Huffman Encoding Creating the Tree
  • Building the Tree
  • Select the two Trees (initially leafs) with the
    lowest frequencies
  • We will call them L and R
  • Create a new Node with L as its left child, R as
    its right child and a frequency of L.frequency
    R.frequency
  • Repeat until all the nodes are merged into one
    tree

33
Huffman Encoding Creating the Tree
C
E
H
I
A
5
8
2
7
3
34
Huffman Encoding Creating the Tree
C
E
I
A
H
5
8
7
3
2
5
35
Huffman Encoding Creating the Tree
E
I
A
H
8
7
3
2
C
5
5
10
36
Huffman Encoding Creating the Tree
E
I
A
H
8
7
3
2
C
15
5
5
10
37
Huffman Encoding Creating the Tree
A
H
3
2
C
E
I
5
8
7
5
15
10
25
38
Huffman Encoding Creating the Tree
  • Use the tree built using the Huffman algorithm to
    get compression
  • Most frequently used characters have shortest
    codeword lengths
  • Less common character have longer codeword
    lengths
  • Get codewords by walking from root of tree to
    the leaves representing the characters
  • No codeword is a prefix of any other codeword in
    this algorithm

39
Huffman Encoding Creating the Tree
A
H
E 01 I 00 C 10 A 111 H 110
3
2
C
E
I
1
0
5
8
7
5
1
0
0
1
15
10
Length of the codeword dependant of frequency of
the character.
0
1
25
40
Huffman Encoding Overview
  • Decompression
  • Takes in information to rebuild tree used in
    creating compressed file (we handle)
  • Converts the compressed file into original file
    by walking down tree from root to the leaves
    representing characters.
  • Outputs results

41
Huffman Encoding Decoding from Tree
A
H
1111001
3
2
C
E
I
1
0
5
8
7
5
1
0
0
1
15
10
0
1
25
42
Huffman Encoding Decoding from Tree
A
H
1111001
3
2
C
E
I
1
0
5
8
7
5
1
0
0
1
15
10
0
1
25
43
Huffman Encoding Decoding from Tree
A
H
1111001
3
2
C
E
I
1
0
5
8
7
5
1
0
0
1
15
10
0
1
25
44
Huffman Encoding Decoding from Tree
A
H
1111001 A
3
2
C
E
I
1
0
5
8
7
5
1
0
0
1
15
10
0
1
25
45
Huffman Encoding Decoding from Tree
A
H
1111001 A
3
2
C
E
I
1
0
5
8
7
5
1
0
0
1
15
10
0
1
25
46
Huffman Encoding Decoding from Tree
A
H
1111001 AC
3
2
C
E
I
1
0
5
8
7
5
1
0
0
1
15
10
0
1
25
47
Huffman Encoding Decoding from Tree
A
H
1111001 AC
3
2
C
E
I
1
0
5
8
7
5
1
0
0
1
15
10
0
1
25
48
Huffman Encoding Decoding from Tree
A
H
1111001 ACE
3
2
C
E
I
1
0
5
8
7
5
1
0
0
1
15
10
0
1
25
49
Huffman Encoding Decoding from Tree
A
H
1111001 ACE
3
2
C
E
I
1
0
5
8
7
5
1
0
0
1
15
10
0
1
25
50
Huffman Encoding Resources
  • Once more, this is linked to homework 2
  • http//www.cs.duke.edu/csed/poop/huff/info/
  • If you have any questions about this or any other
    part of your homework, come any of our office
    hours or e-mail the TAs

51
Arrays
52
Arrays
  • A way of organizing multiple objects of the same
    type that are logically connected
  • int charFreq new charFreqsize // size is
    an int
  • Can access an element of an array quickly and
    easily
  • charFreq200 is the same as calling element the
    element in 200th slot of the array
  • Arrays run from 0 to size -1 , so can access
    charFreqi for 0 lt I lt size
  • Say size 256, trying to access charFreq256 is
    VERY BAD
  • You will get lots of problems in your code from
    this if you are not careful

53
Arrays
  • If you have an array of an objects you can access
    them by selecting an element of the array and
    then calling the method you want
  • Ex. Want the frequency of element 127 in an array
    of TreeLeafs called Leaves
  • Leaves127.getFrequency() //returns the value
    looking for
  • Once again, be very careful about going out of
    bounds in looking at array elements

54
Arguments
55
So thats what it does
  • Ever wonder why you always have to write
    main(String args) instead of just main()?
    Its because args is an array containing all the
    arguments passed to your function from the
    command line. If no arguments are passed args
    just has length 0.

56
Making good use of your new knowledge
  • public class MidgetSearch
  • public static void main(String args)
  • for(int i 0 i lt args.length i)
  • System.out.println(findMidget(argsi))
  • gt java MidgetSearch Bridget Larry Dan
  • Bridget the midget is putting on a show in
    Ricketts
  • Larry the midget is trapped in a well in Rhode
    Island
  • Dan isnt really a midget and therefore cannot be
    located
Write a Comment
User Comments (0)
About PowerShow.com