Evolutionary Computing - PowerPoint PPT Presentation

1 / 37
About This Presentation
Title:

Evolutionary Computing

Description:

Title: Evolutionary Computing Author: Marius Schamschula Last modified by: localAdmin Created Date: 7/27/1999 6:26:18 PM Document presentation format – PowerPoint PPT presentation

Number of Views:50
Avg rating:3.0/5.0
Slides: 38
Provided by: MariusSch7
Category:

less

Transcript and Presenter's Notes

Title: Evolutionary Computing


1
CS4413 Matching Algorithms (These materials
are used in the classroom only)
2
Two important concepts
  • Finite automata.
  • Character strings.

3
String Matching with Finite Automata
  • Finite Automata
  • M (Q, q0, A, ?, ?)
  • Q a finite set of states
  • q0?Q the initial state
  • A?Q accepting states
  • ? input alphabets
  • ? transition function from Q x ? ? Q
  • M accepts (rejects) an input string
  • M acts as a final-state function ? from ? to Q
  • M scans the string w, ends up with a state ?(w) ?
    A

4
Simple Automata
A simple two state finite automaton with state
set Q 0,1, start state q0 0, and input
alphabet ? a, b.
5
Automata Example
6
String Matching
  • PROBLEM find the occurrence of a given
    substring, called a pattern, in another string,
    called the text.
  • Applications
  • In text processing of character strings.
  • In matching a string of bytes containing
    graphical data or machine code.
  • Virus checking in a computer virus.
  • Search for particular patterns in DNA sequences.

7
Notation
  • T the text in which we search for a pattern.
  • n - length of the text T.
  • P - the pattern being searched for.
  • m - length of a pattern P.
  • Pi , T i the i-th character in P and T
    respectively.

8
String Matching
  • We formalize the string-matching problem as
    follows
  • We assume that the text is an array T 1...n of
    length n and that the pattern is an array
    P1...m of length m.
  • We further assume that the elements of P and T
    are characters drawn from a finite alphabet ?.
    For example, we may have ? 0, 1 or ? a,
    b, ..., z . The character arrays P and T are
    often called strings of characters.

9
String Matching
  • We say that pattern P occurs with shift s in text
    T (or, equivalently, that pattern P occurs
    beginning at position s 1 in text T) if 0 s
    n -m and T s 1 ..s m P1...m (that is,
    if Ts j Pj, for 1 j m).
  • If P occurs with shift s in T, then we call a
    valid shift otherwise, we call an invalid shift.
    The string-matching problem is the problem of
    finding all valid shifts with which a given
    pattern P occurs in a given text T.

10
String Matching
  • Finite Automata
  • A finite automaton M is a 5-tuple (Q, q0, A, ?,
    d), where
  • Q is a finite set of states
  • q0 ? Q is the start state
  • A Q is a distinguished set of accepting states
  • ? is a finite input alphabet
  • d is a function from Q ? ? into Q, called the
    transition function of M.

11
Simple String Matching
  • INPUT P of length m and T of length n.
  • PRECONDITION P is nonempty.
  • OUTPUT The index in T where a copy of P begins
    or -1 if no match for P is found.

12
Naïve String-Matching Algorithm
  • The naïve algorithm finds all valid shifts using
    a loop that checks the condition P1 m T
    s1, , sm for each of the n m 1 possible
    values of s.

13
Naïve String-Matching Algorithm
  • Naïve-String-Matcher (T, P)
  • N length T
  • M length P
  • For j 0 to n-m
  • Compare Tj Tj1 Tj2Tjm-1 to
  • P1 P2 P3 ...Pm
  • If all m characters are matching
  • return j /print pattern occurs with
    shift s.

14
Examples
  • Example How many comparisons (both successful
    and unsuccessful) will be made by the brute-force
    string-matching algorithm in searching for each
    of the following patterns in the binary text of
    1000 zeros?
  • 00001
  • 10000
  • 01010

15
Worst Case
  • Worst case happens when each time all m-1
    characters match and the last one does not.
  • a a a b
  • a a a a a a a a a a a a a a a a a a a a a a b
  • T((n m 1) m) in the worst case.

16
Analysis
  • The worst case is not one that occurs often in
    natural language text.
  • Empirical studies show that the algorithm did
    only 1.1 comparisons for each character in T (up
    to the point where match was found.)

17
Analysis
  • Naïve string-matcher is inefficient because
    information gained about the text for one value
    of s is totally ignored in considering other
    values of s.
  • Such information can be very valuable, however.
  • For example, if P aaab and we find that s 0
    is valid, then none of the shifts 1, 2, or 3 are
    valid, since T4 b.

18
Input Enhancement in String Matching
  • The Knuth-Morris-Pratt algorithm compare left
    to right.
  • The Boyer-Moore algorithm compare right to
    left, leads to simpler algorithms Horspools
    algorithm.

19
Horspools Algorithm
  • Example
  • s0 .. c ..sn-1
  • B A R B E R
  • Case 1 if there are no cs in the pattern eg.,
    c is letter S in our example we can shift the
    pattern by its entire length.
  • s0 .. S ..sn-1
  • B A R B E R
  • B A R B E R

20
Horspools Algorithm(contd..)
  • Case 2 if there are occurrences of character c
    in the pattern but it is not the last one there
    e.g., c is letter B in our example the shift
    should align the rightmost occurrence of c in the
    pattern with the c in the text.
  • s0 .. B ..sn-1
  • B A R B E R
  • B A R B E R

21
Horspools Algorithm(contd..)
  • Case 3 if c happens to be the last character in
    the pattern but there are no c s among its other
    m-1 characters, the shift should be the entire
    pattern length m
  • s0 .. M E R ..sn-1
  • L E A D E R
  • L E A D E R

22
Horspools Algorithm(contd..)
  • Case 4 Finally, if c happens to be the last
    character in the pattern and there are other cs
    among its first m-1 characters, the shift should
    be such that, the rightmost occurrence of c among
    the first m-1 characters is aligned with the
    texts c
  • s0 . O R ..sn-1
  • R E O R D E R
  • R E O R D E R

23
Horspools Algorithm(contd..)
  • Compute the shifts value, thus
  • t(c) the pattern length m, if c is not among
    the first m-1 characters of the pattern
  • t(c) the distance from the rightmost c among
    the first m-1 characters of the pattern to
    its last character, otherwise
  • ALGORITHM ShiftTable(P0m-1)
  • //Fills the shift table used by Horspools
    and Boyer-Moore algorithms
  • //Input Pattern P0.m-1 and an alphabet
    of possible charactrers
  • //Output Table0..size-1 indexed by the
    alphabets characters and
  • // filled with shift sizes
    computed by formula (7.1)
  • initialise all the elements of Table with m
  • for j ? 0 to m-2 do TablePj ? m-1-j
  • return Table

24
Horspools Algorithm(contd..)
  • Horspools algorithm
  • Step 1 For a given pattern of length m and the
    alphabet used in both the pattern and text,
    construct the shift table as described above.
  • Step 2 Align the pattern against the beginning
    of the text.
  • Step 3Repeat the following until either a
    matching substring is found or the pattern
    reaches beyond the last character of the text.
    Starting from the last character in the pattern,
    compare the corresponding characters in the
    pattern and text until either all m characters
    are matched or a mismatching pair is encountered.

25
Horspools Algorithm(contd..)
  • ALGORITHM HorspoolMatching(P0..m-1,T0..n-1)
  • // Implements Horspools algorithm for string
    matching
  • // Input Pattern P0..m-1 and text
    T0..n-1
  • // Output The index of the left end of the
    first matching
  • // substring or -1 if there are
    no matches
  • ShiftTable(P0..m-1) //generate Table of
    shifts
  • i ? m-1 //position of
    the patterns right end
  • while i n-1 do
  • k ? 0 //number of
    matched characters
  • while k m-1 and Pm-1-kTi-k
  • k ? k1
  • if km
  • return i-m1
  • else i ? i TableTi
  • return -1

26
Horspools Algorithm
  • Exercise
  • Apply Horspools algorithm to search for the
    pattern BAOBAB in the text
  • BESS_KNEW_ABOUT_BAOBABS

27
Horspools Algorithm
  • Exercise Consider the problem of searching for
    genes in DNA sequences using Horspools
    algorithm. A DNA sequence is represented by a
    text on the alphabet A, C, G, T, and the gene
    or gene segment is the pattern.
  • (a) Construct the shift table for the following
    gene segment of your chromosome 10 TCCTATTCTT
  • (b) Apply Horspools algorithm to locate the
    pattern in the following DNA sequence
  • TTATAGATCTCGTATTCTTTTATAGATCTCCTATTCTT

28
Horspools Algorithm
  • Exercise
  • How many character comparisons will be made
    by Horspools algorithm in searching for the
    following patterns in the binary text of 1000
    zeros?
  • 00001
  • 10000
  • 01010

29
Prestructuring
  • Hashing and B-Trees are examples of
    presturucturing.
  • In general, a hash function needs to satisfy two
    somewhat conflicting requirements
  • 1) A hash function needs to distribute keys among
    the cells of the hash table as evenly as
    possible.
  • 2) A hash function has to be easy to compute.

30
Hashing
  • Hashing
  • Hash Table
  • Hash Function
  • Hash Address
  • Collisions
  • Open Hashing (Separate Chaining)
  • Closed Hashing (Open Addressing)
  • (example Linear Probing checks the cell
    following the one where the collision occurs)
    implies that the table size m must be at least as
    large as the number of keys n.

31
Hashing
A 1 B 2 C3 D 4 .. Z26 Hash function
key mod 13

keys A FOOL AND HIS MONEY ARE SOON PARTED
Hash addresses 1 9 6 10 7 11 11 12


0 1 2 3 4 5 6 7 8 9 10 11 12

32
Open Hashing

keys A FOOL AND HIS MONEY ARE SOON PARTED
Hash addresses 1 9 6 10 7 11 11 12
0 1 2 3 4 5 6 7 8 9 10 11 12


A AND MONEY FOOL HIS ARE PARTED

SOON
33
Closed Hashing
keys A FOOL AND HIS MONEY ARE SOON PARTED
Hash addresses 1 9 6 10 7 11 11 12
0 1 2 3 4 5 6 7 8 9 10 11 12
A
A FOOL
A AND FOOL
A AND FOOL HIS
A AND MONEY FOOL HIS
A AND MONEY FOOL HIS ARE
A AND MONEY FOOL HIS ARE SOON
PARTED A AND MONEY FOOL HIS ARE SOON

34
Hashing
  • Hash function distributes n keys among m cells of
    the hash table evenly, each list will be about
    n/m keys long.
  • load factor a n/m
  • Efficiency of hashing (Open Hashing)
  • Efficiency of hashing (Closed Hashing)

35
Hashing
  • Exercise For the input 30, 20, 56, 75, 31, 19
    and hash function h(K) K mod 11
  • (a) Construct the open hash table.
  • (b) Find the largest number of key comparisons in
    a successful search in this table.
  • (c) Find the average number of key comparisons in
    a successful search in this table.

36
Hashing
  • Exercise For the input 30, 20, 56, 75, 31, 19
    and hash function h(K) K mod 11
  • (a) Construct the closed hash table.
  • (b) Find the largest number of key comparisons in
    a successful search in this table.
  • (c) Find the average number of key comparisons in
    a successful search in this table.

37
END
  • End of Chapter 5.
Write a Comment
User Comments (0)
About PowerShow.com