Introduction to Perl Part II - PowerPoint PPT Presentation

About This Presentation
Title:

Introduction to Perl Part II

Description:

... starts with something other than a single upper case letter ... Would possibly return : bobble, babble, bubble. oat/ Would possibly return : boat, coat, goat ... – PowerPoint PPT presentation

Number of Views:65
Avg rating:3.0/5.0
Slides: 38
Provided by: csU86
Learn more at: https://www.d.umn.edu
Category:

less

Transcript and Presenter's Notes

Title: Introduction to Perl Part II


1
Introduction to PerlPart II
  • By Bridget Thomson McInnes
  • 22 January 2004

2
File Handlers
  • Very simple compared to C/ C !!!
  • Are not prefixed with a symbol (, _at_, , ect)
  • Opening a File
  • open (SRC, my_file.txt)
  • Reading from a File
  • line ltSRCgt reads upto a newline character
  • Closing a File
  • close (SRC)

3
File Handlers cont...
  • Opening a file for output
  • open (DST, gtmy_file.txt)
  • Opening a file for appending
  • open (DST, gtgtmy_file.txt)
  • Writing to a file
  • print DST Printing my first line.\n
  • Safeguarding against opening a non existent file
  • open (SRC, file.txt) die Could not open
    file.\n

4
File Test Operators
  • Check to see if a file exists
  • if ( -e file.txt)
  • The file exists!
  • Other file test operators
  • -r readable
  • -x executable
  • -d is a directory
  • -T is a text file

5
Quick Program with File Handles
  • Program to copy a file to a destination file
  • !/usr/local/bin/perl -w
  • open(SRC, file.txt) die Could not open
    source file.\n
  • open(DSTlt gtnewfile.txt)
  • while ( line ltSRCgt )
  • print DST line
  • close SRC
  • close DST

6
Some Default File Handles
  • STDIN Standard Input
  • line ltSTDINgt takes input from stdin
  • STDOUT Standard output
  • print STDOUT File handling in Perl is sweet!\n
  • STDERR Standard Error
  • print STDERR Error!!\n

7
The ltgt File Handle
  • The empty file handle takes the command line
    file(s) or STDIN
  • line ltgt
  • If program is run ./prog.pl file.txt, this will
    automatically open file.txt and read the first
    line.
  • If program is run ./prog.pl file1.txt file2.txt,
    this will first read in file1.txt and then
    file2.txt ... you will not know when one ends and
    the other begins.

8
The ltgt File Handle cont...
  • If program is run ./prog.pl, the program will
    wait for you to enter text at the prompt, and
    will continue until you enter the EOF character
  • CTRL-D in UNIX

9
Example Program with STDIN
  • Suppose you want to determine if you are one of
    the three stooges
  • !/usr/local/bin/perl
  • stooges (larry gt 1, moe gt 1, curly gt 1 )
  • print Enter your name ?
  • name ltSTDINgt chomp name
  • if(stoogeslc(name))
  • print You are one of the Three
    Stooges!!\n
  • else
  • print Sorry, you are not a Stooge!!\n

10
Chomp and Chop
  • Chomp function that deletes a trailing newline
    from the end of a string.
  • line this is the first line of text\n
  • chomp line removes the new line character
  • print line prints this is the first
    line of text without returning
  • Chop function that chops off the last character
    of a string.
  • line this is the first line of text
  • chop line
  • print line prints this is the first line
    of tex

11
Regular Expressions
  • What are Regular Expressions .. a few
    definitions.
  • Specifies a class of strings that belong to the
    formal / regular languages defined by regular
    expressions
  • In other words, a formula for matching strings
    that follow a specified pattern.
  • Some things you can do with regular expressions
  • Parse the text
  • Add and/or replace subsections of text
  • Remove pieces of the text

12
Regular Expressions cont..
  • A regular expression characterizes a regular
    language
  • Examples in UNIX
  • ls .c
  • Lists all the files in the current directory that
    are postfixed '.c'
  • ls .txt
  • Lists all the files in the current directory that
    are postfixed '.txt'

13
Simple Example for ... ? Clarity
  • In the simplest form, a regular expression is a
    string of characters that you are looking for
  • We want to find all the words that contain the
    string 'ing' in our text.
  • The regular expression we would use
  • /ing/

14
Simple Example cont...
  • What would are program then look like
  • !/usr/local/bin/perl
  • while(ltgt)
  • chomp
  • _at_words split/ /
  • foreach word(_at_words)
  • if(wordm/ing/) print word\n

15
Regular Expressions Types
  • Regular expressions are composed of two types of
    characters
  • Literals
  • Normal text characters
  • Like what we saw in the previous program (
    /ing/ )
  • Metacharacters
  • special characters
  • Add a great deal of flexibility to your search

16
Metacharacters
  • Match more than just characters
  • Match line position
  • start of a line ( carat )
  • end of a line ( dollar sign )
  • Match any characters in a list ...
  • Example
  • /Bbridget/ matches Bridget or bridget
  • /McIinnes/ matches McInnes or Mcinnes

17
Our Simple Example Revisited
  • Now suppose we only want to match words that end
    in 'ing' rather than just contain 'ing'.
  • How would we change are regular expressions to
    accomplish this
  • Previous Regular Expression
  • word m/ ing /
  • New Regular Expression
  • wordm/ ing /


18
Ranges of Regular Expressions
  • Ranges can be specified in Regular Expressions
  • Valid Ranges
  • A-Z Upper Case Roman Alphabet
  • a-z Lower Case Roman Alphabet
  • A-Za-z Upper or Lower Case Roman Alphabet
  • A-F Upper Case A through F Roman
    Characters
  • A-z Valid but be careful
  • Invalid Ranges
  • a-Z Not Valid
  • F-A Not Valid

19
Ranges cont ...
  • Ranges of Digits can also be specified
  • 0-9 Valid
  • 9-0 Invalid
  • Negating Ranges
  • / 0-9 /
  • Match anything except a digit
  • / a /
  • Match anything except an a
  • / A-Z /
  • Match anything that starts with something
    other than a single upper case
    letter
  • First start of line
  • Second negation

20
Our Simple Example Again
  • Now suppose we want to create a list of all the
    words in our text that do not end in 'ing'
  • How would we change are regular expressions to
    accomplish this
  • Previous Regular Expression
  • word m/ ing /
  • New Regular Expression
  • wordm/ ing /

21
Literal Metacharacters
  • Suppose that you actually want to look for all
    strings that equal '' in your text
  • Use the \ symbol
  • / \ / Regular expression to search for
  • What does the following Regular Expressions
    Match?
  • / A - Z /
  • Matches any line that contains ( A-Z or )
    followed by

22
Patterns provided in Perl
  • Some Patterns
  • \d 0 9
  • \w a z A z 0 9 _
  • \s \r \t \n \f (white space pattern)
  • \D 0 - 9
  • \W a z A Z 0 9
  • \S \r \t \n \f
  • Example 19\d\d
  • Looks for any year in the 1900's

23
Using Patterns in our Example
  • Commonly words are not separated by just a single
    space but by tabs, returns, ect...
  • Let's modify our split function to incorporate
    multiple white space
  • !/usr/local/bin/perl
  • while(ltgt)
  • chomp
  • _at_words split/\s/, _
  • foreach word(_at_words)
  • if(wordm/ing/) print word\n

24
Word Boundary Metacharacter
  • Regular Expression to match the start or the end
    of a 'word' \b
  • Examples
  • / Jeff\b / Match Jeff but not Jefferson
  • / Carol\b / Match Chris but not Caroline
  • / Rollin\b / Match Rollin but not Rolling
  • /\bform / Match form or formation but not
    Information
  • /\bform\b/ Match form but neither information
    nor formation

25
DOT Metacharacter
  • The DOT Metacharacter, '.' symbolizes any
    character except a new line
  • / b . bble/
  • Would possibly return bobble, babble, bubble
  • / . oat/
  • Would possibly return boat, coat, goat
  • Note remember '.' usually means a bunch of
    anything, this can be handy but also can have
    hidden ramifications.

26
PIPE Metacharacter
  • The PIPE Metacharacter is used for alternation
  • / Bridget (Thomson McInnes) /
  • Match Bridget Thomson or Bridget McInnes but
    NOT Bridget Thomson McInnes
  • / B bridget /
  • Match B or bridget
  • / ( B b ) ridget /
  • Match Bridget or bridget at the beginning of a
    line

27
Our Simple Example
  • Now with our example, suppose that we want to not
    only get all words that end in 'ing' but also
    'ed'.
  • How would we change are regular expressions to
    accomplish this
  • Previous Regular Expression
  • word m/ ing /
  • New Regular Expression
  • wordm/ (inged) /

28
The ? Metacharacter
  • The metacharacter, ?, indicates that the
    character immediately preceding it occurs zero or
    one time
  • Examples
  • / worl?ds /
  • Match either 'worlds' or 'words'
  • / m?ethane /
  • Match either 'methane' or 'ethane'

29
The Metacharacter
  • The metacharacter, , indicates that the
    characterer immediately preceding it occurs zero
    or more times
  • Example
  • / abc/ Match 'ac', 'abc', 'abbc', 'abbbc'
    ect...
  • Matches any string that starts with an a, if
    possibly followed by a sequence of b's and ends
    with a c.
  • Sometimes called Kleene's star

30
Our Simple Example again
  • Now suppose we want to create a list of all the
    words in our text that end in 'ing' or 'ings'
  • How would we change are regular expressions to
    accomplish this
  • Previous Regular Expression
  • word m/ ing /
  • New Regular Expression
  • wordm/ ings? /

31
Modifying Text
  • Match
  • Up to this point, we have seen attempt to match a
    given regular expression
  • Example variable m/ regex /
  • Substitution
  • Takes match one step further if there is a
    match, then replace it with the given string
  • Example variable s/ regex / replacement
  • var / Thomson / McInnes /
  • var / Bridgette / Bridget /

32
Substitution Example
  • Suppose when we find all our words that end in
    'ing' we want to replace the 'ing' with 'ed'.
  • !/usr/local/bin/perl -w
  • while(ltgt)
  • chomp _
  • _at_words split/ \s/, _
  • foreach word(_at_words)
  • if(words/ing/ed/) print
    word\n

33
Special Variable Modified by a Match
  • Copy of text matched by the regex
  • '
  • A copy of the target text in from of the match
  • A copy of the target text after the match
  • 1, 2, 3, ect
  • The text matched by 1st, 2nd, ect., set of
    parentheses. Note 0 is not included here
  • A copy of the highest numbered 1, 2, 3, ect..

34
Our Simple Example once again
  • Now lets revise are program to find all the words
    that end in 'ing' without splitting our line of
    text into an array of words
  • !/usr/local/bin/perl -w
  • while(ltgt)
  • chomp _
  • if(_/(A-Za-zing\b)/) print "\n"

35
Example
  • !/usr/local/bin
  • exp ltSTDINgt chomp exp
  • if(exp/(A-Za-z\s)\bcrave\b(\sA-Za-z)/)
  • print 1\n
  • print 2\n
  • Run Program with string I crave to rule the
    world!
  • Results
  • I
  • to rule the world!

36
Example
  • !/usr/local/bin
  • exp ltSTDINgt chomp exp
  • if(exp/\bcrave\b/)
  • print \n print \n print \n
  • Run Program with string I crave to rule the
    world!
  • Results
  • I
  • crave
  • to rule the world!

37
Thank you ?
Write a Comment
User Comments (0)
About PowerShow.com