Perl Training - PowerPoint PPT Presentation

1 / 28
About This Presentation
Title:

Perl Training

Description:

o Compiles pattern once (rarely used) s Treats as single line. x Allows whitespace for comments ... number and change time to Line ... – PowerPoint PPT presentation

Number of Views:17
Avg rating:3.0/5.0
Slides: 29
Provided by: perlw
Category:
Tags: pattern | perl | size | training

less

Transcript and Presenter's Notes

Title: Perl Training


1
Perl Training
  • Pattern Matching and Regular Expressions

http//perlwizard.org/perl/week8
2
Agenda
  • Homework Review
  • Regular Expressions
  • Simple True/False Searches
  • Getting Shorter
  • Those Pesky Backslashes
  • Inexact matches
  • Characters with Class
  • Range Shortcuts
  • Special Locations
  • Multiplicity
  • Match Return values
  • Greedy Matching
  • Scalar and List Context of m//
  • Modifiers
  • Global searches
  • Grep
  • Simple Substitutions
  • Using Match Results/Expressions in Second
    Argument
  • tr///

3
Homework Problem 1
  • Create a program that will open the /www/
    directory on your VServer. Walk through the
    directory structure, making sure that all the
    directories are 755 and files are 644.

View Devens Solution at http//www.netexplorer.or
g/dirwalker.txt
4
Homework Problem 2
  • Create a program that will run a Unix command and
    do something with the output (use Pipes)

!/usr/local/bin/perl open(WHO, "who
") while(ltWHOgt) print
5
Homework Problem 3
  • Read a list of files from a text file, and print
    the permissions, file size, whether they are
    readable/writable and other usefull file
    information for each file

sub parse_date my _at_time_array ()
local (time) _at__ _at_time_array
localtime(time) Increment Month so
1Jan, 2Feb time_array4
Increment Year, so it's 2000
instead of 100 time_array5 1900
my date_string "time_array4/"
date_string . "time_array3/"
date_string . "time_array5 "
date_string . sprintf("02d",
time_array2) date_string .
sprintf("02d", time_array1)
date_string . sprintf("02d", time_array0)
return date_string
!/usr/local/bin/perl open(FILE,
"/tmp/files.txt") print "Filename\t\tPermissions\
tSize\tLast_Mod\t\tLast_Acc\n" while(ltFILEgt)
filename _ chomp(filename)
my perms (stat (filename))2 07777
my size (stat (filename))7
my last_mod (stat (filename))9
last_mod parse_date(last_mod) my
last_acc (stat (filename))8
last_acc parse_date(last_acc) print
"filename\tperms\t\tsize\tlast_mod\tlast_acc\
n"
6
Regular Expressions
  • _at_fields split(/\t/, TheRecord)
  • This is the simplest form of Regular Expressions,
    just the exact characters you want to specify,
    such as a single tab in the above example.
  • _at_fields split(//, TheRecord) Split each
    field by abccdefff
  • _at_fields split(/999/, TheRecord) Split each
    field by 999 abc999cde999fff

7
Simple True-False Searches
  • Format
  • m// Returns true or false, based on if it finds
    text matching its RegEx
  • if(STRING m/REGEX/)
  • Example
  • if(abcdef m//) print True else
    print False
  • Test abcdef
  • if(Test m//) print True else print
    False
  • if(Test ! m//) If Test does not contain a
  • find
  • if(Test m/find/) Its ok to use variables
  • Make SURE that find is not the null string
  • If it is, Perl uses whatever is in _, which
    is BAAAAD.

8
Getting Shorter
  • You can use other delimiters
  • if(Test m)
  • if(Test m!!)
  • Ill use m// because most people do.
  • If you use, the m// form, you dont have to
    include the m.
  • if(Test //)
  • This is considered the short form and many
    programmers use it.
  • To get shorter still, if you are searching in
    Perls special variable _, you dont need the
    at all
  • while(ltINFILEgt)
  • if(/From/) The line has From in it

9
Those pesky backslashes
How do you search for a / in a
string? if(line /\//) Search for a
vertical bar? if(line /\/) Use Quotemeta
(Requires Perl 5) function print
quotemeta("This string? has ( ) meta chars in
it") prints This\ string\?\ has\ \\ \(\ \)\
meta\ chars\ in\ it Must be escaped
? . ( ) \ (/ is also there
if you use them as your delimiter)
10
Inexact Matches in RegEx
  • Use a for or
  • if(Test /(xX)/) If Test has an x or an
    X in it
  • if(Test /can(dledycer)/) Matches on
    candle, candy or cancer
  • Matching any character
  • if(Test /N.T/) Matches the letter Nltany
    chargtT
  • NET, NeT, NT Not NesT (2 chars)
  • Does not work across \n, so it would fail
    if
  • Test was N\nT

11
Characters with Class
  • if(Test /(xX)/) is the same as if(Test
    /xX/)
  • Using will match any characters within the
    brackets
  • if(Test /ABCDE/)
  • Using in the pattern negates it, so this would
    match any character not in that range
  • Which is better?
  • if(Test /(0123456789)/)
  • if(Test /0123456789/)
  • if(Test /0-9/)
  • Test for any upper or lowercase letter
  • if(Test /A-Za-z/)

12
Range Shortcuts
  • Code Replaces Description
  • \d 0-9 Any digit
  • \w a-zA-Z_0-9 Any alphanumeric
  • \s \t\n\r\f Any whitespace character
  • \D 0-9 Any non-digit
  • \W a-zA-Z_0-9 Any non-alphanumeric
  • \S \t\n\r\f Any non whitespace
  • if(Test /\d/) Search for a digit
  • if(Test /A-Z\d\d/) Search for a capital
    letter followed by two digits
  • if(Test /\s\w\w\w\s/) Searches for
    whitespace, followed by 3 letters
  • followed by whitespace (a 3-letter word)

13
Special Locations
  • indicates the beginning of the string,
    indicates the end of the string
  • if(Test /Beg/) If the string starts with
    Beg
  • Test Begin here True
  • Test I Beg your Pardon False
  • if(Test /don/) If the string ends in don
  • if(Test /\bY/) If a word starts with a Y
  • \b is for word boundary
  • if(Test /J\b/) If a word ends with J
  • \B is the opposite, i.e. the middle of a
    word

14
Multiplicity
  • How do you find one or more hash characters?
  • if(Temp //) Not
    general enough
  • Symbol Meaning
  • Match 1 or more times
  • Match 0 or more times
  • ? Match 0 or 1 time
  • n Match exactly n times
  • n, Match at least n times
  • n,m Match at least n but not more than m
    times
  • (n and m values must be less than 65,536!!)

15
Multiplicity Examples
  • if(Test //) Matches one or more
  • if(Test /\d3/) Match exactly three digits
  • if(Test /\w\d/) Search for one or more
    alphanumeric
  • characters followed by one digit
  • if(Test /N.T/) NeT, NT, NeT
  • if(Test /N.T/) NeT, NesT, NeesT
  • if(Test /N.T/) NT, NeT, NesT, NeesT

16
What Perl returns from a match
  • Test NESTING
  • if(Test /(N.T)/) print True
  • print 1\n Prints NEST
  • 1 matches the result of the first set of
    parentheses
  • 2 matches the result of the second set of
    parentheses
  • if(Test /(N.)S(.I)/) 1 is NE, 2 is TI

17
Greedy Matching
  • Test NITWITS
  • if(Test /(N.T)/) 1 is NITWIT
  • Perl uses Greedy matching by default, i.e. match
    as much as possible on the first try, you can
    supress it with the ? after the quantifier
  • if(Test /(N.?T)/) 1 is NIT

18
Scalar and List context of m//
  • In scalar context, the m// operator returns true
    or false. In list context, it returns the list
    of items returned, such as (1, 2, 3...)
  • Test NESTING
  • _at_Birds (Test /(N.)S(.I)/) Birds0
    NE, Birds1 TI

19
Modifiers
  • Modifier Description
  • g Returns each occurrence (global search)
  • i Ignores case
  • m Allows multiple lines in the string
  • o Compiles pattern once (rarely used)
  • s Treats as single line
  • x Allows whitespace for comments

20
Modifier Examples
  • if(Test /(N.T)/i) Matches NesT, Nest,
    nest, net, neT
  • if(Test /
  • (N.) (? Start with N and any letter)
  • S (? ... followed by the letter S)
  • (.I) (? ... followed by another letter, then I)
  • /x) Same as if(Test /(N.)S(.I)/)
  • Its usually more readable to use a regular perl
    comment on the line before your if statement.

21
Global Searches
  • Test NESTING
  • _at_Birds (Test /N./g)
  • Birds0 is NE
  • Birds1 is NG
  • Handy way to find all the matching strings, if
    you dont know how many there will be.
  • i0
  • Remembers where in the string it last found
    the last match and starts from there for
    the next search
  • while(Test /N./g) i
  • print There were i matches\n

22
Grep (A Unix Geek Favorite)
  • grep(RegEx, list)
  • In List context, returns items that the RegEx is
    true
  • In scalar context, returns the number of elements
    that return true
  • _at_JustSNames grep(/S/, _at_LastNames) Get all
    names that start with S
  • TotSNames grep(/S/, _at_LastNames) Get the
    number of S Names

23
Simple Substitutions
  • s/// function format
  • s/String to search for/String to replace with/
  • Takes the same modifiers as m//
  • Count (Test s/e/E/g) All es in Test
    are now E
  • Count is number of es changed
  • count (Test s/aeiouAEIOU/V/g)
  • Change all vowels to V
  • If you dont use the , it assumes you want _
  • while(ltINFILEgt)
  • s/\t//g Change the tabs to vertical bars
  • Safer to assign _ to a variable and work
    with the variable
  • You can use variables in either part of the s///
    operator
  • Test s/tmp/tmp2/g

24
Using Match Results/Expressions in Second
Argument
  • Test s/(\d)/Line 1/ Change all lines
    that start with a
  • number and change time to Line
  • Test s/(,), (.)/2 1/ Change Noble,
    Jason to Jason Noble
  • s/// also has the e modifier, which means
    evaluate the second part as an expression rather
    than a string
  • Test s/(\d)/1 1/e Turns 23 into 24,
    for all numbers

25
Another look at splitting
  • The first argument is a regular expression, not
    just a string.
  • Example (from DNS files)
  • _at_fields split(/\t\t\t/, TheRecord) What if
    it has 2 or 4 tabs?
  • _at_fields split(/\t2,4, TheRecord) Split
    on 2-4 tabs

26
Letter for Letter Translations
  • tr/// function translates lots of letters at once
  • Test tr/A-Z/a-z/ Translate all uppercase
    to lowercase
  • tr/// returns the number of changes it makes.
  • Modifiers
  • c Complement the first argument (same as in
    m// and s///)
  • d Deletes matching characters that are not
    replaced
  • s Removes duplicate replace characters
  • Test tr/0-9/\n/c Change all non-digits
    into line feeds
  • Test tr/0-9//cd Change non-digits into
    nothing
  • Test tr/,/,/s Change two or more commas
    into one comma
  • y/// is equivalent to tr///

27
Homework
  • Im going to take next week off (from teaching
    this class). Instead, each of you needs to
    research a topic in Perl that we either have not
    covered in class, or that you feel needs more
    coverage.
  • Perl University class next Monday/Tuesday
  • www.cpan.org
  • www.perl.com
  • www.tpj.com
  • Please send me a Power Point Slide presentation
    so that I can put it up on the Perl Website

28
Questions?
Write a Comment
User Comments (0)
About PowerShow.com