Short Perl tutorial - PowerPoint PPT Presentation

1 / 34
About This Presentation
Title:

Short Perl tutorial

Description:

variable containing scalar values such as a number or a string ... allows for almost instantaneous lookup of a value that is associated with some particular key ... – PowerPoint PPT presentation

Number of Views:22
Avg rating:3.0/5.0
Slides: 35
Provided by: radami
Category:
Tags: look | number | perl | reverse | short | tutorial | up

less

Transcript and Presenter's Notes

Title: Short Perl tutorial


1
  • Short Perl tutorial
  • Instructor Rada Mihalcea, András Csomai
  • Note some of the material in this slide set was
    adapted from a Perl course taught at University
    of Antwerp

2
About Perl
  • 1987
  • Larry Wall Develops PERL
  • 1989
  • October 18 Perl 3.0 is released under the GNU
    Protection License
  • 1991
  • March 21 Perl 4.0 is released under the GPL and
    the new Perl Artistic License
  • Now
  • Perl 6

PERL is not officially a Programming Language per
se. Walls original intent was to develop a
scripting language more powerful than Unix Shell
Scripting, but not as tedious as C. PERL is an
interpreted language. That means that there is
no explicitly separate compilation step. Rather,
the processor reads the whole file, converts it
to an internal form and executes it
immediately. P.E.R.L. Practical Extraction and
Report Language
3
Variables
  • A variable is a name of a place where some
    information is stored. For example
  • yearOfBirth 1976
  • currentYear 2000
  • age currentYear-yearOfBirth
  • print age
  • The variables in the example program can be
    identified as such because their names start with
    a dollar (). Perl uses different prefix
    characters for structure names in programs. Here
    is an overview
  • variable containing scalar values such as a
    number or a string
  • _at_ variable containing a list with numeric keys
  • variable containing a list with strings as
    keys
  • subroutine
  • matches all structures with the associated
    name

4
Operations on numbers
  • Perl contains the following arithmetic operators
  • sum
  • - subtraction
  • product
  • / division
  • modulo division
  • exponent
  • Apart from these operators, Perl contains some
    built-in arithmetic functions. Some of these are
    mentioned in the following list
  • abs(x) absolute value
  • int(x) integer part
  • rand() random number between 0 and 1
  • sqrt(x) square root

5
Input and output
  • age calculator
  • print "Please enter your birth year "
  • yearOfBirth ltgt
  • chomp(yearOfBirth)
  • print "Your age is ",2003-yearOfBirth,".\n"
  • count the number of lines in a file
  • open INPUTFILE, ltmyfile
  • (-r INPUTFILE) die Could not open the file
    myfile\n
  • count 0
  • while(line ltINPUTFILEgt)
  • count
  • print count lines in file myfile\n

6
Conditional structures
  • determine whether number is odd or even
  • print "Enter number "
  • number ltgt
  • chomp(number)
  • if (number-2int(number/2) 0)
  • print "number is even\n"
  • elsif (abs(number-2int(number/2)) 1)
  • print "number is odd\n"
  • else
  • print "Something strange has happened!\n"

7
Numeric test operators
  • An overview of the numeric test operators
  • equal
  • ! not equal
  • lt less than
  • lt less than or equal to
  • gt greater than
  • gt greater than or equal to
  • All these operators can be used for comparing two
    numeric values in an if condition.
  • Truth expressions
  • three logical operators
  • and and (alternative )
  • or or (alternative )
  • not not (alternative !)

8
Iterative structures
  • print numbers 1-10 in three different ways
  • i 1
  • while (ilt10)
  • print "i\n"
  • i
  • for (i1ilt10i)
  • print "i\n"
  • foreach i (1,2,3,4,5,6,7,8,9,10)
  • print "i\n"
  • Stop a loop, or force continuation
  • last C break

9
A paranthesisPERL philosophy(ies)
  • There is more than one way to do it
  • If you want to shoot yourself in the foot,
  • who am I to stop you?
  • And a comment DO write comments in your Perl
    programs!

10
Basic string operations
  • - strings are stored in the same type of
    variables we used for storing numbers
  • string values can be specified between double and
    single quotes
  • !!! in the first one variables will be evaluated,
    in the second one they will not.
  • Comparison operators for strings
  • eq equal
  • ne not equal
  • lt less than
  • le less than or equal to
  • gt greater than
  • ge greater than or equal to
  • Examples
  • if (a eq b)
  • .

11
String substitution and string matching
  • The power of Perl!
  • The s/// operator modifies sequences of
    characters
  • The tr/// operator changes individual characters.
  • The m/// operator checks for matching (or in
    short ///)
  • the first part between the first two slashes
    contains a search pattern
  • the second part between the final two slashes
    contains the replacement.
  • behind the final slash we can put characters to
    modify the behavior of the commands.
  • By default s/// only replaces the first
    occurrence of the search pattern
  • append a g to the operator to replace every
    occurrence.
  • append an i to the operator, to have the search
    case insensitive
  • The tr/// operator allows the modification
    characters
  • c (replace the complement of the search class)
  • d (delete characters of the search class that are
    not replaced)
  • s (squeeze sequences of identical replaced
    characters to one character)

12
Examples
  • replace first occurrence of "bug"
  • text s/bug/feature/
  • replace all occurrences of "bug"
  • text s/bug/feature/g
  • convert to lower case
  • text tr/A-Z/a-z/
  • delete vowels
  • text tr/AEIOUaeiou//d
  • replace nonnumber sequences with x
  • text tr/0-9/x/cs
  • replace all capital characters by CAPS
  • text s/A-Z/CAPS/g
  • Simple example

13
Regular expressions
Examples 1. Clean an HTML formatted text 2.
Grab URLs from a Web page 3. Transform all
lines from a file into lower case
  • \b word boundaries
  • \d digits
  • \n newline
  • \r carriage return
  • \s white space characters
  • \t tab
  • \w alphanumeric characters
  • beginning of string
  • end of string
  • . any character
  • bdkp characters b, d, k and p
  • a-f characters a to f
  • a-f all characters except a to f
  • abcdef string abc or string def
  • zero or more times
  • one or more times
  • ? zero or one time

14
Lists and arrays
  • _at_a () empty list
  • _at_b (1,2,3) three numbers
  • _at_c ("Jan","Piet","Marie") three strings
  • _at_d ("Dirk",1.92,46,"20-03-1977") a mixed
    list
  • Variables and sublists are interpolated in a list
  • _at_b (a,a1,a2) variable interpolation
  • _at_c ("Jan",("Piet","Marie")) list
    interpolation
  • _at_d ("Dirk",1.92,46,(),"20-03-1977") empty
    list interpolation
  • _at_e ( _at_b, _at_c ) same as (1,2,3,"Jan","Piet","Ma
    rie")
  • Practical construction operators
  • (x..y)
  • _at_x (1..6) same as (1, 2, 3, 4, 5, 6)
  • _at_y (1.2..4.2) same as (1.2, 2.2, 3.2,
    4.2, 5.2)
  • _at_z (2..5,8,11..13) same as
    (2,3,4,5,8,11,12,13)
  • qw() ("quote word") function
  • qw(Jan Piet Marie) is a shorter notation for
    ("Jan","Piet","Marie").

15
Split function
  • string "Jan Piet\nMarie \tDirk"
  • _at_list split /\s/, string yields (
    "Jan","Piet","Marie","Dirk" )
  • string " Jan Piet\nMarie \tDirk\n" watch
    out, empty string at the begin and end!!!
  • _at_list split /\s/, string yields ( "",
    "Jan","Piet","Marie","Dirk", "" )
  • string "JanPietMarie---Dirk" use any
    regular expression...
  • _at_list split /---/, string yields (
    "Jan","Piet","Marie","Dirk" )
  • string "Jan Piet" use an empty regular
    expression to split on letters
  • _at_letters split //, string yields (
    "J","a","n"," ","P","i","e","t")
  • Example
  • 1. Tokenize a text separate simple punctuation
    (, . ! ? ( ) )

16
More about arrays
  • _at_array ("an","bert","cindy","dirk")
  • length _at_array length now has the value 4
  • _at_array ("an","bert","cindy","dirk")
  • length _at_array
  • print length prints 4
  • print array prints 3
  • print arrayarray prints "dirk"
  • print scalar(_at_array) prints 4
  • (a, b) ("one","two")
  • (onething, _at_manythings) (1,2,3,4,5,6) now
    onething equals 1 and
  • _at_manythings (2,3,4,5,6)
  • (array0,array1) (array1,array0)
    swap the first two
  • Pay attention to the fact that assignment to a
    variable first evaluates the right hand-side of
    the expression, and then makes a copy of the
    result

17
Manipulating lists and their elements
  • push ARRAY LIST
  • appends the list to the end of the array.
  • if the second argument is a scalar rather
    than a list, it appends it as the last item of
    the array.
  • _at_array ("an","bert","cindy","dirk")
  • _at_brray ("evelien","frank")
  • push _at_array, _at_brray _at_array is
    ("an","bert","cindy","dirk","evelien","frank")
    push _at_brray, "gerben" _at_brray is
    ("evelien","frank","gerben")
  • pop ARRAY does the opposite of push. it removes
    the last item of its argument list and returns
    it. if the list is empty it returns undef.
  • _at_array ("an","bert","cindy","dirk")
  • item pop _at_array item is "dirk" and
    _at_array is ( "an","bert","cindy")
  • shift ARRAY works on the left end of the list,
    but is otherwise the same as pop.
  • unshift ARRAY LIST puts stuff on the left side of
    the list, just as push does for the right side.

18
Working with lists
  • Convert lists to strings
  • _at_array ("an","bert","cindy","dirk")
  • print "The array contains array0 array1
    array2 array3"
  • interpolate
  • print "The array contains _at_array"
  • function join STRING LIST.
  • string join "", _at_array string now has the
    value "anbertcindydirk"
  • string join "", "", _at_array string now has
    the value "anbertcindydirk"
  • Iteration over lists
  • for( i0 iltarray i)
  • item arrayi
  • item tr/a-z/A-Z/
  • print "item "
  • foreach item (_at_array)

19
Grep and map
  • grep CONDITION LIST
  • returns a list of all items from list that
    satisfy some condition.
  • For example
  • _at_large grep _ gt 10, (1,2,4,8,16,25)
    returns (16,25)
  • _at_i_names grep /i/, _at_array returns
    ("cindy","dirk")
  • map OPERATION LIST
  • is an extension of grep, and performs an
    arbitrary operation on each element of a list.
  • For example
  • _at_more map _ 3, (1,2,4,8,16,25)
    returns (4,5,7,11,19,28)
  • _at_initials map substr(_,0,1), _at_array
    returns ("a","b","c","d")

20
Hashes (Associative Arrays)
  • associate keys with values
  • allows for almost instantaneous lookup of a value
    that is associated with some particular key
  • Existing, Defined and true.
  • If the value for a key does not exist in the
    hash, the access to it returns the undef value.
  • special test function exists(HASHENTRY), which
    returns true if the hash key exists in the hash
  • if(hashkey)..., or if(defined(hashkey))
    ... return false if the key key has no
    associated value

21
Hashes (contd)
  • Examples
  • wordfrequency"the" 12731 creates key
    "the", value 12731
  • phonenumber"An De Wilde" "31-20-6777871"
  • indexword nwords
  • occurrencesa if this is the first
    reference,
  • the value
    associated with a will
  • be increased
    from 0 to 1
  • birthdays ("An","25-02-1975","Bert","12-10-1953
    ","Cindy","23-05-1969","Dirk","01-04-1961")
    fill the hash
  • birthdays (An gt "25-02-1975", Bert gt
    "12-10-1953", Cindy gt "23-05-1969", Dirk gt
    "01-04-1961" ) fill the hash the same as
    above, but more explicit
  • _at_list birthdays make a list of the
    key/value pairs
  • copy_of_bdays birthdays copy a hash

22
Operations on Hashes
  • - keys HASH returns a list with only the keys in
    the hash. As with any list, using it in a scalar
    context returns the number of keys in that list.
  • - values HASH returns a list with only the values
    in the hash, in the same order as the keys
    returned by keys.
  • foreach key (sort keys hash )
  • push _at_sortedlist, (key , hashkey )
  • print "Key key has value hashkey\n"

23
Operations on Hashes
  • reverse the direction of the mapping, i.e.
    construct a hash with keys and values swapped
  • backwards reverse forward
  • (if forward has two identical values associated
    with different keys, those will end up as only a
    single element in backwards)
  • hash slice
  • _at_birthdays"An","Bert","Cindy","Dirk"
    ("25-02-1975","12-10-1953","23-05-1969","01-04-196
    1")
  • each( HASH ) traverse a hash
  • while ((name,date) each(birthdays))
  • print "name's birthday is date\n"
  • alternative foreach key (keys birthdays)

24
Multidimensional data structures
  • - Perl does not really have multi-dimensional
    data structures, but a nice way of emulating
    them, using references
  • matrixij x
  • lexicon1"word"1 partofspeech
  • lexicon2"word""noun" frequency
  • Array of arrays
  • _at_matrix ( an array of references to
    anonymous arrays
  • 1, 2, 3, 4, 5, 6, 7, 8, 9
  • )

25
Multidimensional structures
  • Hash of arrays
  • lexicon1 ( a hash
    from strings to anonymous arrays
  • the gt "Det", 12731 ,
  • man gt "Noun", 658 ,
  • with gt "Prep", 3482
  • )
  • Hash of hashes
  • lexicon2 ( a hash from strings to
    anonymous hashes of strings to numbers
  • the gt Det gt 12731 ,
  • man gt Noun gt 658 , Verb gt 12 ,
  • with gt Prep gt 3482
  • )

26
Programming Example
  • A program that reads lines of text, gives a
    unique index number to each word and counts the
    word frequencies
  • !/usr/local/bin/perl
  • read all lines in the input
  • nwords 0
  • while(defined(line ltgt))
  • cut off leading and trailing whitespace
  • line s/\s//
  • line s/\s//
  • and put the words in an array
  • _at_words split /\s/, line
  • if(!_at_words)
  • there are no words?
  • next
  • process each word...
  • while(word pop _at_words)
  • if it's unknown assign a new index

27
A note on sorting
  • If we would like to have the words sorted by
    their frequency instead of by alphabet, we need a
    construct that imposes a different sort order.
  • sort function can use any sort order that is
    provided as an expression.
  • the usual alphabetical sort order
  • sort a cmp b _at_list
  • !! a and b are placeholders for the two items
    from the list that are to be compared. Do not
    attempt to replace them with other variable
    names. Using x and y instead will not provide
    the same effect
  • a numerical sort order
  • sort a ltgt b _at_list
  • for a reverse sort, change the order of the
    arguments
  • sort b ltgt a _at_list
  • sort the keys of a hash by their value instead of
    by their own identity, substitute the values for
    the arguments of sort
  • sort hashb ltgt hasha ( keys
    hash )

28
Basics about Subroutines
  • Calls to subroutines can be recognized because
    subroutine names often start with the special
    character .
  • sub askForInput
  • print "Please enter something "
  • function call
  • askForInput()
  • Tip put related subroutines in a file (usually
    with the extention .pm perl module) and include
    the file with the command require
  • files with subroutines are stored here
  • use lib "C\PERL\MYLIBS"
  • we will use this file
  • require "nlp"

29
Variables Scope
  • A variable a is used both in the subroutine and
    in the main part program of the program.
  • a 0
  • print "a\n"
  • sub changeA
  • a 1
  • print "a\n"
  • changeA()
  • print "a\n"
  • The value of a is printed three times. Can you
    guess what values are printed?
  • - a is a global variable.

30
Variables Scope
  • Hide variables from the rest of the program using
    my.
  • my a 0
  • print "a\n"
  • sub changeA
  • my a 1
  • print "a\n"
  • changeA()
  • print "a\n"
  • What values are printed now?

31
Communication between subroutines and programs
  • Provide the arguments of the subroutine call
  • doSomething(2,"a",abc).
  • - Perl converts all arguments to a flat list.
    This means that doSomething((2,"a"),abc) will
    result in the same list of arguments as the
    earlier example.
  • Access the argument values inside the procedure
    with the special list _at__.
  • E.g. my(number, letter, string) _at__ reads
    the parameters from _at__
  • A tricky problem is passing two or more lists as
    arguments of a subroutine. sub(_at_a,_at_b) ? the
    subroutine receives the two list as one big one
    and it will be unable to determine where the
    first ends and where the second starts.
  • pass the lists as reference arguments
  • sub(\_at_a,\_at_b).

32
  • Subroutines also use a list as output.
  • the return statement from a subroutine
  • return(1,2) or simply (1,2)
  • read the return values from the subroutine
  • (a,b) subr().
  • - Read the main program arguments using ARGC and
    _at_ARGV (same as in C)

33
More about file management
  • open(INFILE,"myfile") reading
  • open(OUTFILE,"gtmyfile") writing
  • open(OUTFILE,"gtgtmyfile") appending
  • open(INFILE,"someprogram ") reading from
    program
  • open(OUTFILE," someprogram") writing to program
  • opendir(DIR,"mydirectory") open directo
  • Operations on an open file handle
  • a ltINFILEgt read a line from INFILE into a
  • _at_a ltINFILEgt read all lines from INFILE into _at_a
  • a readdir(DIR) read a filename from DIR into
    a
  • _at_a readdir(DIR) read all filenames from DIR
    into _at_a
  • read(INFILE,a,length) read length characters
    from INFILE into a
  • print OUTFILE "text" write some text in OUTFILE
  • Close files / directories
  • close(FILE) close a file
  • closedir(DIR) close a directory

34
Other file management commands
  • binmode(HANDLE) change file mode from text to
    binary
  • unlink("myfile") delete file myfile
  • rename("file1","file2") change name of file
    file1 to file2
  • mkdir("mydir") create directory mydir
  • rmdir("mydir") delete directory mydir
  • chdir("mydir") change the current directory to
    mydir
  • system("command") execute command command
  • die("message") exit program with message message
  • warn("message") warn user about problem message
  • Example
  • open(INFILE,"myfile") or die("cannot open
    myfile!")
  • Other
  • About _
  • Holds the content of the current variable
  • Examples
Write a Comment
User Comments (0)
About PowerShow.com