Title: Perl Tutorial-Basics
1Perl Tutorial-Basics
Venkat Thirumala
2Why Perl?
- Perl is built around regular expressions
- REs are good for string processing
- Therefore Perl is a good scripting language
- Perl is especially popular for CGI scripts
- Perl makes full use of the power of UNIX
- Short Perl programs can be very short
- Perl is designed to make the easy jobs easy,
without making the difficult jobs impossible. --
Larry Wall, Programming Perl
3Why not Perl?
- Perl is very UNIX-oriented
- Perl is available on other platforms...
- ...but isnt always fully implemented there
- However, Perl is often the best way to get some
UNIX capabilities on less capable platforms - Perl does not scale well to large programs
- Weak subroutines, heavy use of global variables
- Perls syntax is not particularly appealing
4What is a scripting language?
- Operating systems can do many things
- copy, move, create, delete, compare files
- execute programs, including compilers
- schedule activities, monitor processes, etc.
- A command-line interface gives you access to
these functions, but only one at a time - A scripting language is a wrapper language that
integrates OS functions
5Major scripting languages
- UNIX has sh, Perl
- Macintosh has AppleScript, Frontier
- Windows has no major scripting languages
- probably due to the weaknesses of DOS
- Generic scripting languages include
- Perl (most popular)
- Tcl (easiest for beginners)
- Python (new, Java-like, best for large programs)
6Perl Example 1
!/usr/local/bin/perl Program to do the
obvious print 'Hello world.' Print a
message
7Comments on Hello, World
- Comments are to end of line
- But the first line, !/usr/local/bin/perl, tells
where to find the Perl compiler on your system - Perl statements end with semicolons
- Perl is case-sensitive
- Perl is compiled and run in a single operation
8Perl Example 2
!/ex2/usr/bin/perl Remove blank lines from a
file Usage singlespace lt oldfile gt
newfile while (line ltSTDINgt) if (line
eq "\n") next print "line"
9More Perl notes
- On the UNIX command line
- lt filename means to get input from this file
- gt filename means to send output to this file
- In Perl, ltSTDINgt is the input file, ltSTDOUTgt is
the output file - Scalar variables start with
- Scalar variables hold strings or numbers, and
they are interchangeable - Examples
- priority 9
- priority '9'
- Array variables start with _at_
10Perl Example 3
!/usr/local/bin/perl Usage fixm ltfilenamesgt
Replace \r with \n -- replaces input
files foreach file (_at_ARGV) print
"Processing file\n" if (-e "fixm_temp")
die " File fixm_temp already exists!\n"
if (! -e file) die " No such file
file!\n" open DOIT, " tr \'\\015'
\'\\012' lt file gt fixm_temp" or die
" Can't tr '\015' '\012' lt file gt
fixm_temp \n" close DOIT open DOIT, "
mv -f fixm_temp file" or die " Can't mv -f
fixm_temp file\n" close DOIT
11Comments on example 3
- In Usage fixm ltfilenamesgt, the angle brackets
just mean to supply a list of file names here - In UNIX text editors, the \r (carriage return)
character usually shows up as M (hence the name
fixm_temp) - The UNIX command tr '\015' '\012' replaces all
\015 characters (\r) with \012 (\n) characters - The format of the open and close commands is
- open fileHandle, fileName
- close fileHandle, fileName
- " tr \'\\015' \'\\012' lt file gt fixm_temp"
says Take input from file, pipe it to the tr
command, put the output on fixm_temp
12Arithmetic in Perl
a 1 2 Add 1 and 2 and store in a a
3 - 4 Subtract 4 from 3 and store in
a a 5 6 Multiply 5 and 6 a 7 /
8 Divide 7 by 8 to give 0.875 a 9
10 Nine to the power of 10, that is, 910 a
5 2 Remainder of 5 divided by 2 a
Increment a and then return
it a Return a and then
increment it --a Decrement a
and then return it a-- Return a
and then decrement it
13String and assignment operators
a b . c Concatenate b and c a b x
c b repeated c times a b
Assign b to a a b Add b to a a
- b Subtract b from a a . b
Append b onto a
14Single and double quotes
- a 'apples'
- b 'bananas'
- print a . ' and ' . b
- prints apples and bananas
- print 'a and b'
- prints a and b
- print "a and b"
- prints apples and bananas
15Arrays
- _at_food ("apples", "bananas", "cherries")
- But
- print food1
- prints "bananas"
- _at_morefood ("meat", _at_food)
- _at_morefood ("meat", "apples", "bananas",
"cherries") - (a, b, c) (5, 10, 20)
16push and pop
- push adds one or more things to the end of a list
- push (_at_food, "eggs", "bread")
- push returns the new length of the list
- pop removes and returns the last element
- sandwich pop(_at_food)
- len _at_food len gets length of _at_food
- food returns index of last element
17foreach
Visit each item in turn and call it
morsel foreach morsel (_at_food) print
"morsel\n" print "Yum yum\n"
18Tests
- Zero is false. This includes 0, '0', "0", '',
"" - Anything not false is true
- Use and ! for numbers, eq and ne for strings
- , , and ! are and, or, and not, respectively.
19for loops
- for loops are just as in C or Java
- for (i 0 i lt 10 i) print
"i\n"
20while loops
!/usr/local/bin/perl print "Password? " a
ltSTDINgt chop a Remove the
newline at end while (a ne "fred") print
"sorry. Again? " a ltSTDINgt chop
a
21do..while and do..until loops
!/usr/local/bin/perl do print
"Password? " a ltSTDINgt chop
a while (a ne "fred")
22if statements
if (a) print "The string is not
empty\n" else print "The string is
empty\n"
23if - elsif statements
if (!a) print "The string is empty\n"
elsif (length(a) 1) print "The string
has one character\n" elsif (length(a) 2)
print "The string has two characters\n" else
print "The string has many characters\n"
24Why Perl?
- Two factors make Perl important
- Pattern matching/string manipulation
- Based on regular expressions (REs)
- REs are similar in power to those in Formal
Languages - but have many convenience features
- Ability to execute UNIX commands
- Less useful outside a UNIX environment
25Basic pattern matching
- sentence /the/
- True if sentence contains "the"
- sentence "The dog bites."if (sentence
/the/) is false - because Perl is case-sensitive
- ! is "does not contain"
26RE special characters
. Any single character except a
newline The beginning of the line or
string The end of the line or
string Zero or more of the last
character One or more of the last
character ? Zero or one of the last
character
27RE examples
. matches the entire string hi.bye
matches from "hi" to "bye" inclusive x y
matches x, one or more blanks, and
y Dear matches "Dear" only at
beginning bags? matches "bag" or
"bags" hiss matches "hiss", "hisss",
"hissss", etc.
28Square brackets
qjk Either q or j or k qjk
Neither q nor j nor k a-z Anything
from a to z inclusive a-z No lower
case letters a-zA-Z Any letter a-z
Any non-zero sequence of
lower case letters
29More examples
aeiou matches one or more
vowels aeiou matches one or more
nonvowels 0-9 matches an unsigned
integer 0-9A-F matches a single hex
digit a-zA-Z matches any
letter a-zA-Z0-9_ matches identifiers
30More special characters
\n A newline \t A tab \w Any
alphanumeric same as a-zA-Z0-9_ \W Any
non-word char same as a-zA-Z0-9_ \d Any
digit. The same as 0-9 \D Any non-digit.
The same as 0-9 \s Any whitespace
character\S Any non-whitespace character \b
A word boundary, outside only \B No
word boundary
31Quoting special characters
\ Vertical bar \ An open square
bracket \) A closing parenthesis \
An asterisk \ A carat symbol \/ A
slash \\ A backslash
32Alternatives and parentheses
jellycream Either jelly or cream (egle)gs
Either eggs or legs (da)
Either da or dada or
dadada or...
33The _ variable
- Often we want to process one string repeatedly
- The _ variable holds the current string
- If a subject is omitted, _ is assumed
- Hence, the following are equivalent
- if (sentence /under/)
- _ sentence if (/under/) ...
34Case-insensitive substitutions
- s/london/London/i
- case-insensitive substitution will replace
london, LONDON, London, LoNDoN, etc. - You can combine global substitution with
case-insensitive substitution - s/london/London/gi
35Remembering patterns
- Any part of the pattern enclosed in parentheses
is assigned to the special variables 1, 2, 3,
, 9 - Numbers are assigned according to the left
(opening) parentheses - "The moon is high" /The (.) is (.)/
- Afterwards, 1 "moon" and 2 "high"
36Dynamic matching
- During the match, an early part of the match that
is tentatively assigned to 1, 2, etc. can be
referred to by \1, \2, etc. - Example
- \b.\b matches a single word
- /(\b.\b) \1/ matches repeated words
- "Now is the the time" /(\b.\b) \1/
- Afterwards, 1 "the"
37tr
- tr does character-by-character translation
- tr returns the number of substitutions made
- sentence tr/abc/edf/
- replaces a with e, b with d, c with f
- count (sentence tr///)
- counts asterisks
- tr/a-z/A-Z/
- converts to all uppercase
38split
- split breaks a string into parts
- info "CaineMichaelActor14, Leafy
Drive"_at_personal split(//, info) - _at_personal ("Caine", "Michael", "Actor",
"14, Leafy Drive")
39Associative arrays
- Associative arrays allow lookup by name rather
than by index - Associative array names begin with
- Example
- fruit ("apples", "red", "bananas", "yellow",
"cherries", "red") - Now, fruit"bananas" returns "yellow"
- Note braces, not parentheses
40Associative Arrays II
- Can be converted to normal arrays_at_food
fruit - You cannot index an associative array, but you
can use the keys and values functions - foreach f (keys fruit) print ("The color
of f is " . fruitf . "\n")
41Calling subroutines
- Assume you have a subroutine printargs that just
prints out its arguments - Subroutine calls
- printargs("perly", "king")
- Prints "perly king"
- printargs("frog", "and", "toad")
- Prints "frog and toad"
42Defining subroutines
- Here's the definition of printargs
- sub printargs print "_at__\n"
- Where are the parameters?
- Parameters are put in the array _at__ which has
nothing to do with _
43Returning a result
- The value of a subroutine is the value of the
last expression that was evaluated
sub maximum if (_0 gt _1)
_0 else _1
biggest maximum(37, 24)
44Local variables
- _at__ is local to the subroutine, and
- so are _0, _1, _2,
- local creates local variables
45Example subroutine
sub inside local(a, b)
Make local variables (a,
b) (_0, _1) Assign values
a s/ //g
Strip spaces from b s/ //g
local variables (a
/b/ b /a/) Is b inside a
or a inside b? inside("lemon", "dole
money") true
46Perl V
- There are only a few differences between Perl 4
and Perl 5 - Perl 5 has modules
- Perl 5 modules can be treated as classes
- Perl 5 has auto variables
47Process Management
- Use system function to launch a new process from
within a Perl program - The function hands a single string to the shell
to execute - You can direct the output of the command to
either the display screen or a file - Multiple commands can be specified, separated by
semicolon or newline - Zero is returned if the command executed without
any problems
48Process Management
- system(date) Output to STDOUT
- system(date gt cmd_out) Output to file cmd_out
- system(date w)
i0 file cmd_out..i system(date w gt
file)
i0 file cmd_out..i system( (date w)
gt file)
49Process Management
- Backquotes
- Alternate way to launch process
- Put the command name within backquotes
- time Time is .date
- Time is Thu Mar 3 010057 EST 2005
50Process Management
- Processes as Filehandles
- Can create processes with the open command used
for filehandles - We can read the output of the command or give
input to it - open(filehandle, cmd ) open cmd for reading
- open(filehandle, cmd) open cmd for writing
51Process Management
- open(LF, ls lt )
- _at_files ltLFgt
open(LPR, lpr Pprinter1) open(RPT,
report) print LPR RPT close(LPR) close(RPT)
print LPR ltRPTgt
52Process Management
open(LF, ls lt ) open(LPR, lpr
Pprinter1) while( ltLFgt ) unless
(/passwd/) print LPR _ close(LPR) cl
ose(LF)
53Process Management
- exec
- Similar to system command
- Replaces the current process with the shell
- Hence, after a successful exec, the Perl program
is gone - exec(date)
54Recommended Books
Randal L. Schwartz Lerning Perl (4th ed.) -
OReilly Tom Christiansen Perl Cookbook (2nd
ed.) - OReilly Larry Wall Programming Perl
(3rd ed.) - OReilly