Benjamin J. Lynch - PowerPoint PPT Presentation

1 / 78
About This Presentation
Title:

Benjamin J. Lynch

Description:

Intermediate Perl by Benjamin J. Lynch blynch_at_msi.umn.edu Introduction Perl is a powerful interpreted language that takes very little knowledge to get started. – PowerPoint PPT presentation

Number of Views:296
Avg rating:3.0/5.0
Slides: 79
Provided by: msi73
Category:

less

Transcript and Presenter's Notes

Title: Benjamin J. Lynch


1
Intermediate Perl
  • by
  • Benjamin J. Lynch
  • blynch_at_msi.umn.edu

2
Introduction
  • Perl is a powerful interpreted language that
    takes very little knowledge to get started. It
    can be used to automate many research tasks with
    little effort.
  • The greatest strength and weakness of Perl is the
    ability to accomplish the same task using two
    very different codes.

3
Outline
  • Review of Perl
  • Variable types
  • Context
  • Operators
  • Control structure
  • Pattern Matching
  • Subroutines
  • Context
  • References
  • grep
  • map
  • modules

4
When should I use Perl?
  • Perl stands for Practical Export Report Language
  • Perl is most useful for
  • parsing files to extract desired data
  • Doing almost anything you can do in a shell
    script
  • cgi scripts to generate HTML for web pages
  • updating or retrieving information from databases
  • acting as in interface between programs

5
Programming Style
  • Questions you should ask
  • Who else might look at the code?
  • Co-workers?
  • Complete strangers?
  • How often will the code be modified?
  • Remember your target audience
  • There is no substitute for comments

6
An Interpreted Language
  • Perl programs are also called Perl scripts
    because Perl is an interpreted language.
  • When you execute a Perl script, the script is
    compiled into a set of instructions for the Perl
    interpreter
  • This set of instructions (or parse tree) is sent
    to the Perl interpreter
  • The Perl interpreter shares many similarities to
    the virtual machine in Java
  • There is no need to compile a Perl script as a
    separate preliminary step, making Perl scripts
    similar to shell scripts (at least on the
    exterior).

7
A Simple Perl Script
  • !/usr/bin/perl
  • print Hello world! \n
  • blynch_at_msi chmod x hello.pl
  • blynch_at_msi ./hello.pl
  • Hello world!
  • blynch_at_msi
  • \n is a new line.
  • The routine print will print the item or list of
    items that follows.

8
Variable Types
  • Scalar
  • Reference (scalar pointer to another variable)
  • List (array)
  • Hash (associative array)

9
Scalars
  • Examples
  • var 3
  • name Larry
  • float 1.1235813
  • sum a 1.2

10
  • A scalar is a single value.
  • number 1
  • text Hello world!
  • a 1.2
  • b 1.3
  • sum a b
  • print sum \n
  • 2.5

The scalar can be integer 64-bit floating
point string reference
The way that the data is stored (integer,
floating point ,) does not need to be specified.
The Perl interpreter will determine it
automatically
11
Lists
  • A list (or array) of values can be specified
    like
  • _at_number_list (1,1,2,3,5,8,13,21)
  • _at_grocery_list (apples,chicken,canned
    soup)
  • A list always starts with a _at_

12
Lists (arrays)
  • _at_mylist (1,2,2,3,4,4,4)
  • _at_names (Larry, Moe)
  • push(_at_names, Curly)
  • print _at_names
  • Larry Moe Curly

Adds an item to the end of a list
13
Lists
  • A list (or array) of values
  • _at_grocery_list (apples,chicken,canned
    soup)
  • print grocery_list2
  • canned soup

Note the numbering of elements
A is used in the print statement because of
the context. We only want print to handle a
single value from the array and so we use to
denote the scalar context.
14
Hashes (associative arrays)
  • A Hash is an associative array
  • Instead of using an integer index, a hash uses a
    key to access elements of the hash
  • lunch (monday gt pizza,
  • tuesday gt burritos,
  • wednesday gt sandwich,
  • thursday gt fish)
  • print on Tuesday Ill eat lunchtuesday
  • on Tuesday Ill eat burritos

15
Hashes (associative arrays)
  • A Hash can be created with a list of key/value
    pairs.
  • Each key has one value associated with it.
  • hash (Larry gt 1, Moe gt 2, Curly gt 3)
  • hash (Larry , 1, Moe , 2, Curly , 3)

Either of these work to specify a hash
16
Variable Context
  • _at_number_list (1,1,2,3,5,8,13,21)
  • _at_grocery_list (apples,chicken,canned
    soup)
  • print _at_grocery_list
  • appleschickencanned soup

If we use the array (or list) context, the print
command will print out all elements from the
array.
17
Variable Context
  • _at_number_list (1,1,2,3,5,8,13,21)
  • _at_grocery_list (apples,chicken,canned
    soup)
  • print grocery_list1
  • chicken

If we use the scalar context, we must specify the
element we want to print from the list.
18
Variable Context
  • _at_grocery_list (apples,chicken,canned
    soup)
  • var _at_grocery_list
  • print var
  • 3

If we request a scalar from a list, the list
will return its length.
19
Perl Operators
  • massheight Multiplication
  • a b Addition
  • a - b Subtraction
  • a / b division
  • str1.str2 Concatenate
  • count Increment count by 1
  • missing-- decrease missing by 1
  • total subtotal increase total by
    subtotal
  • interest factor set interest to
    interestfactor
  • string. more append more to the end of
    string

20
rand Perl
  • rand(num)
  • returns a random, double-precision floating-point
    number between 0 and num.
  • var rand(4)

21
Control structure
  • !/usr/bin/perl
  • _at_my_grocery_list (apples,chicken,canned
    soup)
  • foreach item (_at_my_grocery_list)
  • purchase (item)
  • while ( some condition is true )
  • do_this

22
Control structure
  • Two ways to if/then
  • if (condition) print It is true \n
  • print It is true \n if condition

23
Retrieving a random element from a list
  • _at_greeting (Hello,Greetings,Hola,Howdy)
  • print greetingrand _at_greeting
  • print greetingrand 4
  • print greeting2.59196266661263168
  • print greeting2
  • print Hola

24
Subroutines
  • Defined like this
  • sub my_sub_name
  • do something
  • Used like this
  • mysubroutine(variables passed)

25
Subroutines
  • Variables passed to a subroutine enter the
    routine as a single list
  • _at_list1 (a ,b ,d )
  • scalar 42
  • mysub(_at_list1, scalar)
  • sub mysub
  • print _at__

a b d 42
26
Returning values from subroutines
  • Subroutines return whatever is returned by the
    return statement or else the last item evaluated
    in the subroutine.
  • _at_list1 (2,3)
  • print mymult(_at_list1)
  • sub mymult
  • product _0_1
  • return product

6
27
Pattern Matching
  • Perl uses a very robust pattern matching syntax
  • The most basic pattern match looks like
  • string /some pattern/
  • string 1 2 three
  • if (string /2/)
  • print the number 2 is in the string\n

In Perl, anything but and 0 are considered
TRUE
28
Pattern Matching
  • , \n
  • string 1 2 hello 2 5
  • matching (string /\d/)
  • print matching
  • 1
  • _at_matches (string /\d/g)
  • print _at_matches
  • 1
  • 2
  • 2
  • 5

g is for global. This will allow the pattern to
be matched multiple times.
\d will match any single digit
29
Pattern Matching
  • ,\n
  • string 1.45 1.482 1.938 other text
    10.2849
  • print (string /\d.\d/g)
  • 1.45
  • 1.482
  • 1.938
  • 0.2849

30
Pattern matching
  • /pattern/
  • /(sub-expression1)(sub-expression2)/
  • \d number
  • \s whitespace
  • \S non-whitespace
  • pattern2 will match pattern exactly twice
  • character list defined character class
  • abcDEF
  • a NOT b
  • OR statement - it will match pattern on either
    side

31
/(bbb2)/
This is written on a T-Shirt I own
32
/(bbb2)/
bb
We want 2 of them
New character class NOT b
OR statement
33
Pattern Matching
  • ------------------------------------------------
  • Charge Models 2 and 3 (CM2 and CM3) and
  • Solvation Model SM5.42 GAMESSPLUS version 4.3
  • ------------------------------------------------
  • Gas-phase
  • ------------------------------------------------
  • Center Atomic CM3 RLPA Lowdin
  • Number Number Charge Charge Charge
  • ------------------------------------------------
  • 1 3 .218 -1.090 -.938
  • Gas-phase dipole moment (Debye)
  • ------------------------------------------------
  • X Y Z Total
  • CM3 -.718 -.592 -1.748 1.980
  • RLPA -.327 1.122 -.840 1.440
  • Lowdin -.116 1.662 -.761 1.832
  • ------------------------------------------------

34
Pattern Matching
  • if (/ CM3\s(-?\d\.\d)\s(-?\d\.\d)\s(-
    ?\d\.\d)\s(-?\d\.\d)\s/)
  • amsol94
  • if (/ CM3\s(-?\d\.\d\s)3(-?\d\.\d)\s/
    )
  • amsol92
  • if (/ CM3\s(-?\d\.\d\s)3(-?\d\.\d)\s/)
  • amsol92
  • if (/ CM3(\s\S)3\s(\S)/)
  • amsol92

35
Pattern Matching
  • if (/ CM3\s(-?\d\.\d\s)3(-?\d\.\d)\s/)
  • amsol92
  • if (/ CM3(\s\S)3\s(\S)/)
  • amsol92

36
Substitutions
  • s/search pattern/replace/
  • string words9words383words
  • string s/\d/, /g
  • print string

words, words, words
37
Special Variables
  • 1, 2, 3,
  • Holds the contents of the most recent
    sub-patterns matched
  • if (string /(Larry) (Moe) Curly/)
  • print 2

Moe
38
Special Variables
  • Determines which index in a list is the first,
    the default is 0.
  • my _at_mylist (Larry, Moe, Curly)
  • print mylist1
  • 1
  • print mylist1
  • Moe
  • Larry

39
Special Variables
  • Entire pattern from most recent match

40
Special Variables
  • /
  • Input record separator, default is \n
  • undef /
  • open(FILE,ltinput.txt)
  • bufferltINFILEgt
  • buffer contains the entire file

41
Special Variables
  • .
  • Current line number

42
Special Variables
  • ,
  • Default separator used when a list is printed,
    default is
  • , will add a space between each item if you
    print out a list.
  • \
  • Default record separator, default is
  • \ \n will add a blank line after each print
    statement.

43
Special Variables
  • T time the perl program was executed
  • autoflush
  • process ID number for Perl
  • 0 name of perl script executed
  • ENV hash containing environmental variables.

44
Special Variables
  • _at_ARGV is a list that old all the arguments passed
    to the Perl script.
  • _at__ is a list of all the variables passed to the
    current subroutine

45
Special Variables
  • _ is a variable that hold the current topic.
  • e.g.
  • while (ltFILE1gt)
  • print line . _

46
Special Variables
  • _ is a variable that hold the current topic.
  • e.g.
  • while (ltFILE1gt)
  • print line . _

This is the current line number
This is the current line being processed in FILE1
47
References
  • A reference is a scalar
  • Instead of number or string, a reference holds a
    memory location for another variable or
    subroutine.
  • myref \variable
  • subref \subroutine

48
Dereferencing the Reference
  • To retrieve the value stored in a reference, you
    must dereference it.
  • name Larry
  • ref_name \name
  • print ref_name , \n
  • print ref_name, \n
  • SCALAR(0x60000000000218a0)
  • Larry

49
Dereferencing the Reference
  • Modifying a dereferenced reference to a variable
    is the same as modifying the variable.
  • name Larry
  • ref_name \name
  • ref_name . , Moe, and Curly
  • print ref_name , \n
  • print name, \n
  • Larry, Moe, and Curly
  • Larry, Moe, and Curly

50
Where do we want to use a reference?
  • References are very useful when passing lists to
    a subroutine.
  • _at_mylist (Larry, Moe, Curly)
  • list_ref \_at_mylist
  • mysub(list_ref )
  • sub mysub
  • my ref _0
  • my _at_list _at_ref
  • print list2, \n

51
Where do we want to use a reference?
  • References are very useful when passing lists,
    hashes, or subroutines to a subroutine.
  • myhash (1 gt Larry, 2 gt Moe, 3 gt Curly)
  • hash_ref \myhash
  • mysub(hash_ref )
  • sub mysub
  • my ref_inside _0
  • print ref_inside2, \n
  • print _02, \n

52
Where do we want to use a reference?
  • References are very useful when passing lists,
    hashes, or subroutines to a subroutine.
  • myhash (1 gt Larry, 2 gt Moe, 3 gt Curly)
  • hash_ref \myhash
  • mysub(hash_ref )
  • sub mysub
  • my ref_inside _0
  • print ref_inside2, \n
  • print _02, \n

Both print the same thing
53
We can even pass subroutines
  • sub_ref \my_subroutine
  • run_this(sub_ref )
  • sub runthis
  • my ref _0
  • ref

54
GREP
  • _at_matching_linesgrep(/expression/,_at_input_lines)
  • _at_matching_linesgrep /expression/ _at_input_lines
  • _at_no_commentsgrep !// _at_lines_of_code
  • open(FILE1,ltmycode.pl)
  • _at_no_commentsgrep !// ltFILE1gt

55
MAP
  • map BLOCK _at_array
  • Returns the list generated by executing BLOCK for
    each value of _at_array
  • foreach number (_at_mylist)
  • print number1
  • print map _1 _at_mylist

56
MAP
  • The block can have any amount of code or
    subroutines
  • map mysub(_) _at_array
  • map
  • sub1(_)
  • sub2(_)
  • a1
  • _at_array

This map would simply return a list of 1s.
The last value evaluated is returned
57
keys
  • keys hash
  • Will return a list of the keys in the hash
  • myhash (1 , Larry, 2, Moe,3,Curly)
  • _at_keylist keys myhash
  • print _at_keylist
  • 123

58
Modules
  • Modules are reusable packages defined in a
    library file
  • They offer simple access to routines such as
  • Database access
  • Matrix manipulations
  • Communication libraries
  • Editing standard binary formats (.doc, xls, )
  • Graphics libraries (OpenGL, Tk, )

59
Using a Module
  • use MailSendmail
  • ...
  • foreach user (_at_email_list)
  • mail_mess ( To gt "user",
    Fromgt'me_at_uofu.edu',
  • Subject gt "subject",
  • Message gt "message"
  • )
  • sendmail(mail_mess)

60
New Objects in Modules
  • Perl is an object-orientated language
  • You can define new object types
  • A scalar can hold other object types, such as a
    matrix, a tensor, a window, a database,
  • The behavior of new objects are defined in the
    corresponding modules.
  • See www.cpan.org for a few thousand useful perl
    modules.

61
Overloading Operators
  • The standard operators in Perl can be defined for
    additional operations when placed between objects
    that are not fundamental types.
  • a b 4
  • use MathMatrixSparse
  • matrix_product matrix1matrix2

62
global, local, my
  • Global variables are the default in Perl.
  • Globals can be seen in any subroutines.
  • var 1
  • printme()
  • sub printme
  • print var

We didnt pass var, but this works because var
is global.
63
global, local, my
  • Local variables are also global
  • Local variables become undef when they go out of
    scope.
  • local name
  • mysub()

name has become undef here
64
global, local, my
  • my variables are preferred for most Perl code.
  • my scalar
  • sub1(\myhash, _at_array)

scalar will not be available in sub1 because
it is not explicitly passed.
65
my Code
use strict will not allow global variables to be
defined on the fly. Global variables that
appear partway through the code often make your
script unreadable.
  • use strict
  • my _at_array
  • my hash

66
How much have you learned?
  • How do we make a Perl script that takes a list as
    its argument, and returns the unique values from
    that list?
  • _at_array (1, 1, 2,3, 4, 4, 34, 20 , 20)
  • ,,
  • print unique(_at_array)
  • 1, 2, 3, 4, 34, 20

67
  • sub unique
  • max_el_at__-1
  • my _at_u_list
  • u_list0_0
  • for i (1 .. max_el)
  • original1
  • foreach item (_at_u_list)
  • if (item _i)
  • original0
  • if (original)
  • push (_at_u_list, _i)
  • return _at_u_list

This is how a FORTRAN77 Expert might solve the
problem. Simple, straightforward, lots of code.
68
A smaller script
  • sub unique
  • my _at_u_list keys map _ gt 1 _at__
  • return _at_u_list

How does that work?
69
A smaller script
  • sub unique
  • my _at_u_list keys map _ gt 1 _at__
  • return _at_u_list

map _ gt 1 _at__
This will return a list of key/value pairs. The
keys will be each value (_) in _at__ (the array
passed to the subroutine).
The values will all be 1
70
A smaller script
  • sub unique
  • my _at_u_list keys key/value pairs
  • return _at_u_list

key/value pairs
This creates an anonymous hash and returns a
reference to it.
71
A smaller script
  • sub unique
  • my _at_u_list keys reference_to_a_hash
  • return _at_u_list

reference_to_a_hash
We dereference the anonymous hash
72
A smaller script
  • sub unique
  • my _at_u_list keys hash
  • return _at_u_list

This returns the keys in the hash
Where did we wipe out the duplicates?
73
A smaller script
  • sub unique
  • my _at_u_list keys key/value pairs
  • return _at_u_list

key/value pairs
When we create our anonymous hash, we assign the
value 1 for each key. When a key is repeated, it
simply reassigns that key to the value specified
(always 1 in this case).
74
Our Anonymous Hash
  • unique(a,b,c,c,d,d)
  • contents
  • a gt 1, a gt1
  • b gt 1, a gt1, bgt1
  • c gt 1, a gt1, bgt1, cgt1
  • c gt 1, a gt1, bgt1, cgt1
  • d gt 1, a gt1, bgt1, cgt1, dgt1
  • d gt 1 a gt1, bgt1, cgt1, dgt1

75
A smaller script
  • sub unique
  • my _at_u_list keys map _ gt 1 _at__
  • return _at_u_list

We dont need to explicitly create _at_u_list
A subroutine will return the object most recently
returned by an operator in the subroutine,
unless another object is returned explicitly with
a return statement
76
A smaller script
  • sub unique
  • keys map _ gt 1 _at__

Our compact and slightly cryptic routine to
return unique items
77
Why some people dislike Perl
  • sub unique
  • keys map _ gt 1 _at__
  • sub unique
  • _at_l_at__()
  • keys l
  • sub unique
  • grep!l__at__

78
The End
Questions?
  • blynch_at_msi.umn.edu
  • help_at_msi.umn.edu
  • 612-626-0802 (MSI helpline)
Write a Comment
User Comments (0)
About PowerShow.com