Title: CGI PROGRAMMING
1CGI PROGRAMMING PERL
- What is CGI and how does it work?
- The WWW as an application platform
- Environment variables
- Processing form data
- Generate dynamic web pages
- Common Gotchas
2URL
- Web pages identified by URLs http//www.perl.com/
CPAN/ - URLs are of the form
- scheme//hostname/path?queryfragment
- scheme could be http, ftp, file, etc.
- host is the machine running the webserver.
- path is the location of the resource being
requested. - query passes additional info to CGI scripts.
- fragment refers to a section in the resource.
3So what is CGI?
- Web is client/server system
- Client browser (Mozilla) requests documents
(identified by URL) from web server (Apache) - This browser-to-server dialogue governed by HTTP,
the Hypertext Transport Protocol - CGI is a lightweight interface on top of HTTP
4What is CGI, part 2
- Many times, when a client requests a document,
the web server simply sends back the contents of
a file specified by the URL path. - For example, http//www.largest.org/test.html
tells the web server to grab the file called
test.html (in the webroot directory) and hand it
back to the client (your browser). Your browser
then parses the HTML and displays the page. - This is static content
5What is CGI? Part 3
- BUT, web servers aren't very intelligent and can
only hand back static pages. - Many times you want dynamic content. Then, the
web server will run another program to send back
a document to the client. - This server-to-program dialogue is governed by
the CGI protocol. (stands for Common Gateway
Interface) - This program is therefore a CGI Script.
6More on CGI
- Web server passes lots of info to CGI script
- what page was requested,
- what values were passed from a form,
- where the request came from,
- lots more.
7Stateless
- Important to note that the connection between
client and server is not persistent. - Every call to the server is a brand new one.
8 9Why Perl for CGI?
- CGI Scripts can be written in almost any language
that is supported by the web server, including
shell, C/C, Perl, Visual Basic, Python, Tcl - WWW is driven by plain text (HTTP a text
protocol, HTML a text markup language) - Perl is a favorite among CGI developers because
of its superior text processing power
10WWW As an Application Platform
- Advantages
- Machine-independent. Your software will run
identically on PCs, Macintoshes, and workstations - Timely. Data and Software updates are immediately
available. - Easy to deploy. There is no software to
distribute, since your applications run on a
server and the client software is standard. - Geographically dispersed. Physical proximity is
irrelevant since the Web is indeed World Wide. - Easy for users to learn. Browsers are intuitive
and already familiar to many.
11WWW As an Application Platform, part 2
- Disadvantages
- Limited bandwidth
- Have to do tricks to get persistent data
12HTTP Headers
- The server must tell the browser what kind of
data it's delivering. - Examples include plain text, HTML, audio files,
binary graphics. - Browsers expect these in MIME format.
- Examples of MIME types are text/plain,
text/html, image/gif, audio/x-wav - This data (and other) is delivered in the HTTP
header. - The user does not see this info!
- This is different than the HTML header
(ltheadgtlt/headgt)
13HTTP Headers, 2
- An HTTP header looks like this
- Content-type text/html\n\n
- header name (Content-type)
- MIME content (text/html)
- 2 newline characters (\n) to separate HTTP header
from body. CRUCIAL! Must have 2 newlines. - Long list of all header options
http//www.w3.org/Protocols/HTTP/Object_Headers.ht
ml
14An actual CGI Script!
- !/usr/local/bin/perl
- use strict
- MUST print out header with 2 \n
- print "Content-type text/html\n\n"
- print "ltHTMLgt\n"
- print "ltHEADgt\n"
- print "ltTITLEgtHello Worldlt/TITLEgt\n"
- print "lt/HEADgt\n"
- print "ltBODYgt\n"
- print "Hello World!\n"
- print "lt/BODYgtlt/HTMLgt\n"
15Here documents in CGI scripts
- It's often convenient to use here documents for
blocks of HTML. It lines up code and allows you
to not have to escape quotes with a \. - CGI Perl can get very ugly without here docs
- use strict
- print "Content-type text/html\n\n"
- print "ltHTMLgtltTITLEgtTestlt/TITLEgt\n"
- print "ltBODY BGCOLOR\"white\"gtltTABLE
BORDER\"0\"gtltTRgtltTDgt" - print "ltH2gtltA HREF\"foo.html\"gtThislt/Agt is a
test " - print "of the emergency broadcast
systemlt/H2gtltPgt\n" - print "ltIMG SRC\"foo.gif\" WIDTH\"200\"
HEIGHT\"100\" ALT\"foobar!\"gt" - print "lt/TDgtlt/TRgtlt/BODYgtlt/HTMLgt\n"
16Here documents in CGI Scripts, part 2
- !/usr/local/bin/perl
- looks much better with a here document
- use strict
- print ltltEnd_of_HTML
- Content-type text/html
- ltHTMLgt
- ltTITLEgtTestlt/TITLEgt
- ltBODY BGCOLOR"white"gt
- ltH2gtltA HREF"foo.html"gtThislt/Agt is a test of the
emergency broadcast systemlt/H2gt - ltPgtltBgtThis is only a testlt/Bgt
- lt/BODYgt
- lt/HTMLgt
- End_of_HTML
17More Complex CGI
- The CGI script can do anything behind the scenes,
as long as it finally outputs HTML. - Lookup info in a database
- grok text files, etc
18Employee phone generator
- !/usr/local/bin/perl
- use strict
- print "Content-type text/html\n\n"
- print "ltHTMLgtltTITLEgtEmployeeslt/TITLEgtltBODYgt\n"
- print "ltBgtWinefred Employeeslt/BgtltBR /gtltBR /gt\n"
- print "ltULgt"
- if ( open (IN, file) )
- while (my line ltINgt)
- my (employee, phone) split /\s/, line
- print "ltLIgtemployee phonelt/LIgt\n"
-
-
- print "lt/ULgtlt/BODYgtlt/HTMLgt\n"
19Counter CGI Script
- !/usr/local/bin/perl
- use strict
- print "Content-type text/html\n\n"
- print "ltHTMLgtltTITLEgtCount Examplelt/TITLEgtltBODYgt\n"
- my count
- my counter_file 'count.dat'
- if ( open (COUNT, counter_file) )
- chomp (count ltCOUNTgt)
- close COUNT
-
- print "You're number countlt/BODYgtlt/HTMLgt\n"
- count
- open (COUNT, "gtcounter_file")
- print COUNT count, "\n"
- close COUNT
20Environment Variables
- CGI applications and the web server communicate
using environment variables. - When a web server invokes a CGI program, it sets
variables in the CGI program's environment. - These are stored in the ENV hash in Perl
21Seeing your environment variables
- !/usr/local/bin/perl
- use strict
- print "Content-type text/html\n\n"
- foreach my key (keys ENV)
- print "key gt ENVkeyltBRgt\n"
22Stupid environment tricks
- !/usr/local/bin/perl
- use strict
- print ltltEnd_of_HTML
- Content-type text/html
- Your IP address is ENVREMOTE_ADDR
- ltpgt
- This server is running ENVSERVER_SOFTWARE
- ltpgt
- Your browser is ENVHTTP_USER_AGENT
- End_of_HTML
23Forms and WWW Interaction
- Create a HTML form with
- ltFORM action"your_cgi_script" method"METHOD"gt
- lt!-- various form attributes --gt
- lt/FORMgt
24Forms
- Your CGI script can call itself, or another CGI
script. Or a static HTML page can call a CGI
script. - There are 2 different METHODs of passing HTML
form data to CGI scripts, GET and POST.
25Parsing Forms, GET
- GET passes the form data in the URL and creates
URLs that look something like this - http//www.foobar.com/nifty.cgi?nameabeownerjim
smithservice378 - This is called URL Encoding.
- keyvalue pairs
- ampersands delimit pair
- plus signs represent blanks
- non-alphanumerics are xx, where xx is the
hexadecimal ASCII code (20 is a space for
example)
26Parsing Forms, GET
- The server stores the string after the question
mark in the QUERY_STRING environment variable. - So from the last example
- http//www.foobar.com/nifty.cgi?nameabeownerjim
smithservice378 - ENVQUERY_STRING is
- nameabeownerjimsmithservice378
27More QUERY_STRING parsing
- Here's a way to parse the QUERY_STRING in a CGI
script - my form_data
- split (//, ENV'QUERY_STRING')
- foreach my key (keys form_data)
- convert hex chars
- form_datakey
- s/(\dA-Fa-f2)/chr(hex 1)/ge
- convert to space
- form_datakey s/\/ /g
28Parsing Forms, POST
- Problem with GET the form data cannot exceed the
maximum size of an environment variable (1024
characters on many systems). - So anytime you have a large chunk of data you
want to pass to a script (such as from a
ltTEXTAREAgt box), you're out of luck. - What to do? Enter the POST method of passing
form data.
29POST
- Form data is delivered as a stream to the CGI
script's STDIN, not as an environment variable. - No limit on size of the data.
- So in Perl, to parse POST, just read from STDIN
until EOF. - POST requests also set the CONTENT_LENGTH
environment variable, which is the length in
bytes of the input stream containing the form
data.
30Parsing Forms, POST
- parse POST
- reads CONTENT_LENGTH bytes from STDIN
- read (STDIN,raw_data,ENV'CONTENT_LENGTH')
- now do the same thing as with GET
- form_data split (//, raw_data)
- foreach my key (keys form_data)
- form_datakey
- s/(\dA-Fa-f2)/chr(hex 1)/ge
- form_datakey s/\/ /g
31Stop the Madness! CGI.pm saves the day
- In practice, it is surprisingly tricky to
accurately parse CGI script input. - And it's hard to maintain.
- And a headache to understand.
- The solution do not try to parse it yourself!
Let CGI.pm take care of the dirty work!
32Using CGI.pm
- CGI.pm has two interfaces, a procedural one and
an object-oriented one. - Object-oriented is the default (and most of the
examples in perldoc CGI use the object-oriented
interface), but either method is fine for most
tasks.
33Procedural vs OO
- use the OO interface
- use CGI
- then,
- my query CGI-gtnew
- query is a CGI object.
34CGI.pm
- Many, many, many features in CGI.pm.
- perldoc CGI is your best friend.
- For example, just doing
- use CGI
- my query CGI-gtnew()
- automatically detects and applies the request
method (POST or GET) and automatically parses
QUERY_STRING or the data stream (STDIN)!!
35Very simple CGI.pm usage
- !/opt/third-party/bin/perl
- use strict
- use CGI
- my query CGI-gtnew
- print query-gtheader print the HTTP header
- print the HTML header
- print query-gtstart_html(-title gt "Hello
world!", - -bgcolor gt "green")
- print "ltbgtHello world! Perl r0ckz!lt/bgt"
- print query-gtend_html print the HTML footer
36Simple HTML form example
- ltHTMLgt
- ltHEADgt
- ltTITLEgtForm examplelt/TITLEgt
- lt/HEADgt
- ltBODYgt
- ltFORM ACTION"test.cgi" METHOD"POST"gt
- Your name?
- ltINPUT TYPE"text" NAME"name"gt
- ltBR /gtltINPUT TYPE"SUBMIT"gt
- lt/FORMgt
- lt/BODYgt
- lt/HTMLgt
37CGI script to parse form from previous slide
- !/opt/third-party/bin/perl
- use CGI
- use strict
- my q CGI-gtnew
- grab value of 'name' variable
- my name q-gtparam('name')
- print q-gtheader print the HTTP header
- now print the HTML header
- print q-gtstart_html (-title gt "Test CGI",
- -BGCOLOR gt "White")
- print "ltBgtYour name is ltblinkgtnamelt/blinkgt!lt/Bgt"
- print q-gtend_html end the HTML
38More CGI.pm examples
- Calling param() with no arguments yields a list
of the form field names (like first_name,
last_name, city, etc) - my _at_field_names q-gtparam
- Now process all the fields with a loop
- foreach my field_name (_at_field_names)
- print "field_name is ",
- q-gtparam(field_name), "ltBR /gt"
39More CGI.pm examples
- perldoc CGI
- Read it, know it, live it.
- There is lots of great info in there.
40Debugging in CGI Scripts
- Tracking down errors in CGI scripts can be very
annoying (more so than regular Perl) - Error messages in Perl go to STDERR, which is
usually the programmer's terminal window. - In the CGI environment, however, STDERR gets
intercepted by the web server, which appends
error messages to the error log and sends the
browser a meaningless 500 Server Error message. - You may not have access to the error log. Even
if you do, it can be hard to read through and
decide which script caused what error at what
time. Every web server error message goes to the
same log file!
41die and warn in CGI
- Since die and warn print their messages to
STDERR, doing something like - open (IN, file) die "Can't open file!"
- probably doesn't do what you expect.
- Nothing gets printed to the screen (error msg
goes to the error log)
42Further complications with die
- This script produces a 500 server error. Why?
- !/usr/local/bin/perl
- use strict
- use CGI
- my q CGI-gtnew
- open (IN, 'file') die "Can't open file !"
- print q-gtheader, q-gtstart_html(-title gt
'Test') - print q-gth2('Here is the file')
- while (my line ltINgt)
- print "lineltbr /gt"
-
- print q-gtend_html
43Write your own error handling subroutine
- Instead of relying on die and warn, sometimes you
can write your own error handling subroutines.
For example - instead of
- open (IN, file) die !
- use
- unless ( open (IN, file) )
- print_error_and_die (file, !)
-
- sub print_error_and_die
- my (file, error) _at__
- print "ltbgtError!lt/bgt Problem with file
!" - exit
44Or, use what's already been done for you
- Use the CGICarp module (part of standard
distribution). It causes all messages going to
STDERR to be prefixed with the name of the
application and the current date. It can also
send warnings and errors to a file or to the
browser. - make die and warn more verbose, but they
- still go to the error log
- use CGICarp
- open (IN, file) die !
- print while (ltINgt)
- close IN warn "Couldn't close IN !"
45More CGICarp
- return fatal errors to the browser!
- use CGICarp 'fatalsToBrowser'
- this msg now gets printed to the browser
- open (IN, file)
- die "Can't open file !"
46Fixing the dreaded 500 Server Error
- 1. Check ownership and permissions on the script.
The script needs to be readable and executable
by whoever the server runs scripts as. (usually
you want to chmod 755) - 2. Check the extension of the script make sure
the script can be identified as a CGI script by
the web server. This means .cgi at many places - 3. Make sure the script has permission to do what
it's trying to do. (if it's writing to a file, it
has to have permissions) - 4. Make sure the script is valid Perl!
- (use "perl -cw script.cgi"on the command line)
- 5. See the next slide
- 6. Check the return value from every system call.
- 7. See Perl Cookbook, chapter 19
47More 500 Server Error Fixes
- CGI.pm lets you run and debug scripts from the
command line and pass mock form input. Very
cool! - D is whatever you type to get End of File,
(ctrl-D - in nix)
- ./script.cgi
- (offline mode enter namevalue pairs on standard
input) - animalgiraffe
- animal_namegoofy
- D
- the script then runs with your variables and
prints - to the terminal