Mod_perl : Performance CGI in Apache - PowerPoint PPT Presentation

About This Presentation
Title:

Mod_perl : Performance CGI in Apache

Description:

It is a whole new way to create dynamic content by utilizing the full power of ... distinct pattern we can use commandline perl and regex to convert them to a hash. ... – PowerPoint PPT presentation

Number of Views:55
Avg rating:3.0/5.0
Slides: 23
Provided by: MK48
Category:

less

Transcript and Presenter's Notes

Title: Mod_perl : Performance CGI in Apache


1
3.1.1.1.1 Mod_perl Performance CGI in Apache
mod_perl is more than CGI scripting on steroids.
It is a whole new way to create dynamic content
by utilizing the full power of the Apache web
server to create stateful sessions, customized
user authentication systems, smart proxies and
much more. Yet, magically, your old CGI scripts
will continue to work and work very fast indeed.
With mod_perl you give up nothing and gain so
much! -- Lincoln Stein
2
Classical Perl-CGI
  • Conventional perl CGI scripts are compiled,
    interpreted, and executed like any other perl
    script.
  • Every time a perl script is run, it is translated
    (interpreted/compiled) into op code, then
    executed.
  • The translation step takes time.
  • Any database connections, filehandles, or other
    like resources are created only for the life of
    the particular instance of the script being run.

3
Classical Perl-CGI
  • In CGI, a Perl scripts output is directed to the
    users browser via the web browser, instead of
    the usual STDOUT (screen).

gt top i
  • Each time a script is run, it becomes its own
    process.
  • Each process requires its own compilation and
    memory space, even if the script is the same.

4
Classical Perl-CGI Performance Scenario
  • Consider the following scenario
  • A website served from a single server.
  • 20 perl CGI scripts which each serve 5 clients /
    second.
  • Each script loads 5 modules.
  • Each script creates a database connection.
  • Each script accesses files on the filesystem.
  • The time that it takes to compile/interpret/run
    each script is 2 seconds.
  • Each instance of the script requires 10Mb of
    memory.
  • The effect on the server would be
  • 100 perl processes compiling, interpreting, and
    running (5 processes / script / second ).
  • 200 seconds of cpu-time consumed.
  • 1Gb of memory used.
  • 100 database connections created and destroyed.
  • Solution
  • Compile the scripts once.
  • Server clients from single cached version of
    scripts
  • Share the database connections.

5
Why Mod_perl?
  • Problems with conventional perl-CGI
  • Compilation of script for each request (slow)
  • New process created for each request
    (resource-intensive)
  • No easy way to share commonly used resources such
    as modules, data memory, database connections,
    etc.
  • Limited integration with Apache server, limited
    control of Apache modules, services, and
    functions.
  • Mod_perls solutions and features
  • Speed and Efficiency
  • The standard ApacheRegistry module can provide
    100x speedups for existing CGI scripts and reduce
    the load on the server at the same time.
  • Scripts are wrapped as subroutines within a
    handler in the server module which execute
    faster.
  • Shared Resources
  • Share database connections.
  • Share memory.
  • Server control / Customization
  • Apache can be controlled using existing modules.
  • Custom modules and handlers can be easily written
    to extend server functionality.
  • Control over request stages
  • Rewrite URLs in Perl based on the content of
    directories, settings stored in a database, or
    anything else conceivable.
  • Maintenance of state within the server memory.

6
Approaches to Perl Coding
  • One-off scripting and the one-potapproach
  • Programming by passing-the-buck

Output
Input
Input
Output
Main
  • Variables
  • Functions
  • Subroutines

Subroutines
  • Sufficient for non-persistent script
  • One set of output based on one set of inputs
  • Subroutines can access and modify the globally
    available data.
  • Better for persistent program
  • Input/Output dynamic based on parameters
  • Subroutines should only be able to access global
    data under certain conditions

7
Nested Subroutines in Perl
  • nested.pl
  • -----------
  • !/usr/bin/perl w
  • use diagnostics
  • use strict
  • sub print_power_of_2
  • my x shift
  • sub power_of_2
  • return x 2
  • my result power_of_2()
  • print "x2 result\n"
  • print_power_of_2(5)
  • The script should print the square of the numbers
    passed to it
  • ./nested.pl
  • 52 25
  • 62 25
  • If we use the warnings(-w) pragma we get the
    warning
  • Variable "x" will not stay shared at ./nested.pl
    line 9.
  • If we use diagnostics.pm we get
  • (W) An inner (nested) named subroutine is
    referencing a lexical
  • variable defined in an outer subroutine.
  • When the inner subroutine is called, it will
    probably see the value of
  • the outer subroutine's variable as it was
    before and during the
  • first call to the outer subroutine in this
    case, after the first

8
How mod_perl Works
  • Mod_perl is a binary module extension which
    provides Apache with a built-in perl
    interpreter.
  • Requests which map to directories assigned to
    mod_perl are serviced by perl packages called
    handlers
  • The handler is interpreted by the built-in
    interpreter, compiled, and cached in memory.
  • The most important mod_perl handler is called
    ApacheRegistry
  • The Apache server loads a parent server
    process (httpd), and this process forks a
    specified number of children.
  • Each process contains the mod_perl module and can
    serve requests .
  • The children can share memory from the parent.

9
Content Handlers
  • All content handlers in mod_perl must have the
    handler subroutine.
  • To add the handler to the server configuration,
    the httpd.conf file must be modified and the
    server restarted
  • /usr/local/apache/conf/httpd.conf
  • In redhat 9 httpd.conf is moved, and the mod_perl
    configuration is in another file
  • /etc/httpd/conf/httpd.conf
  • /etc/httpd/conf.d/perl.conf
  • The following configuration snippet is added to
    httpd.conf or perl.conf
  • PerlModule ModPerlRules1
  • ltLocation /mod_perl_rules1gt
  • SetHandler perl-script
  • PerlHandler ModPerlRules1
  • PerlSendHeader On
  • lt/Locationgt
  • ModPerl/Rules1.pm
  • -----------------
  • package ModPerlRules1
  • use ApacheConstants qw(common)
  • sub handler
  • print "Content-type text/plain\n\n"
  • print "mod_perl rules!\n"
  • return OK We must return a status to
    mod_perl
  • 1 This is a perl module so we must return true
    to perl
  • ModPerl/Rules2.pm
  • ----------------
  • package ModPerlRules2
  • use ApacheConstants qw(common)
  • sub handler
  • my r shift

10
ApacheRegistry / ModPerlRegistry
  • counter.pl
  • ----------
  • !/usr/bin/perl w
  • use CGI qw(all)
  • use strict
  • print header
  • my counter 0 redundant
  • for (1..5)
  • increment_counter()
  • sub increment_counter
  • counter
  • print Counter is equal to counter !, br
  • To use this script in mod_perls
    ApacheRegistry, we must save the file in the
    appropriate directory specified in the directive
    in httpd.conf / perl.conf
  • Standard Apache installation
  • ltLocation /perlgt
  • SetHandler perl-script
  • PerlHandler ApacheRegistry
  • Options ExecCGI
  • PerlSendHeader On
  • lt/Locationgt
  • Redhat 9 (Apache 2.0)
  • ltDirectory /var/www/perlgt
  • SetHandler perl-script
  • PerlModule ModPerlRegistry
  • PerlHandler ModPerlRegistryhandler
  • Options ExecCGI
  • lt/Directorygt

11
ApacheRegistry / ModPerlRegistry Continued
  • package ApacheROOTperlcounter_2epl
  • use Apache qw(exit)
  • sub handler
  • use strict
  • print header"
  • my counter 0 redundant
  • for (1..5)
  • increment_counter()
  • sub increment_counter
  • counter
  • print "Counter is equal to counter !\r\n"
  • The script counter.pl is compiled into the
    package ApacheROOTperlcounter_2epl and is
    wrapped into this packages handler subroutine.
  • We would expect to see the output
  • Counter is equal to 1 !
  • Counter is equal to 2 !
  • Counter is equal to 3 !
  • Counter is equal to 4 !
  • Counter is equal to 5 !
  • After some reloading, we start to get strange
    results, with the counter starting at higher
    numbers like 6, 11, 15 and so on
  • Counter is equal to 6 !
  • Counter is equal to 7 !
  • Counter is equal to 8 !
  • Counter is equal to 9 !
  • Counter is equal to 10 !
  • The major cause of this bug nested subroutines.
    Non-linearity of buggy output is caused by the
    requests being served by different children

12
Solving the Nested Subroutine Problem Anonymous
subs, Scoping
  • anonymous.pl
  • --------------
  • !/usr/bin/perl
  • use strict
  • sub print_power_of_2
  • my x shift
  • my func_ref sub
  • return x 2
  • my result func_ref()
  • print "x2 result\n"
  • print_power_of_2(5)
  • print_power_of_2(6)
  • Change the named inner nested subroutine to an
    anonymous subroutine.
  • The anonymous subroutine sees the variables in
    the same lexical context, at any moment that it
    is called.
  • The x variable is in the same lexical scope as
    the anonymous subroutine call so it sees the
    variable and its value at any given moment.
  • Acts like a closure
  • ./anonymous.pl
  • 52 25
  • 62 36

13
Solving the Nested Subroutine Problem Package
Scoped Variables
  • multirun.pl
  • -----------
  • !/usr/bin/perl
  • use strict
  • use warnings
  • for (1..2)
  • print "run time _\n"
  • run()
  • sub run
  • my counter 0
  • our counter 0
  • local our counter 0
  • increment_counter()
  • When the script is run using the lexically scoped
    counter variable we get
  • Variable "counter" will not stay shared at
    ./nested.pl line 18.
  • run time 1
  • Counter is equal to 1 !
  • Counter is equal to 2 !
  • run time 2
  • Counter is equal to 3 !
  • Counter is equal to 4 !
  • The counter variable in the named subroutine
    remains bound to the initial value (named subs
    are compiled once)
  • If we use our to scope counter to the package
    it works
  • run time 1
  • Counter is equal to 1 !
  • Counter is equal to 2 !
  • run time 2
  • Counter is equal to 1 !

14
Solving the Nested Subroutine Problem Parameter
Passing, References
  • multirun3.pl
  • ------------
  • !/usr/bin/perl
  • use strict
  • use warnings
  • for (1..3)
  • print "run time _\n"
  • run()
  • sub run
  • my counter 0
  • counter increment_counter(counter)
  • counter increment_counter(counter)
  • multirun4.pl
  • ------------
  • !/usr/bin/perl
  • use strict
  • use warnings
  • for (1..3)
  • print "run time _\n"
  • run()
  • sub run
  • my counter 0
  • increment_counter(\counter)
  • increment_counter(\counter)

15
Porting example
  • perl -i.bak -pe 's/\opt_(\w)/\opt1/g'
    param_printer.pl
  • The my scoping must be removed from the hash
    assignments.
  • We declare the hash opt and then pass the
    options into the subroutine
  • Param_printer.pl
  • -----------------------
  • !/usr/bin/perl -w
  • use strict
  • use CGI qw(standard)
  • front_page() if !param()
  • my opt_p param('p') 20 primer size
  • my opt_a param('a') 2 primer size
    range
  • my opt_t param('t') 60 opt. tm
  • my opt_b param('b') 5 tm range
  • my opt_y param('y') 5 primer sets
    per exon
  • print header
  • print_options
  • print end_html
  • Param_printer.pl
  • -----------------------
  • !/usr/bin/perl -w
  • use strict
  • use CGI qw(standard)
  • front_page() if !param()
  • my opt
  • optp param('p') 20 primer size
  • opta param('a') 2 primer size
    range
  • optt param('t') 60 opt. tm
  • optb param('b') 5 tm range
  • opty param('y') 5 primer sets per
    exon
  • print header
  • print_options(opt)
  • When this script is run in mod_perl, it is
    wrapped in the handler subroutine of the package
    inner subroutine problem we get the same
    initial parameters repeatedly.
  • Since the variables follow a distinct pattern we
    can use commandline perl and regex to convert
    them to a hash.

16
Porting example continued
  • !/usr/bin/perl -w
  • use strict
  • use CGI qw(standard)
  • front_page() if !param()
  • my opt
  • optp param('p') 20 primer size
  • opta param('a') 2 primer size
    range
  • optt param('t') 60 opt. tm
  • optb param('b') 5 tm range
  • opty param('y') 5 primer sets per
    exon
  • my text "These are the parameters"
  • my _at_array split (param('a'))
  • print header
  • If we want to pass more than one variable of
    different types (arrays, scalars, and hashes)
    into the subroutine, we can use references.
  • The references will cause mod_perl to hold-on to
    the data that they reference
  • We should use local our to clean up those
    references after they are used.

local our opt .. local our text "These
are the parameters" local our _at_array split
(param('a')) print header print_options(\opt,
\text,\_at_array) print end_html sub
print_options print text, br, "optp,
opta, optt, optb, opty", br
print join (_at_array)
17
Database Connections ApacheDBI
  • In regular CGI, the script which connects to the
    database creates its own connection in every
    instance it is run.
  • If 20 scripts are accessed each 10 times, thats
    200 database connections which are created and
    destroyed.
  • Database connections are expensive.
  • To mitigate this shortcoming, use ApacheDBI,
    which allows persistent database connections to
    be created in mod_perl.
  • The DBI module will check ENVMOD_PERL
    environment variable. If ApacheDBI has been
    loaded, it forwards connect() requests to it.
  • The disconnect() method is overloaded with
    nothing.
  • To load ApacheDBI, it should be loaded in
    httpd.conf / perl.conf
  • PerlModule ApacheDBI
  • After that, you program DBI just as if you used
    use DBI
  • The use DBI statement can remain in your
    scripts.

use DBI dbh DBI-gtconnect(data_source,
username, auth, \attr) sth
dbh-gtprepare(statement) rv sth-gtexecute
_at_row_ary dbh-gtselectrow_array(statement)

18
Sharing Memory Aliasing
  • package MyConfig
  • use strict
  • use vars qw(c)
  • c (
  • dir gt
  • cgi gt "/home/httpd/perl",
  • docs gt "/home/httpd/docs",
  • img gt "/home/httpd/docs/images",
  • ,
  • url gt
  • cgi gt "/perl",
  • docs gt "/",
  • img gt "/images",
  • ,
  • color gt
  • hint gt "777777",
  • warn gt "990066",
  • normal gt "000000",
  • use strict
  • use MyConfig ()
  • use vars qw(c)
  • c \MyConfigc
  • print "Content-type text/plain\r\n\r\n"
  • print "My url docs root curldocs\n"
  • The c glob has been aliased with
    \MyConfigc, a reference to a hash. From now
    on, MyConfigc and c are the same hash and
    you can read from or modify either of them.
  • Any script that you use can share this variable
  • You can also use the _at_EXPORT and _at_EXPORT_OK
    arrays in your package to export the variables
    that you want to share.
  • A Package is created with a hash that contains
    configuration parameters for some scripts.
  • We want to be able to use this hash in other
    scripts

19
Server Configuration httpd.conf / perl.conf
  • Server-Pool Size Regulation (MPM specific)
  • prefork MPM
  • StartServers number of server processes to
    start
  • MinSpareServers minimum number of server
    processes which are kept spare
  • MaxSpareServers maximum number of server
    processes which are kept spare
  • MaxClients maximum number of server processes
    allowed to start
  • MaxRequestsPerChild maximum number of requests
    a server process serves
  • ltIfModule prefork.cgt
  • StartServers 1
  • MinSpareServers 1
  • MaxSpareServers 1
  • MaxClients 1
  • MaxRequestsPerChild 1000
  • lt/IfModulegt
  • worker MPM
  • Mod_perl incorporates a Perl interpreter into
    the Apache web server,
  • so that the Apache web server can directly
    execute Perl code.
  • Mod_perl links the Perl runtime library into
    the Apache web server
  • and provides an object-oriented Perl interface
    for Apache's C
  • language API. The end result is a quicker CGI
    script turnaround
  • process, since no external Perl interpreter has
    to be started.
  • LoadModule perl_module modules/mod_perl.so
  • PerlRequire /etc/httpd/conf/start-up.pl
  • This will allow execution of mod_perl to
    compile your scripts to
  • subroutines which it will execute directly,
    avoiding the costly
  • compile process for most requests.
  • Alias /perl /var/www/perl
  • ltDirectory /var/www/perlgt
  • SetHandler perl-script
  • PerlHandler ModPerlRegistryhandler

20
Performance Tuning Startup.pl
  • use lib("/var/www/perl")
  • use MultisageConfig ()
  • use DBI ()
  • use CGI ()
  • CGI-gtcompile('all')

21
Mod Perl API / Packages
  • ApacheSession - Maintain session state across
    HTTP requests
  • ApacheDBI - Initiate a persistent database
    connection
  • ApacheWatchdogRunAway - Hanging Processes
    Monitor and Terminator
  • ApacheVMonitor -- Visual System and Apache
    Server Monitor
  • ApacheGTopLimit - Limit Apache httpd processes
  • ApacheRequest (libapreq) - Generic Apache
    Request Library
  • ApacheRequestNotes - Allow Easy, Consistent
    Access to Cookie and Form Data Across Each
    Request Phase
  • ApachePerlRun - Run unaltered CGI scripts under
    mod_perl
  • ApacheRegistryNG -- ApacheRegistry New
    Generation
  • ApacheRegistryBB -- ApacheRegistry Bare Bones
  • ApacheOutputChain -- Chain Stacked Perl
    Handlers
  • ApacheFilter - Alter the output of previous
    handlers
  • ApacheGzipChain - compress HTML (or anything)
    in the OutputChain
  • ApacheGzip - Auto-compress web files with Gzip
  • ApachePerlVINC - Allows Module Versioning in
    Location blocks and Virtual Hosts
  • ApacheLogSTDERR
  • ApacheRedirectLogFix
  • ApacheSubProcess
  • ModuleUse - Log and Load used Perl modules

22
Conclusions
  • Mod_perl has to be done right
  • Take care of nested subroutines
  • Goto perl.apache.org
Write a Comment
User Comments (0)
About PowerShow.com