Web servers - PowerPoint PPT Presentation

1 / 40
About This Presentation
Title:

Web servers

Description:

WWW server is able to communicate with other programs (CGI scripts) CGI scripts can be written in any programming language (shell script, PERL, C, ... – PowerPoint PPT presentation

Number of Views:95
Avg rating:3.0/5.0
Slides: 41
Provided by: lidijaf
Category:
Tags: servers | web

less

Transcript and Presenter's Notes

Title: Web servers


1
Web servers
  • Miroslav Milinovic
  • Croatian Academic and Research Network - CARNet
  • Zagreb, Croatia
  • ltmiro_at_srce.hrgt

6th CEENet Workshop on Network Technology,
Budapest, Hungary, August 2000.
2
Content
  • Web server
  • Apache
  • directory structure configuration files
    directives running
  • access control authentication
  • Common Gateway Interface (CGI) passing data
  • Server side includes (SSI)
  • API modules handlers
  • virtual servers
  • Measuring the Web
  • Summary

3
How Web works?
WWW servers
Internet
(WWW)
users browse
?
?
HTML files
authors write HTML
?
4
Web server
  • general purpose data delivery vehicle
  • a program (daemon, httpd)
  • responds to an incoming TCP connection and
    provides a service to the client
  • runs independently
  • Web servers
  • do NOT validate HTML code (parse documents)
  • do NOT check links
  • follow MIME rules (without checking file content)
  • Web site host Web server information (file
    system)

5
Web server software
  • traditionally freely available
  • for most of the platforms
  • UNIX, Ms Windows, Macintosh, VMS, VM,
  • list of available servers software
  • http//www.yahoo.com/Computers_and_Internet/Softwa
    re/Internet/World_Wide_Web/Servers/
  • Web Server Survey
  • http//www.netcraft.com/Survey/
  • popular server programs
  • CERN, NCSA (first ones)
  • Apache, MS IIS, Netscape servers, ...

6
Apache
  • A PAtCHy server is a kind of a plug-in
    replacement for NCSA httpd
  • under constant development
  • freely available
  • in source code
  • binaries for many platforms (v. 1.3.x includes
    also the Windows NT)
  • supports HTPP 1.1. from 1.2.
  • useful addresses
  • Apache home http//www.apache.org/
  • http//www.apacheweek.com/
  • support via Usenet group(s)

7
Where to put the server?
  • server should run where information is been
    created
  • choose host carefully
  • give an DNS alias name to the selected host
    (www. mydoimain.mycuntry)
  • ServerRoot, DocumentRoot and Log files
    directories should be chosen carefully according
    to rules for all daemons and disk space
    requirements
  • User Home Pages?
  • CGI rules!

8
Apache directory structure
  • can be designed (changed) during installation
    (compilation) process
  • some important directories
  • cgi-bin/ - CGI scripts directory (examples
    present)
  • conf/ - configuration files for httpd server
  • htdocs/ - main directory for documents
  • logs/ - directory with log files (currently
    empty)
  • other stuff (bin/, man/)

9
Apache configuration files
  • look in conf/ directory
  • access.conf - access configuration
  • httpd.conf - server configuration
  • mime.types - MIME type to extensions definition
  • srm.conf - resource configuration
  • .-dist - distribution templates
  • since v.1.3.6. it is recommended to use only main
    configuration file httpd.conf

10
Apache configuration directives
  • general rules
  • case insensitive (not true for file/directory
    names)
  • comment lines begin with
  • one directive per line
  • each line of these files consists of
  • directive data data2 ... datan
  • extra whitespace is ignored

11
httpd.conf
  • ServerType standalone
  • Port 80
  • User nobody
  • Group nogroup
  • ServerAdmin your_e-mail_address
  • ServerRoot /home/httpd/
  • ErrorLog /home/httpd/logs/error_log
  • TransferLog /home/httpd/logs/access_log
  • PidFile /home/httpd/logs/httpd.pid
  • more directives
  • Keep Alive, Spare Servers, Proxy, Cache, Virtual
    Servers, ...

?
12
httpd.conf (srm.conf)
  • DocumentRoot /home/httpd/htdocs/
  • UserDir public_html
  • DirectoryIndex index.html
  • AccessFileName .htaccess
  • DefaultType text/plain
  • ScriptAlias /cgi-bin/ /home/httpd/cgi-bin/
  • more directives
  • Icons, Language, Handlers, ...

?
13
httpd.conf (access.conf)
  • defines
  • which types of services are allowed
  • in what circumstances
  • ltDirectory dir_namegt directives
    lt/DirectorygtltDirectoryMatch regexgt directives
    lt/DirectoryMatchgt lt Files file1 file2 gt
    directives lt/Filesgt ltFilesMatch regexgt
    directives lt/FilesMatchgt
  • be very careful due to possible problems
  • operational
  • security

14
mime.types
  • list of MIME types know to your server
  • format type/subtype file_extension
  • example text/html html htm
  • image/gif gif
  • files with other extension will be sent with
    DefaultType
  • add an entry according to your needs

15
Starting and stopping Apache
  • if you selected standalone server type
  • simply execute the program (apachectl start)
  • setup automated startup (during boot)
  • apachectl options START, STOP, CONFIGTEST
  • Apache dynamically adapts to the workload
  • to stop (restart) the server use
  • kill command (UNIX) (pid is in httpd.pid file)
  • apachectl stop

16
Access control
  • two levels
  • per-server (Global Access Configuration file) -
    using directives in httpd.conf (access.conf)
  • per-directory (Per-directory Access Configuration
    file) - using .htaccess files (you can change
    this file name using AccessFileName directive in
    httpd.conf (srm.conf)
  • two ways
  • by user/password
  • by host/domain

17
httpd.conf (access.conf) DocumentRoot settings
  • ltDirectory /home/httpd/htdocsgt
  • instead of the Directory it is possible to
    use Location (controls URLs) or Files
    (controls files).
  • it is possible to use wild cards here ?
  • Options Indexes FollowSymLinks
  • Option can be FollowSymLinks,
    SymLinksIfOwnerMatch, ExecCGI, Includes, Indexes,
    IncludesNoExec, All, None
  • AllowOverride All
  • Specify which Options can be overridden by
    per-directory access files
  • lt/Directorygt

18
httpd.conf (access.conf) Scripts directory
  • ltDirectory /home/httpd/cgi-bingt
  • Options FollowSymLinks
  • AllowOverride None
  • lt/Directorygt
  • the later directives (according to the order in
    the configuration files) are the more important
    (specific)
  • if permitted the more specific are the settings
    in the .htaccess

19
User/password authentication
  • Create a file called .htaccess in required
    directory (of course you can do this on the
    server level)
  • AuthUserFile /home/httpd/admin/.htpasswd
  • AuthGroupFile /dev/null
  • AuthName ByPassword
  • AuthType Basic
  • ltLimit GETgt
  • require user username
  • lt/Limitgt

?
20
User/password authentication
  • using htpasswd command create the password file
  • htpasswd -c /home/httpd/bin/.htpasswd username
  • enter password of your choice (later you can
    check the content of .htpasswd file)
  • multiple users (of course you have to create
    entries in .htpasswd file)
  • add users in require directive in .htaccess
  • OR
  • create a group file (.htgroup), use directives
    AuthGroupFile and require group in .htaccess file
  • OR
  • use require valid-user directives (all users from
    .htpasswd have access)

21
It works, but ...
  • server asks browser for user/password to allow
    access
  • password is send over the network not encrypted
    but "uuencoded"
  • password is not visible in the clear, but can
    easily be decoded by anyone who happens to catch
    the right network packet (sniffers in action)
  • this method of authentication is as safe as
    telnet-style username and password security

22
Host/domain authentication
  • protective .htaccess file looks like
  • ltLimit GETgt
  • order deny,allow
  • deny from all
  • allow from hostname/domain
  • lt/Limitgt

?
23
Host/domain authentication
  • open .htaccess file looks like
  • ltLimit GETgt
  • order allow,deny
  • allow from all
  • deny from hostname/domain
  • lt/Limitgt

24
Access control
  • it is possible to use authentication by
    host/domain and by user/password together
  • for better security compile the Apache with the
    SSL (Secure Socket Layer)
  • then server and client exchange the keys on the
    beginning of the session and all of the
    transactions are encrypted

25
Common Gateway Interface (CGI)
  • WWW server is able to communicate with other
    programs (CGI scripts)
  • CGI scripts can be written in any programming
    language (shell script, PERL, C, )
  • CGI scripts can use CGI environment variables
  • CGI is used for
  • getting input from user, forms processing,
    returning any kind of dynamic information,
    gateways to other services, ...
  • workload is on the servers side (be careful)

?
26
CGI
  • server needs to be configured for CGI operation
    to enforce security procedures
  • ScriptAlias /cgi-bin/ /home/httpd/cgi-bin/
  • all of the files in /cgi-bin/ are considered to
    be a executable scripts (regardless of the name
    of the file)
  • security measures (with CGI scripts)
  • parse and check user input
  • programs should have only the power they require
  • dynamically generated programs are not permitted
  • carefully examine all cgi scripts (do not allow
    users to execute their own programs)

27
Passing data (GET method)
  • data is simply attached to the end of the URL
  • ? is used to separate data from URL
    (http//url?data)
  • CGI programs are executed with URL address
  • http//www_server/cgi-bin/program_name?data
  • simple example ltISINDEXgt tag
  • browser asks for input from user and attaches it
    to the URL
  • the input is rewritten by browser (spaces become
    "", \n become "", )
  • server puts part of URL after "?" in to the
    environment variable QUERY_STRING

28
Passing data (POST method)
  • recommended method for processing FORMS
  • on the HTML page with form you declare which
    script will be called to process data from the
    form
  • ltFORM METHODPOST ACTION/cgi-bin/script_namegt
  • ...
  • lt/FORMgt
  • when user hits the submit button client contacts
    server and passes request (POST
    /cgi-bin/script_name) with data from the form
    (data follows URL as a document)
  • to pass the data to the CGI program server uses
    environment variables and stdin

?
29
Passing data (POST method)
  • server executes the CGI script and provides it
    with
  • list of environment variables
  • input stream of FORM contents in namevalue
    pairs(name1value1name2value2name3value3)
  • script knows how long this input stream is from
    environmental variable CONTENT_LENGHT
  • CGI script general procedure order
  • read input from stdin
  • split namevalue pairs and do value conversion
    (spaces, ...)
  • do something and print out results in HTML form
    tostdout

?
30
Passing data (POST method)
  • CGI scripts are responsible for formatting output
    on "stdout" back to the server (finally server
    will pass this information to the client)
  • CGI script is responsible for generating content
    specific headers and send them as a first lines
    of output to the server
  • for example
  • Content-type text/plain
  • FOLLOWED BY (at lest) ONE BLANK
    LINE !

31
Server side includes (SSI)
  • server can be configured to scan documents with
    shtml extension for occurrence of construction
    like
  • lt!--command tag1"value1" tag2"value2" --gt
  • and replace them with the result of the command
  • this concept is used to add
  • current date, any other CGI environment variable
    value
  • document's (or other file's) last modification
    data, size
  • inline other document contents into the current
    document
  • result of work of any other program on any Web
    server side

32
API modules handlers
  • Apache breaks down clients request handling into
    a series of steps
  • URL --gt Filename translation
  • Auth ID checking
  • Auth access checking
  • Access checking other than above Auth
  • Determining MIME type of the object requested
  • Fixups' - if needed
  • Actually sending a response back to the client
  • Logging the request

?
33
API modules handlers
  • on any of those steps you may tide up an handler
    (the procedure)
  • a set of handlers may make an module, eg. cgi
    module, log module, server side includes module,
    access module, ...
  • consistent specification of the steps allows to
    connect own modules to Apache which replace the
    old one or gives the new possibilities

34
Virtual servers
  • one server may listen on many hosts names -
    virtual servers (same port, different hostnames)
  • part of basic server configuration (httpd.conf)
  • ltVirtualHost hostnamegt lt/VirtualHostgt
  • each of the virtual server may have totally
    different content, configuration, separate log
    and error files,
  • alternative is to run another server on a
    different port

35
Virtual servers - example
  • Port 80
  • ServerName test.ceenet.ceu.hu
  • NameVirtualHost 193.225.201.125
  • ltVirtualHost 193.225.201.125gt
  • ServerName test.ceenet.ceu.hu
  • DocumentRoot /home/httpd/test
  • lt/VirtualHostgt
  • ltVirtualHost 193.225.201.125gt
  • ServerName virtualtest.ceenet.ceu.hu
  • DocumentRoot /home/httpd/virtualtest
  • ...
  • ltVirtualHostgt

36
Web measurement? Why?
  • for content providers
  • usability testing
  • detection of performance problems
  • tuning the server software / benchmarking servers
  • for network providers
  • evaluating proxies
  • detection of performance problems
  • for protocol developers
  • measuring protocol (DNS, TCP, HTTP) performance
  • evaluating changes and new mechanisms

37
Web measurement techniques
  • server logs
  • proxy (cache) logs
  • browser (client) logs
  • packet traces

38
Server logs
  • servers logs access information in the file
  • client host,
  • date,
  • client request,
  • status,
  • count of the bytes sent by server
  • ...
  • standard log files and formats
  • it is possible (and easy) to produce many kinds
    of activity reports from that data
  • plenty of log analyzers (wwwstat, analog,)

39
Some results
  • derived from various studies
  • average resource is small (e.g. 8-10 KB)
  • 10 of resources attract 90 of traffic
  • average of 3-5 resources per Web document
    (including HTML file)
  • number of clicks per session (?)
  • 4 - server studies
  • 10 - client studies

40
Summary
  • WWW server
  • Apache
  • directory structure configuration files
    directives running
  • access control authentication
  • Common Gateway Interface (CGI) passing data
  • Server side includes (SSI)
  • API modules handlers
  • virtual servers
  • Measuring the Web
Write a Comment
User Comments (0)
About PowerShow.com