Web pages - PowerPoint PPT Presentation

About This Presentation
Title:

Web pages

Description:

Web pages Programming Language Design and Implementation (4th Edition) by T. Pratt and M. Zelkowitz Prentice Hall, 2001 Section 12.2.1 ARPANET Initial idea was by ... – PowerPoint PPT presentation

Number of Views:34
Avg rating:3.0/5.0
Slides: 15
Provided by: 66459
Category:
Tags: crawlers | pages | web

less

Transcript and Presenter's Notes

Title: Web pages


1
Web pages
  • Programming Language Design and Implementation
    (4th Edition)
  • by T. Pratt and M. Zelkowitz
  • Prentice Hall, 2001
  • Section 12.2.1

2
ARPANET
  • Initial idea was by Defense Advanced Research
    Project Agency (DARPA) project in late 1960s for
    a national defense network
  • began a project to see whether several computers,
    widely separated geographically, could be linked
    together to enable users at a terminal on one
    system to access the resources on another
    computer.
  • Initial military concept provide access to
    computers if some communications lines are
    destroyed by building a network where data
    communications traffic could dynamically adapt to
    changing conditions.
  • data communications -sending messages reliably
    from one computer to another- was the major
    obstacle.
  • The initial ARPANET began in 1970 as a three-node
    network linking BBN in Cambridge, Massachusetts,
    with UCLA and SRI in California using 56 kilobit
    lines.
  • Sites added until several hundred by mid-1970s

3
ARPANET communications
  • Communication between two computers was handled
    via messages. A message was broken down into
    fixed-length strings called packets, and the
    packets were sent from computer to computer until
    the original message was reassembled at the
    receiving node.
  • To ensure that messages destined for another
    computer arrived reliably, a formal communication
    model -called a protocol -was developed. For the
    ARPANET, this developed as the Transmission
    Control Program/ Internet Protocol (TCP/IP).
  • TCP/IP was a low-level communication mechanism
    that simply determined that a sequence of bytes
    destined for a specific computer arrived there
    uncorrupted. It was generally too complex for
    users to use directly for accessing a computer.

4
User protocols
  • Telnet is a protocol that makes the sending
    computer -the computer the user is actually
    working on -behave like a terminal connected to
    the distant computer. - user is connected to a
    client computer, which acts like a terminal, and
    the terminal program is communicating using the
    telnet protocol to a distant host computer, which
    is providing the server program.
  • SMTP is Simple Mail Transport Protocol. This
    provides the basic e-mail (electronic mail) that
    has become so ubiquitous today
  • FTP is File Transfer Protocol. One would invoke
    the FTP client on a local machine, log onto the
    distant server machine using the FTP protocol,
    and then retrieve the desired documents from the
    distant machine or send documents from the user's
    machine to the distant machine.

5
Weaknesses in FTP
  • one had to know explicitly which machine to
    access to retrieve the desired data.
  • One also had to have access to the files of that
    machine to retrieve the information. The
    anonymous login partially solved that.
  • One had to know exactly where on the file system
    the desired information was.
  • Despite these weaknesses, FTP was the standard
    file transmission mechanism for many years until
    the web changed all that.

6
Birth of the Internet
  • In the mid-1980s, ARPA decided to stop supporting
    the ARPANET. As a research activity, the concept
    had been proved, and ARPA was not in the business
    of providing what was becoming a commercial
    service.
  • The U.S. National Science Foundation (NSF) took
    over the backbone network in the United States
    -the set of high-speed telephone lines that
    provided the basic TCP/IP communications traffic
    between host computers as a way to link
    universities together.
  • The name of the network gradually evolved into
    the Internet. NSF support stopped.
  • Attached to this backbone, local networks (a
    state, a university, a large company) were added
    until the Internet became this amorphous
    collection of computers all continually
    chattering to one another.
  • Commercial providers, now called Internet Service
    Providers (ISP) established links to the Internet
    so that individuals on their home computers could
    use a modem to dial into their local ISP to be on
    the Internet.

7
The World Wide Web
  • By the late 1980s, widespread interest to easily
    transfer files. FTP was a cumbersome process.
    Systems like gopher, archie, veronica developed
  • Physicists -principally Tim Berners-Lee at CERN
    in Geneva -desired a mechanism to access and
    transfer documents by computer that was simpler
    than the standard FTP server.
  • They developed the concept of a semantic
    description language. One server program would
    display a document, and a client program, called
    a browser, would read and understand the
    displayed document. The power of their system was
    that the displayed document contained pointers to
    other documents called hypertext.
  • An earlier version of hypertext was Apple
    Computer's HyperCard product for the Macintosh,
    but the real power of the CERN development was to
    allow hypertext links to documents that existed
    on other computers connected to the Internet.

8
HTTP
  • The protocol developed was the HyperText
    Transport Protocol (HTTP). Http an addition to
    TELNET, FTP, and SMTP protocols discussed
    earlier.
  • Release of the first MOSAIC browser in 1993 led
    to rapid growth of the web.
  • Each pointer became known as a Uniform Resource
    Locator (URL). Document location was reduced to
  • invoking a Web browser on your local machine,
  • typing in a URL for the document you wanted to
    access,
  • connecting to a Web server on the distant machine
    that contained the location of the typed in URL,
  • displaying the document obeying the HTTP
    protocol.
  • HTML language based upon SGML - to be explained
    later.

9
Web navigation
10
Prentice Hall example of navigation
  • 1. The user types in the URL for the home page.
    This URL consists of a Domain name
    (www.cs.umd.edu) and a file on that
    machine(users/mvz/pzbook).
  • 2. The Web browser sends the domain name to one
    of several special Internet machines called
    Domain Name Servers (DNS). The DNS returns the
    Internet Protocol address of the desired web
    page.
  • 3. The web browser sends the file name to the Web
    server at IP address 128.8.128.80. A HTTP Daemon
    (HTTPD) program on this machine is the main
    interface between a web server and the Internet.
  • 4. The Web server appends the name index.html
    because the given file was a directory and not a
    file.
  • 5. The contents of the file are sent back to the
    Web browser and displayed to the user.
  • 6. If the user now clicks on the URL for
    Prentice-Hall that appears on the Web page
    (www.prenticehall.com), the process is repeated
    and the Prentice-Hall server at IP address
    63.69.110.94 is accessed and the appropriate Web
    page is displayed.

11
Portals
  • To make navigation easier, certain Web sites
  • are now known as portals -entrance sites to the
    WWW.
  • These sites have programs known as search
    engines. A search engine is a query processor in
    which you enter a question. The result of that
    query is a list of WWW locations that answer the
    question.
  • Search engines often operate as Web crawlers.
    Beginning at one location, the Web crawler
    follows all links on that Web page to find other
    Web pages.

12
SGML
  • Structured General Markup Language is basis of
    SGML
  • an unstructured sequence of characters
  • within the text can be SGML elements. The
    semantics of elements are unspecified, but their
    syntax is given.
  • elements are bracketed by a start-tag and an
    end-tag notation.
  • ltzorkgt I am a zork lt/zorkgt
  • identifies I am a zork as the contents of the
    zork element.
  • A report in SGML
  • ltreportgt
  • lttitle text lt/titlegt
  • ltauthor text lt/authorgt
  • ltabstract text lt/abstractgt
  • ltbody text lt/bodygt
  • lt/reportgt
  • SGML handles semantic content, not presentation

13
HTML
  • An instance of SGML with a defined syntax for Web
    pages
  • lthtmlgt
  • lttitlegt title of document lt/titlegt
  • ltbodygt
  • text of document
  • lt/bodygt
  • lt/htmlgt
  • Problem SGML is semantic content, not layout
    (presentation).
  • How to handle things like
  • lth1gtMajor headinglt/h1gt
  • - What font and font size to use?
  • - Where on page to place heading?
  • Elements like ltfont size...gt move away from pure
    semantic content

14
Links in HTML
  • HTML contains
  • Embedded text
  • URLs Links to other web pages lthttp//web
    addressgt
  • Images ltSRC SRC...gt
  • MAILTO protocol (Send email)
  • Executable pages (CGI scripts. To be discussed
    soon)
Write a Comment
User Comments (0)
About PowerShow.com