Title: WMB-520: Web Technology Web Server Setup
1WMB-520 Web TechnologyWeb Server Setup
Meeting 1 Introduction to the Internet, World
Wide Web and Web Servers
- Rutgers University Center for Applied Computing
Technology - Instructor Christopher Uriarte
2Course Overview and Goals
- This course will teach you how to install,
configure, and administer a Web server that runs
on a Unix system and can be used to deliver
dynamic content. - The course objectives will be achieved through a
combination of lectures, demonstrations, and
hands-on exercises.
3About Your Instructor
- Chris Uriarte chrisjur_at_cju.com. Feel free to
contact me anytime via email. - Homepage for this class linked from
- http//www.cju.com/classes
- Contains all slides, notes and misc. links and
resources.
4What This Course Is and Is Not
- The purpose of the course is to teach you the
concepts behind serving content on the web and
how to run and administer web server software on
a UNIX system. This means you will be learning
how to use tools to deliver content for the World
Wide Web, not to create content. - Rutgers offers other courses designed to teach
you how to create content for the World Wide Web
(HTML design, Perl, etc.)
5Prerequisites
- Familiarity with a Web browser such as Netscape
or Internet Explorer. - You MUST have user-level experience with UNIX and
must be familiar with the use of a UNIX text
editor like vi, emacs and pico
6Class Schedule
- Week 1 Introduction to the Internet, the World
Wide Web, and Web Servers - Week 2 Installing and Configuring the Apache Web
Server - Week 3 Extending Apache Advanced Topics
- Week 4 More advanced configuration and Web
Security - Note classes 2-4 are hands-on classes.
7Course Resources
- Textbook Apache The Definitive Guide by
Laurie, Laurie and Denn (OReilly Press, 1999). - User account on UNIX server blender.rutgers.edu.
(Remote access available via SSH) - Slides and resources on class hompage, linked
from http//www.cju.com/classes/
8How does the World Wide Web Work?
- Works on a client/server model The Web server
software is the server component, the Web browser
is the client component. Purpose of the Web
server is to provide documents to clients. - Web servers, Web browsers, and the information
that is shared between them through the Hypertext
Transfer Protocol (HTTP) protocol make up the
World Wide Web.
9History of the World Wide Web
- Grew out of the Internet, a network of networks
that began in the early 1970s and was used to
support a variety of services (including telnet,
ftp, Usenet, email, and gopher). Most of these
services communicated via via TCP/IP
(Transmission Control Protocol/Internet
Protocol). - In 1989, Tim Berners-Lee at CERN developed a new
system to simplify document distribution and to
allow documents to be linked together. Called the
World Wide Web.
10Web History, cont.
- In 1993, the National Center for Supercomputing
Applications (NSCA) released to the public the
NCSA Web Server software and a GUI Web browser,
called Mosaic. Quickly became popular. - Mosaic became Netscape first major web browser
distribution.
11The Web Today
- The Web has evolved greatly since it was first
implemented, but its overall architecture has
remained generally the same. - Still fueled by three major components A
network (typically the Internet, but can also
include local networks or Intranets), a client
component (a web browser) and a server software
component (what were going to learn about in
this class).
12Webmasters, Sys-Admins and Developers, Oh My!
- There are a number of key roles that are
necessary to support a web infrastructure.
Providing end-to-end web services requires
knowledge about systems, network, software,
graphical design, programming and much more. - There are many different
13Roles in the Web World
- Web Designers Create graphical elements and
determine layout of Website. - Content Providers- Create and edit HTML
documents. - Web Developers Write web applications using
programming languages such as Java, JavaScript,
ASP, PHP and Perl other used to deliver dynamic
content.
14Roles in the Web World (cont)
- System Administrators Responsible for
maintaining the Web server software and often the
operating system and hardware where the Web
server is installed. - Network Administrators Responsible for the
design and maintanance of network components used
to deliver web content. - For most organizations, these responsibilities
tend to be split over multiple job positions
except for very small and simple Web sites.
15Hosting A Website The Planning Phase
- There are a number of key questions an
organization or individual must ask when planning
to deploy a website - How and where will you host it?
- What kind of hardware will you use?
- What kind of Operating System will the hardware
run? - What Web server software will you use?
- What domain name will your site use?
- Answers to above questions usually determined by
budget, staffing, and existing infrastructure of
your organization.
16Hosting Your Website Options
- Use a Free Page Site For personal use, limited
space and tools, typically adds advertisements.
(examples Yahoo, Tripod, Xoom, etc.).
Limitation on amount of traffic your site is
allowed. Generally not an option for business
use. - Personal Page Site For personal use, usually
included with an ISP (about 20 per month),
includes small amoung of disk space, no or
limited access to server-side technologies for
delivering dynamic content, generally uses your
ISPs Internet domain name. Limitation on amount
of traffic your site is allowed. (Website URL
usually looks something like http//www.yourisp.c
om/yourusername). Might be an option for very
small businesses. - Under both models, ISP owns and manages the
server, network and web server software. You
simply provide the content.
17Hosting Your Website, cont.
- Virtual Hosting Most popular web hosting option
todays. Suitable for business or personal use.
Using this model, an ISP uses one machine to host
many different websites (sometimes called shared
hosting). You can use your own Internet domain
name (http//www.yourdomain.com). These
typically provide a wide range of tools for
building more complex Websites. - ISP owns and manages the server, network and web
server software. You simply provide the content. - Overall cost based on disk usage and website
traffic, ranges from 20 to several thousands of
dollars a month. Now generally available through
all ISPs and specialized Hosting-only providers
such as Highway Technologies (http//www.hway.net)
and YourDomainHost (http//www.yourdomainhost.com
)
18Hosting Your Website, cont.
- Dedicated Server Services For business use, ISP
owns and the hardware and network. Your
organization typically has the option of managing
and configuring the server. You provide the
content. Your organization has exclusive access
to the server. - Price based on the type of server you require and
the amount of traffic your site uses. - Co-Location Services (co-lo) For business use
- Your organization owns and manages the
hardware, software and content. ISP provides you
with space to place the server and the network
connectivity. - Price based on the amount of space your servers
require, the power requirements of the servers
and the amount of traffic.
19Hosting Your Website, cont.
- Managing your Own Web Server Network (in-house
web hosting) You provision, configure and
manage the network connection, hardware, software
and provide the content. - Most flexible option you have complete control.
- Cost can be very high or very low depending on
the business need.
20Hosting Your Web Server Do It Yourself
Networking Options
- For an Intranet Server Need a LAN (local area
network). Does not require an Internet
connection. - For an Internet Server Need a dedicated
Internet connection. Internet Connectivity
Options - POTS (up to 56Kbps) not practical for web use
anymore. - ISDN (128Kbps) not practical for web use either
(costly, slow) - Cable (512Kbps 10Mbps)
- DSL (128kps 1.54 Mbps)
- T-1 (up to 1.54Mbps) full, fractional, or
burstable - T-3 (up to 45 Mbps) full, fractional, or
burstable
21Finding an ISP
- Some ISPs specialize in web hosting and provide
all the service described earlier (shared
hosting, dedicated server, co-location, etc.). - Other ISPs specialize only in commercial Internet
access (AOL, Earthlink, etc.). They may provide
free personal website space. - Check The List http//thelist.com, for a
comprehensive list ISPs and their services.
22Hosting Your Server Hardware Options
- There are a number of things to consider when
choosing the hardware platform for your website - Need to select a machine architecture (i.e Intel
Compatible PC, Sun, Macintosh) Typically
dictated by your Operating System of choice - Processor speed and number of processors.
- RAM and Disk Space.
- NIC card.
- Price can range from several hundred dollars to
thousands of dollars.
23Web Server Hardware Myths vs. Realities
Myth
Reality
24Important Notes about Web Server Hardware
- Web Servers need fast disk access and a lot of
RAM to handle high-volumes of traffic. - Not unusual to see web servers with 1GB of RAM
and 10,000RPM hard drives. - Processor speed and performance becomes very
important when delivering dynamic content via
custom web application. - High-end PCs can typically handle 100K website
visits per day.
25Hosting Your Server Operating System Options
- Commercial Versions of Unix (i.e. Solaris HP-UX,
AIX, MacOS X, IRIX). - Free Versions of Unix (i.e. Linux, FreeBSD).
- Microsoft Windows (9x, NT, XP, Windows 2000).
- Windows vs. Unix raises issues of easy of use,
stability, scalability, open source, and pricing.
UNIX platforms generally considered more
reliable, scaleable and cost-effective.
26Hosting Your Server Web Server Software Options
- The primary focus of this class will be
installing and configuration the web server
software the software the turns an ordinary
computer into a computer that can host and serve
content on the World Wide Web. - Web server software is often referred to as
- The Web server
- The web daemon
- The httpd
27Hosting Your Server Web Server Software Options
- According to the Oct. 2002 Netcraft Web Server
Survey (http//www.netcraft.com), four Web server
software distributions support over 90 of all
Websites on the Internet - Apache - 65
- Microsoft Internet Information Server - 25
- Zeus Web Server - 1.4
- iPlanet 1.3
28Web Server Software Options Apache
- The standard for UNIX web servers..
- The most popular web server. Considered to be
the most secure, stable and robust server
platform. - Originally based on NCSA httpd code.
- Can be installed under most Unix variants and
Windows. Binary versions available for many
operating systems. - Uses file-based configuration, although some GUI
tools are also available.
29Introduction to Apache, cont.
- Unix versions very stable. Windows version less
mature, but becoming more stable. Apache 2.0 is
released. - Very Fast and uses resources efficiently.
- Freely distributed source code. Can be modified
for commercial or non-commercial use. - Price Free. Developed by the Apache software
foundation. - See http//www.apache.org for more information.
30Web Server Software Options SunOne/iPlanet/Netsc
ape Server
- Now officially called the SunOne Web Server
- Originally developed as Netscape Server, then
distributed by partnership between Sun/Netscape,
now owned and supported by Sun. - Server packages iPlanet/Netscape Enterprise
Server, Netscape Fast-Track Server. - Runs under Windows NT, Solaris, HP-UX, Digital
Unix, AIX, Linux.
31Netscape/SunOne Server, cont.
- Uses Web-based administration.
- Can be resource intensive.
- Lost major portion of market over last 5 years.
- Price 1495 per CPU
- http//wwws.sun.com/software/products/web_srvr/hom
e_web_srvr.html for more information.
32Web Server Software Options Microsoft Internet
Information Server (IIS)
- Most popular for Win NT and 2000-based web
servers. - Version 4 runs on Windows NT Server. IIS version
5 runs on Windows 2000 Server (and XP, but used
for development purposes only). - GUI-based administration. Web-based
administration available as well. - May not scale well.
33Microsoft IIS, cont.
- Increasing concerns over its security.
- Source code not available. Extendable through
Microsofts Internet Server API (ISAPI). - Price Free with NT Server 4.0 and Windows 2000
Server - See http//www.microsoft.com/windows2000/technolog
ies/web/default.asp for more information.
34How the Internet Works Networking Basics
- For a Web server to be useful needs to be
attached to a network. - Minimum requirements for a computer network at
least two computers that have a media and a
method of communicating. - All Internet applications use TCP/IP
(Transmission Control Protocol/Internet Protocol)
for low-level communications.
35Networking Basics TCP/IP
- TCP/IP is actually a combination of 2 protocols
- A transport layer protocol called the
Transmission Control Protocol (TCP) - A network protocol called the Internet Protocol
(IP)
36Networking Basics IP Addresses
- TCP/IP uses IP address to identify different
devices. Every computer on the Internet must have
a least one unique IP address. - IPv4 IP address are four 8-bit numbers separated
by dots 165.230.30.68 - Usually divided in three parts
- 165.230 is one of Rutgers networks e.g. no one
else has addresses starting with 165.230 - 30 is the subnet portion of the address
- 68 is the particular node, or host portion of the
address - Division not necessarily on octet boundary.
37TCP/IP Two Friends, Working Together
- IP - An IP address represents a machines
identity on the internet and tells other machines
how to get to it similar to your street address
(e.g. 123 Main Street, Anytown, USA). - TCP is a mechanism used to ensure that anything
sent to a specific IP address makes it there in
one piece. similar to the Post Office. - Together, TCP/IP assures that anything sent to a
server on the Internet is delivered to the right
place in one complete piece.
38Networking Basics IP Addresses
- IP addresses no longer being distributed by
classes blocks are distributed to ISPs on an
as-needed basis and must be justified. - IP addresses are hard to come by. How do you get
them? - Your ISP received an address space from the
ARIN (http//www.arin.org) - You receive IP addresses from your ISP.
39Networking Basics Tools
- Network interfaces need to be assigned IP
addresses. - Interfaces can be configured using ifconfig
command on UNIX machines. - Type ifconfig a to view current configuration
settings. - Additional tools for network monitoring ping,
traceroute, tcpdump, netstat, arp, snoop.
40Networking Basics DNS
- IP addresses are usually paired with more
human-friendly names The system that contains the
IP Address-to-Hostname pairing is called the
Domain Name System (DNS).
internet.rutgers.edu
Hostname Organization Top-level domain
- Other top-level domains include .com, .gov, .org,
etc. There are also country-specific domains
like .uk, .ca, .jp, etc.
41Networking Basics DNS, cont.
- Domain name information is maintained through a
distributed database of host name/ IP address
pairing. - The Network Information Center (NIC) manages the
top-level domains and maintains a database of
registered name servers for all domain names. - Host name assignments maintained through zone
files on primary and secondary DNS servers
controlled by the organization that owns the
domain (or their ISP).
42Networking Basics DNS, cont.
- Network Solutions (previously the InterNic)
registers domain names See http//www.networksol
utions.com. Other registrars include
Register.com - Costs range from 20 to 50 per year.
- ISPs often offer domain name registration as
part of other packages (such as web hosting
packages). - Need to register a primary and secondary domain
name servers for your domain and arrange to have
zone files created on DNS servers. Your ISP
will typically configure this for you.
43DNS Overview If Computers on the Internet Could
Talk
44Networking Basics DNS Tools
- There are several tools for for monitoring DNS
information - whois tells you the owner and primary DNS
servers associated with a domain (e.g. whois
yahoo.com). Also available via web browser at
www.networksolutions.com. - nslookup and host tell you IP address
information for a particular hostname on the
internet (e.g. nslookup www.yahoo.com or host
www.rutgers.edu)
45DNS Exercise
- What are IP addresses of the DNS servers that
contain information about rutgers.edu? - What are the IP address of
- www.retaildecisions.com
- abusaday.admin.cju.com
- www.linux.org
46Networking Basics Ports
- Servers tend to run a number of services. A
single NIC can be used to provide multiple
services through ports. - Network server software listens on specific
ports. Clients contact server by specifying an IP
address and a connection port. The port is the
identifier that tells a server what application
a piece of network traffic is destined for. - Common services and port numbers
- smtp 25, ftp 21, telnet 23, http/web 80,
https/ssl 443 - A list of services and ports is contained in the
/etc/services file on UNIX systems. - Ports below 1024 are reserved for special system
services and can only be used by programs started
by root (the system administrator on a UNIX
system).
47Uniform Resource Locator (URL)
- URL a fancy way of saying web site address
- Anatomy of a URL
http//internet.rutgers.edu80/ITI520/index.html
Protocol Hostname Port Number Path To
File
48Unix Tools and Commands
- File Editors vi, emacs, pico
- File system navigation cd
- File management mv, rm, mkdir,rmdir, ls, chmod,
ln - Archiving and compression tar, gzip
- Process management ps, kill
- Man pages available for all these commands, e.g
man rmdir
49UNIX Process Management
- UNIX Processes are managed using the ps and kill
commands - ps is used to list processes running on the
system - kill is used to kill and restart processes
running on the system - Every time you start a new program (pico, vi,
bash, etc.) a process is created and you are the
owner of that process. - Each process is assigned a unique Process ID
(PID) on the system
50Process Management Exercises
- You can type ps aux to see all the processes
running on a system. This will list the process
owner, process ID (PID) and the command being
run. - You can kill any PID, as long as you are the
owner of the process. - ps aux grep username will show all the
processes your are currently running
51Process Management Exercises, cont.
- Open up a new UNIX terminal window and type vi
foo.txt. This will create a new process on the
system that you own. - Open a second terminal window on the same UNIX
system. Locate the process ID for your vi
session and kill it. What happens?
52Reading for Next Week