Title: The Application Layer
1The Application Layer
2The application layer
- The layers below the application layer are there
to provide reliable end-to-end communication. - The application layer contains all the
higher-level protocols. - Supporting protocols
- DNS - Domain Name System
- Real Applications
- Email.
- World Wide Web.
- Multimedia.
3DNS The Domain Name System
- The DNS Name Space
- Resource Records
- Name Servers
4The DNS Name Space
- A portion of the Internet domain name space.
5Resource Records
- The principal DNS resource records types.
6Resource Records (2)
- A portion of a possible DNS database for cs.vu.nl.
7Name Servers
- Part of the DNS name space showing the division
into zones.
8Name Servers (2)
- How a resolver looks up a remote name in eight
steps.
9Electronic Mail
- Architecture and Services
- The User Agent
- Message Formats
- Message Transfer
- Final Delivery
10Electronic Mail (2)
- Some smileys. They will not be on the final exam
-).
11Architecture and Services
- Basic functions
- Composition
- Transfer
- Reporting
- Displaying
- Disposition
12The User Agent
- Envelopes and messages. (a) Paper mail. (b)
Electronic mail.
13Reading E-mail
- An example display of the contents of a mailbox.
14Message Formats RFC 822
- RFC 822 header fields related to message
transport.
15Message Formats RFC 822 (2)
- Some fields used in the RFC 822 message header.
16MIME Multipurpose Internet Mail Extensions
- Problems with international languages
- Languages with accents (French, German).
- Languages in non-Latin alphabets (Hebrew,
Russian). - Languages without alphabets (Chinese, Japanese).
- Messages not containing text at all (audio or
images).
17MIME (2)
- RFC 822 headers added by MIME.
18MIME (3)
- The MIME types and subtypes defined in RFC 2045.
19MIME (4)
- A multipart message containing enriched and audio
alternatives.
20Message Transfer
- Transferring a message from elinore_at_abc.com to
carolyn_at_xyz.com.
21Final Delivery
(a) Sending and reading mail when the receiver
has a permanent Internet connection and the user
agent runs on the same machine as the message
transfer agent. (b) Reading e-mail when the
receiver has a dial-up connection to an ISP.
22POP3
- Using POP3 to fetch three messages.
23IMAP
- A comparison of POP3 and IMAP.
24The World Wide Web
- Architectural Overview
- Static Web Documents
- Dynamic Web Documents
- HTTP The HyperText Transfer Protocol
- Performance Ehnancements
- The Wireless Web
25Architectural Overview
- (a) A Web page (b) The page reached by clicking
on Department of Animal Psychology.
26Architectural Overview (2)
- The parts of the Web model.
27The Client Side
- (a) A browser plug-in. (b) A helper application.
28The Server Side
- A multithreaded Web server with a front end and
processing modules.
29The Server Side (2)
30The Server Side (3)
- (a) Normal request-reply message sequence.
- (b) Sequence when TCP handoff is used.
31URLs Uniform Resource Locaters
32Statelessness and Cookies
- Some examples of cookies.
33HTML HyperText Markup Language
(b)
- (a) The HTML for a sample Web page. (b) The
formatted page.
34HTML (2)
- A selection of common HTML tags. some can have
additional parameters.
35Forms
- (a) An HTML table.
- (b) A possible rendition of this table.
36Forms (2)
- (a) The HTML for an
order form. - (b) The formatted page.
(b)
37Forms (3)
- A possible response from the browser to the
server with information filled in by the user.
38XML and XSL
- A simple Web page in XML.
39XML and XSL (2)
40Dynamic Web Documents
- Steps in processing the information from an HTML
form.
41Dynamic Web Documents (2)
- A sample HTML page with embedded PHP.
42Dynamic Web Documents (3)
(a) A Web page containing a form. (b) A PHP
script for handling the output of the form. (c)
Output from the PHP script when the inputs are
"Barbara" and 24 respectively.
43Client-Side Dynamic Web Page Generation
- Use of JavaScript for processing a form.
44Client-Side Dynamic Web Page Generation (2)
- (a) Server-side scripting with PHP.
- (b) Client-side scripting with JavaScript.
45Client-Side Dynamic Web Page Generation (3)
- A JavaScript program for computing and printing
factorials.
46Client-Side Dynamic Web Page Generation (4)
- An interactive Web page that responds to mouse
movement.
47Client-Side Dynamic Web Page Generation (5)
- The various ways to generate and display content.
48HTTP Methods
- The built-in HTTP request methods.
49HTTP Methods (2)
- The status code response groups.
50HTTP Message Headers
- Some HTTP message headers.
51Example HTTP Usage
- The start of the output of www.ietf.org/rfc.html.
52Caching
- Hierarchical caching with three proxies.
53Content Delivery Networks
- (a) Original Web page. (b) Same page after
transformation.
54The Wireless Web
- Steps in looking up a URL when a CDN is used.
55WAP The Wireless Application Protocol
56WAP (2)
57I-Mode
- Structure of the i-mode data network showing the
transport protocols.
58I-Mode (2)
- Structure of the i-mode software.
59I-Mode (3)
- Lewis Carroll meets a 16 x 16 screen.
60I-Mode (4)
- An example of cHTML file.
61Second-Generation Wireless Web
- A comparison of first-generation WAP and i-mode.
62Second-Generation Wireless Web (2)
- New features of WAP 2.0.
- Push model as well as pull model.
- Support for integrating telephony into apps.
- Multimedia messaging.
- Inclusion of 264 pictograms.
- Interface to a storage device.
- Support for plug-ins in the browser.
63Second-Generation Wireless Web (3)
- WAP 2.0 supports two protocol stacks.
64Second-Generation Wireless Web (4)
- The XHTML Basic modules and tags.
65Multimedia
- Introduction to Audio
- Audio Compression
- Streaming Audio
- Internet Radio
- Voice over IP
- Introduction to Video
- Video Compression
- Video on Demand
- The MBone The Multicast Backbone
66Multimedia
- For many people, multimedia is the holy grail of
networking. It brings immense technical
challenges in providing (interactive) video on
demand to every home and equally immense profits
out of it. - Literally, multimedia is just two or more media.
Generally, the term of multimedia means the
combination of two or more continuous media. In
practice, the two media are normally audio and
video.
67Introduction to Audio
- The representation, processing, storage, and
transmission of audio signals are a major part of
the study of multimedia systems. - The frequency range of the human ear runs from 20
Hz to 20K Hz. The ear is very sensitive to sound
variations lasting only a few milliseconds. The
eye, in contrast, does not notice changes in
light level lasting only a few milliseconds.
- So, jitter of only a few milliseconds during a
multimedia transmission affects the perceived
sound quality more than it affects the perceived
image quality. - Audio waves can be converted to digital form by
an ADC (Analog Digital Converter), as shown in
the following figure.
68Digital audio
- (a) A sine wave. (b) Sampling the sine wave.
(c) Quantizing the samples to 4 bits.
If the highest frequency in a sound wave is f,
then Nyquist theorem states that it is sufficient
to make 2f samples at a frequency . Digital
samples are never exact. The error introduced by
the finite number of bits per sample is called
the quantization noise.
69Examples of sampled sound
- Two well-known examples of sampled sound
- The telephone Pulse code modulation uses 7-bit
(US and Japan) or 8-bit (Europe) samples 8000
times per second, giving a date rate 56 Kbps or
64 Kbps. - Audio CDs are digital with a sampling rate of
44,100 sample/sec, enough to capture frequencies
up to 22,050 Hz. The samples are 16 bits each,
which allows only 65,536 distinct values (but the
dynamic range of the ear is about 1 million when
measured in steps of the smallest audible sound.)
Thus using 16 bits per sample introduces some
quantization noise.
With 44,100
samples/sec of 16 bits each, audio CD needs a
bandwidth of 705.6 Kbps for monaural and 1.411
Mbps for stereo, which requires almost a full T1
channel (1.544 Mbps) to transmit uncompressed CD
quality stereo sound.
70MIDI (Music Instrument Digital Interface)
- Dozens of programs exist for personal computers
to allow users to record, display, edit, mix, and
store sound waves from multiple sources. - A standard, MIDI (Music Instrument Digital
Interface), is adopted by virtually the entire
music industry. It specifies the connector, the
cable, and the message format. - Each MIDI message consists of a status byte
followed by zero or more date bytes. - A MIDI message conveys one musically significant
event, such as a key being pressed, slider being
moved, or a foot pedal being released. The status
byte indicates the event, and the data bytes give
parameters (e.g. which key was depressed). - The heart of every MIDI system is a synthesizer
(often a computer) that accepts messages and
generates music from them. - The advantage of transmitting music using MIDI
compared to sending a digitized waveform is the
enormous reduction in bandwidth, often by a
factor of 1000.
71Audio Compression
- The most popular audio compression algorithm is
MPEG audio, which has three layers (variants), of
which MP3 (MPEG audio layer 3) is the most
powerful and best known. - Audio compression can be done in one of two
ways - Waveform coding the signal is transformed
mathematically by a Fourier transform into its
frequency components. The amplitude of each
component is then encoded in a minimal way. The
goal is to reproduce the waveform accurately at
the other end in as few bits as possible. - Perceptual coding exploits certain flaws in the
human auditory system to encode a signal in such
a way that it sounds the same to a human
listener, even if it looks quite different on an
oscilloscope. Perceptual coding is based on the
science of psychoacoustics how people perceive
sound. MP3 is based on perceptual coding.
72Audio Compression
- The key property of perceptual coding is that
some sounds can mask other sounds. - Frequency masking the ability of a loud sound
in one frequency bank to hide a softer sound in
another frequency band that would have been
audible in the absence of the loud sound. - Temporal masking the effect that the ear turns
down its gain when they start and it takes a
finite time to turn it up again.
- (a) The threshold of audibility as a function of
frequency. (b) The masking effect.
73MP3 audio compression
- The essence of MP3 is to Fourier-transform the
sound to get the power at each frequency and then
transmit only the unmasked frequencies, encoding
these in as few bits as possible. - MP3 audio compression
- It samples the waveform at 32 kHz, 44.1 kHz, or
48 kHz. - The sampled audio signal is transformed from the
time domain to the frequency domain by a fast
Fourier transformation. - The resulting spectrum is then divided up into 32
frequency bands, each of which is processed
separately. - The MP3 audio stream is adjustable from 32 kbps
to 448 kbps. MP3 can compress a rock'n roll CD
down to 96 kbps with no perceptible loss in
quality, even for rock'n roll fans. For a piano
concert, at least 128 kbps are needed.
74Streaming Audio
- A straightforward way to implement clickable
music on a Web page.
75Streaming Audio (2)
When packets carry alternate samples, the loss of
a packet reduces the temporal resolution rather
than creating a gap in time.
76Streaming Audio (3)
- The media player buffers input from the media
server and plays from the buffer rather than
directly from the network.
77Streaming Audio (4)
- RTSP commands from the player to the server.
78Internet Radio
79Voice over IP
- The H323 architectural model for Internet
telephony.
80Voice over IP (2)
81Voice over IP (3)
- Logical channels between the caller and callee
during a call.
82SIP The Session Initiation Protocol
- The SIP methods defined in the core specification.
83SIP (2)
- Use a proxy and redirection servers with SIP.
84Comparison of H.323 and SIP
85Video
- The human eye has the property that when an image
is flashed on the retina, it is retained for a
few milliseconds before decaying. - If a sequence of images is flashed at 50 or more
images/sec, the eye does not notice that it is
looking at discrete images. All TV systems
exploit this property to produce moving pictures.
- To represent the two-dimensional image as a
one-dimensional voltage as a function of time,
the scanning pattern used by both the camera and
the receiver is shown in the following figure.
86Video Analog Systems
- The exact scanning parameters very from country
to country - US and Japan 525 scan lines (only 483 lines
displayed), a horizontal to vertical aspect ratio
of 43, and 30 frames/sec. - Europe 626 scan lines (only 576 lines
displayed), the same aspect ratio of 43, and 25
frame/sec.
87Interlacing and progressive techniques
- The interlacing technique Instead of displaying
the scan lines in order, first all the odd scan
lines are displayed, then the even ones are
displayed. Each of these half frames is called a
field. Experiments have shown that although
people notice flicker at 25 frames/sec, they do
not notice it at 50 fields/sec. - Non-interlaced TV or video is said to be
progressive.
88Color video
- Color video uses three beams moving in unison,
with one beam for each of the three additive
primary colors red, green, and blue (RGB). For
transmission on a single channel, the three color
signals must be combined into a single composite
signal. - To ensure that programs transmitted in color be
receivable on existing black-white TV sets, the
simplest scheme, just encoding the RGB signals
separately, was not acceptable, leading to
incompatible color TV systems in different
countries - NTSC standard by the US National Television
Standards Committee, - SECAM (Sequential Couleur Avec Memoire) used in
France and Eastern Europe, and - PAL (Phase Alternating Line) used in the rest of
Europe. - All three systems linearly combine the RGB
signals into a luminance (brightness) signal, and
two chrominance (color) signals. - Since the eye is much more sensitive to the
luminance signal than to the chrominance signals,
the latter need not be transmitted as accurately
and they can be broadcast in narrow bands at
higher frequencies. - HDTV (High Definition TeleVision) produces
sharper images by roughly doubling the number of
scan lines and has an aspect ratio of 169
instead of 43 to match them better to the format
used for movies (using 35 mm film).
89Digital video systems
- The representation of digital video consists of
- A sequence of frames.
- Each frame consists of a rectangular grid of
picture elements, or pixels. - Each pixel can be presented by a single bit
(b/w), 8 bits (for high-quality b/w video), or 24
bits (8 bits for each of the RGB colors). - Smoothness of motion is determined by the number
of different images per second, whereas flicker
is determined by the number of times the screen
is painted per second. - To produce smooth motion, digital video, like
analog video, must display at least 25
frames/sec. - Since good quality computer monitors often rescan
the screen from images stored in memory at 75
times/sec or more, interlacing is not needed.
Just repainting the same frame three times in a
row is enough to eliminate flicker.
90Digital systems
- Current computer monitors all use the 43 aspect
ratio so they can use inexpensive, mass-produced
picture tubes for the TV market. Common
configurations are - 640 ? 480 (VGA)
- 800 ? 600 (SVGA)
- 1024 ? 768 (XGA)
- An XGA display with 24 bits per pixel and 25
frames/sec needs to be fed at 472 Mbps, which is
larger than the bandwidth of an OC-9 SONET
carrrier. - Digital video transmits 25 frames/sec but have
the computer to store each frame and paint it
twice to eliminate flicker. - Analog TV broadcast transmits 50 fields(25
frames)/sec but uses interlacing to eliminate
flicker because TV sets do not have memory and
cannot convert analog frames into digital form.
91Data compression
- Transmitting multimedia material in uncompressed
form is completely out of the question. - Fortunately, a large body of research over the
past few decades has led to many compression
techniques and algorithms that make multimedia
transmission feasible. - All compression systems require two algorithms
- one for compressing the data at the source
(encoding), and - another for decompressing it at the destination
(decoding).
92Asymmetries of data compression algorithms
- These algorithms have certain asymmetries
- For many applications, e.g., a multimedia
document, a movie will only be encoded once (when
it is stored on the multimedia server) but will
be decoded thousands of times (when it is viewed
by customers).
Consequently, the decoding algorithms should be
very fast and simple, even at the price of making
encoding slow and complicated.
On the
other hand, for real-time multimedia, such as
video conferencing, slow encoding is
unacceptable, so different algorithms or
parameters are used. - The encode/decode process need not be invertible
for multimedia documents, unlike the compressed
file transfer which ensures the receiver to get
the original file back. When the
decoded output is not exactly equal to the
original input, the system is said to be lossy,
otherwise it is lossless. Lossy
systems are important because accepting a small
amount of information loss can give a huge payoff
in terms of the compression ratio possible.
93Entropy encoding
- Compression schemes can be divided into two
general categories entropy encoding and source
encoding. - Entropy encoding just manipulates bit streams
without regard to what the bits mean. It is a
general, lossless, fully reversible technique,
applicable to all data. - Run-length encoding
- In many kinds of data, strings of repeated
symbols (bits, numbers, etc) are common. These
can be replaced by - a special marker not otherwise allowed in the
data, followed by - the symbol comprising the run, followed by
- how many times it occurred.
- If the special marker itself occurs in the data,
it is duplicated (as in character stuffing).
94An example of entropy encoding
- A sequence of bits
- 31500000000000084587111111111111163546740000000000
00000000000065 - If A is used as the marker and two-digit numbers
are used for the repetition count, the above
digit string can be encoded as - 315A01284587A1136354674A02265
- A saving of about 50.
- In audio, silence is often represented by runs of
zeros. In video, runs of the same color occur in
shots of the sky, walls, and many flat surfaces.
All of these runs can be greatly compressed.
95Statistical and CLUT encoding
- Statistical encoding
- Basic idea using a short code to represent
common symbols and long ones to represent
infrequent ones. - Huffman coding and Ziv-Lempel algorithm used by
the UNIX Compress program use statistical
encoding. - CLUT (Color Look Up Table) encoding
- Suppose that only 256 different color values are
actually used in the image (e.g., a cartoon or
computer-generated drawing). A factor of almost
three compression can be achieved by - building a 768-byte table listing the three RGB
values of the 256 colors actually used, and then - representing each pixel by the index of its RGB
value in the table. - This is a clear example where encoding (searching
the whole table) is slower than decoding (a
single indexing).
96Source encoding
- It takes advantages of properties of the data to
produce more (usually lossy) compression. - Example 1 differential encoding
- A sequence of values (e.g., audio samples) are
replaced by representing each one as the
difference from the previous value. - Example 2 transformations
- By transforming signals from one domain to
another, compression may become much easier.
E.g., the Fourier transformation.
97The JPEG Standard
JPEG (International Standard 10918) was developed
by photographic experts for compressing
continuous-tone still pictures (e.g.,
photographs).
- The operation of JPEG in lossy sequential mode.
98Step 1 Block preparation
- For a given RGB input arrays shown in (a),
separate matrices are constructed for the
luminance, Y, and the two chrominance, and (for
NTSC), according to the following formulas - Y 0.30R 0.59G 0.11B
I 0.60R - 0.28G - 0.32B Q 0.21R - 0.52G
0.31B - Square blocks of four pixels are averaged in the
I and Q matrices to reduce them to 320 ? 240. - 128 is subtracted from each element of all three
matrices. - Each matrix is divided up into 8 ? 8 blocks. The
Y matrix has 4800 blocks the other two have 1200
blocks each, as shown in (b).
99Step 2 DCT (Discrete Cosine Transform)
- Apply DCT to each of the 7200 blocks separately.
The output of each DCT is an matrix of DCT
coefficients. - DCT element (0, 0) is the average of the block.
Other elements tell how much spectral power is
present at each spatial frequency. These elements
decay rapidly with distance from the origin, (0,
0), as suggested in the figure.
- (a) One block of the Y matrix.
(b) The DTC coefficients.
100Step 3- 4 Quantization and differential
quantization
Quantization wipes out the less important DCT
coefficients by dividing each of the elements in
the matrix by a weight taken from a table, as
illustrated in Figure below.
Differential quantization reduces the (0, 0)
value of each block by replacing it with the
amount it differs from the corresponding element
in the previous block. Since these elements are
the averages of their blocks they should change
slowly, so taking the differential values should
reduce most of them to small values. No
differentials are computed for other elements.
101Step 5 - 6 run-length encoding and Huffman
encoding
The output matrix of the differential
quantization is scanned in a zig zag pattern, and
the run-length encoding is used to reduce the
output string of numbers.
- Huffman encoding is used to encode the numbers
for storage or transmission. - Decoding a JPEG image requires running the
algorithm backward. - Since JPEG often produces a 201 compression or
better, it is widely used.
102The MPEG (Motion Picture Experts Group) Standard
MPEG is the main algorithm used to compress
videos and has been international standard since
1993. The following discussion will focus on MPEG
video compression. MPEG-1 (International
Standard 11172) has the goal to produce video
recorder-quality output ( for NTSC) using a bit
rate of 1.2 Mbps. MPEG-1 can be transmitted over
twisted pairs for modest distances (100 meters).
MPEG-2 (International Standard 13818) was
originally designed for compressing broadcast
quality video into 4 to 6 Mbps, so it could fit
in a NTSC or PAL broadcast channel. Later, it was
extended to support HDTV. It forms the basis for
DVD and digital satellite TV. MPEG-4 is
for medium-resolution video-conferencing with low
frame rates (10 frames/sec) and at low bandwidths
(64 kbps).
103The MPEG-1system
MPEG-1 has three parts audio, video, and system,
which integrates the other two, as shown below.
- The system clock runs at 90-kHz and feeds
timestamp to both encoders. The timestamps are
represented in 33 bits, to allow film to run for
24 hours without wrapping around. These
timestamps are included in the encoded output and
propagated to the receiver.
104The MPEG-1 video compression
- It exploits two kinds of redundancies in movies
spatial and temporal. - Spatial redundancy can be utilized by simply
coding each frame separately with JPEG. In this
mode, a compressed bandwidth in the 8- to 10-Mbps
range is achievable. - Additional compression can be achieved by taking
advantage of the fact that consecutive frames are
often almost identical. - MPEG-1 output consists of four kinds of frames
- 1. I (Intracoded) frames Self-contained
JPEG-encoded still pictures. - It is needed for three reasons to enable start
viewing in the middle, to enable decoding in face
of errors in previous frames, and to enable a
fast forward or rewind.
105The P (Predictive) frames
2. P (Predictive) frames Block-by-block
difference with the last frame. It is based on
the idea of macroblocks, which cover pixels in
luminance space and pixels in chrominance space.
An example of where P-frames would be useful is
given in the following figure.
- A macroblock is encoded by searching the previous
frames for it or something only slightly
different from it. - If a macroblock is found, it is encoded by taking
the difference with its value in the previous
frame, and then applying JPEG onto the
difference. - If a macroblock is not found, it is encoded
directly with JPEG, just as an I-frame. - The value for the macroblock in the output stream
is then - the motion vector (how far the macroblock moved
from its previous position in each direction),
followed by - the JPEG encoding.
106B (Bidirectional) and D (DC-coded) frames
3. B (Bidirectional) frames Differences with the
last and next frame. Similar to P-frames, except
that they allow the reference macroblock to be in
either a previous frame or in a succeeding frame.
B-frames give the best compression, but not all
implementations support them. 4. D (DC-coded)
frames Block average used for fast forward.
Each D-frame entry is just the average value of
one block, with no further encoding, making it
easy to display in real time. D-frames are only
used to make it possible to display a
low-resolution image when doing a rewind or fast
forward.
107MPEG-2
- MPEG-2 differs from MPEG-1 in the following
aspects - It supports I-frames, P-frames, and B-frames, but
not D-frames. - The discrete cosine transformation uses a 10 10
block instead of 8 8 block, to give 50 percent
more coefficients, hence better quality. - MPEG-2 is targeted at broadcast TV as well as DVD
applications, so it supports both progressive and
interlaced image, whereas MPEG-1 supports only
progressive image. - MPEG-2 supports four resolution levels
- low 352 ? 240 for VCRs and backward compatible
with MPEG-1. - main 720 ? 480 for NTSC broadcasting.
- high-14401440 ? 1152 for HDTV
- high 1920 ? 1080 for HDTV
- For high quality output, MPEG-2 usually runs at 4
8 Mbps.
108Video on Demand
- Watching video movies on demand is but one of a
vast array of potential new services possible
once wideband networking is available. - Two different models of video on demand
- Video rental store model Users are allowed to
start, stop, and rewind videos of theirr choice
at will. In this model, the video provider has
to transmit a separate copy to each one. - Multi-channel (e.g., 5000-channel) cable TV
model Users are not allowed to pause/resume a
video, but they can simply switch to another
channel shown the same video but 10 minutes
behind. In this model, the video provider can
start each popular video, say, every minutes, and
run these nonstop. This model is called near
video on demand. - The general system structure of video-on-demand
is illustrated in the following figure.
109Overview of a video-on-demand system
How these pieces will fit together and who will
own what are matters of vigorous debate within
the industry. Below we will examine the design of
the main pieces the video servers and the
distribution network.
110Video Servers
Storage capacity requirement The total number of
movies ever made is estimated at 65,000. When
compressed in MPEG-2, a normal movie occupies
roughly 4 GB of storage, so 65,000 of them would
require about 260 terabytes (without counting all
the old TV programs ever made, sports films,
etc.). Equipment and price A DAT magnetic tap
can store 8 GB (two movies) at a cost of about 5
dollars/gigabyte (the cheapest). Large
mechanical tape servers that can hold thousands
of tapes and have a robot arm for fetching any
tape and insert it into a tape drive are
commercially available now. The problem with
these systems is the access time, the transfer
rate, and the limited number of tape drives.
111Video server storage hierarchy
Fortunately, experience with video rental stores,
public libraries, and other such organizations
shows that not all items are equally popular.
Zipf's law the most popular movies is times as
popular as the number movie. The fact that some
movies are much more popular than others suggests
a possible solution in the form of a storage
hierarchy, as shown below.
112DVD, Disk and RAM
An alternative to tape is optical storage.
Current DVDs hold only 4.7 GB, good for one
movies, but next generation will hold two MPEG-2
movies. DVD seek times (50 msec) are slow
compared to magnetic disks (5 msec), but their
advantages are low cost and high reliability.
Suitable for holding more heavily used movies.
Disks have short access time (5 msec), high
transfer rates (320 MB/sec for SCSI 320), and
substantial capacities (gt 100 GB), which makes
them well suited to holding movies that are
actually being transmitted. Their main drawback
is the high cost for storing movies that are
rarely accessed. RAM is the fastest storage
medium but the most expensive. It is best suited
to movies for which different parts are being
sent to different destinations at the same time
(e.g., true video-on-demand with 100 users who
all started at different times). When RAM prices
drop to 50/GB, a 4-GB movie occupy 200 dollars
worth of RAM, so having 100 movies in RAM will
cost 20,000 dollars for the 200 GB of memory.
Keeping all watched movies in RAM begins to look
not only feasible, but a good design.
113The hardware architecture of a typical video
server
114Video server software
- The CPUs are used for accepting user requests,
locating movies, moving data between devices,
customer billing, etc. Many of them are time
critical, so a real-time operating system is
needed for CPUs. - A real-time system normally breaks work up into
small tasks, each with a known deadline. The
scheduler can run an algorithm such as nearest
deadline next. - The interface between the video server and
clients (i.e., spooling servers and set-top
boxes). Two popular designs - The traditional file system (such as UNIX) model
the clients can open, read, write, and close
files. - The video recorder model the clients can open,
play, pause, fast forward, and rewind files. - The disk management software takes charge of
- placing movies on the magnetic disk when they
have to be pulled up from optical or tape
storage, and - handling disk requests for the many output
streams. - Two possible ways of organizing disk storage
- Disk farm each drive holds a few entire movies.
For performance and reliability reasons, each
movie may be present on more than one drive. - Disk array or RAID (Redundant Array of
Inexpensive Disks) each movie is spread out over
multiple drives block 0 on drive 0, ..., n 1
block on drive n - 1, then block n on drive 0,
and so forth. This organization is called
striping.
115The distribution network
The distribution network is the set of switches
and lines between the source and destination.
The main requirements imposed on the backbone
are high bandwidth. Low jitter used to be a
requirement as well, but with even the smallest
PC now able to buffer 10 sec of high-quality
MPEG-2 video, low jitter is not a requirement
anymore. Local distribution is highly chaotic,
with different companies (telephone, cable TV,
etc) trying out different networks in different
regions. ADSL (Asymmetric Digital
Subscriber Line) was the telephone industry's
first entrant in the local distribution
sweepstakes, which make use of existing copper
twisted pairs, as discussed in Chap 2. It is not
fast enough (4 8 Mbps) except for very short
local loops. Another design is to run fiber into
everyone's house, called FTTH (Fiber To The
Home), which is very expensive and will not
happen for years. When it really happens, itd be
possible for every family member to run his or
her own personal TV station!
116FTTC (Fiber To The Curb)
About 16 copper local loops can terminate in an
end office. Able to support MPEG-1 and MPEG-2
movies. Video-conferencing for home workers and
small business is now possible because FTTC is
symmetric
117HFC (Hybrid Fiber Coax)
Instead of using point-to-point local
distribution networks, a completely different
approach is HFC (Hybrid Fiber Coax), which is
preferred solution currently being installed by
cable TV providers, as illustrated be low.
118HFC (Hybrid Fiber Coax)
- The current 300- to 450-MHz coax cables will be
replaced by 750-MHz coax cables, upgrading the
capacity from 50 to 75 6-MHz channels to 125
6-MHz channels. - 75 of the 125 channels will be used for
transmitting analog TV. The 50 new channels will
each be modulated using QAM-256, which provides
about 40 Mbps per channel, giving a total of 2
Gbps of new bandwidth. - Each cable runs past about 500 houses, and each
house can be allocated a dedicated 4 Mbps
channel, which can be used for some combination
of MPEG-1 programs, MPEG-2 programs, upstream
data, analog and digital telephony, etc. - HFC uses a shared medium without switching and
routing. As a result, HFC providers want video
servers to send out encrypted streams, so
customers who have not paid for a movie cannot
see it. - One the other hand, FTTC is fully switched and
does not need encryption because it adds
complexity, lowers performance, and provides no
additional security in their system. - For all these local distribution networks, it is
likely that each neighborhood will be outfitted
with one or more spooling servers, which may be
preloaded with movies either dynamically or by
reservation.
119The MBone The Multicast Backbone
MBone can be thought of as Internet radio and TV.
Its emphasis is on broadcasting live audio and
video in digital form all over the world via the
Internet. MBone has been operational since early
1992, and has been used for broadcasting many
scientific conferences, such as IETF meetings, as
well as newsworthy scientific events, such as
space shuttle launches. Most of the research
concerning MBone has been about how to do
multicasting efficiently over the
(datagram-oriented) Internet. Little has been
done on audio or video encoding. Technically,
MBone is a virtual overlay network on top of the
Internet, as shown below.
120Major MBone components
- Mrouters (mostly just UNIX stations running
special user-level software) are logically
connected by tunnels (defined just by tables in
the mrouters). - MBone packets are encapsulated within IP packets
and sent as regular unicast packets to the
destination mrouter's IP address. - Tunnels are configured manually. For a new island
to join MBone, - the administrator sends a message announcing its
existence to the MBone mailing list, and - the administrators of nearby sites then contact
him to arrange to set up tunnels. - Multicast addressing
- To multicast an audio or video program, a source
must first acquire a class D multicast address
(from a distributed database), which acts like a
station frequency or channel number. - Multicast group management
- Periodically, each mrouter sends out broadcast
packet limited to its island asking who is
interested in which channel. - Hosts wishing to receive one or more channels
send another packet back in response. - Each mrouter keeps a table of which channels it
must put out onto its LAN.
121Major MBone components
- Multicast routing
- When an audio or video source generates a new
packet, it multicasts it to its local island
using the hardware multicast facility. - This packet is picked up by the local mrouter,
which copies it into all the tunnels to which it
is connected. - Each mrouter getting such a packet checks the
routing table (based on the distance vector
routing algorithm) and uses the reverse path
forwarding algorithm to decide whether to drop
and forward it. - Moreover, the IP Time to live field is also used
to limit the scope of multicasting. Each tunnel
is assigned a weight. A packet is only passed
through a tunnel if its has enough weight. - Much research has been devoted to improving the
above MBone routing algorithm. Read the text and
references for more details. - All in all, multimedia is an exciting and rapidly
moving field. New technologies and applications
are announced daily, but the area as a whole is
likely to remain important for decades to come.