Introduction to IT 1, 3 Lecture 3: Data Representation - PowerPoint PPT Presentation

1 / 47

About This Presentation

Title:

Introduction to IT 1, 3 Lecture 3: Data Representation

Description:

WinRAR. Currently the best archiver. WinRAR Tutorial. http://users.pandora.be/soulmaniacs/winrar.html. Analog and Digital Information ... – PowerPoint PPT presentation

Number of Views:332

Avg rating:3.0/5.0

Slides: 48

Provided by: uic3

Category:

more less

Transcript and Presenter's Notes

Title: Introduction to IT 1, 3 Lecture 3: Data Representation

1
Introduction to IT (1), (3)Lecture 3 Data
Representation
Dr. Haipeng Guo United International College
Fall, 2006
2
Outline

Distinguish between analog and digital
information
Explain data compression and calculate
compression ratios
Explain the binary formats for negative values
Describe the characteristics of the ASCII and
Unicode character sets
Explain the nature of sound and its
representation.
Explain how RGB values define a color.
Explain how to represent images graphics.
Explain how to represent video.

3
Data Representation

Data comes in many forms
Numbers 235, 11.01, -24,
Text hello, world! ??!
Audio .mp3
Images and graphics .bmp, gif, JPEG
Video .avi
All of the data is stored in computers as binary
digits
Data must be represented in a way that
Captures the essence of the information
And in a form that is convenient for computer
processing

4
Data Compression

Data compression
Reduction in the amount of space needed to store
a piece of data.
Compression ratio
The size of the compressed data divided by the
size of the original data.
A data compression techniques can be
lossless, which means the data can be retrieved
without any loss of the original information,
lossy, which means some information may be lost
in the process of compaction.

5
WinRAR

Currently the best archiver
WinRAR Tutorial
http//users.pandora.be/soulmaniacs/winrar.html

6
Analog and Digital Information

Computers are finite. Computer memory and other
hardware devices have only so much room to store
and manipulate a certain amount of data.
The goal is to represent enough of the world to
satisfy our computational needs and our senses of
sight and sound.

7
Analog and Digital Information

Information can be represented in one of two
ways analog or digital.
Analog data A continuous representation,
analogous to the actual information it
represents.
Digital data A discrete representation,
breaking the information up into separate
elements.
A mercury thermometer is an analog device. The
mercury rises in a continuous flow in the tube in
direct proportion to the temperature.

8
Analog Data

A mercury thermometer is an analog device. The
mercury rises in a continuous flow in the tube in
direct proportion to the temperature.

9
Analog and Digital Information

Computers, cannot work well with analog
information. So we digitize information by
breaking it into pieces and representing those
pieces separately.
Why do we use binary?
Modern computers are designed to use and manage
binary values because the devices that store and
manage the data are far less expensive and far
more reliable if they only have to represent on
of two possible values.

10
Electronic Signals (Contd)

An analog signal continually fluctuates in
voltage up and down. But a digital signal has
only a high or low state, corresponding to the
two binary digits.
All electronic signals (both analog and digital)
degrade as they move down a line. That is, the
voltage of the signal fluctuates due to
environmental effects.

11
Analog and Digital Information

Periodically, a digital signal is reclocked to
regain its original shape.

An analog and a digital signal
Degradation of analog and digital signals
12
Binary Representation

One bit can be either 0 or 1. Therefore, one bit
can represent only two things.
To represent more than two things, we need
multiple bits. Two bits can represent four things
because there are four combinations of 0 and 1
that can be made from two bits 00, 01, 10,11.

13
Binary Representation
14
Binary Representation

In general, n bits can represent 2n things
because there are 2n combinations of 0 and 1 that
can be made from n bits. Note that every time we
increase the number of bits by 1, we double the
number of things we can represent.
Questions
How many bits are needed to represent 128 things?
How many bits are needed to represent 67 things?

15
Representing Negative Values

You have used the signed-magnitude representation
of numbers since grade school.
The sign represents the ordering,
and the digits represent the magnitude of the
number.

16
Representing Negative Values

problem with the sign-magnitude representation.
There are two representations of zero. There is
plus zero and minus zero. Two representations of
zero within a computer can cause unnecessary
complexity.
If we allow only a fixed number of values, we can
represent numbers as just integer values, where
half of them represent negative numbers.

17
Representing Negative Values

For example, if the maximum number of decimal
digits we can represent is two, we can let 1
through 49 be the positive numbers 1 through 49
and let 50 through 99 represent the negative
numbers -50 through -1.
This representation of negative numbers is called
the tens complement.

18
Advantages of Using 10s Complement

To perform addition within this scheme, you just
add the numbers together and discard any carry.

19
Advantages of Using 10s Complement

A-BA(-B). We can subtract one number from
another by adding the negative of the second to
the first.
Addition and subtraction become same

20
2s Complement

8 bits

3 bits
000 0
001 1
010 2
011 3
100 -4
101 -3
110 -2
111 -1

21
Overflow

Overflow occurs when the value that we compute
cannot fit into the number of bits we have
allocated for the result.
For example, if each value is stored using eight
bits, adding 127 to 3 overflows.
Overflow is a classic example of the type of
problems we encounter by mapping an infinite
world onto a finite machine.

22
Overflow
1111111 0000011 10000010
127 3
23
Representing Text

A text document can be decomposed into chapters,
paragraphs, sentences, words, and ultimately
individual characters.
To represent a text document in digital form, we
simply need to be able to represent every
character that may appear.
In English, a, b, , z, A, B,Z
The general approach for representing characters
is to list them all and assign each a binary
string.
a ? (01100001)2 ? (97)10 ? 61h

24
Character Set

A character set is a list of characters and the
codes used to represent each one.
By agreeing to use a particular character set,
computer manufacturers have made the processing
of text data easier.
ASCII, Unicode, etc.

25
ASCII

ASCII stands for American Standard Code for
Information Interchange.
The ASCII character set originally used seven
bits to represent each character, allowing for
128 unique characters.
Later ASCII evolved so that all eight bits were
used which allows for 256 characters

26
ASCII
27
ASCII

Note that the first 32 characters in the ASCII
character chart do not have a simple character
representation that you could print to the
screen.

28
The Unicode Character Set

The extended version of the ASCII character set
is not enough for international use.
The Unicode character set uses 16 bits per
character. Therefore, the Unicode character set
can represent 216, or over 65 thousand,
characters.
Unicode was designed to be a superset of ASCII.
That is, the first 256 characters in the Unicode
character set correspond exactly to the extended
ASCII character set.

29
Unicode
30
Representing

We perceive sound when a series of air
compressions vibrate a membrane in our ear, which
sends signals to our brain.
A stereo sends an electrical signal to a speaker
to produce sound. This signal is an analog
representation of the sound wave. The voltage in
the signal varies in direct proportion to the
sound wave.

31
Representing Audio Information

We perceive sound when a series of air
compressions vibrate a membrane in our ear, which
sends signals to our brain.
A stereo sends an electrical signal to a speaker
to produce sound. This signal is an analog
representation of the sound wave. The voltage in
the signal varies in direct proportion to the
sound wave.

32
Representing Audio Information

To digitize the signal we periodically measure
the voltage of the signal and record the
appropriate numeric value. The process is called
sampling.
In general, a sampling rate of around 40,000
times per second is enough to create a reasonable
sound reproduction.

33
Representing Audio Information
34
Representing Audio Information

A compact disk (CD) stores
audio information digitally
On the surface of the CD are
microscopic pits that represent
Binary digits
A low intensity laser is pointed
as the disc.
The laser light reflects strongly
if the surface is smooth and
reflects poorly if the surface is pitted.

35
Representing Audio Information

Audio Formats
WAV, AU, AIFF, VQF, and MP3.
MP3 is dominant
MP3 is short for MPEG-2, audio layer 3 file.
MP3 employs both lossy and lossless compression.
First it analyzes the frequency spread and
compares it to mathematical models of human
psychoacoustics (the study of the interrelation
between the ear and the brain), then it discards
information that cant be heard by humans. Then
the bit stream is compressed to achieve
additional compression.

36
Representing Color

Color is our perception of the various
frequencies of light that reach the retinas of
our eyes.
Our retinas have three types of color
photoreceptor cone cells that respond to
different sets of frequencies.
These photoreceptor categories correspond to the
colors of red, green, and blue.

37
Representing Color

Color is often expressed in a computer as an RGB
(red-green-blue) value, which is actually three
numbers that indicate the relative contribution
of each of these three primary colors.
For example, an RGB value of (255, 255, 0)
maximizes the contribution of red and green, and
minimizes the contribution of blue, which results
in a bright yellow.

38
Three Dimension Color Space
(0,0,0)
(1,1,1)
39
Representing Images and Graphics

The amount of data that is used to represent a
color is called the color depth.
HiColor is a term that indicates a 16-bit color
depth. Five bits are used for each number in an
RGB value and the extra bit is sometimes used to
represent transparency.
TrueColor indicates a 24-bit color depth.
Therefore, each number in an RGB value gets eight
bits.

40
Indexed Color

A particular application such as a browser
may support only a certain number of
specific colors, creating a palette from
which to choose.
For example

41
Digitized Images and Graphics

Digitizing a picture is the act of representing
it as a collection of individual dots called
pixels.
The number of pixels used to represent a picture
is called the resolution.
The storage of image information on a
pixel-by-pixel basis is called a raster-graphics
format.
Several popular raster file formats including
bitmap (BMP), GIF, and JPEG.

42
BMP
43
Digitized Images and Graphics
High Resolution
44
Digitized Images and Graphics
Low Resolution
45
Representing Video

A video codec (COmpressor/DECompressor) refers to
the methods used to shrink the size of a movie to
allow it to be played on a computer or over a
network.
Almost all video codecs use lossy compression to
minimize the huge amounts of data associated with
video.
The goal is not to lose information that affects
the viewer's senses.

46
Video Players