Data Formats - PowerPoint PPT Presentation

1 / 55
About This Presentation
Title:

Data Formats

Description:

Unique to a product or company. E.g., Microsoft Word, Corel Word Perfect, IBM Lotus Notes ... Most common coding system is ASCII (Pronounced ass-key) ... – PowerPoint PPT presentation

Number of Views:85
Avg rating:3.0/5.0
Slides: 56
Provided by: scottma9
Category:
Tags: ass | data | formats | perfect

less

Transcript and Presenter's Notes

Title: Data Formats


1
Data Formats
Textbook Chapter 3
2
Introduction
  • Examples

Input device
3
Format must be Appropriate
  • The internal representation must be appropriate
    for the type of processing to take place (e.g.,
    text, images, sound)
  • Problem Since computers store everything in
    binary code, how does it know what a particular
    stored item is?

4
Rules/Conventions
  • Proprietary formats
  • Unique to a product or company
  • E.g., Microsoft Word, Corel Word Perfect, IBM
    Lotus Notes
  • Standards
  • Evolve two ways
  • Proprietary formats become de facto standards
    (e.g., Adobe PostScript, Apple Quick Time)
  • Committee is struck to solve a problem (Motion
    Pictures Experts Group, MPEG)

Text pg 63-64
5
Standards Organizations
  • ISO International Standards Organization
  • CSA Canadian Standards Association
  • ANSI American National Standards Institute
  • IEEE Institute for Electrical and
    Electronics Engineers
  • Etc. Et Cetera (Everybody Else)

Rv.kc
6
Examples of Standards
Type of Data Standards
Alphanumeric ASCII, EBCDIC, Unicode
Image JPEG, GIF, PCX, TIFF
Motion picture MPEG-2, Quick Time
Sound Sound Blaster, WAV, AU
Outline graphics/fonts PostScript, TrueType, PDF
Hint - Learn What kind is which!
7
Why Standards?
  • Standards are arbitrary
  • They exist because they are

Convenient Efficient Flexible
Appropriate
Plus, they provide some consistency
and predictability for applications.
Rv.kc
8
Alphanumeric Data
  • Problem Distinguishing between the number 123
    (one hundred twenty-three) and the characters
    123 (one, two, three)
  • In software data is given a type
  • Four standards for representing letters (alpha)
    and numbers
  • BCD Binary-coded decimal
  • ASCII American standard code for information
    interchange
  • EBCDIC Extended binary-coded decimal
    interchange code
  • Unicode

pp. Old 63-69 Rev 65-72
R/kc
9
Standard Alphanumeric Formats
Next 2 slides
  • BCD
  • ASCII
  • EBCDIC
  • Unicode

10
Binary-Coded Decimal (BCD)
  • Four bits per digit

Digit Bit pattern
0 0000
1 0001
2 0010
3 0011
4 0100
5 0101
6 0110
7 0111
8 1000
9 1001
Note the following bit patterns are not
used 1010 1011 1100 1101 1110 1111
11
Example
  • 709310 ? (in BCD)

7 0 9 3 0111 0000 1001 0011
12
Standard Alphanumeric Formats
  • BCD
  • ASCII
  • EBCDIC
  • Unicode

Next 22 slides
13
The Problem
  • Representing text strings, such as Hello,
    world, in a computer

After all, computers store binary digits, not
letters!
14
Codes and Characters
  • Each character is coded as a byte
  • Most common coding system is ASCII (Pronounced
    ass-key)
  • ASCII American National Standard Code for
    Information Interchange
  • Defined in ANSI document X3.4-1977

15
ASCII Features
  • 7-bit code
  • 8th bit is unused (or used for a parity bit or
    to indicate extended character set)
  • 27 128 codes
  • Two general types of codes
  • 95 are Graphic codes (displayable on a console)
  • 33 are Control codes (control features of the
    console or communications channel)

R/kc
16
Hint Memorize codes for blank space,
period, digit zero (0), capital
A, small a, carriage return (CR)
/Kc
17
ASCII Chart
Book - page 67, Figure 3.3 (In decimal)
18
(No Transcript)
19
Most significant bit
Least significant bit
20
e.g., a 1100001
21
95 Graphic codes
22
33 Control codes
See text page 69 / 71 for details
23
Alphabetic codes
24
Numeric codes
25
Punctuation, etc.
26
Hello, world Example
27
Common Control Codes
  • CR 0D carriage return
  • LF 0A line feed
  • HT 09 horizontal tab
  • DEL 7F delete
  • NULL 00 null

Hexadecimal code
28
(No Transcript)
29
Terminology
  • Learn the names of the special symbols
  • brackets
  • braces
  • ( ) parentheses
  • _at_ commercial at sign
  • ampersand
  • tilde

30
(No Transcript)
31
Escape Sequences
  • Extend the capability of the ASCII code set
  • For controlling terminals and formatting output
  • Defined by ANSI in documents X3.41-1974 and
    X3.64-1977
  • The escape code is ESC 1B16
  • An escape sequence begins with two codes ESC

1B16
5B16
32
Examples
  • Erase display ESC 2 J
  • Erase line ESC K

33
Standard Alphanumeric Formats
  • BCD
  • ASCII
  • EBCDIC
  • Unicode

Next slide
34
EBCDIC
  • Extended BCD Interchange Code (pronounced
    ebb-se-dick)
  • 8-bit code
  • Developed by IBM
  • Rarely used today
  • IBM mainframes only

35
Standard Alphanumeric Formats
  • BCD
  • ASCII
  • EBCDIC
  • Unicode

Next 2 slides
36
Unicode
  • 16-bit standard
  • Developed by a consortia
  • Intended to supercede older 7- and 8-bit codes

37
Unicode Version 2.1
  • 1998
  • Improves on version 2.0
  • Includes the Euro sign (20AC16 )
  • From the standard

contains 38,887 distinct coded characters
derived from the supported scripts. These
characters cover the principal written languages
of the Americas, Europe, the Middle East, Africa,
India, Asia, and Pacifica.
http//www.unicode.org
38
Keyboard Input
  • Key (scan) codes are converted to ASCII
  • ASCII code sent to host computer
  • Received by the host as a stream of data
  • Stored in buffer
  • Processed
  • Etc.

pp. Old 69 Rev 72
39
Shift Key
  • inhibits bit 5 in the ASCII code

Key(s) ASCII code 6 5 4 3 2 1 0 Character
1 1 0 0 0 0 1 1 0 0 0 0 0 1 a A
a
a
Shift
40
Control Key
  • inhibits bits 5 6 in the ASCII code

Key(s) ASCII code 6 5 4 3 2 1 0 Character
1 1 0 0 0 1 1 0 0 0 0 0 1 1 c ETX
c
c
Ctrl
Controlcode
41
Other Input
  • OCR optical character recognition
  • Bar code readers
  • Voice/audio input
  • Punched cards
  • Images / objects
  • Pointing devices

pp. Old 69-86 Rev 72-89
42
OCR
  • Hello, world

Optical scan
10110110
Page of text
Computer file
43
Other Input
  • OCR optical character recognition
  • Bar code readers
  • Voice/audio input
  • Punched cards
  • Images / objects
  • Pointing devices

pp. 69-86
44
Bar Codes
  • An automatic identification (Auto ID) technology
    that streamlines identification and data
    collection
  • See http//www.digital.net/barcoder/barcode.html

45
Other Input
  • OCR optical character recognition
  • Bar code readers
  • Voice/audio input
  • Punched cards
  • Images / objects
  • Pointing devices

pp. 69-86
46
Voice/audio Input
  • Input device microphone
  • Audio input is digitized and stored
  • Processed in two ways
  • As is (no recognition)
  • Recognized and converted to alphanumeric data
    (ASCII)

Digitize
10110010
47
Other Input
  • OCR optical character recognition
  • Bar code readers
  • Voice/audio input
  • Punched cards
  • Images / objects
  • Pointing devices

pp. 69-86
48
Punched Cards
  • Invented by Herman Hollerith (founder of IBM)
  • Each card holds 80 characters

49
Other Input
  • OCR optical character recognition
  • Bar code readers
  • Voice/audio input
  • Punched cards
  • Images / objects
  • Pointing devices

pp. 69-86
50
Images
  • Typically images are pictures that are optically
    scanned and saved as a bit map or in some other
    format
  • Many formats
  • gif, jpeg,

Note animated gifs often used on ww web.
51
Typical Save As Dialog
52
Objects
  • Images made of geometrically definable shapes
    example MS Paint software.
  • Offer efficiency, flexibility, small size, etc.

53
Other Input
  • OCR optical character recognition
  • Bar code readers
  • Voice/audio input
  • Punched cards
  • Images / objects
  • Pointing devices

pp. 69-86
54
Pointing Devices
  • Originally used for specifying coordinates (x, y)
    for graphical input
  • Today used as general purpose device for
    graphical user interfaces (GUIs)

55
Thank you
Write a Comment
User Comments (0)
About PowerShow.com