Data Representation - PowerPoint PPT Presentation

About This Presentation
Title:

Data Representation

Description:

Data Representation – PowerPoint PPT presentation

Number of Views:144
Avg rating:3.0/5.0
Slides: 50
Provided by: GaryB264
Learn more at: https://media.lanecc.edu
Category:

less

Transcript and Presenter's Notes

Title: Data Representation


1
Data Representation
2
Topics
  • Bit patterns
  • Binary numbers
  • Data type formats
  • Character representation
  • Integer representation
  • Floating point number representation

3
Data Representation
  • Data representation refers to the manner in which
    data is stored in the computer
  • There are several different formats for data
    storage
  • It is important for computer problem solvers to
    understand the basic formats

4
Why is it Important?
  • As an example
  • We will learn that since we have finite storage,
    it is possible to overflow a storage location by
    trying to store too large a number
  • Most programming languages provide multiple data
    types each providing different length storage for
    variables
  • It is up to the programmer to choose the data
    type with a length that wont overflow
  • Knowing how numbers are represented in storage
    helps one to understand this

5
Bit Pattern
  • As you recall from an earlier presentation, data
    may take various forms characters, numbers,
    graphical, etc.
  • All data is stored in the computer as a sequence
    of bits, that is, binary digits
  • This is a universal storage format for all data
    types, and it is called a bit pattern

6
Bits and Bytes
  • A bit is the smallest unit of data stored in a
    computer and it has a value of 0 or 1
  • Its like a switch, on (1) or off (0)
  • In computers, bits are stored electronically in
    RAM and auxiliary storage devices by two-state
    digital circuits
  • The storage device itself doesnt know what the
    bit pattern represents, but software (application
    software, operating system, and I/O device
    firmware) stores and interprets the pattern
  • That is, data is coded then stored and when
    retrieved it is decoded
  • A byte is a string of 8 bits and is called a
    character when the data is text

7
Binary Numbers
  • Each bit pattern is a binary number, that is, a
    number represented by 0s and 1s rather than 0,
    1, 2, , 9 as decimal numbers are
  • For example, bit patterns like 1010 and 101001
    are also binary numbers
  • Binary numbers are based on powers of 2 rather
    than powers of 10 as decimal numbers are

8
Data Type Formats
  • As we have learned, fundamentally all data is
    stored as a bit pattern
  • But the different data types have different bit
    pattern formats
  • We want to learn the formats for
  • Characters (for example, left, Lane, a, ?, \)
  • Integers (for example, 1, 453, -10, 0)
  • Floating point numbers (for example, 3.14159,
    1.2, -567.235, 0.009)

9
Character Representation
  • The American Standard Code for Information
    Interchange (ASCII) is the scheme used to assign
    a bit pattern to each of the characters
  • ASCII charts come in different flavors
  • Some have 7 bit strings, some 8 or more
  • Some show the binary code for the various
    characters as well as the code represented in
    other number systems, e.g., decimal, hex, octal
  • For example, the letter A has the ASCII code
  • 1000001 in binary for a 7-bit chart
  • 65 in decimal
  • 41 in hex
  • 101 in octal

Note All four of these numbers represent the
same value but using different number systems
10
Subset of ASCII Chart
11
ASCII Chart
  • Uppercase characters have different ASCII codes
    than lowercase
  • Uppercase characters come before lowercase
  • Numbers come before letters
  • The special characters are spread around
  • The numbers and upper and lowercase characters
    are in adjacent groupings, so that their codes
    increment by one

12
ASCII Chart
  • The only difference between the binary codes for
    the upper and lowercase characters is the sixth
    bit, that is, the decimal code for a lowercase
    character is 32 greater than the uppercase
    characters decimal code
  • ASCII codes before decimal 32 are control
    characters (nonprintable) like bell, backspace,
    and carriage return
  • The final ASCII code in the 7-bit chart is the
    control character DEL with decimal code 127

13
Extended ASCII Unicode
  • The eight-bit ASCII chart is sometimes called
    Extended ASCII
  • The seven-bit ASCII codes are the same in
    eight-bit chart except have a zero at the left
  • Some manufactures use the extra bit to create
    additional special characters, these however are
    nonstandard, e.g., using decimal 171 for ½, or
    246 for
  • Unicode is another scheme developed so that the
    many symbols in international languages may be
    represented. It also uses bit patterns. UTF-32
    uses 32 bits.

14
Numeric Representation
  • ASCII codes are an inefficient method for
    representing numbers
  • For example, the number 1,024 using 8-bit ASCII
    would require four bytes or 32 bits of storage
  • Arithmetic operations on numbers represented in
    ASCII are very complicated
  • Representing the precision of a number, that is,
    the number of digits stored, may require large
    amounts of space when stored in ASCII
  • There are more efficient schemes for numbers

15
Integer Representation
  • An integer is a whole number, that is, a number
    without a decimal portion
  • Integers may be positive, negative, or zero
  • A plus-sign or minus-sign in front of the number
    is used to represent positive and negative
    numbers
  • The plus-sign is not required for positive
    numbers and zero
  • There are two categories of integer
    representation unsigned and signed

16
Unsigned Integer
  • An unsigned integer is an integer without a sign,
    that is, a non-negative integer
  • They range from zero to infinity, but no computer
    can store all the integers in that range
  • So, a maximum unsigned integer is defined
  • This maximum is based on the number of bits used
    to store an integer
  • Lets use 8 and 16-bit (1 and 2 bytes) storage
    locations in our examples
  • The length of storage is set by the data type the
    programmer specifies for a variable

17
Unsigned Integer
  • An unsigned integer is stored as its value when
    represented as a binary number
  • Leading zeros are added to fill out the storage
    location
  • For example, the decimal number 9 is represented
    as 00001001 when stored in 1-byte because
    000010012 910
  • When stored in a 2-byte location, 9 would be
    represented as 0000000000001001

18
Unsigned Integer
  • One may use the following table to work with
    binary numbers
  • For example, given 00001001, what decimal number
    does it represent?
  • Add the non-negative powers of two, that is, 8
    1 9

19
Unsigned Integer
  • One may use the same table to go the other way,
    that is, given the decimal number 13, what is its
    binary representation?
  • Find the largest power of 2 that doesnt exceed
    the number and place a 1 in that cell
  • Subtract that power of 2 from the number and use
    this as the new number 13 8 5

20
Unsigned Integer
  • Then continue in this way until the sum of the
    powers of two equals the number
  • Now, 5 4 1, and so finally
  • Note that 8 4 1 13

21
Unsigned Integer
  • Then fill in the remaining cells with zeros
  • So, the unsigned integer representation of
    decimal 13 is 00001101 when stored in 1-byte

22
Unsigned Integer
  • If one tries to store a number in a memory
    location that is not large enough we have what is
    called overflow
  • In this case, depending on the system, one may or
    may not receive an error message
  • So, one must not store a number that is larger
    than the maximum for a given length of storage
  • The maximum number storable in 1-byte is 255

23
Unsigned Integer
  • For example, if one tries to store 256 in 1-byte
    there is overflow because the largest value
    storable in 8 bits is 255 as one can see from the
    following table
  • Note that 128 64 32 16 8 4 2 1
    255

24
Signed Integer
  • A sign-and-magnitude format is used to allow for
    positive and negative numbers (and zero)
  • The leading bit is designated as the sign bit 0
    for positive or zero, 1 for negative
  • The remaining bits represent the value
  • So, in 1-byte of storage the maximum number
    storable is not 255 as it was for the unsigned
    integer representation, but 127
  • Note that 64 32 16 8 4 2 1 127

25
Signed Positive Integer
  • To determine what the sign-and-magnitude
    representation of a positive decimal number is
  • Convert the decimal number to binary
  • If needed add leading zeros to fill the storage
    location
  • For example, decimal 12 is represented in 1-byte
    as 00001100 because 8 4 12

26
Signed Positive Integer
  • Going the other way, given a sign-and-magnitude
    representation for a positive number, one can
    interpret it as follows
  • Leftmost bit will be 0 indicating positive
  • Convert the remaining bits to a decimal number
  • For example, 00010001 is decimal 17
  • Because 16 1 17

27
Signed Negative Integer
  • For negative numbers, twos complement format is
    used
  • Twos complement is still a sign-and-magnitude
    format
  • In twos complement, some of the magnitude bits
    are flipped from 0 to 1 or 1 to 0

28
Signed Negative Integer
  • To determine what the twos complement
    representation of a negative decimal number is
  • Ignore the sign and convert the decimal number to
    binary
  • If needed add leading zeros to fill the storage
    location
  • Leave all the rightmost 0s and first 1
    unchanged, but flip the remaining bits
  • Make the sign bit 1
  • For example, decimal -14 is represented in 1-byte
    as 11110010 because (see next slide)

29
Signed Negative Integer
  • Convert 14 to binary (8 4 2 14) and make
    leading bits zero
  • Leave the rightmost 0s and first 1 as is, but
    flip the remaining bits
  • Make sign bit 1

30
Signed Negative Integer
  • Going the other way, given a twos complement
    representation for a negative number, one can
    interpret it as follows
  • Leave the rightmost bits up to and including the
    first 1 unchanged, but flip the remaining bits
  • Convert the binary number to decimal
  • Put a minus-sign in front
  • For example, 11101010 is decimal 22 because (see
    next slide)

31
Signed Negative Integer
  • Flip all but the rightmost 1 and any following
    0s
  • Convert the binary number to decimal
  • We get 22 because 16 4 2 22
  • Put a minus-sign in front yielding -22

32
Signed Negative Integer
  • Twos complement is the standard representation
    for negative integers in modern computers
  • This is because arithmetic operations are simple
    to implement when integers are stored this way
    (but this concept is beyond the scope of the
    course)
  • Although on the surface it seems complicated, at
    a deeper level it allows for simplicity of
    operations

33
Signed Negative Integer
  • An alternative but equivalent method for
    converting a negative number to its twos
    complement representation is
  • Ignore the sign and convert the decimal number to
    binary
  • If needed add leading zeros to fill the storage
    location
  • Flip all the bits
  • Add 1 to the result of the last step
  • Make the sign bit 1
  • Some people find this easier

34
Signed Negative Integer
  • For example, -14. First, convert 14 to binary (8
    4 2 14) and make leading bits zero
  • Flip all the bits
  • Add 1
  • Make the sign bit 1

35
Floating Point Number Representation
  • Float point numbers are those that have a decimal
    portion (mathematicians call these real numbers)
  • Numbers like 3.14159, 50000.3, and 0.000005
  • The method that is used allows for very large or
    very small numbers to be stored using the same
    format

36
Floating Point
  • The main idea in this format is that the decimal
    point is allowed to float
  • That is, there is an actual decimal location in
    the original number, and there is stored
    decimal location that is usually different
  • The original number is normalized by moving the
    decimal place so there is only one digit to the
    left

37
Floating Point
  • The basic idea can be seen from an example
    although this description glosses over many
    details
  • The number 102.39 is normalized by moving its
    decimal point two places to the left to become
    1.0239, and this number is stored and is called
    the mantissa
  • Also, the fact that the decimal point was moved
    left by 2 is stored so that the original number
    may be reconstructed and this is called the
    exponent
  • The sign of the number is also stored (0 for
    positive or zero, 1 for negative)

38
Floating Point
  • However, it is actually more complicated than
    that
  • The exponent and mantissa are actually stored in
    binary
  • And the value stored as the mantissa is only the
    fractional part of the binary number once the
    decimal point has been moved so that there is a
    binary 1 at the left, that is, 1.101001 is stored
    as 101001 and the leading 1 is assumed

39
Floating Point
  • The representation of numbers in floating point
    involves a couple procedures that are complicated
    and beyond the scope of the course
  • These are repetitive multiplication of a decimal
    fraction by 2, and the excess system for
    storing positive and negative numbers
  • So, we wont be converting the numbers manually
    ourselves

40
Floating Point
  • However, the procedure used to store a number in
    floating point representation is
  • Store a 0 (positive) or 1 (negative) in the sign
    field
  • Convert the integer part to binary
  • Convert the decimal part (fraction) to binary by
    using repetitive multiplication by 2
  • Combine the two binary numbers with a decimal
    point between
  • Move the decimal point so that there is a 1 bit
    at the left and store the remaining bits in the
    mantissa field
  • Store the number of places moved using the
    excess system in the exponent field

41
Floating Point
  • Computers store data in binary and in finite
    space, i.e., they are discrete, finite systems
  • However real numbers form a continuous, infinite
    system
  • Hence, computers can only approximate real
    numbers
  • The precision of a floating point number is how
    close the stored number is to the original number

42
Floating Point
  • Small Basic Example
  • Mathematically c should be 0 but what does the
    program display for c?

a 2 / 3 b 2 (1 / 3) c a -
b TextWindow.WriteLine(c)
43
Floating Point
  • The more bits available for the mantissa field
    the more digits of the original number may be
    stored
  • Programming languages normally allow the
    programmer to define the precision by the data
    type chosen

44
Floating Point
  • Institute of Electrical and Electronics Engineers
    (IEEE) standards
  • Single-Precision (4 bytes)
  • Double-Precision (8 bytes)

45
Floating Point
  • Trade off
  • Double precision numbers require more space and
    therefore programs using them may run slower
  • But operations using double precision numbers
    will be more precise

46
Summary
  • Data are stored as bit patterns
  • A bit pattern is a binary number
  • There are various data type formats
  • Characters are represented in ASCII

47
Summary
  • Integers are represented as either
  • Unsigned stored as the binary number equivalent
    to the original
  • Signed Positive stored using the
    sign-and-magnitude format where the magnitude is
    the binary equivalent
  • Signed Negative stored using the
    sign-and-magnitude format where the magnitude is
    in the twos complement format
  • Floating point numbers are represented using
    sign, exponent, and mantissa

48
Terminology
  • Data representation
  • Bit
  • Byte
  • Bit pattern
  • Binary number
  • Character
  • Integer
  • Floating point number
  • ASCII
  • Control characters
  • Extended ASCII
  • Unicode
  • Unsigned integer representation
  • Overflow
  • Sign-and-magnitude representation
  • Sign
  • Twos complement representation
  • Floating point representation
  • Normalize
  • Exponent
  • Mantissa
  • Precision
  • Single-precision
  • Double-precision

49
End
Write a Comment
User Comments (0)
About PowerShow.com