Title: Programming Language Ch III' Data Types
1Programming LanguageCh III. Data Types
- Internet Management Technology Lab.
2Contents
- Introduction
- Basic Concepts for data
- Elementary Data Type
- Structure Data Type
3Introduction
- Three elementary components in a programming
language - Data
- Basic information unit to hold values to
determine the state of a system - Operations
- Actions to manipulate data or sequencing
- Control
- A mechanism provided to control the sequence of
instructions
4Basic Concepts for Data
- Data storage of an actual computer
- Memory, register or external media
- Usually have simple structure as sequence of bits
comprising bytes or words - Data storage of a virtual computer
- Arrays, stacks, numbers, character strings
- Usually have more complex organization
5Basic Concepts for Data
- Data Object
- Run-time grouping of one or more pieces of data
in a virtual computer - Some are defined by programmer (variables,
constants, arrays) and others are defined by
system for house keeping (activation records) - Container for a data value (a single number,
character or a memory location) - Classified as elementary object (manipulated as a
single unit( or structured object (aggregated) - Each data object has a lifetime (extent)
6Basic Concepts for Data
- Attributes associated with a data object
- Name
- Represent the object and is referred during
execution time - Type
- Specify the data values that the object may
contain - Location
- Address of the storage where the object is
located - Value
- Actual value that the object contains
- Component
- The binding of a data object to one or more data
objects - Operations
- Mechanisms to manipulate the object
7Basic Concepts for Data
- Variable
- A data object that is defined and named by
programmer explicitly in a program - The content (value) of a variable can be changed
during its life time - Simple variable an elementary data object with
a name - Complex variable group of variables
- Constant
- A data object with a name which is bound to a
value permanently during its lifetime - Types of constants
- Literal its representation is just the written
representation of the value - Programmer defined constant its name is chosen
by the programmer
8Basic Concepts for Data
- Data type
- A class of data objects together with a set of
operations for creation and manipulation - Types are array, integer, file, etc.
- Classes of data type
- Primitive data type built into the language
- User-defined data type defined with the
facilities that the language provides - Self-modifying types the content of a data
object is modified - Basic elements of a data type include
specification and implementation - Specification specifying the defined data type
as an object with attributes, values and
operations - Implementation simulation of that part of the
virtual computer with the basic elements
9Basic Concepts for Data
- Data type (contd)
- Specification
- Attributes that distinguish data objects of that
type - Number of dimensions, subject range, data type of
components - Values that data objects of that type may have
- Set of numbers
- Operations that define the possible manipulations
of data objects that type - e.g. consider array data type
- Subscribing to select a component, create array,
change shape, access attributes, and perform
arithmetic on a pair of array - Implementation
- Storage representation used to represent the
data object of that type - Manner in which the operations are defined to
manipulate the data object (algorithm or
procedures)
10Elementary Data Type
- What is the Elementary Data Type?
- It contains a single data value, with various
operations - Integer, Real, Character, Boolean, Enumeration,
Pointer - Specification of Elementary Data Types
- Attributes
- Basic attributes (name, data, type) are invariant
during its life time - Attribute information could be stored in dove
vector (descriptor) or only used to determine
storage size - Value of an attribute of a data object
- Not equals to the value of a data object contains
- Values
- Type of an object determines the set of possible
data values - Usually closely related to the values that the
underlying hardware provides - The set of values for a data type is usually an
ordered set
11Elementary Data Type
- Specification of Elementary Data Types (contd)
- Operations
- Determine how data objects of that type may be
manipulated - Primitive operation part of language definition
- Programmer defined operation form of
subprograms, or method declaration - Elements of an operation
- Domain a set of possible input values
- Range a set of possible result values
- Action determines results from any given set of
input values - Signature (called prototype in C)
- Is used to specify an operations elements
- opname arg_type x arg_type x x arg_type ?
result_type - Ex) integer x integer ?integer (binary
operation) - SQRT real ? real (unary operation)
12Elementary Data Type
- Specification of Elementary Data Types (contd)
- Four main factors that combine to obscure the
definitions of operations - Operations undefined for certain inputs
- Implicit arguments use of global variable
- Side effects (implicit results) change global
variable - Self-modification history sensitive actions
like counter, random number generator, LISP
allows self modification through code change - Subtype
- A data type which is a part of a larger class
(superclass) - Ex) type smallInt 1..20
13Elementary Data Type
- Implementation of elementary data types
- Storage representation for data objects and
values of that type - Strongly influenced on underlying hardware
(efficient) - Sometimes it is software simulated (inefficient)
- Data attributes are determined by compiler in
many languages or - Sometimes data attributes are stored in a
descriptor vector (e.g. LISP) - Set of algorithms or procedures that defines the
operations of the type - Directly implemented by hardware or
- As a procedure or function subprogram
- As in-line code sequence
- abs(x) if x lt 0 then -x else x
14Elementary Data Type
- Declarations
- Definition
- Statements that specify information about the
name, type of data object, and lifetime of each
object - Explicit declaration vs. implicit (default)
declaration - Declaration of operations can be done by the
signature of each operation - Explicit declaration of arguments and result type
of programmer defined operations - Purpose of declaration
- Choice of storage representation for translator
- Storage management by specifying the life time of
variables - Polymorphic operations
- Type checking
15Elementary Data Type
- Type Checking
- Checking the proper number of arguments of the
proper data type of each operations - Dynamic type checking
- Difficult to debug
- Need storage to keep type information
- Slow execution speed for type checking
- Limit the compiler optimization due to unknown
factors - flexibility
- Static type checking
- Operation checking number and types of
arguments and results - Variable checking type of object under the name
- Type of constant syntactic form of literals
- Strong typing
- Detect all type errors statically
16Elementary Data Type
- Type safe system
- A function cannot generate result with a type
outside of the signature - Type inference
- Types can be resolved from the program by how
they are used - Ex) fun area (lengthint, widthint)int length
width in ML - Only illegal example fun area (length, width)
length width - Type conversion
- Takes one type and produce the corresponding type
- Explicit type conversion by using a set of
built-in functions - Implicit type conversion (type coercion) as
specified in the language - Basic principle is not to lose information
called widening, or promotion
17Elementary Data Type
- Polymorphic operations
- Polymorphic function a subprogram or function
that can assume more than one type - Ex) f(x) x in PASCAL
- function f(xint) int
- function f(xboolean) boolean
- function f(xreal) real
- Ad hoc polymorphism
- Different code for different manifestations of
the operators - Overloading
- Implicit coercion
- Universal polymorphism
- A function name selects on a variety of
implementations depending on the types of its
arguments
18Elementary Data Type
- Assignment
- Basic operation for changing the binding of a
value to a data object - Forms of Assignment Statements
- Schizophrenic Representation
- lvalue (location attribute) and rvalue (value
attribute) - Ex) X X 1
integer x integer -gt void (PASCAL)
integer x integer -gt integer
19Elementary Data Type
- Assignment
- Variation on assignment
- The updating assignment
- Multiple target assignment
- Multiple assignment statement
L1, L2, L3, ..., Ln E ALGOL60, PL/1
20Elementary Data Type
- Storage models
- Value semantics assigned to immutable value
- Pointer semantics assign the location
- Storage insecurities
- Use before initialization
- Dangling reference
A 2 B 3 A B B 5 print A
var p, q T Begin new(p) q p
dispose(p) end
var p integer procedure q var i
integer Begin p ADDR(i) end access using
p
21Elementary Data Type
- Initialization
- Must be explicit
- Must be done at object creation
- Initialized by default
- Equality and Equivalence
22Elementary Data Type
- Integers
- Specification
- Attribute type attribute integer only
- Value ordered and finite subset of integer
values - Operations
- Arithmetic operations BinaryOp (, - , x, /,
mod), UnaryOp (-, , abs) - Relational operations (equal, not equal,
less-then...) - Assignment operations ( , )
- Bit operations (, , ltlt, gtgt)
- Implementation
- Use complete memory word
- Three possible storage representation
- No runtime descriptor
- Descriptor stored in separate words (LISP)
- Descriptor stored in the same word
23Elementary Data Type
- Subranges
- Specification
- Attribute subtype of a type
- Value a sequence of integer values within some
restricted values - Operations same operations with the integer
- Implementation
- Same as the base type but has a few effects
- Smaller storage requirements
- Range 110 requires only 4 bits
- Better type checking
24Elementary Data Type
- Floating point real numbers
- Specification
- Attribute type attribute real
- Value hardware determined numbers from min to
max, and not every distributed - Operations same operations with integer except
boolean that has some restrictions() - Built-in functions
- sin real ? real
- max real x real ? real
- Implementation
- Split the storage location into mantissa and
exponent
ltlt Excess 127 notation gtgt
25Elementary Data Type
- Fixed-point real numbers
- Specification
- Digit sequence of fixed length
- Avoid round-off errors (e.g. dollar and cents)
- Ex) in COBOL X PICTURE 999V99.
- Implementation
- Either supported by hardware or simulated by
software - Ex)
- Calculation is done after converting the numbers
in the same format
Scale factor
Y(SF2) X(SF3) Y(SF2) ((X(SF3) (10
Y)(SF3)) /10)(SF2)
26Elementary Data Type
- Enumeration
- Specification
- Allow a programmer to define and manipulate
subranged variables more directly - Ex)
- Implementation
- Allocated within the minimum number of bits
needed - Each entry is numbered with 1,2,3,..and simple
operations on the index numbers are used - Use primitive operations of the superset
27Elementary Data Type
- Booleans
- Specification
- Attribute type Boolean false, true in
PASCAL, ADA - And, or, and not operations
- Implementation
- Single bit of storage
- Could be word or byte due to the addressing
problem - A designated bit is used and the rest are ignored
- Entire word (or byte) has 0 for false, else for
true
28Elementary Data Type
- Characters
- Sequence of characters are often processes as a
unit - Specification
- Value set of possible character values that are
language-defined enumeration, and supported by
underlying hardware - Operation assign, relational, test special type
of characters (letter, digit, special characters) - Implementation
- Directly supported by underlying hardware
- Converted from input-output system to the
character-set representation
29Structured data types
- Structured data Objects
- Aggregate data object
- Component
- Elementary object
- Another data structure
- Issues about data structures
- Maintaining specification of the structural
information - Indicating the component data objects
- Managing the storage
30Structured data types
- Specification of structured data types
- Major attributes
- Number of components
- Fixed size array, record
- Variable size lists, stacks, files, sets
- Type of each component
- Homogeneous - array
- Heterogeneous - record
- Names to be used for selecting components
- Homogeneous type may use integer subscript or
sequence of subscripts - Heterogeneous type use programmer defined
identifier - Maximum number of components
- Only for the variable size data structure
- Organization of the components
- Simple linear sequence is the most common
organization - Multi-dimension forms are used to extend linear
sequences
31Structured data types
- Specification of structured data types
(continued) - New classes of operations on data structures
- Component selection operations
- Random selection
- Sequential selection
- Whole-data-structure operations
- Addition of two arrays
- Assignment of one records to another
- Union operation on sets
- Insertion/Deletion
- Storage representation and management problems
related - Creation/destruction of data structure
- Storage management problems related
32Structured data types
- Implementation of data structure types
- Storage representation
- Storage for data optional descriptor
- Usually S/W simulated
- basic representation
- Sequential representation
- linked representation
- Implementation of operation on data structures
- Sequential representation
- Using accessing formula to locate the component
- Location base address offset
- e.g. A10 lvalue(A) 10 sizeof(A0)
(homogeneous) - Linked representation
- Following a chain of pointers
33Structured data types
- Implementation of data structure types
- Type checking issues
- Existence of a selected component
- Type of a selected components
- Name equivalence VS structural equivalence
- Name equivalence
- Same name
- structural equivalence
- Same structure
e.g. in Ada type BLACK is INTEGER type WHITE is
INTEGER B BLCAK WWHITE IINTEGER begin
W5 BW I B end
34Structured data types
- Vectors
- linear arrays
- Composed of a finite number of homogeneous
components - Selected by using its subscript
- Attributes of a vector
- Number of components
- Data type
- Subscripts to be used to select each component
- Could be an integer value or enumerated value
float A10 /in C / V
array 1 .. 10 of real / in PASCAL / type
class (Freshman Sophomore, Junior, Senior) var
ClassAverage arrayclass of real / in
Pascal /
35Structured data types
- Vectors (continued)
- Operations of vectors
- Subscripting
- Creation
- Destroying
- Assigning a value to a component
- Arithmetic operation on two vectors
- implementation
- For efficiency
- Homogeneity
- Fixed size
Insertion or deletion are not available
36Structured data types
- Vectors (continued)
- Implementation (continued)
37Structured data types
- Vectors (continued)
- Implementation (continued)
- Accessing formula
- Packed storage representation
- Create issues like boundary crossing
- Whole vector operation
Where a is the base location If subscript does
not start from 0, then compute the virtual
origin(may not exist) VO (a LB E)
Lvalue(ai) a (I LB) E (a LB E)
(I E)
Struct a char str3 int
on_boundary
38Structured data types
- Multi-dimensional arrays
- Specification
- Extension of vector array with multiple
subscripts - Implementation
- A vector of vectors
- Storage representation
39Structured data types
- Multi-dimensional arrays
- Implementation (continued)
- Row-major order
- Column-major order
lvalue(AI,J) a (I - LB1) x S (J - LB2) x
E Where, a base address S length of a row
(UB2 - LB2 1) x E VO a - LB1 x S - LB2 x
E lvalue(AI,J VO I x S J x E
lvalue(AI,J VO' I x E' J x S Where, VO
a - LB1 x E' - LB2 x S
40Structured data types
- Slice
- Specification
- A substructure of an array that it itself an
array - Implementation
- 3 by 4 array
e.g. in FORTRAN77 Pass A(1,3) to B(1) A(1,3)
B(1), A(2,3) B(2), A(3,3) B(3) and so on.
lvalue(AI,J) a I x 3 J x 1
41Structured data types
- Records
- A data structure of a fixed number of components
of different typed - Specification
- The attribute of components
- The number of components
- The data type of each component
- A name typed literally to select a component
- Operations on entire structure is not common
/ records in C / struct EmployeeType int
ID int Age float Salary char Dept
- Array is a subtype of Record
- Record may have heterogeneous type of
components - Components may be distinguished by a
name(identifier) rather than subscripts
42Structured data types
- Records (continued)
- Implementation
- Mapping into an offset
- the distance from the base
- Padding occurs for data alignment
struct Class int ClassName char Class
int classID
43Structured data types
- Variant Records
- Union type in C
- Implementation
- Allocate largest storage for the variant record
- Dynamic checking is provided for range error
type PayClass (Salaried, Hourly) Var
HourlyEmployee record ID integer Dept
array1..3 of char Age integer case
PayClass PayType of Salaried
(MonthlyRate real StartDAte
integer) Hourly (HourlyRate real
Reg integer Overtime integer) end
/ records in C / Union EmployeeTemp int
ID int Age float Salary char Dept
PayType is called tag in PASCAL Or discriminant
in Ada
44Structured data types
- Lists
- Similar to vectors except
- Variable length
- Heterogeneous components
- Usually implicit declaration is used
- Operation
- Selector, Insert, Append, Delete, etc.
- Variation of lists
- Stack, queue, tree, etc.
e.g. in LISP (cons '(a b c) '(d e f)) ((a b c)
d e f) - each element is called an atom
45Structured data types
- Character strings
- Specification
- Operation
- Concatenation
- Relational operations on string
- Substring selection
- I/O formatting
- Implementation
- S/W simulated for the part that H/W does not
support - Implementation manner
- Fixed declared length
- Variable length to a declared bound
- Unbounded length
- C uses a null terminator to indicate the end of a
string
46Structured data types
- Pointers
- A object that contains the location of another
data object - Its rvalue is lvalue of another object
- Specification
- Pointers may reference data object of a single
type - Pointers may reference data object of any type
- Operations
- Creation
- Dereferencing
- Implementation
- Storage representation
- Absolute address
- Relative address
- Issues
- Garbage
- dangling reference
- optimization
47Structured data types
- Sets
- Containing unordered collection of distinct
values - Operations
- Membership
- Insertion/deletion
- Union
- Intersection
- And/or
- Implementation
- Bit string representation
- Hash-coded representation