Title: The presentation will address the following questions:
1Introduction
- The presentation will address the following
questions - What is systems modeling and what is the
difference between logical and physical system
models? - What is data modeling and what are its benefits?
- Can you recognize and understand the basic
concepts and constructs of a data model? - Can you read and interpret a entity relationship
data model? - When in a project are data models constructed and
where are they stored? - Can you discover entities and relationships?
- Can you construct an entity-relationship context
diagram?
2Introduction
- The presentation will address the following
questions - Can you discover or invent keys for entities?
- Can you construct a fully attributed entity
relationship diagram and describe all data
structures and attributes to the repository or
encyclopedia?
3An Introduction to Systems Modeling
- Systems Modeling
- One way to structure unstructured problems is to
draw models. - A model is a representation of reality. Just as a
picture is worth a thousand words, most system
models are pictorial representations of reality. - Models can be built for existing systems as a way
to better understand those systems, or for
proposed systems as a way to document business
requirements or technical designs. - What are Logical Models?
- Logical models show what a system is or does.
They are implementation-independent that is,
they depict the system independent of any
technical implementation. As such, logical models
illustrate the essence of the system.
4An Introduction to Systems Modeling
- Systems Modeling
- What are Physical Models?
- Physical models show not only what a system is
or does, but also how the system is physically
and technically implemented. They are
implementation-dependent because they reflect
technology choices, and the limitations of those
technology choices. - Systems analysts use logical system models to
depict business requirements, and physical system
models to depict technical designs.
5An Introduction to Systems Modeling
- Systems Modeling
- Systems analysis activities tend to focus on the
logical system models for the following reasons - Logical models remove biases that are the result
of the way the current system is implemented or
the way that any one person thinks the system
might be implemented. - Logical models reduce the risk of missing
business requirements because we are too
preoccupied with technical details. - Logical models allow us to communicate with
end-users in non-technical or less technical
languages.
6An Introduction to Systems Modeling
- Systems Modeling
- Data modeling is a technique for defining
business requirements for a database. - Data modeling is a technique for organizing and
documenting a systems DATA. Data modeling is
sometimes called database modeling because a data
model is usually implemented as a database. It is
sometimes called information modeling. - Many experts consider data modeling to be the
most important of the modeling techniques. - Why is data modeling considered crucial?
- Data is viewed as a resource to be shared by as
many processes as possible. As a result, data
must be organized in a way that is flexible and
adaptable to unanticipated business requirements
and that is the purpose of data modeling.
7An Introduction to Systems Modeling
- Systems Modeling
- Why is data modeling considered crucial?
(continued) - Data structures and properties are reasonably
permanent certainly a great deal more stable
than the processes that use the data. Often the
data model of a current system is nearly
identical to that of the desired system. - Data models are much smaller than process and
object models and can be constructed more
rapidly. - The process of constructing data models helps
analysts and users quickly reach consensus on
business terminology and rules.
8(No Transcript)
9System Concepts for Data Modeling
- System Concepts
- Most systems analysis techniques are strongly
rooted in systems thinking. - Systems thinking is the application of formal
systems theory and concepts to systems problem
solving. - There are several notations for data modeling,
but the actual model is frequently called an
entity relationship diagram (ERD). - An ERD depicts data in terms of the entities and
relationships described by the data.
10System Concepts for Data Modeling
- Entities
- All systems contain data.
- Data describes things.
- A concept to abstractly represent all instances
of a group of similar things is called an
entity. - An entity is something about which we want to
store data. Synonyms include entity type and
entity class. - An entity is a class of persons, places, objects,
events, or concepts about which we need to
capture and store data. - An entity instance is a single occurrence of an
entity.
11System Concepts for Data Modeling
- Attributes
- The pieces of data that we want to store about
each instance of a given entity are called
attributes. - An attribute is a descriptive property or
characteristic of an entity. Synonyms include
element, property, and field. - Some attributes can be logically grouped into
super-attributes called compound attributes. - A compound attribute is one that actually
consists of more primitive attributes. Synonyms
in different data modeling languages are
numerous concatenated attribute, composite
attribute, and data structure.
12System Concepts for Data Modeling
- Attributes
- Domains
- The values for each attribute are defined in
terms of three properties data type, domain, and
default. - The data type for an attribute defines what class
of data can be stored in that attribute. - For purposes of systems analysis and business
requirements definition, it is useful to declare
logical (non-technical) data types for our
business attributes. - An attributes data type determines its domain.
- The domain of an attribute defines what values an
attribute can legitimately take on. - Every attribute should have a logical default
value. - The default value for an attribute is that value
which will be recorded if not specified by the
user.
13(No Transcript)
14(No Transcript)
15(No Transcript)
16System Concepts for Data Modeling
- Attributes
- Identification
- An entity typically has many instances perhaps
thousands or millions and there exists a need to
uniquely identify each instance based on the data
value of one or more attributes. - Every entity must have an identifier or key.
- An key is an attribute, or a group of attributes,
which assumes a unique value for each entity
instance. It is sometimes called an identifier. - Sometimes more than one attribute is required to
uniquely identify an instance of an entity. - A group of attributes that uniquely identifies an
instance of an entity is called a concatenated
key. Synonyms include composite key and compound
key.
17System Concepts for Data Modeling
- Attributes
- Identification
- Frequently, an entity may have more than one key.
- Each of these attributes is called a candidate
key. - A candidate key is a candidate to become the
primary identifier of instances of an entity. It
is sometimes called a candidate identifier.
(Note A candidate key may be a single attribute
or a concatenated key.) - A primary key is that candidate key which will
most commonly be used to uniquely identify a
single entity instance. - Any candidate key that is not selected to become
the primary key is called an alternate key.
18System Concepts for Data Modeling
- Attributes
- Identification
- Sometimes, it is also necessary to identify a
subset of entity instances as opposed to a single
instance. - For example, we may require a simple way to
identify all male students, and all female
students. - A subsetting criteria is a attribute (or
concatenated attribute) whose finite values
divide all entity instances into useful subsets.
Some methods call this an inversion entry.
19System Concepts for Data Modeling
- Relationships
- Conceptually, entities and attributes do not
exist in isolation. - Entities interact with, and impact one another
via relationships to support the business
mission. - A relationship is a natural business association
that exists between one or more entities. The
relationship may represent an event that links
the entities, or merely a logical affinity that
exists between the entities. - A connecting line between two entities on an ERD
represents a relationship. - A verb phrase describes the relationship.
- All relationships are implicitly bidirectional,
meaning that they can interpreted in both
directions.
20(No Transcript)
21System Concepts for Data Modeling
- Relationships
- Cardinality
- Each relationship on an ERD also depicts the
complexity or degree of each relationship and
this is called cardinality. - Cardinality defines the minimum and maximum
number of occurrences of one entity for a single
occurrence of the related entity. Because all
relationships are bi-directional, cardinality
must be defined in both directions for every
relationship.
22(No Transcript)
23System Concepts for Data Modeling
- Relationships
- Degree
- The degree of a relationship is the number of
entities that participate in the relationship. - A binary relationship has a degree 2, because
two different entities participated in the
relationship. - Relationships may also exist between different
instances of the same entity. - This is called a recursive relationship
(sometimes called a unary relationship degree
1).
24(No Transcript)
25System Concepts for Data Modeling
- Relationships
- Degree (continued)
- Relationships can also exist between more than
two different entities. - These are sometimes called N-ary relationships.
- A relationship existing among three entities is
called a 3-ary or ternary relationship. - An N-ary relationship maybe associated with an
associative entity. - An associative entity is an entity that inherits
its primary key from more than one other entity
(parents). Each part of that concatenated key
points to one and only one instance of each of
the connecting entities.
26(No Transcript)
27System Concepts for Data Modeling
- Relationships
- Foreign Keys
- A relationship implies that instances of one
entity are related to instances of another
entity. - To be able to identify those instances for any
given entity, the primary key of one entity must
be migrated into the other entity as a foreign
key. - A foreign key is a primary key of one entity that
is contributed to (duplicated in) another entity
for the purpose of identifying instances of a
relationship. A foreign key (always in a child
entity) always matches the primary key (in a
parent entity).
28(No Transcript)
29System Concepts for Data Modeling
- Relationships
- Foreign Keys (continued)
- When you have a relationship that you cannot
differentiate between parent and child it is
called a non-specific relationship. - A non-specific relationship (or many-to-many
relationship) is one in which many instances of
one entity are associated with many instances of
another entity. Such relationships are suitable
only for preliminary data models, and should be
resolved as quickly as possible. - All non-specific relationships can be resolved
into a pair of one-to-many relationships by
inserting an associative entity between the two
original entities.
30(No Transcript)
31System Concepts for Data Modeling
- Relationships
- Generalization
- Generalization is an approach that seeks to
discover and exploit the commonalties between
entities. - Generalization is a technique wherein the
attributes that are common to several types of an
entity are grouped into their own entity, called
a supertype. - An entity supertype is an entity whose instances
store attributes that are common to one or more
entity subtypes. - The entity supertype will have one or more
one-to-one relationships to entity subtypes.
These relationships are sometimes called IS A
relationships (or WAS A, or COULD BE A) because
each instance of the supertype is also an
instance of one or more subtypes.
32System Concepts for Data Modeling
- Relationships
- Generalization (continued)
- An entity subtype is an entity whose instances
inherit some common attributes from an entity
supertype, and then add other attributes that are
unique to an instances of the subtype. - An entity can be both a supertype and subtype.
- Through inheritance, the concept of
generalization in data models permits the the
reduction of the number of attributes through the
careful sharing of common attributes. - The subtypes not only inherit the attributes, but
also the data types, domains, and defaults of
those attributes. - In addition to inheriting attributes, subtypes
also inherit relationships to other entities.
33(No Transcript)
34The Process of Logical Data Modeling
- Strategic Data Modeling
- Many organizations select application development
projects based on strategic information system
plans. - Strategic planning is a separate project.
- This project produces an information systems
strategy plan that defines an overall vision and
architecture for information systems. - Almost always, the architecture includes an
enterprise data model.
35The Process of Logical Data Modeling
- Strategic Data Modeling
- An enterprise data model typically identifies
only the most fundamental of entities. - The entities are typically defined (as in a
dictionary) but they are not described in terms
of keys or attributes. - The enterprise data model may or may not include
relationships (depending on the planning
methodologys standards and the level of detail
desired by executive management). - If relationships are included, many of them will
be non-specific. - The enterprise data model is usually stored in a
corporate repository.
36The Process of Logical Data Modeling
- Data Modeling During Systems Analysis
- The data model for a single system or application
is usually called an application data model. - Logical data models have a DATA focus and a
SYSTEM USER perspective. - Logical data models are typically constructed as
deliverables of the study and definition phases
of a project. - Logical data models are not concerned with
implementation details or technology, they may be
constructed (through reverse engineering) from
existing databases. - Data models are rarely constructed during the
survey phase of systems analysis.
37(No Transcript)
38The Process of Logical Data Modeling
- Data Modeling During Systems Analysis
- Data modeling is rarely associated with the study
phase of systems analysis. Most analysts prefer
to draw process models to document the current
system. - Many analysts report that data models are far
superior for the following reasons - Data models help analysts to quickly identify
business vocabulary more completely than process
models. - Data models are almost always built more quickly
than process models. - A complete data model can be fit on a single
sheet of paper. Process models often require
dozens of sheets of paper. - Process modelers too easily get hung up on
unnecessary detail.
39The Process of Logical Data Modeling
- Data Modeling During Systems Analysis
- Many analysts report that data models are far
superior for the following reasons (continued) - Data models for existing and proposed systems are
far more similar than process models for existing
and proposed systems. Consequently, there is less
work to throw away as you move into later phases. - A study phase model should include only entities
relationships, but no attributes a context data
model. - The intent is to refine the understanding of
scope not to get into details about the entities
and business rules.
40The Process of Logical Data Modeling
- Data Modeling During Systems Analysis
- The definition phase data model will be
constructed in at least two stages - A key-based data model will be drawn.
- This model will eliminate non-specific
relationships, add associative entities, include
primary, alternate keys, and foreign keys, plus
precise cardinalities and any generalization
hierarchies. - A fully attributed data model will be
constructed. - The fully attributed model includes all remaining
descriptive attributes and subsetting criteria. - Each attribute is defined in the repository with
data types, domains, and defaults. - The completed data model represents all of the
business requirements for a systems database.
41The Process of Logical Data Modeling
- Looking Ahead to Systems Configuration and Design
- The logical data model from systems analysis
describes business data requirements, not
technical solutions. - The purpose of the configuration phase is to
determine the best way to implement those
requirements with database technology. - During system design, the logical data model will
be transformed into a physical data model (called
a database schema) for the chosen database
management system. - This model will reflect the technical
capabilities and limitations of that database
technology, as well as the performance tuning
requirements suggested by the database
administrator. - The physical data model will also be analyzed for
adaptability and flexibility through a process
called normalization.
42The Process of Logical Data Modeling
- Fact-Finding and Information Gathering for Data
Modeling - Data models cannot be constructed without
appropriate facts and information as supplied by
the user community. - These facts can be collected by a number of
techniques such as sampling of existing forms and
files research of similar systems surveys of
users and management and interviews of users and
management. - The fastest method of collecting facts and
information, and simultaneously constructing and
verifying the data models is Joint Application
Development (JAD).
43(No Transcript)
44The Process of Logical Data Modeling
- Computer-Aided Systems Engineering (CASE) for
Data Modeling - Data models are stored in the repository.
- In a sense, the data model is metadata that is,
data about the business data. - Computer-aided systems engineering (CASE)
technology, provides the repository for storing
the data model and its detailed descriptions.
45The Process of Logical Data Modeling
- Computer-Aided Systems Engineering (CASE) for
Data Modeling - Using a CASE product, you can easily create
professional, readable data models without the
use of paper, pencil, erasers, and templates. - The models can be easily modified to reflect
corrections and changes suggested by end-users. - Most CASE products provide powerful analytical
tools that can check your models for mechanical
errors, completeness, and consistency.
46The Process of Logical Data Modeling
- Computer-Aided Systems Engineering (CASE) for
Data Modeling - Not all data model conventions are supported by
all CASE products. - It is very likely that any given CASE product may
force the company to adapt their methodologys
data modeling symbols or approach so that it is
workable within the limitations of their CASE
tool.
47How to Construct Data Models
- 1st Step - Entity Discovery
- The first task in data modeling is to discover
those fundamental entities in the system that are
or might be described by data. - There are several techniques that may be used to
identify entities. - During interviews or JAD sessions with system
owners and users, pay attention to key words in
their discussion. - During interviews or JAD sessions, specifically
ask the system owners and users to identify
things about which they would like to capture,
store, and produce information. - Study existing forms and files.
- Some CASE tools can reverse engineer existing
files and databases into physical data models.
48How to Construct Data Models
- 1st Step - Entity Discovery
- A true entity has multiple instancesdozens,
hundreds, thousands, or more! - Entities should be named with nouns that describe
the person, event, place, or tangible thing about
which we want to store data. - Try not to abbreviate or use acronyms.
- Names should be singular so as to distinguish the
logical concept of the entity from the actual
instances of the entity. - Define each entity in business terms.
- Dont define the entity in technical terms, and
dont define it as data about . - Your entity names and definitions should
establish an initial glossary of business
terminology that will serve both you and future
analysts and users for years to come.
49(No Transcript)
50How to Construct Data Models
- 2nd Step - The Context Data Model
- The second task in data modeling is to construct
the context data model. - The context data model includes the fundamental
or independent entities that were previously
discovered. - An independent entity is one which exists
regardless of the existence of any other entity.
Its primary key contain no attributes that would
make it dependent on the existence of another
entity. - Independent entities are almost always the first
entities discovered in your conversations with
the users. - Relationships should be named with verb phrases
that, when combined with the entity names, form
simple business sentences or assertions. - Always name the relationship from
parent-to-child.
51(No Transcript)
52How to Construct Data Models
- 3rd Step - The Key-Based Data Model
- The third task is to identify the keys of each
entity. - The following guidelines are suggested for keys
- The value of a key should not change over the
lifetime of each entity instance. - The value of a key cannot be null.
- Controls must be installed to ensure that the
value of a key is valid.
53How to Construct Data Models
- 3rd Step - The Key-Based Data Model
- The following guidelines are suggested for keys
(continued) - Some experts suggest that you avoid intelligent
keys because the key may change over the lifetime
of the entity instance. - An intelligent key is a business code whose
structure communicates data about an entity
instance (such as its classification, size, or
other properties). - A code is a group of characters and/or digits
that identifies and describes something in the
business system. - Other experts suggest that you avoid intelligent
keys because business codes can return value to
the organization because they can be quickly
processed by humans without the assistance of a
computer.
54How to Construct Data Models
- 3rd Step - The Key-Based Data Model
- The following guidelines are suggested for keys
(continued) - Consider inventing a surrogate key instead to
substitute for large concatenated keys of
independent entities. - This suggestion is not practical for associative
entities since because each part of the
concatenated key is a foreign key that must
precisely match its parent entitys primary key. - If you cannot define keys for an entity, it may
be that the entity doesnt really existthat is,
multiple occurrences of the so-called entity do
not exist.
55How to Construct Data Models
- 3rd Step - The Key-Based Data Model
- Business Codes
- There are several types of codes and they can be
combined to form effective means for entity
instance identification. - Serial codes assign sequentially generated
numbers to entity instances. - Many database management systems can generate and
constrain serial codes to a business
requirements. - Block codes are similar to serial codes except
that serial numbers are divided into groups that
have some business meaning. - Alphabetic codes use finite combinations of
letters (and possibly numbers) to describe entity
instances. - Alphabetic codes must usually be combined with
serial or block codes in order to uniquely
identify instances of most entities.
56How to Construct Data Models
- 3rd Step - The Key-Based Data Model
- Business Codes
- There are several types of codes and they can be
combined to form effective means for entity
instance identification. (continued) - In significant position codes, each digit or
group of digits describes a measurable or
identifiable characteristic of the entity
instance. - Significant digit codes are frequently used to
code inventory items. - Hierarchical codes provide a top-down
interpretation for an entity instance. - Every item coded is factored into groups,
subgroups, and so forth.
57How to Construct Data Models
- 3rd Step - The Key-Based Data Model
- Business Codes
- The following guidelines are suggested when
creating a business coding scheme - Codes should be expandable to accommodate growth.
- The full code must result in a unique value for
each entity instance. - Codes should be large enough to describe the
distinguishing characteristics, but small enough
to be interpreted by people without a computer. - Codes should be convenient. A new instance should
be easy to create.
58(No Transcript)
59How to Construct Data Models
- 4th Step - Generalized Hierarchies
- At this time, it would be useful to identify any
generalization hierarchies in a business problem.
60(No Transcript)
61How to Construct Data Models
- 5th Step - The Fully Attributed Data Model
- The fifth task is to identify the remaining data
attributes. - The following guidelines are offered for
attribution. - Many organizations have naming standards and
approved abbreviations. - The data or repository administrator usually
maintains such standards. - Many attributes share common base names such as
NAME, ADDRESS, DATE. - Unless the attributes can be generalized into a
supertype, it is best to give each variation a
unique name such as CUSTOMER NAME vs
SUPPLIER NAME - Names must be distinguishable across projects.
- Logical attribute names should not be abbreviated.
62How to Construct Data Models
- 5th Step - The Fully Attributed Data Model
- The following guidelines are offered for
attribution. (continued) - For attributes that have only YES or NO values,
name as questions. - For example, CANDIDATE FOR A DEGREE?
- Each attribute should be mapped to only one
entity. - Foreign keys are the exception they identify
associated instances of related entities. - An attributes domain should not be based on
logic.
63(No Transcript)
64How to Construct Data Models
- 6th Step - The Fully Described Model
- The last task is to fully describe the data
model. - This task is the most time consuming.
- This task can be started in parallel with the
key-based model or fully attributed model, but it
is usually the last data modeling task completed. - At this time the descriptions for the attributes
are still incomplete they require domains. - Most CASE tools provide extensive facilities for
describing the data types, domains, and defaults
for all attributes to the repository.
65How to Construct Data Models
- 6th Step - The Fully Described Model
- Additional descriptive properties may be recorded
for attributes such as - Who should be able to create, delete, update, and
access each attribute? - How long should each attribute (or entity) be
kept before the data is deleted or archived?
66The Next Generation
- Data modeling should remain a value-added skill
for many years. - The demand for data modeling as a skill is
dependent on two factors - (1) the need for databases, and
- (2) the use of relational database management
system technology to implement those databases. - There is some belief that relational database
technology will eventually be replaced by object
technology. - If that were to happen, data modeling would be
replaced by object modeling techniques. - Even as object database technology becomes
available, we expect the relational database
industry to add object features and technologies
to their product lines.
67The Next Generation
- CASE technology will continue to improve.
- Todays better CASE tools provide a two-way
synchronization between the logical data models
and their database designs. - This synchronization will likely extend as CASE
vendors enable their tools to directly
communicate and interoperate with database
management systems and working databases.
68Summary
- Introduction
- An Introduction to Systems Modeling
- System Concepts for Data Modeling
- The Process of Logical Data Modeling
- How to Construct Data Models
- The Next Generation