Title: Chapter%207:%20Analyzing%20Systems%20Using%20Entity%20Relationship%20Diagrams
1Chapter 7 Analyzing Systems Using Entity
Relationship Diagrams
2Analyzing Systems Using Entity Relationship
Diagrams
- Data Model What, Why, When?
- Pre-modeling Activities
- Conceptual Data Modeling What, Why, When, Who?
- Data Modeling Approach
- Logical Data Modeling What, Why, When, Who?
- Data Warehousing Data Modeling
3Whats a Data Model?
- A data model is a collection of constructs,
business rules - and sample data which together supports a dynamic
- representation of real world objects and events.
- Constructs entity-relationship diagrams
(conceptual logical or functional) and tables
(tables) - Business rules constraints such as referential
integrity rules and operators (add, update,
delete) - Sample Data for verification and prototyping
Verifying the accuracy of the model.
4Objectives
- Testing the real system before building it.
- Assisting in understanding data requirements.
- Facilitating physical data base design.
5Types of Models
- Conceptual Data Model
- Logical Data Model
- Physical Data Model
6Pre-modeling Activities
- Modeling format (JAD session or something else)
- Roles and responsibilities
- Tools and repository
- Naming standards
- Modeling convention (What methods to use)
- Data protection/Backup and recovery procedures
7Conceptual Data Modeling What, Why, When, Who?
- What is it?
- It is a conceptual representation of data
without concern for its logical (functional) or
physical aspects. - It is a set of high-level business data models
which provides a framework for the data modeling
activities at the next level.
8Conceptual Data Modeling What, Why, When, Who?
- When should it be done?
- In support of the data requirements of a process
model under development at the corresponding
level. - or
- 2. Outside the system application lifecycle on
a department, division, or company wide basis.
9Conceptual Data Modeling What, Why, When, Who?
- Who should do it?
- The group responsible for assuring that data
structure reflects business policies and rules. -
- 2. It should be a joint effort between the
owners and custodians of the data, the users of
the data, and the analysts.
10Conceptual Data Modeling Why?
- Documents the type of data (information) which
must be represented in a system independent of
specific application, organizations, or
technology. - Maximizes data sharing minimizing redundancy.
- Provides foundations for physical database
design. - Describes the unique business enterprise
specifically
11Conceptual Data Modeling Why?
- Outside of application life cycle on a
company-wide basis. - Data modeling expresses inherent associations
which are the most part, independent of anyone
one application. - Data entities change very little even through the
way they are used can change for each
application. - A complete maintained conceptual data model
should shorten the requirements definition phases
of system development life cycle.
12Data Modeling Approach
- Data partitioning
- Use a top-down approach to define the data
requirements of a system. The purpose is to
divide and conquer (from subject to entity), and
to evolve from the conceptual level to logical
level until physical database is derived. - Standard deliverables
- For each of the levels, there is a set of
standard deliverables that must be produced. The
documentation items must be well defined so that
the data at each level is well understood.
13Data Partitioning
(How)
(What, Why, Who, Where)
Subjects
Conceptual Level
Technical considerations
Entities
(Data access/performance Data integrity/recovery)
Relationships
Logical Level
Data Elements
Frequencies
Data Definition Language -DDL(create, Alter, drop
tables)
Data Manipulation Language (select, insert,
delete, update)
14Data Partitioning
- 1. Define the system boundary Data Context
Diagram. - 2. Partition a subject into entities.
- 3. Discover entities -super-type and subtype
part and whole. - 4. Define associations between entities.
- 5. Define major attributes.
- 6. Define unique identifier for the entity.
- 7. Assign cardinalities and set up
relationships between - the entities.
- 8. Define integrity rules and apply
normalization rules. - 9. Validate the model.
- 10. Complete and standardize the data
elements. -
15Understand terms and Terminology
- Independent entity (fundamental entity)
- Dependent entity (Attribute entity)
- Associative Entity
- Identifying relationship
- Non-identifying relationship
16Understand terms and Terminology
- Identifier
- An attribute distinctly identifies each
occurrence of an entity. - For ex., bank account Id. , and student Id.
- Association (relationships)
- An association is a relationship between two
or more entities. - Employee works for company
- Part has item
17Understand terms and Terminology
- Cardinality(the form of relationship)
- Associations occur in there forms
- one-to-one one-to-many many-to-many
- Tables
- A table is a two-dimensional representation
of data consisting of columns and rows. - Primary Key
- Used to identify entities.
- Unique identification for a row in a table.
- Allow no nulls and no duplicates.
- May be system assigned.
18Understand terms and Terminology
- Foreign key
- A foreign key is one or more data elements
whose value is based on the primary identifier of
another entity, thus allowing the system to
join and get related information from other
entities. The joining of different entities in
this manner eliminates the need of data
repetition and redundancy. - Normalization
- A technique to make sure that the data in a
logical model is defined once and only once.
Normalization helps minimum data redundancy, and
minimize update abnormalities.
191. Define the System Boundary by Subject Context
Diagram
- A subject (a group of entities with strong
affinity) is a - class of data objects representing the mission
and - resources of the organization.
- Subjects provides mechanisms for controlling
how - much of a model a reader (user, analyst, manager)
is able - to consider and comprehend at a time. For a small
system, - go directly to define entities.
- For ex Library (subject) decomposed into book,
member, - and account. (entities).
20ATM ERD Subject Context Level
Customer
ATM
uses
Has
Owns
Bank Consortium
Account
Affiliated Bank
Holds
Consists of
21Data (Subject) Context Diagram
- A Data context diagram is a special case of ERD
in which a single diagram represents the problem
domain in terms of data requirements. - For example The ATM diagram illustrates and
highlights - Several important characteristics of the system
- The people and organization with which the
- system communicates.
- It documents the significant connections
(relationships) between the data objects within
as well as outside the problem domain.
222. Partition a Subject into Entities
- Criteria for partitioning a subject into entities
are - The entities included in a subject all tend to
describe the subject and have a strong affinity
with the subject. - The entities of a subject should be of equal
importance, as measured by the range,
complexities or importance of their data. - Each entity should belong to only one subject.
233. Discover Entities
- To find potential entities (entities are nouns),
look for - Objects can be generalized or specialized
- The entities of a subject should be of equal
importance, as measured by the range,
complexities or importance of their data. - Each entity should belong to only one subject
24Subtype and Super-type
Faculty
Full Time Faculty
Part Time Faculty
Part (day) Time Faculty
Part (Night)Time Faculty
25Generalization Specialization
Aircraft
Specialization
Generalization
Commercial
Military
B52
B-1B
747
777
264. Define Associations Between Entities
- Use a verb to describe associations between two
or more - entities that the user wants to keep track of.
Each - relationship must be specific as possible so that
its - meaning is clear.
- An individual owns a building.
- Owns possession, rental or management?
275. Define Major Attributes
- All the attributes of an entity must have meaning
for each - and every one of the occurrence of the entity.
Only - elementary data are included in the model.
Attributes - resulting from process algorithms should not be
included - in the model.
- For ex. Entity individual has attributes such
as name, - address sex (male or female) no. of dependents,
etc. But - Derived attributes (such as percentage) are not
- attributes.
286. Define Unique Identifier for the Entity
- This is an attribute that unambiguously
identifies each - occurrence of each entity. Some entities do no
have their - own identifier. These entities are qualified as
dependent - or week entities. The identifier of the entity to
which the - dependent entities are associated with must be
used to - uniquely identify their occurrences.
- For ex., a child entity must have his or her
parent - identifier to be uniquely identified.
297. Assigning Cardinalities Setting Up
Relationships Between the Entities
- Cardinality is the minimum and maximum number of
- times an occurrence of an entity occurs in
relationship to - another entity.
- The minimum number 0 or 1
- The maximum number 1 or M
- There are three types of associations
- One-to-one one-to-many many to many.
30Continued
- The relationships could be either as binary,
recursive, or - ternary.
Doctor
Patient
Binary
Part
Order
Ternary
Part/Order
Recursive
Organization
31Recursive Association
A recursive association is one in which there is
a relationship between An entity and itself.
Information Dev
Accounting
Engineering
Product
Payable
Receivable
Facility
Nuclear
Coal
32Continued
- A many-to-many relationships will result in the
creation - of a new entity.
Order Order
Part Part
1M
1M
Part/Order Part /Order
338. Define Integrity Rules and Apply Normalization
Rules
- Integrity rules for entities indicate the context
in which an - entity occurrence may be created, modified, or
deleted. - They also ensure that the entity is consistent
with other - entities.
- For example, a Client (entity) holds an
Account (entity). A client cannot be deleted if
at least one of his accounts has a balance
greater than 0.
34Defining Integrity Rules
- Integrity rules for relationships indicate in
which context - the relationship is meaningful. They also ensure
that it is - consistent with other relationships.
- This is accomplished by placing referential
attribute in - appropriate entity on the model.
35Formalizing a One-to-one Relationship with
Referential Attribute
Husband
Wife
Husband name Other attributes
Wife name Other attributes Husband name
Married to
Referential Attribute
36Formalizing a One-to-Many Relationship with
Referential Attribute
Dog
Dog Owner
(1M)
(11)
Dog Id Other attributes Dog Owner Id
Dog Owner Dog Owner Id Other attributes
Referential Attribute
37Formalizing a Many-to-Many Relationship with
Referential Attribute
Part
Order
Part Id Other attributes
Order No Other attributes
An associative entity may Participate in
relationship With other entity.
Order/Part Order No Part Id Other attributes
Referential Attributes
38Referential Integrity
- Three options
- Restrict A primary key can not be deleted if
there are any dependent foreign key rows. - Cascade Deleting a primary key row causes the
deletion of all dependent foreign key rows. - Set Null Deleting a primary key row causes all
dependent foreign keys values to be set null.
39Apply Normalization Rules
- A technique to make sure the data in a logical
data models - is defined once and only once. Normalization
helps - minimum data redundancy, and minimize update
- abnormalities. Three forms
- First Normal Form
- Second Normal Form
- Third Normal Form
40Normalization
- First Normal Form Relationships between primary
key and each attribute must be one-to-one ie.,
remove - repeating group.
- Second Normal Form All non-key elements are
dependent upon the entire primary key rather than
any part thereof. - Third Normal Form Elimination of the dependence
of non-key field upon any other field excepts the
primary keys.
41First Normal Form
Item
Qty-Store-3
Qty-Store-2
Qty-Store-1
Item No
PK
3000
4000
5000
101
The above is an violation of first normal form
because there exists a repeated group.
42Rule Number 1
- For each occurrence of an entity, there is only
one and only one value for each its attributes.
Attributes with repeating values form at least
one new entity. - N other words, relationship between primary key
and each attribute must be one-to-one.
43Possible Solution
Store
Store/Item
Store ID
Store ID
Item- No
Qty Sold
PK
PK
FK
FK
S1
S1
3000
101
S2
S2
102
4000
44Second Normal Form
Student/Course
Course Name
Course No
Student No
Teacher code
Grade
Student Name
PK
FK
FK
FK
3.0
Math
ST01
100
T2
Lee
ST02
4.0
200
T1
CS
Doe
Both course name and student name should be
removed because They are not related to the
entire student/course primary key.
45Possible Solution
Course
Student No
Course Name
Student Name
Student
Course No
Student/Course
46Rule Number 2
- Each attribute must be related to the entire
primary key.
47Second Normal Process
Order
Part
Part Name
Order No
Pt-price
PartNo
Order-Dt
PK
PK
1/2/01
Nut
1
1
1.5
1/3/01
5
Bolts
2.0
3
Order/Part
Partno
Order No
QTY
How about Putting PartName In Order/part Table?
PK
1
123
1
1
5
3
123
48Third Normal Form
COURSE
Course Id
Teacher Code
Course Name
Dept Name
Teacher Name
Dept -Id
PK
T1
DOE
MH400
Math
Math
A1
CS
DB
CS401
T2
Lee
CS
The relationship between any two non-primary key
components must not be one-to-one. Whats wrong
with the above?
49Rule Number 3
- The relationship between any two non-primary key
components must not be one-t-one ie., remove
tables within tables.
50The Normal Process
Order
Customer
Cust-Name
Order ID
Order DT
Cust-Id
Cust-Id
PK
PK
FK
1
Lee
1
1/2/ 01
1
3
Sato
1/5/21
3
5
It would be a violation of third normal form to
place cust-name in the order table.
51Why
- Reasons
- One-to-one relationship between two non-reprimary
key columns (Cus-Id and Cust-name). - Redundancy
- An update anomaly (when a customer name was
changed) - Worse yet when a new name was added (the name
could not be stored until the customer placed at
least one order)
529. Validate the Model With These Check Points
- Attribute allocation
- Rules followed
- Cardinality necessity
- Relationship necessity
- Achievement of organization goals
- Contributions to expected benefits
5310. Complete and Standardize the Data Elements
- As a result of the modeling process via the
preceding - procedure, an information model will emerge.
- The information model can be used as input (for
ex., via - Erwin Tool) to generate a data definition
language (DDL) - which in turn is used as input for physical data
model. - The information model (also called logical data
model) - fulfills the data requirements of the system.
54Logical Data Modeling What, Why, When, Who?
- What is it?
- It is a representation of data required to
support the complete business needs for a
particular business area, system or project. - It is a set of data models that provides a
framework for the physical database construction
activities. - It is a graphical representation of data objects
that shows the relationship between the tables,
views and functional core services used by
modules in the application system.
55Logical Data Modeling What, Why, When, Who?
- Why do it?
- Document the type of data which must be
represented in a system with regard to specific
system applications, organizations, or
technologies. - Assist in the orderly creation of a physical
database design. - Specifically describe the unique business
enterprise. - Accelerate and clarify communications between the
functional analysis and DBAs.
56Logical Data Modeling What, Why, When, Who?
- When should it be done?
- Part of the system application lifecycle.
- In parallel with process modeling activities.
- Upon the completion of the conceptual data model
to produce a first-cut database design that
includes definitions of tables, columns, and
constraints.
57Logical Data Modeling What, Why, When, Who?
- Who should do it?
- The group responsible for ensuring that data
structure reflects business data requirements. - It should be a joint effort between the
functional analysts and data administrators.
58Logical Data Modeling What, Why, When, Who?
- Benefits
- Provide a definition of the data architecture of
how the target system will be implemented. - Model parts of the database schema that show how
data structures are related to the processes. - Provide program designers with the detail for the
part of the database design that their modules
use.
59Logical Data Modeling Activity Description
- Define Data Architectural Standards related to
data security, logging, archiving, backup and
update collision, etc. - 2. Position the conceptual data modeling and
revise the definitions of the entities. By taking
architectural standards into consideration, the
complete CDM is reexamined. As a result, new
entities and relationships my be discovered. - 3. Partition data into entities at the table
level.
60Logical Data Modeling Activity Description
- 4. Define integrity rules for entities and
relationships. Integrity rules for entities
indicate the context in which an entity
occurrence may be created, modified, or deleted.
This is accomplished by placing referential
attributes in each appropriate entity on the
model. - 5. Apply normalization rules to each entity.
61Logical Data Modeling Activity Description
- Complete and standardize the data elements.
- Package the model for physical data modeling and
system construction. - Evaluate quality of data for conversion.
- If the data modeling is part of
re-engineering efforts, we must also document - Condition of the data of the existing system
- Impacts on the new and enhanced system.
- Conversion rules
62Logical Data Modeling Validation and Verification
- Validation (dynamic0
- Prototyping is used to validate and refine
the model. - Verification (static)
- Inspection or walk-through.
63PK Primary KeyFK Foreign KeyNN No NullND
No duplicate
Order
Part
Relationship
Order/Part
64Define Data Architectural Standards
- Data Access/retrieval guidelines
- Naming convention SQL coding standards
- Data integrity (package triggercommit
rollback) - Error handling record locking rule update
collision. - Data Security
- Data access rule data separation rule
- Data recovery/backup
- Data base refresh/performance tuning
65Data Warehousing Data Modeling
- Logical Data Modeling vs. Data Warehousing Data
Modeling - Some Definitions
66Logical Data Modeling vs. Data Warehousing Data
Modeling
Logical DM Data warehouse DW
Focus Business operations (e.g., what orders from the two months backlog are scheduled to be shipped today?). Business intelligence (e.g., how many orders, by product, were shipped more than two months late over the past year?).
content Transaction driven allow updates to reflect current business transactions dynamic Analysis driven query primarily historical summarized derived
access Structured queries Ad hoc queries. Used to drill-down into data for analysis
Performance Performance sensitive Functionality sensitive
67SOME DEFINITIONS
- FACT TABLE A Fact Table is a table in a
relational - database with a multi-part key. Each element
of the key is itself a foreign key to a single
dimension tale. - Dimension Tables
- They are the constraints used in forming the
fact table.
68SOME DEFINITIONS
- Star Schema (or Star Joint Schema)
- A specific organization of a database in
which a fact - table with a composite key is joined to a
number of single-level dimension tables, each
with a single, primary key - -- Kimball Ralph, Data Warehouse Toolkit ---
69A STAR JOIN SCHEMA
Fact Table
Times
Food Item
Sales
Food Item Key Food Item Desc Qty
time key day of week quarter year
Food Item Key Profile Key Time Key YTD_Sales_dolla
rs YTD_Sales_qty
YTD_
Member Profile
Profile key Profile desc Territory
Demographics
Demographic Key
Age category
Cluster 1 Population
Income category
Cluster 2 Population
70Final Words
- Transform data into information by understanding
the process - Transform information into decisions with
knowledge - Transform decisions into results with actions