Chapter%207:%20Analyzing%20Systems%20Using%20Entity%20Relationship%20Diagrams - PowerPoint PPT Presentation

About This Presentation
Title:

Chapter%207:%20Analyzing%20Systems%20Using%20Entity%20Relationship%20Diagrams

Description:

... a single diagram represents the problem domain in terms of data requirements. ... Entity individual' has attributes such as name, ... – PowerPoint PPT presentation

Number of Views:160
Avg rating:3.0/5.0
Slides: 71
Provided by: paul53
Category:

less

Transcript and Presenter's Notes

Title: Chapter%207:%20Analyzing%20Systems%20Using%20Entity%20Relationship%20Diagrams


1
Chapter 7 Analyzing Systems Using Entity
Relationship Diagrams
  • Instructor Paul K Chen

2
Analyzing Systems Using Entity Relationship
Diagrams
  • Data Model What, Why, When?
  • Pre-modeling Activities
  • Conceptual Data Modeling What, Why, When, Who?
  • Data Modeling Approach
  • Logical Data Modeling What, Why, When, Who?
  • Data Warehousing Data Modeling

3
Whats a Data Model?
  • A data model is a collection of constructs,
    business rules
  • and sample data which together supports a dynamic
  • representation of real world objects and events.
  • Constructs entity-relationship diagrams
    (conceptual logical or functional) and tables
    (tables)
  • Business rules constraints such as referential
    integrity rules and operators (add, update,
    delete)
  • Sample Data for verification and prototyping
    Verifying the accuracy of the model.

4
Objectives
  • Testing the real system before building it.
  • Assisting in understanding data requirements.
  • Facilitating physical data base design.

5
Types of Models
  • Conceptual Data Model
  • Logical Data Model
  • Physical Data Model

6
Pre-modeling Activities
  • Modeling format (JAD session or something else)
  • Roles and responsibilities
  • Tools and repository
  • Naming standards
  • Modeling convention (What methods to use)
  • Data protection/Backup and recovery procedures

7
Conceptual Data Modeling What, Why, When, Who?
  • What is it?
  •  It is a conceptual representation of data
    without concern for its logical (functional) or
    physical aspects.
  • It is a set of high-level business data models
    which provides a framework for the data modeling
    activities at the next level.

8
Conceptual Data Modeling What, Why, When, Who?
  • When should it be done?
  • In support of the data requirements of a process
    model under development at the corresponding
    level.
  • or
  • 2. Outside the system application lifecycle on
    a department, division, or company wide basis.

9
Conceptual Data Modeling What, Why, When, Who?
  • Who should do it?
  • The group responsible for assuring that data
    structure reflects business policies and rules.
  • 2. It should be a joint effort between the
    owners and custodians of the data, the users of
    the data, and the analysts.

10
Conceptual Data Modeling Why?
  • Documents the type of data (information) which
    must be represented in a system independent of
    specific application, organizations, or
    technology.
  • Maximizes data sharing minimizing redundancy.
  • Provides foundations for physical database
    design.
  • Describes the unique business enterprise
    specifically

11
Conceptual Data Modeling Why?
  • Outside of application life cycle on a
    company-wide basis.
  • Data modeling expresses inherent associations
    which are the most part, independent of anyone
    one application.
  • Data entities change very little even through the
    way they are used can change for each
    application.
  • A complete maintained conceptual data model
    should shorten the requirements definition phases
    of system development life cycle.

12
Data Modeling Approach
  • Data partitioning
  • Use a top-down approach to define the data
    requirements of a system. The purpose is to
    divide and conquer (from subject to entity), and
    to evolve from the conceptual level to logical
    level until physical database is derived.
  • Standard deliverables
  • For each of the levels, there is a set of
    standard deliverables that must be produced. The
    documentation items must be well defined so that
    the data at each level is well understood.

13
Data Partitioning
(How)
(What, Why, Who, Where)
Subjects
Conceptual Level
Technical considerations
Entities
(Data access/performance Data integrity/recovery)
Relationships
Logical Level
Data Elements
Frequencies
Data Definition Language -DDL(create, Alter, drop
tables)
Data Manipulation Language (select, insert,
delete, update)
14
Data Partitioning
  • 1. Define the system boundary Data Context
    Diagram.
  • 2. Partition a subject into entities.
  • 3. Discover entities -super-type and subtype
    part and whole.
  • 4. Define associations between entities.
  • 5. Define major attributes.
  • 6. Define unique identifier for the entity.
  • 7. Assign cardinalities and set up
    relationships between
  • the entities.
  • 8. Define integrity rules and apply
    normalization rules.
  • 9. Validate the model.
  • 10. Complete and standardize the data
    elements.

15
Understand terms and Terminology
  • Independent entity (fundamental entity)
  • Dependent entity (Attribute entity)
  • Associative Entity
  • Identifying relationship
  • Non-identifying relationship

16
Understand terms and Terminology
  • Identifier
  • An attribute distinctly identifies each
    occurrence of an entity.
  • For ex., bank account Id. , and student Id.
  • Association (relationships)
  • An association is a relationship between two
    or more entities.
  • Employee works for company
  • Part has item

17
Understand terms and Terminology
  • Cardinality(the form of relationship)
  • Associations occur in there forms
  • one-to-one one-to-many many-to-many
  • Tables
  • A table is a two-dimensional representation
    of data consisting of columns and rows.
  • Primary Key
  • Used to identify entities.
  • Unique identification for a row in a table.
  • Allow no nulls and no duplicates.
  • May be system assigned.

18
Understand terms and Terminology
  • Foreign key
  • A foreign key is one or more data elements
    whose value is based on the primary identifier of
    another entity, thus allowing the system to
    join and get related information from other
    entities. The joining of different entities in
    this manner eliminates the need of data
    repetition and redundancy.
  • Normalization
  • A technique to make sure that the data in a
    logical model is defined once and only once.
    Normalization helps minimum data redundancy, and
    minimize update abnormalities.

19
1. Define the System Boundary by Subject Context
Diagram
  • A subject (a group of entities with strong
    affinity) is a
  • class of data objects representing the mission
    and
  • resources of the organization.
  • Subjects provides mechanisms for controlling
    how
  • much of a model a reader (user, analyst, manager)
    is able
  • to consider and comprehend at a time. For a small
    system,
  • go directly to define entities.
  • For ex Library (subject) decomposed into book,
    member,
  • and account. (entities).

20
ATM ERD Subject Context Level
Customer
ATM
uses
Has
Owns
Bank Consortium
Account
Affiliated Bank
Holds
Consists of
21
Data (Subject) Context Diagram
  • A Data context diagram is a special case of ERD
    in which a single diagram represents the problem
    domain in terms of data requirements.
  • For example The ATM diagram illustrates and
    highlights
  • Several important characteristics of the system
  • The people and organization with which the
  • system communicates.
  • It documents the significant connections
    (relationships) between the data objects within
    as well as outside the problem domain.

22
2. Partition a Subject into Entities
  • Criteria for partitioning a subject into entities
    are
  • The entities included in a subject all tend to
    describe the subject and have a strong affinity
    with the subject.
  • The entities of a subject should be of equal
    importance, as measured by the range,
    complexities or importance of their data.
  • Each entity should belong to only one subject.

23
3. Discover Entities
  • To find potential entities (entities are nouns),
    look for
  • Objects can be generalized or specialized
  • The entities of a subject should be of equal
    importance, as measured by the range,
    complexities or importance of their data.
  • Each entity should belong to only one subject

24
Subtype and Super-type
Faculty
Full Time Faculty
Part Time Faculty
Part (day) Time Faculty
Part (Night)Time Faculty
25
Generalization Specialization
Aircraft
Specialization
Generalization
Commercial
Military
B52
B-1B
747
777
26
4. Define Associations Between Entities
  • Use a verb to describe associations between two
    or more
  • entities that the user wants to keep track of.
    Each
  • relationship must be specific as possible so that
    its
  • meaning is clear.
  • An individual owns a building.
  • Owns possession, rental or management?

27
5. Define Major Attributes
  • All the attributes of an entity must have meaning
    for each
  • and every one of the occurrence of the entity.
    Only
  • elementary data are included in the model.
    Attributes
  • resulting from process algorithms should not be
    included
  • in the model.
  • For ex. Entity individual has attributes such
    as name,
  • address sex (male or female) no. of dependents,
    etc. But
  • Derived attributes (such as percentage) are not
  • attributes.

28
6. Define Unique Identifier for the Entity
  • This is an attribute that unambiguously
    identifies each
  • occurrence of each entity. Some entities do no
    have their
  • own identifier. These entities are qualified as
    dependent
  • or week entities. The identifier of the entity to
    which the
  • dependent entities are associated with must be
    used to
  • uniquely identify their occurrences.
  • For ex., a child entity must have his or her
    parent
  • identifier to be uniquely identified.

29
7. Assigning Cardinalities Setting Up
Relationships Between the Entities
  • Cardinality is the minimum and maximum number of
  • times an occurrence of an entity occurs in
    relationship to
  • another entity.
  • The minimum number 0 or 1
  • The maximum number 1 or M
  • There are three types of associations
  • One-to-one one-to-many many to many.

30
Continued
  • The relationships could be either as binary,
    recursive, or
  • ternary.

Doctor
Patient
Binary
Part
Order
Ternary
Part/Order
Recursive
Organization
31
Recursive Association
A recursive association is one in which there is
a relationship between An entity and itself.
Information Dev
Accounting
Engineering
Product
Payable
Receivable
Facility
Nuclear
Coal
32
Continued
  • A many-to-many relationships will result in the
    creation
  • of a new entity.

Order Order
Part Part
1M
1M
Part/Order Part /Order
33
8. Define Integrity Rules and Apply Normalization
Rules
  • Integrity rules for entities indicate the context
    in which an
  • entity occurrence may be created, modified, or
    deleted.
  • They also ensure that the entity is consistent
    with other
  • entities.
  • For example, a Client (entity) holds an
    Account (entity). A client cannot be deleted if
    at least one of his accounts has a balance
    greater than 0.

34
Defining Integrity Rules
  • Integrity rules for relationships indicate in
    which context
  • the relationship is meaningful. They also ensure
    that it is
  • consistent with other relationships.
  • This is accomplished by placing referential
    attribute in
  • appropriate entity on the model.

35
Formalizing a One-to-one Relationship with
Referential Attribute
Husband
Wife
Husband name Other attributes
Wife name Other attributes Husband name
Married to
Referential Attribute
36
Formalizing a One-to-Many Relationship with
Referential Attribute
Dog
Dog Owner
(1M)
(11)
Dog Id Other attributes Dog Owner Id
Dog Owner Dog Owner Id Other attributes
Referential Attribute
37
Formalizing a Many-to-Many Relationship with
Referential Attribute
Part
Order
Part Id Other attributes
Order No Other attributes
An associative entity may Participate in
relationship With other entity.
Order/Part Order No Part Id Other attributes
Referential Attributes
38
Referential Integrity
  • Three options
  • Restrict A primary key can not be deleted if
    there are any dependent foreign key rows.
  • Cascade Deleting a primary key row causes the
    deletion of all dependent foreign key rows.
  • Set Null Deleting a primary key row causes all
    dependent foreign keys values to be set null.

39
Apply Normalization Rules
  • A technique to make sure the data in a logical
    data models
  • is defined once and only once. Normalization
    helps
  • minimum data redundancy, and minimize update
  • abnormalities. Three forms
  • First Normal Form
  • Second Normal Form
  • Third Normal Form

40
Normalization
  • First Normal Form Relationships between primary
    key and each attribute must be one-to-one ie.,
    remove
  • repeating group.
  • Second Normal Form All non-key elements are
    dependent upon the entire primary key rather than
    any part thereof.
  • Third Normal Form Elimination of the dependence
    of non-key field upon any other field excepts the
    primary keys.

41
First Normal Form
Item

Qty-Store-3
Qty-Store-2
Qty-Store-1
Item No
PK
3000
4000
5000
101
The above is an violation of first normal form
because there exists a repeated group.
42
Rule Number 1
  • For each occurrence of an entity, there is only
    one and only one value for each its attributes.
    Attributes with repeating values form at least
    one new entity.
  • N other words, relationship between primary key
    and each attribute must be one-to-one.

43
Possible Solution
Store
Store/Item
Store ID
Store ID
Item- No
Qty Sold

PK
PK
FK
FK
S1
S1
3000
101
S2
S2
102
4000
44
Second Normal Form
Student/Course
Course Name
Course No
Student No
Teacher code
Grade
Student Name
PK

FK
FK
FK
3.0
Math
ST01
100
T2
Lee
ST02
4.0
200
T1
CS
Doe
Both course name and student name should be
removed because They are not related to the
entire student/course primary key.
45
Possible Solution
Course
Student No
Course Name
Student Name
Student
Course No
Student/Course
46
Rule Number 2
  • Each attribute must be related to the entire
    primary key.

47
Second Normal Process
Order
Part
Part Name
Order No
Pt-price
PartNo
Order-Dt
PK
PK
1/2/01
Nut
1
1
1.5
1/3/01
5
Bolts
2.0
3
Order/Part
Partno
Order No
QTY
How about Putting PartName In Order/part Table?
PK

1
123
1
1
5
3
123
48
Third Normal Form
COURSE
Course Id
Teacher Code
Course Name
Dept Name
Teacher Name
Dept -Id
PK
T1
DOE
MH400
Math
Math
A1
CS
DB
CS401
T2
Lee
CS
The relationship between any two non-primary key
components must not be one-to-one. Whats wrong
with the above?
49
Rule Number 3
  • The relationship between any two non-primary key
    components must not be one-t-one ie., remove
    tables within tables.

50
The Normal Process
Order
Customer
Cust-Name
Order ID
Order DT
Cust-Id
Cust-Id
PK
PK
FK
1
Lee
1
1/2/ 01
1
3
Sato
1/5/21
3
5
It would be a violation of third normal form to
place cust-name in the order table.
51
Why
  • Reasons
  • One-to-one relationship between two non-reprimary
    key columns (Cus-Id and Cust-name).
  • Redundancy
  • An update anomaly (when a customer name was
    changed)
  • Worse yet when a new name was added (the name
    could not be stored until the customer placed at
    least one order)

52
9. Validate the Model With These Check Points
  • Attribute allocation
  • Rules followed
  • Cardinality necessity
  • Relationship necessity
  • Achievement of organization goals
  • Contributions to expected benefits

53
10. Complete and Standardize the Data Elements
  • As a result of the modeling process via the
    preceding
  • procedure, an information model will emerge.
  • The information model can be used as input (for
    ex., via
  • Erwin Tool) to generate a data definition
    language (DDL)
  • which in turn is used as input for physical data
    model.
  • The information model (also called logical data
    model)
  • fulfills the data requirements of the system.

54
Logical Data Modeling What, Why, When, Who?
  • What is it?
  • It is a representation of data required to
    support the complete business needs for a
    particular business area, system or project.
  • It is a set of data models that provides a
    framework for the physical database construction
    activities.
  • It is a graphical representation of data objects
    that shows the relationship between the tables,
    views and functional core services used by
    modules in the application system.

55
Logical Data Modeling What, Why, When, Who?
  • Why do it?
  • Document the type of data which must be
    represented in a system with regard to specific
    system applications, organizations, or
    technologies.
  • Assist in the orderly creation of a physical
    database design.
  • Specifically describe the unique business
    enterprise.
  • Accelerate and clarify communications between the
    functional analysis and DBAs.

56
Logical Data Modeling What, Why, When, Who?
  • When should it be done?
  • Part of the system application lifecycle.
  • In parallel with process modeling activities.
  • Upon the completion of the conceptual data model
    to produce a first-cut database design that
    includes definitions of tables, columns, and
    constraints.

57
Logical Data Modeling What, Why, When, Who?
  • Who should do it?
  • The group responsible for ensuring that data
    structure reflects business data requirements.
  • It should be a joint effort between the
    functional analysts and data administrators.

58
Logical Data Modeling What, Why, When, Who?
  • Benefits
  • Provide a definition of the data architecture of
    how the target system will be implemented.
  • Model parts of the database schema that show how
    data structures are related to the processes.
  • Provide program designers with the detail for the
    part of the database design that their modules
    use.

59
Logical Data Modeling Activity Description
  • Define Data Architectural Standards related to
    data security, logging, archiving, backup and
    update collision, etc.
  • 2. Position the conceptual data modeling and
    revise the definitions of the entities. By taking
    architectural standards into consideration, the
    complete CDM is reexamined. As a result, new
    entities and relationships my be discovered.
  • 3. Partition data into entities at the table
    level.

60
Logical Data Modeling Activity Description
  • 4. Define integrity rules for entities and
    relationships. Integrity rules for entities
    indicate the context in which an entity
    occurrence may be created, modified, or deleted.
    This is accomplished by placing referential
    attributes in each appropriate entity on the
    model.
  • 5. Apply normalization rules to each entity.

61
Logical Data Modeling Activity Description
  • Complete and standardize the data elements.
  • Package the model for physical data modeling and
    system construction.
  • Evaluate quality of data for conversion.
  • If the data modeling is part of
    re-engineering efforts, we must also document
  • Condition of the data of the existing system
  • Impacts on the new and enhanced system.
  • Conversion rules

62
Logical Data Modeling Validation and Verification
  • Validation (dynamic0
  • Prototyping is used to validate and refine
    the model.
  • Verification (static)
  • Inspection or walk-through.

63
PK Primary KeyFK Foreign KeyNN No NullND
No duplicate
Order
Part
Relationship
Order/Part
64
Define Data Architectural Standards
  • Data Access/retrieval guidelines
  • Naming convention SQL coding standards
  • Data integrity (package triggercommit
    rollback)
  • Error handling record locking rule update
    collision.
  • Data Security
  • Data access rule data separation rule
  • Data recovery/backup
  • Data base refresh/performance tuning

65
Data Warehousing Data Modeling
  • Logical Data Modeling vs. Data Warehousing Data
    Modeling
  • Some Definitions

66
Logical Data Modeling vs. Data Warehousing Data
Modeling
Logical DM Data warehouse DW
Focus Business operations (e.g., what orders from the two months backlog are scheduled to be shipped today?). Business intelligence (e.g., how many orders, by product, were shipped more than two months late over the past year?).
content Transaction driven allow updates to reflect current business transactions dynamic Analysis driven query primarily historical summarized derived
access Structured queries Ad hoc queries. Used to drill-down into data for analysis
Performance Performance sensitive Functionality sensitive
67
SOME DEFINITIONS
  • FACT TABLE A Fact Table is a table in a
    relational
  • database with a multi-part key. Each element
    of the key is itself a foreign key to a single
    dimension tale.
  • Dimension Tables
  • They are the constraints used in forming the
    fact table.

68
SOME DEFINITIONS
  • Star Schema (or Star Joint Schema)
  • A specific organization of a database in
    which a fact
  • table with a composite key is joined to a
    number of single-level dimension tables, each
    with a single, primary key
  • -- Kimball Ralph, Data Warehouse Toolkit ---

69
A STAR JOIN SCHEMA
Fact Table
Times
Food Item
Sales
Food Item Key Food Item Desc Qty
time key day of week quarter year
Food Item Key Profile Key Time Key YTD_Sales_dolla
rs YTD_Sales_qty
YTD_
Member Profile
Profile key Profile desc Territory
Demographics
Demographic Key
Age category
Cluster 1 Population
Income category
Cluster 2 Population
70
Final Words
  • Transform data into information by understanding
    the process
  • Transform information into decisions with
    knowledge
  • Transform decisions into results with actions
Write a Comment
User Comments (0)
About PowerShow.com