Introduction to Databases CS420 Software Engineering in Practice - PowerPoint PPT Presentation

1 / 35
About This Presentation
Title:

Introduction to Databases CS420 Software Engineering in Practice

Description:

7/12/09. Introduction to Databases. CS420 Software ... (Lindsay, Park, Pittsfield) } is a relation over. customer_name x customer_street x customer_city ... – PowerPoint PPT presentation

Number of Views:44
Avg rating:3.0/5.0
Slides: 36
Provided by: avis4
Category:

less

Transcript and Presenter's Notes

Title: Introduction to Databases CS420 Software Engineering in Practice


1
Introduction to DatabasesCS420 Software
Engineering in Practice
2
Basic Structure
  • Formally, given sets D1, D2, . Dn a relation r
    is a subset of D1 x D2 x x DnThus,
    a relation is a set of n-tuples (a1, a2, , an)
    where each ai ? Di
  • Example If customer_name Jones, Smith,
    Curry, Lindsay customer_street Main, North,
    Park customer_city Harrison, Rye,
    PittsfieldThen r (Jones, Main, Harrison),
    (Smith, North, Rye),
    (Curry, North, Rye),
    (Lindsay, Park, Pittsfield) is a relation over
  • customer_name x customer_street x
    customer_city

3
Example of a Relation
4
Attribute Types
  • Each attribute of a relation has a name
  • The set of allowed values for each attribute is
    called the domain of the attribute
  • Attribute values are (normally) required to be
    atomic that is, indivisible
  • Note composite attribute values are not atomic
  • The special value null is a member of every
    domain
  • The null value causes complications in the
    definition of many operations

5
Relation Schema
  • A1, A2, , An are attributes
  • R (A1, A2, , An ) is a relation schema
  • Example
  • Customer_schema (customer_name,
    customer_street, customer_city)

6
Relation Instance
  • The current values (relation instance) of a
    relation are specified by a table
  • An element t of r is a tuple, represented by a
    row in a table

attributes (or columns)
customer_name
customer_street
customer_city
Jones Smith Curry Lindsay
Main North North Park
Harrison Rye Rye Pittsfield
tuples (or rows)
customer
7
Relations are Unordered
  • Order of tuples is irrelevant (tuples may be
    stored in an arbitrary order)
  • Example account relation with unordered tuples

8
Database
  • A database consists of multiple relations
  • Information about an enterprise is broken up into
    parts, with each relation storing one part of
    the information
  • account stores information about accounts
    depositor stores information about which
    customer owns which
    account customer stores information
    about customers
  • Storing all information as a single relation such
    as bank(account_number, balance,
    customer_name, ..)results in
  • repetition of information (e.g., two customers
    own an account)
  • the need for null values (e.g., represent a
    customer without an account)
  • Normalization theory deals with how to design
    relational schemas

9
The depositor Relation
10
Keys
  • Let K ? R
  • K is a superkey of R if values for K are
    sufficient to identify a unique tuple of each
    possible relation r(R)
  • by possible r we mean a relation r that could
    exist in the enterprise we are modeling.
  • Example customer_name, customer_street and
    customer_name are both superkeys
    of Customer, if no two customers can possibly
    have the same name.
  • K is a candidate key if K is minimalExample
    customer_name is a candidate key for Customer,
    since it is a superkey (assuming no two customers
    can possibly have the same name), and no subset
    of it is a superkey.

11
Query Languages
  • Language in which user requests information from
    the database.
  • Categories of languages
  • Procedural
  • Non-procedural, or declarative
  • Pure languages
  • Relational algebra
  • Tuple relational calculus
  • Domain relational calculus
  • Pure languages form underlying basis of query
    languages that people use.

12
SQL
  • Most popular computer language used to create,
    modify and retrieve data from relational database
    management systems.
  • SQL is not a procedural language but a
    declarative language. You write a single SQL
    declaration and hand it to the DBMS. The DBMS
    then executes internal code, which is hidden from
    us.
  • In a declarative language, we carefully phrase
    what we want and then let the DBMS get it for us.
  • Procedural languages result in many lines of
    code. Declarative languages result in one
    statement of the desired result.

13
SQL - Select
  • SELECT is used to retrieve zero or more rows from
    one or more tables in a database. In specifying a
    SELECT query, the user specifies a description of
    the desired result set, but they do not specify
    what physical operations must be executed to
    produce that result set. Translating the query
    into an efficient query plan is left to the
    database system, more specifically to the query
    optimizer.
  • Commonly available keywords related to SELECT
    include
  • FROM is used to indicate from which tables the
    data is to be taken, as well as how the tables
    join to each other.
  • WHERE is used to identify which rows to be
    retrieved, or applied to GROUP BY.
  • GROUP BY is used to combine rows with related
    values into elements of a smaller set of rows.
  • HAVING is used to identify which of the "combined
    rows" (combined rows are produced when the query
    has a GROUP BY keyword or when the SELECT part
    contains aggregates), are to be retrieved.
  • ORDER BY is used to identify which columns are
    used to sort the resulting data

14
SQL Example
  • Example SELECT FROM my_table WHERE id gt 10

15
Data manipulation
  • First there are the standard Data Manipulation
    Language (DML) elements. DML is the subset of the
    language used to add, update and delete data.
  • INSERT is used to add zero or more rows (formally
    tuples) to an existing table.
  • UPDATE is used to modify the values of a set of
    existing table rows.
  • MERGE is used to combine the data of multiple
    tables. It is something of a combination of the
    INSERT and UPDATE elements. It is defined in the
    SQL2003 standard prior to that, some databases
    provided similar functionality via different
    syntax, sometimes called an "upsert".
  • DELETE deletes all data from a table
    (non-standard, but common SQL command).
  • TRUNCATE removes zero or more existing rows from
    a table.

16
Entity-Relationship Model
  • Design Process
  • Modeling
  • Constraints
  • E-R Diagram
  • Design Issues
  • Weak Entity Sets
  • Extended E-R Features
  • Design of the Bank Database
  • Reduction to Relation Schemas
  • Database Design
  • UML

17
Modeling
  • A database can be modeled as
  • a collection of entities,
  • relationship among entities.
  • An entity is an object that exists and is
    distinguishable from other objects.
  • Example specific person, company, event, plant
  • Entities have attributes
  • Example people have names and addresses
  • An entity set is a set of entities of the same
    type that share the same properties.
  • Example set of all persons, companies, trees,
    holidays

18
Entity Sets customer and loan
customer_id customer_ customer_ customer_
loan_ amount
name street city
number
19
Relationship Sets
  • A relationship is an association among several
    entities
  • Example Hayes depositor A-102 customer
    entity relationship set account entity
  • A relationship set is a mathematical relation
    among n ? 2 entities, each taken from entity sets
  • (e1, e2, en) e1 ? E1, e2 ? E2, , en ?
    Enwhere (e1, e2, , en) is a relationship
  • Example
  • (Hayes, A-102) ? depositor

20
Relationship Set borrower
21
Relationship Sets (Cont.)
  • An attribute can also be property of a
    relationship set.
  • For instance, the depositor relationship set
    between entity sets customer and account may have
    the attribute access-date

22
Degree of a Relationship Set
  • Refers to number of entity sets that participate
    in a relationship set.
  • Relationship sets that involve two entity sets
    are binary (or degree two). Generally, most
    relationship sets in a database system are
    binary.
  • Relationship sets may involve more than two
    entity sets.
  • Relationships between more than two entity sets
    are rare. Most relationships are binary. (More
    on this later.)
  • Example Suppose employees of a bank may have
    jobs (responsibilities) at multiple branches,
    with different jobs at different branches. Then
    there is a ternary relationship set between
    entity sets employee, job, and branch

23
Attributes
  • An entity is represented by a set of attributes,
    that is descriptive properties possessed by all
    members of an entity set.
  • Domain the set of permitted values for each
    attribute
  • Attribute types
  • Simple and composite attributes.
  • Single-valued and multi-valued attributes
  • Example multivalued attribute phone_numbers
  • Derived attributes
  • Can be computed from other attributes
  • Example age, given date_of_birth

Example customer (customer_id,
customer_name, customer_street,
customer_city ) loan (loan_number, amount )
24
Composite Attributes
25
Mapping Cardinality Constraints
  • Express the number of entities to which another
    entity can be associated via a relationship set.
  • Most useful in describing binary relationship
    sets.
  • For a binary relationship set the mapping
    cardinality must be one of the following types
  • One to one
  • One to many
  • Many to one
  • Many to many

26
Mapping Cardinalities
One to one
One to many
Note Some elements in A and B may not be mapped
to any elements in the other set
27
Mapping Cardinalities
Many to one
Many to many
Note Some elements in A and B may not be mapped
to any elements in the other set
28
Keys
  • A super key of an entity set is a set of one or
    more attributes whose values uniquely determine
    each entity.
  • A candidate key of an entity set is a minimal
    super key
  • Customer_id is candidate key of customer
  • account_number is candidate key of account
  • Although several candidate keys may exist, one of
    the candidate keys is selected to be the primary
    key.

29
Keys for Relationship Sets
  • The combination of primary keys of the
    participating entity sets forms a super key of a
    relationship set.
  • (customer_id, account_number) is the super key of
    depositor
  • NOTE this means a pair of entity sets can have
    at most one relationship in a particular
    relationship set.
  • Example if we wish to track all access_dates to
    each account by each customer, we cannot assume a
    relationship for each access. We can use a
    multivalued attribute though
  • Must consider the mapping cardinality of the
    relationship set when deciding the what are the
    candidate keys
  • Need to consider semantics of relationship set in
    selecting the primary key in case of more than
    one candidate key

30
E-R Diagrams
  • Rectangles represent entity sets.
  • Diamonds represent relationship sets.
  • Lines link attributes to entity sets and entity
    sets to relationship sets.
  • Ellipses represent attributes
  • Double ellipses represent multivalued attributes.
  • Dashed ellipses denote derived attributes.
  • Underline indicates primary key attributes (will
    study later)

31
E-R Diagram With Composite, Multivalued, and
Derived Attributes
32
Relationship Sets with Attributes
33
Roles
  • Entity sets of a relationship need not be
    distinct
  • The labels manager and worker are called
    roles they specify how employee entities
    interact via the works_for relationship set.
  • Roles are indicated in E-R diagrams by labeling
    the lines that connect diamonds to rectangles.
  • Role labels are optional, and are used to clarify
    semantics of the relationship

34
Cardinality Constraints
  • We express cardinality constraints by drawing
    either a directed line (?), signifying one, or
    an undirected line (), signifying many,
    between the relationship set and the entity set.
  • One-to-one relationship
  • A customer is associated with at most one loan
    via the relationship borrower
  • A loan is associated with at most one customer
    via borrower

35
One-To-Many Relationship
  • In the one-to-many relationship a loan is
    associated with at most one customer via
    borrower, a customer is associated with several
    (including 0) loans via borrower
Write a Comment
User Comments (0)
About PowerShow.com