Functional Dependency Graphs and SQL - PowerPoint PPT Presentation

1 / 69
About This Presentation
Title:

Functional Dependency Graphs and SQL

Description:

... once (with a different B), or A must map to more than one B in a given ... values ( Luke', Duke', 45, 2130 Boars Nest', Hazard Co.', Georgia'); 11/13/09. 35 ... – PowerPoint PPT presentation

Number of Views:155
Avg rating:3.0/5.0
Slides: 70
Provided by: kevi59
Category:

less

Transcript and Presenter's Notes

Title: Functional Dependency Graphs and SQL


1
Functional Dependency Graphs and SQL
CS157A
Lecture 12
  • Prof. Sin-Min Lee
  • Department of Computer Science
  • San Jose State University

2
Data Normalization
  • Primarily a tool to validate and improve a
    logical design so that it satisfies certain
    constraints that avoid unnecessary duplication of
    data.
  • The process of decomposing relations with
    anomalies to produce smaller, well-structured
    relations.
  • Primary Objective Reduce Redundancy,Reduce
    nulls,
  • Improve modify activities
  • insert,
  • update,
  • delete,
  • but not read
  • Price degraded query, display, reporting

3
Functional Dependency and Keys
  • Functional Dependency The value of one attribute
    (the determinant) determines the value of another
    attribute.
  • Candidate Key Each non-key field is functionally
    dependent on every candidate key.

4
Functional dependency
  • a constraint between two attributes (columns) or
    two sets of columns
  • A ? B if for every valid instance of A, that
    value of A uniquely determines the value of B
  • or A ?B if there exists at most one value of B
    for every value of A

5
(No Transcript)
6
Functional Dependencies
R
X Y Z
  • FDs defined over two sets of attributes X, Y
    Ì R
  • Notation X à Y reads as X determines Y
  • If X à Y, then all tuples that agree on X must
    also agree on Y

1 2 3 2 4 5 1 2 4 1 2 7 2 4 8 3 7 9
7
Functional Dependencies Graph(example)
X Y Z
X Y Z
1 2 3 2 4 5 1 2 4 1 2 7 2 4 8 3 7 9
8
(No Transcript)
9
functional dependency
  • some examples
  • SSN ? Name, Address, Birthdate
  • VIN ? Make, Model, Color
  • note the LHS is the determinant
  • so functional dependency is the technical term
    for determines

10
Candidate Keys
  • an attribute (or set of attributes) that uniquely
    identifies a row
  • primary key is a special candidate key
  • values cannot be null
  • e.g.
  • ENROLL (Student_ID, Name, Address, )
  • PK Student_ID
  • candidate key Name, Address

11
candidate key
  • a candidate key must satisfy
  • unique identification.
  • implies that each nonkey attribute is
    functionally dependent on the key (for not(A ? B)
    to be true, A must occur more than once (with a
    different B), or A must map to more than one B in
    a given row)
  • nonredundancy
  • no attribute in the key can be deleted and still
    be unique
  • minimal set of columns (Simsion)

12
keys and dependencies
EMPLOYEE1 (Emp_ID, Name, Dept_Name, Salary)
determinant
Emp_ID Name Dept_Name Salary
functional dependency
13
EMPLOYEE2 (Emp_ID, Course_Title, Name,
Dept_Name, Salary, Date_Completed)
Emp_ID Course_ Title Name Dept_ Name Salary Date_Comp.
not fully functionally dependant on the primary
key
14
determinants candidate keys
  • candidate key is always a determinant (one way to
    find a determinant)
  • determinant may or may not be a candidate key
  • ? candidate key is a determinant that uniquely
    identifies the remaining (nonkey) attributes
  • determinant may be
  • a candidate key
  • part of a composite candidate key
  • nonkey attribute

15
Introduction
  • Data integrity maintained by various constraints
    on data
  • Functional dependencies are application
    constraints that help DB model real-world entity
  • Join dependencies are a further constraint that
    help resolve some FD constraint limitations

16
(No Transcript)
17
(No Transcript)
18
(No Transcript)
19
(No Transcript)
20
(No Transcript)
21
What is SQL?
  • It is a language used to communicate with a
    database.
  • SQL statements are used to perform tasks such as
    update or retrieve data from a database.
  • Standard SQL commands Select, Create,
    Insert, Update, Delete, and Drop can be
    used to accomplish almost everything that a
    database needs to do.

22
History
  • Originally developed by IBM at San Jose Research
    Laboratory (a.k.a. Almaden Research Center)
  • Oracle, Sybase, Microsoft SQL server are some
    common relational database management systems
    that use SQL

23
Table Basics
  • Data or information for the database are stored
    in tables.
  • Tables are uniquely identified by their names and
    are comprised of columns and rows.
  • Rows contain data for the columns.

City State High Low
Phoenix Arizona 105 90
Tucson Arizona 101 92
Flagstaff Arizona 88 69
San Diego California 77 60
San Jose California 80 65
24
(No Transcript)
25
Basic Structure
  • The basic structure of an SQL expression consists
    of three clauses select, from, and where.
  • The select clause corresponds to the projection
    operation of the relational algebra. It is used
    to list attributes desired in the result.
  • The from clause corresponds to the
    Cartesian-product operation of the relational
    algebra. It lists the relations to be scanned in
    the evaluation of the expression.
  • The where clause corresponds to the selection
    predicate of the relational algebra. It consists
    of a predicate involving attributes of the
    relations that appear in the from clause.

26
Selecting Data
  • The Select statement is used to query the
    database and retrieve selected data that match
    the criteria that you specify
  • Example
  • select column1
  • ,column2, etc
  • from tablename
  • where condition
  • optional continued

27
Selecting Data cont
  • The column names that follow the select keyword
    determine which columns will be returned in the
    results.
  • The table name that follows the keyword from
    specifies the table that will be queried to
    retrieve the desired results.
  • cont
  • Example
  • select column1
  • ,column2, etc
  • from tablename
  • where condition
  • optional

28
Selecting Data cont
  • The where clause is optional and specifies which
    data values or rows will be returned or
    displayed, based on the criteria described on the
    condition.
  • Conditional selections used in the where clause
    , gt, lt, gt, lt, and ltgt (not equal to).
  • Example
  • select column1
  • ,column2, etc
  • from tablename
  • where condition
  • optional

29
The where Clause
  • Problem Find all loan numbers for loans made at
    the Perryridge branch with loan amounts greater
    than 1200.
  • Answer
  • select loan-number
  • from loan
  • where branch-name Perryridge and amount gt
    1200
  • SQL uses the logical connectives and, or, and
    not rather than the mathematical symbols in the
    where clause.

30
The from Clause
  • The from clause, by itself, defines a Cartesian
    product of the relations in the clause.
  • Problem For all customers who have a loan from
    the bank, find their names, loan numbers, and
    loan amount.
  • Answer
  • select customer-name, borrower.loan- number,
    amount
  • from borrower, loan
  • where borrower.loan- number loan.loan- number

31
Creating Tables
  • Example of simple create table statement
  • create table tablename
  • (column1 data type,
  • column2 data type,
  • column3 data type)
  • The create table statement is used to create a
    new table.
  • The data types specify what the type of data can
    be for that particular column.

32
Most Common Data Types
  • char (size) Fixed length character string. Size
    is specified in parenthesis. Max 255 bytes.
  • varchar (size) Variable-length character
    string. Max size is specified in parenthesis.
  • number (size) Number value with a max number of
    column digits specified in parenthesis.
  • date Date value
  • number (size, d) Number value with a max number
    of digits of size total, with a max number of
    d digits to the right of the decimal.

33
Creating Tables cont
  • Example
  • Problem Create a table for a new company that
    contains the following information about your new
    employees first name, last name, title, age, and
    salary.
  • Answer
  • Create table myemployees
  • (firstname varchar(30),
  • lastname varchar(30),
  • title varchar (30),
  • age number(2),
  • salary number(8, 2))

34
Inserting into a Table
  • Example 1
  • Insert into tablename
  • (first_column, last_column)
  • values (first_value,last_value)
  • Example 2
  • Insert into employee
  • (first, last, age, address, city, state)
  • values (Luke, Duke, 45, 2130 Boars Nest,
    Hazard Co., Georgia)
  • The insert statement is used to insert or add a
    row of data into the table.
  • Strings should be enclosed in single quotes, and
    numbers should not.

35
Updating Records
  • The update statement is used to update or change
    records that match a specified criteria.
  • This is accomplished by carefully constructing a
    where clause.
  • Example 1
  • update phone_book
  • set area_code 623
  • where prefix 979
  • Example 2
  • update tablename
  • set columnname newvalue
  • ,nextcolumn newvalue2
  • where columnname
  • OPERATOR value
  • and or column
  • OPERATOR value
  • optional

36
Deleting Records
  • The delete statement is used to delete records or
    rows from the table.
  • To delete an entire row/record, enter delete
    from followed by the table name, the where
    clause.
  • If you leave off the where clause, it will
    delete all records.
  • Example
  • delete from tablename
  • where columnname
  • OPERATOR value
  • and or column
  • OPERATOR value
  • optional

37
Drop a Table
  • The drop table command is used to delete a table
    and all rows in the table.
  • drop table is different from deleting all of the
    records in the table. Dropping the table removes
    the table definition as well as all of its rows.
  • Example
  • Drop table tablename

38
Ordering the Display of Tuples
  • Example To list in alphabetic order all
    customers who have a loan at the Perryridge
    branch
  • select distinct customer-name
  • from borrower, loan
  • where borrower.loan-number loan.loan-number
    and branch-name Perryridge
  • order by customer-name
  • The order by clause causes the tuples in the
    result of a query to appear in sorted order.
  • By default, the order by clause lists items in
    ascending order.

39
Aggregates
  • Functions that operate on sets
  • COUNT, SUM, AVG, MAX, MIN
  • Produce numbers (not tables)
  • Not part of relational algebra

SELECT COUNT() FROM Professor P
SELECT MAX (Salary) FROM Employee E
40
Aggregates
Count the number of courses taught in S2000
SELECT COUNT (T.CrsCode) FROM Teaching T WHERE
T.Semester S2000
But if multiple sections of same course are
taught, use
SELECT COUNT (DISTINCT T.CrsCode) FROM Teaching
T WHERE T.Semester S2000
41
Aggregates Proper and Improper Usage
SELECT COUNT (T.CrsCode), T. ProfId
makes no sense (in the absence of
GROUP BY clause)
SELECT COUNT (), AVG (T.Grade)
but this is OK
WHERE T.Grade gt COUNT (SELECT .)
aggregate cannot be applied to result
of SELECT statement
42
Grouping
  • But how do we compute the number of courses
    taught in S2000 per professor?
  • Strategy 1 Fire off a separate query for each
    professor
  • SELECT COUNT(T.CrsCode)
  • FROM Teaching T
  • WHERE T.Semester S2000 AND T.ProfId
    123456789
  • Cumbersome
  • What if the number of professors changes? Add
    another query?
  • Strategy 2 define a special grouping operator
  • SELECT T.ProfId, COUNT(T.CrsCode)
  • FROM Teaching T
  • WHERE T.Semester S2000
  • GROUP BY T.ProfId

43
GROUP BY
44
GROUP BY - Example
Transcript
Attributes -students Id -avg grade
-number of courses
1234 1234 1234 1234
1234 3.3 4
SELECT T.StudId, AVG(T.Grade), COUNT () FROM
Transcript T GROUP BY T.StudId
45
HAVING Clause
  • Eliminates unwanted groups (analogous to WHERE
    clause)
  • HAVING condition constructed from attributes of
    GROUP BY list and aggregates of attributes not in
    list

SELECT T.StudId, AVG(T.Grade) AS CumGpa,
COUNT () AS NumCrs FROM
Transcript T WHERE T.CrsCode LIKE CS GROUP
BY T.StudId HAVING AVG (T.Grade) gt 3.5
46
Evaluation of GroupBy with Having
47
Example
  • Output the name and address of all seniors on the
    Deans List

SELECT S.Id, S.Name FROM Student S,
Transcript T WHERE S.Id T.StudId AND
S.Status senior GROUP BY HAVING AVG
(T.Grade) gt 3.5 AND SUM (T.Credit) gt 90
S.Id -- wrong S.Id, S.Name --
right
Every attribute that occurs in SELECT clause must
also occur in GROUP BY or it must be an
aggregate. S.Name does not.
48
ORDER BY Clause
  • Causes rows to be output in a specified order

SELECT T.StudId, COUNT () AS NumCrs,
AVG(T.Grade) AS CumGpa FROM Transcript
T WHERE T.CrsCode LIKE CS GROUP BY
T.StudId HAVING AVG (T.Grade) gt 3.5 ORDER BY
DESC CumGpa, ASC StudId
49
Query Evaluation Strategy
  • Evaluate FROM produces Cartesian product, A, of
    tables in FROM list
  • Evaluate WHERE produces table, B, consisting of
    rows of A that satisfy WHERE condition
  • Evaluate GROUP BY partitions B into groups that
    agree on attribute values in GROUP BY list
  • Evaluate HAVING eliminates groups in B that do
    not satisfy HAVING condition
  • Evaluate SELECT produces table C containing a
    row for each group. Attributes in SELECT list
    limited to those in GROUP BY list and aggregates
    over group
  • Evaluate ORDER BY orders rows of C

50
Nested Queries
List all courses that were not taught in S2000
SELECT C.CrsName FROM Course C WHERE C.CrsCode
NOT IN (SELECT T.CrsCode --subquery
FROM Teaching T WHERE T.Sem
S2000)
Evaluation strategy subquery evaluated once
to produces set of courses taught in S2000.
Each row (as C) tested against this set.
51
Correlated Nested Queries
Output a row ltprof, deptgt if prof has taught a
course in dept.
SELECT P.Name, D.Name --outer
query FROM Professor P, Department D WHERE
P.Id IN (set of Ids of all profs who
have taught a course in
D.DeptId)
SELECT T.ProfId
--subquery FROM Teaching T, Course C WHERE
T.CrsCodeC.CrsCode AND
C.DeptIdD.DeptId --correlation
52
Correlated Nested Queries (cont)
  • Tuple variables T and C are local to subquery
  • Tuple variables P and D are global to subquery
  • Correlation subquery uses a global variable, D
  • The value of D.DeptId parameterizes an
    evaluation of the subquery
  • Subquery must (at least) be re-evaluated for
    each distinct value of D.DeptId
  • Correlated queries can be expensive to evaluate

53
Division SQL Solution
SELECT P.Id FROM Professor P WHERE NOT EXISTS
(SELECT D.DeptId -- set B of all dept
Ids FROM Department D EXCEPT
SELECT C.DeptId -- set A of dept
Ids of depts in
-- which P has taught a course
FROM Teaching T, Course C WHERE
T.ProfIdP.Id -- global variable
AND T.CrsCodeC.CrsCode)
54
Division
  • Query type Find the subset of items in one set
    that are related to all items in another set
  • Example Find professors who have taught courses
    in all departments
  • Why does this involve division?

ProfId DeptId
DeptId
Contains row ltp,dgt if professor p has taught
a course in department d
All department Ids
55
Division
  • Strategy for implementing division in SQL
  • Find set of all departments in which a particular
    professor, p, has taught a course - A
  • Find set of all departments - B
  • Output p if A ? B, or equivalently if B-A is
    empty

56
Division SQL Solution
SELECT P.Id FROM Professor P WHERE NOT EXISTS
(SELECT D.DeptId -- B set of all dept
Ids FROM Department D EXCEPT
SELECT C.DeptId -- A set of dept
Ids of depts in
-- which P has taught a course
FROM Teaching T, Course C WHERE
T.ProfIdP.Id --global variable
AND T.CrsCodeC.CrsCode)
57
GROUP BY
Table output by WHERE clause - Divide
rows into groups based on subset of
attributes - All members of a group
agree on those attributes
a a b b c c c c c d d d
group
Each group can be described by a single row in a
table with attributes limited to -Attributes
all group members share (listed in GROUP
BY clause) -Aggregates over group
GROUP BY attributes
58
GROUP BY - Example
Transcript
Attributes -students Id -avg grade
-number of courses
1234 1234 1234 1234
1234 3.3 4
SELECT T.StudId, AVG(T.Grade), COUNT () FROM
Transcript T GROUP BY T.StudId
59
HAVING Clause
  • Eliminates unwanted groups (analogous to WHERE
    clause)
  • HAVING condition constructed from attributes of
    GROUP BY list and aggregates of attributes not in
    list

SELECT T.StudId, AVG(T.Grade) AS CumGpa,
COUNT () AS NumCrs FROM Transcript
T WHERE T.CrsCode LIKE CS GROUP BY
T.StudId HAVING AVG (T.Grade) gt 3.5
60
Example
  • Output the name and address of all seniors on the
    Deans List

SELECT S.Name, S.Address FROM Student S,
Transcript T WHERE S.StudId T.StudId AND
S.Status senior GROUP BY HAVING AVG
(T.Grade) gt 3.5 AND SUM (T.Credit) gt 90
S.StudId -- wrong S.Name,
S.Address -- right
61
SQL and Relational Algebra
  • RELATIONAL ALGEBRA SQL 
  • PSEUDO-CODE
  •  
  • 1) R customers where city 'Dallas' SELECT
    from customers where city 'Dallas'
  •  
  • For c1 first row of customers until c1 last
    row of customers
  • If c1.city Dallas
  • Display c1.
  • End-If
  • End-For
  •  
  •  
  • 2) R customers cid, cname SELECT
    cid, cname from customers
  •  
  • For c1 first row of customers until c1 last
    row of customers
  • Display c1.cid, c1.cname
  • End-For

62
Constructing SQL
63
(No Transcript)
64
(No Transcript)
65
Aggregate Functions
  • Aggregate Functions are functions that take a
    collection (a set or multiset) of values as input
    and return a single value.
  • SQL offers five built-in aggregate functions
  • Average avg
  • Minimum min
  • Maximum max
  • Total sum
  • Count count

66
Embedded SQL
  • The SQL standard defines embeddings of SQL in a
    variety of programming languages such as C,
    Cobol, Java, Pascal, etc.
  • A language in which SQL queries are embedded is
    referred to as a host language, and the SQL
    structures permitted in the host language
    constitute embedded SQL.

67
Embedded SQL cont
  • Two important reasons why a programmer must have
    access to a database from a general purpose
    programming language
  • 1. Not all queries can be expressed in SQL,
    since SQL does not provide the full expressive
    power of a general-purpose language.
  • 2. Nondeclarative actions such as printing a
    report, interacting with a user, or sending the
    results of a query to a graphical user interface
    cannot be done from within SQL.

68
Dynamic SQL
  • The dynamic SQL component of SQL allows programs
    to construct and submit SQL queries at run-time.
  • Using dynamic SQL, programs can create SQL
    queries as strings at run time (perhaps based on
    input from the user) and can either have them
    executed immediately or have them prepared for
    subsequent use.

69
Relational Algebra and SQL
Read the following
http//www.cs.ru.nl/gerp/IS0/sheets/IS0_Relatione
le_Algebra_SQL2.pdf
Write a Comment
User Comments (0)
About PowerShow.com