Title: Database Systems CS 311
1Database SystemsCS 311
- Lecture 1
-
- (with some slides integrated from those of Jiawei
Han, Kevin Chang, Alon Halevy, and Dan Suciu.)
2Self-Introduction
- AnHai Doan
- database and information system group (DAIS)
- Research interests
- databases, data mining, web mining, artificial
intelligence - Hobbies
- mountain climbing, downhill skiing, sailing
- Education history
- Vietnam gt Hungary gt Wisconsin gt Seattle gt
UIUC
3Course administrivia ...
4Course Goals Content
- First course on database systems and data
management at UIUC - cover mostly relational databases
- how to design and create such databases
- how to use them (via SQL query language)
- how to implement them (only briefly)
- will touch on some advanced issues
- XML data models, semi-structured data
- data integration
- you may also try a simple research component
- more on this later
5Prerequisite
- Must have data structure and algorithm background
- CS 225 or 300 equivalent
- Good at C or Java
- project will require lot of programming
- need C or Java to do a good job at talking with
databases - you or your project group picks the language
- Knowing only C will require more work
- more difficult to talk in C to databases
6Textbook
- Required Database Systems The Complete Book,
by Garcia-Molina, Ullman and Widom, 2002 - Comments on the textbook.
- Do you have problems getting your textbook?
- Books on reserve here at the Gringer Library
- "Database Management Systems" by Ramakrishnan and
Gehrke - "Database System Concepts" by Silberschatz,
Korth, and Sudarsan
7Course Format
- For all students
- two 75-min lectures / week
- 4-6 homeworks
- project
- a midterm and a final exam
- Graduate students do an extra project
- survey papers on a research topic, write a 10-15
page report - I will talk with you in detail later in the course
8Lectures
- Lecture slides in ppt format will be posted
shortly before or after the lecture - are to complement the lectures
- Many issues discussed in the lectures will be
covered in the exams and homeworks - hence try to attend lectures regularly
9Homeworks
- Some paper-based, some may involve light
programming - Will be collected at the beginning of class on
the due date, or be collected at my secretary
place - to be decided later
- No late homework will be accepted
10Project
- Select an application that needs a database
- Build a database application from start to finish
- Significant amount of programming
- Will be done in stages
- you will submit some work at the end of each
stage - Will show a demo at semester end
11Project Groups
- Project will be done in group of 3-4 students
- a lot of work, difficult to design so that one
person can do all - learn how to work in a group valuable skills
- groups are like broccoli, they are good for you
- Try to form groups as soon as possible
- can start by posting requests on the class
newsgroup - There will be a deadline later for forming groups
- If you have not formed groups by then
- we will help assign you to groups
12More on Grouping
- All group members receive same grading
- If someone drops out, the rest pick up the work
13Exams
- Midterm final
- will be announced shortly
- check final date and make sure no conflict!
- There will be some brief review before each exam
- If you have conflicts
- do let us know in advance, see course homepage
for more information
14Tentative Grading Breakdown
- Homework 25
- Project 30
- Midterm 20
- Final 25
- Will attempt to grade on an absolute scale as
much as possible - not on a curve
15Contacting the staff ...
16Staff Office Hours
- Instructor AnHai Doan
- Room 2118 Siebel, anhai_at_cs.uiuc.edu
- Office hours Tue Thu 1045-1145 (after
lecture)
- TAs
- Michael Makstman, 1271 DCL, cs311ta1_at_cs.uiuc.edu2
17-244-8522, office hours TBD - Rishi Sinha, 1271 DCL, cs311ta2_at_cs.uiuc.edu217-24
4-8522, office hours TBD - They are not here yet
17Communications
- www-courses.cs.uiuc.edu/cs311
- newsgroup class.cs311
- vitally important!
- make sure to check it daily for new announcements
- If you have a question/problem
- talk to people in your group first
- post your question on newsgroup
- email TA
- go to office hours to talk to TA or instructor
- Office hours are held on ALL WEEKDAYS
- so don't be shy
18Newsgroup
- class.cs311
- designed for you and your peer
- to communicate and help one another
- please do not post solutions to the newsgroup
- TAs will monitor and try their best to help with
your questions - There can be many questions
- it is usually difficult to answer all of them or
answer in a timely manner - hence should come to office hours or email TA
19Now onto database studies ...
20A Motivating Example
- Suppose we are building a system to store the
information about - students
- courses
- professors
- who takes what, who teaches what
21Application Requirements
- store the data for a long period of time
- large amounts (100s of GB)
- protect against crashes
- protect against unauthorized use
- allow users to query/update
- who teaches CS 173
- enroll Mary in CS 311
22- allow several (100s, 1000s) users to access the
data simultaneously - allow administrators to change the schema
- add information about TAs
23Trying Without a DBMS
- Why Direct Implementation Wont Work
- Storing data file system is limited
- size less than 4GB (on 32 bits machines)
- when system crashes we may loose data
- password-based authorization insufficient
- Query/update
- need to write a new C/Java program for every
new query - need to worry about performance
24- Concurrency limited protection
- need to worry about interfering with other users
- need to offer different views to different users
(e.g. registrar, students, professors) - Schema change
- entails changing file formats
- need to rewrite virtually all applications
- Better let a database system handle it
25What Can a DBMS Do for Us?
- Data Definition Language - DDL
- Data Manipulation Language - DML
- query language
- Storage management
- Transaction Management
- concurrency control
- recovery
- Think buying a plane ticket! Can you do it
without a DBMS?
26Building an Application with a DBMS
- Requirements modeling (conceptual, pictures)
- Decide what entities should be part of the
application and how they should be linked. - Schema design and implementation
- Decide on a set of tables, attributes.
- Define the tables in the database system.
- Populate database (insert tuples).
- Write application programs using the DBMS
- way easier now that the data management is taken
care of.
27 Conceptual Modeling
name
category
name
cid
ssn
Takes
Course
Student
quarter
Advises
Teaches
Professor
name
field
address
28Schema Design and Implementation
- Tables
- Separates the logical view from the physical view
of the data.
Students
Takes
Courses
29Querying a Database
- Find all courses that Mary takes
- S(tructured) Q(uery) L(anguage)
- Query processor figures out how to answer the
query efficiently.
select C.namefrom Students S, Takes T,
Courses Cwhere S.name Mary and
S.ssn T.ssn and T.cid C.cid
30Query Optimization
Goal
Imperative query execution plan
Declarative SQL query
select C.name from Students S, Takes T, Courses
C where S.nameMary and S.ssn
T.ssn and T.cid C.cid
Plan tree of Relational Algebra operators,
choice of algorithms at each operator
31Traditional and NovelData Management
- Traditional Data Management
- relational data for enterprise applications
- storage
- query processing/optimization
- transaction processing
- Novel Data Management
- Integration of data from multiple databases,
warehousing. - Data management for decision support, data
mining. - Exchange of data on the web XML.
32Database Industry
- Relational databases are a great success of
theoretical ideas. - Big DBMS companies are among the largest software
companies in the world. - Oracle
- IBM (with DB2)
- Microsoft (SQL Server, Microsoft Access)
- Others
- 20B industry.
33The Study of DBMS
- Several aspects
- Modeling and design of databases
- Database programming querying and update
operations - Database implementation
- DBMS study cuts across many fields of Computer
Science OS, languages, AI, Logic, multimedia,
theory...
34For the next lectureread some parts of the
textbookthe reading requirements will be posted
under lectures/schedule tomorrow