Advanced Databases Introduction - PowerPoint PPT Presentation

1 / 46
About This Presentation
Title:

Advanced Databases Introduction

Description:

Advanced Databases. Introduction. dr. Toon Calders. prof. dr. Jan ... g(a,d). reach(X,X) :- g(X,Y). reach(X,Y) :- g(X,Y). reach(X,Z) :- reach(X,Y), reach(Y,Z) ... – PowerPoint PPT presentation

Number of Views:54
Avg rating:3.0/5.0
Slides: 47
Provided by: toon3
Category:

less

Transcript and Presenter's Notes

Title: Advanced Databases Introduction


1
Advanced DatabasesIntroduction
  • dr. Toon Calders
  • prof. dr. Jan Paredaens

2
Outline
  • Motivation for the course
  • Other DH courses
  • Practical organization
  • Course topics
  • Project
  • Overview of changes

3
Motivation for the Course
  • Database a piece of software to handle data
  • store,
  • maintain, and
  • query
  • Most ideal system situation-dependent
  • data type simple / semi-structured / complex /
  • types of queries simple lookup / analytical /
  • type of usage multi-user / single-user /
    distributed /

4
Motivation for the Course
  • Relational databases are tuned towards
  • simple data
  • simple, ad-hoc queries
  • multiple users
  • Other models are more suitable for other types of
    data
  • Object-Oriented,
  • Deductive,
  • Semi-Structured Databases,
  • Data warehouses

5
Motivation for the Course
  • Study different data models
  • Advantages, disadvantages
  • Conceptual level
  • what are the important notions?
  • Whats underneath?
  • In a scientific way
  • exact, not just claims

6
Motivation for the Course
  • Student knows
  • different database models
  • Understands
  • why they are introduced
  • conceptual notions
  • Is able to
  • quickly master vendor-specific products

7
Outline
  • Motivation for the course
  • Other DH courses
  • Practical organization
  • Course topics
  • Project
  • Overview of changes

8
Other DH Courses
  • Relational database systems
  • (2ID05) Databases and Data Modelling
  • (2ID35) Database Technology
  • transations, indexing, query optimization,
    distributed DB
  • Other database models
  • (2ID45) Advanced Databases
  • (2II15) Data Mining
  • (2ID25) Information Retrieval
  • (2ID99) Capita Selecta DH

9
Outline
  • Motivation for the course
  • Other DH courses
  • Practical organization
  • Course topics
  • Project
  • Overview of changes

10
Practical Organization
  • In principle
  • Wed 845 ? 1030 Practical session M 1.46
  • no new material
  • opportunity to practice, ask questions
  • together solve exercises
  • Fri 1045 ? 1230 Lectures HG 6.09
  • XML Paredaens (6 lectures)
  • other parts Calders

11
Practical Organization
  • Important information
  • http//wwwis.win.tue.nl/tcalders/teaching/advance
    dDB/
  • Subscribe to 2ID45 on studyweb !
  • messages to the whole class group
  • lecture postponed, room changes,
  • t.calders_at_tue.nl

12
Practical Organization
  • Course material
  • Book
  • Silberschatz, Korth, Sudarshan. Database system
    concepts 5th edition. McGraw-Hill International
  • Lots of additional material on course webpage
  • papers
  • slides
  • solutions to exercises

13
Practical Organization
  • Grades
  • 70 written exam
  • 30 group project
  • No project no grade
  • Grade for the project can be transfered to
    August, similar for grade for the exam
  • Grades expire in August

14
Outline
  • Motivation for the course
  • Other DH courses
  • Practical organization
  • Course Topics
  • Project
  • Overview of changes

15
Course Topics
  • Limitations of the relational model
  • Deductive databases
  • Object-Oriented Databases
  • Data Warehousing OLAP
  • Semi-Structured data

16
Limitations of the relational model
  • Not every query can be expressed
  • Transitive closure cannot be expressed in
    Relational Algebra
  • Give all cities reachable from Antwerp by plane
  • Give all smallest components of a part
  • Give all decendants of person X
  • Not even if youre very smart
  • proof
  • Extension to other relational query languages

17
Deductive Databases
  • Motivation is two-fold
  • add deductive capabilities to databases the
    database contains
  • facts (intensional relations)
  • rules to generate derived facts (extensional
    relations)
  • Database is knowledge base
  • Extend the querying
  • datalog allows for recursion

18
Deductive Databases
  • Datalog as engine of deductive databases
  • similarities with Prolog
  • has facts and rules
  • rules define -possibly recursive- views
  • Semantics not always clear
  • safety
  • negation
  • recursion

19
Deductive Databases
  • g(a,b). g(b,c). g(a,d).
  • reach(X,X) - g(X,Y).
  • reach(X,Y) - g(X,Y).
  • reach(X,Z) - reach(X,Y), reach(Y,Z).
  • node(X) - g(X,Y).
  • node(Y) - g(X,Y).
  • unreach(X,Y) - node(X), node(Y), not
    reach(X,Y).

20
Deductive Databases
  • In this topic we study
  • How to handle negation and recursion in the same
    program
  • How to efficiently evaluate Datalog queries

21
OO Databases
  • Many applications require the storage and
    manipulation of complex data
  • design databases
  • geometric databases
  • Object-Oriented programming languages manipulate
    complex objects
  • classes, methods, inheritance, polymorphism

22
OO Databases
  • Very simple example
  • Class book
  • set of authors
  • title
  • set of keywords
  • Extremely simple to model in OO language
  • Hard in relational database!

23
OO Databases
  • In many applications persistency of the data is
    nevertheless required
  • protection against system failure
  • consistency of the data
  • Mapping object in OO language ? tuples of atomic
    values in relational database is often problematic

24
OO Databases
  • Either we ignore the multivalued dependencies
  • This table is in 3NF, BCNF

25
OO Databases
  • Or we go to 4NF

26
OO Databases
  • Basically OODB persistent OO programming
    language
  • Very important concept
  • rather uninteresting scientifically
  • This topic will mainly be self-study
  • Reading bookchapter Q A session

27
Data Warehousing OLAP
28
Data Warehousing OLAP
  • Transaction processing
  • Operational setting
  • Up-to-date critical
  • Simple data
  • Simple queries only  touch  a small part of
    the database
  • Flight reservations
  • ticket sales
  • do not sell a seat twice
  • reservation, date, name
  • Give flight details of X
  • List flights to Y

29
Data Warehousing OLAP
  • Decision support
  • Off-line setting
  •  Historical  data
  • Summarized data
  • Integrate different databases
  • Statistical queries
  • Flight company
  • Evaluate ROI flights
  • Flights of last year
  • passengers per carrier for destination X
  • Passengers, fuel costs, maintenance info
  • Average of seats sold/month/destination

30
Data Warehousing OLAP
  • In this topic we will study
  • Conceptual models for decision support
  • Database explosion problem
  • Efficient implementation strategies
  • indexing, view materialization

31
XML
  • Why is XML important?
  • simple open non-proprietary widely accepted data
    exchange format
  • XML is like HTML but
  • no fixed set of tags
  • X extensible
  • no fixed semantics (c.q. representation) of tags
  • representation determined by separate
    stylesheet
  • semantics determined by application
  • no fixed structure
  • user-defined schemas

32
XML
ltPersonList Type"Student" Date"2004-12-12"gt
ltTitle Value"Student List"/gt ltContentsgt
ltPersongt ltNamegtJan Vijslt/Namegt
ltIdgt11lt/Idgt ltAddressgt
ltNumbergt123lt/Numbergt ltStreetgtTurnstreetlt/S
treetgt lt/Addressgt lt/Persongt
ltPersongt ltIdgt66lt/Idgt
ltAddressgt ltStreetgtHole Rdlt/Streetgt
lt/Addressgt lt/Persongt lt/Contentsgt lt/PersonLis
tgt
33
XML
  • In this topic
  • XML
  • XQuery, XSLT
  • LiXQuery
  • Taught by prof Paredaens

34
Outline
  • Motivation for the course
  • Other DH courses
  • Practical organization
  • Course Topics
  • Project
  • Overview of changes

35
Project
  • Pick one of the 4 topics
  • deductive databases / rule-based systems
  • object-oriented databases
  • data warehouses
  • semi-structured databases
  • Formulate your own project
  • illustrating the different course concepts
  • showing you mastered the technology

36
Project
  • Make a project proposal ( WEEK 10 )
  • examples of last year will be given
  • fulfilling certain constraints
  • listing technologies to be used
  • Status report ( WEEK 15 )
  • Final report ( WEEK 20 )
  • Project presentations ( WEEKS 21 22 )

37
Outline
  • Motivation for the course
  • Other DH courses
  • Practical organization
  • Course Topics
  • Project
  • Overview of changes

38
Overview of Changes
  • First some facts and figures regarding Spring
    2008
  • Heterogeneous group
  • Outside NL, HBO, BSc TU/e
  • CSE
  • BIS

39
Overview of Changes
  • Some suggestions I decided to act upon
  • 1. Start with the difficult material
  • expressiveness of RA
  • Gaifman locality
  • 2. Too much time is being spent on XML
  • (55) ? (63) topic (XSLT) has been added
  • 3. Disproportional weight given to XML in exam
  • project no longer exclusively XML

40
Overview of Changes
  • Some suggestions I decided to act upon
  • 4. Some materials and instruction just too hard
  • extra exercices will be added more modular
  • 5. The course was split up in lots of individual
    subjects, with no apparent relation to one
    another
  • tried to handle that in the course motivation

41
Overview of Changes
  • Some suggestions that were ignored
  • A google for 'advanced databases' returns quite
    some courses from other universities that look
    interesting to me. Perhaps the lecturers could
    take a look at those.
  • When (re-)constructing the course last year other
    universities ADB courses were surveyed. Many of
    the interesting topics are already handled in
    other courses (Data Mining, Information
    retrieval, Database technology)

42
Overview of Changes
  • Some suggestions that were ignored
  • Don't discuss prerequisite knowledge too much, it
    is prerequisite.
  • ? Heterogeneous group.
  • Balance the course subjects more, TC was
    discussed very specific while the other 3
    subjects where treated in global.
  • ? Time spent on TC is justified by its difficulty
    and its importance for database theory
    motivates OODB Deductive DB

43
Overview of Changes
  • Take-away message
  • (some?) lecturers do act on questionnaires
  • filling out the questionnaires is useful

44
Overview of Changes
  • Take-away message
  • (some?) lecturers do act on questionnaires
  • filling out the questionnaires is useful

45
Summary
  • Relational model has limitations
  • simple queries
  • simple data
  • OODBs allow complex data types
  • Deductive databases, datalog complex queries
  • Somewhere in-between datawarehouses and OLAP
  • special requirements, special datastructures
  • Semi-structured data can be stored in XML
  • Project complements theoretical lectures
  • Instructions for clarification

46
!! See you on Friday !!
Write a Comment
User Comments (0)
About PowerShow.com