Program 3 - PowerPoint PPT Presentation

1 / 19
About This Presentation
Title:

Program 3

Description:

LeadActress,Year- Movie. 3NF Synthesis. Implementation ... {Movie,Year,Director,Studio,Duration,BoxOfficeRevenue,Awards,Rating} ... – PowerPoint PPT presentation

Number of Views:47
Avg rating:3.0/5.0
Slides: 20
Provided by: csU62
Category:
Tags: movie | program

less

Transcript and Presenter's Notes

Title: Program 3


1
Program 3
  • BCNF vs 3NF

2
BCNF
  • Written recursively
  • Pseudo code

doBCNF(schema r , fds) Let fd BCNF violating
dependency in fds if fd is null then
return else split into two tables table1
fddomain U fdrange table2 r
fdrange doBCNF(table2, fds)
3
BCNF Results
  • Average time taken to do decomposition
  • 36ms
  • Pros
  • Easy to code
  • Fast
  • Cons
  • Fds lost (see sample inputs)

4
BCNF Sample Inputs
  • Example From Class
  • Input
  • R A,B,C,D,E,G
  • F AB-C, C-A, BC-D, ACD-B, D-EG,
    BE-C,CG-BD,CE-AG
  • Output
  • C,A, D,E,G, B,C,D
  • Lost Fds
  • ACD-B, AB-C,BE-C,GC-BD,CE-AG

5
Another BCNF Example
  • From Class Also
  • Input
  • R C,T,H,R,S,G
  • F CS-G, C-T, HR-C, HS-R, HT-R
  • Output
  • C,S,G, C,T, H,R,C, H,R,S
  • Lost Fds
  • HT-R

6
Beers Example (from book)
  • Input
  • R beer, manf, bar, bar_addr, license, person,
    phone, person_addr, price
  • F beer - manf, bar - bar_addr license,
    person - person_addr, phone, bar, beer -
    price
  • Output
  • beer, manf, bar, bar_addr, license,
    person, phone, person_addr, beer, bar, price
  • Lost Fds
  • None

7
Movies Example
  • Input
  • RMovie, Year, Director, Studio, LeadActor,
    LeadActress, Duration, BoxOfficeRevenue, Awards,
    Rating
  • F Movie,Year - Director,Studio,Duration,BoxO
    fficeRevenue, Awards, Rating, Director, Studio,
    Year - Movie, LeadActor,Year - Movie
    LeadActress, Year - Movie
  • Output
  • Movie,Year,Director,Studio,Duration,BoxOfficeReve
    nue,Awards,Rating, Year,LeadActor,Movie
    ,Year,LeadActor,LeadActress
  • Lost Fds
  • LeadActress,Year-Movie

8
3NF Synthesis
  • Implementation
  • Broke into modules that performed a separate step
    in the algorithm
  • MinimalCover, MergeLHS, FormSubSchema,
    MergeSubShema, AddMissing, AddKey
  • Average Time Taken 200ms
  • Pros
  • Lossless
  • Cons
  • Slow and complex

9
3NF Example 1
  • Example From Class
  • Input
  • R A,B,C,D,E,G
  • F AB-C, C-A, BC-D, ACD-B, D-EG,
    BE-C,CG-BD,CE-AG
  • Output
  • A,B,C, B,C,D, D,E,G, B,E,C, C,G,B,
    C,E,G

10
3NF Example 2
  • From Class Also
  • Input
  • R C,T,H,R,S,G
  • F CS-G, C-T, HR-C, HS-R, HT-R
  • Output
  • S,C,G, C,T, R,H,C, S,H,R, T,H,R

11
3NF Example 3
  • Input
  • R beer, manf, bar, bar_addr, license, person,
    phone, person_addr, price
  • F beer - manf, bar - bar_addr license,
    person - person_addr, phone, bar, beer -
    price
  • Output
  • beer, manf, bar,bar_addr,license,
    person,phone,person_addr, beer,bar,price
  • Same as BCNF

12
3NF Example 4
  • Input
  • RMovie, Year, Director, Studio, LeadActor,
    LeadActress, Duration, BoxOfficeRevenue, Awards,
    Rating
  • F Movie,Year - Director,Studio,Duration,BoxO
    fficeRevenue, Awards, Rating, Director, Studio,
    Year - Movie, LeadActor,Year - Movie
    LeadActress, Year - Movie
  • Output
  • Movie,Year,Director,Studio,Duration,BoxOfficeReve
    nue,Awards,Rating, Year,LeadActor,Movie
    ,Year,LeadActor,LeadActress, Year,LeadActress,M
    ovie

13
Hybrid Approach
  • First use 3NF to generate temporary schemas
  • Next, use BCNF on each of the temporary schemas
    to further decompose
  • Eliminate possible redundancies
  • Average Time 227ms (which is to be expected)

14
Hybrid Example 1
  • Example From Class
  • Input
  • R A,B,C,D,E,G
  • F AB-C, C-A, BC-D, ACD-B, D-EG,
    BE-C,CG-BD,CE-AG
  • Output
  • C,A, B,C,D, D,E,G, E,B,C, G,C,B,
    E,C,G
  • Lost Fds
  • ?

15
Hybrid Movies
  • Input
  • RMovie, Year, Director, Studio, LeadActor,
    LeadActress, Duration, BoxOfficeRevenue, Awards,
    Rating
  • F Movie,Year - Director,Studio,Duration,BoxO
    fficeRevenue, Awards, Rating, Director, Studio,
    Year - Movie, LeadActor,Year - Movie
    LeadActress, Year - Movie
  • Output
  • Movie,Year,Director,Studio,Duration,BoxOfficeReve
    nue,Awards,Rating, Year,LeadActor,Movie
    ,Year,LeadActor,LeadActress, Year,LeadActress,M
    ovie
  • Same as 3NF

16
Program 4
  • Data Mining

17
Data Mining
  • Artificial Neural Network
  • Weka Implementation
  • ANN with error back propagation
  • Oracle-XE Database
  • Java Connectivity
  • Oracle Thin Driver
  • Get database data with java and use Weka to build
    ANN and classify

18
Data Set
  • Census Data
  • Predict Income Class of US citizens given census
    data.
  • Income Class 50,000
  • For simplicity (difficult to predict exact
    income)
  • 30,000 tuples
  • CSV
  • http//cs.uga.edu/mcknight/nlp/data.csv
  • Database (login required)
  • http//128.192.101.749090/apex

19
Results
  • ANN with 27 internal nodes, 55 input nodes
  • Training Data
  • 66 of original data used for training
  • the remaining 33 used as test set for
    classification
  • Training Time 2050.91 seconds
  • Error

Correctly classified 84.0665 Mean absolute
error 0.1698 Root mean
squared error 0.3558 Relative
absolute error 46.6323 Root
relative squared error 83.9026
Write a Comment
User Comments (0)
About PowerShow.com