CS 347: Distributed Databases and Transaction Processing Notes03: Query Processing - PowerPoint PPT Presentation

About This Presentation
Title:

CS 347: Distributed Databases and Transaction Processing Notes03: Query Processing

Description:

CS 347: Distributed Databases and Transaction Processing Notes03: Query Processing Hector Garcia-Molina CS 347 Notes 03 * – PowerPoint PPT presentation

Number of Views:90
Avg rating:3.0/5.0
Slides: 116
Provided by: Sir130
Category:

less

Transcript and Presenter's Notes

Title: CS 347: Distributed Databases and Transaction Processing Notes03: Query Processing


1
CS 347 Distributed Databases and Transaction
ProcessingNotes03 Query Processing
  • Hector Garcia-Molina

2
Query Processing
  • Decomposition
  • Localization
  • Optimization

3
Decomposition
  • Same as in centralized system
  • Normalization
  • Eliminating redundancy
  • Algebraic rewriting

4
Normalization
  • Convert from general language to a standard
    form (e.g., Relational Algebra)

5
Example
  • Select A,C
  • From R,S
  • Where (R.B1 and S.D2) or (R.Cgt3 and S.D.2)
  • ? (R.B1 v R.Cgt3) ? S.D.2
  • R S

?
A, C
Conjunctive normal form
6
Also Detect invalid expressions
  • E.g. Select from R where R.A 3 ? R does
    not have A attribute

7
Eliminate redundancy
  • E.g. in conditions
  • (S.A1) ? (S.Agt5) ? False
  • (S.Alt10) ? (S.Alt5) ? S.Alt5

8
E.g. Common sub-expressions
  • U U
  • S ?cond ?cond T S ?cond T
  • R R R

9
Algebraic rewriting
  • E.g. Push conditions down
  • ?cond3
  • ?cond
  • ?cond1 ?cond2
  • R S R S

10
  • After decomposition
  • One or more algebraic query trees on relations
  • Localization
  • Replace relations by corresponding
  • fragments

11
Localization steps
  • (1) Start with query
  • (2) Replace relations by fragments
  • (3) Push ? up (use CS245 rules)
  • ?,? down
  • (4) Simplify eliminate unnecessary operations

12
Notation for fragment
  • R cond
  • fragment conditions its tuples
    satisfy

13
Example A
  • (1) ?E3
  • R

14
  • (2) ?E3
  • ?
  • R1 E lt 10 R2 E ? 10

15
  • (3) ?
  • ?E3 ?E3
  • R1 E lt 10 R2 E ? 10

16
  • (3) ?
  • ?E3 ?E3
  • R1 E lt 10 R2 E ? 10

17
  • (4) ?E3
  • R1 E lt 10

18
Rule 1
  • ?C1R c2 ? ?C1R c1 ? c2
  • R False ? Ø

A
B
19
In example A
  • ?E3R2 E?10 ? ?E3 R2 E3 ? E?10
  • ? ?E3 R2 False
  • ? Ø

20
Example B
  • (1) Acommon
  • attribute
  • R S

A
21
  • (2)
  • ? ?

R1 Alt5 R2 5 ? A ? 10 R3 Agt10
S1 Alt5 S2 A ? 5
22
  • (3) ?
  • R1Alt5S1Alt5 R1Alt5S2A?5
    R25?A?10S1Alt5
  • R25?A?10S2A?5 R3Agt10S1Alt5
    R3Agt10S2A?5

23
  • (3) ?
  • R1Alt5S1Alt5 R1Alt5S2A?5
    R25?A?10S1Alt5
  • R25?A?10S2A?5 R3Agt10S1Alt5
    R3Agt10S2A?5

24
  • (4) ?
  • R1Alt5S1Alt5 R25?A?10S2A?5

R3Agt10S2A?5
25
Rule 2
  • R C1 S C2 ?
  • R S C1 ? C2 ? R.A S.A

26
? In step 4 of Example B
  • R1 Alt5 S2 A ? 5
  • ?R1 S2 R1.A lt 5 ? S2.A ? 5 ?
    R1.A S2.A
  • ?R1 S2 False ? Ø

27
Localization with derived fragmentation
  • Example C
  • (2)
  • ? ?
  • R1 R2 S1KR.K S2KR.K
  • Alt10 A ?10 ?R.Alt10 ?R.A?10

28
  • (3) ?
  • R1S1 R1S2 R2S1 R2S2

29
  • (4) ?
  • R1Alt10 S1KR.K R2A?10 S2KR.K
  • ? R.Alt10 ? R.A?10

30
? In step 4 of Example C
  • R1Alt10 S2KR.K ? R.A?10
  • ? R1 S2 R1.Alt10 ? S2.KR.K ?
    R.A?10 ? R1.K S2.K
  • ? R1 S2False (K is key of R, R1)
  • ? Ø

31
  • (4) ?
  • R1Alt10 S1KR.K R2A?10 S2KR.K
  • ? R.Alt10 ? R.A?10

(4) simplified more ?
S2
R1
R2
S1
32
Localization with vertical fragmentation
  • Example D
  • (1) ?A R1(K, A, B)
  • R R2(K, C, D)

33
  • (2) ?A
  • R1 R2
  • (K, A, B) (K, C, D)

34
  • (3) ?A
  • ?K,A ?K,A
  • R1 R2
  • (K, A, B) (K, C, D)

not really needed
35
  • (4) ?A
  • R1
  • (K, A, B)

36
Rule 3
  • Given vertical fragmentation of R
  • Ri ?Ai (R), Ai ? A
  • Then for any B ? A
  • ?B (R) ?B Ri B ? Ai ? Ø

37
Localization with hybrid fragmentation
  • Example E
  • R1 ?klt5 ?k,A R
  • R2 ?k?5 ?k,A R
  • R3 ?k,B R

38
  • Query ?A
  • ?k3
  • R

39
  • Reduced
  • Query ?A
  • ?k3
  • R1

40
Summary - Query Processing
  • Decomposition ?
  • Localization ?
  • Optimization
  • Overview
  • Tricks for joins other operations
  • Strategies for optimization

41
Optimization Process
  • Generate query
  • plans
  • Estimate size of
  • intermediate results
  • Estimate cost of
  • plan (,time,)

P2
P3
Pn
P1
Cn
C3
C2
C1
pick minimum
42
Differences with centralized optimization
  • New strategies for some operations (semi-join,ran
    ge-partitioning, sort,)
  • Many ways to assign and schedule processors

43
Parallel/distributed sort
  • Input (a) relation R on single site/disk
  • (b) R fragmented/partitioned by
  • sort attribute
  • (c) R fragmented/partitioned by other
    attribute

44
  • Output (a) sorted R on single site/disk
  • (b) fragments/partitions sorted
  • F1 F2 F3

19
...
5
...
12
...
20
6
...
15
21
10
50
45
Basic sort
  • R(K,), sort on K
  • Fragmented on K Vector ko, k1, kn

k1
ko
7
27
11
10
20
3
22
17
14
46
  • Algorithm each fragment sorted independently
  • If necessary, ship results

47
? Same idea on different architectures
  • Shared nothing
  • Shared memory sorts F1 sorts F2

P1
P2
Net
F1
F2
M
M
P1
P2
F2
M
F1
48
Range partitioning sort
  • R(K,.), sort on K
  • R located at one or more site/disk, not
    fragmented on K

49
  • Algorithm
  • (a) Range partition on K
  • (b) Basic sort
  • Ra
  • Rb

Local sort
R1
R1
ko
Result
Local sort
R2
R2
k1
Local sort
R3
R3
50
Selecting a good partition vector
  • Ra Rb Rc

10
...
7
...
31
...
12
52
8
4
11
15
14
11
32
17
51
Example
  • Each site sends to coordinator
  • Min sort key
  • Max sort key
  • Number of tuples
  • Coordinator computes vector and distributes to
    sites
  • (also decides of sites for local sorts)

52
Sample scenario
  • Coordinator receives
  • SA Min5 Max10 10 tuples
  • SB Min7 Max17 10 tuples

53
Sample scenario
  • Coordinator receives
  • SA Min5 Max10 10 tuples
  • SB Min7 Max17 10 tuples

54
(No Transcript)
55
  • Expected tuples Total tuples
  • with key lt ko 2
  • 2(ko - 5) (ko - 7) 10
  • 3ko 10 10 7 27
  • ko 9

56
Variations
  • Send more info to coordinator
  • Partition vector for local site
  • Eg. Sa 3 3 3 tuples
  • 5 6 8 10 local vector
  • - Histogram
  • 5 6 7 8 9 10

57
? More than one round
  • E.g. (1) Sites send range and tuples
  • (2) Coordinator returns preliminary vector
    Vo
  • (3) Sites tell coordinator how many tuples in
    each Vo range
  • (4) Coordinator computes final vector Vf

58
? Can you come up with a distributed
algorithm?
  • (no coordinator)

59
Parallel external sort-merge
  • Same as range-partition sort, except sort first
  • Ra
  • Rb

Local sort
Ra
R1
ko
Result
Rb
Local sort
R2
k1
In order
R3
Merge
60
Parallel external sort-merge
  • Same as range-partition sort, except sort first
  • Ra
  • Rb

Local sort
Ra
R1
ko
Result
Rb
Local sort
R2
k1
In order
Note can use merging network if
available(e.g., Teradata)
R3
Merge
61
Parallel/distributed Join
  • Input Relations R, S
  • May or may not be partitioned
  • Output R S
  • Result at one or more sites

62
Partitioned Join (Equi-join)
Local join
Ra
Rb
f(A)
Result
f(A)
63
Notes
  • Same partition function f is used for both R and
    S (applied to join attribute)
  • f can be range or hash partitioning
  • Local join can be of any type (use any CS245
    optimization)
  • Various scheduling options e.g., (a) partition
    R partition S join (b) partition R build
    local hash table for R partition S and join

64
More notes
  • We already know why part-join works
  • R1 R2 R3 S1 S2 S3 R1 S1 R2 S2 R3
    S3
  • Useful to give this type of join a name, because
    we may want to partition data to make
    partition-join possible
  • (especially in parallel DB system)

?
?
?
?
65
Even more notes
  • Selecting good partition function f very
    important
  • Number of fragments
  • Hash function
  • Partition vector

66
  • Good partition vector
  • Goal Ri Si the same
  • Can use coordinator to select

67
Asymmetric fragment replicate join
Local join
Ra
Sa
Rb
Sb
f partition
Result
union
68
Notes
  • Can use any partition function f for R
  • (even round robin)
  • Can do any join not just equi-join e.g. R
    S
  • R.A lt S.B

69
General fragment and replicate join
  • f1
  • partition n copies of each fragment
  • -gt 3 fragments

Ra
R1
R1
R2
R2
Rb
R3
R3
70
? S is partitioned in similar fashion
  • Result

R1
S1
R1
S2
R2
S1
All nxm pairings of R,S fragments
R3
S1
R3
S2
71
Notes
  • Asymmetric FR join is special case of general
    FR
  • Asymmetric FR may be good if S small
  • Works for non-equi-joins

72
Semi-join
  • Goal reduce communication traffic
  • R S ? (R S) S or
  • R (S R) or
  • (R S) (S R)

73
Example R S
  • A B A C
  • R S

2 a
3 x
10 b
10 y
25 c
15 z
30 d
25 w
32 x
74
Example R S
  • A B A C
  • R S

2 a
3 x
10 b
10 y
25 c
15 z
30 d
25 w
32 x
75
Example R S
  • A B A C
  • R S

2 a
3 x
10 b
10 y
25 c
15 z
30 d
25 w
32 x
76
  • A B A C
  • R S

2 a
3 x
10 b
10 y
25 c
15 z
30 d
25 w
32 x
  • Computing transmitted data in example
  • with semi-join R (S R)
  • T 4 A 2 AC result
  • with join R S
  • T 4 AB result

77
  • A B A C
  • R S

2 a
3 x
10 b
10 y
25 c
15 z
30 d
25 w
32 x
  • Computing transmitted data in example
  • with semi-join R (S R)
  • T 4 A 2 AC result
  • with join R S
  • T 4 AB result

78
? In general
  • Say R is smaller relation
  • (R S) S better than R S if
  • size (?A S) size (R S) lt size (R)

79
  • Similar comparisons for other semi-joins
  • Remember only taking into account transmission
    cost

80
  • Trick
  • Encode ?A S (or ?A R ) as a bit vector
  • key in S
  • lt----one bit/possible key-------gt

0 0 1 1 0 1 0 0 0 0 1 0 1 0 0
81
? Three way joins with semi-joins
  • Goal R S T

82
? Three way joins with semi-joins
  • Goal R S T

83
? Three way joins with semi-joins
  • Goal R S T

84
  • ? Many options!
  • ? Number of semi-join options is exponential in
    of relations in join

85
Privacy Preserving Join
  • Site 1 has R(A,B)
  • Site 2 has S(A,C)
  • Want to compute R S
  • Site 1 should NOT discover any Sinfo not in the
    join
  • Site 2 should NOT discover any Rinfo not in the
    join

R
S
site 1
site 2
86
Semi-Join Does Not Work
  • If Site 1 sends ?A R to Site 2,site 2 leans all
    keys of R!

?A R (a1, a2, a3, a4)
87
Fix Send hashed keys
  • Site 1 hashes each value of A before sending
  • Site 2 hashes (same function) its own A values to
    see what tuples match

?A R (h(a1), h(a2),h(a3), h(a4))
Site 2 sees it has h(a1),h(a3)
(a1, c1), (a3, c3)
88
What is problem?
?A R (h(a1), h(a2),h(a3), h(a4))
Site 2 sees it has h(a1),h(a3)
(a1, c1), (a3, c3)
89
What is problem?
?A R (h(a1), h(a2),h(a3), h(a4))
Site 2 sees it has h(a1),h(a3)
(a1, c1), (a3, c3)
  • Dictionary attack!Site 2 takes all keys, a1, a2,
    a3... andchecks if h(a1), h(a2), h(a3) matches
    what Site 1 sent...

90
Adversary Model
  • Honest but Curious
  • dictionary attack is possible (cheating is
    internal and cant be caught)
  • sending incorrect keys not possible (cheater
    could be caught)

91
One Solution (Agrawal et al)
  • Use commutative encryption function
  • Ei(x) x encryption using site i private key
  • E1( E2 (x)) E2( E1 (X))
  • Shorthand for exampleE1(x) is xE2(x) is
    xE1(E2(x)) is x

92
Solution
(a1, a2, a3, a4)
(a1, a2, a3, a4)
(a1, a3, a5, a7)
computes (a1, a3, a5, a7), intersects with
(a1, b1), (a3, b3)
93
Why does this solution work?
94
Other Privacy Preserving Operations?
  • Inequality join R S
  • Similarity Join R S

95
Other parallel operations
  • Duplicate elimination
  • Sort first (in parallel) then eliminate
    duplicates in result
  • Partition tuples (range or hash) and eliminate
    locally
  • Aggregates
  • Partition by grouping attributes compute
    aggregate locally

96
Example
Ra
Rb
  • sum (sal) group by dept

97
Example
Ra
Rb
  • sum (sal) group by dept

98
Example
Ra
Rb
  • sum (sal) group by dept

99
Example
less data!
Ra
Rb
  • sum (sal) group by dept

100
Example
less data!
Ra
Rb
  • sum (sal) group by dept

101
Example
less data!
Ra
Rb
  • sum (sal) group by dept

102
Preview Map Reduce
data A1
data B1
data C1
data A2
data B2
data C2
data A3
103
? Enhancements for aggregates
  • Perform aggregate during partition to reduce
    data transmitted
  • Does not work for all aggregate functions
  • Which ones?

104
? Selection
  • Range or hash partition
  • Straightforward
  • ? But what about indexes?

105
? Indexing
  • Can think of partition vector as root of
    distributed index

ko k1
Local indexes
Site 1
Site 2
Site 3
106
Index on non-partition attribute
  • Index
  • sites
  • Tuple
  • sites

ko k1
107
Notes
  • If index is not too big, it may bebetter to keep
    whole and make copies...
  • If updates are frequent,can partition update
    work...(Question how do we handle split of
    B-Tree pages?)

108
  • Extensible or linear hashing
  • R1
  • f R2
  • R3
  • R4 lt- add

109
  • How do we adapt schemes?
  • Where do we store directory,set of
    participants...?
  • Which one is better for a distributed
    environment?
  • Can we design a hashing scheme withno global
    knowledge (P2P)?

110
Summary Query processing
  • Decomposition and Localization ?
  • Optimization
  • Overview ?
  • Tricks for joins, sort,.. ?
  • Tricks for inter-operations parallelism
  • Strategies for optimization

111
Inter-operation parallelism
  • Pipelined
  • Independent

112
Pipelined parallelism
  • Site 2
  • ?c
  • Site 1 S
  • R

result
Site 1
Join
Probe
Tuples matching ?c
R
S
113
Independent parallelism
  • R S T V
  • (1) temp1 ? R S temp2 ? T V
  • (2) result ? temp1 temp2

Site 2
Site 1
114
  • Pipelining cannot be used in all cases
  • e.g. Hash Join

Stream of R tuples
Stream of S tuples
115
Summary
  • As we consider query plans for optimization, we
    must consider various tricks
  • - for individual operations
  • - for scheduling operations
Write a Comment
User Comments (0)
About PowerShow.com