Probabilistic%20Ranking%20of%20Database%20Query%20Results

About This Presentation

Title:

Description:

Number of Views:28

Avg rating:3.0/5.0

Slides: 12

Provided by: wha93

Learn more at: https://crystal.uta.edu

Category:

more less

Transcript and Presenter's Notes

Title: Probabilistic%20Ranking%20of%20Database%20Query%20Results

1
Probabilistic Ranking of Database Query Results

Presented by Z.M. Joseph Spring 2006, CSE, UT
Arlington
2
Introduction

3
Challenge

4
Approach

5
Recall from PIR

6
Structured Data

Simplifies to
This automatically increases probability for
unspecified attributes that occur more in the
ideal tuple set R

7
Limited Independence Assumptions

8
Workload-Based R Estimation

In order to use these techniques, the ideal
result set R must be known.
Use statistics gathered from the workload
View the workload as a set of tuples containing
each query and the specified attributes
Thus can replace P(yR) with P(yX,W)
Properties of R can be obtained by examining the
workload for queries that retrieved X in the past

9
Workload-Based R Estimation

10
Implementation

Atomic Probabilities Module stores atomic
quantities in the intermediate knowledge
representation layer
Index Module Uses inputs and association rules
to create global and conditional scores
Scan Algorithm Selects tuples that satisfy the
condition and then finds the ranking based on the
scores
List Merge Algorithm Alternate to scanning

11
Conclusion

Gives a ranking for the Many-Answer problem by
factoring in unspecified attributes
Automated
Makes use of workload statistics and correlations
Can still be adjusted by users and/or domain
experts
Can use user feedback as well

Write a Comment

User Comments (0)