Sampling and Soundness: Can We Have Both - PowerPoint PPT Presentation

1 / 18

About This Presentation

Title:

Sampling and Soundness: Can We Have Both

Description:

Cachet [Sang et al. '04] #3: Na ve Sampling Estimate. Idea: Randomly select a region ... Cachet (exact) Relsat (exact) SampleCount (99% conf.) Instance. Talk Roadmap ... – PowerPoint PPT presentation

Number of Views:18

Avg rating:3.0/5.0

Slides: 19

Provided by: ashi64

Category:

more less

Transcript and Presenter's Notes

Title: Sampling and Soundness: Can We Have Both

1
Sampling and Soundness Can We Have Both?

Carla Gomes, Bart Selman, Ashish Sabharwal
Cornell University
Jörg Hoffmann
DERI Innsbruck
and I am Frank van Harmelen

2
Talk Roadmap

A Sampling Method with a Correctness Guarantee
Can we apply this to the Semantic Web?
Discussion

3
How Might One Count?
How many people are present in the hall?

Problem characteristics
Space naturally divided into rows, columns,
sections,
Many seats empty
Uneven distribution of people (e.g. more near
door, aisles, front, etc.)

4
1 Brute-Force Counting

Idea
Go through every seat
If occupied, increment counter
Advantage
Simplicity, accuracy
Drawback
Scalability

5
2 Branch-and-Bound (DPLL-style)

Idea
Split space into sectionse.g. front/back,
left/right/ctr,
Use smart detection of full/empty sections
Add up all partial counts
Advantage
Relatively faster, exact
Drawback
Still accounts for every single person present
need extremely fine granularity
Scalability

Framework used in DPLL-based systematic exact
counters e.g. Relsat Bayardo-et-al 00, Cachet
Sang et al. 04
6
3 Naïve Sampling Estimate

Idea
Randomly select a region
Count within this region
Scale up appropriately
Advantage
Quite fast
Drawback
Robustness can easily under- or over-estimate
Scalability in sparse spacese.g. 1060 solutions
out of 10300 means need region much larger than
10240 to hit any solutions

7
Sampling with a Guarantee

Idea
Identify a balanced row split or column split
(roughly equal number of people on each side)
Use local search for estimate
Pick one side at random
Count on that side recursively
Multiply result by 2

This provably yields the true count on average!
Even when an unbalanced row/column is picked
accidentallyfor the split, e.g. even when
samples are biased or insufficiently many
Surprisingly good in practice, using a local
search as the sampler

8
Algorithm SampleCount
Gomes-Hoffmann-Sabharwal-Selman IJCAI07

Input Boolean formula F
Set numFixed 0, slack some constant (e.g. 2,
4, 7, )
Repeat until F becomes feasible for exact
counting
Obtain s solution samples for F
Identify the most balanced variable and
variable-pair x is balanced s/2
samples have x 0, s/2 have x 1 (x,y) is
balanced s/2 samples have x y, s/2 have x
y
If x is more balanced than (x,y), randomly set x
to 0 or 1Else randomly replace x with y or y
simplify F
Increment numFixed
Output model count ? 2numFixed slack ?
exactCount(simplified F) with confidence
(1 2 slack )

Note showing one trial
9
Correctness Guarantee
Theorem SampleCount with t trials gives a
correct lower bound with
probability (1 2 slack ? t )
e.g. slack 2, t 4 ? 99 correctness
confidence

Key properties
Holds irrespective of the quality of the local
search estimates
No free lunch! Bad estimates ? high variance of
trial outcome ? min(trials) is high-confidence
but not tight
Confidence grows exponentially with slack and t
Ideas used in the proof
Expected model count true count (for each
trial)
Use Markovs inequality PrXgtkEX lt 1/k to
bound error probability (X is outcome of one
trial)

10
Circuit Synthesis, Random CNFs
11
Talk Roadmap

A Sampling Method with a Correctness Guarantee
Can we apply this to the Semantic Web?
Discussion

12
Talk Roadmap

A Sampling Method with a Correctness Guarantee
Can we apply this to the Semantic Web?
Highly speculative
Discussion

13
Counting in the Semantic Web

should certainly be possible with this method
Example given RDF database D, count how many
triples comply with query q
Throw a constraint cutting the set of all triples
in half
If feasible, count n triples exactly return
n2constraints-slack
Else, iterate
Merely technical challenges
What are constraints cutting the set of all
triples in half?
How to throw a constraint?
When to stop throwing constraints?
How to efficiently count the remaining triples?

14
What about Deduction?

Does ? follow from ??
Exploit connection implication ? UNSAT? upper
bounds?
A similar theorem does NOT hold for upper bounds
Nutshell Markovs inequality PrXgtkEX lt 1/k
does not have a symmetric PrXltkEX
counterpart
An adaptation is possible but has many problems ?
does not look too promising
Heuristic alternative
Add constraints into ? to obtain ? check
whether ? implies ?
If No, stop if yes, goto next trial
After t successful trials, output its enough, I
believe it
No provable confidence but may work well in
practice

15
What about Deduction?

Does ? follow from ??
Much more distant adaptation
Constraint something that removes half of ?
!!
Throw some and check whether ? ? ?
Confidence problematic
Can we draw any conclusions if ? NOT ? ??
May be that ?1, ?2 in ? with ?1 ??2 ? ?, but a
constraint separated ?1 from ?2
May be that all relevant ? are thrown out
Are there interesting cases where we can bound
the probability of these events??

16
Talk Roadmap

A Sampling Method with a Correctness Guarantee
Can we apply this to the Semantic Web?
Highly speculative
Discussion

17
Discussion

In prop CNF, one can efficiently obtain
high-confidence lower bounds on nr of models, by
sampling
Application to Semantic Web
Adaptation to counting tasks should be possible
Adaptation for ? ? ?, via upper bounds, is
problematic
Promising heuristic method sacrificing
confidence guarantee
Alternative adaptation weakens ? instead of
strengthening it
Sampling the knowledge base
Confidence guarantees??
Your feedback and thoughts are highly
appreciated!!

18
What about Deduction?

Does ? follow from ??
Straightforward adaptation
There is a variant of this algorithm that
computes high-confidence upper bounds instead
Throw large constraints, check if ???? is SAT
If SAT, no implication if UNSAT in each of t
iterations, confidence on upper bound on models
Many problems
Is the ???? actually easier to check??
Large constraints are tough even in
propositional CNF context!
(Large involves half of the prop vars needed
for confidence)
Upper bound on models is not confidence in
UNSAT!

Write a Comment

User Comments (0)

About PowerShow.com

Recommended Relevance Latest Highest Rated Most Viewed

Sort by:

Related More from user

CrystalGraphics Presentations

Introducing-PowerShowcom PowerPoint PPT Presentation

Introducing-PowerShowcom - Introducing-PowerShowcom (Without Music)

CrystalGraphics 3D Character Slides for PowerPoint PowerPoint PPT Presentation

CrystalGraphics 3D Character Slides for PowerPoint - CrystalGraphics 3D Character Slides for PowerPoint

Chart and Diagram Slides for PowerPoint PowerPoint PPT Presentation

Chart and Diagram Slides for PowerPoint - Beautifully designed chart and diagram s for PowerPoint with visually stunning graphics and animation effects. Our new CrystalGraphics Chart and Diagram Slides for PowerPoint is a collection of over 1000 impressively designed data-driven chart and editable diagram s guaranteed to impress any audience. They are all artistically enhanced with visually stunning color, shadow and lighting effects. Many of them are also animated. And they’re ready for you to use in your PowerPoint presentations the moment you need them. – PowerPoint PPT presentation

Related Presentations

Music Gadgets - Best Turntables for Sampling PowerPoint PPT Presentation

Music Gadgets - Best Turntables for Sampling - If you are a budding DJ or aspiring musician, then making a smart investment in the good and best turntables for a sampling study can be a way better decision. Visit Music Gadgets to know more https://www.musicgadgets.net/best-turntables-for-sampling/ | PowerPoint PPT presentation | free to view

Prioritizing Air Sampling Needs at BNL PowerPoint PPT Presentation

Prioritizing Air Sampling Needs at BNL - Prioritizing Air Sampling Needs at BNL BNL s Approach to Completing an Industrial Hygiene Exposure Baseline Monitoring Program Presented by Robert Selvey | PowerPoint PPT presentation | free to view

Chorionic villus sampling PowerPoint PPT Presentation

Chorionic villus sampling - Sterile speculum. Metal sound. 20-mL syringe containing nutrient medium ... Sterile pocket for the ultrasound probe and sterile gel ... | PowerPoint PPT presentation | free to view

Lecture 9 Sensors, A/D, sampling noise and jitter PowerPoint PPT Presentation

Lecture 9 Sensors, A/D, sampling noise and jitter - voltage divider Vsignal = ( 5V) RR/(R RR) Choose R=RR at median of ... B. A leads B. Phase lag between A and B is 90 degrees (Quadrature Encoder) Jizhong Xiao ... | PowerPoint PPT presentation | free to view

Random Interpretation PowerPoint PPT Presentation

Random Interpretation - What if we allow probabilistic soundness? ... Almost as simple as random testing but better soundness guarantees. ... soundness. Randomization suggests ideas ... | PowerPoint PPT presentation | free to view

A sampling of some of the subfields of linguistics PowerPoint PPT Presentation

A sampling of some of the subfields of linguistics - A sampling of some of the subfields of linguistics. Pragmatics study of meaning in context ... (1) I'd like a large pizza with mangoes. ... | PowerPoint PPT presentation | free to view

Lecture 9 Sensors, A/D, sampling noise and jitter - Lecture 9 Sensors, A/D, sampling noise and jitter Forrest Brewer Light Sensors - Photoresistor voltage divider Vsignal = (+5V) RR/(R + RR) Choose R=RR at median of ... | PowerPoint PPT presentation | free to view

Generalized Exemplar Model of Sampling PowerPoint PPT Presentation

Generalized Exemplar Model of Sampling - This presentation contains one model (GEMS) and its application in two areas: ... GEMS can be used to model similarity/distance based sampling and loss aversion ... | PowerPoint PPT presentation | free to view

$When waves interact with matter, they can be reflected, transmitted, or a combination of both. Waves that are transmitted can be refracted. PowerPoint PPT Presentation$