SAT and CSP competitions - PowerPoint PPT Presentation

About This Presentation
Title:

SAT and CSP competitions

Description:

Price of entry for the competition? ... Competitions require lots of work. Organizers get limited (academic) reward ... Competitions ... – PowerPoint PPT presentation

Number of Views:64
Avg rating:3.0/5.0
Slides: 46
Provided by: NIC53
Category:
Tags: csp | sat | competitions

less

Transcript and Presenter's Notes

Title: SAT and CSP competitions


1
SAT and CSP competitions benchmark
librariessome lessons learnt?
  • Toby Walsh
  • NICTA UNSW
  • Sydney, Australia

2
Whats the best way to benchmark systems?
3
Outline
  • Benchmark libraries
  • Founding CSPLib.org
  • Competitions
  • SAT competition judge
  • TPTP competition judge

4
Why?
  • Why did I set up CSPLib.org
  • I needed problems against which to benchmark my
    latest inference techniques
  • Zebra and random problems dont cut it!
  • I thought it would help unify and advance the CP
    community

5
Random problems
  • ve
  • Easy to generate
  • Hard (if chosen from phase transition)
  • Impossible to cheat
  • You can solve 1000 variable random 3SAT problems
    at l/n4.2, Ill be impressed

6
Random problems
  • -ve
  • Lack structures found in real world
  • Unrepresentative
  • E.g. random 3SAT either have many solutions or
    none
  • Different methods work well on them
  • Random SAT forward looking algorithms
  • Industrial SAT backward looking algorithms

7
Why?
  • Thesis every mature field has a benchmark
    library
  • Deduction started in 1960s
  • TPTP set up in 1993
  • SAT started in 1960s
  • SAT DIMACS challenge in 1992
  • SATLib set up in 1999
  • CP started in 1970s
  • CSPLib set up in 1998

8
Why?
  • Thesis every mature field has a benchmark
    library
  • Spatial and temporal reasoning started in early
    80s (or before?)
  • Its been approximately 30 years so its about
    time you guys set one up!

9
Benchmark libraries
  • CSPLib.org
  • Over 35k unique visitors
  • Still not everything Id want it to be
  • But state of the art for experimentation is now
    much better than it was
  • I havent seen a zebra for a very long time

10
An ideal library
  • Desiderata taken from
  • CSPLib a benchmark library for constraints,
    Proc. CP-99

11
An ideal library
  • Location
  • On the web and easy to find
  • TPTP.org
  • CSPLib.org
  • SATLib.org
  • QBFLib.org
  • http//elib.zib.de/pub/mp-testdata/tsp/tsplib/tspl
    ib.html
  • http//mat.gsia.cmu.edu/COLOR/instances.html

12
An ideal library
  • Easy to use
  • Tools to make benchmarking as painless as
    possible
  • tptp2X,
  • Diverse
  • To help prevent over-fitting

13
An ideal library
  • Large
  • Growing continuously
  • Again helps to prevent over-fitting
  • Extensible
  • To new problems or domains

14
An ideal library
  • Complete
  • One stop for your problems
  • Topical
  • For instance, it should report current best
    solutions found

15
An ideal library
  • Independent
  • Not tied to a particular solver or proprietary
    input language
  • Mix of difficulties
  • Hard and easy problems
  • Solved and open problems
  • With perhaps even a difficulty index?

16
An ideal library
  • Accurate
  • It should be trusted
  • Used
  • A valued resource for the community

17
Problem format
  • Lo-tech or hi-tech?

18
Lo-tech formats
  • DIMACS format used in SATLib
  • c a simple example
  • p cnf 3 2
  • 1 -1 0
  • 1 2 3 0
  • This represents x v -x, x or y or z

19
Lo-tech formats
  • DIMACS format used in SATLib
  • ve
  • All programming languages can read integers!
  • Small amount of extensibility built in (e.g. QBF)
  • -ve
  • Larger extensions are problematic (e.g. beyond
    CNF to arbitrary Boolean circuits)

20
Hi-tech formats
  • CP competition
  • ltinstancegt
  • ltpresentation
  • name"4-queens"
  • description"This problem involves placing
    4 queens on a chessboard"
  • nbSolutions"at least 1"
  • format"XCSP1.1 (XML CSP Representation
    1.1)"
  • /gt
  • ltdomains nbDomains"1"gt
  • ltdomain name"dom0" nbValues"4"
    values"1..4" /gt
  • lt/domainsgt
  • ltvariables nbVariables"4"gt
  • ltvariable name"X0" domain"dom0"/gt
  • lt/variablesgt
  • ltrelations nbRelations"3"gt
  • ltrelation
  • name"rel0" domain"dom0 dom0
    nbConflicts"10
  • conflicts"(1,1)(1,2)(2,1)(2,2)(2,3)(3,2)(
    3,3)(3,4)(4,3)(4,4)" /gt

21
Hi-tech formats
  • XML
  • ve
  • Easy to extend
  • Parsing tools can be provided
  • -ve
  • Complex and verbose
  • Computers can parse terse structures easily

22
No-tech formats
  • CSPLib
  • Problems are specified in natural language
  • No agreement at that time for an input language
  • One focus was on how you model a problem
  • Today there is more consensus on modelling
    languages like Zinc

23
No-tech formats
  • CSPLib
  • Problems are specified in natural language
  • But you can still provide in one place
  • Input data
  • Results
  • Code
  • Parsers

24
Getting problems
  • Submit them yourself
  • Initially, you must do this so library has some
    critical mass first time people look at it
  • But it becomes tiresome and unrepresentative to
    do so continually
  • Ask at every talk
  • Tried for several years but it (almost) never
    worked

25
Getting problems
  • Need some incentive
  • Offer money?
  • Price of entry for the competition?
  • If you have a competition, users will submit
    problems that their solver is good at?

26
Competitions
27
Libraries Competitions
  • You can have a library without a competition
  • But you cant have a competition without a library

28
Libraries Competitions
  • Libraries then competition
  • TPTP then CASC
  • Easy and safe!
  • Libraries and competition
  • Planning
  • RoboCup

29
Increasing complexity
  • Constraints
  • 1st year, binary extensional
  • 2nd year, limited number of globals
  • 3rd year, unlimited
  • Planning
  • Increasing complexity
  • Time, metrics, uncertainty,

30
Benefits
  • Gets ideas implemented
  • Rewards engineering
  • Progress needs both science and engineering!
  • Puts it all together

31
Benefits
  • Gives greater importance to important low-level
    issues
  • In SAT
  • Watched literals
  • VSIDS

32
Benefits
  • Witness the progress in SAT
  • 1985, 10s vars
  • 1995, 100s vars
  • 2005, 1000s vars
  • Not just Moores law at play!

33
Pitfalls
  • Competitions require lots of work
  • Organizers get limited (academic) reward
  • One solution is to organize also competition
    special issues

34
Pitfalls
  • Competitions encourage incremental improvements
  • Dont have them too often!
  • You may discover a local minimum
  • E.g. MDPs for speech recognition
  • Give out best new solver prize?

35
The Chaff story
  • Industrial problems, SAT UNSAT instances
  • 2008, 1st MiniSAT (son of zChaff)
  • 2007, 1st RSAT (son of MiniSAT)
  • 2006, 1st MiniSAT
  • 2005, 1st SatELite GTI (MiniSATpreprocessor)
  • 2004, 1st zChaff (Forklift from 2003 was better)
  • 2003, 1st Forklift
  • 2002, 1st zChaff

36
Other issues
  • Man-power
  • Organizers
  • One is not enough?
  • Judges
  • All rules need interpretation
  • Compute-power
  • Find a friendly cluster

37
Other issues
  • Multiple tracks
  • SAT/UNSAT
  • Random/industrial/crafted
  • Certificate/Uncertificated

38
Other issues
  • Holding problems back if possible
  • Release some problems so competitors can ensure
    solver compliance
  • But hold most back so competition is blind!

39
Other issues
  • Multiple phases
  • Too many solvers for all to compete with long
    timeouts
  • First phase to test correctness
  • Second phase to throw out the slow solvers (who
    cost you many timeouts)
  • Third phase to differentiate between better
    solvers

40
Other issues
  • Reward function
  • ltcompleted, average time, gt
  • solution purse speed purse
  • Points for each problem divided between those
    solvers that solve it
  • Getting buy in from competitors
  • It will (and should) evolve over time!

41
Other issues
  • Prizes
  • Give out many!
  • Good for peoples CVs
  • Good motivator for future years

42
Other issues
  • Open or closed source?
  • Open to share progress
  • Closed to get the best
  • Last years winner
  • Condition of entry
  • To see progress is being made!

43
Other issues
  • Smallest unsolved problem
  • Give a prize!
  • Timing
  • Run during the conference
  • Creates a buzz so people enter next year
  • Get a slot in program to discuss results
  • Get a slot in banquet to give out prizes

44
Conclusions
  • Benchmark libraries
  • When an area is several decades old, why wouldnt
    you have one?
  • Competitions
  • Designed well, held not too frequently, with
    buy-in from the community, why wouldnt you?

45
Questions
  • Disagreements
  • Other opinions
  • Different experiences
Write a Comment
User Comments (0)
About PowerShow.com