Organizing Open Online Computational Problem Solving Competitions - PowerPoint PPT Presentation

About This Presentation
Title:

Organizing Open Online Computational Problem Solving Competitions

Description:

Match-Level Neutrality Dominated by heuristic approaches Compensation ... of stress levels that a stress testing plan can ... sp satisfies the binary search tree ... – PowerPoint PPT presentation

Number of Views:74
Avg rating:3.0/5.0
Slides: 43
Provided by: KarlL150
Learn more at: http://www.ccs.neu.edu
Category:

less

Transcript and Presenter's Notes

Title: Organizing Open Online Computational Problem Solving Competitions


1
Organizing Open Online Computational Problem
Solving Competitions
  • By Ahmed Abdelmeged

2
  • In 2011, researchers from the Harvard Catalyst
    Project were investigating the potential of
    crowdsourcing genome-sequencing algorithms.

3
  • So, they collected few million sequencing
    problems and developed an electronic judge that
    evaluates sequencing algorithms by how well they
    solve these problems.

4
  • And, they set up a two-week open online
    competition on TopCoder with a total prize pocket
    of 6000.

5
  • The results were astounding!

6
-- Nature Biotechnology, 31(2)pp. 108111, 2013.
  • ... A two-week online contest ... produced over
    600 submissions ... . Thirty submissions exceeded
    the benchmark performance of the US National
    Institutes of Healths MegaBLAST. The best
    achieved both greater accuracy and speed (1,000
    times greater).

7
  • We want to lower the barrier to entry for
    establishing such competitions by having
    meaningful competitions where participants
    assist the admin in evaluating their peers.

8
Thesis Statement
  • Semantic games of interpreted logic sentences
    provide a useful foundation to organize
    computational problem solving communities.

9
(No Transcript)
10
(No Transcript)
11
(No Transcript)
12
(No Transcript)
13
(No Transcript)
14
Open online competitions have been quite
successful in organizing computational problem
solving communities.
15
... A two-week online contest ... produced over
600 submissions ... . Thirty submissions exceeded
the benchmark performance of the US National
Institutes of Healths MegaBLAST. The best
achieved both greater accuracy and speed (1,000
times greater).
-- Nature Biotechnology, 31(2)pp. 108111, 2013.
16
Lets take a closer look at
  • state-of-the-art approaches to organize an open
    online competition for solving computational
    problems.
  • MAX-SAT as a sample problem.

17
MAXimum SATisfiability (MAX-SAT) problem
  • Input a boolean formula in the Conjunctive
    Normal Form (CNF).
  • Output an assignment satisfying the maximum
    number of clauses.

18
The Omniscient Admin Approach
  • A trusted admin prepares a thorough benchmark
    of MAX-SAT problem instances together with their
    correct solutions.
  • This benchmark is used to evaluate individual
    MAX-SAT algorithms submitted by participants.

19
The Teaching Admin Approach
  • Admin prepares a thorough benchmark of MAX-SAT
    problems and their model solutions.
  • Benchmark used to evaluate individual MAX-SAT
    algorithms submitted by participants.

20
Cons
  • Overhead to collect and solve problems.
  • What if, admin incorrectly solves some problems?

21
The Open Benchmark Approach
  • Admin maintains an open benchmark of problems and
    their solutions.
  • Participants may object to any of the solutions
    before the competition starts.

22
Cons
  • Over-fitting Participants may tailor their
    algorithms for the benchmark.

23
The Learning Admin Approach
  • An admin prepares a set of MAX-SAT problems and
    keeps track of the best solution produced by one
    of the algorithms submitted by participants.
  • Pioneered by the FoldIt team.

24
Cons
  • Works for optimization problems. Not clear how to
    apply to other computational problems. TQBF for
    example.

25
Wouldnt it be great if
  • we had a sports-like OOCs where admin referees
    the competition with minimal overhead?

26
However,
27
Research Question
  • How to organize a meaningful open online
    computational problem solving competition where
    participants assist in the evaluation of their
    opponents?

28
Research Question
  • How to organize a meaningful, sports-like, open
    online computational problem solving competition
    where the admin only referees the competition
    with minimal overhead?

29
Simpler Version
  • meaningful, Two-Party Competitions.
  • Admin provides neither benchmark problems nor
    their solutions.

30
Attempt I
  • Each participant prepares a benchmark of problems
    and solve their opponents benchmark problems.
  • Admin checks solutions.
  • Checking the correctness of a MAX-SAT problem
    solution can be an overhead to the admin.

31
Attempt II
  • Each participant prepares a benchmark of problems
    and their solutions.
  • Each participant solves their opponents
    problems.
  • Admin compares both solutions for each problem
    to determine the winner.
  • Admin has to correctly compare solutions.
  • Admin cannot assume any of the solutions to be
    correct.

32
Attempt II
  • Each participant prepares a benchmark of problems
    and their model solutions.
  • Each participant solves their opponents
    problems.
  • Admin only compares solutions to model
    solutions.

33
But,
  • Participants are incentivized to provide the
    wrong model solution.
  • Admin should compare solutions without trusting
    any of them.

34
Thesis
  • Semantic games of interpreted logic sentences
    provide a useful foundation to organize
    computational problem solving communities.

35
Semantic Games
  • A Semantic Game (SG) is a constructive debate of
    the correctness of an interpreted logic sentence
    (a.k.a claim) between two distinguished parties
    the verifier which asserts that the claim holds,
    and the falsifier which asserts that the claim
    does not hold.

36
A Two-Party, SG-Based MAX-SAT Competition (I)
?f ? CNFs ?v ? assignments(f)?f ? assignments(f).
fsat(f,f)fsat(v,f)
  • Participants develop functions to
  • Provide side preference.
  • Provide values for quantified variables based on
    values of variables in scope.

37
A Two-Party, SG-Based MAX-SAT Competition (II)
?f ? CNFs ?v ? assignments(f)?f ? assignments(f).
fsat(f,f)fsat(v,f)
  • Admin chooses sides for players based on their
    side preference.
  • Let Pv be the verifier and Pf be the falsifier.

38
A Two-Party, SG-Based MAX-SAT Competition (III)
?f ? CNFs ?v ? assignments(f)?f ? assignments(f).
fsat(f,f)fsat(v,f)
  • Admin gets value provided by Pf for f.
  • Admin checks f ? CNFs. If false, Pf loses.
  • Admin gets value provided by Pv for v.
  • Admin checks v ? assignments(f). If false, Pv
    loses.

39
A Two-Party, SG-Based MAX-SAT Competition (IV)
?f ? CNFs ?v ? assignments(f)?f ? assignments(f).
fsat(f,f)fsat(v,f)
  • Admin gets value provided by Pf for f.
  • Admin checks f ? assignments(f). If false, Pf
    loses.
  • Admin evaluates fsat(f,f)fsat(v,f). If true Pv
    wins, otherwise Pf wins.

40
Rationale (I)
?f ? CNFs ?v ? assignments(f)?f ? assignments(f).
fsat(f,f)fsat(v,f)
  • Controllable admin overhead.

?f ? CNFs ?v ? assignments(f). satisfies-max(v,f)
41
Rationale (II)
  • Correct there is a winning strategy for
    verifiers of true claims and falsifiers of false
    claims. Regardless of the opponents actions.

42
Rationale (III)
  • Objective.
  • Systematic.
  • Learning chances.

43
SG-Based Two-Party Competitions
  • We let participants debate the correctness of an
    interpreted predicate logic sentence specifying
    the computational problem of interest, assuming
    that participants choose to take opposite sides.

44
Out-of-The-Box, SG-Based, Two-Party MAX-SAT
Competition
  • ?f ? CNFs ?v ? assignments(f)?f ? assignments(f).
    fsat(f,f) fsat(v,f)
  • 1. Falsifier provides a CNF formula f.
  • 2. Verifier provides an assignment v.
  • 3. Falsifier provides an assignment f.
  • 4. Admin evaluates fsat(f,f) fsat(v,f). If
    true, verifier wins. Otherwise, falsifier wins.

45
  • Pros and cons of
  • out-of-the-box, SG-based, two-Party competitions
    solve our
  • meaningful, Two-Party Competitions.
  • Admin provides neither benchmark problems nor
    their solutions.

46
Pro (I) Systematic
  • The rules of an SG are systematically derived
    from the syntax of its underlying claim.
  • SGs are also defined for other logics.

47
Rules of SG(?f, A?, v, f)
f Move Next Game
?x ?(x) f provides x0 SG(??x0/x, A?, v, f)
? ? ? f chooses ???, ? SG(??, A?, v, f)
?x ?(x) v provides x0 SG(??x0/x, A?, v, f)
? ? ? v chooses ???, ? SG(??, A?, v, f)
? N/A SG(??, A?, f, v)
P(t0) v wins if p(t0) holds, o/w f wins v wins if p(t0) holds, o/w f wins
The Game of Language Studies in
Game-Theoretical Semantics and Its Applications
-- Kulas and Hintikka, 1983
48
Pro (II) Objective
  • Competition result is based on skills that are
    precisely defined in the competition definition.

49
Pro (III) Correct
  • Competition result is based on demonstrated
    possession (or lack of) skill.
  • Incorrectly solved problems by admin or opponent
    cannot worsen participants rank.
  • There is a winning strategy for verifiers of true
    claims and falsifiers of false claims. Regardless
    of the opponents actions.

50
Pro (III) Correct
  • There is a winning strategy for verifiers of true
    claims and falsifiers of false claims, regardless
    of the opponents actions.

51
Pro (IV) Controllable Admin Overhead
  • Admin overhead is to implement the structure
    interpreting the logic statement specifying a
    computational problem.
  • It is always possible to scrap functionality out
    of the interpreting structure at the cost of
    adding complexity to the logic statement.

52
Pro (V) Learning
  • Losers can learn from SG traces.

53
Pro (VI) Automatable
  • Participants can codify their strategies for
    playing SGs.
  • Efficient and thorough evaluation.
  • Codified strategies are useful bi-products.
  • Controlled information flow.

54
Challenges (I)
  • Participants must take opposing sides!
  • Neutrality is lost with forcing.

55
Con (II) Not Thorough
  • Unlike sports-games, a single game is not
    thorough enough.

56
Con (III) Issues Scaling to N-Party Competitions
  • In sports, tournaments are used to scale
    two-party games to n-party competitions.

57
Challenges (II)
  • Scaling to N-Party Competition using a
    tournament, yet
  • Avoid Collusion Potential especially in the
    context of open online competitions where Sybil
    identities are common and games are too fast to
    spectate!
  • Ensure that participants get the same chance.

58
Challenges (II)
  • Scaling to N-Party Competition using a
    tournament, yet

59
Issue (II) Neutrality
  • Do participants get the same chance?
  • We have to force sides on participants.
  • We may have vastly different number of verifiers
    and falsifiers.

60
Issue (II) Correctness and Neutrality
  • We have to force sides on participants.
  • Yet, we cannot penalize forced losers for
    competition correctness.
  • Weve to ensure that all participants get the
    same chance even though we may have vastly
    different number of verifiers and falsifiers.

61
Contributions
  1. Computational Problem Solving Labs (CPSLs).
  2. Simplified Semantic Games (SSGs).
  3. Provably Collusion-Resistant SSG-Tournament
    Design.

62
Computational Problem Solving Labs (CPSLs)
63
CPSLs
  • A structured interaction space centered around a
    claim.
  • Community members contribute by submitting their
    strategies for playing an SSG of the labs claim.
  • Submitted strategies, compete in a provably
    collusion resistant tournament of simplified
    semantic games.

64
  • Control, Thorough and Efficient Evaluation.

65
Codified Strategies
  • Efficient and thorough evaluation.
  • Useful by-products.
  • Controlled information flow.

66
CPSLs (II)
  • A structured interaction space centered around a
    claim.
  • Community members contribute by submitting their
    strategies for playing an SSG of the labs claim.
  • Once a new strategy is submitted in a CPSL, it
    competes against the strategies submitted by
    other members in a provably collusion resistant
    tournament of simplified semantic games.

67
Highest Safe Rung Problem
  • The Highest Safe Rung (HSR) problem is to find
    the largest number (n) of stress levels that a
    stress testing plan can examine using (q) tests
    and (k) copies of the product under test.
  • k 1, n q, linear search
  • k gt q, n 2q, binary search
  • 1 lt k lt q, n ?, ?

1
2
k
...
n
...
2
1 (safe)
68
Computational Problem Solving Lab - Highest Safe
Rung
Welcome ....
Highest Safe Rung Admin Page
Log out
1
Description
The Highest Safe Rung (HSR) problem is to find
the largest number of stress levels that a stress
testing plan can examine using (q) tests and (k)
copies of the product under test.
2
Claim
... "HSR() forall Integer q forall Integer k
...
3
Game Traces

Publish
Hide
4
Standings
Save
69
Computational Problem Solving Lab - Highest Safe
Rung
Highest Safe Rung
Welcome Sc1
Log out
1
Description
6
Standings
The Highest Safe Rung (HSR) problem is to find
the largest number of stress levels that a stress
testing plan can examine using (q) tests and (k)
copies of the product under test.
Rank Member Latest contribution of faults Chosen side
1 ... ... verifier
2 ... ... verifier
3 ... ... ...
20 ... ... ...
22 ... ... ...
21 ... ... ...
22 Sc1 1/1/2014 ...
23 ... ... ...
24 ... ... ...
25 ... ... falsifier
2
Download claim specification
3
Download strategy skeleton
4
Download traces of past games
Upload new Strategy
5
See all
70
Claim Specification
71
Simplified Semantic Games
72
SG Rules
73
SSGs
  • Simpler use auxiliary games to replace moves
    for conjunctions and disjunctions.
  • Thoroughness potential participants can provide
    several values for quantified variables.

74
SSG Rules
75
HSR Claim Specification
class HSRClaim public static final String
FORMULAS new String HSR() forall
Integer q forall Integer k exists Integer n
HSRnqk(n, k, q) and ! exists Integer m greater
(m, n) and HSRnqk(m, q, k) HSRnqk(Integer n,
Integer q, Integer k) exists SearchPlan sp
correct(sp, n, q, k) public static boolean
greater(Integer n, Integer m) return n gt m
public static interface SearchPlan public
static class ConclusionNode implements
SearchPlan Integer hsr public static class
TestNode implements SearchPlan Integer testRung
SearchPlan yes // What to do when the jar
breaks. SearchPlan no // What to do when the
jar does not break . public static boolean
correct(SearchPlan sp, Integer n, Integer q,
Integer k) // sp satisfies the binary search
tree property , has n leaves , of depth at most
q, all root-to-leaf paths have at most k yes
branches . ...
76
Strategy Specification
  • One function per quantified variable.

77
HSR Strategy Skeleton
class HSRStrategy public static
IterableltIntegergt HSR_q() ... public static
IterableltIntegergt HSR_k(Integer q) ... public
static IterableltIntegergt HSR_n(Integer q, Integer
k) ... public static IterableltIntegergt
HSR_m(Integer q, Integer k, Integer
n) ... public static IterableltSearchPlangt
HSRnqk_sp(Integer n, Integer q, Integer
k) ...
78
(No Transcript)
79
Semantic Game Tournaments
80
Tournament Design
  • Scheduler
  • Neutral.
  • Ranking function
  • Correct and anonymous.
  • Can mask scheduler deficiencies.

81
Ranking Functions
  • Input beating function representing output of
    several games.
  • Output a total preorder of participants.

82
Beating Functions (of SG Tournaments)
  • bP(pw, pl, swc, slc, sw) sum of all gains of pw
    against pl while pw choosing side swc , pl
    choosing side slc and pw taking side sw.
  • More complex.

83
Ranking Functions (Correctness)
  • Non-Negative Regard for Wins.
  • Non-Positive Regard for Losses.

84
Non-Negative Regard For Wins (NNRW)
Px
Additional wins cannot worsen Pxs rank w.r.t.
other participants.
Wins
Faults
85
Non-Positive Regard For Losses (NPRL)
Implies
Px
Additional faults cannot improve Pxs rank
w.r.t. other participants.
Wins
Faults
86
Ranking Functions (Anonymity)
  • Output ranking is independent of participant
    identities.
  • Ranking function ignores participants
    identities.
  • Participants also ignore their opponents
    identities.

87
Limited Collusion Effect
  • Slightly weaker notion than anonymity.
  • What you want in practice.
  • A participant Py can choose to lose on purpose
    against another participant Px, but that wont
    make Px get ahead of any other participant Pz.

88
Limited Collusion Effect (LCE)
Px
Games outside Pxs control cannot worsen Pxs
rank w.r.t. other participants.
Wins
Faults
89
Discovery
  • A useful design principle for ranking functions.
  • Under NNRW, NPRL LCE LFB
  • LFB is quite unusual to have.
  • LFB lends itself to implementation.

90
Locally Fault Based (LFB)
Relative rank of Px and Py depends only on
faults made by either Px or Py.
Px
Py
Wins
Faults
Faults
Wins
91
Locally Fault Based (LFB)
Relative rank of Px and Py can depends only on
games faults made by either Px or Py.
Px
Py
Wins
Faults
Faults
Wins
92
Locally Fault Based (LFB)
93
Collusion Resistant Ranking Functions
94
Beating Functions
  • Represent outcome of a set of SSGs
  • bP(pw, pl, swc, slc, sw) sum of all gains of pw
    against pl while pw choosing side swc , pl
    choosing side slc and pw taking side swc.

95
Beating Functions (Operations)
  • bPwpx games px wins.
  • bPlpx games px loses.
  • bPflpx games px loses while not forced.
  • bPcpx bPwpx bPflpx games px controls.
  • Can add them, bP0 is the identity element.

96
Ranking Functions
  • Take a beating function to a ranking
  • Ranking a total pre-order.

97
Limited Collusion Effect
  • There is no way pys rank can be improved w.r.t.
    pxs rank behind px back.

98
Non-Negative Regard for Wins
  • An extra win cannot worsen pxs rank.

99
Non-Positive Regard for Losses
  • An extra loss cannot improve pxs rank.

100
Local Fault Based
  • Relative rank of px w.r.t. py only depends on
    faults made by either px or py.

101
Main Result
102
Visual Proof
103
Fault Counting Ranking Function
  • Players are ranked according to the number of
    faults they make. The less the number of faults
    the higher the rank.
  • Satisfies the NNRW, NPRL, LFB and LCE properties.

104
Semantic Game Tournament Design
  • For every pair of players
  • If choosing different sides, play a single SG.
  • If choosing same sides, play two SGs where they
    switch sides.

105
Tournament Properties
  • Our tournament is neutral.

106
Neutrality
  • Each player plays nv nf - 1 SGs in their chosen
    side, those are the only games it may make faults.

107
Related Work
  • Rating and Ranking Functions
  • Tournament Scheduling
  • Match-Level Neutrality

108
Rating and Ranking Functions (I)
  • Dominated by heuristic approaches
  • Elo ratings.
  • Whos 1?
  • There are axiomatization of rating functions in
    the field of Paired Comparison Analysis.
  • LCE not on radar.
  • Independence of Irrelevant Matches (IIM) is
    frowned upon.

109
Rating and Ranking Functions (II)
  • Rubinstein1980
  • points system (winner gets a point) characterized
    as
  • Anonymity ranks are independent of the names of
    participants.
  • Positive responsiveness to the winning relation
    which means that changing the results of a
    participant p from a loss to a win, guarantees
    that ps rank would improve.
  • IIM relative ranking of two participants is
    independent of matches in which neither is
    involved.
  • beating functions are restricted to complete,
    asymmetric relations.

110
Tournament Scheduling
  • Neutrality is off radar.
  • Maximizing winning chances for certain players.
  • Delayed confrontation.

111
Match-Level Neutrality
  • Dominated by heuristic approaches
  • Compensation points.
  • Pie rule.

112
Conclusion
  • Semantic games of interpreted logic sentences
    provide a useful foundation to organize
    computational problem solving communities.

113
(No Transcript)
114
Future Work
  • Problem decomposition labs.
  • Social Computing.
  • Evaluating Thoroughness.

115
Questions?
116
Thank You!
117
(No Transcript)
118
N-Party SG-Based Competitions
  • A tournament of two-party SG-based competitions

119
N-Party SG-Based Competitions Challenges (I)
  • Collusion potential especially in the context of
    open online competitions.

120
N-Party SG-Based Competitions Challenges (II)
  • Neutrality.
  • Two-party SG-Based competitions are not neutral
    when one party is forced.

121
(No Transcript)
122
Rationale (4) Anonymous
123
Rationale (Objective)
  • While constructively debating the correctness of
    an interpreted predicate logic sentence
    specifying a computational problem, participants
    provide and solve instances of that computational
    problem.

124
  • ?f ? CNFs ?v ? assignments(f)?f ? assignments(f).
    fsat(f,f) fsat(v,f)

125
  • ?f ? CNFs ?v ? assignments(f)?f ? assignments(f).
    fsat(f,f) fsat(v,f)

126
Semantic Games
127
A meaningful competition is
  • Correct
  • Anonymous
  • Neutral
  • Objective
  • Thorough

128
Correctness
  • Rank is based on demonstrated possession (or lack
    of) skill.
  • Suppose that we let participants create
    benchmarks of MAX-SAT problems and their
    solutions to evaluate their opponents.
  • Participants would be incentivised to provide the
    wrong solutions.

129
Anonymous
  • Rank is independent of identities.
  • There is a potential for collusion among
    participants. This potential arise from the
    direct communication between participants. This
    potential is aggravated by the open online nature
    of competitions.

130
Neutral
  • The competition does not give an advantage to any
    of the participants.
  • For example, a seeded tournament where the seed
    (or the initial ranking) can affect the final
    ranking is not considered neutral.

131
Objective
  • Ranks are exclusively based on skills that are
    precisely defined in the competition definition.
    Such as solving MAX-SAT problems.

132
Thorough
  • Ranks are based on solving several MAX-SAT
    problems.

133
Thesis
  • Semantic games of interpreted logic sentences
    provide a useful foundation to organize
    computational problem solving communities.

134
Semantic Games
  • Thoroughness means that the competition result is
    based on a wide enough range of skills that
    participants demonstrate during the competition.

135
(No Transcript)
Write a Comment
User Comments (0)
About PowerShow.com