Learning Shape in Computer Go - PowerPoint PPT Presentation

About This Presentation
Title:

Learning Shape in Computer Go

Description:

No good evaluation functions. Human intuition (shape knowledge) has proven difficult to capture. ... Just how good is a particular shape? Enumerating local shapes ... – PowerPoint PPT presentation

Number of Views:46
Avg rating:3.0/5.0
Slides: 21
Provided by: csUal
Category:

less

Transcript and Presenter's Notes

Title: Learning Shape in Computer Go


1
Learning Shape in Computer Go
  • David Silver

2
A brief introduction to Go
  • Black and white take turns to place down stones
  • Once played, a stone cannot move
  • The aim is to surround the most territory
  • Usually played on 19x19 board

3
Capturing
  • The lines radiating from a stone are called
    liberties
  • If a connected group of stones has all of its
    liberties removed then it is captured
  • Captured stones are removed from the board

4
Capturing
  • The lines radiating from a stone are called
    liberties
  • If a connected group of stones has all of its
    liberties removed then it is captured
  • Captured stones are removed from the board

5
Atari Go (Capture Go)
  • Atari Go is a simplified version of Go
  • The winner is the first player to capture
  • Often used to teach Go to beginners
  • Circumvents several tricky issues
  • The game only finishing by agreement
  • Ko (local repetitions of position)
  • Seki (local stalemates)

6
Computer Go
  • Computer Go programs are very weak
  • Search space is too large for brute force
    techniques
  • No good evaluation functions
  • Human intuition (shape knowledge) has proven
    difficult to capture.
  • Why not learn shape knowledge?
  • And use it to learn an evaluation function?

7
Local shape
  • Local shape describes a pattern of stones
  • It is used extensively by current Computer Go
    programs (pattern databases)
  • Inputting local shape by hand takes many years of
    hard labour
  • We would like to
  • Learn local shapes by trial and error
  • Assign a value for the goodness of a shape
  • Just how good is a particular shape?

8
Enumerating local shapes
  • In these experiments all possible local shapes
    are used as features
  • Up to a small maximum size (e.g. 2x2)
  • A local shape is defined to be
  • A particular configuration of stones
  • At a canonical position on the board
  • Local shapes are used as binary features by the
    learning algorithm

9
Invariances
  • Each canonical local shape can be
  • Rotated
  • Reflected
  • Inverted
  • So each position may cause updates to multiple
    instances of each feature.

10
Algorithm
  • Value function is learnt for afterstates
  • Move selection is done by 1-ply greedy search (e
    0) over value function
  • Active local shapes are identified
  • Linear combination is taken
  • Sigmoid squashing function is applied
  • Backups are performed using TD(0)
  • Reward of 1 for winning, 0 for losing

11
Value function approximation
12
Training procedure
  • The challenge
  • Learn to beat the average liberty player
  • So learning algorithm was trained specifically
    against the average liberty player
  • The problem learning is very slow, since the
    agent almost never wins any games by chance.
  • The solution mix in a proportion of random moves
    until the agent wins 50 of all games.
  • Reduce the proportion of randomness as the agent
    learns to win more games.

13
Training procedure
  • The two pint challenge
  • Learn to beat the average liberty player
  • So learning algorithm was trained specifically
    against the average liberty player
  • The problem learning is very slow, since the
    agent almost never wins any games by chance.
  • The solution mix in a proportion of random moves
    until the agent wins 50 of all games.
  • Reduce the proportion of randomness as the agent
    learns to win more games.

14
Results for different shape sizes
15
Results for different board sizes
16
Shapes learned (1x1)
17
Shapes learned (2x2)
18
Shapes learned (3x3)
19
Conclusions
  • Local shape information is sufficient to beat a
    naïve rule-based player
  • Significant shapes can be learned
  • The goodness of shapes can be learned
  • A linear threshold unit can provide a reasonable
    evaluation function
  • Enumerating all local shapes reaches a natural
    limit at 3x3
  • Training methodology is crucial

20
Future work
  • Learn shapes selectively rather than enumerating
    all possible shapes
  • Learn shapes to answer specific questions
  • Can black B4 be captured?
  • Can white connect A2 to D5?
  • Learn non-local shape
  • Use connectivity relationships
  • Build hierarchies of shapes
Write a Comment
User Comments (0)
About PowerShow.com