Applying Machine Learning to Circuit Design - PowerPoint PPT Presentation

About This Presentation
Title:

Applying Machine Learning to Circuit Design

Description:

'Annealing' came from a process blacksmiths use. ... Different costs to travel between different cities (assume cost = distance) ... – PowerPoint PPT presentation

Number of Views:27
Avg rating:3.0/5.0
Slides: 29
Provided by: davidhet
Learn more at: http://cs.gettysburg.edu
Category:

less

Transcript and Presenter's Notes

Title: Applying Machine Learning to Circuit Design


1
Applying Machine Learning to Circuit Design
  • David Hettlinger
  • Amy Kerr
  • Todd Neller

2
Channel Routing Problems (CRPs)
  • Chip Design
  • Each silicon wafer contains hundreds of chips.
  • Chip components, specifically transistors, are
    etched into the silicon.
  • Often groups of components need to be wired
    together.

3
A Simple CRP Instance
Silicon Channel
4
One Possible Solution
Goal Decrease the number of horizontal tracks.
5
CRPs Constraints
  • Horizontal Horizontal segments cannot overlap.
  • Vertical If subnet x has a pin at the top of
    column a, and subnet y has a pin at the bottom of
    column a, then subnet x must be placed above
    subnet y.

6
Simulated Annealing (SA)
  • Background
  • Annealing came from a process blacksmiths use.
  • Metal is heated then cooled slowly to make it as
    malleable as possible.
  • Statistical physicists developed SA.

7
SA Problem Components
  • Definable states The current configuration
    (state) of a problem instance must be
    describable.
  • New state generation A way to generate new
    states.
  • Energy function A formula for the relative
    desirability of a given state.
  • Temperature and annealing schedule A
    temperature value and a function for changing it
    over time.

8
A Visual Example
Local Minimum
Global Minimum
9
Applying Simulated Annealing to CRPs
  • Definable states The partitioning of subnets
    into groups.
  • New states generation Change the grouping of
    the subnets.
  • Energy function The number of horizontal tracks
    needed to implement a given partition.
  • Temperature and annealing schedule Start the
    temperature just high enough to accept any new
    configuration. As for the annealing schedule,
    reinforcement learning can help find that.

10
A Simple CRP Example
Start State of a CRP Instance
Partition Graph of this State
11
A Simple CRP Example
States 1 and 2
Partition Graphs of theses States
12
A Simple CRP Example
States 1, 2 and 3
Partition Graphs of these States
13
A Simple CRP Example
Starting through Ending States
Partition Graphs of these States
14
A Generated CRP Instance
Start State
A Solution
12 Horizontal Tracks
15 Horizontal Tracks
15
SA Problem Components
  • Definable states The current configuration
    (state) of a problem instance must be
    describable.
  • New state generation A way to generate new
    states.
  • Energy function A formula for the relative
    desirability of a given state.
  • Temperature and annealing schedule A
    temperature value and a function for changing it
    over time.

16
The Drunken Topographer
  • Imagine an extremely hilly landscape with many
    hills and valleys high and low
  • Goal find lowest spot
  • Means airlift a drunk!
  • Starts at random spot
  • Staggers randomly
  • More tired ? rejects more uphill steps

17
Super-Drunks, Dead-Drunks, and Those In-Between
  • The Super-Drunk never tires
  • Never rejects uphill steps
  • How well will the Super-Drunk search?
  • The Dead-Drunk is absolutely tired
  • Always rejects uphill steps
  • How well will the Dead-Drunk search?
  • Now imagine a drunk that starts in fine condition
    and very gradually tires.

18
Traveling Salesman Problem
  • Have to travel a circuit around n cities (n
    400)
  • Different costs to travel between different
    cities (assume cost distance)
  • State ordering of cities (gt 8?10865 orderings
    for 400 cities)
  • Energy cost of all travel
  • Step select a portion of the circuit and reverse
    the ordering

19
Determining the Annealing Schedule
  • The schedule of the cooling is critical
  • Determining this schedule by hand takes days
  • Takes a computer mere hours to compute!

20
Reinforcement Learning Example
  • Goal Ace a class
  • Need to learn how long we need to study to get an
    A
  • Trial Error study for various amts. of time
  • Short term reward exam grades, amt free time
  • Long term rewards
  • Grade affects future opportunities
  • i.e. whether we can slack off later
  • Our semester grades (goal is to max. this!)

21
Reinforcement Learning (RL)
  • Learns completely by trial error
  • No preprogrammed knowledge
  • No human supervision/mentorship
  • Receives rewards for each action

1. Immediate reward (numerical)
2. Delayed reward actions affect future
situations opportunities for future rewards
  • Goal maximize long-term numerical reward

22
RL The Details
Agent
Sensation
Reward
Action
Environment
  • Agent the learner (i.e. the student)
  • Environment everything the agent cannot
    completely control.
  • Includes reward functions (i.e. grade scale)
  • Descript. of current state (i.e. current average)
  • Call this description a Sensation

23
RL Value Functions
  • Use immediate delayed rewards to evaluate
    desirability of actions/learn task
  • Value function of a state-action pair, Q(s,a)
  • The expected reward for taking action a from
    state s
  • Includes immediate delayed reward
  • Strategy Most of the time, choose the action
    that corresponds to the maximal Q(s,a) value for
    the state.
  • Remember, must explore sometimes!

24
RL Q-Functions
  • We must learn Q(s,a)
  • To start, set Q(s,a) 0 for all s, a.
  • Agent tries various actions.
  • Each time experiences action a from state s,
    updates estimate of Q(s,a) towards the actual
    reward experienced
  • If usually pick the action a that has the
    maximal Q-value for that state

? max. total reward ? optimal performance
25
Example Grid World
  • Can always move up, down, right, left
  • Board wraps around



Goal

Start
  • Get to goal in as few steps as
  • possible.
  • Reward ?
  • Meaning of Q?
  • What are the optimal paths?

26
Applying RL to CRP
  • Can learn an approx. optimal annealing schedule
    using RL
  • Reward function
  • Rewarded for reaching better configuration
  • Penalized for the amount of time used to find
  • this better configuration


? Computer learns to find an approx optimal
annealing schedule in a time-efficient manner.
  • Program self-terminates!

27
Conclusions
  • CRP is an interesting but complex problem
  • Simulated Annealing helps us solve CRPs
  • Simulated Annealing requires annealing schedule
    (how to change temperature over time)
  • Reinforcement learning which is just learning
    through trial error lets a computer learn an
    annealing schedule in hours instead of days.

28
Any Questions?
Write a Comment
User Comments (0)
About PowerShow.com