Parallel Pi Implementation - PowerPoint PPT Presentation

1 / 13
About This Presentation
Title:

Parallel Pi Implementation

Description:

Monte Carlo. Randomly select values for x and y. 0 = x,y =1. Pi/4 ... More complex than Monte Carlo. Assigning equal loads across processors challenging ... – PowerPoint PPT presentation

Number of Views:31
Avg rating:3.0/5.0
Slides: 14
Provided by: kened
Category:

less

Transcript and Presenter's Notes

Title: Parallel Pi Implementation


1
Parallel Pi Implementation
  • Ken Edwards

2
Overview
  • Algorithms chosen
  • Implementation
  • Performance results

3
Monte Carlo
  • Randomly select values for x and y
  • 0 lt x,y lt1
  • Pi/4 inside points / total points

Source http//www.modula2.org/projects/pi_by_mont
ecarlo/est_pi.gif
4
Monte Carlo Continued
  • Pros
  • Straightforward implementation
  • Monte Carlo algorithms parallelize well
  • Cons
  • Very slow convergence

5
Machins Formula
  • p 4 arctan(1/5) arctan(1/239)
  • arctan(x) x x3/3 x5/5 - x7/7 x9/9 -
  • Infinite number of similar formulas
  • p 4 arctan(1)
  • p 4 arctan(1/2) 4 arctan(1/3)
  • p 4 arctan(1/2) 4 arctan(1/5) arctan(1/8)
  • p 16 arctan(1/5) 4 arctan(1/70) 4
    arctan(1/99)
  • p 24 arctan(1/8) 8 arctan(1/57) 4
    arctan(1/239)
  • p 48 arctan(1/18) 4 arctan(1/57) - 20
    arctan(1/239)

6
Machins Formula
  • Pros
  • Much faster convergence
  • Variety of ways to parallelize
  • Cons
  • More complex than Monte Carlo
  • Assigning equal loads across processors
    challenging

7
What I Did
  • Monte Carlo
  • Serial version
  • Pthread version
  • OpenMP version
  • Machins
  • Serial version
  • Several Pthread versions
  • OpenMP version

8
Machins Formula - Pthreads
  • Each arctan calculation is its own thread
  • Two threads
  • p 4 arctan(1/5) arctan(1/239)
  • Three threads
  • p 4 arctan(1/2) arctan(1/5) arctan(1/8)
  • p 48 arctan(1/18) 32 arctan(1/57) 20
    arctan(1/239)
  • Four threads
  • p 7 arctan(1/13) 8 arctan(1/32) -
  • 2 arctan(1/132) 5 arctan(1/378)
  • p 14 arctan(1/4) 10 arctan(1/32)
  • 6 arctan(1/132) - 8 arctan(1/378)

9
Initial Pthread Performance
Tests run on 4-way machine with 2.20 GHz Intel
Xeons
10
Why Such Variable Results?
Source http//turner.faculty.swau.edu/pi/piforms.
html
11
Monte Carlo Performance
  • Initially, concurrent version performed worse!
  • rand() is not thread-safe
  • used rand_r() instead
  • Lesson learned make sure functions you call are
    thread-safe before using them!

Tests run on 4-way machine with 2.20 GHz Intel
Xeons
12
Whats Left
  • Try more formulas to achieve faster converging 3
    and 4-threaded versions
  • Profile the code to find additional performance
    bottlenecks

13
Thank You
Write a Comment
User Comments (0)
About PowerShow.com