CSE5304 - PowerPoint PPT Presentation

1 / 22
About This Presentation
Title:

CSE5304

Description:

Tian Mi. An naive version with MPI. P1. P2. Pi. PN. Result: An naive version with MPI. Pi ... Data size: 2048*2048. 15.64583. 48. 128. 13.65455. 55. 64. 14. ... – PowerPoint PPT presentation

Number of Views:62
Avg rating:3.0/5.0
Slides: 23
Provided by: cseU3
Category:
Tags: cse5304 | size

less

Transcript and Presenter's Notes

Title: CSE5304


1
CSE5304Project ProposalParallel Matrix
Multiplication
  • Tian Mi

2
An naive version with MPI
Result






P1 ?
P2 ?

Pi ?

PN ?
3
An naive version with MPI
Pi ?
? Pi
4
An naive version with MPI
  • Processor0 reads input file
  • Processor0 distributes one matrix
  • Processor0 broadcasts the other matrix
  • All processors in parallel
  • Do the multiplication of each piece of data
  • Processor0 gathers the result
  • Processor0 writes result to output file

5
MPI_Scatter
6
MPI_Scatter
7
MPI_Bcast
8
MPI_Bcast
9
MPI_Gather
10
MPI_Gather
11
Data generation
  • Data generation in R with package igraph
  • Integer in range of -1000, 1000
  • Matrix size

Matrix 512512 10241024 20482048 40964096
File size 2.69 MB 10.7 MB 43.1 MB 172 MB
12
Result
  • Data size 10241024

Processors Experiments(second) Experiments(second) Experiments(second) Experiments(second) Experiments(second) Average(s) Speedup
1 44 41 45 37 42 41.8 1
2 23 20 21 19 22 21 1.99
4 11 10 19 18 16 14.8 2.82
8 10 9 8 9 10 9.2 4.54
16 9 9 11 9 6 8.8 4.75
32 8 10 8 7 7 8 5.23
64 8 8 8 8 8 8 5.23
128 10 9 6 8 9 8.4 4.98
13
Result
  • Data size 10241024

14
Result
  • Data size 10241024

15
Result
  • Data size 20482048

Processors Time(s) Speedup
1 751 1
2 498 1.508032
4 258 2.910853
8 127 5.913386
16 84 8.940476
32 51 14.72549
64 55 13.65455
128 48 15.64583
16
Result
  • Data size 20482048

17
Result
  • Data size 20482048

18
Result
  • Data size 40964096

Processors Time(s) Speedup
1 5920 1
2 3630 1.630854
4 2813 2.104515
8 925 6.4
16 745 7.946309
32 576 10.27778
64 DIV/0!
128 DIV/0!
19
Analysis
  • To see the superlinear speedup
  • increase the computation, which is not dominant
    enough
  • larger matrix and larger integer
  • However, larger matrix or long integer will also
    increase the communication time (broadcast,
    scatter, gather)

20
Cannon's algorithm--Example
  • http//www.vampire.vanderbilt.edu/education-outrea
    ch/me343_fall2008/notes/parallelMM_10_09.pdf

21
Cannon's algorithm
  • Still Implementing and debugging
  • No result to share at present

22
Thank you
  • Questions Comments?
Write a Comment
User Comments (0)
About PowerShow.com