High Performance Fortran (HPF) - PowerPoint PPT Presentation

About This Presentation

Title:

High Performance Fortran (HPF)

Description:

Restrict kind of parallelism used. Use semi-automatic approach ... More restrictive than general message passing model (only data parallelism) ... – PowerPoint PPT presentation

Number of Views:197

Avg rating:3.0/5.0

Slides: 21

Provided by: csVu

Category:

Tags: hpf | fortran | high | parallelism | performance

Transcript and Presenter's Notes

Title: High Performance Fortran (HPF)

1
High Performance Fortran (HPF)

Source
Chapter 7 of "Designing and building parallel
programs (Ian Foster, 1995)

2
Question

Can't we just have a clever compiler generate a
parallel program from a sequential program?
Fine-grained parallelism
x ab cd
Trivial parallelism
for i 1 to 100 do
for j 1 to 100 do
C i, j dotproduct ( A i,, B , j
)
od
od

3
Automatic parallelism

Automatic parallelization of any program is
extremely hard
Solutions
Make restrictions on source program
Restrict kind of parallelism used
Use semi-automatic approach
Use application-domain oriented languages

4
High Performance Fortran (HPF)

Designed by a forum from industry, government,
universities
Extends Fortran 90
To be used for computationally expensive
numerical applications
Portable to SIMD machines, vector processors,
shared-memory MIMD and distributed-memory MIMD

5
Fortran 90 - Base language of HPF

Extends Fortran 77 with 'modern' features
abstract data types, modules
recursion
pointers, dynamic storage
Array operators
A B C
A A 1.0
A(17) B(17) B(28)
WHERE (X / 0) X 1.0/X

6
Data parallelism

Data parallelism same operation applied to
different data elements in parallel
Data parallel program sequence of data parallel
operations
Overall approach
Programmer does domain decomposition
Compiler partitions operations automatically
Data may be regular (array) or irregular (tree,
sparse matrix)
Most data parallel languages only deal with arrays

7
Data parallelism - Concurrency

Explicit parallel operations
A B C ! A, B, and C are arrays
Implicit parallelism
do i 1,m
do j 1,n
A(i,j) B(i,j) C(i,j)
enddo
enddo

8
Compiling data parallel programs

Programs are translated automatically into
parallel SPMD (Single Program Multiple Data)
programs
Each processor executes same program on a subset
of the data
Owner computes rule
- Each processor owns subset of the data
structures
- Operations required for an element are executed
by the owner
- Each processor may read (but not modify) other
elements

9
Example

real s, X(100), Y(100) ! s is scalar, X and Y
are arrays
X X 3.0 ! Multiply each X(i) by 3.0
do i 2,99
Y(i) (X(i-1) X(i1))/2 ! Communication
required
enddo
s SUM(X) ! Communication required
Arrays X and Y are distributed (partitioned)
Scalar s is replicated on each machine

X
Y
10
HPF primitives for data distribution

Directives
PROCESSORS shape size of abstract processors
ALIGN align elements of different arrays
DISTRIBUTE distribute (partition) an array
Directives affect performance of the program, not
its result

11
Processors directive

!HPF PROCESSORS P(32)
!HPF PROCESSORS Q(4,8)
Mapping of abstract to physical processors not
specified in HPF
(implementation-dependent)

12
Alignment directive

Aligns an array with another array
Species that specific elements should be mapped
to the same processor
real A(50), B(50), C(50,50)
!HPF ALIGN A(I) WITH B(I)
!HPF ALIGN A(I) WITH B(I2)
!HPF ALIGN A() WITH C(,)

13
Figure 7.6 from Foster's book
14
Distribution directive

Species how elements should be partitioned among
the local memories
Each dimension can be distributed as follows
no distribution
BLOCK (n) block distribution
CYCLIC (n) cyclic distribution

15
Figure 7.7 from Foster's book
16
Example program

See Ian Foster, Program 7.2
program hpf_finite_difference
!HPF PROCESSORS pr(4) ! use 4 CPUs
real X(100, 100), New(100, 100) ! data
arrays
!HPF ALIGN New(,) WITH X(,)
!HPF DISTRIBUTE X(BLOCK,) ONTO pr ! row-wise
New(299, 299) (X(198, 299) X(3100,
299) X(299, 198) X(299, 3100))/4
diffmax MAXVAL (ABS (New-X))
end

17
Example program (2)

Use block distribution instead of row
distribution
program hpf_finite_difference
!HPF PROCESSORS pr(2,2) ! use 2x2 grid
real X(100, 100), New(100, 100) ! data
arrays
!HPF ALIGN New(,) WITH X(,)
!HPF DISTRIBUTE X(BLOCK, BLOCK) ONTO pr !
block-wise
New(299, 299) (X(198, 299) X(3100,
299) X(299, 198) X(299, 3100))/4
diffmax MAXVAL (ABS (New-X))
end

18
Performance

Distribution affects
Load balance
Amount of communication
Example (communication costs)
!HPF PROCESSORS pr(3)
integer A(8), B(8), C(8)
!HPF ALIGN B() WITH A()
!HPF DISTRIBUTE A(BLOCK) ONTO pr
!HPF DISTRIBUTE C(CYCLIC) ONTO pr

19
Figure 7.9 from Foster's book
20
Conclusions

High-level model
User species data distribution
Compiler generates parallel program
communication
More restrictive than general message passing
model (only data parallelism)
Restricted to array-based data structures
HPF programs will be easy to modify, which
enhances portability
Changing data distribution only requires changing
directives

Write a Comment

User Comments (0)

About PowerShow.com

Recommended Relevance Latest Highest Rated Most Viewed

Sort by:

Related More from user

CrystalGraphics Presentations

Introducing-PowerShowcom PowerPoint PPT Presentation

Introducing-PowerShowcom - Introducing-PowerShowcom (Without Music)

CrystalGraphics 3D Character Slides for PowerPoint PowerPoint PPT Presentation

CrystalGraphics 3D Character Slides for PowerPoint - CrystalGraphics 3D Character Slides for PowerPoint

Chart and Diagram Slides for PowerPoint PowerPoint PPT Presentation

Chart and Diagram Slides for PowerPoint - Beautifully designed chart and diagram s for PowerPoint with visually stunning graphics and animation effects. Our new CrystalGraphics Chart and Diagram Slides for PowerPoint is a collection of over 1000 impressively designed data-driven chart and editable diagram s guaranteed to impress any audience. They are all artistically enhanced with visually stunning color, shadow and lighting effects. Many of them are also animated. And they’re ready for you to use in your PowerPoint presentations the moment you need them. – PowerPoint PPT presentation

Related Presentations

High Performance Computing With Java PowerPoint PPT Presentation

High Performance Computing With Java - Java (byte) code HPJava, Manta, JavaParty, Titanium. ... Manta: Vrije University. Compiler-based high-performance Java system. ... | PowerPoint PPT presentation | free to view

High%20Performance%20Parallel%20Programming PowerPoint PPT Presentation

High%20Performance%20Parallel%20Programming - Parallel programming covers all occasions where more than 1 functional element is involved. ... Most simple hardware parallelism is already exploited. Bit ... | PowerPoint PPT presentation | free to view

Coarray Fortran: Compilation, Performance, Languages Issues PowerPoint PPT Presentation

Coarray Fortran: Compilation, Performance, Languages Issues - Tailor communication to architecture. design supports alternate communication libraries ... applications can be tailored to fully exploit architectural characteristics ... | PowerPoint PPT presentation | free to view

HPF%20(High%20Performance%20Fortran) PowerPoint PPT Presentation

HPF%20(High%20Performance%20Fortran) - int A(10), B(10), C(10) int D(50), E(100), F(100) int max. int G(100), H(100,100) A(1:8) = B(1:8) C(2:9) D = E(1:100:2) F(2:99:2) max = MAXVAL( G(1:100:10) ... | PowerPoint PPT presentation | free to view

UPC: A Portable High Performance Dialect of C PowerPoint PPT Presentation

UPC: A Portable High Performance Dialect of C - CPU: x86, Itanium, Opteron, Alpha, Power3/4, SPARC, PA-RISC, MIPS. OS: Linux, Solaris, AIX, Tru64, Unicos, FreeBSD, IRIX, HPUX, Cygwin, MacOS ... | PowerPoint PPT presentation | free to view

Computer Architectures ... High Performance Computing I PowerPoint PPT Presentation

Computer Architectures ... High Performance Computing I - Ttheor: theoretical peak performance; obtained by multiplying clock rate with no. ... fine subdivision of an operation into sub-operations leading to shorter cycle ... | PowerPoint PPT presentation | free to view

205CSC316 High Performance Computing PowerPoint PPT Presentation

205CSC316 High Performance Computing - Outline the concept of Data parallelism. Examine the Compiler Directives for Data Alignment ... Data Parallelism. A single thread of program control operating ... | PowerPoint PPT presentation | free to view

Performance Assessment of Parallel Techniques PowerPoint PPT Presentation

Performance Assessment of Parallel Techniques - Test execution time of MPI-, OpenMP- and HPF-version on Origin 2000 ... [4] D. Bailey, T. Harris, W. Sahpir, R. van der Wijngaart, A. Woo, M. Yarrow (December 1995) ... | PowerPoint PPT presentation | free to view

High Performance Fortran HPF PowerPoint PPT Presentation

High Performance Fortran HPF - HPF$ DISTRIBUTE A(BLOCK) ONTO pr !HPF$ DISTRIBUTE C(CYCLIC) ONTO pr ... Fortran: an historical object lesson' by Ken Kennedy, Charles Koelbel, Hans Zima. ... | PowerPoint PPT presentation | free to view

Experiences with Sweep3D Implementations in Coarray Fortran PowerPoint PPT Presentation

Experiences with Sweep3D Implementations in Coarray Fortran - ... buffers into ... Using multiple communication buffers enables overlap of ... CAF multi-version buffers improve performance of one ... | PowerPoint PPT presentation | free to view

A Multi-platform Co-Array Fortran Compiler PowerPoint PPT Presentation

A Multi-platform Co-Array Fortran Compiler - pack strided data on source and unpack it on destination. 22. Pragmatics of Packing ... unpacking requires conversion of PUTs into two-sided communication (a difficult ... | PowerPoint PPT presentation | free to view

Compiling High Performance Fortran PowerPoint PPT Presentation

Compiling High Performance Fortran - IF (PID==CEIL(N 1/100)-1) hi = MOD(N,100) 1. DO i = lo, hi. A(i) = B(i-1) ... lastP = CEIL((N 1)/100) - 1. IF (PID==lastP) hi = MOD(N,100) 1. DO i = lo, hi 1 ... | PowerPoint PPT presentation | free to view

Programming for High Performance Computers PowerPoint PPT Presentation

Programming for High Performance Computers - Caches are introduced to facilitate the re-use of data. 2-3 levels of cache L1, L2, L3 ... A language was developed that was difficult to compile efficiently. ... | PowerPoint PPT presentation | free to view

A Multi-platform Co-array Fortran Compiler for High-Performance Computing PowerPoint PPT Presentation

A Multi-platform Co-array Fortran Compiler for High-Performance Computing - real a(20,20) private: a 20x20 array in each image ... Annotated sequential code (semiautomatic parallelization) Requires heroic compiler technology ... | PowerPoint PPT presentation | free to view

Performance Technology for Complex Parallel Systems Sameer Shende, Allen D. Malony University of Oregon PowerPoint PPT Presentation

Performance Technology for Complex Parallel Systems Sameer Shende, Allen D. Malony University of Oregon - How do we create robust and ubiquitous performance technology for ... VM. space. Context. SMP. Threads. node memory. Interconnection Network. Inter-node message ... | PowerPoint PPT presentation | free to view

Robust and High Performance Tools for Scientific Computing The DOE ACTS Collection PowerPoint PPT Presentation

Robust and High Performance Tools for Scientific Computing The DOE ACTS Collection - Robust and High Performance Tools for Scientific Computing. The ... Guide to Available Mathematical Software: http://gams.nist.gov. MGNet: http://www.mgnet.org ... | PowerPoint PPT presentation | free to view

TAU%20Performance%20Toolkit%20(WOMPAT%20OpenMP%20Lab%20Sessions)%20Sameer%20Shende,%20Allen%20D.%20Malony,%20Robert%20Bell%20University%20of%20Oregon%20{sameer,%20malony,%20bertie}@cs.uoregon.edu PowerPoint PPT Presentation

TAU%20Performance%20Toolkit%20(WOMPAT%20OpenMP%20Lab%20Sessions)%20Sameer%20Shende,%20Allen%20D.%20Malony,%20Robert%20Bell%20University%20of%20Oregon%20{sameer,%20malony,%20bertie}@cs.uoregon.edu - Experiment trials describing instrumentation and measurement requirements ... Performance data mapping between software levels. The TAU Performance System ... | PowerPoint PPT presentation | free to view

Experiences with Sweep3D Implementations in Coarray Fortran PowerPoint PPT Presentation

Experiences with Sweep3D Implementations in Coarray Fortran - Evaluate CAF for an application with sophisticated parallelization: Sweep3D. Co-Array Fortran ... Sweep3D Parallelization. 2D spatial domain decomposition onto ... | PowerPoint PPT presentation | free to view

High Performance Computing and Trends, Enhancing Performance, Measurement Tools, PowerPoint PPT Presentation

High Performance Computing and Trends, Enhancing Performance, Measurement Tools, - High Performance Computing and Trends, Enhancing Performance, Measurement Tools, | PowerPoint PPT presentation | free to view

High-Performance Computing, Computational Science, and NeuroInformatics Research PowerPoint PPT Presentation

High-Performance Computing, Computational Science, and NeuroInformatics Research - ... Acquisition of the ... and data management power of the infrastructure without requiring specialized knowledge of parallel execution Marine seismic ... | PowerPoint PPT presentation | free to view

TAU Performance System Sameer Shende, Allen D. Malony University of Oregon {sameer, malony}@cs.uoregon.edu Workshop Jan 9-10, 2006. Classroom 1, T 1889, LLNL PowerPoint PPT Presentation

TAU Performance System Sameer Shende, Allen D. Malony University of Oregon {sameer, malony}@cs.uoregon.edu Workshop Jan 9-10, 2006. Classroom 1, T 1889, LLNL - ... and visualization Portable performance profiling ... and hotspots Implemented through ... periodically Online performance monitoring ... | PowerPoint PPT presentation | free to view

The TAU Performance Technology for Complex Parallel Systems (Performance Analysis Bring Your Own Code Workshop, Building 1103 Room 236, NASA Stennis Space Center, MS 39529) Sameer Shende, Allen D. Malony, Robert Bell University of Oregon {sameer, PowerPoint PPT Presentation

The TAU Performance Technology for Complex Parallel Systems (Performance Analysis Bring Your Own Code Workshop, Building 1103 Room 236, NASA Stennis Space Center, MS 39529) Sameer Shende, Allen D. Malony, Robert Bell University of Oregon {sameer, - Outline Motivation Part I: Overview of TAU and PDT Performance Analysis and Visualization with TAU Pprof Paraprof Performance Database Part II: Using TAU ... | PowerPoint PPT presentation | free to view

?????(COW)????? PowerPoint PPT Presentation

?????(COW)????? - ... MFS(Mosix kernel)... 2 NIS (Unix Linux) NIS+(Solaris) NT 3 MPI PVM Linda, HPF, ... | PowerPoint PPT presentation | free to view

A Multi-platform Co-array Fortran Compiler for High-Performance Computing PowerPoint PPT Presentation

A Multi-platform Co-array Fortran Compiler for High-Performance Computing - Title: PowerPoint Presentation Last modified by: U Created Date: 1/1/1601 12:00:00 AM Document presentation format: Custom Other titles: Times New Roman Arial ... | PowerPoint PPT presentation | free to view

Implementation and Performance Evaluation of XcalableMP: A Parallel Programming Language for Distributed Memory Systems PowerPoint PPT Presentation

Implementation and Performance Evaluation of XcalableMP: A Parallel Programming Language for Distributed Memory Systems - Implementation and Performance Evaluation of XcalableMP: A Parallel Programming Language for Distributed Memory Systems University of Tsukuba | PowerPoint PPT presentation | free to view

Performance Technology for Complex Parallel Systems Sameer Shende University of Oregon PowerPoint PPT Presentation

Performance Technology for Complex Parallel Systems Sameer Shende University of Oregon - Performance Technology for Complex Parallel Systems Sameer Shende University of Oregon | PowerPoint PPT presentation | free to view

Computer Systems Lab Courses PowerPoint PPT Presentation

Computer Systems Lab Courses - Computer Systems Lab Courses Comparative Languages FORTRAN - 1957 FORmula TRANslating systems FORTRAN I - 1957 (FORTRAN 0 - 1954 - not implemented) Designed by John ... | PowerPoint PPT presentation | free to view