Implementing Tomorrow's Programming Languages - PowerPoint PPT Presentation

About This Presentation
Title:

Implementing Tomorrow's Programming Languages

Description:

We have finally reached the long-expected 'speed wall' for the processor clock. ... engineers who do not know parallel programming will be obsolete in no time. ... – PowerPoint PPT presentation

Number of Views:39
Avg rating:3.0/5.0
Slides: 34
Provided by: mossCs
Category:

less

Transcript and Presenter's Notes

Title: Implementing Tomorrow's Programming Languages


1
Implementing Tomorrow's Programming Languages
  • Rudi Eigenmann
  • Purdue University
  • School of ECE
  • Computing Research institute
  • Indiana, USA

2
How to find Purdue University
3
Computing Research Institute (CRI)
Other DP Centers Bioscience Nanotechnology E-Ente
rprise Entrepreneurship Learning Advanced
Manufacturing Environment Oncology
CRI is the high-performance computing branch of
Discovery Parks
4
Compilers are the Center of the Universe
  • The compiler translates
  • the programmers
  • view
  • into
  • the machines
  • view

5
Why is Writing Compilers Hard? a high-level view
  • Translation passes are complex algorithms
  • Not enough information at compile time
  • Input data not available
  • Insufficient knowledge of architecture
  • Not all source code available
  • Even with sufficient information, modeling
    performanceis difficult
  • Architectures are moving targets

6
Why is Writing Compilers Hard? from an
implementation angle
  • Interprocedural analysis
  • Alias/dependence analysis
  • Pointer analysis
  • Information gathering and propagation
  • Link-time, load-time, run-time optimization
  • Dynamic compilation/optimization
  • Just-in-time compilation
  • Autotuning
  • Parallel/distributed code generation

7
Its Even Harder Tomorrow
  • Because we want
  • All our programs to work on multicore processors
  • Very High-level languages
  • Do weather forecast
  • Composition Combine weather forecast with
    energy-reservation and cooling manager
  • Reuse warn me if Im writing a module that
    exists out there.

8
How Do We Get There?Paths towards tomorrows
programming language
  • Addressing the (new) multicore challenge
  • Automatic Parallelization
  • Speculative Parallel Architectures
  • SMP languages for distributed systems
  • Addressing the (old) general software engineering
    challenge
  • High-level languages
  • Composition
  • Symbolic analysis
  • Autotuning

9
The Multicore Challenge
  • We have finally reached the long-expected speed
    wall for the processor clock.
  • (this should not be news to you!)
  • one of the biggest disruptions in the
    evolution of information technology.
  • Software engineers who do not know parallel
    programming will be obsolete in no time.

10
Automatic ParallelizationCan we implement
standard languages on multicore?
Polaris A Parallelizing Compiler
Standard Fortran
  • more specifically a source-to-source
    restructuring compiler
  • Research issues in such a compiler
  • Detecting parallelism
  • Mapping parallelism onto the machine
  • Performing compiler techniques at runtime
  • Compiler infrastructure

Polaris
Fortrandirectives (OpenMP)
OpenMP backend compiler
11
State of the Art in Automatic parallelization
  • Advanced optimizing compilers perform well in 50
    of all science/engineering applications.
  • Caveats this is true
  • in research compilers
  • for regular applications, written in Fortran
    or C without pointers
  • Wanted heroic, black-belt programmers who know
    the assembly language of HPC

12
Can Speculative Parallel Architectures Help?
  • Basic idea
  • Compiler splits program into sections (without
    considering data dependences)
  • The sections are executed in parallel
  • The architecture tracks data dependence
    violations and takes corrective action.

13
Performance of Speculative Multithreading
  • SPEC CPU2000 FP programs executed on a 4-core
    speculative architecture.

14
We may needExplicit Parallel Programming
  • Shared-memory architectures
  • OpenMP proven model for Science/Engineering
    programs
  • Suitability for non-numerical programs ?
  • Distributed computers
  • MPI the assembly language of parallel/distributed
    systems. Can we do better ?

15
Beyond ScienceEngineering Applications
  • 7 Dwarfs
  • Structured Grids (including locally structured
    grids, e.g. Adaptive Mesh Refinement)
  • Unstructured Grids
  • Fast Fourier Transform
  • Dense Linear Algebra
  • Sparse Linear Algebra
  • Particles
  • Monte Carlo
  • Search/Sort
  • Filter
  • Combinational logic
  • Finite State Machine

16
Shared-Memory Programming for Distributed
Applications?
  • Idea 1
  • Use an underlying software distributed-shared-memo
    ry system (e.g., Treadmarks).
  • Idea 2
  • Direct translation into message-passing code

17
OpenMP for Software DSM
Challenges
  • S-DSM maintains coherency at a page level
  • Optimizations that reduce false sharing and
    increase page affinity are very important
  • In S-DSMs, such as TreadMarks, the stacks are not
    in shared address space
  • Compiler must identify shared stack variable ?
    interprocedural analysis

P1 tells P2 I have written page x
Processor 1
Processor 2
A50
barrier
Shared address space
Shared memory
A50
Distributed memories
P2 requests page diff from P1
stack
stack
stack
stack
stack
stack
stack
stack
t
18
Optimized Performance of SPEC OMPM2001 Benchmarks
on a Treadmarks S-DSM System
19
Direct Translation of OpenMP into Message Passing
  • A question often asked How is this different
    from HPF?
  • HPF emphasis is on data distribution
  • OpenMP the starting point is explicit
    parallel regions.
  • HPF implementations apply strict data
    distribution and owner-computes schemes
  • Our approach partial replication of shared data.
  • Partial replication leads to
  • Synchronization-free serial code
  • Communication-free data reads
  • Communication for data writes amenable to
    collective message passing.
  • Irregular accesses (in our benchmarks) amenable
    to compile-time analysis
  • Note partial replication is not necessarily
    data scalable

20
Performance of OpenMP-to-MPI Translation
NEW
EXISTS
Hand-coded MPI
OpenMP-to-MPI
Higher is better
Performance comparison of our OpenMP-to-MPI
translated versions versus (hand-coded) MPI
versions of the same programs. Hand-coded MPI
represents a practical upper bound speedup
is relative to the serial version
21
How does the performance compare to the same
programs optimized for Software DSM?
OpenMP for SDSM
OpenMP-to-MPI
NEW
EXISTS (Project 2)
Higher is better
22
How Do We Get There?Paths towards tomorrows
programming language
  • The (new) multicore challenge
  • Automatic Parallelization
  • Speculative Parallel Architectures
  • SMP languages for distributed systems
  • The (old) general software engineering challenge
  • High-level languages
  • Composition
  • Symbolic analysis
  • Autotuning

23
(Very) High-Level Languages
?
  • Observation The number of programming
  • errors is roughly proportional to the
  • number of programming lines

Scripting, Matlab
Object- oriented languages
Fortran
Assembly
24
CompositionCan we compose software from existing
modules?
  • Idea
  • Add an abstract algorithm (AA) construct to
    the programming language
  • the programmer definines is the AAs goal
  • called like a procedure
  • Compiler replaces each AA call with a sequence of
    library calls
  • How does the compiler do this?
  • It uses a domain-independent planner that accepts
    procedure specifications as operators

25
Motivation Programmers often Write Sequences of
Library Calls
  • Example A Common BioPerl Call Sequence Query a
    remote database and save the result to local
    storage

Query q bio_db_query_genbank_new(nucleotide,
ArabidopsisORGN AND topoisomeraseTITL AND
03000SLEN) DB db bio_db_genbank_new(
) Stream stream get_stream_by_query(db,
q) SeqIO seqio bio_seqio_new(gtsequence.fasta,
fasta) Seq seq next_seq(stream) write_seq(s
eqio, seq)
Example adapted from http//www.bioperl.org/wiki/H
OWTOBeginners
26
Defining and Calling an AA
  • AA (goal) defined using the glossary...

algorithm save_query_result_locally(db_name,
query_string, filename, format) gt
query_result(result, db_name, query_string),
contains(filename, result),
in_format(filename, format)
1 data type, 1 AA call
27
Ontological Engineering
  • Library author provides a domain glossary
  • query_result(result, db, query) result is the
    outcome of sending query to the database db
  • contains(filename, data) file named filename
    contains data
  • in_format(filename, format) file named filename
    is in format format

28
Implementing the Composition Idea
(Compiler)
(Executable)
Plan User
World
(Library Specs.)
Operators
Planner
Actions
Plan
(Call Context)
Initial State
(AA Definition)
Goal State
A Domain-independent Planner
A
  • Borrowing AI technology planners
  • -gt for details, see PLDI 2006

29
Symbolic Program Analysis
  • Today many compiler techniques work assume
    numerical constants
  • Needed Techniques that canreason about the
    program in symbolic terms.
  • differentiate ax2 -gt 2ax
  • analyze ranges yexp if c y5 -gt
    yexpexp5
  • c0
  • DO j1,n Recognize algorithm
  • if (t(j)ltv) c1 -gt c COUNT(t1nltv)
  • ENDDO

30
Autotuning(dynamic compilation/adaptation)
  • Moving compile-time decisions to runtime
  • A key observation
  • Compiler writers solve difficult decisions
    by creating a command-line option
  • -gt finding the best combination of options
    means making the difficult compiler decisions.

31
Tuning Time
PEAK is 20 times as fast as the whole-program
tuning.
On average, PEAK reduces tuning time from 2.19
hours to 5.85 minutes.
32
Program Performance
The performance is the same.
33
Conclusions
  • Advanced compiler capabilities are crucial for
    implementing tomorrows programming languages
  • The multicore challenge -gt parallel programs
  • Automatic parallelization
  • Support for speculative multithreading
  • Shared-memory programming support
  • High-level constructs
  • Composition pursues this goal
  • Techniques to reason about programs in symbolic
    terms
  • Dynamic tuning
Write a Comment
User Comments (0)
About PowerShow.com