Title: ContextSensitive DomainIndependent
1- Context-Sensitive Domain-Independent
- Algorithm Composition and Selection
- Troy A. Johnson and Rudi Eigenmann
- Purdue University
2Motivation
- Increasing programmer productivity
- Typical language approach increase abstraction
- do more with less code
- reduce development and maintenance costs
- Domain-specific languages / libraries (DSLs)
provide a high level of abstraction - e.g., a domain is biology, chemistry, physics
- But, typically a sequence of library calls is
needed
3A Library Designer's Problem
- Consider two useful library procedures A and B
- reluctant to include a third procedure C that
simply calls A and B (i.e., has composite
behavior) - though convenient, C is redundant
- including only C prevents reuse of A and B
- including all three, for all such procedures,
greatly increases library size and complexity - Library user is expected to compose sequences of
fundamental calls
4A Library User's Problem
- Novice users don't know these call sequences
- procedures documented independently
- tutorials provide a few example call sequences
- not an exhaustive list
- may need adjusted for each calling context
- User knows what they want to do, but not how to
do it. - Can the compiler let them specify a goal, and
insert an appropriate call sequence? Yes!
5Our Solution
- Add an abstract algorithm (AA) construct to the
programming language - named and defined (once) by the programmer
- definition is the programmer's goal
- called like a procedure (any number of times)
- compiler replaces each AA call w/ a library call
sequence - How does the compiler do this?
- short answer it uses a domain-independent
planner that accepts procedure specifications as
operators
6Our Major Contributions
- Explain how composition of DSL procedures can be
implemented as a language feature - First compiler to use a planner to insert a
sequence of library calls, while considering the
calling context - A novel application of planning that motivates
research in planning theory - Support incomplete, abstract procedure
specifications that can be written for many
domains - use programmer-compiler interaction to clarify
ambiguity
7Outline
- Example The BioPerl DSL for Bioinformatics
- Brief Introduction to Planning
- Mapping Composition onto Planning
- Related Work (very abbreviated see paper)
- Conclusions Future Work
8A Common BioPerl Call Sequence
- Query a remote database and save the result to
local storage
Query q bio_db_query_genbank_new(nucleotide,
ArabidopsisORGN AND topoisomeraseTITL AND
03000SLEN) DB db bio_db_genbank_new(
) Stream stream get_stream_by_query(db,
q) SeqIO seqio bio_seqio_new(gtsequence.fasta,
fasta) Seq seq next_seq(stream) write_seq(s
eqio, seq)
5 data types, 6 procedure calls
Type Procedure
Example adapted from http//www.bioperl.org/wiki/H
OWTOBeginners
9Describing the Library User's Goal
- Library author provides a domain glossary
- query_result(result, db, query) result is the
outcome of sending query to the database db - contains(filename, data) file named filename
contains data - in_format(filename, format) file named filename
is in format format - Glossary terms are properties (facts), whereas
procedure calls are actions - Library user lists properties of their goal
10Defining and Calling an AA
- AA (goal) defined using the glossary...
algorithm save_query_result_locally(db_name,
query_string, filename, format) gt
query_result(result, db_name, query_string),
contains(filename, result),
in_format(filename, format)
Order does not matter. These are not procedure
calls.
- ...and called like a procedure
Seq seq save_query_result_locally(nucleotide,
ArabidopsisORGN AND topoisomeraseTITL AND
03000SLEN, gtsequence.fasta, fasta)
1 data type, 1 AA call
Type Property AA
11Call-Sequence Selection
- Compilers planner may find multiple sequences
- programmer can select a sequence
- or the compiler can select one heuristically
- by using library annotations as a guide may
require knowing typical program values to compare
sequences - by selecting sequence with fewest calls
- Incomplete specifications may cause undesirable
sequences to be suggested - incompleteness will occur in practice
- cannot eliminate the option for programmer-review
12Making Interaction Unobtrusive
- Compilation normally non-interactive
- what programmers expect
- interaction should be minimized
- Cache programmers responses
- most code does not change between compiles
- avoids repeatedly selecting same sequence
- compiler flag to clear or ignore cache
13Advantages of Our AA approach
- The following can remain unknown
- library procedure names
- order of calls
- intermediate variables
- Can teach library users call sequences
- AA calls adapt under different calling contexts
14Why Calling Context Matters
- save_query_result_locally replaced with library
calls - In general, would use the 6-call sequence
- creates 5 data objects
- What if some objects already exist?
- suppose there is a live DB object
- then DB db bio_db_genbank_new( ) is
unnecessary - What if the goal is already satisfied?
- then the AA call does not generate any code
15Composing a Call Sequence
- A planner discovers a sequence of instantiated
operators (actions calls), known as a plan - Given
- initial state lt calling context, from compiler
- goal state lt AA definition, from programmer
- operator set lt library procedure specifications,
from librarian
16Greatly-Simplified View of Planning
(Compiler)
(Executable)
Plan User
World
(Library Specs.)
Operators
Planner
Actions
Plan
(Call Context)
Initial State
(AA Definition)
Goal State
A Domain-INDEPENDENT Planner
A Domain-Dependent Planner
- World is composed of objects
- Actions modify objects' properties and
relationships - Planner deals with a symbolic model
17Why Planning Is Difficult
- Operators define a state-transition system
- precondition when an operator can be used
- effects what an operator does
- Planner finds a path through the system from the
initial state to the goal state - What's difficult?
- typically too many states to enumerate
- search intelligently using reasonable time
space - danger that planner may not terminate
18Is planning necessary for composition?
- Many possible actions
- libraries contain 10s 100s procedures
(operators) - each procedure has several parameters
- 10s 100s live variables (objects) at a call
site - many ways to bind variables to parameters
- Ex 128 procs, 2 params each, 8 objs, 4 calls
- assume all objects params have the same type
- (128 82)4 (27 26)4 2(134) 252
potential plans
19Overall System Challenges
Plan
(Challenge 4)
Procedure Specifications
Domain-Specific Library
Operators
DIPACS Planner
DIPACS Compiler
(Challenge 1)
Goal State
Initial State
Application Code
(Challenge 3)
(Challenge 2)
C Code
Run-Time Program State (World)
Binary (Actions)
gcc
00101010 01000010
1. Ontological Engineering choosing a glossary
for the domain
2. Determine Initial Goal States requires
flow analysis translation to a planning language
3. Object Creation most planners assume a fixed
set of objects
4. Merge the Plan into the Program destructive
vs. non-destructive plans
DIPACS Domain-Independent Planned Algorithm
Composition and Selection
20Related Work
- Languages and Compilers
- Jungloids
- David Mandlin et al. Jungloid Mining Helping to
Navigate the API Jungle. PLDI, June 2005. - Broadway
- Samuel Z. Guyer and Calvin Lin. Broadway A
Compiler for Exploiting the Domain-Specific
Semantics of Software Libraries. Proceedings of
the IEEE, 93(2)342357, February 2005. - Speckle
- Mark T. Vandevoorde. Exploiting Specifications to
Improve Program Performance. PhD thesis,
Massachusetts Institute of Technology, 1994.
21Related Work (continued)
- Automatic Programming
- Robert Balzer. A 15-year Perspective on Automatic
Programming. IEEE Transactions on Software
Engineering, 11(11)12571268, November 1985. - David R. Barstow. Domain-Specific Automatic
Programming. IEEE Transactions on Software
Engineering, 11(11)13211336, November 1985. - Charles Rich and Richard C. Waters. Automatic
Programming Myths and Prospects. IEEE Computer,
21(8)4051, August 1988. - Automated (AI) Planning
- Keith Golden. A Domain Description Language for
Data Processing. Proc. of the International
Conf. on Automated Planning and Scheduling, 2003. - M. Stickel et al. Deductive Composition of
Astronomical Software from Subroutine Libraries.
Proc. of the International Conf. on Automated
Deduction, 1994. - Other work at NASA Ames
22Conclusions Future Work
- A DSL compiler can use a planner to implement a
useful language feature - We provide an example using a real DSL
- Identified implementation challenges in this talk
- for detailed solutions see paper
- Future work
- call sequences that include branches and loops
- develop examples using additional domains
23- Context-Sensitive Domain-Independent
- Algorithm Composition and Selection
- Troy A. Johnson and Rudi Eigenmann
- Purdue University