Title: FourA Component Adaptation and Assurance
1Four-A Component Adaptation and Assurance
- William L. Scherlis
- Institute for Software Research
- CMU School of Computer Science
- scherlis_at_cs.cmu.edu412-268-8741
- With John Tang Boyland (U Wisc), Aaron
Greenhouse, Edwin Chan
PI Meeting Honolulu July 2000
2Technical Objectives
- 1. Improve source-level software assurance
- Systematically improve code safety, tolerance,
etc., using source-level analysis, annotation,
transformation. - Improve the extent of formal assurance using
analyses, annotation, transformation. - Provide composable approaches for a variety of
code-safety properties, based on annotations. - 2. Provide ongoing assurance thru evolution
- Avoid re-verification of code safety, tolerance,
and other properties as software components and
systems evolve. - Support programmer through adaptation by formally
analyzing and carrying out changes, preserving
and enhancing assurance where possible.
3(No Transcript)
4(No Transcript)
5Four-A and Information Assurance
- Code-SafetyFour-A Robustness
SecurityEncapsulation . - Code Safety
- Array avoid bounds and types exceptions
- Types avoid cast exceptions
- References avoid nullpointer exceptions
- Concurrency avoid races, deadlocks, stolen
locks, etc. - Exceptions avoid non-declared RuntimeException
instances (excluding VM) - Data representations avoid violations of data
integrity - Robustness
- Binary compatibility Javas own promises
compiler vs. load - Redundancy, integrity insert redundant integrity
checks across components - Encapsulation
- References and access protect referenced objects
- Unique protect referenced objects
Prevention Detection Tolerance
Java
6Threats, Assumptions, Policy
- Threats/attacks addressed
- Exploitations of code safety errors via client
interfaces - E.g., APIs, subclassing
- Historically a major threat domain (CERT data)
- "Misuse" of systems calls
- E.g., C buffer overflow
- Examples
- Induced exceptions
- Data integrity violations
- Concurrency failures
- Liberal policy vs. race condition
- Deadlock inducement, lock stealing, etc.
- Testing difficulties
- Assumptions
- Code-level focus
- Type safety
- Limited/absent functional specs
- Component-orientation with diverse access
requirements - Policies enforced
- Data integrity and confidentiality policies
- Concurrency policies
- Consistency with mechanical specifications
- Enforcement approach
- Code manipulation, analysis, annotation/specificat
ion - Management of assurance through code change
7Footnote CERT vulnerability taxonomy
- Assumptions wrong or changed
- Design errors
- Implementation errors
- Basic programming practices
- Improper use of a well understood algorithm
- Timing windows
- Privileged programs
- Trusts something not designed to support trust
- Trusts untrustworthy information
- Errors in requirements specifications
- Other problems
- User interface
8Outline
- Technical objectives
- Code-level assurance
- In development improve
- In evolution sustain
- Link with design
- Threats, assumptions, policy
- Research hypotheses
- Project approach
- Scope
- Engineering practice
- Baseline
- Adaptation JDK evolution
- Security CERT vulnerability data
- Four-A case studies scenarios
- Technical approach
- Semantics-based manipulation
- Program analysis
- Annotation and scalability
- Tool-based studies
- Accomplishments plans
- Schedule
- Expected accomplishments
- Transition management
- Techniques, technologies, tools
- Approach
9Complexity and Encapsulation in Practice
- Combining helpers into a host class makes the
host class more complex but also potentially more
efficient, due to short-circuited method calls
and the like. - Performing such simplifications along the way,
we can define a more concise, slightly more
efficient, and surely more frightening version of
BoundedBuffer. - Lea, Concurrent Programming in Java, 2nd
ed. - A recurring textbook remark.
- When you are handed an interface, the first
thing youll see is a set of operations that
specify a service of a class or a component. Look
a little deeper and youll see the full
significance of these operations, along with any
of their special properties, such as visibility,
scope, and concurrency semantics. - These properties are important, but for complex
interfaces they arent enough to help you
understand the semantics of the service they
represent, much less know how to use these
operations properly. - In the absence of any other information, youd
have to dive into some abstraction that realizes
the interface to figure out what each operations
does and how these operations are meant to work
together. - However, that defeats the purpose of an
interface, which is to provide a clear separation
of concerns in a system. - Booch, Rumbaugh, Jacobson, The Unified
Modeling Language User Guide. - See example in Suns production library code.
10Four-A Hypotheses
- The four As. Manipulations, analyses,
annotations, and detailed design-record
management can be used to improve safety,
tolerance, and robustness - Add safety checks, redundancies, audit, graceful
degredation - Scalability. Many code safety properties can be
made composable using annotations. - Examples exceptions, arrays, concurency
management, types - Evolution. Re-verification of many critical
safety and dependability properties can be
avoided. - Systematization. Safety risks of restructuring
can be reduced using systematic techniques. - Administrative structural changes
- Performance improvements
- Robustness improvements
11Four-A Project Approach
- Assurance at code-level
- Code safety
- Design consistency
- Robustness improvement
- Techniques
- Semantics-based program manipulation
- E.g., performance optimization, relocate
abstraction boundaries, refactor, stage, increase
concurrency, conform with changing APIs - Advanced program analysis
- Program annotations and specifications
- Effects, unique refs, uses, mutability, etc.
- Fine-grained audit-trail, design-record, and
linking - Ultra-fine versioning, design/code linking, code
diff, etc. - Tools
- Language-independent core
- IR, CFG and analysis framework, Usability.
- Java-specific capability (99 pure)
- Data
- Understand change logs and code-level
vulnerabilities from practice
12Four-A Project Approach, continued
- Work from code level thru design toward spec
- Why Code as ground truth. Snapshot problem.
- Why Legacy code. Exploit and improve partial
specs. - Why Manage detail design.
- Use partial information about components in a
system - Why Trade secret (COTS). Security. Distributed
development. - Cf. whole-program analysis
- Rely on encapsulation, type safety, composable
props - Java, (modified) beans, etc.
- Encapsulation benefits both programmers and
intruders. - Why Scalability. Partial information.
Manipulation soundness. - Focus on administrative change in routine SWE
- Why Appropriate roles for programmers and tools.
Adoptability. - Why Tune for performance, security, robustness
13Scenarios and Baseline
- Commercial baseline
- CERT vulnerability database
- Many vulnerabilities/exploitations enabled by
failures in code-level implementation code
safety, concurrency, etc. - Four-As JDK census
- Preliminary results
- Most changes explicitly require client-side
semantic analysis - Estimate 34 to be potentially tool-feasible
- Only about 20 are potentially transparent to
client - Many changes affect code safety
- Very limited client-side API evolution support
- E.g., Suns sed script
- Four-A scenarios
- Busy-guy / Bad-guy
- Based in production code
- Examples
- Denial-of-service lock stealing
- Information leakage unique
- Unlocked door Method extract
14A Simple Concurrency Example
- public class Point
- private int x
- private int y
- public Point( final int x, final int y )
- this.x x
- this.y y
-
- public int getX() return x
- public int getY() return y
- public void set( int x, int y )
- this.x x
- this.y y
-
- public String toString()
- return "(" x ", " y ")"
- public class Point
- private final Object mutex new Object()
- private int x
- private int y
- public Point( final int x, final int y )
- this.x x
- this.y y
-
- public int getX() synchronized( mutex )
return x - public int getY() synchronized( mutex )
return y - public void set( int x, int y )
- synchronized( mutex )
- this.x x
- this.y y
-
-
Protect lock stealing.
?
Avoid races
This version of the class is safe for
concurrency, but (in this rendering), has lost
extensibility.
This version of the class is safe for single
threading, but does not support concurrent access.
15Four-A Technologies(Adaptation, Analysis,
Annotation, Accounting)
- Semantics-based program manipulation
- Source-code and design level
- Structural manipulations
- Run-time manipulations
- Meta-manipulations
- Analysis and models
- OO effects, mutability, uniqueness, aliasing,
uses, . . . - Annotation and specification
- Mechanical properties
- Tools for assured adaptation of Java components
- Information loss and chain of evidence
- Use of audit data
161. Systematic Software Adaptation
- Routine software structural evolution
- Examples
- API change
- Data representation change
- Class hierarchy restructuring
- Signature change
- Introduce self-adaptation
- Mobility
- Encapsulation
- Split into phases / stages
- Cloning to produce specialized variants
- Merging of related functions
- Replication for robustness
- Threading changes
- Provide tool support for these operations
- With predictable impact on functional and
mechanical program properties
17Assured Software Change
- Structural change in practice
- Costly
- Changes can be distributed throughout a system.
- Complex analysis (program understanding) is
required. - Risky
- Invariants and specifications are not present.
- Avoided
- Why do we tolerate brittleness?
- Code rot persistence of abstractions beyond
their time. - Why do commercial APIs accrete?
- Necessary
- Structural change enables functional change
- Localize/encapsulate related software elements
- Structural change enables code management
- Cf. Aspect-Oriented Prog. Subject-Oriented Prog.
N-Dim. - Navigate structural trade-offs during
design/evolution - Support iterative software processes
18Simple Example Move Field
- Checks
- C is descendent of A.
- If A is interface, f must be public static
final. - Shadowing A and B have no use of ancestral f.
- Unshadowing No f field in B (capture Cs f uses).
- Move field f from class C to class A.
Programmers can do this using drag-and-drop.
A
f
B
foo bar f
C
19Simple Example Move Field
- Checks
- C is descendent of A.
- If A is interface, f must be public static
final. - Shadowing A and B have no use of ancestral f.
- Unshadowing No f field in B (capture Cs f
uses). - D (and other sibs) have no uses of f.
- Initializer code can be reordered, by field type.
- Reordering is acceptable for interleaved
constructor and field code. - Actions
- Adjust access tags
- Handle special cases
- Caveats
- Visibility in D and other sibs
- Visibility in Cs subs
- Promises introduced
- Changes in binary compatibility
- Move field f from class C to class A.
Programmers can do this using drag-and-drop.
20Structural Manipulation Techniques
- Boundary movement (ISAW98)
- Frequency change
- Data representation change (ESOP98)
- Hierarchy restructuring
- Staging, specialization, splitting
- Thread management
- Self-adaptation
- Integrity
21Evolving MultiThreaded Code
- Why
- Improve code safety and robustness
- Improve performance and flexibility
- How
- Annotations
- Locks associated with regions (generalization of
fields) - Assignment of locks to (final) fields or instance
variable - Lock ordering
- Manipulations
- Shrink lock
- Split lock, merge locks
- Move boundaries
- Reorder locks
- Etc.
- Analyses
- Tool support
- The Generative Approach (new)
22Four Concurrency Policies for EventQueue
A
B
C
D
// maximizing getSize() public class EventQueue
// . . . public int getSize()
synchronized( gsLock ) synchronized(
dQLock ) int s1, s2
synchronized( high ) s1 numHigh
synchronized( normal )
s2 numNormal return s1
s2 public void
enqueuePriority( Object e )
synchronized( gsLock ) synchronized(
dQLock ) synchronized( high )
high.add( e ) numHigh 1
private Object
dequeuePriority() Object e null
synchronized( high ) if( numHigh ! 0 )
e high.remove( 0 ) numHigh
- 1 return e
public void dispatchEvent() Object e
null synchronized( fifoLock )
synchronized( dQLock ) e dequeue()
if( e ! null ) fireEQEvent(
e )
// minimizing getSize() public class EventQueue
// . . . public int getSize()
synchronized( gsLock ) synchronized(
dQLock ) int s1, s2
synchronized( high ) s1 numHigh
synchronized( normal )
s2 numNormal return s1
s2 public void
enqueuePriority( Object e )
synchronized( dQLock ) synchronized( high
) high.add( e ) numHigh 1
private EQEvent
dequeuePriority() Object e null
synchronized( gsLock ) synchronized( high
) if( numHigh ! 0 ) e
high.remove( 0 ) numHigh - 1
return e public
void dispatchEvent() Object e null
synchronized( fifoLock ) synchronized(
dQLock ) e dequeue()
if( e ! null ) fireEQEvent( e )
// exact getSize() public class EventQueue // .
. . public int getSize() synchronized(
gsLock ) synchronized( dQLock )
int s1, s2 synchronized( high )
s1 numHigh synchronized(
normal ) s2 numNormal
return s1 s2 public
void enqueuePriority( Object e )
synchronized( gsLock )
synchronized( dQLock ) synchronized(
high ) high.add( e )
numHigh 1
private EQEvent dequeuePriority() Object e
null synchronized( gsLock )
synchronized( high ) if( numHigh ! 0 )
e high.remove( 0 )
numHigh - 1 return e
public void dispatchEvent()
Object e null synchronized( fifoLock )
synchronized( dQLock ) e
dequeue() if( e ! null )
fireEQEvent( e )
// getSize() w/no guarantees public class
EventQueue // . . . public int getSize()
synchronized( dQLock ) int s1, s2
synchronized( high ) s1 numHigh
synchronized( normal ) s2
numNormal return s1 s2
public void enqueuePriority(
Object e ) synchronized( dQLock )
synchronized( high ) high.add( e )
numHigh 1 private
EQEvent dequeuePriority() Object e null
synchronized( high ) if( numHigh ! 0 )
e high.remove( 0 ) numHigh
- 1 return e
public void dispatchEvent() Object e
null synchronized( fifoLock )
synchronized( dQLock ) e dequeue()
if( e ! null ) fireEQEvent(
e )
23Four Concurrency Policies for EventQueue
// maximizing getSize() public class EventQueue
// . . . public int getSize()
int s1, s2 s1 numHigh
s2 numNormal return s1
s2 public void enqueuePriority(
Object e )
high.add( e ) numHigh 1
private Object dequeuePriority() Object e
null if( numHigh ! 0 ) e
high.remove( 0 ) numHigh - 1
return e public void
dispatchEvent() Object e null
e dequeue() if( e ! null )
fireEQEvent( e )
Version Policy
A B C D
getSize is exact getSize gives upper
bound getSize gives lower bound No guarantees
about getSize
Exercise for the reader Which arrows above
are correct??
242. Analyses and 3. Annotations
- Analyses
- Typing
- Binding
- Effects
- Unique
- Reaching defs
- Predicate
- Capability
- Concurrency
- Annotations
- Effects (read/write, regions)
- Exceptions
- Conditions (pre/post, exception)
- Locks (ordering, regions, used, context)
- Uses
- Method attributes (idem, pure)
- Pointer capability (anonymous, clonable, castable
(down), mutable, borrowed, unique/excluded)
25Specifications for mechanical properties
- Manipulations require analyses
- Example
- Manipulation Reorder code
- Analyses Effects, aliasing (may-equal and
uniqueness), uses. - Cf. compiler analyses, software engineering
analyses. - Analyses are goal-directed
- Avoid whole-program analysis
- Limited access. Scale.
- At scale
- Development
- Distributed/collaborative.
- Functional specifications lacking.
- Programs
- OO, dynamically linked.
- Potentially adaptive, mobile
- Analyses require mechanical specifications
- Promises about components and their elements
(ICSE98)
26Key Ideas OO Effects(ECOOP99)
- Source-level analysis of partial programs
- Do not want, and may not have, the whole program
- Use annotations on methods as surrogates for
components - Use of regions and aliases to analyze OO programs
- Encapsulate state of objects in regions to
protect programmer abstractions - Use aliasing information to improve results
- Identify potential aliases may-alias
information - Control creation of aliases unique
information. - Programmer-guided source-level manipulation
- Goal-directed analysis (vs. compile-time
opportunistic analysis)
27Code safety and Unique Variables
- The value of uniqueness
- Sole access to an object entails certain
privileges - Mutations can be performed without regard to rest
of program (no other read access) - Invariants can be maintained without regard to
rest of program (no other write access) - Program invariants are ideally
- Explicit (code readability)
- Checked (code maintainability)
- Uniqueness examples
- Input stream of a lexer
- If unique
- Lexer can buffer input without interference.
- String buffer character array
- If unique
- Can be coerced to immutable when final string is
desired. - Vectors array
- If unique
- Mutations of separate vectors can be reordered.
- Hashtables array
- If unique
- One can enforce hashing invariants,
- And can rehash without interference.
284. Four-A Tool Infrastructure
- The Fluid IR
- Ternary structure
- Versioning
- Simultaneity
- Immutability
- Typing
- Sequences
- Encapsulated composites
- Persistence
- Sharing windows
- Coordination
- Team and concurrency policy
- Notification
- Dependency management
- Truth maintenance
- Language support
- Parse
- Unparse and format
- Compositional CFG
- 50 lines to define def/use
- Pattern matching
- Manipulation support
- UI framework
- Stateful views
- Templates
- Indexed attributes
- Java support
- Analyses
- Annotations
- Manipulations
29Four-A Schedule
- Year 1
- Tool infrastructure (99 Java, A, A, A, A)
- Analysis algorithms (uniqueness, effects,
mayEqual, etc.) - Demonstrate preservation of assurance properties
thru change - Manipulations for threading
- Case studies for thread safety and pattern
- Year 2
- Class-level structural manipulations
- Management of uses information
- Exploitation of aliasing annotations to assure
code safety props - Threading annotations and analyses
- Design record to support assurance information
- Year 3
- Manipulation library for improvement of code
safety - Prevent ? Detect ? Tolerate
- Manipulation through analytic views
- Tool-based case study based on intrusion scenarios
30Accomplishments(recent)
- Four-A tool prototype
- Supports non-local manipulations
- Model-view infrastructure
- Annotations and analyses
- Unique. MayEqual.
- Threading
- Manipulations (preliminary form)
- Annotations
- Policy model and generative approach
- Software engineering baseline
- Evolution census JDK changes source code, logs
31Transition
- Build on mainstream commercial technologies
- Java, beans, etc.
- Build on existing infrastructure
- Tool (developed by our team) for Java analysis,
manipulation, engineering process, design
information management. - Platform (UI, IM, VM, syntax) is also usable for
other languages. - Usability/adoptability a priority from the outset
- Enable experimentation/studies without high
adoption cost - E.g., gesture-based interface where possible
- Conduct engineering baseline analyses
- What are the code-leve vulnerabilities being
exploited? - What kinds of changes are routinely made in
commercial APIs? - What is the impact of those changes on code
safety?