Bristlecone: A Language for Robust Software Systems - PowerPoint PPT Presentation

About This Presentation
Title:

Bristlecone: A Language for Robust Software Systems

Description:

Bristlecone: A Language for Robust Software Systems Brian Demsky Alokika Dash University of California, Irvine – PowerPoint PPT presentation

Number of Views:109
Avg rating:3.0/5.0
Slides: 37
Provided by: Office21032
Learn more at: http://plrg.eecs.uci.edu
Category:

less

Transcript and Presenter's Notes

Title: Bristlecone: A Language for Robust Software Systems


1
Bristlecone A Language for Robust Software
Systems
  • Brian Demsky
  • Alokika Dash
  • University of California, Irvine

2
Current Software is All or Nothing
  • Most current software either executes perfectly
    or fails completely
  • Small errors cause catastrophic failures
  • Violate fundamental developer assumptions
  • Violated assumptions prevent continued execution
  • No clean way to recover from errors
  • Unclear what parts of the program are affected
  • Failure may leave key data structures partially
    updated

3
Degraded Service Can Be Desirable
  • Consider a bug that affects an embedded
    application in a single web page
  • Current browsers often close all browser windows
    and exit
  • Users find this behavior frustrating
  • Better option is to isolate the failure and halt
    only the embedded application component

4
Motivating Example Web Server
  • Request rreceiverequest()
  • logRequest(r)
  • processRequest(r)

5
Motivating Example Web Server
  • Failure in log operation
  • Prevents serving this request
  • If logging failure is independent of request,
    potentially causes system to fail to serve any
    requests
  • Request rreceiverequest()
  • logRequest(r)
  • processRequest(r)

CRASH
6
Real World Example
  • First Flight of the Ariane 5 Rocket
  • Uncaught integer overflow in computation that
    computed horizontal bias
  • Overflow shutdown the inertial reference system
  • Inertial reference sent debug information to the
    guidance system
  • Guidance system used these invalid values to set
    incorrect nozzle deflections
  • 120 Million rocket crashed
  • Horizontal bias value is not even used!
  • Lesson Critical system operations coupled to
    non-critical operations

7
Observations about Recovery
  • Challenging to recover from failure with
    traditional program structures
  • Unclear what code was doing
  • Are data structures consistent?
  • What depends on the failed code? What is still
    safe to do?
  • Code structure introduces artificial dependences
  • In the absence of precise dependence information,
    we must assume the worst case
  • Failures can propagate through artificial
    dependences ? small errors can cause catastrophic
    failures

8
Where do we lose information?
  • Specifications describe functionality
    requirements
  • Architecture/implementation phases map
    requirements into sequence of operations
  • Mapping process loses information
  • Boundaries of operations (What is A and what is
    B?)
  • Temporal dependences (Does B require A?)
  • Data dependences (Does B use data produced by A?)
  • Lost information introduces artificial dependences

9
Designing for Robustness
  • Underlying assumption All code contains bugs
  • Goal Mitigate the consequences
  • Approach
  • Decompose application into many small tasks
  • Specify dependences between these tasks
  • Data dependences
  • Control dependences
  • Use transactions to prevent failures from
    exposing partially updated data structures
  • Use dependence information to continue past
    failures

10
Bristlecone Language
  • Program is specified as a set of tasks
  • Task specifications describe task dependences
  • Tasks have transactional semantics
  • Runtime system reasons about dependences to
    execute past failures (automated recovery)

11
Web Server Example
  • Decoupled operations
  • Log Request and Send Page tasks are independent
  • Failure of one does not affect the other

Accept Connection
Read Request
Log Request
Send Page
12
Specifying Object States
  • Different object states support different
    functionality (Type State)
  • Use flag construct to label conceptual object
    states
  • Use these flags to determine when to perform
    operations
  • Can differentiate between operations that have
    true data dependences and operations that just
    operate on same objects
  • class WebRequest
  • Flag initialized
  • Flag send_page
  • Flag write_log

13
Tagging Objects
  • Motivation Consider the web server example
  • Each connection has
  • A Socket object that provides communication
  • A WebRequest object that stores application
    specific state
  • Need to pair the correct Socket and WebRequest
    objects together
  • Solution Tag the group of objects

Connection Tag
Socket Object
WebRequest Object
14
Tagging Objects
  • Tags group object instances
  • Tags provide mechanism
  • Tags have types
  • Can create many instances of a tag type
  • Each instance defines a group
  • Can bind tag instances to objects
  • Tags can specify that task parameters must be in
    the same group

15
Task Specifications
  • Describe data dependences of tasks
  • Describe affect of tasks on objects
  • / This task reads a request from a client. /
  • task readRequest(WebRequest w in initialized with
    connection t, Socket s in IO_Pending with
    connection t)
  • ...
  • taskexit(w initializedfalse,
    send_pagetrue, write_logtrue)

16
Bristlecone Task Semantics
  • Runtime invokes tasks
  • Tasks can be invoked when objects are available
    in the heap that satisfy the tasks parameter
    guards
  • Task have transactional memory semantics
  • All operations are executed or none
  • Task execution appears to occur in a single
    instance
  • Failures cause transactions to abort and restore
    consistency

17
Failure-free Execution
Accept Connection
Read Request
Log Request
Send Page
18
Failure-free Execution
Accept Connection
Read Request
Log Request
Send Page
19
Failure-free Execution
Accept Connection
Read Request
Log Request
Send Page
20
Failure-free Execution
Accept Connection
Read Request
Log Request
Send Page
21
Error Detection
  • Catching operating system signals
  • Arithmetic exceptions
  • Null pointer exceptions
  • Library signals
  • Socket errors
  • Runtime language checks
  • Array out of bounds exceptions
  • Assertions
  • Imperative consistency checks
  • Declarative data structure specifications

22
Failure Recovery
  • Transactions restore data structures to previous
    consistent state
  • Problem Re-executing the same task will likely
    result in the same failure
  • Solution Use task specifications to determine
    what other tasks can be safely executed

23
Automatic Recovery
Accept Connection
Read Request
CRASH
logRequest
Send Page
24
Automatic Recovery
Accept Connection
Read Request
Log Request
Send Page
25
Automatic Recovery Summary
Accept Connection
Read Request
Log Request
Send Page
26
Language Benefits
  • Use specifications to understand failure in a
    meaningful way
  • Use task specifications to reason how to recover
    from failures
  • Task specifications eliminate artificial
    dependences

27
Task Dispatch
  • Goal Determine which parameter objects satisfy
    task guards
  • Problem Brute force search can be expensive
  • Our Approach maintains
  • Parameter set of objects that satisfy an
    individual parameters guard
  • Active task queue of sets of parameter objects
    that collectively satisfy all of tasks guards

28
Task Dispatch
  • Precisely maintain parameter sets
  • If an object is in a parameter set
  • It satisfy the flag component of the guard
  • Is bound to the correct types of tags
  • All objects that satisfy parameters guard are in
    parameter set
  • Active task queue is conservative
  • If a set of objects could potentially satisfy all
    of tasks guards, it is in the task queue
  • Must check that set of objects in a task queue
    invocation satisfies guards before invoking task

29
Task Dispatch
  • When a new object is added to parameter set,
    create corresponding task queue invocations
  • Search for objects that satisfy tag guards
  • Idea Use tags to prune search
  • When we add an object with a tag guard to the
    set, use tags to prune search of other parameter
    objects that must be bound to the same tag

30
Task Binding Iteration
  • Structure computation as a list of iterators over
    tags and objects
  • Multiple types of iterators
  • Over tags bound to object
  • Over objects bound to tag
  • Over objects in parameter set
  • Want to prune search early ordering is
    important
  • Statically generate iterator orderings for each
    parameter set of each task

31
Initial Experiences
  • Implemented Bristlecone compiler and runtime
  • Have evaluated system on several benchmarks
    including
  • Web Server
  • Web Spider
  • Chat Server
  • Developed a Bristlecone and Java versions of each
  • Java versions were designed to use threads to
    provide resilience to failures
  • Randomly injected failures into executions

32
Web Spider
  • Workload is a set of 100 web pages
  • Java version implemented using a thread pool
    architecture
  • 100 trials on each version
  • Randomly injected 3 halting failures into each
    execution
  • With injected failures
  • Java version fetched average of 6 pages
  • Bristlecone version fetched average of 91 pages

33
Web Server
  • Web Server with support for e-commerce
    transactions
  • Java version spawns a thread for each connection
  • 200 trials on each version
  • Randomly injected 50 halting failures into each
    execution
  • With injected failures
  • Java failed to serve inventory requests in 4.5
    of trials, Bristlecone failed in 1.5
  • Java had correct inventory responses in 68.6,
    Bristlecone in 100

34
Chat Server
  • Chat server allows multiple users to chat
  • Java version spawns a thread for each connection
  • 100 trials on each version
  • Workload sent 800 messages
  • Randomly injected 10 failures into each execution
  • With injected failures
  • Java version failed to serve 39.9 of messages
  • Bristlecone version failed to serve 19.3 of
    messages

35
Related Work
  • Traditional fault tolerance
  • N-version programming
  • Recovery blocks
  • Exception handlers
  • Languages
  • Linda / Tuple spaces
  • Orc
  • Actors
  • Argus
  • Oz
  • Erlang
  • Software and Hardware Transactional Memory

36
Conclusions
  • Bristlecone is a exciting approach to improve
    application reliability
  • Initial experiences promising
Write a Comment
User Comments (0)
About PowerShow.com