Title: Towards a Bell-Curve Calculus and its Application to e-Science
1Towards a Bell-Curve Calculus and its Application
to e-Science
- Lin Yang
- Supervised by Alan Bundy, Dave Berry, Sophie
Huczynska and Conrad Hughes
2Content
- Background
- Workflow
- QoS properties
- Interval arithmetic
- Experimental environment
- Bell-Curve calculus
- Importance
- Definition
- Methodology
- Discussion
3Background (1) -- workflow
- What is workflow?
- Web services
- The orchestration of web services
- An automation of a web process
- Pass documents, information or data from one web
service to another for action - Grid service web service implementing Grid
functionality
4Background (2) -- workflow
Query information
- Ticket booking system
- Four services (generally sequential, partially
parallel)
Query
Ticket information
Ticket information
Check_available1
Check_available2
Booking information1
Booking information2
Deal_made
Deal information
5Background (3) quality of service properties
- Why QoS properties?
- Describe/evaluate the quality of
- a Grid/web service
- Which QoS properties?
- Run time, reliability and accuracy
6Background (4) interval arithmetic
- Error bound an interval that represents the
possible values of the result - e.g. 42 ? 41, 43
- Propagation extension of numerical analysis
- e.g. unary and monotonically increasing
- f(x, y) f(x), f(y)
- A worse-case analysis the biggest accumulated
error
7Background (5) experimental environment
- Agrajag
- Developed by Conrad Hughes for Dependability
Infrastructure for Grid Services (DIGS) project - Define classic distribution functions, operations
and numeric approximation of function
combinations - http//sourceforge.net/projects/digs
8Bell-Curve calculus (1) -- importance
- Why Bell-Curve
- An average case analysis likely or unlikely
- Bell-Curve Normal Distribution
- Easy to store and propagate
- To deal with complex workflows efficiently
- Commonly occurs in the real world
9Bell-Curve calculus (2) -- importance
- Evidence
- Experimental evidence from DIGS
- A possible approximation to probabilistic
behaviour of run time, accuracy and reliability
(mean time to failure) - Central Limit Theorem
- The distribution of an average tends to be
Normal, - even when the distribution from which the
average - is computed is decidedly non-Normal.
- May extend calculus to more complicated curves
in due course
10Bell-Curve calculus (3) -- definition
- Normal Distribution (Bell-curve)
11Bell-Curve calculus (4) -- definition
- Three QoS properties
- Run time, accuracy and reliability
- Four ways of combining Grid services
- Sequential
- Parallel_All
- Parallel_First
- Conditional
- So 12 fundamental combinations
12Bell-Curve calculus (5) combination methods
- Sequential
- Parallel_All
- Parallel_First
- Conditional
13Bell-Curve calculus (6) basic combination
functions
- 12 bell-curve simple situations
Seq Para_All Para_Fir Cond
run time sum max min cond1
accuracy mult combine1 varies? cond2
reliability mult combine2 varies? cond3
14Bell-Curve calculus (7) proposed work
- Our proposed work
- For each 12 functions, find function for
- and in terms of , , and
- Induce the 24 functions
- By experiment using Agrajag
- Find other suitable calculi to describe the
combination functions
15Bell-Curve calculus (8) -- sum
16Bell-Curve calculus (9) -- max
17Bell-Curve calculus (10) -- methodology
- is the bell-curve approximation of
the combination curve - experimental tasks
- find functions to calculate and
- e.g. for sequential/run time
- ,
- experiment with functions for and
- determine ranges of acceptable error
- plot 3D graph ( vs. vs. error)
18Discussion (1)
- A better representation of probabilistic
behaviour of QoS properties? - e.g. log-normal calculus
- More QoS properties?
- e.g. failure detection time
run
run
service
down
failure detection system
suspect
confirm
time
failure detection time
19Discussion (2)
- f.d.t. An instantiation of run time
- More combination situations?
- e.g. voting
seq Para_All Para_Fir Cond
f.d.t. sum max min cond4
voting service
20The end