Title: Workload Generation for PubSub System
1Workload Generation for Pub/Sub System
2Agenda
- Introduction
- Related Works
- Approach
- Experiences
- Conclusion Future Works
3Motivation
- Publish-Subscribe Infrastructure
4A Benchmark Suite
- A benchmark suite composed of three sections
- Interface suitability
- Applications
- Synthetic Scenarios
- Role of a synthetic workload generator
- Generate inputs to system/simulator
5Agenda
- Introduction
- Related Works
- Approach
- Experiences
- Conclusion Future Works
6Two Approaches
- Trace-based approach
- Starts with an empirical trace
- Subsamples or permutes the ordering of the
requests to generate a new workload different in
some respect from the original. - Analytic approach
- Uses mathematical models for the workload
characteristics of interest - Uses random number generation to produce
workloads that statistically conform to these
models.
7Analytic approach
- Two types
- Resource-oriented modeling approach
- Capture the characteristic of the workload
itself. - User behavior modeling approach
- Capture the characteristic of the user behavior
- In general be hierarchical
- The sequence of user interactions at a higher
level - Result in a stream of requests at a lower level
8Resource-oriented modeling approach
- ProWGen (Proxy Workload Generator)
- Analytic approach
- Parameters provide control over five key workload
characteristics - Resource-oriented modeling approach does not
model individual client behaviors, rather, models
the aggregate workload as generated from many
clients
9User behavior modeling approach
- SURGE (Scalable URL Reference Generator)
- UE (User Equivalent) a single ON/OFF process
- Probability distributions are used for each UE
- BISANTE (Broadband Integrated Satellite Network
Traffic Evaluations) - User Profiles a hierarchy of independent
stochastic processes modeled by FSM to represent
user behavior. - S-client
- A single process and select system call to
manage a large number of concurrent active
connections to the server - Waspclient
- Adapted from SURGE, excluding the user think time
10User behavior modeling approach(Cont.)
- SynRGen (Synthetic file Reference Generator)
- User-Behavior modeling approach
- Volume a subtree of files and directories
exhibiting a unique combination of physical
characteristics. - Preprocessed into a C data structure accessed by
users - User classes a stochastic finite state machine
- Configuration files describing user behavior
- Preprocessed into a C program representing a
synthetic user
11Agenda
- Introduction
- Related Works
- Approach
- Experiences
- Conclusion Future Works
12Our Approach
- Two Parts
- Topology
- Internet Topology Generator from GIT (includes
mapping from servers to sites) - Map clients to sites
- Application behavior
- A process for each client user behavior modeling
approach - Creating workloads by simulating clients using a
generic discrete-event sequential simulator (a
client can be understood as a program.) - program complex behaviors
- program inter-related behaviors among clients
- Parameters defined in configuration file to
describe their behavior
13Topology
Site
Client
Server
Client/Server
14Topology File
- C0_at_s77 -gt S_at_s79
- C0_at_s77 -gt S_at_s82
- C1_at_s29 -gt S_at_s0
- C1_at_s29 -gt S_at_s30
- C2_at_s47 -gt S_at_s43
- C2_at_s47 -gt S_at_s45
- C3_at_s15 -gt S_at_s11
- C3_at_s15 -gt S_at_s12
- C4_at_s80 -gt S_at_s2
- C4_at_s80 -gt S_at_s84
- C4_at_s80 -gt S_at_s85
- C5_at_s42 -gt S_at_s43
- C5_at_s42 -gt S_at_s45
- C6_at_s4 -gt S_at_s0
- C6_at_s4 -gt S_at_s7
- C7_at_s59 -gt S_at_s60
- C7_at_s59 -gt S_at_s63
- S_at_s0
- S_at_s2 -gt S_at_s0
- S_at_s3 -gt S_at_s0
- S_at_s4 -gt S_at_s0
- S_at_s12 -gt S_at_s0
- S_at_s17 -gt S_at_s0
- S_at_s29 -gt S_at_s0
- S_at_s69 -gt S_at_s2
- S_at_s80 -gt S_at_s2
- S_at_s88 -gt S_at_s2
- S_at_s1 -gt S_at_s3
- S_at_s95 -gt S_at_s3
- S_at_s5 -gt S_at_s4
- S_at_s7 -gt S_at_s4
- S_at_s9 -gt S_at_s12
- S_at_s10 -gt S_at_s12
- S_at_s11 -gt S_at_s12
15Simulator
- Class Sim a generic discrete-event sequential
simulator, maintains a time-ordered schedule of
discrete events. - create_process(Process, char mode 0)
- run_simulation(), stop_simulation()
- set_timeout(Time)
- signal_event(Event, ProcessId, Time)
- Virtual class Process representing processes
running within the simulator. - process_event(const Event msg)
- process_timeout()
- stop()
16Simulation
create_process()
create_process()
create_process()
run_simulation()
set_timeout(T)
process_timeout() subscribe signal_event() set_
timeout(T)
process_event() subscribe signal_event()
process_event() subscribe
process_timeout() publish stop()
stop_simulation()
17Contents of Sub/Pub
- Sub Pattern Type Attr_Name Operator Value
- Pub Pattern Type Attr_Name Value
- Every part comes from weighted dictionaries
(distribution) - Type from types.dist
- Attr_Name from attr_names.dist
- Operator from string_operators.dist for string,
from operators.dist for other types - Value from int_values.dist for int, from
str_values.dist for string
18Parameters
- Includes dictionary files as parameters.
- Different parameters for different kinds of
activities. E.g - For non-interactive activities
- Type Random, Constant or Poisson
- For Random Min, Max
- For Constant Value
- For Poisson Mid
- For triggered activities modeled through
simulator
19Parameters (Content)
20Parameters (Behavior)
21Workload Generated
- T3, C0, S0
- s optics SF belch
- i relocated 29
- s panelist lt dismissing
- i Massachusetts 85
- i lilacs 80
- T4, C2, S0
- s absolution adiabatic
- i signify gt 21
- i hourly 89
- T5, C0, U0
- T5, C1, S0
- s opposites gt epistemological
- T5, C2, U0
- T6, C7, S0
- s pooling SF lender
T6, C0, N0 i hourly 22 i permanently 56 i
Bontempo 7 T6, C6, N0 s Muslims smuggle i
influenced 55 T7, C0, N1 i roping 6 i syllogisms
47 T7, C4, N0 s archdiocese belch b greengrocer
0 s wheels sates T9, C7, U0 T10, C1, U0
22Agenda
- Introduction
- Related Works
- Approach
- Experiences
- Conclusion Future Works
23Formulate implement scenarios (Wave Scenario)
Film
4
2
Film News
2
3
2
Film News shopping
1
5
Film
shopping
5
2
3
News shopping
4
News shopping
6
2
shopping
7
24Configuration File(1)
- SUBJECT "News"
- CONSTR_MIN 1
- CONSTR_MAX 3
- ATTR_MIN 2
- ATTR_MAX 6
- ATTR_DICT_F "attr_names.dict
- TYPES_DIST_F "types.dist"
- OP_DIST_F "operators.dist"
- OP_STRING_DIST_F "string_operators.dist"
- STR_DIST_F "str_values.dist"
- INT_DIST_F "int_values.dist"
- BOOL_DIST_F "bool_values.dist"
-
25Configuration File(2)
- CLIENT1
- SUB_FRIENDS 20_at_2Film"News",
30_at_5"News""Shopping" - SUBJECT_DICT_F "subject_names.dict"
- SUBSCRIPTION 7
- NOTIFICATION 3
- FIRST 0
- SUB_SERIAL_NUM 2
- NOTIFY_SERIAL_NUM 1
- SUB_NOTIFY_TIME 1
- SUB_TIME 0
- NOTIFY_TIME 0
- UNSUB_TIME 2
-
- 20_at_2Film News means client 2 will be
triggered to subscribe on the subject Film and
News after 2 seconds.
26Simulation
subscribe
publish
Be triggered to sub/pub
signal_event
Time (s)
Client 1
Client 2
Client 3
Client 4
Client 5
Client 6
27Application to Siena
Generate topology
Client 1
Script 1
Topology file
Server 1
Script Generator
Script Dispatcher
Script 2
Client 2
Script 3
Workload file
scripts for each clients
Server 2
Client 3
28Agenda
- Introduction
- Related Works
- Approach
- Experiences
- Conclusion Future Works
29Conclusion
- User behavior modeling approach
- Two Parts
- Topology generated by Internet Topology
generator - Application Behavior
- Every client is a program
- Regular non-interrelated activity through
analytic approach - Complex and interrelated activity through
simulating - Defined by configuration file
30Future Works
- More reasonable workload content
- Experiment on remote servers
- Improvement of configuration file
- Application to other systems
- Validation with data from real life
31Questions?