Title: Self-* Systems CSE 598B
1Self- SystemsCSE 598B
- Instructor Bhuvan Urgaonkar
- Fall 2005
2Introduction
- Bhuvan Urgaonkar
- Assistant Professor, CSE
- Ph.D. Univ. of Mass., Amherst
- Research Interests
- Distributed systems, operating systems, computer
networking, modeling of systems - Office 338D, Email bhuvan_at_cse.psu.edu
- Office hours and class timings
- Undecided as of now, we will figure this out at
the end of the class - If in doubt just walk in anytime!
- Students turn to introduce themselves
3Self- systems
- Self- a regular expression
- But not quite
- No self-destroying systems ?
- Three themes
- Self-tuning systems
- Self-healing systems
- Self-stabilizing systems
- Course Web page
- http//www.cse.psu.edu/bhuvan/teaching/fall05/sel
f-star.html - To do Set up a course mailing list
4Self-tuning systems
- Systems that can adapt their behavior to
dynamically changing external influences on their
own -
5Internet applications
- Proliferation of Internet applications
auction site
online game
online retail store
- Growing significance in personal, business
affairs - Focus Internet server applications
6Hosting platforms
- Data Centers
- Clusters of servers
- Storage devices
- High-speed interconnect
- Hosting platforms
- Rent resources to third-party applications
- Performance guarantees in return for revenue
- Benefits
- Applications dont need to maintain their own
infrastructure - Rent server resources, possibly on demand
- Platform provider generates revenue by renting
resources
7Goals of a hosting platform
- Meet service-level agreements
- Satisfy application performance guarantees
- E.g., average response time, throughput
- Maximize revenue
- E.g., maximize the number of hosted applications
- Question How should a hosting platform manage
its resources to meet these goals?
8Challenge dynamic workloads
1200
- Multi-time-scale variations
- Time-of-day, hour-of-day
- Overloads
- E.g., Flash crowds
- User threshold for
response time
8-10 s - Key issue How to provide good
- response time under varying workloads?
0
0
1
2
3
4
5
Time (days)
Arrivals per min
140K
0
0 12 24
Time (hours)
9Self-tuning systems
- A self-tuning hosting platform
-
10Dynamic provisioning
Monitor workload
Compute current/ future demand
Adjust allocation
- Key idea increase or decrease allocated servers
to handle workload fluctuations - Monitor incoming workload
- Compute current or future demand
- Match number of allocated servers to demand
11Dynamic provisioning at multiple time-scales
- Predictive provisioning
- Certain Internet workloads patterns can be
predicted - E.g., time-of-day effects, increased workload
during Thanksgiving - Design a good application model
- Provision using model at time-scale of hours or
days - Reactive provisioning
- Applications may see unpredictable fluctuations
- E.g., Increased workload to news-sites after an
earthquake - Detect such anomalies and react fast (minutes)
- Question How to put these together?
- When to invoke the predictor and the reactor?
12Self-healing systems
- Systems that continue to operate on their own
despite faults or failures - Distinction between faults and failures
- Fault A sysadmin sets a small concurrency limit
for a Web server - Failure debris from an external fuel tank is
thought to have struck Columbia's left wing in
2003. - Failure/fault handling capability built into the
system - Graceful degradation
- We will study classic literature in fault
tolerance, papers that apply these principles to
modern distributed systems
13Self-stabilizing systems
- Guaranteed to converge to a desired behavior from
any initial state if left alone - Why should one have interest in self-stabilizing
algorithms? - Its applicability to distributed systems
- Recovering from faults of a space shuttle. Faults
may cause malfunction for a while. Using a
self-stabilizing algorithm for its control will
cause an automatic recovery, and enables the
shuttle continue in its task
14What is a self-stabilizing algorithm?
- This question will be answered using the
Stabilizing Orchestra example - The Problem
- The conductor is unable to participate harmony
is achieved by players listening to their
neighbor players - Windy evening the wind can turn some pages in
the score, and the players may not notice the
change
15The Stabilizing Orchestra Example
- Our Goal
- To guarantee that harmony is achieved at some
point following the last undesired page turn - Imagine that the drummer notices a different page
of the violin next to him (solutions and their
problems) - The drummer turns to its neighbors new page
what if the violin player noticed the difference
as well? - Both the drummer and violin player start from the
beginning- what if the player next to the violin
player notices the change only after sync between
the other 2?
16The Self-Stabilizing Solution
- Every player will join the neighboring player who
is playing the earliest page (including himself)
- Note that the score has a bounded length. What
happens if a player goes to the first page of the
score before harmony is achieved? - In every long enough period in which the wind
does not turn a page, the orchestra resumes
playing in synchrony
17Discussion Overlaps and distinctions
- Self-tuning vs self-healing vs self-stabilizing
systems - Proactive vs reactive
18Crosscutting goals and challenges
- Removing costly and error-prone humans from
administering complex systems - Learning from the past
- Modeling systems to render them amenable to
analysis - Understanding how robust a system is
- Robust predictable behavior, graceful
degradation - Equivalent Figuring out how to make a system
robust
19Introspection!
- Everyone gives an example of a self- aspect from
their research/experience - Arjun e-commerce applications
- Amitayu dynamic allocation of servers in a farm
- Ross Rosss sensor n/w
- Huajing information ret/ feedback
- Young fault handling by duplication
- Krishna activity migration in a multiprocessor
20Goals of the course
- Understand classic literature
- Identify theory and systems issues/tools common
across these diverse domains - Statistical learning, control theory, measurement
techniques, data analysis, fault tolerance,
modeling - I will try to have some guest lectures
- Learn to appreciate how theory translates into
and compares with practice - Critically evaluate papers and present them, use
these in research
21Some administrative details
22Grading policy
- Paper presentations 30
- Class participation and discussion 15
- Lets have lots of heated discussions
- Dont be shy!
- Paper evaluations due before class 15
- A conference-style evaluation form
- Semester-long project 30
- May be replaced by a term paper
- Apply ideas to your research, masters thesis
- Final exam 10
- Take-home exam
23Expected course-load
- No intentions of stressing you out!
- Round-robin presentation policy
- Number of presentations will depend on how many
students enroll - Red-teams To make sure you come prepared
- We DONT want bad presentations!
- Mid-term and final presentations for students
doing projects - End-of-semester take-home exam
- Goal Find out what we learnt in the course
-
24Presentations
- Prepare about 45-min long talk
- Rest of the class for discussions
- We will accept or reject papers at the end of
each class ? - Red team
- Each presenter will practice his/her talk with
the assigned red team before the class - You are welcome to talk to me, discuss slides,
ask for help understanding the paper before
presenting it - Use the powerpoint template on course page
- We will try to become good speakers and reviewers!
25Paper evaluations
- Due the midnight before the class
- I will put up an evaluation format that you will
adhere to - No long essays needed
- Be critical, read the papers carefully
- I will anonymize evaluations and put them up
after the class so all can read them - Acceptable txt, pdf
26Course project
- Not compulsory
- You may work in groups of up to 2 students
- You may replace it with a term paper
- Survey of additional reading material
- Project may be
- A theoretical exercise
- Implementation-based
- A thought experiment
- Report and term papers due at the end of the
semester
27Final exam
- Day-long take-home exam
- For students doing projects, I will design
questions related to their project - For students doing a survey, I will design
questions based on their survey report
28Miscelleneous
- Please register soon so the course can be offered
- At least 5 students need to take the course
- Lets figure out course timings suitable to all
- Random thoughts
- Would you like to solve puzzles?
- Would you like to have discussions on systems
research in general, hot areas, top conferences
? - Would you like to take turns as scribes?
- Hope We will learn a lot and have lots of fun in
this course
29Questions or comments?