eGovernance - PowerPoint PPT Presentation

About This Presentation
Title:

eGovernance

Description:

Choose availability over consistency ... Reduces complexity of service implementation , consistency for simplicity. Fault Tolerance ... – PowerPoint PPT presentation

Number of Views:48
Avg rating:3.0/5.0
Slides: 46
Provided by: db2a9
Category:

less

Transcript and Presenter's Notes

Title: eGovernance


1
(No Transcript)
2
Topics
  • ACID vs BASE
  • Starfish Availability
  • TACC Model
  • Transend Measurements
  • SNS Architecture

3
Extensible Cluster Based Network Services
Armando Fox Steven Gribble Yatin Chawathe Eric
Brewer Paul Gauthier
University of CaliforniaBerkeley
Inktomi Corporation
Presenter Ashish Gupta Advanced Operating Systems
4
Motivation
  • Proliferation of network-based services
  • Two critical issues must be addressed by Internet
    services
  • System scalability
  • Incremental and linear scalability
  • Availability and fault tolerance
  • 24x7 operation

Clusters of workstations meet these requirements
5
(No Transcript)
6
Contribution of this work
Isolate common requirements of cluster-based
Internet apps into a reusable substrate
the Scalable Network Services (SNS) framework
  • Goal complete separation of ility concerns
    from application logic
  • Legacy code encapsulation
  • Insulate programmers from nasty engineering

7
Contribution of this work
  • Architecture for SNS, exploiting the strength of
    cluster computing
  • Separation of content of network services from
    implementation
  • Encapsulation of low level functions in a lower
    layer
  • Example of a new service
  • A Programming Model to go with the architecture

8
The SNS architecture
  • Workers and Front-ends
  • All control decisions for satisfying user
    requests localized in the front-ends
  • Which Servers to invoke, access profile database,
    notify the end-user etc.
  • Workers simple and stateless
  • Behaviour of service defined entirely at the
    front-end
  • Analogy of processes in a Unix pipeline ls l
    grep .pl wc

User ProfileDatabase
Caches
Front Ends
C
FE



FE
Workers
FE
GUI
LB/FT
Manager Load Balancing Fault Tolerance
AdministrationInterface
9
Separating the content from implementation
Layered Software model
Previous Components
SNS Provides Scalability Load Balancing Fault
tolerance High Availability
10
The SNS Layer
  • Scalability
  • Replicate well-encapsulated components
  • Prolonged Bursts Notion of Overflow Pool
  • Load Balancing
  • Centralized Simple to implement and predicable

11
The SNS Layer
  • Soft State for fault-tolerance and availability
  • Process peers watch each other
  • Because of no hard state, recovery restart
  • Load balancing, hot updates, migration are easy
  • Shoot down a worker, and it will recover
  • Upgrade install new software, shoot down old
  • Mostly graceful degradation

12
Starfish Availability LB Death
  • FE detects via broken pipe/timeout, restarts LB

C
FE



FE
FE
LB/FT
13
Starfish Availability LB Death
  • FE detects via broken pipe/timeout, restarts LB

New LB announces itself (multicast), contacted by
workers, gradually rebuilds load tables
If partition heals, extra LBs commit
suicide FEs operate using cached LB info during
failure
C
FE



FE
FE
LB/FT
14
(No Transcript)
15
Question How do we build the services in the
higher layers?
16
The TACC Model
a model for structuring services
Programming model based on composable building
blocks Many existing services fit well within
the TACC model
17
A Meta-Search Engine In TACC
  • Uses existing services to create a new service
  • 2.5 hours to write using TACC franework

Internet
Metasearch Web UI
18
An Example ServiceTRANSEND
19
Datatype-Specific Distillation
  • Lossy compression that preserves semantic content
  • Tailor content for each client
  • Reduce end-to-end latency when link is slow
  • Meaningful presentation for range of clients

6.8x
65x
1.2 The Remote Queue Model We introduce Remote
Queues (RQ), .

20
TranSend SNS Components
  • Workers Distillers here
  • Simple restart mechanism for fault-tolerance
  • Each distiller took 5-6 hrs to write
  • SNS Fault tolerance removes worries about
    occasional bugs/crashes

21
Measurements
  • Request Generation
  • High performance HTTP request playback engine
  • Burstiness
  • Handled by the overflow pool

22
Load Balancing
Metric Queue Length at distillers
Load reaches threshold Manager spawns a new
distiller
23
Scalability
Strategy Begin with minimal instance Increase
offered load until saturation Add more resources
to eliminate saturation
Observations Nearly perfect linear growth 1
Distiller 23 requests/sec Front end 70
requests/sec
Ultimate bottleneck Shared components of the
system (Manager and the SAN) SAN could be
bottleneck for communication-intensive workloads
(Example of 10Mb/s eth) Topic for future research
24
Conclusion
  • A layered architecture for cluster-based scalable
    network services
  • Authors shielded from software complexity of
    automatic scaling, high availability, and failure
    management
  • New services as composition of stateless workers
  • A useful paradigm for deploying new Internet
    services

25
ACID vs BASE semantics
An approximate answer delivered quickly is more
useful than the exact answer slowly
26
ACID vs BASE semantics
An approximate answer delivered quickly is more
useful than the exact answer slowly
  • Search Engine as a database
  • 1 Big table
  • Unknown but large growth
  • Must be truly highly available

27
  • A DBMS would be too slow
  • Choose availability over consistency
  • Graceful degradation OK to temporarily lose
    small random subsets of data due to faults

Atomicity
BASE Basically Available Soft-State Eventual
Consistency
Replace with Availablity Graceful
degradation Performance
Consistency
Isolation
Durability
Database research is about ACID
28
Why BASE ?
  • Idea focus on looser semantics rather than ACID
    semantics
  • ACID gt data unavailable rather than available
    but inconsistent
  • BASE gt data available, but could be stale,
    inconsistent or approximate
  • Real systems use BOTH semantics
  • Claim BASE can lead to simpler systems and
    better performance
  • Performance caching and avoidance of
    communication and some locks (e.g. ACID requires
    strict locking and communication with replicas
    for every write and any reads without locks)
  • Simpler soft-state leads to easy recovery and
    interchangable components
  • BASE fits clusters well due to partial failure

29
More BASE
  • Reduces complexity of service implementation ,
    consistency for simplicity
  • Fault Tolerance
  • Availability
  • Opportunities for better performance
    optimizations in the SNS framework
  • ACID durable and consistent state across
    partial failures
  • This Is relaxed in the BASE model
  • Example of HotBot

30
THANK You
31
Backup Slides
32
Question
  • Why are the cluster-based network service well
    suited to internet service

33
answer
  • The requirements are highly parallel( many
    indepent simultaneous users)
  • The grain size typically corresponds to at most a
    few CPU seconds on a commodity PC

34
Question 2
  • Why does the cluster-base network service use
    BASE semantics?

35
Answer
  • BASE semantics allow us to handle partial failure
    in clusters with less complexity and cost.

36
Question 3
  • When the overflow machines are being recruited
    unusually often, what should be done at this time?

37
Answer
  • It is time to add new machines.

38
Question 4
  • Does the Front-end crash not lost any
    information? If does, what kind information will
    be lost?

39
Answer
  • User requests will be lost and user need to
    handle timeout and resend request.

40
(No Transcript)
41
Clustering and Internet Workloads
  • Internet vs. traditional workloads
  • e.g. Database workloads (TPC benchmarks)
  • e.g. traditional scientific codes (matrix
    multiply, simulated annealing and related
    simulations, etc.)
  • Some characteristic differences
  • Read mostly
  • Quality of service (best-effort vs. guarantees)
  • Task granularity
  • Embarrasingly parallelwhy?
  • HTTP is stateless with short-lived requests
  • Webs architecture has already forced app
    designers to work around this! (not obvious in
    1990)

42
Meeting the Cluster Challenges
  • Software programming models
  • Partial failure and application semantics
  • System administration
  • Two case studies to contrast programming models
  • GLUnix goal support all traditional Unix apps,
    providing a single system image
  • SNS/TACC goal simple programming model for
    Internet services (caching, transformation,
    etc.), with good robustness and easy
    administration

43
AltaVista hardware
  • An AltaVista system consists of six computers
  • AltaVista (external traffic, HTTP server) 250 MT
    main memory, 6 GB plate
  • Indexer (indicates HTML documents) 10
    processors, 6 GB main memory, 210 GB plate
  • Scooter (Robot) 1,5 GB main memory, 30 GB plate
  • Vista (Scooter output processes) 2 processors, 2
    GB main memory, 180 GB plate
  • News Indexer 896 MT main memory, 13 GB plate
  • News server 896 MT main memory, 24 GB plate

44
GOOGLE
  • Google's hardware is a massive "farm" of more
    than 10,000 servers, capable of not only indexing
    more than 3 billion web documents but handling
    thousands of queries per second with sub-second
    response times. It's an awesome engineering feat
    in its own right.

45
GOOGLE LB and FT
  • Google's application makes expensive proprietary
    hardware unsuitable, says Reese. "We are not like
    a transaction-based e-Commerce site, where it
    makes sense to spend a whole lot of money on some
    really big server iron and storage area network.
    We architected our solution to be scalable by
    using smaller servers that are multiply redundant
    and very fast through load balancing. Also it
    makes us very fault tolerantwe can lose a whole
    cluster or clusters, and we'll still be fine."
Write a Comment
User Comments (0)
About PowerShow.com