Title: Future InternetScale Systems
1Future Internet-Scale Systems
- Information Devices (10 Billion)
- Connected Stationary Computers (100 Million)
- Scalable Servers (Million)
2Imagine
3Internet-Scale Systems
- Extremely large, complex, distributed,
heterogeneous, with continuous, rapid
introduction of new technologies - Change in philosophy
- Dynamically deployed agents where they are
neededBig infrastructure, small clients - Incremental processing/communications growth
- Careful violation of traditional layering
- Implementation approach based on incremental
prototyping, deployment, evaluation,
experimentation - infrastructure support to simplify service
construction - ingrained rapid deployment, evolution, and
customizability
4Ninja Architecture Overview
Internet
PDAs (e.g. IBM Workpad)
Cellphones, Pagers, etc.
5Bases
- A physical, administrative, and logical boundary
- a collection of machines geographically
co-located - administrative guarantees no network partitions
(!), constant power supply, trust within the Base - Base platform simplifies authoring of services
- cluster primitives
- task execution, naming, and monitoring
- load balancing, failure detection, and restart
- persistent data primitives and guarantees
- distributed, available data structures
- Hides service implementation from rest of world
- granularity of services is at cluster level, not
node level
6Services Programmatic Access
- Service
- Highly available program (or cooperating
programs) - fixed interface at a fixed location (in a Base)
- guarantees about performance, availability,
consistency - Strongly typed interface
- Multiple services of a given type compete
- Compete on location, price, robustness,
quality, brand name - Finding services Service Discovery Service (SDS)
- Find best service of given type
- best according to multiple criteria (cost,
geographic and administrative location, speed,
reputation, etc.) - Path construction tied to service discovery
7Wide-Area Paths
- Path to service is a first-class entity
- Explicit or automatic creation
- Can change dynamically (path or computation)
- Insert operators into the path (FEC,
compression) - Change parameters (e.g. to optimize for
wireless/satellite) - The path is the unit of
- authentication delegate along the path
- resource allocation and scheduling
- Automatic path creation as result of SDS query
- has flavor of query optimization in DB
terminology - Find logical path of operators, path must type
check - Place operators on nodes (some operators have
affinity) - Add connectors as needed, create any
authentication keys
8Base Implementation
iSpace
iSpace
iSpace
iSpace
SAN
Multispace cluster
- iSpace the building block of a Base
- receptive execution environment
- intra-Base primitives (stub generation,
consistent persistent data repository access,
etc.) - Multispace cluster-wide naming and resource mgmt
9iSpace Execution Environment
Untrusted Services
Loader
Trusted Services
Security Mgr
Ninja RMI
JVM persistent store APIs
iSpace
10Multispace
Multispace services
Multispace Loader
iSpace
11Multispace
- Multicast soft-state beacons to distribute
Multispace state - beacons contain list of service instances on each
node, and an RMI stub for each service instance
Multispace services
Multispace Loader
iSpace
RMI Redirector Stubs assembled run-time
compiled RMI superstub contains all of a
services instances stubs stub selection
policy fail-over, broadcast, multicast, fork,
etc. currently, idempotency and atomicity
required of service instances
1
2
3
12SDS implementation
- Finished beta, using Java XML parser
- Query Model
- Queries are well-formed XML documents
- Known values are encoded as tags within the
correct context - All other tags are assumed to be wildcards
- Features
- Supports searches on tag values and attribute
values - Supports range queries on String, Integer, and
Floats tags - XML documents inserted with timestamp for easy
updates - Optional cleaning mechanism can refresh database
asynchronously
13Four key task handler design patterns
- Wrap
- Pipeline
- Replicate
- Combine
14Wrap
gt
- Take arbitrary piece of code
- place queue in front
- encapsulate with bounded thread pool T lt T
- gt get robust service with non-blocking
interface
15Wrap (thread-per-task server)
gt
- Get robust hybrid task handler with T/L tolerance
- Preserve conventional task sequencing
- Building block for composed services
16Pipeline
gt
- Decouple stages within task handler across
multiple task handlers - Wrapped Blocking call is natural boundary
17Why Pipeline?
- Functional parallelism across stages
- when thread blocks in one...
- Functional parallelism across processors
- Functional parallelism across nodes
- Increase locality (cache, VM, TLB, ) within node
- tend to perform operation (stage) on convoy of
tasks - Limit number of threads devoted to low
concurrency operation - ex file system can only handle 40-50 concurrent
write requests, so this limits useful T - additional threads can be applied to remainder of
stage
18Replicate
gt
- Scale throughput across nodes
- Provide fault isolation boundary
- Mediate thread-pool bottleneck within node
19Combine
gt
- Two task handlers share pool and queue
- Common use is before/after wrapped call
- Avoid wasting threads
20Existing Applications
- Ninja "NOW Jukebox"
- Harnesses CD-ROM drives on Berkeley Network of
Workstations - Plays real-time MPEG-3 audio served from 40 CD's
worth of music - Voice-enabled room control (ICEBERG)
- Speech-to-text Operators control room services
(camera, lights, microphone) - Eventual integration with GSM cell phones and
PDA-based UI - Parallelisms service
- Inversion of Yahoo! directory provides related
links support - Uses distributed hash table - service code only
100 lines worth - NinjaFAX
- Programmable remotely-accessed FAX machine
service - Send/receive FAXes authentication used for
access control - Keiretsu The Ninja Pager Service
- Provides instant messaging service via Web,
1/2-way pagers, PDAs, etc. - Automatic Java Interface to HTML forms
generation - Computational economy support (Millennium project)