Scaleable Servers - PowerPoint PPT Presentation

1 / 42
About This Presentation
Title:

Scaleable Servers

Description:

Scaleable Servers – PowerPoint PPT presentation

Number of Views:66
Avg rating:3.0/5.0
Slides: 43
Provided by: ResearchM53
Category:
Tags: aam | scaleable | servers

less

Transcript and Presenter's Notes

Title: Scaleable Servers


1
Scaleable Servers
  • Jim Gray
  • Microsoft
  • Gray_at_Microsoft.com
  • http//www.research.Microsoft.com/Gray

2
Thesis Scaleable Servers
  • Scaleable Servers
  • Commodity hardware allows new applications
  • New applications need huge servers
  • Clients and servers are built of the same stuff
  • Commodity software and
  • Commodity hardware
  • Servers should be able to
  • Scale up (grow node by adding CPUs, disks,
    networks)
  • Scale out (grow by adding nodes)
  • Scale down (can start small)
  • Key software technologies
  • Objects, Transactions, Clusters, Parallelism

3
1987 256 tps Benchmark
  • 14 M computer (Tandem)
  • A dozen people
  • False floor, 2 rooms of machines

Admin expert
Hardware experts
A 32 node processor array
Auditor
Network expert
Simulate 25,600 clients
Manager
Performance expert
OS expert
DB expert
A 40 GB disk array (80 drives)
4
1988 DB2 CICS Mainframe65 tps
  • IBM 4391
  • Simulated network of 800 clients
  • 2m computer
  • Staff of 6 to do benchmark

2 x 3725 network controllers
Refrigerator-sized CPU
16 GB disk farm 4 x 8 x .5GB
5
1997 10 years later1 Person and 1 box 1250 tps
  • 1 Breadbox 5x 1987 machine room
  • 23 GB is hand-held
  • One person does all the work
  • Cost/tps is 1,000x less25 micro dollars per
    transaction

4x200 Mhz cpu 1/2 GB DRAM 12 x 4GB disk
Hardware expert OS expert Net expert DB
expert App expert
3 x7 x 4GB disk arrays
6
What Happened?
  • Moores law Things get 4x better every 3
    years (applies to computers, storage, and
    networks)
  • New Economics Commodityclass price/mips
    software /mips
    k/yearmainframe 10,000 100 minicomputer
    100 10microcomputer 10
    1
  • GUI Human - computer tradeoffoptimize for
    people, not computers

7
Billions Of ClientsNeed Millions Of Servers
  • All clients networked to servers
  • May be nomadicor on-demand
  • Fast clients wantfaster servers
  • Servers provide
  • Shared Data
  • Control
  • Coordination
  • Communication

Clients
Mobileclients
Fixedclients
Servers
Server
Super server
8
ThesisMany little beat few big
1 million
100 K
10 K
Pico Processor
Micro
Nano
10 pico-second ram
1 MB
Mini
Mainframe
10
0

MB
1
0 GB
1
TB
1
00 TB
1.8"
2.5"
3.5"
5.25"
1 M SPECmarks, 1TFLOP 106 clocks to bulk
ram Event-horizon on chip VM reincarnated Multi
program cache, On-Chip SMP
9"
14"
  • Smoking, hairy golf ball
  • How to connect the many little parts?
  • How to program the many little parts?
  • Fault tolerance?

9
Future Super Server4T Machine
  • Array of 1,000 4B machines
  • 1 bps processors
  • 1 BB DRAM
  • 10 BB disks
  • 1 Bbps comm lines
  • 1 TB tape robot
  • A few megabucks
  • Challenge
  • Manageability
  • Programmability
  • Security
  • Availability
  • Scaleability
  • Affordability
  • As easy as a single system

Cyber Brick a 4B machine
Future servers are CLUSTERS of processors,
discs Distributed database techniques make
clusters work
10
The Hardware Is In PlaceAnd then a miracle
occurs
?
  • SNAP scaleable networkand platforms
  • Commodity-distributedOS built on
  • Commodity platforms
  • Commodity networkinterconnect
  • Enables parallel applications

11
Thesis Scaleable Servers
  • Scaleable Servers
  • Commodity hardware allows new applications
  • New applications need huge servers
  • Clients and servers are built of the same stuff
  • Commodity software and
  • Commodity hardware
  • Servers should be able to
  • Scale up (grow node by adding CPUs, disks,
    networks)
  • Scale out (grow by adding nodes)
  • Scale down (can start small)
  • Key software technologies
  • Objects, Transactions, Clusters, Parallelism

12
Scaleable ServersBOTH SMP And Cluster
Grow up with SMP 4xP6is now standard Grow out
with cluster Cluster has inexpensive parts
SMP superserver Departmentalserver Personalsy
stem
Clusterof PCs
13
SMPs Have Advantages
  • Single system image easier to manage, easier to
    program threads in shared memory, disk, Net
  • 4x SMP is commodity
  • Software capable of 16x
  • Problems
  • gt4 not commodity
  • Scale-down problem (starter systems expensive)
  • There is a BIGGEST one

SMP superserver Departmentalserver Personalsy
stem
14
Tpc-C Web-Based Benchmarks
  • Client is a Web browser (9,200 of them!)
  • Submits
  • Order
  • Invoice
  • Query to server via Web page interface
  • Web server translates to DB
  • SQL does DB work
  • Net
  • easy to implement
  • performance is GREAT!

HTTP
IIS Web
ODBC
SQL
15
TPC-C Shows How Far SMPs have come
  • Performance is amazing
  • 2,000 users is the min!
  • 30,000 users on a 4x12 alpha cluster (Oracle)
  • Peak Performance 30,390 tpmC _at_ 305/tpmC
    (Oracle/DEC)
  • Best Price/Perf 7,693 tpmC _at_ 43/tpmC (MS
    SQL/Dell)
  • graphs show UNIX high price diseconomy of
    scaleup

16
TPC C SMP Performance
  • SMPs do offer speedup
  • but 4x P6 is better than some 18x MIPSco

17
The TPC-C Revolution Shows How Far NT and SQL
Server have Come
  • Economy of scale on Windows NT
  • Recent Microsoft SQL Server benchmarks are
    Web-based

tpmC and /tpmC
MS
SQL Server Economy of Scale Low Price
250
DB2
200
Informix
150
Better
Price /TPM-C
Microsoft
100
Oracle
50
Sybase
0
0
1000
2000
3000
4000
5000
6000
7000
8000
Performance tpmC
18
What Happens To Prices?
  • No expensive UNIX front end (20/tpmC)
  • No expensive TP monitor software (10/tpmC)
  • gt 65/tpmC

19
Building the Largest NT Node
  • Build a 1 TB SQL Server database
  • Show off NT and SQL Server Scaleability
  • Stress test the product
  • Demo it on the Internet
  • WWW accessible by anyone
  • So data must be
  • 1 TB
  • Unencumbered
  • Interesting to everyone everywhere
  • AND not offensive to anyone anywhere

20
Whats TeraByte?
  • 1 Terabyte
  • 1,000,000,000 business letters 150 miles
    of book shelf
  • 100,000,000 book pages 15 miles of
    book shelf
  • 50,000,000 FAX images 7 miles of
    book shelf
  • 10,000,000 TV pictures (mpeg)
    10 days of video 4,000 LandSat images 16
    earth images (100m)
  • 100,000,000 web page 10 copies of
    the web HTML
  • Library of Congress (in ASCII) is 25 TB
  • 1980 200 million of disc
    10,000 discs
  • 5 million of tape silo 10,000 tapes
  • 1997 200 k of magnetic disc
    48 discs
  • 30 k nearline tape
    20 tapes
  • Terror Byte !

21
The Plan
  • DEC Alpha
  • 324 StorageWorks Drives (1.4 TB)
  • 30K BTU, 8 KW, 1.5 metric tons.
  • SQL 7.0
  • USGS data(1 meter)
  • Russian Spacedata (2 meter)

DEC 4100 4 x 400 Mhz Alpha Processors 4GB DRAM
22
Image Data Sources
23
DOQ coverage of the US
  • 1 Meter images of many places
  • Problems
  • most of data not yet published
  • interesting places missing (LA, Portland, SD,
    Anchorage,)
  • Loaded published 130 GB.
  • CRDA for unpublished 3 TB

24
SPIN-2 Coverage
  • The rest of the world
  • The US Government cant help, but....
  • The Russian Space Agency is eager to cooperate.
  • 2 Meter Geo Rectified imagery of anywhere
  • More data coming, Earth has 500 TeraMeters2
  • gt 30 Tera Bytes of Land at 2x2 Meter
  • gt we need 3 of the land (Urban World the red
    stuff)

25
Demo Interface
26
Grow UP and OUT
1 Terabyte DB
  • Cluster
  • a collection of nodes
  • as easy to program and manage as a single node

1 billion transactions per day
27
Clusters Have Advantages
  • Clients and servers made from the same stuff
  • Inexpensive
  • Built with commodity components
  • Fault tolerance
  • Spare modules mask failures
  • Modular growth
  • Grow by adding small modules
  • Unlimited growth no biggest one

28
Billion Transactions per Day Project
  • Built a 45-node Windows NT Cluster (with help
    from Intel Compaq) gt 900 disks
  • All off-the-shelf parts
  • Using SQL Server DTC distributed transactions
  • DebitCredit Transaction
  • Each node has 1/20 th of the DB
  • Each node does 1/20 th of the work
  • 15 of the transactions are distributed

29
How Much Is 1 Billion Transactions Per Day?
  • 1 Btpd 11,574 tps (transactions per second)
    700,000 tpm (transactions/minute)
  • ATT
  • 185 million calls (peak day worldwide)
  • Visa 20 M tpd
  • 400 M customers
  • 250,000 ATMs worldwide
  • 7 billion transactions / year (cardcheque) in
    1994

Millions of transactions per day
1,000.
100.
10.
Mtpd
1.
0.1
ATT
Visa
BofA
NYSE
1 Btpd
30
Billion Transactions Per Day Hardware
  • 45 nodes (Compaq Proliant)
  • Clustered with 100 Mbps Switched Ethernet
  • 140 cpu, 13 GB, 3 TB.

31
1.2 B tpd
  • 1 B tpd ran for 24 hrs.
  • Sized for 30 days
  • Linear growth
  • 5 micro-dollars per transaction
  • Out-of-the-box software
  • Off-the-shelf hardware
  • AMAZING!

32
Other Stunts
  • 100 M Web Hits/day on one server
  • (1,300 hits/sec, Web Mark HTML server)
  • Email server (exchange)
  • 50 GB database (up from 16GB, limit now 16TB)
  • 50 k POP3 users (1.5 M msg/day)
  • 64-bit addressing SQL Server
  • SAP Failover
  • Theme
  • conventional stuff is easy

33
Thesis Scaleable Servers
  • Scaleable Servers
  • Commodity hardware allows new applications
  • New applications need huge servers
  • Clients and servers are built of the same stuff
  • Commodity software and
  • Commodity hardware
  • Servers should be able to
  • Scale up (grow node by adding CPUs, disks,
    networks)
  • Scale out (grow by adding nodes)
  • Scale down (can start small)
  • Key software technologies
  • Objects, Transactions, Clusters, Parallelism

34
ParallelismThe OTHER aspect of clusters
  • Clusters of machines allow two kinds of
    parallelism
  • Many little jobs online transaction processing
  • TPC-A, B, C
  • A few big jobs data search and analysis
  • TPC-D, DSS, OLAP
  • Both give automatic parallelism

35
Kinds of Parallel Execution
Any
Any
Sequential
Sequential
Pipeline
Program
Program
Partition outputs split N ways inputs merge
M ways
Any
Any
Sequential
Sequential
Program
Program
36
Data Rivers Split Merge Streams
N X M Data Streams
M Consumers
N producers
River
Producers add records to the river, Consumers
consume records from the river Purely sequential
programming. River does flow control and
buffering does partition and merge of data
records River Split/Merge in Gamma Exchange
operator in Volcano.
37
Partitioned Execution
Spreads computation and IO among processors

Partitioned data gives
NATURAL parallelism
38
N x M way Parallelism
N inputs, M outputs, no bottlenecks. Partitioned
Data Partitioned and Pipelined Data Flows
39
Clusters (Plumbing)
  • Single system image
  • naming
  • protection/security
  • management/load balance
  • Fault Tolerance
  • Wolfpack
  • Hot Pluggable hardware Software

40
Windows NT clusters
  • Key goals
  • Easy to install, manage, program
  • Reliable better than a single node
  • Scaleable added parts add power
  • Microsoft 60 vendors defining NT clusters
  • Almost all big hardware and software vendors
    involved
  • No special hardware needed - but it may help
  • Enables
  • Commodity fault-tolerance
  • Commodity parallelism (data mining, virtual
    reality)
  • Also great for workgroups!
  • Initial two-node failover
  • Beta testing since December96
  • SAP, Microsoft, Oracle giving demos.
  • File, print, Internet, mail, DB, other services
  • Easy to manage
  • Each node can be 4x (or more) SMP
  • Next (NT5) Wolfpack is modest size cluster
  • About 16 nodes (so 64 to 128 CPUs)
  • No hard limit, algorithms designedto go further

41
SQL Failover Using NT Clusters
  • Each server owns half the database
  • When one fails
  • The other server takes over the shared disks
  • Recovers the database and serves it

42
So, Whats New?
  • When slices cost 50k, you buy 10 or 20.
  • When slices cost 5k you buy 100 or 200.
  • Manageability, programmability, usability become
    key issues (total cost of ownership).
  • PCs are MUCH easier to use and program

MPP Vicious Cycle No Customers!
Apps
CP/Commodity Virtuous Cycle Standards allow
progress and investment protection
Standard platform
Customers
43
Thesis Scaleable Servers
  • Scaleable Servers
  • Commodity hardware allows new applications
  • New applications need huge servers
  • Clients and servers are built of the same stuff
  • Commodity software and
  • Commodity hardware
  • Servers should be able to
  • Scale up (grow node by adding CPUs, disks,
    networks)
  • Scale out (grow by adding nodes)
  • Scale down (can start small)
  • Key software technologies
  • Objects, Transactions, Clusters, Parallelism

44
The BIG PictureComponents and transactions
  • Software modules are objects
  • Object Request Broker (a.k.a., Transaction
    Processing Monitor) connects objects(clients to
    servers)
  • Standard interfaces allow software plug-ins
  • Transaction ties execution of a job into an
    atomic unit all-or-nothing, durable, isolated

Object Request Broker
45
ActiveX and COM
  • COM is Microsoft model, engine inside OLE ALL
    Microsoft software is based on COM (ActiveX)
  • CORBA OpenDoc is equivalent
  • Heated debate over which is best
  • Both share same key goals
  • Encapsulation hide implementation
  • Polymorphism generic operationskey to GUI and
    reuse
  • Versioning allow upgrades
  • Transparency local/remote
  • Security invocation can be remote
  • Shrink-wrap minimal inheritance
  • Automation easy
  • COM now managed by the Open Group

46
Linking And EmbeddingObjects are data
modulestransactions are execution modules
  • Link pointer to object somewhere else
  • Think URL in Internet
  • Embed bytesare here
  • Objects may be active can callback to subscribers

47
Commodity Software ComponentsInexpensive OS,
DBMSand plug-ins
  • Recent TPC-C prices
  • Oracle on DEC UNIX 30.4 k tpmC _at_ 305/tpmC
  • Informix on DEC UNIX 13.6 k tpmC _at_ 277/tpmC
  • DB2 on Solaris 6.4 ktpmC _at_ 200/tpmC
  • SQL Server on Compaq, Windows NT 7.3 ktpmC _at_
    65/tpmC (using Web, no TP monitor!)
  • Oracle on Windows NT 3.1 ktpmC _at_ 198/tpmC
  • Net Open solutionscan do even biggest jobs
    thousands of online users per node of cluster
  • ActiveX, VBX, andJava plug-ins
  • Spreadsheets, GeoQuery, FAX, voice, image
    libraries, commodity component market

48
Objects Meet DatabasesThe basis for universal
data servers, access, integration
  • object-oriented (COM oriented) programming
    interface to data
  • Breaks DBMS into components
  • Anything can be a data source
  • Optimization/navigation on top of other data
    sources
  • A way to componentized a DBMS
  • Makes an RDBMS and O-RDBMS (assumes optimizer
    understands objects)

DBMS engine
49
The Pattern Three Tier Computing
Presentation
  • Clients do presentation, gather input
  • Clients do some workflow (Xscript)
  • Clients send high-level requests to ORB (Object
    Request Broker)
  • ORB dispatches workflows and business objects --
    proxies for client, orchestrate flows queues
  • Server-side workflow scripts call on distributed
    business objects to execute task

workflow
Business Objects
Database
50
The Three Tiers
Object Data server.
51
Why Did Everyone Go To Three-Tier?
  • Manageability
  • Business rules must be with data
  • Middleware operations tools
  • Performance (scaleability)
  • Server resources are precious
  • ORB dispatches requests to server pools
  • Technology Physics
  • Put UI processing near user
  • Put shared data processing near shared data

Presentation
workflow
Business Objects
Database
52
Why Put Business Objects at Server?
53
What Middleware Does ORB, TP Monitor, Workflow
Mgr, Web Server
  • Registers transaction programs workflow and
    business objects (DLLs)
  • Pre-allocates server pools
  • Provides server execution environment
  • Dynamically checks authority (request-level
    security)
  • Does parameter binding
  • Dispatches requests to servers
  • parameter binding
  • load balancing
  • Provides Queues
  • Operator interface

54
Server Side Objects Easy Server-Side Execution
A Server
  • Give simple execution environment
  • Object gets
  • start
  • invoke
  • shutdown
  • Everything else is automatic
  • Drag Drop Business Objects

Network
Receiver
Queue
Management
Connections
Context
Security
Configuration
Thread Pool
Service logic
Synchronization
Shared Data
55
A new programming paradigm
  • Develop object on the desktop
  • Better yet download them from the Net
  • Script work flows as method invocations
  • All on desktop
  • Then, move work flows and objects to server(s)
  • Gives
  • desktop development
  • three-tier deployment
  • Software Cyberbricks

56
Transactions Coordinate Components (ACID)
  • Transaction properties
  • Atomic all or nothing
  • Consistent old and new values
  • Isolated automatic locking or versioning
  • Durable once committed, effects survive
  • Transactions are built into modern OSs
  • MVS/TM Tandem TMF, VMS DEC-DTM, NT-DTC

57
Transactions Objects
  • Application requests transaction identifier (XID)
  • XID flows with method invocations
  • Object Managers join (enlist)in transaction
  • Distributed Transaction Manager coordinates
    commit/abort

58
Transactions Coordinate Components (ACID)
  • Programmers view bracket a collection of
    actions
  • A simple failure model
  • Only two outcomes

Begin() action action action
action Commit()
Begin() action action action Rollback()
Begin() action action action Rollback()
Fail !
Success!
Failure!
59
Distributed Transactions Enable Huge Throughput
  • Each node capable of 7 KtmpC (7,000 active
    users!)
  • Can add nodes to cluster (to support 100,000
    users)
  • Transactions coordinate nodes
  • ORB / TP monitor spreads work among nodes

60
Distributed Transactions Enable Huge DBs
  • Distributed database technology spreads data
    among nodes
  • Transaction processing technology manages nodes

61
Thesis Scaleable Servers
  • Scaleable Servers Built from Cyberbricks
  • Allow new applications
  • Servers should be able to
  • Scale up, out, down
  • Key software technologies
  • Clusters (ties the hardware together)
  • Parallelism (uses the independent cpus, stores,
    wires
  • Objects (software CyberBricks)
  • Transactions masks errors.

62
Computer Industry Laws (Rules of thumb)
  • Metcalfs law
  • Moores first law
  • Bells computer classes (7 price tiers)
  • Bells platform evolution
  • Bells platform economics
  • Bills law
  • Software economics
  • Groves law
  • Moores second law
  • Is info-demand infinite?
  • The death of Groschs law

63
Metcalfs LawNetwork Utility Users2
  • How many connections can it make?
  • 1 user no utility
  • 100,000 users a few contacts
  • 1 million users many on Net
  • 1 billion users everyone on Net
  • That is why the Internet is so hot
  • Exponential benefit

64
Moores First Law
  • XXX doubles every 18 months 60 increase per
    year
  • Micro processor speeds
  • Chip density
  • Magnetic disk density
  • Communications bandwidthWAN bandwidth
    approaching LANs
  • Exponential growth
  • The past does not matter
  • 10x here, 10x there, soon youre talking REAL
    change
  • PC costs decline faster than any other platform
  • Volume and learning curves
  • PCs will be the building bricks of all future
    systems

65
Bumps In The Moores Law Road
  • DRAM
  • 1988 United States anti-dumping
    rules
  • 1993-1995 ?price flat
  • Magnetic disk
  • 1965-1989 10x/decade
  • 1989-1996 4x/3year! 100X/decade

66
Gordon Bells 1975 VAX Planning Model... He
Didnt Believe It!
System Price 5 x 3 x .04 x memory size/ 1.26
(t-1972) K
  • 5x Memory is20 of cost3x DEC markup.04x
    per byte
  • He didnt believethe projection500 machine
  • He couldntcomprehendthe implications

67
Gordon Bells ProcessingMemories, And Comm 100
Years
Sec. Mem.
Processing
Pri. Mem
Backbone
POTS(bps)
68
Gordon Bells Seven Price Tiers
  • 10 wrist watch computers
  • 100 pocket/ palm computers
  • 1,000 portable computers
  • 10,000 personal computers (desktop)
  • 100,000 departmental computers
    (closet)
  • 1,000,000 site computers (glass house)
  • 10,000,000 regional computers (glass
    castle)

Super server costs more than 100,000Mainframe
costs more than 1 million Must be an array
of processors, disks, tapes, comm ports
69
Bells Evolution Of Computer Classes
Technology enables two evolutionary paths 1.
constant performance, decreasing cost 2.
constant price, increasing performance
1.26 2x/3 yrs -- 10x/decade 1/1.26 .8 1.6
4x/3 yrs --100x/decade 1/1.6 .62
70
Gordon Bells Platform Economics
  • Traditional computers custom or semi-custom,
    high-tech and high-touch
  • New computers high-tech and no-touch

100000
10000
Price (K)
1000
Volume (K)
Applicationprice
100
10
1
0.1
0.01
Mainframe
WS
Browser
Computer type
71
Software Economics
Microsoft 9 billion
  • An engineer costs about150,000/year
  • RD gets 515of budget
  • Need 3 million1 million revenue per
    engineer

Profit 24
RD 16
SGA 34
Tax 13
Productand Service 13
Intel 16 billion
IBM 72 billion
Oracle 3 billion
Profit 15
Profit 6
RD 9
RD 8
Profit
22
Tax 7
SGA
11
Tax
SGA
12
PS 59
43
PS 47
PS 26
72
Software Economics Bills Law
Fixed_
Cost
Price
Marginal _Cost


Units
  • Bill Joys law (Sun) dont write software for
    less than 100,000 platforms _at_10 million
    engineering expense, 1,000 price
  • Bill Gates lawdont write software for less
    than 1,000,000 platforms _at_10 engineering
    expense, 100 price
  • Examples
  • UNIX versus Windows NT 3,500 versus 500
  • Oracle versus SQL-Server 100,000 versus 6,000
  • No spreadsheet or presentation pack on
    UNIX/VMS/...
  • Commoditization of base software and hardware

73
Groves LawThe New Computer Industry
  • Horizontal integrationis new structure
  • Each layer picks best from lower layer
  • Desktop (C/S) market
  • 1991 50
  • 1995 75

Example
Function
Operation
ATT
Integration
EDS
Applications
SAP
Middleware
Oracle
Baseware
Microsoft
Systems
Compaq
Intel Seagate
Silicon Oxide
74
Moores Second Law
  • The cost of fab linesdoubles every generation
    (three years)
  • Money limit hard to imagine
  • 10-billion line
  • 20-billion line
  • 40-billion line
  • Physical limit
  • Quantum effects at 0.25 micron now 0.05 micron
    seems hard 12 years, three generations
  • Lithograph need Xray below 0.13 micron

75
Constant Dollars Versus Constant Work
  • Constant work
  • One SuperServer can doall the worlds
    computations
  • Constant dollars
  • The world spends 10 oninformation processing
  • Computers are moving from5 penetration to 50
  • 300 billion to 3 trillion
  • We have the patenton the byte and algorithm

76
Crossing The Chasm
New market
No product no customers
Product finds customers
Hard
Veryhard
Old market
Hard
Boring competitive slow growth
Customers find product
Old technology
New technology
Write a Comment
User Comments (0)
About PowerShow.com