Title: The Porcupine Scalable Mail Server
1The Porcupine Scalable Mail Server
- Yasushi Saito
- Eric Hoffman
- Brian Bershad
- Hank Levy
- David Becker
- Bertil Folliot
http//porcupine.cs.washington.edu/
University of Washington, Department of Computer
Science and Engineering Sep 7, 1998
2Why Is Mail an Interesting Problem?
- Cluster research has focused on web services
- Mail is an example of a write-intensive
application - disk-bound workload
- reliability requirements
- failure recovery
- Mail servers have relied on brute force
approach to scaling - Big-iron file server, RDBMS
3Goals
- Use networked PCs to build a fast, scalable and
easy-to-manage mail server
1 billion messages/day (100x existing systems)
100 million users (10x existing systems) 1000
nodes (50x existing systems)
4Conventional Mail Servers
- SMTP/POP front-end hosts
- Distributed file system for message store
- Dedicated user DB server
The Internet
User DB Server
NFS Server
NFS Server
5Problems of Conventional Architecture
- Hardware expense
- Dedicated file servers, DBMS
- Management expense
- Limited fault tolerance
- Static configuration
- Performance
- Synchronization based on file system mechanisms
- Slow legacy software
6Porcupine Mail Server
- Symmetric function distribution
- Distribute user database and user mailbox
- Lazy data management
- Self-management
- Automatic load balancing, membership management
- Graceful Degradation
- Cluster remains functional despite any number of
failures
7Porcupine Architecture
SMTP server
POP server
IMAP server
B
C
A
C
A
B
D
D
Cluster Membership Manager
Hash map
RPC selector
User DB cache
Mailbox spool
User DB
...
...
Node A
Node B
Node Z
Local area network
8User Management
USER jim
Node A
Node C
SMTP server
POP server
SMTP
POP
hash(jim)3
Membership
Membership
B
C
A
C
A
B
D
D
LAN
RPC selector
Mailbox spool
User DB
User DB cache
Mailbox spool
User DB
jim INBOX??A,D? oldmail??B? john INBOX
??A,C? ...
User DB Cache
9Node Failure and Recovery
- Membership protocol runs
- Hash map reconfigured
- dead node removed, recovered node added
- load balanced by assigning approx equal number of
entries to each node - Recovered node scans spool and notifies user DB
caches about the spool content - User DB reconciliation runs optimistically
10Implementation Status
- Linux pthread Flick
- Basic functions implemented (SMTP, POP)
- Lacks frills (mail address rewriting, filtering)
- Robust recovery
- Up to 15 cluster of PCs connected by 100Mb
Ethernet - Porcupine simulator for larger cluster
11Status Monitor Screen
SMTP, POP, and RPC throughput
Cluster members and their CPU utilization
Status report
12Performance
- Questions
- How does the system scale?
- How costly is the failure recovery procedure?
- Two scenarios tested
- Steady state
- Node failure
- Platform
- 300MHz PII 128MB memory 100Mb Ethernet
- Linux-2.0.32 glibc2.0.7
13Scalability
SMTP Sessions /sec
Number of nodes
14Failure Recovery
- 8-node cluster. Two nodes fail, later recover
First node fails
Second node fails
SMTP Sessions /sec
Both nodes recover
Time (seconds)
15Summary and Future Work
- Porcupine is designed to be cheap, fast,
scalable, and easy to manage - System throughput scales
- Large-scale membership, RPC
- Geographically distributed clustering
- Porcupine as a distributed service workbench
16Discussion Points
- Distributed Membership protocol
- Christians 3/5 phase algorithm vs two-tiered
broadcast algorithm vs ... - Hashing algorithm
- Centralized hash computation vs consistent
hashing - Is a 1000-node cluster really realizable?
- Load balancing
- File system design for high-throughput mail
server - Other services using Porcupine infrastructure
- Management Interface