A languagebased technique to combine events and threads - PowerPoint PPT Presentation

1 / 39

About This Presentation

Title:

A languagebased technique to combine events and threads

Description:

Threads provides good abstraction. Event-driven programming is difficult ... Abstraction mechanisms: OS, VM, compiler transformations, etc. ... – PowerPoint PPT presentation

Number of Views:39

Avg rating:3.0/5.0

Slides: 40

Provided by: seasU

Category:

more less

Transcript and Presenter's Notes

Title: A languagebased technique to combine events and threads

1
A language-based technique to combine events and
threads

Peng Li, Steve Zdancewic
University of Pennsylvania

2
Outline

Motivation and challenge
Implementation
Programming interfaces
Discussion

3
The C10K problemhttp//www.kegel.com/c10k.html

Problem one server ? 10,000 clients
Network servers, peer-to-peer systems
Where is the bottleneck?
Hardware is powerful enough
Steady network transfers 1G bps / 10K clients
100K bps per client
It is on the software
Programming with 10K clients is not a trivial
task!

4
Programming models for C10Kthreads vs. events

The multithreaded approach
One server thread ? one client
Blocking I/O (usually)
OS and runtime library take care of scheduling

The event-driven approach
One server thread ? many clients
Non-blocking/asynchronous I/O
Programmer manages to interleave computation for
multiple clients

5
The debate over threads vs. events

Why threads are a bad idea USENIX ATC 1999
Why events are a bad idea HotOS 2003
Ease of programming threads win
Threads provides good abstraction
Event-driven programming is difficult
Program in the continuation-passing style (CPS)
Performance
Events win (minimal overhead, async IO)
(Cooperative) Threads can also be very
lightweight
Requires substantial engineering under the hood
(OS / compiler support)
Flexibility events win
Event-driven systems give programmers more
control
Scheduling and I/O functionality tailored to
applications needs
Thread schedulers are difficult to customize
Implemented in separate libraries/kernel address
spaces
Uses unsafe, low-level hacks

6
The hybrid concurrency modelconcepts

A hybrid approach
At the high level threads
Cheap, cooperative threads
One cheap thread ? one client
Blocking I/O
Internally
Cheap thread represented in CPS
Thread continuations can be used as event
handlers
The I/O schedulers events
Event-driven system
Nonblocking/Asynchronous I/O

7
The hybrid concurrency modeldesign goals
User Application

A uniform programming
environment
Same language
Same address space
Shared data structures
Shared libraries
Compiled and linked together

Operating System
8
Challenge providing abstractions inside the
application

The hybrid concurrency model
Abstractions are inside the
user application
Abstraction mechanism
Programming-language level abstractions
Lightweight, transparent to programmer

In most multithreading systems
Thread abstraction is outside the
user application
Abstraction mechanisms
OS, VM, compiler transformations, etc.
Heavyweight, opaque to programmer

Application withthreads
Thread abstraction
Runtime libraries and OS
OS
9
Outline

Motivation and challenge
Implementation
Programming interfaces
Discussion

10
Implementing the unified concurrency model

A poor mans concurrency monad, by Koen
Claessen, ICFP 1999

Monadsprovides an embedded language with thread
primitives
Higher-order functions represent computation
internally in CPS
Lazy data structures implement inversion of
control
11
Basic concept system calls

Primitives for the cheap cooperative threads
Thread control primitives fork, yield, stop
I/O primitives read, write, readiness
notification

12
Basic concept trace

Trace a tree of system calls
Generated at run time, can have infinite size
A system call creates a trace node
sys_fork creates a branching node SYS_FORK

13
Representing traces in Haskell

A lazy tree structure
Potentially infinite size
Nodes are computed only when needed
Provides the event abstraction
Lazy computation provides control inversion

14
How stuff works (conceptually)

Threads create the trace
Scheduler (event loops) consumes the trace

SYS_NBIO(write_nb)
15
Summary so far
Monadsprovides an embedded language with thread
primitives
Higher-order functions represent computation
internally in CPS
Lazy data structure traces (implement inversion
of control)
16
Next
Monadsprovides an embedded language with thread
primitives
Higher-order functions represent computation
internally in CPS
Lazy data structure traces (implement inversion
of control)
17
How the cheap threads looks like
Nested function calls
Exception handling
System call
Conditional branches
Function call to I/O lib
Recursion
18
Designing the thread language

An embedded language in Haskell
Syntax
Standard control-flow primitives
Sequential composition, branches, loops, nested
functional calls, exceptions
System calls
fork, yield, I/O,
Semantics
Thread programs produce traces!

19
Problem composition of traces

Naive ideas
Each system call generates a trace node
A threaded computation generates a trace
Technical challenge how to sequentially combine
two traces?
Example programdo read_request
write_response
Doesnt work out easily!

20
The composible designContinuation-Passing Style
(CPS)

Continuation-passing style
A threaded computation of type a is represented
using a function of type (a-gtTrace)-gtTrace
It takes a function that generates a Trace, and
generates a Trace.
The thread language is implemented as a CPS
monad
Each primitive operation (system call) is
represented as a CPS computation that generates a
trace node
The bind operation (written as gtgt)
sequentially combine CPS computations

21
The thread abstraction CPS monad(hidden from
the user)
22
Haskells do-syntax for monads

A special syntax to simplify programming with
monads
Automatic overloading of operations such as gtgt
A monad is an embedded language!

23
The big picture
The monad interfaceembedded language with the
do-syntax and system calls
The monad implementation Continuation-passing
style computation
The side effect of the monad Lazy traces
24
Outline

Motivation and challenge
Implementation
Programming interfaces
Discussion

25
Now we only care the interfaces
Embedded language with the do-syntax and system
calls
Lazy traces for control inversion
26
Programming with threads

The same programming style with C/Java
Wrap low-level I/O system calls with higher-level
library interfaces
Code example writing a blocking sock_accept call
using non-blocking accept calls

27
Programming with events

The scheduler (main event loop) has an abstract
programming interface traces
Schedulers job traverse the trace tree
Reading a trace node running an event handler
Tree traversal strategy thread scheduling
algorithm
Scheduler plays the active role of control
Control inversion provided by laziness

28
A simple round-robin scheduler
Fetch a thread
Run the thread until it makes the next syscall
Take the child node
Interpret the syscall
Throw it back to the ready queue
29
A real event-driven system

Each event loop (a worker) runs in a OS thread
Event loops synchronize using queues
Example configuration
Some worker threads for CPU-intensive
computations and nonblocking I/O (like the
round-robin scheduler)
Some worker threads for executing blocking I/O
calls (such as fopen)
Dedicated worker threads for monitoring epoll/AIO
events

30
Event loops for epoll/AIO

Epoll high-performance select() in Linux
AIO high-performance asynchronous file I/O in
Linux
Event loops are really simple.

Wraps the C function epoll_wait() using the
Haskell Foreign Function Interface (FFI)
31
Adding an user-level TCP stack
Define / interpret TCP syscalls (22 lines)
Event loop for incoming packets (7 lines)
Event loop for timers (9 lines)
32
Outline

Motivation and challenge
Implementation
Programming interfaces
Discussion

33
How about performance?

Haskell is a pure, lazy, functional programming
language
Runs slightly slower than C
Needs garbage collection
However
CPS threads are really cheap and lightweight
We can take advantage of high-performance,
event-driven I/O interfaces (epoll/AIO)
Mastering continuation-passing in C is not easier
than learning Haskell!

34
How lightweight?

A minimum CPS thread only uses 48 bytes at run
time
No thread-local stack, everything is
heap-allocated
Actual memory usage depends on thread-local
states
1GB RAM 22M minimum CPS threads

35
How scalable?

We compared Haskell CPS threads with C threads
using Linux NPTL
NPTL Native Posix Thread Library
I/O scalability tests
FIFO pipes
Disk head scheduling
CPS threads scale better than NPTL
Performs like the ideal event-driven system!

36
Disk head scheduling performance
37
FIFO performance with idle threads
38
How can this help system design?
Use programming language techniques to provide
all the necessary abstractions
User Application
Thread scheduler
Blocking I/O
Flexible and type-safe
TCP
Mostly event-driven
OS Kernel
Keep it thin and simple
Hardware
Purely event-driven
39
Conclusion