Title: Programming MemoryConstrained Networked Embedded Systems
1ProgrammingMemory-ConstrainedNetworked Embedded
Systems
- Adam Dunkels
- PhD thesis defense
- February 15, 2007
2Embedded systems
- Things with computers that are not computers
themselves - Refrigerators, toys, industrial robots, ...
- 98 of all microprocessors go into embedded
systems - Embedded systems are everywhere!
- 50 much smaller than PC microprocessors
- 8-bit microprocessors
- 1024 bytes vs 1073741824 (1 billion) bytes
3Tiny microprocessors are huge
4Networked, programming
- What if we could make them talk to each other?
- A wide range of new fascinating applications
- Memory-constraints make programming the small
embedded systems a challenge - Typical example 60k ROM, 2k RAM
5Programming programming in the small
6What Ive done
- TCP/IP networking for memory-constrained
networked embedded systems - Developed two embedded TCP/IP stacks lwIP, uIP
- Simplifying event-driven programming for
memory-constrained systems - Protothreads, a novel programming mechanism
- Per-process multi-threading for event-driven
systems - Loadable modules for embedded operating systems
- Developed an embedded operating system with
loadable module support Contiki
7Results of this thesis
- TCP/IP for embedded systems
- Now possible to use in systems an order of
magnitude smaller - Trade-off memory for performance
- Protothreads a novel programming abstraction
- Decrease program complexity
- Very small memory performance overhead
- Dynamically loadable modules in the Contiki
operating system - First system in the community to have this
- Energy overhead of dynamic linking low
- Significant impact
- Software used by 100 companies world-wide, in
research projects, university courses the papers
are published at high-caliber conferences, ...
8The details...
9Networked embedded systems
- Some embedded systems already talk to each other
- Wireless car keys, the TV remote, mobile phones,
... - The vision wireless sensor networks
- Sensing, processing, radio on a single device
- Enable new applications
10Wireless sensor networks Applications
- Environmental monitoring
- Follow contamination flows
- Habitat observation
- Oceanography
11Wireless sensor networks Applications
- Health monitoring of buildings
- Cracks in bridges
- Mix sensors right into the concrete
etc
12Wireless sensor networks may be just a vision ...
... but networked embedded systems are a reality!
13The networked refrigerator
Dave Hudson, principal software engineer for
Ubicom Ltd, 26 September 2001
Actually, refrigerators are probably one of the
most network-connected appliances I know Not
domestic refrigerators, but the commercial type
that supermarkets use Weve supplied tens of
thousands of RS485-connected control and
monitoring systems for such refrigerators This
market is now headed towards Ethernet and TCP/IP
connectivity because it has a tremendous benefits
in terms of manageability and interoperability
between different suppliers' equipment.
14TCP/IP for memory-constrained networked embedded
systems
1
15Traditional TCP/IP stacks are large
- Linux TCP/IP stack
- 100k code, 400k RAM
- µCLinux kernel 400k code, 1 megabyte RAM
60k ROM, 2k RAM...
16µIP Bottom-up approach
- Unconventional design
- Bottom-up design
- Single packet buffer
- Event-driven application interface
17µIP results
- 5k code, 100 bytes 2k RAM
- An order of magnitude smaller than existing work
- RFC compliant TCP, UDP, IP
- Possible contrary to conventional wisdom
- Single-segment design of µIPunfortunate
interaction with TCPs delayed ACK mechanism
18(No Transcript)
19But ability to communicate more important than
throughput
- µIP trades memory for throughput
- Low memory usage, low throughput
- Small systems not that much data
- Example CubeSat
- µIP with 100 bytes buffer
- 9600 bps RF link
20Event-driven
- In TinyOS, we have chosen an event model so that
high levels of concurrency can be handled in a
very small amount of space. A stack-based
threaded approach would require that stack space
be reserved for each execution context. - J. Hill, R. Szewczyk, A. Woo, S. Hollar, D.
Culler, and K. Pister. System architecture
directions for networked sensors. ASPLOS 2000
21Problems with the event-driven model?
- This approach is natural for reactive processing
and for interfacing with hardware, but
complicates sequencing high-level operations, as
a logically blocking sequence must be written in
a state-machine style. - P. Levis, S. Madden, D. Gay, J. Polastre, R.
Szewczyk, A. Woo, E. Brewer, and D. Culler. The
Emergence of Networking Abstractions and
Techniques in TinyOS. NSDI 2004
22Simplifying event-driven programming of
memory-constrained systems
2
23Threads vs events
Threads sequential code flow
Events unstructured code flow
Very much like programming with GOTOs
24Explicit state machines for flow control
- The problem using explicit state machines for
flow control - Created ad hoc by the programmer
- No formal specification
- Must be inferred from reading code
- Very much like using GOTOs
25ContikiCombining event-driven and threads
- Event-based kernel
- Low memory usage
- Single stack
- Multi-threading is a library
- For those applications that needs it
- One thread, one extra stack
- The first system in the sensor network community
to do this
26However...
- Threads still require stack memory
- Unused stack space wastes memory
- 200 bytes out of 2048 bytes is a lot!
- A multi-threading library very difficult to port
- Requires use of assembly language
- Hardware specific
- Platform specific
- Compiler specific
27ProtothreadsA new programming abstraction
- A design point between events and threads
- Programming primitive conditional blocking wait
- PT_WAIT_UNTIL(condition)
- Single stack
- Low memory usage, just like events
- Sequential flow of control
- No explicit state machine, just like threads
- Programming language helps us if and while
28An example protothread
int a_protothread(struct pt pt)
PT_BEGIN(pt) PT_WAIT_UNTIL(pt,
condition1) if(something)
PT_WAIT_UNTIL(pt, condition2)
PT_END(pt)
/ /
/ /
/ /
/ /
29Proof-of-concept implementation of protothreads
in ANSI C
- Implementation pure ANSI C
- Uses the C preprocessor
- No need for a special preprocessor
- No assembly language
- Very portable
- Nothing is changed between platforms, C compilers
- However, two deviations from mechanism
- Automatic variables not stored across blocking
waits - Limitations on the use of switch statements
30How well do protothreads work?
31Reduction of complexity
Explicit flow-control state machines could be
almost completely removed
Found state machine-related bugs in two of the
programs when rewriting with protothreads
32Execution time overhead isa few cycles
Contiki TR1001 radio driver average execution
time, MSP430 CPU cycles
- Overhead 3 6 CPU cycles
- Protothreads useful even in time-critical code
33Now we can program...
- But how do we get the programs onto the devices?
34Loadable modules in the Contiki operating system
3
35Traditional reprogramming
- Physically attach to the device
- Provide a special voltage to the chip
- Rewrite the memory of the chip
- Do this for all your devices out there
- What if we have 100 devices in 100 buildings?
- 10000 devices...
36Transmitting programs over the network
Load the software
37Traditional systemsentire system a monolithic
binary
- Most systems statically linked at compile-time
- Entire system is a monolithic binary
- Makes code smaller
- But hard to change
- Must re-upload entire system
38Contiki run-time loadable program modules
- Core resident in memory
- Programs know the core
- The core do not know the programs
- Individual programs can be loaded/unloaded
- The first system in the sensor network community
to do this
Core
39Can we use a standard mechanism for the dynamic
loading?
- Can we do dynamic loading the Linux way in
Contiki? - Despite the resource constraints
- Run-time linking of ELF files
- Availability of tools, knowledge
- If we could, what would the overhead be?
- Compared to a tailored loading mechanism
- Compared to virtual machines
40In comparison two virtual machines
- CVM Contiki VM
- A stack-based, typical virtual machine
- A compiler for a subset of Java
- The leJOS Java VM
- Adapted to run in ROM
- Executes Java byte code
- Bundled .class files
41Memory footprint is small
- ROM size of dynamic linker
- 2k code
- 4k symbol table
- Full Contiki system, automatically generated
- ELF loading feasible for memory-constrained
systems
42Quantifying the energy consumption
- Measure the energy consumption
- Radio reception, measured on CC2420, TR1001
- Better estimate based on average Deluge overhead
- Storing data to EEPROM
- Linking, relocating object code
- Loading code into flash ROM
- Executing the code
- Two platforms ESB, Telos Sky (both MSP430)
43Energy consumption of the dynamic linker
44Loading, linking native code vs virtual machine
code
Energy consumption in mJ for loading an object
tracking application
45Execution time overhead
- Computationally heavy code
- 8x8 vector convolution
- Code that use a native code library
- Object tracking application
- Most of the code is spent running native code
46Break even points, vector convolution
ELF16
47Break-even points, object tracking
ELF16
48Wrapping up
49Future work
- Investigating the memory requirements/performance
trade-off - More memory better performance?
- Single-buffer approach for other communication
mechanisms - Bottom-up approach to build other programming
abstractions - High-level sensor network programming
50Conclusions
- Results
- TCP/IP for memory-constrained systems
- Protothreads simplifies event-driven programming
- Dynamic loading/linking of code modules
- Low-complexity mechanisms for low-complexity
systems - Simple in hindsight!
- But it takes a lot of hard work to get there
- Some interesting future work ahead of us
51The end of my part
52Background the TCP/IP stack
- UDP best-effort datagrams
- TCP connection oriented, reliable byte-stream,
full-duplex - Flow control, congestion control, etc
- IP best-effort packet delivery
- Forwarding, fragmentation
- The hard parts are IP and TCP
53The secrets of µIP
- Shared packet buffer
- Lower throughput
- Event-driven application programming interface
54The secrets of µIP part I A shared packet buffer
- All packets both outbound and inbound use the
same buffer - Size of buffer determines throughput
Outbound packet
Incoming packet
Packet buffer
55The secrets of µIP part I A shared packet
buffer II
- Implicit locking single-threaded access
- Grab packet from network put into buffer
- Process packet
- Put reply packet in the same buffer
- Send reply packet into network
Packet buffer
56The secrets of µIP part II Throughput
- µIP trades throughput for RAM
- Low RAM usage low throughput
- Small systems not that much data!
- Ability to communicate more important than
throughput!
57The smallest µIP configuration (that I know of)
- CubeSat kit by Pumpkin Inc
- Pico satellite construction kit
- 128 bytes of RAM for µIP
58The secrets of µIP part III Application
Programming Interface I
- µIP does not have BSD sockets
- BSD sockets are built on threads
- Threads induce overhead (RAM)
- Instead event-driven API
- Execution is always initiated by µIP
- Applications are called by µIP, call must return
- Protosockets BSD socket-like API based on
protothreads
59The secrets of µIP part III Application
Programming Interface II
void example2_app(void) struct
example2_state s (struct example2_state
)uip_conn-gtappstate if(uip_connected())
s-gtstate WELCOME_SENT
uip_send("Welcome!\n", 9) return
if(uip_acked() s-gtstate
WELCOME_SENT) s-gtstate WELCOME_ACKED
if(uip_newdata()) uip_send("ok\n",
3)
if(uip_rexmit()) switch(s-gtstate)
case WELCOME_SENT
uip_send("Welcome!\n", 9) break
case WELCOME_ACKED uip_send("ok\n",
3) break
60The secrets of µIP part III Application
Programming Interface III
- Event-driven API sometimes is problematic
- Not all programs are well-suited to it
- Programs are explicit state machines
- Protosockets sockets-like API using protothreads
- Extremely lightweight stackless threads
- 2 bytes per-thread state, no stack
- Protothreads allow blocking functions, even
when called from µIP
61The secrets of µIP part III Application
Programming Interface IV
PT_THREAD(smtp_protothread(void))
PSOCK_BEGIN(s) PSOCK_READTO(s, '\n')
if(strncmp(inputbuffer, 220, 3) ! 0)
PSOCK_CLOSE(s) PSOCK_EXIT(s)
PSOCK_SEND(s, HELO , 5) PSOCK_SEND(s,
hostname, strlen(hostname)) PSOCK_SEND(s,
\r\n, 2) PSOCK_READTO(s, '\n')
if(inputbuffer0 ! '2') PSOCK_CLOSE(s)
PSOCK_EXIT(s)
62The secrets of µIP part III Application
Programming Interface V
- API built from the bottom (network) and up
- Protothreads and protosocket API provides
sequential programming - Less overhead than real threads and the real
socket API
63Threads require per-thread stack memory
- Four threads, each with its own stack
Thread 1
Thread 2
Thread 3
Thread 4
64Events require one stack
Threads require per-thread stack memory
- Four event handlers, one stack
- Four threads, each with its own stack
Thread 4
Thread 1
Thread 2
Thread 3
Stack is reused for every event handler
Eventhandler 2
Eventhandler 3
Eventhandler 4
Eventhandler 1
65Protothreads require one stack
Threads require per-thread stack memory
- Four protothreads, one stack
- Four threads, each with its own stack
Thread 4
Thread 1
Thread 2
Thread 3
Just like events
Protothread 2
Protothread 3
Protothread 4
Protothread 1
66Six-line implementation
Protothreads implemented using the C switch
statement
- struct pt unsigned short lc
- define PT_INIT(pt) pt-gtlc 0
- define PT_BEGIN(pt) switch(pt-gtlc)
case 0 - define PT_EXIT(pt) pt-gtlc 0 return
2 - define PT_WAIT_UNTIL(pt, c) pt-gtlc __LINE__
case __LINE__ \ - if(!(c)) return 0
- define PT_END(pt) pt-gtlc 0
return 1
67Code footprint
- Average increase 200 bytes
- Inconclusive
68Whats wrong with using state machines?
- There is nothing wrong with state machines!
- State machines are a powerful tool
- Amenable to formal analysis, proofs
- But state machines typically used to control the
logical progam flow in many event-driven programs - Like using gotos instead of structured
programming - The state machines not formally specified
- Must be infered from reading the code
- These state machines typically look like flow
charts anyway - Were not the first to see this
- Protothreads use language constructs for flow
control
69Why not just use multithreading?
- Multithreading the basis of (almost) all embedded
OS/RTOSes! - WSN community Mantis, BTNut (based on
multithreading) Contiki (multithreading on a
per-application basis) - Nothing wrong with multithreading
- Multiple stacks require more memory
- Networked more concurrency than traditional
embedded - Can lead to more expensive hardware
- Preemption
- Threads explicit locking Protothreads implicit
locking - Protothreads are a new point in the design space
- Between event-driven and multithreaded
70Implementing protothreads
- Modify the compiler?
- There are many compilers to modify (IAR, Keil,
ICC, Microchip, GCC, ) - Special preprocessor?
- Requires us to maintain the preprocessor software
on all development platforms - Within the C language?
- The best solution, if language is expressive
enough - Possible?
71Are protothreads useful in practice?
- We know that at least thirteen different embedded
developers have adopted them - AVR, PIC, MSP430, ARM, x86
- Portable no changes when crossing platforms,
compilers - MPEG decoding equipment, real-time systems
- Others have ported protothreads to C, Objective
C - Probably many more
- From mailing lists, forums, email questions
- Protothreads recommended twice in embedded guru
Jack Ganssles Embedded Muse newsletter