Title: 1 of 20
1Smart-NICs Power Proxying for Reduced Power
Consumption in Network Edge Devices
Karthikeyan Sabhanatarajan, Ann Gordon-Ross ,
Mark Oden, Mukund Navada , Alan D. George High
Performance Computing and Simulation Research
Laboratory Department of Electrical and Computer
Engineering University of Florida , Gainesville
Also Affiliated with NSF Center for
High-Performance Reconfigurable Computing
This work was supported by the U.S. National
Science Foundation
2Introduction
INTERNET
3Introduction
- Connected edge devices account for 2 of the
total power consumed in the US EPA-06 - 130 TWh/Year
- This is 1.3 billion _at_ .10 per kWh
- 1 single-unit nuclear power plant outputs 8
TWh/Year
- Translates to 16 single-unit nuclear power
plants! - Why so much power?
- PCs can consume up to 200 W
- 1 billion PCs worldwide by 2010 Kanellos-04
- What can we do?
- PCs are idle 75 of the time Purushothaman-06
- But only 10 of PCs are allowed to sleep during
that time EPA-06 - Sleeping reduces power consumption by 80 or more
- If PCs were allowed to sleep, only 3 single-unit
nuclear power plants would be required
Question Why arent these PCs asleep?!?!
4Maintaining Network Connectivity
GNUTELLA FILE SHARING APPLICATION
IDLE
FILE QUERY PACKET
FILE QUERY PACKET
Bob
INTERNET
Alice
FILE RESPONSE PACKET
Problem PC must be awake to maintain network
connectivity
Alice checks to see if Bob has a file needed for
p2p file sharing
5A Solution Power Proxying
- Primary challenge is to maintain network
connectivity while the PC is power down to
standby mode - sleeping - Some packets do not require a complex response
- Automated responses are sufficient
- Network Interface Card (NIC) can act as proxy for
the PC - Allow the PC to sleep while NIC services packets
with automated responses - A technique known as power proxying
- We call such a NIC a Smart-NIC - SNIC
6Power Proxying
GNUTELLA FILE SHARING APPLICATION
IDLE
FILE QUERY PACKET
Bob
INTERNET
Alice
FILE RESPONSE PACKET
PC delegates power to the SNIC to handle to
network traffic
7Power Proxying
INTERNET
Bob
Non-Proxiable/Wake up Packet
Response
Proxiable Packet
Chatter Packet
Response
SNIC
8What to Proxy? - Proxiable Protocols
- Proxiable protocols - Network protocols amenable
to proxying - Responses may be automated
- Keep alive packets, IP conflict avoidance, etc.
FOUR Categories of Proxiable Packets
ARP QUERY
Mail Notification
PING
P2P FILE QUERY
ARP RESPONSE
PING RESPONSE
P2P RESPONSE
9Power Proxying Operation
1. PC decides to sleep
2. PC offloads power proxy rules to the SNIC
3. PC sleeps and SNIC proxy is activated
6. Rule checking
Match?
5. Header inspection
?
8. Determine response
10Packet Classifier Requirements
PC-BASED CLASSIFIER
ROUTER-BASED CLASSIFIER
1) Must sustain link rates of
10/100/1000/10000 Mbps
1) Must sustain link rates of
10/100/1000/10000 Mbps
2) No packet loss allowed
2) No packet loss allowed
3) Operates only during system inactivity
3) Continual operation
4) Process packets addressed only to a particular
destination and Broad/MultiCast packets
4) Process packets to any destination
5) Limited processing resources - processors
clocked in MHz
5) Processors clocked in GHz range
6) Limited number of rules directly depend on
number of proxiable applications running
6) Larger number rules with a wide
complexity range
7) Packets match only one rule - rules are
disjoint
7) Packets can match multiple rules
11Packet Classifier - SW vs. HW
Software Classifier
Hardware Classifier
1) Limited operating frequency between 66
MHz to 400 MHz
1) Custom hardware can be designed for the
required frequency
2) Can easily meet the network throughput
demands
2) Cannot meet the network throughput
demands even for the fastest packet
classification algorithms
3) Comparatively lower power
3) High power even during idle period
12Custom HW Packet Classification
Incoming Packet (From MAC Core)
Invokes application handler
Header Processor
Source Port CAM
Dest Port CAM
Source IP Address CAM
Source Port CAM
Dest Port CAM
Source IP Address CAM
Match Address
13Packet Classifier Placement
14Experimental Setup
- Software packet classifier
- Implemented on RiceNIC platform using PowerPC405
- RiceNIC is a programmable NIC
- PowerPC clocked at 300 MHz and 100 MHz
- Hardware packet classifier
- Xilinx IP cores to generate CAMs as block memory
- Prototyped in Verilog HDL
- System implemented and simulated using Xilinx ISE
9.1 and ModelSIM - Clocked at 1.25 MHz, 12.5 MHz, and 125 MHz
corresponding to 10 Mbps, 100 Mbps, and 1000
Mbps - Power calculated using Xilinx XPower
15Results Packet Classification Time
- Hardware classification outperforms software
classification at 300 MHz and 100 MHz
Worst-case packet classification time for each
protocol class with 100 rules
16Results Classification Time Vs Rules
- As more applications are identified as
proxyiable, rule set sizes will increase - Thus scalability is important
17Results Packet Throughput
- Throughput is measured in Millions of Packets Per
Second (MPPS)
Hardware exceeds Gbps throughput
Software cannot meet requirements!
18Results HW Speedup vs. SW
19Results Power Consumption
- SW classifier is 2.4x more power than HW
- SW 259.5 mW and 441 mW for 100 MHz and 300 MHz
respectively - HW 180 mW for 100 rules.
- Link rate scalability
- For SW to meet 1 Gpbs throughput
- Clocked at 500 MHz
- Require an additional 294 mW of power
- Resulting in 4x more power than HW
-
20Conclusions
- PCs consume a lot of power
- Left powered on to maintain network connectivity
- Introduced power proxying
- SNIC maintains network connectivity so PC can
sleep - Can increase sleep time by 85 Purushothamom-06
- Low-power hardware-based packet classifier to
enable power proxying - Exceeds Gigabit Ethernet throughput requirement
- Up to 9x speedup in packet classification time
over a software packet classifier - 75 less power than a software packet classifier
- Better scalability with respect to future rule
set size and link rates than a software packet
classifier