Title: Control Update 2, Phase 0
1Control Update 2, Phase 0
2Immediate Tasks/Milestones
- Cross development environment for the xscale
- Started with evaluation version of the montavista
software. - To build missing libraries and software I
installed our own cross-compiler and built
missing systems and applications libraries. - Montavista Due to our dependence on Radisys
software and their dependence on montavista we
(I) need to explore what it will take to obtain
either their set of system headers or an official
release. - Install local version of the Planetlab software
- Planetlab central (PLC) maintains centralized
database of all participants, physical nodes and
slices. - Planetlab node software interfaces with the PLC
to obtain slice definitions, software repository,
authentication and logging. - Status Installed myPLC and created 2 local
nodes. - myPLC continues to evolve though I have not kept
up with the latest release. - myPLC Early on the packet was not stable so once
I got everything to built I stopped tracking
changes. At some point we need to invest the time
and energy to update our distribution.
3Immediate Tasks/Milestones
- Build and test the drivers and applications
provided by Intel, Radisys and IDT - currently all build, load and run.
- the IDT code is a bit fragile and jdd continues
to struggle with it. I have rewritten their
system call wrappers to check for error returns
from system calls and to verify parameters (so we
dont crash kernels). - John has found a problem with how they allocate
memory in the kernel. I have not as of yet
incorporated the IDT (TCAM) into my configuration
utility. - Configure control processors (foghorn and coffee)
for the Radisys boards. - Setup data hosts There are two hosts currently
configured to act clients (along with coffee and
foghorn). - Install and configure as Planetlab hosts the two
Intel general purpose processing boards - Todo.
4Immediate Tasks/Milestones
- xscale utility library for managing NP state
- Can read/write to any physical address in the
system - Will verify addresses are valid
- In some cases (not all) will also verify other
attribute such as alignment and operation size
(width of read/write operation) - Need to add support for TCAM (not needed for
November as jdd has written utilites for
configuing tcams) - Network processor configuration (from xscale)
- Command interpreter (cmd) interpret system
configuration scripts similar to those used in
the simulation environment - debug utility interactive or scripted read/write
to physical memory - xscale Daemon (remote control)
- Simple daemon implemented for reading/writing
blocks of memory - Leverage core components from command interpreter
- Uses xml-rpc for communication
- debug utility, demo tool and test environment for
interfaces - the guts of the daemon used to implemented the
RLI interface - RLI interface daemon (initial version appears to
work)
5For the November Demo
- Command interpreter (utility) Done
- run module initialization scripts
- debugging, read/write physical memory
- XML-RPC Server for reading/writing to xscale
physical memory. - Implementation and initial testing complete.
- Can run either on the hardware (reading/writing
physical memory) or in simulated environment on
any host. For the latter case the library
allocates an 8MB chunk of virtual memory to
simulate the xscale env. - RLI interface daemon
- uses xmlrpc to comm with xscale server and NCCP
on hte RLI side. - Jyoti provided the template code ... she made it
very easy for me! - Initial test worked with one caveat
- done.
6For Demo in November
- Command line client for xscale server
- debugging and simple (dynamic) configuration from
GPE or CP hosts. - Packet generation
- sp
- No change since last presented to group
- todo (Long term, not for November) sp
- verify packet format
- specify packet headers on command line/file
- todo other
- tunnel create and maintenance for endsystems so
we can use generic applications - at a higher level, we need a strategy for how we
will connect to a meta-net and generate traffic
for testing/demos
7Command Interpreter
- Simple command interpreterBasic syntax
- ltexpressiongt ltcmd_exprgt (ltEOLgt )
- ltcmd_exprgt ltcmdgt ltcmd_arggt
- ltcmd_arggt ltterm_arggt ltinfix_opgt
ltterm_arggt - ltterm_arggt ltprefix_opgt ltterm_exprgt
ltpostfix_opgt - ltterm_exprgt ltterminalgt ltsub_exprgt
- ltsub_exprgt '(' ltcmd_exprgt ')
- Commands are either arithmetic expressions or
some system defined operation (mem, vmem, set,
etc.) - expr 3 5 expr r t / 5
- But you need not explicitly include the expr
command, if the command name is missing then expr
is assumed. 3 5 r t / 5 - Objects
- Arguments are objects and represent values in
memory (physical memory) - Variable are names that refer to an object
- An object is bound to a type and value when it is
first assigned to. - Once bound to a type it may not change (but the
value may). - The standard type conversions are supported.
- Every operation returns an object, an object may
or may not have a value - A variable/object if unassigned has no value (in
a conditional it returns false).
8Supported types
- Arguments
- variables (name), immediate values and objects
returned by a command. - An object is a scalar or vector of primitive
types. - variable is defined when it appears in a set
statement
9Example Command Script
- !/root/bin/cmd -f
- Comments
- a single char marks start of comment,
- extends to end-of-line.
- Global scope, 6B Eth addr and 4B IP addr
- set ethAddr 554433221100
- set ipAddr 192.168.0.4
- print ethAddr " ethAddr \
- , ipAddr " ipAddr "\n"
- Block scope
-
- alternative syntax for array of Bytes
- set ethAddr (dw1 1 2 3 4 5 6)
- print "XX Ethernet addr " ethAddr "\n
-
- Function definitions, run time binding
- of arguments, returning values
Loops while ( indx lt 4 ) set result 0
block scope if ((indx / 2) 0)
ipAddr3 indx result (doWrite addr
ipAddr) addr addr 4 else
ethAddr5 indx result (doWrite addr
ethAddr) addr addr 6 if (
result 0) print "Write failed\n
indx indx 1 These would be errors
ethAddr ipAddr print Result result
Output ./try.cmd ethAddr 55 44 33 22 11 00
, ipAddr c0 a8 00 04 XX Ethernet addr 01
02 03 04 05 06 mem write 0x90000000 c0 a8 00
00 mem write 0x90000004 c0 a8 00 01 mem
write 0x90000008 55 44 33 22 11 02 mem write
0x9000000e 55 44 33 22 11 03
10Remote Command Utility
- client --help-h --lvl-l lvl
--serv-s url - --cmd-c cmd --kpa pa --type vt
--cnt n - --period n --inc v
- --serv url default url http//localhost8080/
RPC2 - --cmd cmd valid command list
- get_version get version of currently
running server - read_mem read kernel physcial address
space on server - write_mem write to kernel physcial
address space on server - --kpa pa Kernel physical address to
read/write - --type vt valType to read/write. Valid
types - str Character string
- char Single character, elements of a
string - dbl Double precision float, native
format - int Signed integer, natural int for
platform - dw8 Unsigned 8-Byte integer, same as
uint64_t - dw4 Unsigned 4-Byte integer, same as
uint32_t
11Software Location
- The cvs wu_arl repository (WUARL)
- WUARL/wusrc/
- wulib/ General purpose C library (logging,
high-res timers, comm, etc). There is a user
space (libwu.a) and kernel space version
(libkwu.a). - wuPP/ General purpose C library (messaging,
exceptions, buffers, etc) - Libraries wulib
- cmd/ Core command interpreter Library
(scanning, parsing, evaluation) - Libraries libwu.a, libwupp.a
- WUARL/IXP_Projects/xscale/
- mem/ Configuration tool, adds memory operations
to command interpreter. - Libraries cmd, wuPP, wulib
- ixpLib/ Platform (7010) specific code for
reading/writing memory. Opens dev file to
communicate with the lkm wudev or in simulation
mode allocates memory block to mimic platforms
environment. (wudev interface, kpa validation). - Libraries libwu.a
- wudev/ Kernel loadable module, accepts commands
from user space to manipulate kernel physical
memory. Validates address and limited operation
filtering. - Libraries libkwu.a
12Software Location
- Continued from previous page cvs wu_arl
repository (WUARL)
- WUARL/IXP_Projects/xscale/
- xrpcLib/ Common processing of XMLRPC messages
for the xscale control daemon. Useful for clients
and servers alike. Defines wrappers around the
xmlrpc-c library. The remote client code is also
located in this directory. - Libraries libwu.a, libwupp.a, libxmlrpc, ...
- ixpServer/ The xscale control daemon. Uses the
xmlrpc-c library for communication and the ixpLib
for platform dependent operations. The xmlrpc-c
library has a minimal http server that I use to
support the web-based interface. Contains sample
client code to test/verify server. - Libraries libwu.a, libwupp.a, ixpLib,
libxmlrpc, ... - In a separate repository I use for courses and
experimentation - MYSRC/src/
- misc/ Test code for the planetlab node
environment. A simple client and server for
allocating UDP ports and passing open sockets
between vservers. - Libraries libwu.a
- sp/ This is the sp code. I still consider
it experimental so I have kept it in this
directory. It will soon move into the
WUARL/IXP_Projects/xscale/ directory.
13Software Location
- xscale build environment
- for two reasons I have located the cross
development environment on my desktop computer
1) I need root access to filesystem and 2)
arlfss NFS mounts routinely timeout causing some
compiles to take an entire day (the corresponding
local compile a few hours). - Root directory /opt/crosstool. filesystem is
backed up by cts - gcc version 3.3.X
- All open source code compiled for xscale (and not
modified by me) is located in /opt/crosstool/src.
- If modified then I place in cvs under the
xscale directory. - Reference xscale root filesystems
/opt/crosstool/rootfs. - xscale control processors (foghorn and coffee)
update their copies using rsync. - On CPs (foghorn and coffee) the xscale FS has
symbolic link at /xscale. Place files in
/xscale/xxx - On xscales currently executables we use are kept
in /root/bin
14Next set of Tasks/Milestones
- Dynamic node configuration
- instantiate new meta-routers
- modify lookup tables (substrate and MR owned)
- modify data-plane tables
- Integrate PlanetLab control and PLC database
- slice definitions and user authentication
- slice interface to control infrastructure
monitoring, table updates, communication between
data-plane and control-plane. - Exception/Local deliver for IPv4 router
- implement on GPE
- control daemons for routing over meta-net
- signaling and accounting
- ARP support
- phase 1 in LC for tunnels
- phase 2 generic ARP engine for MR use
- Node Resource management
- creating/initializing new substrate node
- allocating node resources for new meta-router
(may be several meta-routers per meta-net on a
node) - initializing and configuring allocated resources
and enabling user access. - Provide per Meta-Net/Router accounting services
(similar in spirit to PlanetLabs accouting)
15Basic Slice Creation No changes
- Slice information is entered into PLC database.
- Current Node manager pools PLC for slice data.
- Planned PLC contacts Node manager proactively.
- Node manager (pl_conf) periodically retrieves
slice table. - updates slice information
- creates/deletes slices
- Node manager (nm) instantiates new virtual
machine (vserver) for slice. - User logs into vserver using ssh
- uses existing plab mechansism on GPE.
NPE
GPE
root ctx
NM
per Slice contexts
new Slice (Y)
X1
RM
slice X
Preallocated Ports (UDP)
sys-sw vnet
Eth1
Eth2
Ethernet Switch
Eth3
Line card (NPE)
Lookup table (TCAM)
filter
result
TUNX
Eth2
VLANX
default
Eth1
VLAN0
Default configuration forward traffic to the
(single) GPE, in this case the users ssh login
session.
16Requesting NP
- User requests shared-NP
- Specify code option
- Request UDP port number for overlay tunnel
- Request local UDP port for exception traffic
- Substrate Resource Manager
- Configure SW Assign local VLAN to new meta
router. Enable VLAN on switch ports. - Configure NPE allocates NP with requested code
option (decision considers both current load and
available options) - Configure LC(s) Allocate an externally visible
UDP port number (from the preallocated pool of
UDP ports for the external IP address). Add
filter(s) - Ingress packets destination port to- local
(chassis) VLAN and MAC destination address - Egress IP destination address (??) to- MAC
destination address and RTM physical output port - Configure GPE Open local UDP port for exception
and local delivery traffic from NPE. Transfer
local port (socket) and results to client slice
GPE
NPE
root ctx
per Slice contexts
NM
X
slice X
RM
Slice Y
Preallocated Ports (UDP)
sys-sw vnet
Y
Eth2
Eth1
Ethernet Switch
VLANY
Exception and local delivery traffic. Only need
to install filter in TCAM.
Eth3
Line card (NPE)
Lookup table (TCAM)
filter
result
TUNX
Eth2
VLANX
TUNY
Eth2
VLANY
default
Eth1
VLAN0
Meta-network traffic uses UDP tunnels. Only need
to install filter in TCAM.
17Configure Ethernet Switch Step 1
- Allocate next unused VLAN id for meta-net.
- In this scenario can a meta-net have multiple
meta-routers instantiated on a node? - If so then do we allocate switch bandwidth and a
VLAN id for the meta-net or for each meta-router? - Configure Ethernet switch
- enable VLAN id on applicable ports
- need to know line card to meta-port (i.e. IP
tunnel) mappings - if using external GigE switch then use SNMP
(python module pysnmp) - if using Radisys blade then use SNMP???
- set default QoS parameters, which are???
- other ??
18Configure NPE Step 2
- vlan table
- code option and instance number
- memory for code options
- instance base address, size and index/instance
- each instance is given an instance number to use
for indexing into a common code option block of
memory - each code option is assigned a block of memory
- code option base address and size. Also Max
number of instances that can be supported. - Select NPE to host client MR
- Select eligible NPEs (those that have the
requested code option) - Select best NPE based on current load and do
what??? - Configure NPE
- Add entry to SRAM table mapping VLANPORT to MR
instance - What does this table look like?
- Where is it?
- Allocate memory block in SRAM for MR.
- Where in SRAM are the eligible blocks located?
- How do I reference the block?
- 1) allocate memory for code option at load time
2) allocate memory dynamically - Allocate 3 counter blocks for MR
19Configure LC(s) Step 3
- User may request specific UDP port number
- Open UDP socket (on GPE)
- open socket and bind to external IP address and
UDP port number. This prevents other slices or
the system from using selected port - Configure line card to forward tunnel(s) to
correct NPE and MR instance - Add ingress and egress entries to TCAM
- how do I know IPto-Ethernet destination address
mapping for egress filter? - For both ingress and egress allocate QID and
configure QM with rate and threshold parameters
for MR. - Do I need to allocate a Queue (whatever this
means)? - Need to keep track of qids (assign qid when
create instance etc) - For egress I need to know the output physical
port number. I may also need to know this for
ingress (if we are using external sw).
20Configuring GPE Step 4
- Assign local UDP port to client for receiving
exception and local delivery traffic. - user may request specific port number.
- use either a preallocated socket or open a new
one. - use UNIX domain socket to pass socket back to
client along with other results. - all traffic will use this UDP tunnel, this means
the client must perform IP protocol processing of
encapsulated packet in user space. - for exception traffic this makes sense.
- for local delivery traffic the client can use a
tun/tap interface to send packet back into Linux
kernel so it can perform more complicated
processing (such as TCP connection management).
Need to experiment with this. - should we assign a unique local IP address for
each slice? - Result of shared-NPE allocation and socket sent
back to client.
21Run-Time Support for Clients
- Managing entries in NPE TCAM (lookup)
- add/remove entry
- list entries
- NPE Statistics
- Allocate 2 blocks of counters pre-queue and
post-queue. - clear block counter pair (Byte/Pkt) ???
- get block counter pair (Byte/pkt)
- specify block and index
- get once, get periodic
- get counter group (Byte/pkt)
- specify counter group as set of tuples (index,
block), - SRAM read/write
- read/write MR instance specific SRAM memory block
- relative address and byte count, writes include
new value as byte array. - Line card Meta-interface packet counters, byte
counters, rates and queue thresholds - get/set meta-interface rate/threshold
- Other
- Register next hop nodes as the tuple (IPdst,
ETHdst), where IPdst is the destination address
in the IP packet. The ETHdst is the corresponding
Ethernet address. - Can we assume the destination ethernet address is
always the same?
22Boot-time Support
- Initialize GPE
- Initialize NPE
- Initialize LC
- things to init
- spi switch
- memory
- microengine code download
- tables??
- default Line card tables
- default code paths
- TCAM
23IP Meta Router Control
- All meta-net traffic arrives via a UDP tunnel
using a local IP address. - raw IP packets must be handled in user space.
- complete exception traffic processing in user
space. - local delivery traffic can we inject in Linux
kernel so it performs transport layer protocol
processing? This would also allow application to
use the standard socket interface. - should we use two different IP tunnels, one for
exception traffic and one for local delivery? - Configuration responsibilities?
- Stats monitoring for demo?
- get counter values
- support for traceroute and ping
- ONL -like monitoring tool
- Adding/removing routes
- static routing tables or do we run a routing
protocol?
24IP-Meta Router
- Internal packet format has changed.
- see Jings slides
- Redirect not in this version of the meta-router
25XScale Control Software
- Substrate Interface
- Raw interface for reading/writing arbitrary
memory locations. - substrate stats?
- add new meta-router
- Meta-router/Slice interface
- all requests go through a local entity (managed)
- not needed authenticate client
- validate request (verify memory location and
operation) - Node Initialization
- ??
26Virtual Networking Basic Concepts
Substrate Links interconnect adjacent Substrate
Routers
Substrate Router
One or more Meta Router instances
Meta Links interconnect adjacent Meta Routers.
Defined within substrate link context
substrate links may be Tunneled within existing
networks IP, MPLS, etc.
27Adding a Node
Install new substrate router
Define meta-links between meta nodes (routers or
hosts)
Create substrate links between peers
Instantiate meta router(s)
28System Components
- General purpose processing engines (PE/GP).
- Shared PlanetLab VM environment.
- Local Planetlab node manager to configure and
manager VMs - vserver, vnet may change to support substrate
functions - Implement substrate functions in kernel
- rate control, mux/demux, substrate header
processing - Dedicated no local substrate functions
- May choose to implement substrate header
processing and rate control. - Substrate uses VLANs to ensure isolation (VLAN
MRid) - Can use 802.1Q priorities to isolate traffic
further. - NP blades (PE/NP).
- Shared user supplies parse and header formatting
code. - Dedicated User has full access to and control
over the hardware device - General Meta-Processing Engine (MPE) notes
- Use loopback to enforce rate limits between
dedicated MPEs - Legacy node modeled as dedicated MPE, use
loopback blade to remove/add substrate headers. - Substrate links Interconnect substrate nodes
- Meta-links defined within their context.
- Assume an external entity configures end-to-end
meta-nets and meta-links
29Block Diagram of a Meta-Router
Control/Management using Base channel (Control
Net IPv4)
Meta Interfaces (MI) MI connected to meta-links
1G
1G
.5G
2G
1G
.5G
0
1
2
3
4
5
MPEk1
MPEk2
MPEk3
control
data path
data path
.1G
.1G
3G
3G
.1G
.1G
MPEs interconnected in data plane by a
meta-switch. Packet includes Meta-Router and
Meta-PE identifier
Some Substrate detected errors or events reported
to Meta-Router control MPE.
Meta Switch
Meta-Router
Meta-Processing Engines (MPE) - virtual
machine, COTS PC, NPU, FPGA - PEs differ in ease
of programming and performance - MR may use
one or more PEs, with possibly different types
The first Meta-Processing Engine (MPE) assigned
to Meta-Network MNetk called MPEk1
30System Block Diagram
RTM
RTM
10 x 1GbE
PE/NP
PE/NP
PE/GP
LC
LC
PE/GP
PCI
GP CPU
xscale
xscale
xscale
xscale
NPU-A
NPU-B
NPU-A
NPU-B
TCAM
2x1GE
GbE interface
2x1GE
X
X
Fabric Ethernet Switch (10Gbps, data path)
Base Ethernet Switch (1Gbps, control)
I2C (IPMI)
map VLANX to VLANY
Node Server
Loopback
user login accounts
Node Manager
Shelf manager
31Top-Level View (exported) of the Node
PE/GP (control, IPaddr) (platform, x86) (type,
linux_vserver)
PE/NP (control, IPaddr) (platform,
IXP2800) (type, IXP_SHARED)
S-Link (type, p2p) (peer, _Desc_) (BW, XGbps)
PE/GP (control, IPaddr) (platform, x86) (type,
dedicated)
PE/NP (control, IPaddr) (platform,
IXP2800) (type, IXP_DEDICATED)
S-Link (type, p2p) (peer, XXX) (BW, XXGbps)
Exported Node Resource List (Processing engines,
Substrate Links)
Node Server
Substrate Control
user login accounts
Node Manager
32Substrate Enabling an MR
Allocate control-plane MPE (required)
Meta-Router MR1 for MNetk
Update host with local Net gateway
Allocate data-plane MPEs
Host (located within node)
Enable VLANk on fabric switch ports
PE
PE
PE
3
2
1
0
local
Enable control over Base switch (IP-based)
4
10GbE (fabric)
loopback
6
5
7
Update shared MPEs for MI and inter-MPE traffic
LC
LC
Line card
Substrate
Use loopback to define interfaces internal to the
system node.
Define Meta-Interface mappings
33Block Diagram
map received packet to MR and MI
Each MRMI pair is assigned its own rate
controlled queue
Line Card
Line Card
Lookup table
Shared PE
map to MRMI
MR1
MR2
MR5MI1
Dedicated PE
MR3
Line Card
Line Card
Fabric Switch
Fabric Switch
Shared PE/NP
MR4
MR5
1
1
2
2
Meta-Interfaces are rate controlled
Shared PE/GP
VMM
VM manager
Node Server
meta-router
Meta-net control and management functions
(configure, stats, routing etc). Communicate with
MR over separate base switch.
Internet
VMM?
Node M.
meta-net5 control
Base switch (control)
slice/MN VMs?
App-level service
34Partitioning the Control plane
- Substrate manager
- Initialization discover system HW components and
capabilities (blades, links etc) - Hides low level implementation details
- Interacts with shelf manager for resetting boards
or detecting failures. - Node manager
- Initialization request system resource list
- Operational Allocate resources to meta-Networks
(slice authorities?) - Request substrate to reset MPEs
- Substrate assumptions
- All MNets (slices) with a locally defined
meta-router/service (sliver) have a control
process to which it can send exception packets
and event notifications. - Communication
- out-of-band uses Base interface and internal IP
addresses - in-band uses data plane and MPE id.
- Notifications
- ARP errors, Improperly formatted frame, Interface
down/up, etc. - If meta-link is a pass-through link then the Node
manager is responsible for handling meta-net
level errors/event notification. For example link
goes down.
35Initialization Substrate Resource Discovery
- Creates list of devices and their Ethernet
Addresses - Network Processor (NP) blades
- Type network-processor, Arch ixp2800, Memory
768MB (DRAM), Disk 0, Rate 5Gbps - General Processor (GP) blades
- Type linux-vserver, Arch X, Memory X, Disk X,
Rate X - Line Card blades
- not exposed to node manager, used to implement
meta-interfaces - another entity creates substrate links to
interconnect peer substrate nodes. - create table mapping line card blades, physical
links and Ethernet addresses. - Internal representation
- Substrate device ID ltID, SDidgt
- If device has a local control daemon ltControl,
IP Addressgt - Type Processing Engine (NP/GP)
- ltPlatform, (Dual IXP2800Xeon???)gt, ltMemory, gt,
ltStorage, gt ltClock, (1.4GHz???)gt ltFabric,
10GbEgt, ltBase, 1GbEgt, ??? - Type Line Card
- ltPlatform, Dual IXP2800gt ltPorts, ltMedia,
Ethernetgt, ltRate, 1Gbpsgtgt, ??? - Substrate Links
- ltType, p2pgt, ltPeer, Ethernet Addressgt, ltRate
Limitgt, - Met-Link list ltMLid, MLIgt, ltMR, MRidgt,
36Initialization Exported Resource Model
- List of available elements
- Attributes of interest?
- Platform IXP2800, PowerPC, ARM, x86 Memory
DRAM/SRAM Disk XGB Bandwidth 5Gbps VM_Type
linux-vserver, IXP_Shared, IXP_Dedicated,
G__Dedicated Special TCAM - network-processor NP-Shared, NP-Dedicated
- General purpose GP-Shared (linux-vserver),
GP-Dedicated - Each element is assigned an IP address for
control (internal control LAN) - List of available substrate links
- Access networks (expect Ethernet LAN interface)
substrate link is multi-access - Attributes Access multi-access, Available
Bandwidth, Legacy protocol(s) (i.e. IP), Link
protocol (i.e. Ethernet), Substrate ARP
implementation. - Core interface assume point-to-point, Bandwidth
controlled - Attributes Access Substrate Bandwidth, Legacy
protocol?
37Instantiate a router Register Meta-Router (MR)
- Define MR specific Meta-Processing Engines (MPE)
- Register MR ID MRidk with substrate
- substrate allocates VLANk and binds to MRidk,
- Request Meta-Processing Engines
- shared or dedicated, NP or GP, if shared then
relative allocation (rspec) - shared implies internal implementation has
support for substrate functions - dedicated w/substrate user implements substrate
functions. - dedicated no/substrate implies substrate will
remove any substrate headers from data packets
before delivering to MPE. For legacy systems. - indicate of this MPE is to receive control events
from substrate (Control_MPE). - substrate returns MPE id (MPid) and control IP
(MPip) address for each allocated MPE - substrate internally records Ethernet address of
MPE and enables VLAN on applicable port - substrate assumes that any MPE may send data
traffic to any other MPE - MPE specifies target MPE rather then MI when
sending packet.
38Instantiate a router Register Meta-Router (MR)
- Create meta-interfaces (with BW constraints)
- create meta-interfaces associated with external
substrate links - request meta-interface id (MIid) be bound to
substrate link x (SLx). - we need to work out the details of how a SL is
specified - We need to work out the details of who assigns
inbound versus outbound meta-link identifiers
(when they are used). If downstream node then the
some entity (node manager?) reports the outgoing
label. This node assigns the inbound label. - multi-access substrate/meta link node manager or
meta-router control entity must configure
meta-interface for ARP. Set local meta-address
and send destination address with output data
packet. - substrate updates tables to bind MI to
receiving MPE (i.e. were substrate sends
received packets) - create meta-interfaces for delivery to internal
devices (for example, legacy Planetlab nodes) - create meta-interface associated with an MPE
(i.e. the endsystem)
39Scenarios
- Shared PE/NP, send request to device controller
on the XScale - Allocate memory for MR Control Block
- Allocate microengine and load MR code for Parser
and Header Formatter - Allocate meta-interfaces (output queues) and
assign Bandwidth constraints - Dedicated PE/NP
- Notify device control daemon that it will be a
dedicated device. May require loading/booting a
different image? - Shared GP
- use existing/new PlanetLab framework
- Dedicated GP
- legacy planetlab node
- other
40IPv4
- Create the default IPv4 Meta-Router, initially in
the non-forwarding state. - Register MetaNet output Meta-Net ID MNid
- Instantiate IPv4 router output Meta-Router ID
MRid - Add interfaces for legacy IPv4 traffic
- Substrate supports defining a default protocol
handler (Meta-Router) for non-substrate traffic. - for protocolIPv4, send to IPv4 meta-router
(specify the corresponding MPE).
41General Control/Management
- Meta routers use Base channel to send requests to
control entity on associated MPE devices - Node manager sends requests to central substrate
manager (xml-rpc?) - request to both configure, start/stop and tear
down meta-routers (MPEs and MIs). - Substrate enforces isolation and
policies/monitors meta-router sending rates. - Rate exceeded error If MPE violates rate limits
then its interface is disabled and the control
MPE is notified (over Base channel).. - Shared NP
- xscale daemon
- requests start/stop forwarding Allocate shared
memory for table Get/set statistic counters
Set/alter MR control lock Add/Remove lookup
table entries. - Lookup entries can be added to send data packets
to control MPE, packet header may contain tag to
indicate reason packet was sent - mechanism for allocating space for MR specific
code segments. - dedicated NP
- MPE controls XScale. When XScale boots a control
daemon si told to load a specific image
containing user code.
42ARP for Access Networks
- The substrate offers an ARP service to
meta-routers - Meta-router responsibilities
- before enabling interface must register its
meta-network address associated with
meta-interface - send destination (next-hop) meta-net address with
packets (part of substrate internal header).
Substrate will use arp with this value. - if meta-router wants to use multicast or
broadcast address then it mus also supply the
Link layer destination address. So the substrate
must also export the Link layer type. - substrate responsibilities
- all substrate nodes on an access network must
agree on meta-net identifiers (MLIs) - Issues ARP requests/responses using supplied
meta-net addresses and met-net id (MLI). - maintain ARP table and timeout entries according
to relevant rfcs. - ARP Failed error If ARP fails for a supplied
address then substrate must send packet (or
packet context) to control MPE of meta-router.