Control Update Focus on PlanetLab integration and booting - PowerPoint PPT Presentation

About This Presentation
Title:

Control Update Focus on PlanetLab integration and booting

Description:

http://www.arl.wustl.edu/projects/techX/ppt/ This presentation ... stork: environmental service. CoMon: monitoring and discovery. Resource model ... – PowerPoint PPT presentation

Number of Views:193
Avg rating:3.0/5.0
Slides: 37
Provided by: fredk5
Category:

less

Transcript and Presenter's Notes

Title: Control Update Focus on PlanetLab integration and booting


1
Control UpdateFocus on PlanetLab integration and
booting
  • Fred Kuhns
  • fredk_at_arl.wustl.edu
  • Applied Research Laboratory
  • Washington University in St. Louis

2
Documents
  • Control documentation
  • http//www.arl.wustl.edu/projects/techX/ppt/
  • This presentation
  • http//www.arl.wustl.edu/projects/techX/ppt/Contro
    lUpdate.ppt
  • SRM interface
  • http//www.arl.wustl.edu/projects/techX/ppt/srm.pp
    t
  • RMP interface
  • http//www.arl.wustl.edu/projects/techX/ppt/rmp.pp
    t
  • SCD interface (ingress, egress and npe)
  • http//www.arl.wustl.edu/projects/techX/ppt/scd.pp
    t
  • Datapath documentation
  • http//www.arl.wustl.edu/projects/techX/design/SPP
    /
  • NAT overview (Interface??)
  • http//www.arl.wustl.edu/projects/techX/design/SPP
    /SPP_V1_NAT_design.ppt
  • FlowStats (Interface??)
  • http//www.arl.wustl.edu/projects/techX/design/SPP
    /FlowStats_Control.ppt

3
Traditional View of a PlanetLab Node
  • Linux OS, vserver
  • System services
  • pl_netflow
  • sirius brokerage service
  • stork environmental service
  • CoMon monitoring and discovery
  • Resource model
  • focused on PCs with single device instances (CPU,
    NIC)
  • standard Linux/UNIX tools to measure utilization
  • homogeneous environment with single vmm to manage
    all vm instances on a platform
  • local node manager interface through loopback
    interface
  • User requests slice on a set of distributed nodes
  • assigned VM instance on each node
  • Fedora Linux environment
  • per slice flowstats

Planetlab node site, owner, model,
ssh_host_key, groups Host XXX, Domain
YYY IPAddress A.B.C.D
Node Manager (root VM)
System Services (VMs)
VM1
VMN
...
Virtual Machine Monitor (VMM)
Hardware Platform (General Purpose PC)
host.domain A.B.C.D
Internet
4
An SPP Node
SPP/PlanetLab node site, owner,
model ssh_host_key, groups Host XXX Domain
YYY IPAddress A.B.C.D
GPE1
GPE2
Node Manager
System Services
Node Manager
System Services
VM1
VMX-1
VMX
VMN
...
...
Virtual Machine Monitor (VMM)
Virtual Machine Monitor (VMM)
Hardware Platform (General Purpose PC)
Hardware Platform (General Purpose PC)
CP
CPU
NIC
CPU
NIC
NIC
NIC
DRAM
Disk
CPU
NIC
DRAM
Disk
CPU
NIC
data
data
control
control
Node Manager
System Services
HUB 1GbE Control (Base) 10GbE Data (fabric)
data
data
data
NPE
NPE
Line Card
FwdDB/Filters datapath
vm1fast path1
vmX-1fast path1
NAT
vm1fast path2
vmYfast path2
...
...
vmXfast path1
External Interface
spp_host.domain A.B.C.D
Internet
5
Challenges
  • Provide the standard PlanetLab slice environment
  • configure and boot individual GPEs with standard
    planetlab software and supporting the standard
    operational environment
  • Support standard interfaces
  • boot manager
  • node managers internal and external interfaces
  • resource monitoring
  • Create interface for allocating and managing
    fast-paths
  • allocate/free NPE resources
  • manage meta-interface mappings to externally
    visible IP address and UDP port
  • slice control of allocated fastpath resources

6
SPP Node
External Interfaces
...
IP1
IP2
IPN
RTM
10x1G/1x10G
Boot Files
NPE
NPE
GPE
GPE
LC
dhcpd.conf ethers tftpboot bootcd.img overlay_gpe
X.img pxelinux.0 pxelinux.cfg C0A82031 C0A82041 ov
erlay.img plnode.txt plc_config ethers spp_conf.t
xt spp_netinit.py server, certs
pl_netflow
user slivers
ntp
ntp
ntp
ntp
NPU-A
NPU-B
NPU-A
NPU-B
TCAM
TCAM
xscale
xscale
xscale
xscale
vnet
SPI
SPI
PCI
PCI
interfaces
Fabric Ethernet Switch (10Gbps, data path)
Hub
Base Ethernet Switch (1Gbps, control)
CP
FlowStats
httpd
xmlrpc PLCAPI proxy
I2C (IPMI)
PXE, dhcpd tftp
flowDB
sliceDB
/var/www/
Resource DB
Slice DB
boot files
nodeconf.xml
Shelf manager
user info/ home dirs
ntpd
node DB
7
Software Components
  • Control Processor (CP)
  • Boot and Configuration Control (BCC) Node
    configuration, management and local state
    management (DB)
  • httpd, dhcpd, tftp and PXE server for GPE and NPE
    boards maintain config files
  • Boot CD and distribution file management (overlay
    images, RPM and tar files) for GPEs and CP
  • PLCAPI proxy (plc_api) and system level
    BootManager (part of gnm)
  • System Resource Manager (SRM) Centralized
    resource management
  • responsible for all resource allocation decisions
    and maintaining dynamic system state
  • delegates local operations to individual
    board-level managers
  • System Node Manager (SNM, aka GNM) top-half of
    the PlanetLab node manager
  • Slice login manager (SLM) and ssh forwarding
    (modified sshd) -- Ritun
  • Flow Statistics (FS) aggregates pl_netflow data
    and translates NAT records
  • Set default (static) routes in line card
  • What about dynamic route management
    (BGP/OSPF/RIP)? For now assume single next hop
    router for all routes.
  • General purpose Processing Element (GPE)
  • Local Boot Manager (LBM) Modified PlanetLab
    BootManager running on the GPEs
  • Resource Manager Proxy (RMP)
  • Node Manager Proxy (NMP), lower-half of
    PlanetLabs node manage
  • Network Processor Element (NPE)
  • Substrate Control Daemon (SCD)

8
Boot and Configuration Control
  • Read node configuration DB currently this is an
    xml file
  • Allocate IP subnets and addresses for all boards
  • Assign external IP addresses to GPE fabric
    interfaces with default VLAN id
  • Create per GPE configuration DB currently this
    is written to files.
  • Create dhcp configuration file and start dhcpd,
    httpd and system sshd
  • assigns control IP subnets and addresses assigns
    internal substrate IP subnet on fabric Ethernet
  • Start PLCAPI proxy (plc_api) server and system
    node manager
  • read node DB for initialization data currently
    use static configuration data and/or re-read xml
    file
  • Create GPE overlay images currently this is done
    manually
  • Currently the SNM is split between the plc_api
    server and srm due to not having a DB and not
    wanting to implement transaction-like interface
    for the snm.
  • begin periodic slice updates and gpe assignments,
    maintain DB
  • Start SRM and bring up boards as they report in
  • Initialize Line Card to forward default (i.e.
    ssh and icmp) to CP
  • Initialize Hub base and fabric switches
    Initialize any switches not within the chassis
  • Start SLM and the ssh daemon
  • Remove the SLM configuration file for slices, may
    contain old mappings

9
Booting SPP1 Example Configuration
CP
/tftpboot/ ramdisk.gz zImage.ppm10 bootcd.img over
lay_gpe1.img overlay_gpe2.img pxelinux.0 pxelinux.
cfg/ C0A82031 C0A82041
Hub
/etc/ dhcpd.conf ethers
Line Card (Slot 6)
Ingress XScale
192.168.32.17
/var/www/html/boot/ index.html bootmanager.sh boot
strapfs-planetlab-i386.tar.bz2
b1a
eth0
lc_b1a 192.168.32.97/20
drn05.arl.wustl.edu 128.252.153.209
cp_ctrl 192.168.32.1/20
b1
eth2
lc1_data 171.16.1.6/26 ...
the ARL network
f1/0
vlan 2
noarp
vlan 2
eth0.2
dnr05.arl.wustl.edu
Egress XScale
f1/0
eth0
cp_data 171.16.1.1/26
eth0
eth2.2
b1b
eth0
lc_b1b 192.168.32.98/20
128.252.153.78
128.252.153.31
GPE1 (Slot 4)
eth00
192.168.32.2
128.252.153.31
eth10
b1
eth2
gpe1_ctrl 192.168.32.65/20
IP Routing proxy arp for drn05
noarp
vlan 2
eth0.2
dnr05.arl.wustl.edu
f1/0
NPE (Slot 5)
eth0
gbe1_data 171.16.1.3/26
myPLC drn06.arl.wustl.edu
XScale A
f1/1
gpe1_int 172.16.1.65/26
eth1
Ebony
b1a
eth0
GPE2 (Slot 3)
lc_b1a 192.168.32.81/20
lc1_data 171.16.1.5/26 ...
f1/0
b1
gpe2_ctrl 192.168.32.49/20
eth2
noarp
XScale B
vlan 2
eth0.2
dnr05.arl.wustl.edu
f1/0
eth0
gbe2_data 171.16.1.4/26
b1b
eth0
lc_b1b 192.168.32.82/20
f1/1
gpe2_int 172.16.1.66/26
eth1
10
Example Configuration, SPP3
CP
/tftpboot/ ramdisk.gz zImage.ppm10 bootcd.img over
lay_gpe1.img overlay_gpe2.img pxelinux.0 pxelinux.
cfg/ C0A82031 C0A82041
Hub
/etc/ dhcpd.conf ethers
Line Card (Slot 6)
Ingress XScale
the ARL network
192.168.0.17
/var/www/html/boot/ index.html bootmanager.sh boot
strapfs-planetlab-i386.tar.bz2
b1a
eth0
lc_b1a 192.168.0.97/20
myPLC drn06.arl.wustl.edu
spp3.arl.wustl.edu 128.252.153.3
cp_ctrl 192.168.0.1/20
b1
eth2
lc1_data 171.16.1.6/26 ...
f1/0
vlan 2
noarp
vlan 2
eth0.2
spp3.arl.wustl.edu
Egress XScale
f1/0
eth0
cp_data 171.16.1.1/26
eth0
eth2.2
b1b
eth0
lc_b1b 192.168.0.98/20
128.252.153.34
128.252.153.39
GPE1 (Slot 3)
eth00
128.252.153.39
b1
gpe1_ctrl 192.168.0.49/20
eth2
eth10
192.168.0.2
IP Routing proxy arp for drn05
noarp
vlan 2
eth0.2
spp3.arl.wustl.edu
f1/0
NPE (Slot 5)
eth0
gbe1_data 171.16.1.3/26
XScale A
f1/1
gpe1_int 172.16.1.65/26
eth1
cp5.arl.wustl.edu
b1a
eth0
GPE2 (Slot 4)
lc_b1a 192.168.0.81/20
lc1_data 171.16.1.5/26 ...
f1/0
b1
gpe2_ctrl 192.168.0.65/20
eth2
noarp
XScale B
vlan 2
eth0.2
spp3.arl.wustl.edu
f1/0
eth0
gbe2_data 171.16.1.4/26
b1b
eth0
lc_b1b 192.168.0.82/20
f1/1
gpe2_int 172.16.1.66/26
eth1
11
bootcd file system
/ bin/ dev/ home/ lib/ ... etc/ init.d/ pl_boot
pl_netinit pl_validateconf pl_sysinit
pl_hwinit ... ... root/ selinux/ sys/ usr/
  • pl_boot modified to not use ssl or pgp to
    retrieve BootManager script from the cp
  • pl_netinit sets boot_server to reference the cp
  • pl_validateconf added SPP specific variables

12
overlay image
/ etc/issue, passwd kargs.txt pl_version usr/ is
olinux boot/ spp_netinit.py ethers
spp_conf.txt boot_server boot_server_port
boot_server_path plnode.txt cacert.pem
plc_config pubring.gpg backup/ boot_server
boot_server_path boot_server_port cacert.pem
pubring.gpg bootme/ BOOTPORT BOOTSERVER
BOOTSERVER_IP ID cacert/drn06.arl.wustl.edu/cacer
t.pem
  • Changed to list cp as boot server and port as 81
  • Added SPP initialization script and config files
  • Changed plnode.txt to list this GPEs mac address
    for control interface

13
GPE Configuration file spp_conf.txt
Config name spp1.txt nserv
ctrl_ipaddr192.168.32.1
ctrl_hwaddr001EC9FE7622
data_ipaddr172.16.1.1 data_hwaddr001EC
9FE7623 domain hostnamedrn05
domainarl.wustl.edu
dns1128.252.133.45 dns2128.252.120.1
gateway128.252.153.31 hosts
nserv_f1.0172.16.1.1 nserv192.168.32.1
nserv_gbl192.168.48.1
shmgr192.168.48.2 hub192.168.32.17
hub1_f1.0172.16.1.2
hub1_m.0192.168.48.17 gpe1_f1.0172.16.1.
3 gpe1_f1.1172.16.1.65
gpe1_b1.0192.168.32.65
gpe2_f1.0172.16.1.4 gpe2_f1.1172.16.1.66
gpe2_b1.0192.168.32.49
npe1_f1.0172.16.1.5 npe1_b1.0192.168.32.
81 npe1_m.0192.168.48.81
npe1_b1.1192.168.32.82
lc_f1.0172.16.1.6 lc_b1.0192.168.32.97
lc_m.0192.168.48.97
lc_b1.1192.168.32.98 drn05.arl.wustl.edu
128.252.153.209
iface __name__eth0 deveth0
namegpe1_f1.0 hwaddr000e0c85e
440 typedata lanidfabric1
port0 vlan0
ipaddr172.16.1.3 ipnet172.16.1.0
ipbcast172.16.1.63 ipmask255.255.255.19
2 arpno enableyes iface
__name__eth0.2 deveth0.2
namegpe1_f1.0 hwaddr000e0c85e440
vlan2 typedata
lanidfabric1 port0
ipaddr128.252.153.209 ipnet128.252.0.0
ipbcast128.252.255.255
ipmask255.255.0.0 arpno
enableyes
iface __name__eth1 deveth1
namegpe1_f1.1 hwaddr000e0c85e
442 typedata lanidfabric1
port1 vlan0
ipaddr172.16.1.65 ipnet172.16.1.64
ipbcast172.16.1.127
ipmask255.255.255.192 arpno
enableyes iface __name__eth2
deveth2 namegpe1_b1.0
hwaddr000e0c85e43e typecontrol
lanidbase1 port0 vlan0
ipaddr192.168.32.65
ipnet192.168.32.0 ipbcast192.168.39.255
ipmask255.255.248.0 arpyes
enableyes
14
ethers
------------------------------------------------
---------------------- Board Type cp, Name cp1,
Slot 0 nserv_f1.0 fabric1/0 001EC9FE7
623 172.16.1.1 nserv
base1/0 001EC9FE7622 192.168.32.1
nserv_gbl maint/0 001018320076
192.168.48.1 -----------------------------------
----------------------------------- Board Type
shmgr, Name shmgr1, Slot 0 shmgr
maint/0 0050C23FD274 192.168.48.2
--------------------------------------------------
-------------------- Board Type hub, Name hub1,
Slot 1 hub base1/0 0000503D10
6B 192.168.32.17 hub1_f1.0
fabric1/0 0000503D10B0 172.16.1.2
hub1_m.0 maint/0 0000503D106C
192.168.48.17 ----------------------------------
------------------------------------ Board Type
gpe, Name gpe1, Slot 4 gpe1_f1.0
fabric1/0 000e0c85e440 172.16.1.3
gpe1_f1.1 fabric1/1 000e0c85e442
172.16.1.65 gpe1_b1.0
base1/0 000e0c85e43e 192.168.32.65
--------------------------------------------------
--------------------
------------------------------------------------
---------------------- Board Type gpe, Name
gpe2, Slot 3 gpe2_f1.0
fabric1/0 000E0C85E608 172.16.1.4
gpe2_f1.1 fabric1/1 000E0C85E60A
172.16.1.66 gpe2_b1.0
base1/0 000E0C85E606 192.168.32.49
--------------------------------------------------
-------------------- Board Type npe, Name npe1,
Slot 5 npe1_f1.0 fabric1/0 000000000
000 172.16.1.5 npe1_b1.0
base1/0 0000503d073e 192.168.32.81
npe1_m.0 maint/0 0000503D073C
192.168.48.81 npe1_b1.1
base1/1 0000503D073D 192.168.32.82
--------------------------------------------------
-------------------- Board Type lc, Name lc1,
Slot 6 lc_f1.0 fabric1/0 0000503d0
bd4 172.16.1.6 lc_b1.0
base1/0 0000503D0826 192.168.32.97
lc_m.0 maint/0 0000503D0824
192.168.48.97 lc_b1.1
base1/1 0000503D0825 192.168.32.98
--------------------------------------------------
-------------------- Gateway for drn05
(128.252.153.209), VLAN 2 0000503d0bd4
128.252.153.31
15
BootAPI calls made by the BootManager
  • PLCAPI/BootAPI calls
  • GetSession(node_id, auth, node_ip)returns new
    session key for node
  • BootCheckAuthentication(Session)returns true if
    Session id is valid
  • GetNodes(Session, node_id, nodegroup_ids,noden
    etwork_ids,model,site_id)returns the
    indicated parameters for this node (ie. node_id).
  • GetNodeNetworks(Session, node_id,
    nodenetwork_ids)returns list of interfaces
    broadcast, network, ip, dns1, dns2, hostname,
    netmask, gateway, nodenetwork_id, method, mac,
    node_id, is_primary, type, bwlimit,
    nodenetwork_settings_ids
  • GetNodes(Session, node_id, nodegroup_ids)return
    s list of group ids associated with this node
  • GetNodeGroups(Session, nodegroup_id,
    name)returns the name string for each node
    group (in out case SPP)
  • GetNodeNetworkSettings()
  • BootUpdateNode(Session, boot_state)Sets nodes
    boot state at PLC
  • BootNotifyOwners(Session, event, params)causes
    email to be sent to the list of node owners.
  • BootUpdateNode(Session, ssh_host_key)records the
    latest ssh public key for node.

16
Other PLC/Server interactions
  • HTTP/HTTPS
  • Upload alpina boot logsBOOT_SERVER_URL
    /alpina-logs/upload.php
  • Compatibility step (we dont use)BOOT_SERVER_URL
    /alpina-BootLVM.tar.gzBOOT_SERVER_URL
    /alpina-PartDisk.tar.gz
  • Download file system tar file containing basic
    plab node environmentBOOT_SERVER_URL
    /boot/bootstrapfs-group-arch.tar.bz2
  • If not in config file get node idBOOT_SERVER_URL
    /boot/getnodeid.php
  • Get yum update configuration fileBOOT_SERVER_URL
    /PlanetLabConf/yum.conf.php

17
System Initialization Stage 1
  • Use PXE boot and download pxelinux and config
    file
  • boot using basic initial ramdisk, overlay and
    kernel
  • Use dhcp, tftp and pxe server on the cp, files
    stored in the tfptboot directory.pxelinux.o,
    pxelinux.cfg/ltGPE_IPADDRgtbootcd.img,
    overlay_gpeX.img, kernel
  • The overlay image is modified for each GPE to
    include its configuration file, modified
    planetlab config files and an spp node python
    script.
  • Currently this is a manual step but ultimate
    (long term) plan is for the gnm daemon to create
    the individual images
  • The overlay image contains several files that
    identify the node and provide the name and
    address for the PLC and Boot servers. I have
    modified these to point o the cp.
  • Just before booting the final kernel I change
    these values to refer to the real plc/api
    servers.

18
System initialization Stage 2
  • Boot into basic, intermediate environment
  • Initial configuration information obtained from
    the overlay image
  • Includes spp_conf.txt defines gpe interfaces
  • Includes ethers file contains mac addresses for
    static arp entries
  • Updated plnode.txt with GPEs control interface
    mac address
  • Modified bootserver files listing the cp as the
    bootserver
  • Includes spp_netinit.py, a python script to
    configure the interfaces and update system
    configuration files.
  • Enables primary interface and key network
    configuration files such as resolv.conf
  • Downloads BootManager source from the
    boot_server
  • In our case we download from the CP
  • I explicitly disable the use of ssl and certs
    (the certifictes on the overlay image are for the
    PLC server and not the CP)
  • Our assumption is that the control (base) network
    is secure plus within an SPP node we dont have
    to worry about authentication issues.

19
BootManager
  • Opens connection to PLCAPI on bootserver
  • Opens connection to our proxy plcapi/bootapi
    server running on the CP
  • Get node session key GetSession(node_id, auth,
    node_ip)
  • Since each call to create a session invalidates
    any existing keys we intercept this call on the
    cp and use a common session key for all gpes.
  • Determines nodes configuration
  • reads plnode.txt for node_id, node_key and the
    primary interface settings
  • we use DHCP to configure the control interface
    but I do not define a dns server
  • if node_id is not found then reads
    URLBootServer/boot/getnodeid.php
  • Call BootCheckAuthentication(Session) to verify
    session key
  • Calls GetNodes to get the boot_state,
    node_groups, model, site_id
  • Calls GetNodeNetworks to get configuration
    information for all interfaces
  • in our case the call would return the externally
    visible network parameters, which differ from how
    each GPE is configured
  • long term, we can intercept this call and return
    GPE specific interface config info.
  • Short term we use a configuration file in the
    overlay image with similarly formatted
    information. I have replaced the BootManager code
    that reads the config info and configures the
    interfaces.
  • I had to add support for VLANs and our internal
    interfaces.

20
BootManager Continued
  • Download the nodes final filesystem image from
    the boot_server
  • in our case this is the CP, http//CP/boot/bootstr
    ap-planetlab-i386-tar.bz2
  • Download yum config file
  • I am not currently downloading,
    http//CP/PlanetLabConf/yum.conf
  • Call BootUpdateNode with new boot_state
  • we will need to intercept this call and both
    report and set node state based on all GPEs.
  • Call BootNotifyOwners with new state
  • forward to PLC
  • Update network configuration in new sysimg
  • downloads //BootServer/ PlanetLabConf/plc_config
    file
  • In our case I have copied onto the overlay image
    in the /usr/boot directory.
  • calls GetNodeNetworkSettings for a list of any
    additional interface attributes then creates
    various configuration files hosts, resolv.conf,
    network, ifcfg-eth
  • I have replaced this step with our own script
    spp_netinit.py and configuration file
    spp_conf.txt which I use to create the same
    config files in both the current environment and
    the new sysimg.
  • updates devices and creates the initrd image used
    for the next stage
  • finally boots a new kernel using the bootstrap
    file system

21
Boot States
  • The list of boot states is changing as I write
    this
  • In our version of the plc the states are shown on
    the right

State Next state Description
new install verified -gt rins error-gtdbg new instal verify install with user.
inst install verified -gt rins error-gtdbg Install same as new
rins success-gtboot error-gtdbg reinstall reformat disk and reinstall all software and files.
boot boot error -gt dbg boot boot using existing partitions
dbg Success same as boot Fail bootcd image debug boot node
diag user controlled diagnostics bootcd image
disable user controlled disable bootcd image
22
PLC Database
  • The PlanetLab central database keeps a database
    describing all nodes, slices and users/people.
  • Slice data base keeps track of all slices and
    their node bindings
  • The Node database includes externally visible
    properties and the ability to associate general
    attributes with these properties
  • the current (or next) node state (boot_state)
  • node identifier (node_id)
  • list of interface configuration parameters
  • ip address information, mac address, generic list
    of attributes
  • nodes owner
  • nodes site identifier (site_id)
  • model, can be used to specify a set of attributes
    forthe node. For example minhw, smp
  • current ssh host key (ssh_host_key)
  • node groups I believe this is being depricated
    in favor of associate a generic set of attributes
    with a node or its interfaces.

23
SPP Specific Information
  • On an SPP node the resource manager needs to know
    what kind of board is inserted in each slot and
    its I/O characteristics
  • Needs to associate interface MAC addresses with
    boards and interfaces. Or with standalone system
    connected to an RTM or front panel (for example
    the CP).
  • Also need to know which interfaces are connected
    to the base and which to the fabric switch when
    bringing up general purpose systems.
  • There is not a convenient mechanism for
    determining this at run time so I have a
    configuration file.
  • Also need to know what resources are available on
    each board and allocation policies.
  • Must also have a list of external links, their
    addresses and the address of any peers
    (Ethernet).
  • Need to keep track of current nodes state (as
    kept by PLC) as well as the state of each
    individual board.
  • Need to share state between different daemons

24
Node Configuration File
lt?xml version"1.0" encoding"utf-8"
standalone"yes"?gt ltsppgt ltcode_optionsgt ltIPv4
sram"fixed" queues"variable" id"0"
fltrs"variable"gt ltsramgt 1024 lt/sramgt
lt/IPv4gt ltI3 sram"fixed" queues"variable"
id"1" fltrs"variable"gt ltsramgt 1024 lt/sramgt
lt/I3gt lt/code_optionsgt ltcomponentsgt ltcp
name"cp1" slot"0" cat"host" alias"nserv"gt lt
interface name"nserv_f1.0" dev"GigE"
lanid"fabric1" assoc"" port"0"gt ...
lt/interfacegt ... lt/cpgt ltshmgr name"shmgr1"
slot"0" cat"atca" alias"shmgr1"gt
ltinterface name"shmgr" dev"GigE" lanid"maint"
assoc"" port"0"gt ... lt/interfacegt ...
lt/shmgrgt lthub name"hub1" slot"1" cat"atca"
alias"hub1"gt ltswitch lanid"base1"gt lt/switchgt
ltswitch lanid"fabric1"gt ltbwgt 10000000000 lt/bwgt
lt/switchgt ltinterface name"hub" dev"GigE"
lanid"base1" assoc"" port"0"gt ... lt/interfacegt
... lt/hubgt ltgpe name"gpe1" slot"4" cat"atca"
alias"gpe1"gt ltinterface name"gpe1_f1.0"
dev"GigE" lanid"fabric1" assoc"" port"0"gt ...
lt/interfacegt ... lt/gpegt ltnpe name"npe1"
slot"5" cat"atca" alias"npe1"gt ltproductgt
Radisys_7010 lt/productgt ltmodelgt NPEv1
lt/modelgt ltinterface name"npe1_f1.0"
dev"10GigE" lanid"fabric1" assoc"" port"0"gt
... lt/interfacegt ... lt/npegt ltlc name"lc1"
slot"6" cat"atca" alias"lc"gt ltproductgt
Radisys_7010 lt/productgt ltmodelgt LCv1
lt/modelgt ltinterface name"lc_f1.0"
dev"10GigE" lanid"fabric1" assoc"" port"0"gt
... lt/interfacegt ... ltinterface name"drn05"
dev"GigE" lanid"external" port"0"gt
... ltlink peering"true" primary"true"
dev"GigE"gt ... lt/linkgt ... lt/interfacegtlt/lcgt lt/c
omponentsgt lt/sppgt
25
CP Record
lt!-- Interface parameters defined by user in
original xml file --gt ltcp name"cp1" slot"0"
cat"host" alias"nserv"gt ltinterface
name"nserv_f1.0" dev"GigE" lanid"fabric1"
assoc"" port"0"gt lt!-- All internal IP addrs
assigned by configuration software based on
runtime parameters --gt ltipaddrgt172.16.1.1lt/ipadd
rgt ltipnetgt172.16.1.0lt/ipnetgt ltipmaskgt255.255.25
5.192lt/ipmaskgt ltipbcastgt172.16.1.63lt/ipbcastgt lt!
-- Device parameters and comment set by user in
the original xml file --gt ltdevicegt eth0
lt/devicegt lthwaddrgt 001EC9FE7623
lt/hwaddrgt ltdescgt Interface connected to HUB's
fabric port lt/descgt lt/interfacegt ltinterface
name"nserv" dev"GigE" lanid"base1" assoc""
port"0"gt ltipaddrgt192.168.32.1lt/ipaddrgt
ltipnetgt192.168.32.0lt/ipnetgt ltipmaskgt255.255.248.
0lt/ipmaskgt ltipbcastgt192.168.39.255lt/ipbcastgt ltde
vicegt eth1 lt/devicegt lthwaddrgt 001EC9FE7622
lt/hwaddrgt ltdescgt System control processor's
Base Ethernet connection lt/descgt lt/interfacegt lti
nterface name"nserv_gbl" dev"GigE"
lanid"maint" assoc"" port"0"gt ltipaddrgt192.168
.48.1lt/ipaddrgt ltipnetgt192.168.48.0lt/ipnetgt ltipma
skgt255.255.248.0lt/ipmaskgt ltipbcastgt192.168.55.255lt
/ipbcastgt ltdevicegt eth2 lt/devicegt lthwaddrgt
001018320076 lt/hwaddrgt ltdescgt Connection
to the maintenance ports lt/descgt lt/interfacegt lt/c
pgt
26
GPE Record
ltgpe name"gpe1" slot"4" cat"atca"
alias"gpe1"gt ltinterface name"gpe1_f1.0"
dev"GigE" lanid"fabric1" assoc""
port"0"gt -- IP Address Info -- ltdevicegt eth0
lt/devicegt lthwaddrgt 000e0c85e440 lt/hwaddrgt
(Device Data) ltbwgt 1000000000 lt/bwgtltsharegt 2
lt/sharegt (Resource Policy) ltdescgt MACN2,
Fabric 1/0 or AMC Port 0 lt/descgtlt/interfacegt ltint
erface name"gpe1_f1.1" dev"GigE"
lanid"fabric1" assoc"" port"1"gt -- IP
Address Info -- ltdevicegt eth1 lt/devicegt lthwaddrgt
000e0c85e442 lt/hwaddrgt ltdescgt MACN4,
Fabric 1/1 or Maintenance Port 1
lt/descgtlt/interfacegt ltinterface name"gpe1_b1.0"
dev"GigE" lanid"base1" assoc"" port"0"gt --
IP Address Info -- ltdevicegt eth2 lt/devicegt
lthwaddrgt 000e0c85e43e lt/hwaddrgt ltdescgt
MACN, Base connection to Primary HUB
lt/descgtlt/interfacegt ltinterface name"gpe1_b2.0"
dev"GigE" lanid"base2" assoc"" port"0"gt --
IP Address Info -- ltdevicegt eth3 lt/devicegt
lthwaddrgt 000e0c85e43f lt/hwaddrgt ltdescgt
MACN1, Base connection to alternate HUB
lt/descgtlt/interfacegt ltinterface name"gpe1_f2.0"
dev"GigE" lanid"fabric2" assoc""
port"0"gt -- IP Address Info -- ltdevicegt eth4
lt/devicegt lthwaddrgt 000e0c85e441
lt/hwaddrgt ltdescgt MACN3, Fabric 2/0 or AMC
Port 1 lt/descgtlt/interfacegt ltinterface
name"gpe1_f2.1" dev"GigE" lanid"fabric2"
assoc"" port"1"gt -- IP Address Info --
ltdevicegt eth5 lt/devicegt lthwaddrgt
000e0c85e443 lt/hwaddrgt ltdescgt MACN5,
Fabric 2/1 or Maintenance Port 2
lt/descgtlt/interfacegt lt/gpegt
27
NPE Record
ltnpe name"npe1" slot"5" cat"atca"
alias"npe1"gt ltproductgt Radisys_7010 lt/productgt
ltmodelgt NPEv1 lt/modelgt ltinterface
name"npe1_f1.0" dev"10GigE" lanid"fabric1"
assoc"" port"0"gt -- IP Address Info -- --
Device Data -- -- Resource Policy -- ltdescgt
Fabric interface used for both NPUs
lt/descgtlt/interfacegt ltinterface name"npe1_b1.0"
dev"GigE" lanid"base1" assoc"npua"
port"0"gt -- IP Address Info -- -- Device
Data -- ltdescgt Primary control interface
associated with NPUA lt/descgtlt/interfacegt ltinterfa
ce name"npe1_m.0" dev"GigE" lanid"maint"
assoc"npua" port"0"gt -- IP Address Info
-- -- Device Data -- ltdescgt NPUA Front
Maintenance Port lt/descgtlt/interfacegt ltinterface
name"npe1_b1.1" dev"GigE" lanid"base1"
assoc"npub" port"1"gt -- IP Address Info
-- -- Device Data -- ltdescgt NPUB Front
Maintenance Port -- But it's been patched to the
Base switch lt/descgt lt/interfacegt lt/npegt
28
LC Record
ltlc name"lc1" slot"6" cat"atca"
alias"lc"gt ltproductgt Radisys_7010 lt/productgt
ltmodelgt LCv1 lt/modelgt (Model Data) ltinterface
name"lc_f1.0" dev"10GigE" lanid"fabric1"
assoc"" port"0"gt -- IP Address Info -- --
Device Data -- -- Resource Policy --
lt/interfacegt ltinterface name"lc_b1.0"
dev"GigE" lanid"base1" assoc"npua"
port"0"gt -- IP Address Info -- -- Device Data
-- lt/interfacegt ltinterface name"lc_m.0"
dev"GigE" lanid"maint" assoc"npua"
port"0"gt -- IP Address Info -- -- Device Data
-- lt/interfacegt ltinterface name"lc_b1.1"
dev"GigE" lanid"base1" assoc"npub"
port"1"gt -- IP Address Info -- -- Device Data
--lt/interfacegt ltinterface name"drn05"
dev"GigE" lanid"external" port"0"gt lthwaddrgt
00005029b146 lt/hwaddrgt ltlink
peering"true" primary"true" dev"GigE"gt --
Link IP Address Info -- -- Device Data -- --
Resource Policy -- ltdomaingt arl.wustl.edu
lt/domaingt lthostnamegt drn05 lt/hostnamegt ltdns1gt
128.252.133.45 lt/dns1gt ltdns2gt 128.252.120.1
lt/dns2gt ltpeerIPgt 128.252.153.31 lt/peerIPgt
ltpeerMACgt 000FB5FBD867 lt/peerMACgt ltvlangt
2 lt/vlangt ltport_poolgt lt!-- used for NAT
--gt ltudp count"500" start"30000"gt
lt/udpgt lttcp count"500" start"30000"gt lt/tcpgt
lt/port_poolgt ltdescgt p2p link from drn05 to
drn06, the plc lt/descgt lt/linkgtlt/interfacegt lt/lcgt
29
SRM Interface
NATD to SRM egress_map, ingress_map get_sched_
map(LinkIP, BoardMAC) Depricated original natd
interface! fid, port alloc_epmap(map) status
free_epmap(fid) FS to SRM ?? (map vlan to
slice id) RMP to SRM Interfaces (Line Card
Links) if_list get_interfaces(plabID) ifn
get_ifn(plabID, ipaddr) if_entry
get_ifattrs(plabID, ifn) ipaddr
get_ifpeer(plabID, ifn) retcode
resrv_fpath_ifbw(bw, ifn) retcode
reles_fpath_ifbw(bw, ifn) To be
implemented retcode resrv_slice_ifbw(plabID, bw,
ifn) retcode reles_slice_ifbw(plabID, bw, ifn)
EndPoints (local IP and Port number) NATD
changes may have broken these ep
alloc_endpoint(PlabID, ep) status
free_endpoint(PlabID, ipaddr,
port, proto) Fast Path fp_params
alloc_fastpath(PlabID, copt, bwspec,rcnts,
mem) status free_fastpath() Fast-Path
Meta-Interfaces mi, ep alloc_udp_tunnel(bw,
ipaddr, port) ep get_endpoint(mi) status
free_udp_tunnel(ipaddr, port)
30
RMP Interface
  • Prototype completed
  • result noop()
  • version get_version()
  • result add_slice(plabID, len, name)
  • result rem_slice(plabID)
  • ret_t alloc_fastpath(copt, bw, rcnts, mem)
  • void free_fastpath()
  • if_list get_interfaces()
  • ifn get_ifn(ipaddr)
  • if_entry get_ifattrs(ifn)
  • ipaddr get_ifpeer(ifn)
  • retcode alloc_pl_ifbw(ifn, bw)
  • retcode reles_pl_ifbw(ifn, bw)
  • retcode alloc_fpath_ifbw(fpid, ifn, bw)
  • retcode reles_fpath_ifbw(fpid, ifn, bw)
  • retcode bind_queue(fpid, miid, list_type, qids)
  • actual_bw set_queue_params(fpid, qid, threshold,
    bw)
  • threshold, bw get_queue_params(fpid, qid)
  • u32 Pkts, u32 Bytes get_queue_len(fpid, qid)
  • To do
  • ep alloc_endpoint(ep)
  • status free_endpoint(ipaddr, port, proto)
  • -- alloc_tunnel --
  • -- free_tunnel --
  • mi, ep alloc_udp_tunnel(fpid, bw, ip, port)
  • status free_udp_tunnel(ipaddr, port)
  • ep get_endpoint(fpid, mi)
  • retcode write_fltr(fpid, fid, fltr)
  • retcode update_result(fpid, fid, result)
  • fltr_t get_fltr_bykey(fpid, key)
  • fltr_t get_fltr_byfid(fpid, fid)
  • result lookup_fltr(fpid, key)
  • retcode rem_fltr_bykey(fpid, key)
  • retcode rem_fltr_byfid(fpid, fid)
  • stats_t read_stats(fpid, sindx, flags)
  • result clear_stats(sindx)
  • handle create_periodic(fp,indx,P,cnt,flags)
  • retcode delete_periodic(fpid, handle)

31
NPE SCD Interface
SRM to SCD status set_fastpath(fpid, copt, VLAN,
params, mem) status enable_fastpath(fpid) status
disable_fastpath(fpid) status rem_fastpath(fpid) s
tatus set_sched_params(sid, ifn, BWmax,
BWmin) status set_encap_cb(sid, srcIP,
dMAC) status set_fpmi_bw(fpid, sid, miid,
bw) status start_mes() status stop_mes() status
set_encap_gpe(fpid, gpeIP, npeIP) result
write_mem(kpa, len, data) data read_mem(kpa,
len) SRM RMP to SCD ret_t write_fltr(dbid,
fid, key, mask, result) ret_t update_result(dbid,
fid, result) fltr get_fltr_bykey(dbid, key) fltr
get_fltr_byfid(dbid, fid) result
lookup_fltr(dbid, key) retcode rem_fltr_bykey(dbid
, key)retcode rem_fltr_byfid(dbid, fid)
RMP to SCD status set_gpe_info(exPort, ldPort,
exQID, ldQID) u32
result bind_queue(u16 miid,
u8 list_type,
u16 qid_list) u32 bw set_queue_params(u1
6 qid, u32 threshold, u32
bw) u32 threshold, u32 bw get_queue_params(u16
qid) u32 pktCnt, u32 byteCnt get_queue_len(u16
qid) result write_sram(offset, len, data) data
read_sram(offset, len) stats read_stats(sindx,
flags) result clear_stats(sindx) handle create_
periodic(sindx, P, cnt, flags) retcode
del_periodic(handle) retcode set_callback(handle,
udp_port) stats get_periodic(handle)
32
LC SCD Interface
SRM to SCD status set_sched_params(sid, ifn,
BWmax, BWmin) status set_sched_mac(sid, MACdst,
MACsrc) u32 result set_queue_sched(u16 qid, u16
sid) result write_mem(kpa, len, data) data
read_mem(kpa, len) SRM and RMP to SCD ret_t
write_fltr(dbid, fid, key, mask, result) ret_t
update_result(dbid, fid, result) fltr
get_fltr_bykey(dbid, key) fltr get_fltr_byfid(dbid
, fid) result lookup_fltr(dbid, key) retcode
rem_fltr_bykey(dbid, key)retcode
rem_fltr_byfid(dbid, fid)
RMP to SCD u32 actual_bw set_queue_params(u16
qid, u32 threshold, u32
bw) u32 threshold, u32 bw get_queue_params(u16
qid) u32 pktCnt, u32 byteCnt get_queue_len(u16
qid) stats read_stats(sindx, flags) result
clear_stats(sindx) handle create_periodic(sindx,
P, cnt, flags) retcode del_periodic(handle) retcod
e set_callback(handle, udp_port) stats
get_periodic(handle)
33
Slice Example
  • Get list of interfaces, their Ip addresses and
    available bandwidth
  • if_list if_entry, ...
  • if_entry u16 ifn, // logical interface number
  • u16 type, // peering or multi-access
  • u32 ipaddr, // interfaces IP address
  • u32 linkBW, // Links native BW
  • u32 availBW // BW available for allocation
  • struct epoint_t u32 bw,
  • u32 ipaddr // interfaces IP address
  • u16 port, // UDP port number for
    meta-interface
  • u32 bw // total BW required for
    meta-interface
  • iflist get_interfaces(iflist) // return list
    of all available interfaces
  • Estimate the computational complexity and memory
    bandwidth requirements on NPE.
  • bwSpec BWmaxtotalBW, BWmin0 // fast path
    total BW requirement
  • max general NPE resource counts for this example
    I just assume a max number but in general it may
    be that a user scales it by the number of
    meta-interfaces they will use.
  • fpCounts FLTR_CNT, QID_CNT, BUFF_CNT,
    STATS_CNT
  • Request substrate to allocate a fastpath instance
    for the IPv4 code option, assume we will use the
    default sram buffer sizes. Will also need to
    listen to returned sockes.
  • fpid, sockets alloc_fastpath(ipv4_copt,
    bwSpec, fpCnts, IPV4_SRAM_SZ, 0)

34
Slice Example - Continued
  • allocate one meta-interfaces for each external
    interface and assign our default UDP port number
    and BW requirement
  • struct mi_t uint_t mi epoint_t rp
  • mi_t milistiflist.len()
  • for (indx 0, mi 0 indx lt len(iflist)
    indx)
  • if (miBW gt iflistindx.availBW) throw Error
  • // allocate total BW required on this interface
  • if (alloc_fpath_ifbw(fpid, iflistindx.ifn,
    miBW)-1)
  • throw Error
  • // Allocate one meta-interface on this interface
  • milistindx alloc_udp_tunnel(fpid, miBW,
  • iflistindx.ipaddr,
  • myPort)
  • my_bind_queues(milistindx)
  • my_add_routes(milistindx)

35
Test SPP Node
Line Card (Slot 6)
CP
Hub
Ingress XScale
keystone.arl.wustl.edu 128.252.153.81
/etc/ dhcpd.conf ethers hosts
192.168.64.17
0/6
/tftpboot/ ramdisk.gz zImage.ppm10
the ARL network 128.252.153.
b1a
eth0
lc_b1a 192.168.64.97/20
dhcpd
vlan 2
lc1_data 171.16.1.6/26 ...
0/6
f1/0
FP 1/6
Egress XScale
b1
cp_ctrl 192.168.64.1/20
eth2
eth0
eth2.2
FP 1/7
noarp
128.252.153.YYY
128.252.153.XXX
RTM 3/1
vlan 2
b1b
eth0
lc_b1b 192.168.64.98/20
keystone.arl.wustl.edu
eth0.2
eth00
f1/0
FP 1/9
eth0
cp_data 171.16.1.1/26
128.252.153.XXX
192.168.64.2
eth1
GPE1 (Slot 2)
GPE4 (Slot 5)
/etc/ethers,hosts
/etc/ethers,hosts
IP Routing proxy arp for keystone
/etc/sysconfig/network-scripts/ifcfg-eth
/etc/sysconfig/network-scripts/ifcfg-eth
0/5
2/1
b2
eth2
b1
eth2
gpe1_ctrl 192.168.64.33/20
gpe2_ctrl 192.168.64.81/20
noarp
noarp
RTM 3/2
vlan 2
Router
vlan 2
0/5
eth0.2
eth0.2
keystone.arl.wustl.edu
keystone.arl.wustl.edu
f1/1
f1/0
eth0
gbe1_data 171.16.1.2/26
eth0
gbe2_data 171.16.1.5/26
Issue Mounting /opt/crossbuild/ from ebony.
Could export dirs form the Router host. Or
could use ebony rather than Router. In that
case will need an external switch connecting line
cards of spp? to ebonys eth2.2.
f2/1
eth1
f1/1
eth1
gpe1_int 172.16.1.66/26
gpe2_int 172.16.1.69/26
GPE2 (Slot 3)
GPE3 (Slot 4)
/etc/ethers,hosts
/etc/ethers,hosts
/etc/sysconfig/network-scripts/ifcfg-eth
/etc/sysconfig/network-scripts/ifcfg-eth
0/3
0/4
b1
eth2
b1
eth2
gpe2_ctrl 192.168.64.49/20
gpe2_ctrl 192.168.64.65/20
noarp
noarp
vlan 2
vlan 2
0/3
0/4
eth0.2
keystone.arl.wustl.edu
eth0.2
keystone.arl.wustl.edu
f1/0
f1/0
eth0
gbe2_data 171.16.1.3/26
eth0
gbe2_data 171.16.1.4/26
f1/1
gpe2_int 172.16.1.67/26
eth1
f1/1
eth1
gpe2_int 172.16.1.68/26
36
Test Bed Use
  • Core platform issues
  • Can we use the second fabric port on the GPE
    boards?
  • The hub does not display stats or mac fwd entries
    for the slots with GPEs. It used to work.
  • The radisys shelf manager
  • does not reliably reset boards
  • Base1 interface disabled on slot 2
  • NAT/Line Card testing
  • Overall reliability
  • Add support for aging
  • Specific issues (jdd)
  • restarting line card (without reboot)
    occasionally results in data-path thinking the
    scratch ring to the xscale is full.
  • looping iperf test from cp occasionally stalls
    with no packets getting through LC
  • Lookup needs fix to not use DONE bit to indicate
    a tcam lookup is done.
  • GPE/Intel board testing
Write a Comment
User Comments (0)
About PowerShow.com