Anemone Edgebased network management http:www'research'microsoft'comprojectsanemone - PowerPoint PPT Presentation

1 / 28

About This Presentation

Title:

Anemone Edgebased network management http:www'research'microsoft'comprojectsanemone

Description:

Magpie, Topology discovery, Pastry, Avalanche, Vigilante, Anemone. Languages, security, theory ... Anemone helps close the control loop ... – PowerPoint PPT presentation

Number of Views:50

Avg rating:3.0/5.0

Slides: 29

Provided by: andybov

Category:

more less

Transcript and Presenter's Notes

Title: Anemone Edgebased network management http:www'research'microsoft'comprojectsanemone

1
AnemoneEdge-based network managementhttp//www.r
esearch.microsoft.com/projects/anemone/

Mort (Richard Mortier)
Paul Barham, Austin Donnelly, Rebecca Isaacs

2
Preamble Microsoft Research

Over 700 people worldwide, spread through 6
research labs
Bangalore, Beijing, Cambridge, Redmond, San
Francisco, Silicon Valley
Cover a wide range of CS and EE areas
MSR Charter
Advance the state-of-the-art through cutting-edge
research and publishing in the open literature
Provide competitive edge to Microsofts product
groups through technology transfer and
consultation
Engage with academic community through
participation in conferences, programme
committees, journal editorial boards, student
thesis committees
Cambridge lab is about 80 researchers, split into
4 main areas
Networking, systems, distributed systems
Magpie, Topology discovery, Pastry, Avalanche,
Vigilante, Anemone
Languages, security, theory
Graphics, vision, machine learning
Integrated systems, HCI, hardware

3
Network management is hard!

The process of monitoring and controlling a large
complex distributed system of dumb devices where
failures are common and resources scarce
Networks are large 105 hosts, 103 routers
Networks are heterogeneous 130 router
hardware/OS combinations
Networks run distributed protocolsOSPF, BGP,
all very loosely synchronized
Networks undergo continuous change links fail
and recover, upgrades occur

4
State of the art?

Tools to help visualize and inspect network
Get topology
Recursive use of ping and traceroute
Get traffic data
Routers using SNMP and NetFlowTM
Analyze and present the data
Wrap it all up in a GUI triggers, graphs,
top-10s, etc

5
Unfortunately

There are problems!
Traffic is becoming more opaque to the network
core
Increasing deployment of IPSec, tunnelling,
encryption
traceroute data is ambiguous and only polls the
topology
Best case is the reverse path anyway
SNMP data is often buggy
Non-critical part of router operation
Routers are often resource starved
Not built using the latest CPU, memory
technologies
The result is that such systems can end up
presenting inaccurate, untimely, incomplete data

6
Anemone

Edge-based distributed network management
platform
Collect flow information from hosts, and
Combine with topology information from routing
protocols
Enables applications
Visualize current network state
Analyse flow data for intrusion detection
Simulate reconfiguration/failure for planning
Control the network, automatically and in
real-time

7
Demo overview
Data gathering
Management applications
Anemone platform
OSPF packets captured from corporate network
Link events
flow data from hosts topology data from
OSPF distributed database computing load
throughout network
Continuous queries
(topology, failure, recovery)
Subnet list
r e s u l t s
Load model
emulates real-time per-host monitoring
Per-flow statistics
Synthetic traffic traces
Sample management application
One-shot queries
(data transmitted)
simulated for demo
8
Benefits

Anemone has a priori benefits over state of the
art
Visibility into opaque protocols
See into encrypted/tunnelled traffic e.g. IPSec,
PPtP
Plentiful resources at hosts
They need only deal with their own traffic
Independence from poor quality data
No more reliance on SNMP and traceroute data

9
Applications

Where is my traffic going today?Anemone is a
platform for network management apps
Pictures of current topology and traffic
Routesflowsforwarding rules ? BIG PICTURE
In fact, where did my traffic go yesterday?
Keep historical data for capacity planning, etc
A platform for anomaly detection
Historical data suggests normality, live
monitoring allows anomalies to be detected

10
Applications

Where might my traffic go tomorrow? Anemone
enables what-if analysis
Plug into a simulator back-end
Discrete event simulator or flow allocation
solver
Run multiple what-if scenarios
failures
reconfigurations
technology deployments
E.g. What happens to the network if we coalesce
all the mail servers into one datacenter?

11
Applications

Where should my traffic be going?Anemone helps
close the control loop
Use it to support an application that recomputes
link weights to implement policy goals
Recomputation on the order of hours or days
This enables more dynamic policies
Network configuration could be modified to track
e.g. time of day/week/year load changes
potentially reducing bandwidth costs

12
Where are we now?

Studying feasibility and building prototypes
Three major components
Flow collection
Route collection
Anemone platform

13
Data collection flows

Synthesise flow data from low-level packet
tracing
Hosts track active flows
Using ETW, low overhead event posting
infrastructure
Built prototype device driver provider
user-space consumer
Took 24h packet traces from a client and a server
Peaks were at 165, respectively 5667, live flows
per sec and 39, respectively 567, active flows
per sec
Quite manageable sized datasets

14
(No Transcript)
15
(No Transcript)
16
Interlude OSPF routing 101

How does a packet get from any A to any B?Learn
network topology compute shortest paths
For each node
Discover adjacencies (immediate neighbours)
Advertise these link states to all other routers
Build link state database (network topology)
Compute shortest paths to all destination
prefixes
Forward to next-hop using longest-prefix-match
(most specific route)

17
Data collection routes

Passive collection of network critical control
protocol
OSPF is link-state so collect link state adverts
Completely passive, modulo configuration
Process data to recover network events and
topology
Data collected for (local, backbone) areas (20
days)
LSA DB size (700, 1048) LSAs (21, 34) kB
Event totals (2526, 3238) events (5.3, 6.7)
evts/hr
Small, generally stable with bursts of activity

18
NB Spike to 100 from initial DB collection
truncated for readability
19
(No Transcript)
20
complete dataset
steady state
35 mins LSRefreshTimeCheckAge?
30 mins LSRefreshTime?
10 mins data ca. 25/Nov?
12 mins RouterDeadInterval?
21
The Anemone platform

Data unification, distribution and presentation
Distributed database, logically containing
Traffic flow matrix (bandwidths), srcs dsts
Hosts can supply flows they source and sink
Only need a subset of this data to get complete
traffic matrix
each entry annotated with current route, src to
dst
Note src/dst might be e.g. (IP end-point,
application)
OSPF supplies topology ? routes

22
System outline
Packets
Routeing protocol
Flows
Topology
Traffic matrix
Set of routes
Anemone platform
Simulator
Control
Visualize Simulate
23
The Anemone platform

Provides an API for presenting data
Wish to be able to answer queries like
Who are the top-10 traffic generators?
Easy to aggregate, dont care about topology
What is the load on link l?
Can aggregate from hosts, but need to know routes
What happens if we remove links lm?
Interaction between traffic matrix, topology,
even flow control
Related work
distributed, continuous query, temporal
databases
Sensor networks, Astrolabe, SDIMS, PHI

24
The Anemone platform

Currently forming the core of the demo!
Have simulation model
OSPF data gives topology, event list, routes
Simple load model to start with (load
subnets)
Predecessor matrix (from SPF) reduces flow-data
query set
Where/what/how much to distribute/aggregate?
Is data read- or write-dominated?
Which is more dynamic, flow or topology data?
Can the system successfully self-tune?

25
The Anemone platform

Many outstanding research questions
Can we do as well/better than e.g. NetFlowTM?
Accuracy of data vs. completeness of
instrumentation
Which data sets should we distribute and how?
Just OSPF data? Just flow data? A mixture?
Use DHTs? IP multicast?
How many levels of aggregation?
How many nodes should a query touch?
What sort of API is suitable?
Example queries for sample applications

26
http//www.research.microsoft.com/projects/anemone
/

Building a coherent edge-based network management
platform using flow monitoring and standard
routeing protocols
Applications include visualization, simulation,
dynamic control
Research issues include
Accuracy will not be able to monitor 100 of
traffic
Scalability want to manage a 300,000 node
network
Robustness must work as nodes fail or network
partitions
Control systems use the data to optimize the
network in real-time, as well as just observe and
simulate

27
Backup slides

SNMP
Internet routeing
Security

28
SNMP

Protocol to manage information tables at devices
Provides get, set, trap, notify operations
get, set read, write values
trap signal a condition (e.g. threshold
exceeded)
notify reliable trap
Complexity mostly in the table design
Some standard tables, but many vendor specific
Non-critical, so often tables populated
incorrectly

29
Internet routeing

Q how to get a packet from node to destination?
A1 advertise all reachable destinations and
apply a consistent cost function (distance
vector)
A2 learn network topology and compute consistent
shortest paths (link state)
Each node (1) discovers and advertises
adjacencies (2) builds link state database (3)
computes shortest paths
A1, A2 Forward to next-hop using
longest-prefix-match

30
Security