Supporting realtime - PowerPoint PPT Presentation

1 / 20
About This Presentation
Title:

Supporting realtime

Description:

Supporting real-time & offline. network traffic analysis. Chung ... BLOB: leave the burden to the application developers. Conventional relational data types: ... – PowerPoint PPT presentation

Number of Views:37
Avg rating:3.0/5.0
Slides: 21
Provided by: cch67
Category:

less

Transcript and Presenter's Notes

Title: Supporting realtime


1
Supporting real-time offline network traffic
analysis
Chung-Min Chen Munir Cochinwala Allen Mcintosh
Marc Pucci Telcordia Technologies Applied
Research Morristown, NJ, USA
2
Outline
  • OSS Requirements
  • Work Proposal
  • Stream Data Management Issues
  • Traffic Warehouse
  • Tribeca a stream database manager

3
OSS Requirements
  • OSS Data time frame/ resp. time
  • Traffic control seconds minutes
  • monitoring
  • Service level 15 min. hours
  • agreement
  • Capacity planning weeks - months

4
Work proposal (system overview)
LAN
R
LAN
WAN
R
SNMP agent
EMS
SNMP agent
BPF
tcpdump
adaptor
Stream Engine
DBMS
Live SQL
Live Monitor
Warehouse
Live Monitor
Live Monitor
client
5
Real-time traffic analysis state-of-industry
  • Ad hoc or canned programs/scripts
  • Slow deployment
  • No data sharing
  • Hard to maintain and little reuse
  • Traditional DBMS
  • Can beat high line speed (e.g., OC48)?
  • Cumbersome in programming (write into DB then
    query)
  • Semantic mismatch between stream and relation

6
Stream Data Management
  • stream as a first class object (like
    relation)
  • Stream
  • a continuous, unbounded sequence of records with
    a total ordering
  • Issues
  • Stream algebra
  • Data types
  • Query language
  • Implementation

7
Stream Algebra
  • Operators
  • Selection relatively easy
  • Join can be defined nicely (assuming unbounded
    buffer)
  • Demultiplex/multiplex the result could be
    multiple streams
  • Operands
  • Stream stream
  • Stream relation

8
Data Types
  • BLOB
  • leave the burden to the application developers
  • Conventional relational data types
  • Need adaptors to convert from raw types to
    relational types
  • Native support for structured binary object (SBO)
  • Separate fields at bit level
  • Most flexible efficient, but require
    re-implementation of the database type system

9
Stream Query Language
  • How to handle multi-stream output, e.g. group-by?
  • select avg(ip_stream.packet_size)
  • from ip_stream
  • group by ip_stream.source_ip_addr
  • How to handle indefinitely waiting in join?
  • select from s1, s2
  • where s1.packet_id s2.packet_id
  • Time window clause, temporal attributes/operators,

10
Implementation Issues
  • Bounded buffer management
  • Time-constrained query processing must beat the
    buffer refresh rate
  • Storage I/O bandwidth requirement (OC48 or
    higher?)
  • Migration of data processing to disk
  • Data loss incomplete query

11
Traffic Warehouse
  • Repository of traffic data for off-line analysis
  • Efficient navigation across protocol stack
    other business table dimensions
  • Storage (cluster, parallelism)
  • Distributed warehouse approach
  • Chen et al. SIGMOD2000
  • HTTP, FTP, TCP . IP
  • tcpdump, HTTP server logs
  • Caceres et al. IEEE Comm. 2000 ATT WorldNet
    data warehouse

12
Tribeca VLDB96,USENIX98
  • Singe stream input (no join)
  • Supported operators
  • Selection
  • Projection
  • Aggregates
  • Mux/demux multi-stream output
  • Time window
  • User-defined data type and extraction functions
    (in C)
  • Tested on ATM cell traces
  • Achieved 5-7MB/s (30-40k rec/s ) processing rate
    on a Sun Sparc10
  • former contributors M. Sullivan, Y. Saraiya, A.
    Heybey

13
Tribeca example query
  • Q1 Count the accumulated number of large IP
    packets ( gt 250 bytes) transmitted over the link.
  • Q2 Find the number avg length of TCP/IP
    packets for every successive 5 second time
    window. Save to a file.

14
Tribeca example query
demux on VCI
s1
source_stream s1 is live,
atm_link_1476, AtmCellTrace result_stream r1
is file res1 stream_demux s1.atm.vci p1
atm cells
15
Tribeca example query
P2 IP packets
demux
mux
s1
assemble extract
source_stream s1 is live,
atm_link_1476, AtmCellTrace result_stream r1
is file res1 stream_demux s1.atm.vci
p1 stream_proj p1.assemble_ip p2 stream_mux
p2 p3
p3
atm cells
assemble_ip is a user-defined function
16
Tribeca example query
IP packets
demux
mux
s1
assemble extract
source_stream s1 is live,
atm_link_1476, AtmCellTrace result_stream r1
is file res1 stream_demux s1.atm.vci
p1 stream_proj p1.assemble_ip p2 stream_mux
p2 p3 stream_qual p3.length.geq 250
p4 stream_agg p4.count
atm cells
length gt 250
p4
count
display
17
Tribeca example query
IP packets
demux
mux
s1
assemble extract
source_stream s1 is live,
atm_link_1476, AtmCellTrace result_stream r1
is file res1 stream_demux s1.atm.vci
p1 stream_proj p1.assemble_ip p2 stream_mux
p2 p3 stream_qual p3.length.geq 250
p4 stream_agg p4.count stream_qual p3.type.eq
TCP p5 stream_agg p5.count, p5.length.avg on
fixed window 5 sec r1
atm cells
length gt 250
count
display
p5
type TCP
fixed 5 sec window
count, avg (length)
r1 (save to file)
18
Tribeca
  • data type inheritance (IP - TCP, UDP)
  • window fixed vs. moving user-defined delimiter
  • record fixed length, variable length, framing
  • implementation optimization
  • dual buffers
  • minimize data copying passing pointers instead

19
Related Activities
  • CAIDA
  • SLAC
  • NLANR
  • XIWT
  • ATT,HP,Sun,Telcordia,
  • passive Internet traffic collection at major
    Internet backbone routers

20
Related Work
  • Tangram Parker90,92
  • a model captures streams, sets and parallelism
  • more a state machine than a query language
  • SEQ Seshadri95,96
  • static sequences
  • Datacycle Bowen92
  • information filtering on broadcast data
Write a Comment
User Comments (0)
About PowerShow.com