Network Anomalies: Origins, Structure and Diagnosis - PowerPoint PPT Presentation

1 / 36
About This Presentation
Title:

Network Anomalies: Origins, Structure and Diagnosis

Description:

This talk: focus on detection and identification. 20 ... e.g., Spammers hijack route prefixes (network layer), to send spam at ... – PowerPoint PPT presentation

Number of Views:35
Avg rating:3.0/5.0
Slides: 37
Provided by: anukool
Category:

less

Transcript and Presenter's Notes

Title: Network Anomalies: Origins, Structure and Diagnosis


1
Network Anomalies Origins, Structure and
Diagnosis
  • Anukool LakhinaOral Qualifier Exam

Committee Azer Bestavros John Byers Mark
Crovella (advisor)
2
Defining Anomalies
  • anomaly Merriam-Webster, 2004
  • deviation from the common rule, an irregularity
  • something different, abnormal, peculiar, or not
    easily classified
  • Anomalies are unusual events, relative to some
    baseline normal state
  • Relativistic events ? difficult to precisely nail
    down

3
Network Anomalies
  • We want to capture unusual events that network
    operators care about
  • Network anomalies are unexpected events that can
    adversely impact the availability and
    performance of networks

4
Examples
Survey of problem reports to NANOG 1994-99 241
serious problems reported 2001-04 375 same
problems reported Routing loops, instability,
hijacks, filtering, blackhole, outage, etc. have
not reduced in 10 years! Feamster04
5
More Examples
6
Talk Organization
  • Origin of Anomalies
  • How and why do anomalies arise?
  • Structure of Anomalies
  • How do anomalies differ?
  • How can we organize them?
  • Diagnosing Anomalies
  • Strategies to detect, identify, and mitigate
    anomalies
  • This talk focus on detection and identification

7
Origin of Anomalies
  • Operational events
  • Misconfigurations, accidents,
  • Design/Implementation Consequences
  • Software bugs, unexpected protocol interactions,
  • Network abuse (malicious intent)
  • DOS attacks, worms, intrusions,
  • Unusual end-user behavior (not malicious)
  • Flash crowds, peer to peer traffic,

8
Operational Anomalies
  • Broadly arise due to operator error
    accidents
  • Famous example AS 7007 incident
  • In 1997, a small ISP announced routes for the
    entire Internet
  • Most ISPs believed the announcements
  • Result all Internet traffic sent to one router
    for several hours, disrupting Internet
    connectivity to hundreds of networks Farrow02
  • BGP configuration errors are pervasive
    Mahajan02
  • Cause erroneous updates to 0.1-1 of the BGP
    table each day 4 of these can cause outage
  • Many errors due to primitive router configuration
  • Distributed program without ability to compile
    test Feamster03,Caldwell03

9
Operator error not exclusive to the Internet
only
Operator error single largest contributor to
failures in Tandem transaction processing
systems, accounting for 42 of all failures.
Gray85
10
prevalent in yet more domains
Oppenheimer03
  • Dominant in large Internet services
  • Studied failure data from 3 geographically
    Internet services
  • 51 of failures due to operator error
  • Dominant in Public Switched Telephone Networks
  • Kuhn97 studied FCC disruption reports from
    1992-1994
  • 50 outages due to operator error
  • 54 in 2000 Enriquez02

Kuhn97
Enriquez02
11
Reasons for Human Error
  • Humans make errors, even when they know what they
    are doing
  • Because understanding state of large,
    tightly-coupled systems is difficult
  • Humans not good at diagnosing problems from first
    principles
  • Especially in an emergency
  • Automation solves common tasks, leaving the rare,
    complex ones for operators
  • Automation irony Poor automation reduces system
    visibility ? harder for operators to diagnose
    anomalies
  • Lessons Errors will always occur automation
    in synergy with operator


Reason90
12
Design Implementation Flaws
  • Anomalies that arise due to implementation bugs
    or from flawed design. Examples
  • 1988 Internet congestion collapse Jacobson88
  • 2004 Unexpected protocol interactions, e.g.,
    interaction between inter-domain intra-domain
    routing
  • Design Goal isolate Internet from routing
    changes within an AS
  • Reality Small changes in internal routing
    weights can cause large traffic shifts in
    neighbor networks
  • Result operators set IGP metrics that make BGP
    sense, rather than view them separately ? more
    complexity
  • Lesson Routing protocols not designed with
    interactions, network design configurations, and
    dynamics in mind
  • Teixeria04,Teixeira05

13
Reasons for Design Flaws
  • Latent design errors exposed in stress conditions
  • Congestion collapse, routers reboot when
    overloaded
  • Correlated failures frequently occur
  • Evidence in IP networks CBI04 , in
    Internet-scale distributed services
    Yalagalunda04
  • Traditional fault tolerant designs assume
    independent failures
  • Margin of Safety (in Civil Engineering)
  • 25 of all railroad bridges failed between
    1850-1890s! Petroski92
  • What is the equivalent of a margin of safety
    for computer systems networks? Patterson02

Perrow90
Petroski92
14
Abuse Anomalies
  • Defining characteristic Arise from malicious
    intent and violate the targets confidentiality,
    integrity, and availability RS91
  • Examples
  • Intrusions target confidentiality,
  • Route hijacks target integrity,
  • DOS attacks target availability

15
Reasons for Abuse Anomalies
  • Technical enablers
  • Unrestricted connectivity, platform homogeneity,
    anonymity, few defenses
  • Attacker uses automation to target all systems at
    once. Defender must defend all systems
  • Traditional threat attacker targets high-value
    target, defender allocates more resources to
    defend it
  • Economic motivations
  • Profit SPAM forwarding, extortion,
  • Emerging marketplace can buy sell zombie
    machines
  • Bad guys now have financial incentive to get
    better
  • Savage05
  • Political reasons
  • Cyber-terror, cyber-warfare, political protest
    Weaver04

16
Unusual End-User behavior
  • Not malicious in intent but can have harmful
    impact on availability and performance
  • Important to manage these anomalies to provision
    network resources
  • e.g., high rate flows, peer to peer traffic,
    measurement experiments, flash crowds
  • Flash crowds unusual demand for a resource
  • e.g, Starr report, the slashdot effect, ...

17
Flash Crowds hit MSNBC.com
MSNBC is experiencing high site traffic. We have
temporarily moved your personalized news to a
separate page click here
MSNBC.com homepage during a flash crowd
18
Lessons from Anomaly Origins
  • Network anomalies span a broad range, are
    prevalent, and can have catastrophic impact
  • Anomalies are here to stay new anomalies will
    arise
  • Cannot prevent anomalies, but can try to
    accommodate them
  • Despite anomalies, PSTN still had 99.999
    availability Two reasons
  • 1) Error detection built into design Designers
    devote half of the software in telephone switches
    to error detection and correction. Kuhn97
  • 2) Quick correct human intervention (
    capabilities to intervene)

19
Talk Organization
  • Origin of Anomalies
  • How and why do anomalies arise?
  • Structure of Anomalies
  • How do anomalies differ?
  • How can we organize them?
  • Diagnosing Anomalies
  • Strategies to detect, identify, and mitigate
    anomalies
  • This talk focus on detection and identification

20
A Common Feature Anomalies Create Unusual
Network Traffic
  • Despite their diversity, many serious anomalies
    create unusual traffic
  • e.g., DDOS, flash, outage, scans, worms.
  • Some anomalies may not affect traffic, until
    they become seriouse.g., dormant configuration
    or design errors that result in an outage event
  • Challenges
  • How do we mine anomalies in traffic?
  • What type of traffic to analyze?

All Anomalies
Anomaliesvisible in traffic
21
Anomalies by Layer
  • Application LayerGenerates, interprets data
  • Transport LayerReliable data transfer (TCP)
  • Network LayerAddress assignment, and routing
    (BGP, IGP, ..)
  • Physical Layer
  • MAC addressing and bit transmission

Layered TCP/IP Model
22
Abuse Examples, by Layer
  • DDOS floods
  • TCP attacks
  • Route hijacks
  • MAC flooding
  • MAC flooding
  • Overwhelm address-to-physical port mappings at
    switch with spoofed packets
  • Switch enters failopen mode ? all incoming
    traffic is now broadcast
  • Result eavesdropping of legitimate network
    traffic

Deny service to the victim by 1) overwhelming
resource (DOS, MAC), 2) masquerading resource
(hijacks), 3) timing attacks (TCP attacks,
RoQ) RoQ attacks are timing-based attacks
that target adaptation mechanisms instead of
victim directly thus capable of attacking at
multiple layers Guirguis04,Guirguis05
23
Anomaly Origin Structure
24
Examples by Origin Structure
25
Multi-Layer Propagation
  • Anomalies travel across layers
  • Each layer has containment capability (e.g,
    checksums)
  • Going up if error-checks fail to contain
    anomalies, e.g, link failures
  • Going down e.g., router reboots on overload,
    Nimda worm caused routing instability
    Andersen04,Wang02
  • Multi-Layer Anomalies
  • Anomalies at multiple layers
  • e.g., Spammers hijack route prefixes (network
    layer), to send spam at application layer
    Bellovin01
  • Makes diagnosis challenging
  • especially root-cause analysis

26
Talk Organization
  • Origin of Anomalies
  • How and why do anomalies arise?
  • Structure of Anomalies
  • How do anomalies differ?
  • How can we organize them?
  • Diagnosing Anomalies
  • Strategies to detect, identify, and mitigate
  • This talk focus on detection and identification

27
Anomaly Diagnosis
  • Key ingredients of Anomaly Diagnosis
  • Detection Stating when an anomaly has occurred
    or is occurring
  • Identification Isolating the anomaly from
    normal, stating its type, and where possible,
    exposing its structure and origin
  • Diagnosis also includes
  • Mitigation avoiding, managing and controlling
    adverse impact of anomalies
  • Lots of research here, most centered on re-design
    of protocols and architectures
  • Not in this talk

28
Identification Approaches
  • Anomaly Identification Given a detected anomaly,
    what
  • is its structure and cause? Problems
  • Origin structure are typically not
    observable,
  • Evidence is partial, ambiguous, inconsistent
  • Strategies
  • Model-driven e.g., traversing system
    dependency models
  • Rule-based e.g., expert-systems, case-based
    reasoning
  • Learning-based e.g., clustering and
    classification

29
Detection Approaches
  • 1) Anomaly as known signatures to match
  • Key Challenge Define a broad set of signatures
    (without causing false alarms)
  • Advantage Identification for free
  • Problem Cannot detect new anomalies
  • 2) Anomalies as deviations from normality
  • Key Challenge Define notion of normality
  • Advantage Can detect new anomalies
  • Problem Identification problem difficult
  • 3) Hybrid Schemes
  • Match anomaly against known signatures if no
    match, then check for deviations from normality

this talk
30
Deviation from normal
The model is based on the hypothesis that
exploitation of a systems vulnerability involves
abnormal use of the system therefore security
violations could be detected from abnormal
pattern of system usage.
Denning86
31
Capturing normal behavior
  • Broadly, three strategies to model normal
    behavior
  • System-knowledge models
  • Use only a priori knowledge of system
  • Useful in static settings when normal behavior
    is known (e.g., configuration correctness
    checking)
  • Data-driven models
  • Use measurements to model normality via
    correlation in data, e.g., using temporal or
    spatial correlation
  • Useful in dynamic settings or when normal is
    unknown (e.g., for traffic anomalies)
  • Hybrid schemes
  • Use measurements as input to a system model, and
    predict expected behavior, e.g., via dependency
    graphs
  • Useful when normal behavior for system is known,
    but workload is unknown (e.g., vulnerability
    filters for end-hosts)

32
Measurements by Layer
  • Measurements placed in a layer if they are
    available at that layer (and useful to diagnose
    anomalies)

33
How methods use data to capture normal
A broad categorization of methods
34
Measurements by Location
  • Measure at individual end-host (edge)
  • Detailed payloads possible
  • Limited global visibility, limited mitigation
  • e.g., network intrusion detection systems
    Paxson99
  • Measure at multiple end-hosts (overlay)
  • Detailed payloads, that are exchanged
  • Better global visibility, but still limited
    mitigation
  • e.g., Collaborative intrusion detection
    Yegneswaran04
  • Measure at core (ISP)
  • Sampled packet headers
  • Network-wide visibility, effective mitigation
    possible
  • But, mining network-wide traffic is difficult
    LCD05

35
Recent Trends in Diagnosis
  • Network-Wide Diagnosis
  • Exploit correlation across links in order to
  • build models of normal traffic,
  • trace how anomalies move in network
    LCD04,LCD05
  • Multi-Layer Diagnosis
  • Correlate traffic data from multiple layers
    simultaneously Roughan04
  • Enables sophisticated identification and
    root-cause analysis
  • Recent thrust in fault management literature also
    Steinder04

36
Final thoughts
  • Network anomalies span a wide range and can have
    severe impact
  • Network anomalies are increasing in prevalence
    and unlikely to go away
  • Effective diagnosis and mitigation methods are
    needed to manage anomalies
  • Despite their diversity, many anomalies disturb
    network traffic
  • General anomaly diagnosis may be possible by
    mining anomalies in network traffic at different
    layers and topological locations
Write a Comment
User Comments (0)
About PowerShow.com