Analyzing the MAC-level Behavior of Wireless Networks in the wild

About This Presentation

Title:

Analyzing the MAC-level Behavior of Wireless Networks in the wild

Description:

Presented by: Andy Cheng, Worm Yung. 2. Introduction. Authors: Ratul Mahajan. Maya Rodrig ... Measurement-driven analysis of live networks is critical to ... – PowerPoint PPT presentation

Number of Views:36

Avg rating:3.0/5.0

Slides: 66

Provided by: worm9

Category:

more less

Transcript and Presenter's Notes

Title: Analyzing the MAC-level Behavior of Wireless Networks in the wild

1
Analyzing the MAC-level Behavior of Wireless
Networksin the wild

By Ratul Mahajan, Maya Rodrig,David Wetherall,
ZahorjanPresented by Andy Cheng, Worm Yung

2
Introduction

Authors
Ratul Mahajan
Maya Rodrig
David Wetherall
Zahorjan
Goal To develop a non-intrusive tool that builds
on passive monitoring to analyze the detailed
MAC-level behavior of operational wireless
networks call Wit.

3
Introduction

Three processing steps
Merging trace from multiple monitors
Reconstructs packets that are not captured
Deriving network performance measures
Assessing Wit
Simulation test
Real case test SIGCOMM 2004 conference

4
Background

Measurement-driven analysis of live networks is
critical to understand and improving their
operation.
For wireless network, very little detailed
information is available on the performance of
real deployments.
Example Determining how often clients retransmit
their packets.

5
Background

SNMP logs from APs or packet traces from the wire
adjacent to the APs are not sufficient to the
task.
Passive monitoring is the main approach in this
paper.
Passive monitoring a technique used to capture
traffic from a network by generating a copy of
that traffic.

6
Passive Monitoring

One or more nodes in the vicinity of the wireless
network record the attributes of all
transmissions that they observed.
This approach is almost trivial to deploy.
Traces collected are limited in several respects
and will be incomplete.
Not easy to estimate how much information is
missing.

7
Passive Monitoring

Traces do not record whether the packets were
successfully received by their destinations.
Traces only record information about packet
events and omit other important network
characteristics.
Unlike instrumentation, passive monitoring lacks
access to the internal state of the nodes.

8
Goal of this paper

Develop sound methodologies for analyzing traces
collected via passive monitoring of wireless
networks.
Able to investigate questions that are not easy
to answer today, such as
How often do clients retransmit their packets?
What is the average asymmetry in the loss ratio
between two nodes?
How does network performance vary with offered
load?

9
Introduction of Wit

Product of this paper Wit
Wit is composed of 3 components

10
Introduction of Wit

The first component merges the independent views
of multiple monitors into a single, consistent
view.
The second component uses novel inference
procedures based on a formal language model of
protocol interactions to determine whether each
packet was received by its destination. It also
infers and adds to the trace packets that are
missing from the merged trace.

11
Introduction of Wit

The third component derives network-level
measures from the enhanced trace. In addition to
simple measures such as packet reception
probabilities, it estimates the number of nodes
contending for the medium as a measure of offered
node.
The authors expect that future research will add
more techniques to the third component.

12
Technical Description - Overall

Key challenge
A single monitor will miss many transmitted
packets
Monitors do not log whether a packet was received
by its destination
Monitors do not log network-level information

13
Technical Description - Merging

Input A number of packet traces, each collected
by a different monitor.
Output A single, consistent timeline for all the
packets observed across all the monitors.
Eliminate duplicates
Assign coherent timestamps to all packets.

14
Technical Description - Merging

Challenge
The clocks are not synchronized among different
monitors.
The clock of the monitor may have significantly
different skews and drifts.
Identifying duplicated packets. The only way to
distinguish duplicates across traces from
distinct transmissions is by time.

15
Technical Description - Merging

Key An accurate unified timestamp
To produce an accurate unified timestamp, the
authors proposed leveraging reference packets
that can be reliably identified as being
identical across monitor.

16
Technical Description - Merging

Three steps to merge pairs of traces

17
Technical Description - Merging

Step 1 - Identify common references
Use beacons generated by APs as reference, since
they carry a unique source MAC address and the
64-bit value of a local, microsecond resolution
timer.

18
Technical Description - Merging

Step 2 - Translate timestamps of the second trace
Use the timestamps of the references to translate
the time coordinates of the second monitor into
those of the first.
Simple linear function is used during
translation.
The second trace is stretched or shrink
A constant is added to align the two.

19
Technical Description - Merging

Step 2 - Translate timestamps of the second trace
(cont.)
Linear timestamp translation assumes that the
relative clock drift is constant.
This is not true in reality. Even the most
reliable time sources exhibit varying relative
drift.
Therefore, multiple references and interpolate
independently between successive pairs are used.
This reduces the impact of clock fluctuations to
short intervals.

20
Technical Description - Merging

Step 3 Identify and remove duplicates
Identifying duplicates as packets of the same
type by
Having the same source and destination
With timestamp difference less than half of the
minimum time to transmit a packet.(106
microseconds for 802.11b)

21
Technical Description - Merging

Issues
It is unlikely that any one monitor will have
enough references in common with each of the
others.
Therefore, Waterfall merging process is used.
Traces merged from two monitors will then be
added with the trace from a third monitor.
Longer running time but can improve precision.

22
Technical Description Inferring Missing
Information

The inference phase serves two purposes
It uses the information in packets that the
monitors did capture to infer at least some that
they did not.
It introduces an entirely new class of
information to passive traces an estimate of
whether packets were received by their
destination.

23
Technical Description Inferring Missing
Information

Key insight behind the inference technique
The packets a node transmit often imply useful
information about the packets it must have
received.
E.g. An AP sends an ASSOCIATION RESPONSE only if
it recently received an ASSOCIATION REQUEST.
(Sender ?? Receiver)
Therefore, many key attributes of the missing
packet can be reconstructed.

24
Technical Description Inferring Missing
Information

The Formal Language Approach
To cast the inference problem as a language
recognition task.
To view the input trace as interleaved partial
sentences from the language.
To presume that there was a sentence in the
language for which we see only some of the
symbols and ask what complete sentence it was
likely to have been.
To be recognized by finite state machines (FSMs).

25
Technical Description Inferring Missing
Information

Processing the Trace
Classify
Map packets to symbols of the language based
primarily on their type.
Use the values of the retry bit and the fragment
number field in forming symbols, which provides
some additional leverage in marking inferences,
at the cost of a somewhat larger symbol set and
FSM.
Identify the conversation of the packet based on
its source and destination.
For packets without the source field (ACKs and
CTSs), the source is deduced from earlier
packets.
Non-unicast packets are considered conversation
of a single packet.

26
Technical Description Inferring Missing
Information

Processing the Trace
Generate Marker
Introduce a marker if the currently scanned
packet indicates that an ongoing conversation has
ended.
This occurs under one of the following
conditions
The sequence number field signals a new
conversation between the endpoints.
For non-AP nodes, the other endpoint of the
current packet is different from the earlier one.
There is no legal transition in the FSM for the
current symbol.
Timeout interval has passed since the last seen
activity.

27
Technical Description Inferring Missing
Information

Processing the Trace

28
Technical Description Inferring Missing
Information

Processing the Trace
Take FSM Step
Key to this process is the construction of the
FSM.
Cannot simply use an FSM corresponding to the
protocol because packets (i.e. sentence symbols)
are missing from the trace and because we want to
use the FSM to estimate which packets were
received by their destination.
Traditional FSM matching must be extend.
The authors produced the FSM for the complete
802.11 protocol has 339 states.

29
Technical Description Inferring Missing
Information

Processing the Trace
Take FSM Step (cont.)
Inferring Packet Reception
Walk non-deterministically both the packet
received and packet lost edges and encode the
current state as the distinct set of paths
traversed so far.
When the accept state is reached, the edges
traversed on the paths from the start state can
be examined to determine whether each packet was
received.

30
Technical Description Inferring Missing
Information

Processing the Trace
Take FSM Step (cont.)
Inferring Missing Packets
With missing packets, there may be no legal
transition for the current symbols.
The solution is to augment the FSM with
additional edges.
Abstractly, for each pair of states (Si, Sj) ?
(Start, Accept), we add an edge from Si to Sj for
each distinct trail (or, a path with no repeated
edges) from Si to Sj, labeling it with the final
symbol of the trail.

31
Technical Description Inferring Missing
Information

Processing the Trace
Take FSM Step (cont.)
Inferring Missing Packets (cont.)
Move non-deterministically in the augmented FSM
until the accept state is reached.
There may be multiple paths from Start to Accept,
all of them consistent with the captured packets.
To select, we assign weights to paths and select
the lowest one. The weight of a path reflects the
number of packets that it indicates as missing
and the rarity of those packet types.
Un-augmented edges, which correspond to captured
edge has zero weight.
The weight of an augmented edge is the sum of the
weights of the symbols in the annotation.

32
Technical Description Inferring Missing
Information

Processing the Trace
Take FSM Step (cont.)
Inferring Missing Packets (cont.)
The weighting method prefers the shorter of two
paths when the symbols of one are a subset of the
other.
Thus, producing conservative estimates of missing
packets.
When accept state is reached, we synthesize any
missing packets along the selected path.

33
Technical Description Inferring Missing
Information

Processing the Trace
Take FSM Step (cont.)
Inferring Missing Packets (cont.)
Some packet types, such as ACK, RTS and CTS has
fixed size. For others, such as DATA packets, the
size can be inferred if a retransmission of the
packet is observed.
The transmission time of a missing packet can be
inferred if there exists a captured packet
relative to which it has a fixed spacing.
The transmission rate of certain packet types,
such as PROBE REQUEST is usually fixed for a
client. For other types, such as ACK, it depends
on the rate of the previous, incoming packet.
However, the rate of missing DATA packets cannot
be inferred unless the rate adaptation behavior
of the sender is known.

34
Technical Description Inferring Missing
Information

Limitations
What can be known with certainty is inherently
limited.
For any specific partial conversation, we cannot
be certain as to which complete conversation
actually transpired.
We cannot always infer all the properties of a
missing packet.

35
Technical Description Inferring Missing
Information

Limitations (cont.)
Methods to reduce uncertainty in the future
Better leverage timestamp
Transmission times of inferred packets may be
estimated if we observe the idle times in the
trace and mimic the 802.11 channel acquisition
policy.
Current focus is on simpler techniques that bring
the most gain.

36
Technical Description Deriving Measures

The enhanced trace generated by merging and
inference can be mined in many ways to understand
the detailed MAC-level behavior.
We consider a station to be contending for the
medium from the time the MAC-layer gets a packet
to send to the time of successful transmission.
For packets that require MAC-level responses,
transmission is considered successful when the
response is successfully received.
Missing packets may cause larger inaccuracies,
but their impact is limited by our use of
multiple monitors and inference to obtain a
reasonbly complete trace.

37
Technical Description Deriving Measures

Difficulty
In computing the number of contenders, judging
whether a station is contending requires access
to state, such as randomly selected back-off
values and carrier sense decisions, that is not
present in the trace.
How to overcome?
By making a simple observation much of the
relevant state can be approximated by viewing the
stations transmissions through the lens of the
medium access rules that it implement.
E.g. For 802.11, if we see DATA and DATAretry
packets from a station, we know that it was
contending for the medium in the time between the
two transmissions and at least for some time
before the first one.

38
Technical Description Deriving Measures

Procedures of estimating the number of contenders

39
Implementation of Wit

Wit is implemented as three components
halfWit (1,200 lines of Perl code)
nitWit (3,200 lines of Perl code)
dimWit (1,200 lines of Perl code)
Wit inserts the traces collected by individual
monitors into a database, and then uses the
database to pass results between stages.

40
halfWit The Merging Component

halfWit uses a merge-sort like methodology.
It treats the two input traces as time-sorted
like queues and at each step outputs the queue
head with the lower timestamp (after
translation).
When the precision of timestamp translation is
better than half the minimum time to transmit a
packet, duplicate packets will appear together as
queue head.
Waterfall merging is used.
The precision depends on the order in which the
monitors are added because order determines the
density of common references at each waterfall
step.

41
nitWit The Inference Component

Using a customized regular expression grammar to
simplify specification of the FSM.
Example regular expression
DATA ACK, , -DATAretry ACK, , -
The fundamental units, enclosed in square
brackets, consist of three fields a sequence of
symbols, an indication of the next step if all
the packets represented by the symbols are
received, and the indication of the next step if
any is dropped.
The regular expression for the entire 802.11
protocol is 660 characters long. It produces an
FSM with 339 states and 1061 edges. Augmentation
adds 15,193 edges.
Optimizations on the FSM
Pruning some edges
Record only the least weight one when multiple
paths lead to the same state after a transition.

42
dimWit The Derived Measures Component

What dimWit does
Computes the number of contenders in the network.
Inserts summary information into a number of
auxiliary database tables so that it may compute
per contention level measures without reading a
number of records proportional to the number of
packets.

43
Evaluation

What is the quality of time synchronization with
merging?
How accurate are inferences of packet reception
status?
What fraction of missing packets are inferred?
How accurate is the estimate of the number of
contenders?
How should we decide between adding monitors and
using inference?

44
Evaluation Method

Theoretical way
Compare against ground truth obtained from
monitored network
Problems
Obtaining such authoritative data is quit
difficult
Instrumentation necessary to obtain such data is
problematic
No commodity hardware reports information on many
low-level events required for validation such as
the timing of different retries of a packet
Method adopted for validation
Simulation

45
Evaluation Simulation Environment

2 Access Points
40 Clients randomly distributed
QualNet simulator to simulate an 802.11b-like PHY
layer
Packet reception probability depends on signal
strength, transmission rate, other packets in the
flight, and random bit errors
3 grid sizes
100 x 100, 600 x 600, 900 x 900
10 Monitors for logging

46
Evaluation Merging

Dimensions for evaluating merging
Correctness
Run halfWit to merge the logs of the monitors and
compare against authoritative log
Quality of time synchronization
Use Live Network traces versus simulator traces
Measure the timestamp uncertainty of duplicate
packets identified as duplicates during the merge
Result-
Correctness
all duplicates and only duplicates are removed
Quality of time synchronization
99.9 of the duplicate packets timestamp
difference are within 2 ?secs, as shown in the
following graphs

47
Evaluation Merging (cont.)
Real traces
Simulator traces
48
Evaluation Inference

Dimensions for evaluating inference
ability to infer packet reception status, and
ability to infer missing packets
Result-
Inference of packet reception status
Accuracy is 95 when only 1/2 of total packets
Accuracy is 90 when only 1/3 of total packets
Overall, nitWit does well even when small
fraction of packets can be captured

49
Evaluation Inference (cont.)

Left side shows how accurately nitWit infers
whether packets were received
Correctness and capture percentages are computed
using the authoritative simulator log

Right side shows how accurately nitWit infers
receptions for a clients packets as a function
of this capture estimate
Capture estimate is the ratio of the number of
packets captured for the client to the sum of
packets captured and inferred for the client

50
Evaluation Inference (cont.)

Result-
Inference of missing packets
nitWit add roughly 10-20 packets when the
capture percentage is low
nitWit can infer much of the remaining when the
capture percentage is high

51
Evaluation Estimating Contenders

Run dimWit on the merged traces of all ten
simulated monitors and compare the estimate
against the authoritative log
Result-
The graph shows the CDF of error in estimating
the number of contenders
Error estimated minus actual number of
contenders
Accuracy decreases with increasing grid size
since fewer packets can be captured
In 900x900 grid
90 packets captured
dimWit is 87 within 1
dimWit is 92 within 2
In 100x100 grid
98 packets captured
dimWit is 95 within 1

52
Evaluation Inference Versus Additional Monitors

Use a simple model to study the ability of
inference to deal with incomplete data
Generate artificial traces with monitor capture
probability and the node reception probability
held constant
Clients repeatedly attempt to send data
Each DATA packet, both original and retried, and
ACK is independently logged with the capture
probability
Drop packets according to the specified
round-trip probability
Vary the capture probability from 0.7 to 0.95
increase by adding more monitors
Results-

53
Evaluation Inference Versus Additional Monitors
(cont.)

The figures show the results of running nitWit
over the traces

54
Evaluation Inference Versus Additional Monitors
(cont.)

Merging and inference are complementary
Add more monitors at low capture probabilities
Diminishing returns as monitors become more dense
and the capture probability is already high

55
Applying Wit to a Live Network

Monitoring wireless environment SIGCOMM 2004
5 PCs (network monitoring and logging)
5 APs

56
Applying Wit to a Live Network Merging with
halfWit

Monitors were merged in the order of their number
The graph shows the cumulative number of packet
as additional monitors are merged
Solid curves duplicates not removed
Dashed curves duplicates removed
Additional monitor increases the number of unique
packets, even when merging monitors 1 2 that
sit next to each other
Merging enhances the view of wireless activity

57
Applying Wit to a Live Network Inference with
nitWit

Inference of packet reception status
nitWit inferred 80 of the unicast packets,
Inference of missing packets
nitWit can infer the size and transmission time
of the missing packets.

58
Applying Wit to a Live Network Analysis with
dimWit

Following are five 802.11 operational insights
obtained from analyzing Channel 1 of the SIGCOMM
2004 wireless environment with dimWit

Uplink was more reliable than downlink
The graph compares the reception probability for
uplink (to the AP) versus downlink transmissions
Commercial APs have better, possibly multiple,
antennae to improve their coding ability

59
Applying Wit to a Live Network Analysis with
dimWit (cont.)

Offered load was mostly low
The histograms of each contention level for an
hour long busy interval
Left graph time spent at each contention level
Less than 5 contenders at most of the time
Right graph packets sent at each contention
level
Exercised most at low contention levels
Impossible to derive this result without dimWit,
as high contention can also lead to a low
utilization

60
Applying Wit to a Live Network Analysis with
dimWit (cont.)

The medium was inefficiently utilized
The graph shows the airtime utilization as a
function of the number of contenders
The medium is poorly utilized in the common case
of few contenders
As reference, the theoretical utilization of a
single node is roughly 75 rather than 30 as we
obtained here
It appears to be due the nodes waiting
unnecessarily in backoff phase before they
transmit

61
Applying Wit to a Live Network Analysis with
dimWit (cont.)

Reception probability did not decrease with
contention
The graph shows the packet reception probability
as a function of the contention level
Expected to decline with the number of contenders
due to increased collision losses
Surprise to find that reception probability
remains steady
Radio losses were the dominant cause of packet
drops

62
Applying Wit to a Live Network Analysis with
dimWit (cont.)

Performance was stable at high contention levels
The graph shows the rates of packets transmitted
and received in the network network throughput
The throughput initially increases and stabilizes
at five or more contenders
As reference, the throughput with a single node
sending 500 byte data packets is roughly 1200
packet s per second

63
Applying Wit to a Live Network

Findings
The 802.11 MAC is tuned for higher contention
levels than the live network measured here
Most losses appear to be radio losses
MAC appears overly biased towards avoiding
collisions by using larger than necessary backoff
intervals
This lead to the inefficient usage of the medium
for common case of low offered load

64
Conclusion

Wit provides a passive monitoring tool to
Merges traces
Infer missing information
Derive performance measures
802.11 MAC is tuned for uncommon case of high
contention levels

65
Comments

Overall, we believe that the tool should be able
to help network admins to analyze and measure
their wireless networks, as well as to understand
the real problems and improve their systems
The use of FSM mechanism to infer missing packets
is a good idea to help getting a complete view of
the network activity
However, if there are several APs with
overlapping coverage, it might not be meaningful
to merge the traces from monitors near them