Title: Tracedriven Contextaware Mobile Networks Towards Mobile Social Networks
1Trace-driven Context-aware Mobile
NetworksTowards Mobile Social Networks
- Ahmed Helmy
- Computer and Information Science and Engineering
(CISE) Dept - Mobile Networking Lab (NOMADS group)
- University of Florida
- helmy_at_ufl.edu
- http//nile.cise.ufl.edu/MobiLib
2Birds-Eye View Mobile Networking Lab
3Introduction Problem Scope
- Future network devices are mobile personal
- Very tight coupling between devices humans
- Network performance significantly affected by
(and affects) users behavior - Movement, grouping, on-line activity, trust,
cooperation - How do users behave in mobile societies?
- What kinds of protocols/networks survive
perform well in highly mobile societies?
4Paradigm Shift in Protocol Design
Used to
- May end up with suboptimal performance or
failures due to lack of context in the design
Propose to
5Problem Statement
- How to gain insight into deployment context?
- How to utilize insight to design future services?
- Approach
- Extensive trace-based analysis to identify
dominant trends characteristics - Analyze user behavioral patterns
- Individual user behavior and mobility
- Collective user behavior grouping, encounters
- Integrate findings in modeling and protocol
design - I. User mobility modeling II. Behavioral
grouping - III. Information dissemination in mobile
societies, profile-cast
6The TRACE framework
MobiLib
Employ (Modeling Protocol Design)
7Vision Community-wide Wireless/Mobility Library
- Library of
- Measurements from Universities, vehicular
networks - Realistic models of behavior (mobility, traffic,
friendship, encounters) - Benchmarks for simulation and evaluation
- Tools for trace data mining
- Use insights to design future context-aware
protocols? - http//nile.cise.ufl.edu/MobiLib
8Libraries of Wireless Traces
- Multi-campus (community-wide) traces
- MobiLib (USC (04-06), now _at_ UFL)
- nile.cise.ufl.edu/MobiLib
- 15 Traces from USC, Dartmouth, MIT, UCSD, UCSB,
UNC, UMass, GATech, Cambridge, UFL, - Tools for mobility modeling (IMPORTANT, TVC),
data mining - CRAWDAD (Dartmouth)
- Types of traces
- University Campus (mainly WLANs)
- Conference AP and encounter traces
- Municipal (off-campus) wireless
- Bus vehicular wireless networks
- Others (on going)
9Wireless Networks and Mobility Measurements
- In our case studies we use WLAN traces
- From University campuses corporate networks (4
universities, 1 corporate network) - The largest data sets about wireless network
users available to date ( users / lengths) - No bias not special-purpose, data from all
users in the network - We also analyze
- Vehicular movement trace (Cab-spotting)
- Human encounter trace (at Infocom Conf)
10Case study I Individual mobility
11Case Study I Goal
- To understand the mobility/usage pattern of
individual wireless network users - To observe how environments/user
type/trace-collection techniques impact the
observations - To propose a realistic mobility model based on
empirical observations - That is mathematically tractable
- That is capable of characterizing multiple
classes of mobility scenarios
12IMPACT Investigation of Mobile-user Patterns
Across University Campuses using WLAN Trace
Analysis
- 4 major campuses 30 day traces studied from
2 years of traces - Total users gt 12,000 users -
Total Access Points gt 1,300
- Understand changes of user association behavior
w.r.t. - Time - Environment - Device type - Trace
collection method
W. Hsu, A. Helmy, IMPACT Investigation of
Mobile-user Patterns Across University Campuses
using WLAN Trace Analysis, two papers at IEEE
Wireless Networks Measurements (WiNMee), April
2006
13Metrics for Individual Mobility Analysis
- What kind of spatial preference do users exhibit?
- The percentile of time spent at the most
frequently visited locations - What kind of temporal repetition do users
exhibit? - The probability of re-appearance
- How often are the nodes present?
- Percentage of online time
14Observations Visited Access Points (APs)
Fraction of online time associated with the AP
Prob.(coverage gt x)
CCDF of coverage of users percentage of visited
APs
Average fraction of time a MN associates with APs
- Individual users access only a very small portion
of APs in the network. - On average a user spends more than 95 of time at
its top 5 most visited APs. - Long-term mobility is highly skewed in terms of
time associated with each AP. - Users exhibit on/off behavior that needs to
be modeled.
15Repetitive Behavior
- Clear repetitive patterns of association in
wireless network users. - Typically, user association patterns show the
strongest repetitive pattern at time gap of one
day/one week.
16Mobility Characteristics from WLANs
- Simple existing modelsare very differentfrom
the characteristicsin WLAN
17Mobility Models
- Mobility models are of crucial importance for the
evaluation of wireless mobile networks IMP03 - Requirements for mobility models
- Realism (detailed behavior from traces)
- Parameterized, tunable behavior
- Mathematical tractability
- Related work on mobility modeling
- Random models (Random walk/waypoint) inadequate
for human mobility - Improved synthetic models (pathway model, RPGM,
WWP, FWY, MH) more realistic, difficult to
analyze - Trace-based model (T/T) trace-specific, not
general
18Time-variant Community (TVC) Model (W. Hsu,
Thyro, K. Psounis, A. Helmy, Modeling
Time-variant User Mobility in Wireless Mobile
Networks, IEEE INFOCOM, 2007, Trans. on
Networking)
- Skewed location visiting preference
- Create communities to be the preferred area of
movement - Each node can have its own community
- Node moves with two different epoch types Local
or roaming - Each epoch is a random-direction,straight-line
movement - Local epochs in the community
- Roaming epochs around the whole simulation area
19Tiered Time-variant Community (TVC) Model
- Periodical re-appearance
- Create structure in time Periods
- Node moves with different parameters in periods
to capture time-dependent mobility - Repetitive structure
- Finer granularity in space time
- Multi-tier communities
- Multiple time periods
20Using the TVC Model Reproducing Mobility
Characteristics
- (STEP1) Identify the popular locations assign
communities - (STEP2) Assignparameters to the communities
according to stats - (STEP3) Add user on-off patterns (e.g., in WLAN,
users are usually off when moving)
21Using the TVC Model Reproducing Mobility
Characteristics
- WLAN trace (example MIT trace)
Skewed location visiting preference
Periodic re-appearance
Model-simplified single community per node.
Model-complex multiple communities Similar
matches achieved for USC and Dartmouth traces
22Using the TVC Model Reproducing Mobility
Characteristics
- Vehicular trace (Cab-spotting)
23Using the TVC Model Reproducing Mobility
Characteristics
- Human encounter trace at a conference
Inter-meeting time
Encounter duration
A encounters B
time
Encounter duration
Inter-meeting time
24Case study II Groups in WLAN
25Case Study II Goal
- Identify similar users (in terms of long run
mobility preferences) from the diverse WLAN user
population - Understand the constituents of the population
- Identify potential groups for group-aware service
- In this case study we classify users based on
their mobility trends (or location-visiting
preferences) - We consider semester-long USC trace (spring 2006,
94days) and quarter-long Dartmouth trace (spring
2004, 61 days)
26Representation of User Association Patterns
- We choose to represent summary of user
association in each day by a single vector - a aj fraction of online time user i spends
at APj on day d - Summarize the long-run mobility in an
association matrix
- Office, 10AM -12PM
- Library, 3PM 4PM-Class, 6PM 8PM
27Eigen-behavior
- Eigen-behaviors The vectors that describe the
maximum remaining power in the association matrix
(obtained through Singular Value
Decompostion)with quantifiable importance - Eigen-behavior Distance calculates similarity of
users by weighted inner products of
eigen-behaviors. -
- Assoc. patterns can be re-constructed with low
rank error - Benefits Reduced computation and noise
28Similarity-based User Classification
- With the distance between users U and V defined
as 1-Sim(U,V), we use hierarchical clustering to
find similar user groups.
USC
Dartmouth
AMVD Average Minimum Vector Distance
29Validation of User Groups
- Significance of the groups users in the same
group are indeed much more similar to each other
than randomly formed groups (0.93 v.s. 0.46 for
USC, 0.91 v.s. 0.42 for Dartmouth) - Uniqueness of the groups the most important
group eigen-behavior is important for its own
group but not other groups
30User Groups in WLAN - Observations
- Identified hundreds of distinct groups of similar
users - Skewed group size distribution the largest 10
groups account for more than 30 of population on
campus. Power-law distributed group sizes. - Most groups can be described by a list of
locations with a clear ordering of importance - We also observe groups visiting multiple
locations with similar importance taking the
most important location for each user is not
sufficient
31Case study III Encounter Patterns
32Case Study III Goal
- Understand inter-node encounter patterns from a
global perspective - How do we represent encounter patterns?
- How do the encounter patterns influence network
connectivity and communication protocols? - Encounter definition
- In WLAN When two mobile nodes access the same AP
at the same time they have an encounter - In DTN When two mobile nodes move within
communication range they have an encounter
33Observations Encounters
Prob. (unique encounter fraction gt x)
Prob. (total encounter events gt x)
CCDF of unique encounter count
CCDF of total encounter count
- In all the traces, the MNs encounter a small
fraction of the user population. - A user encounters 1.8-6 on average of the user
population (except UCSD) - The number of total encounters for the users
follows a BiPareto distribution.
34Encounter-Relationship (ER) graph
- Draw a link to connect a pair of nodes if they
ever encounter with each other Analyze the
graph properties?
35Small Worlds of Encounters
- Encounter graph nodes as vertices and edges link
all vertices that encounter
Clustering Coefficient (CC)
Regular graph
Normalized CC and PL
Av. Path Length
Random graph
- The encounter graph is a Small World graph (high
CC, low PL) - Even for short time period (1 day) its metrics
(CC, PL) almost saturate
36Background Delay Tolerant Networks (DTN)
- DTNs are mobile networks with sparse,
intermittent nodal connectivity - Encounter events provide the communication
opportunities among nodes - Messages are stored and moved across the network
with nodal mobility
37Information Diffusion in DTNs via Encounters
- Epidemic routing (spatio-temporal broadcast)
achieves almost complete delivery
Robust to the removal of short encounters
Robust to selfish nodes (up to 40)
38Encounter-graphs using Friends
- Distribution for friendship index FI is
exponential for all the traces - Friendship between MNs is highly asymmetric
- Among all node pairs lt 5 with FI gt 0.01, and
lt1 with FI gt 0.4
- Top-ranked friends form cliques and low-ranked
friends are key to provide random links (short
cuts) to reduce the degree of separation in
encounter graph.
39Profile-castW. Hsu, D. Dutta, A. Helmy, Mobicom
2007
- Sending messages to others with similar behavior,
without knowing their identity - Announcements to users with specific behavior V
- Interest-based ads, similarity resource discovery
- Assuming DTN-like environment
C
B
E
D
A
40Profile-cast Use Cases
- Mobility-based profile-cast
- Targeting group of users who move in a particular
pattern (lost-and-found, context-aware messages,
moviegoers) - Approach use similarity metric between users
- Mobility-independent profile-cast
- Targeting people with a certain characteristics
independent of mobility (classic music lovers) - Approach use Small World encounter patterns
41Mobility-based Profile-cast
Scoped message spread in the mobility space
42Profile-cast Operation
- Singular value decomposition provides a summary
of the matrix (A few eigen-behavior vectors are
sufficient, e.g. for 99 of users at most 7
vectors describe 90 of power in the association
matrix)
- Profiling user mobility
- The mobility of a node is represented by an
association matrix
43Profile-cast Operation
- Determining user similarity
- S sends Eigen behaviors for the virtual profile
to N - N evaluated the similarity by weighted inner
products of Eigen-behaviors - Message forwarded if Sim(U,V) is high (the goal
is to deliver messages to nodes with similar
profile) - Privacy conserving N and S do not send
information about their own behavior
44Profile-cast Evaluation
- Epidemic Near perfect delivery ratio, low delay,
high overhead - Centralized Near perfect delivery ratio, low
overhead, a bit extra delay - Decentral provides tradeoff between delivery
overhead - Random poor delivery ratio
Epidemic
Decentral
Decentral
Decentral
Random
Random
Random
Random
- Decentralized I-cast achieves gt 50 reduction
in overhead of Epidemic gt30 increase in
delivery of Random
45Evaluation - Result
- Centralized Excellent successrate with only 3
overhead. - Similarity-based
- (1) 61 success rate at low overhead, 92
success rate at 45 overhead - (2) A flexible success rate overhead
tradeoff - RTx with infinite TTL Much more overhead
undersimilar success rate - Short RTx with many copies Good success
rate/overhead, but delay is still long
46Profile-cast Initial Results
- Adjustable overhead/delivery rate tradeoff
- 61 delivery rate of flooding with 3 overhead
- 92 delivery rate with 45 overhead
- Better than single random walk in terms of delay,
delivery rate - Multiple short random walks also work well in
this case
47Future Work
- Sending to a mobility profile specified by the
sender - Gradient ascend followed by similarity comparison
(in the mobility space) - Mobility independent profile-cast
- The encounter pattern provides a network in which
most nodes are reachable - We dont want to flood How to leverage the
Small World encounter pattern to reach the
neighborhood of most nodes efficiently?
48Future Work
- One-copy-per-clique in the mobility space
- We expect this to work because similarity in
mobility leads to frequent encounters
49Future Directions (Applications)
- Detect abnormal user behavior access patterns
based on previous profiles - Behavior aware push/caching services (targeted
ads, events of interest, announcements) - Caching based on behavioral prediction
- Can/should we extend this paradigm to include
social aspects (trust, friendship, )? - Privacy issues and mobile k-anonymity
50On Mobility Predictability of VoIP WLAN
UsersJ. Kim, Y. Du, M. Chen, A. Helmy, Crawdad
2007Work in-progress
Markov O(2) Predictor Accuracy
VoIP User Prediction Accuracy
- VoIP users are highly mobile and exhibit dramatic
difference in behavior than WLAN users - Prediction accuracy drops from ave 62 for WLAN
users to below 25 for VoIP users
- Motivates
- Revisiting mobility modeling
- Revisiting mobility prediction
51Gender-based feature analysis in Campus-wide
WLANsU. Kumar, N. Yadav, A. Helmy, Mobicom 2007,
Crawdad 2007
- - Able to classify users by gender using
knowledge of campus map - Users exhibit distinct on-line behavior,
preference of device and mobility based on gender - On-going Work
- How much more can we know?
- What is the information-privacy trade-off?
52The Next Generation (Boundless) Classroom
Students
sensor-adhoc
Embedded sensor network
Instructor
sensor-adhoc
Challenges
- Integration of wired Internet, WLANs, Adhoc
Mobile and Sensor Networks - Will this paradigm provide better learning
experience for the students?
sensor-adhoc
53Future Directions Technology-Human
InteractionThe Next Generation Classroom
Protocols, Applications, Services
Human Behavior
Emerging Wireless Multimedia Technologies
Mobility, Load Dynamics
54Multi-Disciplinary Research
Engineering
Social Sciences
Human Computer Interaction (HCI) User Interface
Education
Cognitive Sciences
Psycology
Application Development
Service Provisioning
How to Capture?
Emerging Wireless Multimedia Technologies
Educational/ Learning Experience
Protocol Design
How to Evaluate?
Context-aware Networking
Measurements
Mobility Models
How to Design?
Traffic Models
55Disaster Relief (Self-Configuring) Networks
56On-going and Future Directions Utilizing mobility
- Controlled mobility scenarios
- DakNet, Message Ferries, Info Station
- Mobility-Assisted protocols
- Mobility-assisted information diffusion EASE,
FRESH, DTN, 100 laptop - Context-aware Networking
- Mobility-aware protocols self-configuring,
mobility-adaptive protocols - Socially-aware protocols security, trust,
friendship, associations, small worlds - On-going Projects
- Next Generation (Boundless) Classroom
- Disaster Relief Self-configuring Survivable
Networks
57Thank you!
- Ahmed Helmy helmy_at_ufl.edu
- URL www.cise.ufl.edu/helmy
- MobiLib nile.cise.ufl.edu/MobiLib
58Emerging Wireless Communication
- Opportunities
- Challenges
- Dynamic network structure
- Decentralized service paradigm
- Tight coupling between the devices and individuals
59Outline
60Trace Sets
- Available information from WLAN traces
- MAC addresses of the devices as identifiers
- Location/Time of users (our main focus)
Node e0_12_29_fc_ba_8c
AssociationStart time
Location_ID
Duration
2197745 172.16.8.244_11009 4433 2230200
172.16.8.244_11009 13320 2257917
172.16.8.244_11009 643 2285119
172.16.8.244_11009 1017 2297134
172.16.8.244_11009 7153 2304287
172.16.8.244_11023 6744
61Summary (Case Study I)
- We observe some omni-present mobility
characteristics from WLANs. - These characteristics are not captured by
existing synthetic mobility models (i.e., hence
the models are not realistic) - We propose the Time-variant Community (TVC)
model, which is realistic, theoretically
tractable, and flexible
62Theoretical Tractability
- For the TVC model, we can derive
- Nodal spatial distribution the demographic
profile of the mobility model - Average node degree important for cluster
maintenance and geographic routing - Hitting time/ Meeting time important for
routing performance analysis - With low error when the communication range is
small compared to the community sizes
(communication disk lt 25 of community)
63Theoretical Tractability
64Theory Derivation Hitting Time
- Hitting time the time for a node to move into
the communication range of a randomly chosen
target coordinate, starting from the stationary
distribution
(hit)
65Theory Derivation Hitting Time
- Weighted average conditioned on the relative
location of the target - Calculate the unit-time hitting probability for
each scenario - Calculate hitting probability for the whole time
period - Calculate the conditional hitting time
66Application II Trace-based Mobility Modeling
- Skewed location preference
- Nodes spend 95 of time at top 5 preferred
locations. - Heavily visited preferred spots
- Repetitive behavior
- Nodes show up repeatedly at the same location
after integer multiples of days. - Periodical daily/weekly schedules
67Similarity-based User Classification
Association-based Representation
- For a given day d, user assoc. vector is defined
by n-element vector - a aj fraction of online time user i spends
at APj on day d - n is the number of APs
- Use zero vector for off-line users
- Vector elements quantify relative attraction of
AP to user - User Association Consistency
- User i is consistent if daily assoc. vectors
can be grouped into few clusters - Use clustering with Manhattan distance measure
Association vector (AP1, AP2, AP3) (0.2, 0.4,
0.4)
W. Hsu, D. Dutta, A. Helmy, Mobicom 2007
68 Summarizing user associations
- Association matrix concatenate user association
vectors for all days into a matrix. - To summarize, perform SVD and store the top-k
eigen values/vectors. - What value of k we have to use for a good
representation of the matrix? - Captured matrix power
- How much is the reconstruction error?
- Matrix norms X-Xkp/Xp where
69 Summarizing user associations
Assoc. patterns can be re-constructed with low
rank and low error
Matrix reconstruction error lt 5
70Clustering Users with Similar Behavior
- Exhaustive comparison of assoc. vectors
- Find average of ajd - aid over all days d for
all i,j pairs - Drawback O(nd2) for each pair
- Compare similarity of eigen-vectors obtained from
SVD - Use weighted inner products of eigen vectors U, V
- ,
-
- wui proportion of power of SV
- D(U,V) 1 - Sim(U,V)
- Corr gt 91 with exhaustive
Can achieve very good clustering efficiently
using distributed computation
A handful of eigen-vectors can capture most of
the behavior power
71Encounter Events
- Derived from simultaneous associations to the
same locations
- How many other nodes does a node encounter with?
Prob. (unique encounter fraction gt x)
0.5
On avg. only 27 of population
72Encounter-Relationship graph
- To our surprise, disconnected pairs of nodes are
low!!
Disconnected Ratio ()
73Summary (Case Study II)
- We use SVD to obtain eigen-behaviors of
individual users. - We use the eigen-behavior distances and
hierarchical clustering to classify WLAN users
into similar groups. - This finding is useful for mobility modeling
(identifying group sizes and their frequently
visited locations), network management,
abnormality detection, and group-aware protocol
(i.e., profile-cast, our future work)
74Summary (Case Study III)
- The distribution of encounters in real WLAN trace
is very different from synthetic models - The encounter-relationship graph displays
SmallWorld characteristics - Despite a low encounter ratio of the whole
population, the encounter events lead to a
robust, reachable network (with long delay).
75Future Work Profile-cast
76Goal
- To send messages to a group of nodes within the
general population - The group is defined by the intrinsic behavior
patterns of the nodes (CISE students, library
visitors, moviegoers) - The sender does not know the network identities
(addresses) of the destinations - Different from multi-cast No join/leave, no
group maintenance
77Largest number of female users is in social
sciences and is much higher than the male WLAN
users in those buildings. Female users are
surprisingly high (vs males) in the first 2
samples. WLAN activity was down Feb 07 due to
lower enrollment in Spring and potential changes
in the network.
78Females in social, economic, admin and
comm/journalism generally have longer session
durations than males in those majors. In
Engineering, music and chemistry the opposite is
true. Session durations are decreasing indicating
potential increase in mobility.
79Apple consistently more popular in females than
males Intel (PCs) are more popular in males than
females Increase in use of Apple and Intel in
general, and degradation in other brands
80Mobility Profile-cast (intra-group)
Goal
81Mobility Profile-cast (inter-group)
Goal
Flooding
Flooding_sim
82Performance Comparison
Gradient ascend helpsto overcome the difficult
case when the source is far from T.P.
Few long RW is better when S is far from T.P.
but many short RW is betterwhen S is close to
T.P.
83Performance Comparison
Few long RW is better when S is close toT.P.
but many short RW is betterwhen S is close to
T.P.
Gradient ascend helpsto overcome the difficult
case when the source is far from T.P.
Gradient ascend has some extra delay comparing
with flooding
84Mobility Independent Profile-cast
Goal
Flooding
SmallWorld-based
Single long random walk
Multiple short random walks