Title: Node Clustering in Wireless Sensor Networks by Considering Structural Characteristics of the Network Graph
1Node Clustering in Wireless Sensor Networks by
Considering Structural Characteristics of the
Network Graph
- Nikos Dimokas1
- Dimitrios Katsaros1,2
- Yannis Manolopoulos1
1Informatics Dept., Aristotle University,
Thessaloniki, Greece 2Computer Comm.
Engineering Dept., University of Thessaly, Volos,
Greece
4th ITNG Conference, Las Vegas, NV, 2-4/April/2007
2Wireless Sensor Network (WSN)
- Wireless Sensor Networks features
- Homogeneous devices
- Stationary nodes
- Dispersed Network
- Large Network size
- Self-organized
- All nodes acts as routers
- No wired infrastructure
- Potential multihop routes
3Communication in WSN
- Communication between two unconnected nodes is
achieved through intermediate nodes. - Every node that falls inside the communication
range r of a node u, is considered reachable.
4WSN - Applications
- Applications
- Habitat monitoring
- Disaster relief
- Target tracking
- Many of these applications require simple and/or
aggregate function to be reported. - Clustering allows aggregation and limits data
transmissions.
5What is Clustering
Cluster member
Clusterhead
Gateway node
Intra-Cluster link
Cross-cluster link
- Nodes divided in virtual group according to some
rules - Nodes belonging in a group can execute different
functions from other nodes.
6Clustering in WSN
- Involves grouping nodes into clusters and
electing a CH - Members of a cluster can communicate with their
CH directly - CH can forward the aggregated data to the central
base station through other CHs - Clustering Objectives
- Allows aggregation
- Limits data transmission
- Facilitate the reusability of the resources
- CHs and gateway nodes can form a virtual backbone
for intercluster routing - Cluster structure gives the impression of a
smaller and more stable network - Improve network lifetime
- Reduce network traffic and the contention for the
channel - Data aggregation and updates take place in CHs
7Relevant work Clustering
- Based on the construction of Dominating Set
- Nodes belonging to the DS are carrying out all
communication - Running out of energy very soon
- Based on the residual energy of each node
- Proposed ways to rotate the role of CH among
nodes of clusters - Can be easily combined with the algorithms of the
first family - Our proposal the GESC protocol supports
- dynamically estimation of CHs depending on the
requester node, and thus improvement of network
lifetime - a novel metric for characterizing node importance
- localization
- minimum number of messages exchanged among the
nodes
8Relevant work Topology Control
Minimum Spanning Tree (MST) and Localized Minimum
Spanning Tree (LMST) Calculated with Dijkstras
algorithm and Li, Hou Sha, respectively.
MST
LMST
sample graph
Relative Neighborhood Graph (RNG) An edge uv is
included in RNG iff it is not the longest edge in
any triangle uvw.
Grabriel Graph (GG) An edge uv is included in GG
iff the disk with diameter uv contains no other
node inside it.
Delaunay Triangulation (DT), Partial Delaunay
Triangulation (PDT), Yao graph (YG), etc A lot
of other (variants of) geometric structures
- Topology Control Choosing a set of links from
the possible ones. Not exactly our problem. So
graph-theoretic concepts, than geometric ones.
9Minimal Dominating Set
- A vertex set is DS (Dominating Set)
- Any other vertex connected to one DS vertex
- It is CDS, if it is connected
- It is MCDS if its size is minimum among CDS
- Discovery of the MCDS of a graph is in NP-complete
DS
CDS
10Motivation for new clustering protocol
- The protocol should
- be localized, and thus distributed
- fully exploit the locally available information
in making the best decisions - be computationally efficient
- minimize the number of message exchange among the
nodes - be energy efficient and thus extend network
lifetime. This could be achieved with the use of
different nodes for relaying messages - not make use of variants, e.g., node IDs,
because a (locally) best decision might not be
reached (even if it does exist)
11Well-known CDS algorithm
Wu and Lis algorithm
- Each node exchanges its neighborhood information
with all of its one-hop neighbors - Any node with two unconnected neighbors becomes a
dominator (red) - The set of all the red nodes form a CDS
12Well-known CDS algorithm
Wu and Lis algorithm (Pruning Rules 1 2)
Open neighbor set N(v) u u is a
neighbor of v Closed neighbor set Nv
N(v)Uv
A node u can be taken out from the CDS if u
has two neighbors v and w such that N(u) is
covered by N(v)UN(w) and its ID is the smallest
of the other two nodes IDs
- A node v can be taken out from the CDS if there
exists a node u such that Nv is a subset of
Nu and the ID of v is smaller than the ID of u
13Heed protocol (1/2)
- Every sensor node has multiple power levels.
- Periodically selects CHs according to a hybrid of
the node residual energy and node degree. - TCP is the clustering process duration and TNO is
the network operation interval. - Clustering is activated every TCP TNO seconds.
- Initial number of CHs is Cprob.
- The probability of a node to become a CH is
CHprob. - The probability of a node to become a CH is
CHprob.
14Heed protocol (2/2)
- Intracluster Intercluster communication
- Intracluster communication is proportional to
- Node degree (load distribution)
- 1 / node degree (dense clusters)
- If variable power levels ara allowed for
intracluster communication then select CHs using
average minimum reachability power.
15Leach protocol (1/2)
- All nodes can transmit with enough power to reach
the BS and the nodes use power control. - Cluster formation during set-up phase and data
transfer during steady-state phase. - Each node elects itself as CH at the beginning of
round r1 with probability Pi(t). k is the number
of clusters. - All nodes are CHs the same number of times.
- All nodes have the same energy after N/k rounds.
16Leach protocol (2/2)
- Every node elects as CH the node that requires
the least energy consumption for communication. - Every CH set-up a TDMA schedule and transmitted
to the nodes. Every node could transmit data in
the corresponding time-slot. - Weakness
- Limited scalability
- Could be complementary to clustering techniques
based on the construction of a DS -
17Weakness of current approaches
- Some approaches can not detect all possible
eliminations because ordering based on node ID
prevents this. As a consequence they incur
significantly excessive retransmissions - Others rely on a lot of local information, for
instance knowledge of k-hop neighborhood (k gt 2),
e.g., WD04,WL04 - Other methods are computationally expensive,
incurring a cost of O(f2) or O(f3), where f is
the maximum degree of a node of the ad hoc
network, e.g., the methods reported in WL01,
WD03, DW04 and SSZ02 - some methods (e.g., QVLl00,SSZ02) do not fully
exploit the compiled information for instance,
the use of the degree of a node as its priority
when deciding its possible inclusion in the
dominating set might not result in the best local
decision
18Terminology and assumptions
- WSN is abstracted as a graph G(V,E)
- An edge e(u,v) exists if and only if u is in the
transmission range of v and vice versa. All links
in the graph are bidirectional. - The network is assumed to be connected
- N1(v) the set of one hop neighbours of v
- N2(v) the set of two hop neighbours of v
- N12(v) combined set of N1(v) and N2(v)
- LNv is the induced subgraph of G associated
with vertices in N12(v) - dG(v,u) distance between v and u
19A new measure of node importance
- Let suwswu denote the number of shortest paths
from u ? V to w ? V (by definition, suu0). - Let suw(v) denote the number of shortest paths
from u to w that some vertex v ? V lies on. - We define the node importance index NI(v) of a
vertex v as - Large values for the NI index of a node v
indicate that this node can reach others on
relatively short paths, or that v lies on
considerable fractions of shortest paths
connecting others. In the former case, it
captures the fact of a possibly large degree of
node v, and in the latter case, it captures the
fact that v might have one (some) isolated
neighbors
20The NI index in sample graphs
In parenthesis, the NI index of the respective
node i.e., 7(156) node with ID 7 has NI equal
to 156.
- Nodes with large NI
- Articulation nodes (in bridges), e.g., 3, 4, 7,
16, 18 - With large fanout, e.g., 14, 8, U
- Therefore geodesic nodes
21The NI index in a localized algorithm
- For any node v, the NI indexes of the nodes in
N12(v) calculated only for the subgraph of the
2-hop (in general, k-hop) neighborhood reveal the
relative importance of the nodes in covering N12 - For a node u (of the 2-hop neighbourhood of a
node v), the NI index of u will be denoted as
NIv(u)
22NI computation
- At a first glance, NI computation seems
expensive, i.e., O(mn2) operations in total for
a 2-hop neighbourhood, which consists of n nodes
and m links - calculating the shortest path between a
particular pair of vertices (assume for the
moment that there exists only one) can be done
using bfs in O(m) time, and there exist O(n2)
vertex pairs - Fortunately, we can do better than this by making
some smart observations. The improved algorithm
(CalculateNodeImportanceIndex) is quite
complicated and beyond the scope of this
presentation - THEOREM. The complexity of the algorithm
CalculateNodeImportanceIndex is O(nm) for a
graph with n vertices and m edges
23Pseudocode for CalculateNodeImportanceIndex (1/2)
24Pseudocode for CalculateNodeImportanceIndex (2/2)
25Evaluation setting (1/2)
- We compare GESC to
- WL 12, improved scheme incorporating the rules
indicated - MPR, the MultiPoint Relaying method described in
QVL00 - SSZ, reported in SSZ02, which was selected as a
Fast Breaking Paper for October 2003 - Implementation of protocols using J-Sim
simulation library - Sensor network topologies with 100, 300, 500
nodes. - Each topology consists of square grid units
- Each sensor node is uniformly distributed between
the point (0,0) and (100,100) - Two sensor nodes are neighbors if they are placed
in the same or adjacent grid units.
26Evaluation setting (2/2)
- Varying levels of node degree from 4 to 10
- Run each protocol at least 100 times for each
different node degree. Each time a different node
is selected to start broadcasting - Performance metric
- Energy dissipation
- Broadcast messages
- Latency
27Impact of the nodes (1/2)
28Impact of the nodes (2/2)
29Impact of the average node degree
30Impact of energy consumption
31Conclusions and Future Work
- Defined and investigated a novel distributed
clustering protocol for WSN based on a novel
localized metric - The calculation of this metric is very efficient,
linear in the number of nodes and linear in the
number of links - Proved that it is very efficient in terms of
communication cost and in terms of prolonging
network lifetime - The protocol is able to reap significant
performance gains, reducing the number of
rebroadcasting nodes - Simulated an environment to evaluate the
performance of the protocol and competitive
protocols using J-Sim simulator - Comparison with protocols based on residual
energy (LEACH,HEED) - GESC GEodegic Sensor Clustering has been
proven to prevail