Yevgeniy Ivanchenko University of Jyv - PowerPoint PPT Presentation

About This Presentation
Title:

Yevgeniy Ivanchenko University of Jyv

Description:

This for example can be used to remove unnecessary input information. ... U-matrix is one of possible ways to visualize the output map. ... – PowerPoint PPT presentation

Number of Views:55
Avg rating:3.0/5.0
Slides: 23
Provided by: jark5
Category:

less

Transcript and Presenter's Notes

Title: Yevgeniy Ivanchenko University of Jyv


1
Adaptation of Neural Nets
for Resource Discovery Problem in
Dynamic and Distributed P2P
Environment
  • Yevgeniy IvanchenkoUniversity of Jyväskylä
  • yeivanch_at_cc.jyu.fi

2
OBJECTIVES (I)
  • Since nothing is known about decision mechanism
    of NeuroSearch we need to look inside the
    algorithm to understand its behavior.
  • Since nothing is known about behavior of
    NeuroSearch algorithm in dynamic environment, we
    need to know its behavior under conditions that
    are approximated to real life situation.

3
OBJECTIVES (II)
  • To understand behavior of NeuroSearch data
    analysis techniques were used. The
    Self-Organizing Maps (SOM) is well known tool to
    perform data mining task.
  • Set of rules was obtained based on the analysis
    of NeuroSearch. The rules were tested in static
    environment. The question that arises here Is it
    possible to use the algorithm, which utilized
    properties of static environment, in dynamic
    scenario?

4
OBJECTIVES (III)
  • If we know the inner structure of decision
    mechanism of NeuroSearch we will be able to tell
    about contribution of every input to particular
    decision of the algorithm. This for example can
    be used to remove unnecessary input information.
  • This also can help evaluate complexity and
    robustness of the algorithm.

5
SOM (I)
  • SOM is neural network model that maps high
    dimensional space onto low-dimensional space
    (usually two dimensional).
  • After using SOM algorithm similar vectors from
    the input space are located near each other in
    the output space. This can help investigate
    properties of obtained clusters and as a
    consequence causes that produced these clusters
    on the output map.

6
SOM (II)
  • Usually SOM represents itself either hexagonal or
    rectangular grid of neurons. In the figure R1
    and R2 denote different neighborhood size.
  • During the training process size of neighborhood
    is slightly decreased to provide more accurate
    adjustment of the weights of the neurons.

R2
R1
7
SOM (III)
  • In the figure one can see that the neurons that
    are covered by neighborhood kernel function
    move closer to the input vector.
  • Best Matching Unit (BMU) is the closest neuron to
    the current input vector.
  • The weights of the neurons are updated according
    to the kernel function and the distance to BMU.

BMU
8
DATA ANALYSIS (I)
  • NeuroSearch can be considered as the main part of
    information model of the system. To build this
    system black box method was used we are modeling
    external behavior of the system and at the same
    time we dont know what are the causes of
    particular behavior of the system.
  • To investigate decision mechanism of NeuroSearch
    analysis of input-output pairs was done using SOM.

9
DATA ANALYSIS (II)
  • To perform the analysis we used Component plane
    U-matrix with hit distribution on it. Component
    plane visualizes values of all components of the
    vectors according to the output map. U-matrix is
    one of possible ways to visualize the output map.
    The hits on the U-matrix correspond to the
    decisions of NeuroSearch.
  • This approach allows us investigating not only
    contribution of each component to particular
    decision, but also the correlations between
    components.

10
DATA ANALYSIS (III)
toUnsearchedNeighbors
U-matrix
  • The figure shows U-matrix (the left side of the
    figure) fragment of Component plane (the right
    side of the figure).
  • It is easy to see variable From is responsible
    for stopping further forwarding of the queries
    where it is 1.
  • Other variables have different values in the area
    where From is 1, for example variable
    toUnsearchedNeighbors has different values in
    this area.

From
11
DATA ANALYSIS (IV)
  • After the analysis it was found that 4 variables
    (From, toVisited, Sent and currentVisited) are
    responsible for stopping further forwarding of
    the queries.
  • Variables toUnsearchedNeighbors and Neighbors are
    correlated.
  • Variables packetsNow and Hops are highly
    correlated.
  • Variables fromNeighborAmount, packetsNow and Hops
    are correlated somehow.
  • NeuroSearch mostly doesnt send the queries
    further if Neighbors or toUnsearchedNeighbors is
    small.

12
DATA ANALYSIS (V)
  • Further investigation of the algorithm is based
    on Hops because only this variable shows the
    state of the algorithm in particular time
    interval, in other words analyzing intervals of
    this variable we can monitor the queries through
    their path.
  • The maximum length of the queries path is 7.
    Thus we have 7 different cases to analyze.
  • Data for each case contains only samples with the
    currently investigating value of Hops variable.
    All samples where at least one of From, Sent,
    currentVisted or toVisited variables is equal to
    1 were removed as well. It is because we already
    know behavior of the algorithm in these areas.

13
DATA ANALYSIS (VI)
  • After investigation of the algorithm for the
    different values of Hops we have produced Rule
    Based Algorithm (RBA). RBA is based on rules that
    were extracted using analysis of U-matrix and
    corresponding component plane.
  • General strategy of the algorithm is quite
    simple A decision is mostly based on
    interconnection between Hops, Neighbors/toUnsearch
    edNeighbors and NeighborsOrder values. In the
    beginning the algorithm sends the queries to the
    most connected nodes. When number of hops in the
    query is increasing NeuroSearch slightly starts
    to forward the queries to low-connected nodes.

14
DATA ANALYSIS (VII)

The table shows efficiency of four algorithms.
One can see that NeuroSearch and RBA have almost
the same level of performance. This means that
RBA adapted behavior of NeuroSearch and we can
say that SOM suits well for analyzing of
NeuroSearch. Both these algorithms have better
performance compared to BFS2 and BFS3.
Comparison between algorithms
Algorithm Packets Replies
BFS-2 3000 619
BFS-3 12464 1325
NeuroSearch 4703 979
RBA 4904 963
15
DYNAMIC ENVIRONMENT (I)
  • Since RBA is based on decision mechanism of
    NeuroSearch it is possible to evaluate behavior
    of NeuroSearch using RBA in dynamic environment.
  • As a simulation environment P2P extension for
    NS-2 was built.
  • The environment provides quite high dynamical
    changes. There are two different classes of
    probabilities that define dynamical changes in
    the network. The first class is defined randomly
    before starting the simulation. The second is
    defined by the formulas

16
DYNAMIC ENVIRONMENT (II)
To make qualitative evaluation of performance,
RBA was compared to BFS2 and BFS3 in static and
dynamic environments. Number of replies and
amount of used packets in static environment are
shown in the figures
17
DYNAMIC ENVIRONMENT (III)
  • Analyzing behavior of the algorithms in static
    environment one can see that mostly RBA locates
    more resources than BFS2 and significantly less
    than BFS3.
  • In general RBA uses more packets than BFS2 and
    significantly less than BFS3.
  • This situation satisfies us because RBA is based
    on NeuroSearchs decision mechanism that is
    trained to locate only half of available
    resources.
  • In some points RBA locates more resources than
    BFS3 algorithm and in the same time uses less
    packets. This means that if some resource isnt
    common in the network, RBA and as a consequence
    NeuroSearch can find enough instances of this
    resource.

18
DYNAMIC ENVIRONMENT (IV)
Number of replies and amount of used packets in
dynamic environment are shown in the figures
Analyzing the figures one can see that
performance of the algorithms didnt suffer so
much in the dynamic environment.
19
DYNAMIC ENVIRONMENT (V)
Total number of located resources and used
packets in static and dynamic environment are
shown in the table
Algorithm Packets Packets Replies Replies
Algorithm Static dynamic static dynamic
BFS2 3000 2515 619 528
BFS3 12464 10040 1325 1245
RBA 4904 4865 963 900
The algorithms still can find enough resources in
dynamic environment. There are two possible
causes that can explain the fact that all
investigated algorithms found a little bit fewer
resources 1) Some nodes in offline mode could
contain queried resources. 2) Some nodes in
offline mode could lie on possible path of the
query.
20
DYNAMIC ENVIRONMENT (VI)
  • The algorithms used less packets in dynamic
    environment than in static environment.
  • BFS strategy is very sensitive to the size of the
    network, because BFS based algorithms used
    significantly less packets in dynamic environment
    where size of the network was smaller all the
    simulation time.
  • RBA used approximately the same amount of packets
    in both environments. Therefore we can say that
    RBA is not strongly sensitive to the size of the
    network.

21
FUTURE WORK
  • Developing the supervised approach to train
    NeuroSearch.
  • Developing modification of the algorithm for ad
    hoc wireless P2P networks.
  • Paying more detailed and deeper attention to the
    inner structure of the algorithm, using knowledge
    discovery methods.
  • Investigating and utilizing properties of other
    P2P algorithms to answer to the question about
    adding these properties to NeuroSearch.

22
Thank you!
Write a Comment
User Comments (0)
About PowerShow.com