International Joint Conference on Neural Networks IJCNN 2005 - PowerPoint PPT Presentation

1 / 17
About This Presentation
Title:

International Joint Conference on Neural Networks IJCNN 2005

Description:

... baked seminary marinade dried onion steak tbsp chilli pork ram barbecuing ... meal tsp lander parc cip charcoal pepper salt michelin culinary garlic cooked ... – PowerPoint PPT presentation

Number of Views:88
Avg rating:3.0/5.0
Slides: 18
Provided by: guydesj
Category:

less

Transcript and Presenter's Notes

Title: International Joint Conference on Neural Networks IJCNN 2005


1
International Joint Conference on Neural Networks
(IJCNN 2005)
  • A Self-Organizing Map
  • for Concept Classification
  • in Information Retrieval

Guy Desjardins intellagent_at_vif.com Robert
Godin godin.robert_at_uqam.ca Robert
Proulx proulx.robert_at_uqam.ca
2
Agenda
  • Information Retrieval
  • IR SOM-NN
  • SOM Neural Network Model
  • Experiment and Results
  • Conclusion - Future

3
Information Retrieval
  • Text representation
  • Indexing documents ? corpus terms

4
Information Retrieval
  • Query Document matching

5
Information Retrieval
  • Similarity(Qi, Dj) cosine (qi, dj)

6
IR SOM-NN
7
IR SOM-NN
Documents
  • Concepts

8
IR SOM-NN
  • Documents ? Concepts
  • document weights
  • tf translation
  • idf translation
  • concept weights

9
SOM Neural Network Model
  • Input layer document vectors (t1, t2,..., tn)
  • Output layer document clusters c1, c2, ..., cj
  • Need to manually translate into term clusters

Standard Topology in Information Retrieval
Document clusters
Documents
10
SOM Neural Network Model
  • Term-document matrix

11
SOM Neural Network Model
Our Model
  • Input layer term vectors (d1, d2, ..., dm)
  • Output layer Concepts C1, C2, ..., Cj

Concepts
Terms
12
SOM Neural Network Model
  • Learning rule

13
Experiment
  • Collection TREC FT943
  • 17 109 documents
  • 15 queries
  • 71 011 index terms
  • SOM parameters
  • two-dimensional map 20 ? 20  400 nodes
  • learning rate ? 0.1
  • sacling factor ? 3750
  • maximum starting neighbourhood ? 3
  • Averages
  • 7 documents / index term
  • 29 terms / document

14
Results
  • 362 output nodes activated (90)
  • Average of 196 terms / node
  • Center of map high density nodes
  • nodes 9-11 9-11 clustered 36 of all terms
  • Periphery of map more specific concepts
  • number of terms / node ? 30

15
Results
16
Results
  • Information retrieval

17
Conclusion - Future
  • SOM Meaningfull concept clusters
  • IR Full term vectors ?? Concept vectors
  • Is SOM adequate for IR ?
  • Dimensionnal reduction (71 000 to 400 200X)
  • Future work
  • Recursive SOM
  • Explore sensitivity of the parameters
  • Compare with other concept approaches
Write a Comment
User Comments (0)
About PowerShow.com