Title: Automatic Subject Indexing using an Associative Neural Network
1Automatic Subject Indexing using an Associative
Neural Network
- Yi-Ming Chung, William M. Pottenger and Bruce R.
Schatz - CANIS - Community Architectures for Network
- Information Systems Laboratory
- University of Illinois at Urbana-Champaign
- http//www.canis.uiuc.edu/papers/chung-dl98/
2Outline
- Overview of indexing
- Concept Assigner
- Interspace
- Steps
- Advantages
- Example
- Performance
- Future directions
3Need for Effective Subject Indexing
- World of a billion repositories
- Every community has its own repository
- Full-text search
- Low precision and recall
- Human indexing
- High quality but not scalable
- Need for automatic indexing method
- Indexed by subject matter and scalable
4Steps of Human Subject Indexing
- Analyze the content
- Determine main subjects
- Consult the thesaurus
- Assign limited number of index terms
5Summary of Human Indexing
- Professional indexer
- High quality
- High cost
- Thesaurus
- Hard to keep current, broad coverage
- Inconsistency
- Humans are inconsistent in term assignment
6Interspace System Architecture
Interspace Analysis
Interspace analysis
Kernel Layer
Interspace Services
Service Layer
External Gateways
External Services
- Knowledge Retrieval
- Concept Space searching
- Full text searching
- Knowledge Indexing
- Concept Space Generation
- Concept Assignment
- Knowledge Categorization
- Concept Mapping
- Category Mapping
Concept Extraction
Datastore Layer
Persistent Data Store
7Steps of Concept Assigner
- Analyze document
- Extract semantic units using noun phrase parser
- Create Concept Space
- Capture domain context using statistical
co-occurrence - Suggest terms
- Use Hopfield net algorithm to explore concepts
- Assign a few index terms
8Advantages of Concept Assigner
- Automatic Indexer
- Better quality than keyword matching
- not limited to words extracted from the indexed
document - Lower cost
- domain expert but amateur indexer
- Automatic Concept Space creation
- Current and specific
- Scalable (large collection)
- Facilitates Consistency
- Select index terms from a system suggested list
9Idiom recognition in the polaris parallelizing
compiler
10Idiom recognition in the polaris parallelizing
compiler
11Handling block-cyclic distributed arrays in
Vienna Fortran 90
12Handling block-cyclic distributed arrays in
Vienna Fortran 90
13Hopfield Net Algorithm
- Initialization
- Activation
- Convergence
14Performance
- System performance
- SUN UltraSPARC 200 MHz, 256M Memory
- User evaluation
15Future Directions
- Large scale experiments
- 9M MEDLINE abstracts
- Concept Mapping/Switching
- Algorithms optimization
- Validation
- Manual and semi-automatic
16Conclusion
- Automatic subject indexing
- Uses Hopfield Net Algorithm
- Automatic creation of concept spaces
- Scalable technique
17Automatic Subject Indexing using an Associative
Neural Network
- Yi-Ming Chung, William M. Pottenger and Bruce R.
Schatz - CANIS - Community Architectures for Network
- Information Systems Laboratory
- University of Illinois at Urbana-Champaign
- http//www.canis.uiuc.edu/papers/chung-dl98/