Title: Tang E' K, Tiun S'T, Abdullah R'
1 Enriching ontology using WordNet
- BY
- Tang E. K, Tiun S.T, Abdullah R.
- Pusat Pengajian Sains Komputer
- Universiti Sains Malaysia
2Smart Product Information Search
1. Product Info Cataloging/Indexing
2. Product Concepts Relevancy/Categorization
Product info Databases
Indexing of selected contents
Dictionaries Wordnet Thesaurus Concepts Category
Hierarchy
Product Concepts Categorization
Full text indexing
Information brokering
Product Indexing Database
Product catalogues
Product Concepts Relevancy Network
Product Name Manufacturer Price ..
Language processor
Indexing Info. of Product 1
Indexing Info. of Product n
3. Searching
Search parameters
Interpreting/expanding users query
SEARCH
Key Words Expression
- Search by key words and other constraints
- Results sorted by relevancy and field range
- Customizable search results presentation
User profiles
Access level Roles Tasks
Users query
3Ontology ? Extended Ontology
yahoo
WordNet
Computer And Internet (computer1, Internet1)
Arts_and_Humanities (art1, humanities1)
computer1 data-processor1, electronic-computer
1, digital-computer1, machine1 Internet1
cyberspace1, computer-network1
- How Using external linguistics database
(WordNet) with the synonym, hyponyms/hypernyms
and meronyms/holonyms relationships. - Why Solve Out Of Vocabulary (OOV) problem
4Stemming word sense tagging
(input) Yahoo!Computer And Internet Security and
Encryption
Stemming process
computer internet security encryption
Sense tagging process
computer1 internet1 security4 encryption1
(output)
5Obtain related words from Wordnet
computer1 (word from concept)
SYNONYM ? data-processor1,
electronic-computer1
HYPERNYM / ? machine, digital -
computer1 HYPONYM
HOLONYM/ ? null MERONYM
WORDNET
6Extended Yahoo directory path
Yahoo
null
Computers and Internets (computer1, Internet1)
computer1 data-processor1, electronic-computer
1, digital-computer1, machine1 Internet1
cyberspace1, computer-network1
Security_and_Encryption (security4,
encryption1)
security4security-reason1, precaution1,
safeguard1 encryption1coding1, compression1
7General Overview Web page topic identification
Web page
Keywords Extraction
Yahoo Ontology
Mapping
Optimization
Topic
Extended Ontology
8Web page keywords extraction.
Keywords extracted from text based on HTML tag
- Words within title HTML tag. lttitlegt ..lt/titlegt
- Words that are used for hyperlinks. lta
hrefgt...lt/agt - Highlighted words
- - Bold. ltbgt ..lt/bgt
- - Italics. ltigt ..lt/igt
- - Enlarged. lthDgt..lt/hDgt
9Topic node identification
Tree
Optimized tree
Single Path
Topic node
Domain Web page directories (Yahoo)
10The experiment and its result
Precision hits / (hits mistakes)
RESULT
69.7
Optimized tree
51.9
Single Path
29.7
Topic Node
11We conclude..
- Our approach is simple yet comparable to others
works (others obtained accuracy result between
the range of 30 - 50). - Problems that caused poor result
- Text extraction heterogenity of web
document, - poor quality.
- Domain vocabulary is still insufficient.
- Top-down model uncoverable mistakes (choose
the wrong - node at the top, lead to the wrong node
topic). - Related work
- Use learning method with feature extraction
from - each category node concept.
-