Title: Progress Report Related work in KM
1Progress Report Related work in KM
- Advisor Prof. Hahn-Ming Lee
- Prof. Jan-Ming Ho
- Reporter Shou-Wei Ho
- Chung-Hung Lin
-
- 2009.08.31
2Related work in KM (Knowledge Management)
3Problems in searching Chinese(??) name
Only Chinese Corpus
4Challenges in Chinese name translation
- Many pronunciation rules in different areas
- ? ? Chen (Taiwan)
- ? ? Tsun (Hong Kong)
- ? ? Tan (Fukien)
- Some additional words exist.
- Ex ??? (Kwang-Ming Frank Hwang)
- Ex ??? (Jane Win-Shih Liu)
5Ambiguous pages in the WWW(??)
CMU Professor
Guitar Player and Singer
?
6Anchor text mapping
1.
3.
2.
1.
2.
B. NVIDIA web site page
3.
Search Name Bill Mark
A. Personal main page
7CRE Why do we extract information from
publication list web page? (??)
- Publication list page is an important resource
for many value-added applications, such as
citation analysis and academic network analysis. - What could we get from publication list pages?
- Some up-to-date literatures before they are
formally published - Some reference materials, such as slides and
talks.
8An automatic extractor
Citation String
Web Page
Extract
Detect 3 relationships cont.
Detect 3 relationships cont.
Structure Data
9Citation extracting
10Authorship Disambiguation(??)
?
?
Prof. A
Prof. A
11Detect 3 relationships(COI)
Prof. A
Prof. B
Student C
12Detect 3 relationships cont.
Prof. A
Prof. S
13Detect 3 relationships cont.
Prof. E
Prof. A
Prof. W
14Mining a Chinese Persons Name from the English
Translation(??)
15(No Transcript)
16Name Disambiguation(??)
- Problem
- Given a set of citations with the same author
name, how do we identify which one belongs to
whom? - Goal
- To group the citations into several clusters, so
that each cluster represents an author
17Procedure
18Procedure
- Use classification result to group citations into
several clusters - Each cluster contains citations belonging to the
same author
Grouping
If SVM determines two citations are authored by
the same person, then they are connected each
other
19Citation Correspondence(??)
- Query construction
- A good query
- If proper records are achieved in digital
libraries, good query should get them in search
result, at the same time, proper records should
have higher ranking. - Search result should be small.
- Citation correspondence
- Find proper records from search result by
matching local citation string and records in
search result. - Field-by-field comparison.
- May be not enough due to errors in digital
libraries (optional). - Metrics precision, recall, and F-measure.
20Partial Solution Abbreviation Matching
v1
v2
Example CIKM Conference on Information and
Knowledge Management
21Reviewer Recommendation(??)
22COI in incomplete collaboration Network via
social Interaction(??)