Sin t - PowerPoint PPT Presentation

About This Presentation
Title:

Sin t

Description:

To semantically enrich any WN version with the semantic domain ... bardolatry#n#1 (idolization of William Shakespeare) Domains: RELIGION. Proposal 1. History ... – PowerPoint PPT presentation

Number of Views:23
Avg rating:3.0/5.0
Slides: 26
Provided by: fiM2
Category:
Tags: bardolatry | sin

less

Transcript and Presenter's Notes

Title: Sin t


1
Departament de Llenguatges i Sistemes
Informàtics Universitat Politècnica de Catalunya
Automatic Assignment of Domain Labels to WordNet
Mauro Castillo V. Francis Real V. German Rigau C.
GWC 2004
2
Outline
  • Introduction
  • WordNet
  • WN Domains
  • Experimentation
  • Evaluation and results
  • Discussion
  • Conclusions

3
Introduction
  • To semantically enrich any WN version with the
    semantic domain labels of MultiWordNet Domains
  • WN is an standard resource for semantic
    processing
  • Effectiveness of Word Domain Disambiguation
  • The work presented explores the automatic and
    sistematic assignment of domain labels to glosses
  • Proposed Method can be used to correct and verify
    the suggested labeling

4
WordNet
  • The version WN1.6 was used because of the
    availability of WN Domains

5
WN Domains
TOP
WordNet Domain hierarchy developed at IRST
(Magnini and Cavagliá, 2000)
6
WN Domains
  • The synsets have been annotated semiautomatically
    with one or more labels
  • Most of synsets it has single a label

Distribution of domain labels for synset
noun 1.170 verb 1.078 adj 1.076 adv 1.033
Average labels for synset
7
WN Domains
  • A domain may include synsets of different
    syntactic categories e.g. MEDICINE
  • doctor1 (n)
  • operar7 (v)
  • medical1 (a)
  • clinically1 (r)
  • A domain label may also contain senses from
    different Wn subhierarchies. e.g. SPORT
  • athleta1 ? life-form1
  • game-equipment1 ? physical-object1
  • sport1 ? act2
  • playing-field1 ? location1

8
WN Domains
  • Synsets that have more than one label, do not
    seem to follow any pattern
  • sultanan1 (pale yellow seedless grape used for
    raisins and wine)

Botany Gastronomy
  • moroccon2 (a soft pebble-grained leather made
    from goatskin used for shoes and book bindings
    etc.)

Anatomy Zoology
  • canicola_fevern1(an acute feverish disease in
    people and in dogs marked by gastroenteritis and
    mild jaundice)

Medicine Physiology Zoology
  • bluen1, bluenessn1 (the color of the clear
    sky in the daytime "he had eyes of bright blue")

Color Quality
9
WN Domains
  • FACTOTUM Used to mark the senses of WN that do
    not have a specific domain
  • STOP Senses The synsets that appear frequently
    in different contexts, for instance numbers,
    colours, etc.

10
Experimentation
  • Process to automatically assign domain labels to
    WN1.6 glosses
  • Validation procedures of the consistency of the
    domains assignment in WN1.6, and especially, the
    automatic assignment of the factotum labels

Distribution of synset with and without the
domain label factotum in WN1.6
11
Experimentación
Test set was randomly selected (around 1) and
the other synsets were used as a training set
Corpus test for nouns and verbs
12
Experimentation
castlen4, castlingn1 CHESS SPORT
castle castling interchanging the positions of
the king and a rook
castle chess castle sport castling chess castli
ng sport interchanging chess interchanging sport
interchanging chess interchanging sport intercha
nging chess interchanging sport king chess king
sport rook chess rook sport
Calculation of frequency
13
Experimentation
Measures
M1 Square root formula
M2 Association Ratio
Ar(w,D) Pr(wD)log2(Pr(wD) / Pr(w))
M3 Logarithm formula
log2(Nc(w,D) / c(w)c(D))
14
Experimentation
TRAINING
MATRIX OF WEIGHTS
CALCULATION
VALIDATION
15
Experimentation
POSITION 1 person 30.23 POSITION 2 politics
13.40 POSITION 3 law 11.08 ... ...
VD ? weigth(wi,dj)percentage
16
Evaluation y Results nouns
AP Accuracy first label AT Accuracy all
labels P Precision R Recall F1 2PR/(PR)
MiA Measures the success of each formula (M1,
M2 or M3) when the first proposed label is
correct MiD Measures the success of each
formula (M1, M2 or M3) when the first proposed
label is correct (or subsumed as correct one in
the domain hierarchy).
Results for nouns with factotum CF
Results for nouns without factotum SF
17
Evaluation y Results verbs
AP Accuracy first label AT Accuracy all
labels P Precision R Recall F1 2PR/(PR)
MiA Measures the success of each formula (M1,
M2 or M3) when the first proposed label is
correct MiD Measures the success of each
formula (M1, M2 or M3) when the first proposed
label is correct (or subsumed as correct one in
the domain hierarchy).
Results for verbs with factotum CF
Results for verbs without factotum SF
18
Evaluation y Results
  • On average, the method assigns
  • Noun 1.23 domains labels (1.170)
  • Verb 1.20 domains labels (1.078)
  • We obtain better results with nouns
  • The best average results were obtained with the
    M1 measure
  • The first proposed label (noun) 70 accuracy
  • The results of verbs are worse than nouns, one of
    the reasons may be the high number of verbal
    synsets labels with factotum domain

19
Discussion
Monosemic words
credit applicationn1 (an application for a line
of credit)
Domains SCHOOL
Proposal 1. Banking
Proposal 2. Economy
Banking
economy
banking
20
Discussion
Relation between labels
Academic_programn1 (a program of education in
liberal arts and sciences (usually in preparation
for higher education))
Domains PEDAGOGY
Proposal 1. School
Proposal 2. University
pedagogy
school
university
21
Discussion
Relation between labels
shoppingn1 (searching for or buying goods or
services "went shopping for a reliable plumber"
"does her shopping at the mall rather than down
town")
Domains ECONOMY
Proposal 1. Commerce
social_science
commerce
economy
22
Discussion
Relation between labels
Fire_control_radarn1 (radar that controls the
delivery of fire on a military target)
Domains MERCHANT_NAVY
Proposal 1. Military
social_science
transport
military
merchant_navy
23
Discussion
Uncertain cases
birthmarkn1 (a blemish on the skin formed
before birth)
Domains QUALITY
Proposal 1. Medicine
bardolatryn1 (idolization of William
Shakespeare)
Domains RELIGION
Proposal 1. History
Proposal 1. Literature
24
Conclusions
  • The procedure to assign automatically domain
    labels to WN gloss seems to be dificult
  • The proposal process is very reliable with the
    first proposal labels
  • The proposal labels are ordered by priority
  • It is posible to add new correct labels or
    validate the old ones

25
Departament de Llenguatges i Sistemes
Informàtics Universitat Politècnica de Catalunya
Automatic Assignment of Domain Labels to WordNet
Mauro Castillo V. Francis Real V. German Rigau C.
GWC 2004
Write a Comment
User Comments (0)
About PowerShow.com