Building a Primitivebased Lexical Consultation System - PowerPoint PPT Presentation

1 / 27
About This Presentation
Title:

Building a Primitivebased Lexical Consultation System

Description:

... Dictionary of Contemporary English (LDOCE) and Webster's 9th Dictionary (W9) ... english 2 n people of england. Introduction (Cont) Explicit information (POS) ... – PowerPoint PPT presentation

Number of Views:63
Avg rating:3.0/5.0
Slides: 28
Provided by: utmk3
Category:

less

Transcript and Presenter's Notes

Title: Building a Primitivebased Lexical Consultation System


1
Building a Primitive-based Lexical Consultation
System
  • prepared by Lim Beng Tat
  • Supervisor Dr Tang Enya Kong
  • Dr. Guo Cheng Ming

2
Abstract The research gives about the design of
semantic-primitive-based lexical consultation
system and the possible processes which will be
performed on a mahine-readable dictionary (MRD)
and corpus to produce a machine-tractable
dictionary (MTD) and tractable corpus
automatically. Linguistic tools such as sense
tagger and reources are created during or after
the processes. Besides that, this research will
also show how to perform an unsupervised word
sense disambiguation method to the samples of
unrestricted text from various prospective
application areas by using the newly constructed
MTD. This is important to the applications that
need lexical semantics such as machine
translation, information retrieval and hypertext
navigation, content and thematic analysis,
grammatical analysis, speech processing and text
processing.
3
Outline
  • Introduction
  • Problem
  • Objective
  • Lexical Consultation System
  • System design and architecture
  • Example applications
  • Bilingual Knowledge Bank

4
Introduction
  • Dictionaries
  • Supply knowledge (language and world)
  • E.g. Collins English Dictionary (CED), Longman's
    Dictionary of Contemporary English (LDOCE) and
    Webster's 9th Dictionary (W9)

5
Introduction (Cont)
  • Explicit information (POS)
  • Implicit information / semantic information
  • Hypernym/hyponym relations (class/subclass)
  • Synonymy/Antonymy relations
  • Meronym/Holonym relation (part/whole, ...)
  • Collocational relations (compounds, idioms, ...)
    and etc

6
Introduction (Cont)
  • Problem Extracting semantic information from
    dictionary?
  • 2 methods
  • Defining pattern
  • Identify significant recurring phrase
  • E.g. A member of- NP
  • hand a member of a ship's crewW9
  • Extraction of semantic hierarchy
  • Extraction of hyponym.
  • E.g. dipper a ladle used for dipping... CED
  • ladle a long-handled spoon... CED
  • spoon a metal, wooden, or plastic utensil...
    CED

7
Introduction (Cont)
  • Disadv
  • Circularity
  • E.g. tool an implement, such as a hammer... CED
  • implement a piece of equipment tool or utensil.
    CED
  • utensil an implement, tool or container... CED
  • Inconsistency in dictionaries
  • E.g. corkscrew a pointed spiral piece of
    metal... W9
  • dinner service a complete set of plates and
    dishes... LDOCE
  • Dictionaries for human usage
  • Other methods
  • Semantic primitive and word sense disambiguation

8
Semantic Primitive
  • Semantic primitive refer to a core meaning that
    cannot be not further analyzed
  • E.g. bachelor and red
  • bachelor means that someone is a man who is
    not married
  • What does red mean ?
  • red represents semantic primitive (a basic
    meaning), while bachelor does not.

9
Semantic Primitive (Cont)
  • 2 types of semantic primitive
  • Prescriptive and descriptive
  • Prescriptive semantic primitives
  • Set of pre-defined primitive
  • E.g. father marry couple
  • marry human, human.
  • father human
  • couple human, thing.
  • To choose the correct sense of couple

10
Semantic Primitive (Cont)
  • Prescriptive semantic primitives
  • Problem always need to be extended
  • Descriptive semantic primitives
  • Set of semantic primitives which is derived from
    a natural source of data such as dictionary.
  • E.g.

father5 - a term5 of address for priest2 in
some church especially roman7 or orthodox3
catholic marry3 - perform1 a marriage4
ceremony couple1 - a pair5 of people5 who
live7 together2
Uniquely identify each of the definition of
entries
Avoid Circularity
11
Word Sense Disambiguation(WSD)
  • Documents are collections of sentences containing
    words
  • Some words have more than one meaning. These
    meanings are often called word senses.
  • Goal
  • Assign meanings to words in some context
    according to some lexical resource.

12
Objective
  • Producing Machine-Tractable Dictionary (MTD) from
    Machine-Readable Dictionary using descriptive
    semantic primitives and WSD
  • Producing tractable database/corpus from
    database/corpus

13
Linguistic Resources
  • Machine-Tractable dictionary
  • Encoded with information extracted from MRD
  • Usable format and highly structured semantic
    information for NLP tasks

Descriptive semantic primitives
Determining the relatedness or closeness among
word senses in a dictionary
14
Lexical Consultation System
  • Semantic Primitive Extractor
  • LCDD Generator
  • WSD

15
Semantic Primitive Extractor
  • Searching for self-reference circle in definition
  • For example,

sense_1 def sense_2 sense_5 sense_6 sense_2
def sense_3 sense_2 sense_3 def sense_1
sense_2 sense_4 def sense_5 sense_5 def
sense_2 sense_4 sense_6 def sense_5 sense_4
gtsense_1 is a semantic primitive
16
Semantic Primitive Extractor (cont)
  • Step 1 Expanding dictionary

abandon 1 a feeling of extreme
emotional intensity abandon 2 leave
behind . . betray 2 abandon
abandon 1 a feeling of extreme
emotional intensity abandon 2 leave
behind . . betray 2 abandon1 abandon2
17
Semantic Primitive Extractor (cont)
  • Step 2 identify semantic primitives using
    self-reference circle
  • Example,
  • Extract primitives from pre-released WordNet
    during SENSEVAL2.
  • Pre-released WordNet1.7 192,460 entries
  • Extracted primitives 9368 entries (around 5 of
    pre-released WordNet1.7 entries)

18
LCDD generator
  • Identify the word senses definition layers
  • First layer for forecast2 and fixed6
  • Second layer for forecast2 and fixed6
  • forecast2
  • fixed6

forecast2 predict1 in advance3
fixed6 specify1 in advance3
19
LCDD generator(Cont)
LCDD(forecast2, fixed6) a70 (b c
d)/330
Depth-First Method
a
Layer 1 for forecast2
Layer 1 for fixed6
b
c
Layer 2 for forecast2
Layer 2 for fixed6
d
20
WSD
  • Simple Summation Algorithm
  • For example, assume that a sentence, father,
    marry and couple. Each word in the sentence
    has two senses only.
  • father1 marry1 couple1
  • father1 marry1 couple2
  • father1 marry2 couple1
  • father1 marry2 couple2
  • father2 marry1 couple1
  • father2 marry1 couple2
  • father2 marry2 couple2
  • father2 marry2 couple2
  • Dynamic programming techniques

21
System Design
General Dictionary (MTD)
Lexical Consultation System

Domain MTD for WSD
Domain MRD
22
System Architecture
Bilingual Knowledge Bank (BKB)
Papillon Dictionaries or FEM
23
Tractable Bilingual Knowledge Bank (BKB)
1E
1M
1E
1M
(0-5,0-4)
(0-5,0-4)
kutip(1)v (3-4/3-4)
kutip(1)v (3-4/3-4)
kutip(2)v (3-4/3-4)
pick(1)v up(1)p (3-47-8/3-4)
pick(1)v up(1)p (3-47-8/3-4)
pick(1)v up(1)p (3-47-8/3-4)
(0-1,0-1)
(0-1,0-1)
(0-1,0-1)
(2-4,2-4)
(2-4,2-4)
dia(1)n (0-1/0-1)
he(1)n (0-1/0-1)
dia(1)n (0-1/0-1)
(2-3,3-4)
dia(1)n (0-1/0-1)
(2-3,3-4)
he(1)n
bola(1)n (2-3/2-4)
ball(1)n (3-4/2-4)
bola(1)n (2-3/2-4)
ball(1)n (3-4/2-4)
he(1)n (0-1/0-1)
he(1)n (0-1/0-1)
lelaki(3)n (0-1/0-3)
bola(1)n (2-3/2-4)
ball(1)n (3-4/2-4)
man(4)n (2-3/0-3)
0-1
0-1
itu(1)det (3-4/3-4)
itu(1)det (3-4/3-4)
the(2)det (2-3/2-3)
the(2)det (0-1/0-1)
old(3)adj (1-2/1-2)
the(1)det (2-3/2-3)
the(1)det (2-3/2-3)
tua (2)adj (1-2/1-2)
itu (1)det (2-3/2-3)
itu(1)det (3-4/3-4)
(0-1,0-1)
(0-1,0-1)
(0-1,0-1)
(3-4,2-3)
(3-4,2-3)
dia kutip bola itu 0-1 3-4 2-3 3-4
he pick the ball up 0-1 3-4 2-3 3-4
7-8
dia kutip bola itu 0-1 3-4 2-3 3-4
he pick the ball up 0-1 3-4 2-3 3-4
7-8
(2-3,3-4)
(2-3,3-4)
0the1old2man3pick4the5ball6up7
0lelaki1tua2itu3kutip4bola5itu6
24
  • Thank you
  • Any comments please send to btlim_at_cs.usm.my

25
Semantic Primitive Extractor (cont)
  • Step 2 compute the frequency of each sense entry
    in dictionary according to its appearance in
    definition text.
  • Sort the list by frequency
  • an entry with high frequency gt
  • high probability that entry is a primitive
  • Problems
  • Empty definition
  • Possibility of selecting wrong semantic
    primitives based on the self-reference method

26
WSD (Cont)
  • Improving the quality of a number of Natural
    Language Processing Tasks
  • Machine Translation
  • Information Extraction
  • Internet Search Engines

27
WSD (Cont)
previous path value difference between the two
consecutive paths
Write a Comment
User Comments (0)
About PowerShow.com