Semantic Annotation and Search of Software Artefacts

About This Presentation

Title:

Description:

Number of Views:26

Avg rating:3.0/5.0

Slides: 25

Provided by: carbonVide1

Category:

more less

Transcript and Presenter's Notes

Title: Semantic Annotation and Search of Software Artefacts

1
Semantic Annotation andSearch of Software
Artefacts

2
Some Terminology

Ontology population given an ontology, populate
it with instances derived automatically from a
text.
Annotation (of text) associating labels to text
snippets from a larger document.
Can be linguistic, semantic, etc...
Semantic annotation labels used in annotation
are associated with an ontology
Can also include ontology population, as a side
effect.

3
Annotation
4
Semantic Annotation
5
Case Study Software Artefacts

6
Case Study Software Artefacts

7
The GATE Ontology
8
Ontology Population Structured Data
9
GATE Ontology - populated
10
Ontology PopulationUnstructured Data

11
Ontology PopulationUnstructured Data
12
Information Access Conceptual Retrieval

Can make use of abstractions and generalisations
powered by ontology back-end.
Provides retrieval options not available to
full-text search, e.g.
Capitals of countries in Asia
Query language very complex, somewhat similar to
SQL ? not really suitable for end users.

13
Capitals of countries in Asia (simplified
SeRQL)?

14
QuestIOQuestion-based Interface to Ontologies

15
QuestIO Initialisation

Vocabulary built automatically from the KB (hence
domain independent).
Extract all possible textual descriptions from
the ontology.
Normalise for morphology, lack of tokenisation,
CamelCasing, etc.
Represent all lexicalisations into a GATE
gazetteer (long init time, fast run time).

16
Query Construction

Capital City
Country
Continent
Continent_T4
Asia
Capitals of countries located in Asia
17
Query Construction (II)?

Build a SerQL query by finding appropriate
properties to link the concepts found.
Build a list of candidate properties based on
ontology schema (using domain and range
constraints).
Rank the properties.

18
Ranking Properties

We combine three types of scores
similarity score compare query fragments with
candidate property names using Levenshtein string
similarity metric.
specificity score is based on the subproperty
relation in the ontology definition.

19
Ranking Properties (II)?

distance score inferring an implicit specificity
of a property based on the level of the classes
that are used as its domain and range.

20
Query Execution

21
Evaluation coverage and correctness

36 questions extracted from GATE list
22 out of 36 questions were answerable (the
answer was in the knowledge base)
12 correctly answered (54.5)?
6 with partially corrected answer (27.3)?
system failed to create a SeRQL query or created
a wrong one for 4 questions (18.2)?
Total score
68 correctly answered
32 did not answer at all or did not answer
correctly

22
Evaluation on scalability and portability