Extending databases to precision-controlled retrieval of qualitative information

1 / 19
About This Presentation
Title:

Extending databases to precision-controlled retrieval of qualitative information

Description:

Extending databases to precision-controlled retrieval of qualitative information Victor Polo de Gyves,1 Adolfo Guzman1,2, and Serguei Levachkine2 – PowerPoint PPT presentation

Number of Views:3
Avg rating:3.0/5.0
Slides: 20
Provided by: ipn99

less

Transcript and Presenter's Notes

Title: Extending databases to precision-controlled retrieval of qualitative information


1
Extending databases to precision-controlled
retrieval of qualitative information
  • Victor Polo de Gyves,1 Adolfo Guzman1,2, and
    Serguei Levachkine2
  • (1) SoftwarePro International,
  • (2) Centro de Investigación en Computación,
  • a.guzman_at_acm.org

2
Qualitative variable
  • A variable able to take a symbolic value
  • A symbolic value is a set if can be considered E
    its name or description
  • Palewhite, yellow, orange, beige

3
Hierarchy
  • For an element set E, a hierarchy H is another
    set of elements where any element ei is a
    symbolic value representing
  • An element of E or
  • A Partition. Partition K is a partition of set S
    if it is both a covering for S and an exclusive
    set. The members of K are mutually exclusive and
    collectively exhaust S. Each element of S is in
    exactly one Kj.
  • And the union of every elements by represented by
    ei is E. Example

4
Confusion
  • If I ask for an animal and a snake is given, Is
    there a mistake?
  • I ask for a plant and an animal is given, which
    is the error value? can this error be measured?
  • Measuring the error in using a qualitative value
    r instead of another qualitative value s is
    defined as follows1
  • conf (r, r) conf (r, parent of r) 0
  • conf (r, s) 1 conf (r, parent of s))
  • The confusion in using r instead of s (the
    desired value) is the number of descending links
    from r ? s.
  • Conf(r,s) is not a distance, nor ultradistance
    and it is not symmetric.

1 A. Guzman-Arenas and S. Levachkine. Hierarchies
Measuring Qualitative Variables. Lecture Notes
in Computer Science LNCS 2945 (Computational
Linguistics and Intelligent Text Processing),
(Springer-Verlag 2004). 262-274. ISSN 0372-9743.
5
live_being
animal
plant
mammal
bird
snake
citric
pine
cat
dog
lemon
orange
conf (cat, mammal) 0 (If I am using a cat
instead of a mammal) conf (cat, animal) 0
conf (mammal, cat) 1 conf (cat,
dog) 1 conf (cat, bird) 1 conf (cat, lemon)
3
6
Predicates with controlled confusion
  • P is true? for x if1
  • The object x satisfies predicate P with confusion
    ? if and only if
  • P is true for x when P does not contains
    hierarchical variables
  • When pr is a hierarchical variable and P is of
    the form (pr c), if and only if for the value v
    of the property pr for the object x, v? c (if
    the value v can be used instead of c with
    confusion ?)
  • When P P1? P2, if and only if P1 is satisfied?
    by x, or P2 is satisfied? by x.
  • When P P1 ? P2, if and only if P1 is satisfied?
    by x and P2 is satisfied? by x.
  • When P P1, if and only if P1 is not satisfied?
    by x.

1A. Guzman-Arenas and S. Levachkine. Graduated
errors in approximate queries using hierarchies
and ordered sets. Lecture Notes in Artificial
Intelligence LNAI 2972, (Springer-Verlag 2004).
139-148. ISSN 0302-9743
7
(Ann (lives_in USA) (pet snake)), (Bill (lives_in
English_Speaking_Island) (pet citric)), (Fred
(lives_in USA) (pet cat)), (Tom (lives_in
Mexico)(pet cat)), (Sam (lives_in Cuba) (pet
pine)), (Pedro (lives_in Mexico)(pet dog)).
Predicate P (lives_inUSA) ?(petcat) P is
true? for E0 Fred E1 Fred, Tom , Pedro E2
Fred, Tom, Pedro, Ann E3 Fred, Tom, Pedro, Ann,
Sam, Bill
H1
North_America
Caribbean_Islands
Central_America
live_being
Canada
USA
Mexico
Guatemala
Costa_Rica
animal
plant
Honduras
English_speaking_islands
Spanish_speaking_islands
mammal
bird
snake
citric
pine
Cuba
Puerto Rico
Jamaica
cat
dog
lemon
orange
8
Implementing conf(r, s) for databases
  • Looking for the author of an assasination
  • Witness 1 Witness 2 Witness 3
  • - Red car Elegant car - Small car
  • - Tall normal height
  • - Scar in nose - Scar in the face
  • Confusion can sort a database with suspects
    first, the most probably ones and at the end the
    least probably authors of the murdering.
  • An enterprise manager wishes to make a special
    discount to their customers interested in office
    products.
  • Given a document d on a digital library, sort
    any documents in function of the subject
    similarity with d. The most similar documents
    will be closer.
  • Confusion delimitates objects from a database in
    agreement with the closeness or similarity of
    their properties.

9
Example 1. (address california)1
confusion_at_/tesis/jdbc/jdbc/jars java
-classpath CLASSPATHpostgresql.jar Conexion
"select customers.name, customers.address,
conf(customers.address)from customers where
conf(customers.address,'california')lt1 order by
conf(customers.address)" ------------- SQL
traducido select customers.name,
customers.address, confusion.customers_address_nor
m."california"from customers_address_norm,
customers where ( confusion.customers_address_norm
.nombre customers.address AND
confusion.customers_address_norm."california"lt1
) order by confusion.customers_address_norm."calif
ornia" ------------- NAME ADDRESS
CALIFORNIA Tom's Hamburgers pasadena
0 Microsol silicon valley 0 East coast
meat florida 1 Media Tools new york
1 Texas fruits texas
1 confusion_at_/tesis/jdbc/jdbc/jars
Extended SQL (user)
Pure SQL (machine)
Result
10
Example 2. Update (industrial branch, food)0
confusion_at_/tesis/jdbc/jdbc/jars java
-classpath CLASSPATHpostgresql.jar Conexion
"update customers set discount0.07 where
customers.name in conf(customers.industrial_branch
,'food')lt0" ------------- SQL traducido
update customers set discount0.07 where
customers.name in (select customers.name from
customers_industrial_branch_norm, customers where
( confusion.customers_industrial_branch_norm.nombr
e customers.industrial_branch AND
confusion.customers_industrial_branch_norm."food"lt
0 ) ) confusion_at_/tesis/jdbc/jdbc/jarspsql
conf conf select from customers
name industrial_branch address
discount ---------------------------------------
------------------------ Media Tools
computers new york 0
Garcia Productores tequila mexico
city 0 Microsol software
silicon valley 0 Tom's
Hamburgers food pasadena
0.07 East coast meat meat
florida 0.07 Luigi's italian food
italian food north america 0.07 Mole
Doña Rosa mexican food mexico
0.07 Texas fruits fruits
texas 0.07 Canada seeds
food canada 0.07
Extended SQL (user)
Pure SQL (machine)
Result
11
Example 3 Options for a bachelor application. (
(profesion, informatica)2 ?(sistema, abierto
semestral)2 )3
confusion_at_dlguzman1 jars java -classpath
CLASSPATHpostgresql.jar Conexion "select
profesiones., conf(profesiones.profesion),conf(pr
ofesiones.sistema)from profesiones where
conf(profesiones.profesion,'informatica')lt2 AND
conf(profesiones.sistema,'abierto semestral')lt2
and conf(profesiones.sistema)conf(profesiones.pro
fesion)lt3 order by conf(profesiones.profesion)co
nf(profesiones.sistema)" ------------- SQL
traducido select profesiones.,
confusion.profesiones_profesion_norm."informatica"
,confusion.profesiones_sistema_norm."abierto
semestral"from profesiones_sistema_norm,
profesiones_profesion_norm, profesiones where (
confusion.profesiones_profesion_norm.nombre
profesiones.profesion AND confusion.profesiones_pr
ofesion_norm."informatica"lt2 ) AND (
confusion.profesiones_sistema_norm.nombre
profesiones.sistema AND confusion.profesiones_sist
ema_norm."abierto semestral"lt2 ) and
confusion.profesiones_sistema_norm."abierto
semestral"confusion.profesiones_profesion_norm."i
nformatica"lt3 order by confusion.profesiones_prof
esion_norm."informatica"confusion.profesiones_sis
tema_norm."abierto semestral"
Extended SQL
Pure SQL
12
ESCUELA PROFESION DISPONIBLE SISTEMA
INFORMATICA ABIERTO SEMESTRAL uam
informatica 5 abierto trimestral 0
1 unam informatica 30
escolarizado anual 0 2 ipn
informatica 20 escolarizado semestral 0
2 monterrey informatica 20
escolarizado semestral 0 2 ipn
computacion 30 escolarizado anual 1
2 unam computacion 30
escolarizado semestral 1 2 monterrey
computacion 10 escolarizado semestral 1
2 uam quimica 5 abierto
trimestral 2 1 Finalizando... Fin. co
nfusion_at_dlguzman1 jars
Result
13
The method to create confusion tables
A
c) Calculate and create the confusion table
associated to each attribute of the entity
A alimento, nutritivo, basura, fruta,
carne AxA A2 a,a,a,n,a,b,a,f,a,c
, n,a,n,n,n,b,n,f,n,c,
b,a,b,n,b,b,b,f,b,c,
f,a,f,n,f,b,f,f,f,c,
c,a,c,n,c,b,c,f,c,c
conf(A2) conf(a,a), conf(a,n),
conf(a,b), ..., conf(c,c)
conf(A2) 0, 1, 1, 2, 2, 0, 0, 1, 1, 1, 0,
1, 0, 2, 2, 0, 0, 1, 0, 1, 0, 0, 1, 1, 0,
conf select from mascota_alimento_norm
nombre alimentonutritivobasurafrutaverdura -
------------------------------------------- a
limento 0 1 1 2
2 nutritivo 0 0 1 1
1 basura 0 1 0 2
2 fruta 0 0 1 0
1 verdura 0 0 1 1 0
14
Steps to write predicates
  • Step 1 Create predicate
  • Step 2 Write the XSQL sentence
  • Step 3 Execute the program

15
Step 1. Writing the predicate
  • I whish to find every baseball players living in
    Xochimilco with confusion 1.
  • The predicate is
  • (sport baseball) ? (address xochimilco)1

16
Step 2. Writing XSQL
  • The user looks the predicate and writes it in
    XSQL
  • (sport baseball) ? (address xochimilco)1
  • It is rewritten by the user as
  • conf(sport, baseball)lt1 AND conf(address,
    xochimilco)lt1

17
Step 3. Converting to Pure SQL
  • A program converts the expression in XSQL to pure
    SQL, able to be executed in any database.
  • (sport baseball) ? (address xochimilco)1
  • conf(sport, baseball)lt1 AND conf (address,
    xochimilco)lt1
  • The program converts it to
  • (confusion.friends_sport_norm.nombre
    friends.sport AND confusion.friends_sport_norm.ba
    seball"lt1)
  • AND
  • (confusion.friends_address_norm.nombre
    friends.address AND confusion.friends_address_norm
    ."xochimilco"lt1)

18
Confusion tables
  • The previous expression refers to the tables
    confusion.friends_address_norm
  • and confusion.friends_sport_norm
  • Our program creates the tables just the first
    time
  • Using the hierarchical tree and the method to
    create confusion tables
  • Using these tables the query is more efficient
    because there is no need to recalculate the
    confusion values

19
Conclusions
  • Using hierarchies our program extends any
    database to make it able to recover objects with
    controlled precision
  • Allows to delete and update objects with
    controlled precision
  • The controlled precision organizes the objects in
    a smart way
Write a Comment
User Comments (0)