Title: LDAP Query Access: Challenges and Opportunities
1LDAP Query Access Challenges and Opportunities
with Peter Dinda, Northwestern
Part of GIS Task Force on Relational Data Models
2Goals of Talk
- Pose problem
- query interface could be limiting factor in
directory server useability and performance. - Pose possible solutions
- Extensions to LDAP query language
- SQL query processing front-end
- Adopt relational model as information service
data model - Stimulate discussion with questions
3Talk topics
- Yes
- Data models (e.g., hierarchical, relational,
object-oriented) - Query languages
- No
- Schemas
- Communication protocols
- Interchange formats
- Message-passing layers
- Event-based services
4Establishing a Common Terminology
- LDAP protocol or data model?
- Difference between schemas and data models
- Difference between hierarchical, relational, and
object-oriented data models
5LDAP Protocol or Directory?
- LDAP v2 provide access to X.500 directory
(RFC 1777). (i.e., LDAP is gateway to X.500
directory)
TCP/IP
OSI
LDAP client
LDAP server
X.500 server
directory
- LDAP v3 provide access to directories
supporting X.500 model (RFC 2251) (i.e., LDAP
can implement directory itself)
TCP/IP
LDAP client
LDAP server
directory
6Schema versus data model
- Data model
- Describes entities, structure, relationships
- e.g., relations, tuples, attributes, domains
- Schema
- Description of structure of data in a particular
database - e.g., creates the tables, defines the attributes
and specifies domains for a given application
7Hierarchical, relational, or object-oriented data
model?
Hierarchical tree structure child has only one
parent partitions easily tree often directly
reflected in physical storage. Query language
low-level and procedural.
alias
Relational set of tables query language (SQL)
efficient, well-founded, and declarative.
Doesnt handle complex data types well flat
organization not always natural.
foreign key
Object-oriented enhanced conceptualization Hand
les complex data types SQL-like interface
query language inefficient no standard exists
no formal model
compositional hierarchy
Object-relational adopted OO features into
relational
8Problem
Existing LDAP query access interface is
inadequate for typical types of queries posed by
users of grid information service.
9Example Queries
- Where can I find load measurement stream for
host kanga? - sourcetcpkanga5000, sourceudp239.99.99.99500
0 - Need 1 to 4 machines, all same OS and arch, with
combined memory of 1 GB - (mojave),(sahara),((poconos,pyramid,foo),(manch1,2
,3,4), etc)
10Relational Database Schema
hosts
normalized
IP
name
hostdata
IP
numproc
mhz
arch
os
osv
mem
vmem
dasd
loc
user
note
UR
modules
MID
mt
dsid
note
IP
moduleexecs
mt
arch
os
minosv
ver
name
note
UR
endpoints
MID
EPID
datasources
endpointdata
dsid
dst
EPID
protocol
IP
port
datatype
11Hierarchical Schema
ougrid1
datasources
moduleexecs
host class
endpointdata
endpoints
modules
hostdata
alias
12Relational Query 2 I need 2 machines having
total memory between 512 and 1024 bytes
SELECT host1.name, hd1.arch, hd1.os, host2.name,
hd2.arch, hd2.os, hd1.mem hd2.mem as
TotalMem FROM hosts as h1, hostdata as hd1, hosts
as h2, hostdata as hd2 WHERE host1.ip hd1.ip
and host2.ip hd2.ip and host1.ip ! host2.ip
and hd1.mem hd2.mem gt 512 and hd1.mem
hd2.mem lt 1024
--------------------------------------------
---------------- name arch os
name arch os TotalMem
-------------------------------------------
----------------- poconos. ALPHA DUX
innuendo. I386 LINUX 640.00
poconos. ALPHA DUX pyramid. ALPHA
DUX 640.00 innuendo. I386 LINUX
poconos. ALPHA DUX 640.00 pyramid.
ALPHA DUX poconos. ALPHA DUX
640.00 poconos. ALPHA DUX firenze.
I386 LINUX 640.00 ------------------
------------------------------------------
13Hierarchical Version
Lacking aliasing to dynamically define logical
relationships.
Base
define SEARCHBASE adGrid1 LDAP ld,
LDAPMessage res Main ldap_search_s(ld,
SEARCHBASE, LDAP_SCOPE_SUBTREE,
hostdata.name , hostdata.name,
hostdata.arch, hostdata.os,
hostdata.mem, 0, res) /
results processed using / ldap_first_entry(),
ldap_next_entry(), ldap_first_attribute(),
etc.
Scope
Search filter
Lacking aggregate operator to perform functions
over data before it is returned
Return attributes
--------------------------------- name
arch os Memory ------------------
--------------- poconos. ALPHA DUX
256 innuendo. I386 LINUX 2048
pyramid. ALPHA DUX 256 firenze.
ALPHA DUX 512 --------------------
-------------
Low-level results processing
14dcatt, dccom
LDAP query access limitations
dcproducts
dcservices
dcresearch
objectClassorgUnit
surNamejagadish
surNamejagadish
A. Use of different base entries
(-(dcatt, dccom ? Sub ? surNamejagadish)
(dcresearch, dcatt, cdcom ? Sub ?
surNamejagadish))
Query Locate directory entries whose surname
is Jagadish in ATT except those in research.
B. Selecting parents and children
(c(dcatt, dccom ? Sub ? objectClassorgUnit)
(dcatt, cdcom ? Sub ? surNamejagadish))
Query returns each entry that satisfies
objectClassorgUnit and has at least one child
entry that satisfies surNamejagadish.
15Relational Version of Query Where can I find a
load measurement stream for host kanga
SELECT ed.protocol, h.name, ed.port, m.name FROM
host as h, module as m, endpoint as e,
endpointdata as ed WHERE h.name kanga and
ed.datatype LOAD_MEASUREMENT and h.IP
m.IP and m.MID e.MID and e.EPID ed.EPID
Search all endpoints for all running modules on
host kanga to find endpoints containing data type
LOAD_MEASUREMENT. Returns -gt tcpkanga5000resou
rce_module
16Hierarchical Version
Explicit start point in search space more
encompassing queries obtained by starting higher
in tree, expense of costlier queries.
define SEARCHBASE adGrid1 LDAP ld,
LDAPMessage res Main ld ldap_open()
ldap_simple_bind_s(ld, user, Passwd)
ldap_search_s(ld, SEARCHBASE,
LDAP_SCOPE_SUBTREE, modules.hostdata.name
kanga modules.endpoints.endpointdata
LOAD_MEASUREMENT, modules.endpoints.end
pointdata.protocol, modules.hostdata.name
, modules.endpoints.endpointdata.port,
modules.name, 0, res)
Explicit path traversal to walk aliases requires
users know structural detail difficult to write
accurate queries.
17LDAP query access limitations summary
LDAP limitation Impact Relational data opportunity
No queries selecting parents and children User generates multiple queries, joins results Supported implicitly by flat tables
No complex queries using different base addresses Cant cross admin domains. User generates multiple queries. Distributed relational database? Front-end interface?
Need explicit path knowledge to traverse aliases Low-level for user Removed by flat tables
No floating point support supported
No aggregate selection Imposes low-level processing on user supported
18Solutions
- Query access language extensions
- Database community looking at extensions to LDAP
query language. May be possible to influence or
adopt. - Adopt relational data model
- Relational data model enables efficient query
access. Expressive language. Prototype exists
as part of RPS. - Embed converter in data stream exported by
directory server - dQUOB evaluates SQL-style queries over streaming
data may be part of a solution.
19Discussion
- Hierarchical model superior for partitioned data
space. - Queries across partitions likely?
- If so, LDAP referrals using server chaining or
front-end interface. - What types of queries are likely?
- Whats the metric?
- Minimize number of accesses to server?
- More expressible queries?
- Floating point support