Title: Virtual Organizations, Security and Knowledge Discovery in the CrossGrid Project
1Virtual Organizations, Security and Knowledge
Discovery in the CrossGrid Project
Jesús Marco CrossGrid WP4 (International
Testbed) Instituto de Física de
Cantabria Consejo Superior de Investigaciones
Científicas, CSIC Santander, SPAIN http//www
.eu-crossgrid.org
2The EU CrossGrid Project
- European Project ( 5 M, March 2002-2005)
- proposed to CPA9, 6th IST call, V FP
- Polish (Cracow Poznan) / Spanish (CSIC CESGA)
/ German (FZK) initiative with the support of
CERN (thanks to Fab!) - CYFRONET (Cracow) is the coordinator of the
project (Michal Turala, project leader) - Objectives
- Extension of GRID in Europe, assuring
interoperability with DataGrid - Interactive Applications (human in the loop)
- Environmental fields (meteorology/air pollution,
flooding) - High Energy Physics (interactive analysis over
distributed datasets) - Medicine (vascular surgery preparation)
- Need
- Develop corresponding middleware and tools
- Deploy on a pan-european testbed
- Partners
- Poland (CYFRONET, PSNC, ICM, INP, INS), Spain
(CSIC IFCA, IFIC, RedIRIS, UAB, USC), Germany
(FZK, USTUTT, TUM), Slovakia (II SAS), Ireland
(TCD), Portugal (LIP), Austria (U.Linz), The
Nederlands(UvA), Greece (DEMO, AuTH), Cyprus
(UCY) - Industry Datamat (I), Algosystems (Gr)
3VO, SEC KD in CrossGrid
- CrossGrid interactive applications require
- Complex but Secure Virtual Organizations
- CrossGrid middleware provides a framework for
development - Friendly secure use Roaming Access Server
(Portal/Migrating Desktop) - Scheduling for collaborative work to VO resources
- CrossGrid testbed
- Relies on local site support for management and
security - uses Globus basic grid security GSI
- follows EU-DataGrid in deployment for
interoperability - Certification Authorities
- Virtual Organization LDAP
- Next VOMS
- Knowledge Discovery
- Development of Grid-adapted Data Mining
Techniques accessing Distributed Databases with
published Metadata Catalogs
4Flood management
- Goal
- Flooding risk prediction
- Method
- Cascade of simulations
- Meteorological
- Hydrological
- Hydraulic
- Virtual Organization
- Need Grid in interactive mode (simulation
results for what-if ) - seamlessly connect together experts, data and
computing resources needed for quick decisions - highly automated early warning system, based on
hydro-meteorological (snowmelt) rainfall-runoff
simulations
5Grid Security Infrastructure (GSI)
- Globus Toolkit implements GSI protocols and APIs,
to address Grid security needs - GSI protocols extends standard well-known public
key authentication protocols for authentication
and message protection - X.509 identity certificates
- SSL/TLS
- GSI supports standard API, GSSAPI, for supporting
a number of applications - SSH, GridFTP
6Grid Security Infrastructure (GSI)
Proxies and delegation (GSI Extensions) for
secure single Sign-on
Proxies and Delegation
SSL/ TLS
PKI (CAs and Certificates)
SSL for Authentication And message protection
PKI for credentials
7EU-DataGrid Security Services
8The CrossGrid Testbed
- 16 sites (small large) in 9 countries,
connected through Géant NReNs - Grid Services EDG middleware (based on
Globus) RB, VO, RC
Géant
TCD Dublin
PSNC Poznan
UvA Amsterdam
ICM IPJ Warsaw
FZK Karlsruhe
CYFRONET Cracow
CSIC-UC IFCA Santander
USC Santiago
LIP Lisbon
Auth Thessaloniki
UAB Barcelona
CSIC RedIris Madrid
CSIC IFIC Valencia
UCY Nikosia
DEMO Athens
9Computing resources
- Site testbed
- LCFG configuration server
- User Interface
- Gatekeeper (Computing Element)
- Worker Nodes
- Storage Element
- 16 sites
- 115 CPUs (Worker Nodes)
- 4 TB (Storage Elements)
- National Certification Authority machines
- Grid services (LIP)
- Information Index
- Top MDS Information Server, points to site
Information Servers - Resource Broker
- Matchmaking and load balancing scheduler
- Replica Catalogue
- Database for physical replica file location
- Certificate Proxy Server
- Short lived certificates for long lived
processes, used by RB - Virtual Organization Server
- Database for user authentication (CROSSGRID VO)
- Monitoring
- Mapcenter network monitoring system
10CrossGrid CA page
11Working on RA procedure
12VO server in CrossGrid
13Overview of VOMS
focus is on VOMSdetails are in D7.6 Security
Design
CA
proxy cert
request
dn, cert, Pkey,
VOMS cred.
(short lifetime)
certificate
dn, ca, Pkey
certificate
user
VOMS
re-newal
delegation
request
certkey
VOMS cred
MyProxy
(long lifetime)
VO, group(s),
role(s)
delegation
certkey
(short lifetime)
proxy cert
proxy cert
proxy cert
proxy cert
proxy cert
auth
auth
auth
auth
auth
GSI
TrustManager
TrustManager
GSI
mod_ssl
authz
authz
pre-process
pre-process
pre-process
parameters-gt
LCAS
parameters-gt
parameters-gt
WebServices Authz
obj.id req. op.
dn,attrs,acl, req.op
obj.id req. op.
obj.id req. op.
dn,attrs,acl, req.op
-gtyes/no
-gtyes/no
map
map
LCMAPS
dn -gt DB role
authz
authz
dn -gt userid, krb ticket
authz
obj.id -gt acl
GACL
GACL
dn,attrs,acl, req.op
obj.id -gt acl
obj.id -gt acl
-gtyes/no
doit
dn,attrs,acl, req.op
dn,attrs,acl, req.op
doit
-gtyes/no
-gtyes/no
doit
doit
doit
coarse grained
coarse grained
fine grained
fine grained
fine grained
(e.g. gatekeeper)
(e.g. RepMec)
(e.g. SE, /grid)
(e.g. GridSite)
(e.g. Spitfire)
web
C
Java
14VOMS Overview
- Provides info about the users relationship with
his VO(s) - groups, roles (admin, student, ...), capabilities
(free form string), temporal bounds - Features
- single login voms-proxy-init only at the
beginning of the session (replaces
grid-proxy-init) - expiration time the authorization information is
only valid for a limited period of time (possibly
different from the proxy certificate itself) - backward compatibility the extra VO related
information is in the users proxy certificate,
which can be still used with non VOMS-aware
services - multiple VOs the user may authenticate himself
with multiple VOs and create an aggregate proxy
certificate - security all client-server communications are
secured and authenticated.
15VOMS Architecture
vomsd
GSI
voms-proxy-init
DB
soap SSL
JDBC
https
DBI
mkgridmap
https
VOMS server
MySQL db with history and audit records
- User query server and client (C)
- Java Web Service based administration interface
- Perl client (batch processing)
- Web browser client (generic administrative tasks)
- Web server interface for mkgridmap
16Users Authorization in EDG 2.x
host cert(long life)
service
user
crl update
user cert(long life)
VO-VOMS
registration
registration
VO-VOMS
voms-proxy-init
VO-VOMS
proxy cert(short life)
service cert(short life)
VO-VOMS
authz cert(short life)
authz cert(short life)
authentication authorization info
edg-java-security
LCAS
17Local Site Authorization
- Local Centre Authorization Service (LCAS)
- Handles authorization requests to local fabric
- authorization decisions based on proxy user
certificate and job specification - supports grid-mapfile mechanism.
- Plug-in framework (hooks for external
authorization plugins) - allowed users (grid-mapfile or allowed_users.db),
banned users (ban_users.db), available timeslots
(timeslots.db) - plugin for VOMS (to process authorization data)
- Local Credential Mapping Service (LCMAPS)
- provides local credentials needed for jobs in
fabric - mapping based on user identity, VO affiliation,
local site policy
18Knowledge Discovery
- Will CrossGrid VO export or discover
knowledge ? - Likely for Meteo applications
- Partially only for HEP applications
- First step extending KDD to the Grid
environment - Data-mining on distributed databases (task
1.3-1.4, HEP Meteo large databases) - Distributed query using
- Metadata Replica catalogs
- Interactive Database Server modules (i.e. O/R
DBMS, PAW) - Queries in XML format
- Distributed via MPICH-G2 in master-slave scheme
- Mid-Large size databases, o(TB)
- Data-mining algorithms adapted to the Grid
- Distributed Neural Network training
- Self-Organizing Maps
- Distributed also using MPICH-G2
- Tests started ! Encouraging first results!
- Modeling, benchmarking, performance prediction
(CrossGrid WP2 tools)
19Architecture
20Challenging issues to be discussed with other
projects
- On-line Authentication mechanisms?
- Proxy use for portals/roaming access
- User understanding of Virtual Organizations
- Membership features
- Permanent storage (personal/group/vo/external)
- Optimal use (from accounting, scheduling to
replication and resilience) - Active Security Policies (Grid-patrols)
- Metadata publication for distributed databases
- Transition to OGSA/OGSI
- Adapting current middleware
- OGSA-DAI use
- Distributed mechanisms (MPICH-G3?)
- New knowledge discovery mechanisms
21Summary
- Virtual Organizations Security are key points
in the CrossGrid project - Experience from real working testbed, thanks to
the use of Globus GSI and EU-DataGrid middleware - Considerable effort on deployment (CA,RA,VO,
Sites management) - an interoperable pan-european community
(CrossGrid DataGrid) - VOMS (EDG) opens new possibilities for VO
- CrossGrid will make clear to the user the VO
possibilities but also the security issues to
assure a friendly environment - Portal proxy-based secure access also to be
almost transparent - User group and roles together with resource
discovery and monitoring - Knowledge discovery can be seen as a final ideal
environment for specific application users,
progressing along this direction - Data Mining on Distributed Databases prototypes
being tested on a realistic Grid environment