Title: Metadata Workshop
1Metadata Workshop
- Rick St. Denis
- Glasgow University
- April 26-28, 2004
2Format
- Goal Answer the question What is Metadata in
our document - Method Provaceteurs
- Topics list augment now
- Get acquainted, divide and study topics, present
together, course of action - Output Revamped deliverables
3Rough Agenda
- Mon
- 2-3 5 min on who we are
- 3-330 Decide on topics
- 330-500 Get to Stepps, Hotel
- 500 Meet in 2 West Ave
- Tues Provocateur sessions and research
- Wed Final Document with deliverables, Plans for
future MO, CHEP abstract
4Topics
- Metadata Architecture and components
- Replica Catalogs, file catalogs, physics catalogs
- Use Cases
- Query Languages
- Implementations and Performance. Technology
Considerations, Performance reqs - Service architectures, Deployment Architectures
- Database implementations text/mysql/postgres/orac
le/enth
5Informing ourselves
- SAM Services (Julie)
- Arda/OGSA-DAI(Gav will outline)
- LHCB (Carmine)
- AMI (Solveig)
- Pool and Graphical Visualization (Carmine)
- Spitfire (Paul)
- PNPA-GGF (Rick)
- Project Management (Tony)
- Package services for release in SourceForge
6Use Cases
- CDF5858 physicist use case (Rick)
- HEPCAL II (Solveig,Tony)
- Production
- Analysis
- ADA Atlas catalogs David Adams(Steve)
- D0 Wyatt
- Schema Update Document use cases?(Adam)
7Services
- Compare Arda and SAM approaches Arda
architectureGavin - Given Use cases Define services
- List Services from SAMServices to services
- Interfaces The SAM service with one schema the
Grid services implemented in several schemas. - Interfaces Physics catalog impact from failure
of lower level services. file content status. - Action outline models of access
physical/logical - Discrete or related bits of functionality
dependencies between services. Zenness of
services. List of files, directive on where to
use, not connection to why anymore.Performance
implications on interfaces. - Wyatt, Gavin, Rick, Julie
8Deployment Architectures
- Where do the services run? Application servers?
Tiers of applications and databases - Replication for HA. At what tier? Application or
DB? Oracle? Is it replication or mirroring. - What is the time constant for replication?
- When do metadata become stale?Freshness date
status bits. - Centralized catalogs as a single point of
failure what are single points of failure. - HA strategies
- Federation of metadata
- Julie,Gavin,Paul,Solveig
9Tools
- DB jdbc,phpi,text, mysql, msql,
oracle,xml,soap,python - Dbserver
- Tools on top of sql.
- Relation to deployment architectures db access
directly or application server. - Replication
- Data Virtualization
- Rick, Gavin, Solveig, Adam,Julie
10Query Languages and Interfaces
- SQL
- Chains and Links (rick)
- General Dimensions (Wyatt)
- Queries against multiple databases. Related to
deployment architecture (dimensions, cl,SBIR
II/enth) - POOL (Carmine)
11Monitoring
- Sam TV (Adam)
- Mining and instrumenting (Caitriana)
- MonAlisa
- File access patterns
- stats
12Security
- Table Access in a distributed architecture
- Server to Server security
- Access to the Server by the user
- A standard certification protocol
- VOMs
- Spitfire security
13(No Transcript)
14(No Transcript)
15Next Steps
- Design for Keyword-Value
- Schema evolution and self-describing schema
- Use previous 2 to automate transition from
keyword-value to query-efficient schema and
determination of which queries need to be
satisfied. - Unique dataset tool
16Deliverables
- Docs from next steps
- Use case filtered for our group
- Services Decomposition of ER-Diagram into collab
diagram - Deployment Arch Enumerate problems
- Monitoring Stats on queries(accumluate/doc)
- QueryLang/Int Survey of QL(Pool.CL)
- ToolsWrap corba w/xml
- Deliverables longer term
17Schedules
- Monthly meeting Last Tues of month at
830/1430/1530 First May 25. H323 8272634 - Mailing list (Paul)
18Metadata for the Common Physicist A working group
on metadata with representatives from ATLAS,
BaBar, CDF, CMS, D0, and LHCB in cooperation with
EGEE have identified overlapping user
requirements that may be supported by common
service implementations. Classes of metadata
specific to each service and their relations are
described. These include a set of use cases based
on compilation of various HEP documents. These
documents are used to inform interfaces in
existing and planned services as described in
metadata schema. Emphasis is placed on the
evolution of schema using keyword-value pairs
that are then transformed into a normalised
performant database schema. A report is made of
self-description mechanisms, which coupled with
updating processes, allow the APIs to remain
static as the schema evolves. A presentation is
made of the way use cases drive performance.
Requirements are presented for the physical and
logical arrangement of service implementations,
dictating the degree to which the databases
containing the metadata may be distributed or
centralised. A set of existing monitoring tools
expose the validity and completeness of the use
cases for experiments in various stages of
maturity. A survey of the query languages, web
service interfaces and tools in use across the
experiments is presented.
19Future
- Work to deliverables
- Meet according to deadlines
- Workshops according to major deadlines