Title: VREs: addressing the Needs of e-Research
1VREs addressing the Needs of e-Research
- Rob Allan
- CCLRC e-Science Centre
- Daresbury Laboratory
2The Research Process
- Create or access Data
- Data exists in many forms text, numbers,
pictures, sound, printed, digital - Simulations and experimentation or observation
create new data - Data is archived/ curated and has associated
metadata - Analyse to produce Information
- Definition of information?
- Trends and relationships between data (if I do
A then B will happen) - Metadata important
- Synthesize existing and new Knowledge
- Knowledge links concepts
- Based on information (C is related to D in
some way) - Knowledge base must be consistent new knowledge
introduced carefully! - Science relates knowledge to real world events
3What if ?
- You could automatically access all of the
Archived Data Sets and those used in every social
research publication and decide on the most
appropriate data for your research needs, without
having to spend days reading through coding
schedules and questionnaires - You could automatically re-estimate all the
models others have used on these data sets, and
see what happens if you drop or add new variables
to the analysis - You could quickly formulate (check the
identification etc.) and estimate any new models
or combinations of existing models you thought
might be relevant - You could do this across multiple datasets
- You could match your research questions to
information held in existing digital resources.
Search for new explanations - Integrate multiple sources of data and text to
help to fill in missing data and ideas.
Example from Social Sciences (Rob Crouchley)
4Another Use Case
- User logs in to his desktop toolset using single
sign-on - Desktop tools must share identity and
authentication information - Link into Grid
- Link into digital archives
- Based on survey of literature has an idea that
an observed phenomenon happens for a certain
range of a physical parameter - Need literature cross search facilities google
style using scientific metadata - How do we make it easy to ask the right question?
- Chooses a simulation model capable of testing
this behaviour - Needs metadata about models accuracy vs.
parameter range, accuracy vs. performance,
complexity of model - Runs a set of simulations over appropriate
parameter space - Analyses results Real time? Steering?
Visualisation? - Compare with previous work and observation
- XML or other representation to bring result sets
together
5Virtual Research Environment
- How can a VRE help e-Research?
- Access everything from an integrated set of
desktop tools - At the present time will probably only address a
subset of the steps identified - Many tools are currently used in both research
and support, e.g. MicroSoft Office (Outlook,
Excel, Word, PowerPoint) - Many tools used to collaborate and search and
share data, e.g. Web browsers and servers need
to aggregate data and information (dont just
cut and paste) - e-Science has introduced more collaboration tools
such as Access Grid and PIG giving audio/ visual
collaboration for groups - Peer-to-Peer tools growing in popularity
- Portals popular for CLEs and e-Research (Grid)
CHEF/ Sakai. - VRE enables single sign-on and re-use of key
services.
6Service Oriented Architecture
- Re-usable distributed services
- Sites can host their own content aggregation/
watermarking/ protect IPR - Share metadata and formats cross searching
- Publish services through a variety of interfaces
as a VRE - Registries
- Portals
- GUIs
- Programming APIs
- Common services can be used differently depending
upon application - Research
- Teaching
- Dissemination
- But, this is only useful if the services are rich
and access all resources required in the research
process.
7Example 1 EGEE and ARDA
- EGEE Enabling Grids for e-Science in Europe.
- ARDA Architectural Roadmap towards Distributed
Analysis. - LCH Computing Grid Report CERN-LCG-2003-033
- Builds on existing software - e.g. AliEn portal
and services - Assesses future user requirements for LCG
application area - Build and extend Grid/ database services
- Provide application frameworks, shells, APIs,
interactive GUIS, portals etc. - Includes an object-oriented programming API
- Proposed as an example component of the EGEE work
programme for the EU Grid.
8Key Services for Distributed Analysis
9APIs and User Interfaces
10Some UK Research Resources
- Existing resources need to be service enabled
- Access Grid Nodes (e-Science Centres)
- Course Content (University and Training
Institutions) - Condor pools of workstations (University and
Teaching institutions) - Resource Discovery Network resources (JCIE)
- AHDS (AHRB) and e-SS (ESRC) and related training
and awareness material, e.g. REDRESS - Directories Z-Directory (UKOLN), Z39-50 target
directory (Index Data), RSS-express (UKOLN), OAI
Data providers (OAI), IESR (JISC) - Text mining service (BBSRC), Data Curation Centre
and any other specific research resources funded
in partnership with Research Councils - Resources referenced in the JISC subject
resources guides - Subject gateways Data services Learning and
teaching - Support services.
- Tools referenced in JISC Collections publications
list - National Grid Service nodes (JCSR).
Supercomputing facilities such as HPCx, CSAR
(managed by EPSRC) - Data Archive and MIMAS (ESRC)
- Protein Data Bank (Hosted by Wellcome Foundation
at EBI) - Large-scale facilities such as SRS, ISIS, Diamond
(hosted at CCLRC) and associated scientific data
collections - LHC Data Grid (PPARC)
- NERC Data Centres and CEH
11Service Classification
- There are many services in common between the
three JISC pillars. Based on studies in
previous work (JISC/ CETIS/ ETF) we proposed the
following classification - E-Collaboration
- E-Research
- E-Learning
- Digital Information
- Common
- Some of these will be outlined.
- Where some services cross between these classes
there may still be differences in name and exact
meaning which have to be resolved. - We address the e-Research areas by considering
data, information, knowledge and support.
Click on link
http//www.grids.ac.uk/ETF/public/WebServices/clas
ses.html
12Data the what
- Archiving
- Cataloguing
- Data Access and Integration
- Data Virtualisation
- Data Replication
- Data Management
- Deposition
- Markup
- Resource Discovery
- Transformation
- Validation
Data and Metadata Services
Metadata is crucial to discovery and quality Who
will curate raw scientific results?
13Scientific Data Life Cycle
14Information the how
- Information is also available in many forms,
often textual, but also audio/ video - Semantics are important to exchange and compare
- Need agreed terminologies
- How is it derived? Can it be trusted? bias,
hype - Privacy and security
- Access
- Aggregation
- Content Registration
- Creation
- Query
- Metadata
- Presentation
- Notification
- Update
Information Services
15Knowledge the why
- Knowledge is the end result of a long process, so
far around 4000 years! - It is highly valued and protected (IPR)
- Passed on in many forms oral, written,
behavioural - May be subject to cultural and other
interpretations - Different views
- Education process is fundamental to acquiring and
using knowledge - Knowledge representation and digital processing
is relatively new - Ethical considerations
- Knowledge discovery
- Workflow management
- Knowledge management
- Syndication (join)
- Dictionaries and Ontologies
- Terminology
Knowledge Services
16The Research Support Processes
- We no longer have research supported solely by
philanthropists, it's a job for most scientists
both in academe and industry. It is also
competitive. There are related and important
processes - Write up ideas based on background, reading
journals, attending conferences - Submit funding proposal wait for success/
failure - Set up project plan risks, tasks, deadlines,
metrics... - Link into university finance and people system
- Carry out research based on plan
- Write up results for peers via journals or
conferences - Write up results for funding and monitoring
agencies (e.g. RAE review) - Publish and disseminate - electronic, printed,
Web, other?
17Other Research Services
- Application Management
- Business Workflow
- Deployment
- Distribution
- Fabric Management
- Grid Information
- Job Management
- Process Building
- Proposal Writing
- Resource Discovery
- Resource Management
- Scheduling
- Security
- Validation and Verification
- Visualisation
This area includes active services for control
of computation, observation and experimental
resources (Grid).
18Service Framework for e-Research
From S. Wilson, K. Blinco and D. Rehak Service
Oriented Frameworks DEST (Australia), JISC-CETIS
(UK) and Industry Canada http//www.jisc.ac.uk/upl
oaded_documents/AltilabServiceOrientedFrameworks.p
df
19Example 2 Integrative Biology
- Large project funded by EPSRC with international
partners. Actively defining and seeking
middleware to implement shared services. - Multi length/ time scale modelling of heart and
cancer growth - Linking multiple components across 3 continents.
- Developing services in the following areas
- User management
- Executable building
- Executable management
- Data management
- Job management
- Workflow management
- Visualisation and interactive services
- Steering
- Collaborative working
20IB Architecture
21Services and Aggregation
Web is ubiquitous to both research and leisure.
Has Google replaced the traditional library? I
dont need to leave my desk and can download
plenty of material.
22VRE access to many Services
23(No Transcript)